Show simple item record

dc.rights.licenseCC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.
dc.contributorZaki, Mohammed J., 1971-
dc.contributorStewart, Charles V.
dc.contributorGittens, Alex
dc.contributor.authorElliott, Dylan
dc.date.accessioned2021-11-03T08:55:40Z
dc.date.available2021-11-03T08:55:40Z
dc.date.created2018-02-21T13:01:50Z
dc.date.issued2017-12
dc.identifier.urihttps://hdl.handle.net/20.500.13015/2107
dc.descriptionDecember 2017
dc.descriptionSchool of Science
dc.description.abstractThis thesis presents a comparison of four commonly used neural network models for learning to classify and encode short text sequences. We first evaluate the performance of the models for a supervised classification task on three short text datasets. The results of these tests suggest that performance can be dependent on a combination of the model architecture and the complexity of features desired to be learned. We then train each model on a semi-supervised learning task with a K-means clustering objective for one of the short text datasets, after which we encode the dataset with the trained models and perform clustering on the encoded representations. The results of clustering reveal that a model's performance in the classification task does not necessarily correlate positively to its performance in the semi-supervised task and we relate these observations to data about each model's behavior during learning. Overall we find that if a model does not learn to largely separate its feature representations too quickly, it may have a better chance at clustering due to an increased ability to correct initial alignment mistakes. These insights provide guidance to future work in which more complex models will be used and knowledge bases will be constructed using raw text scraped from the web.
dc.language.isoENG
dc.publisherRensselaer Polytechnic Institute, Troy, NY
dc.relation.ispartofRensselaer Theses and Dissertations Online Collection
dc.rightsAttribution-NonCommercial-NoDerivs 3.0 United States*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/3.0/us/*
dc.subjectComputer science
dc.titleNeural network architectures for short text
dc.typeElectronic thesis
dc.typeThesis
dc.digitool.pid178735
dc.digitool.pid178736
dc.digitool.pid178737
dc.rights.holderThis electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.
dc.description.degreeMS
dc.relation.departmentDept. of Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

CC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.
Except where otherwise noted, this item's license is described as CC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.