Application of unsupervised learning methods to an author separation task

Loading...
Thumbnail Image
Authors
Barlett, Kevin W.
Issue Date
2008-08
Type
Electronic thesis
Thesis
Language
ENG
Keywords
Computer science
Research Projects
Organizational Units
Journal Issue
Alternative Title
Abstract
As the amount of information stored as text grows, being able to efficiently organize and sort the available information becomes crucial. Supervised learning methods provide a solution for some of these tasks, however they rely upon human interaction for the initial classification of texts. Unsupervised learning methods allow not only for a solution to some of these tasks, but by removing the human component increases the speed and efficiency of these methods, thereby decreasing the barriers for application of these methods. This work explores the similar structure taken by unsupervised learning methods when applied to text mining problems, followed by a brief overview of the four components: feature extraction, feature selection, clustering, and cluster evaluation. This framework is then applied to a problem involving author separation, where many excerpts of literary works are presented with the task of dividing the excerpts into groupings corresponding with individual authors. The applicability of various learning methods are then considered based upon their relative performance on the given task.
Description
August 2008
School of Science
Full Citation
Publisher
Rensselaer Polytechnic Institute, Troy, NY
Journal
Volume
Issue
PubMed ID
DOI
ISSN
EISSN