Author
Mastylo, Damian Z.
Other Contributors
Magdon-Ismail, Malik; Anshelevich, Elliot; Patterson, Stacy;
Date Issued
2016-05
Subject
Computer science
Degree
MS;
Terms of Use
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.;
Abstract
Many optimizations exist via tweaking the sparsity factor, the number of left singular vectors used, or the column subset selection method. Many combinations of these approaches are examined, and their efficacy are reported by comparing information loss, symmetric variance, and the classification accuracy of a Support Vector Machine (SVM) using the transformed data set.; Sparse Principal Component Analysis (SPCA) builds upon regular Principal Component Analysis (PCA) by including a sparsity factor to further reduce the number of dimensions. The goal of this thesis is to demonstrate the benefits of using a SPCA method that focuses on minimizing information loss as opposed to maximizing variance. Current state-of-the-art SPCA methods include TPower, GPower, and Zou's SPCA as implemented in the SpaSM toolkit. These current methods focus on maximizing variance. We hypothesize that the other approach, minimizing information loss, may yield better results in machine learning. We employ a practical approach to examine this problem.;
Description
May 2016; School of Science
Department
Dept. of Computer Science;
Publisher
Rensselaer Polytechnic Institute, Troy, NY
Relationships
Rensselaer Theses and Dissertations Online Collection;
Access
Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.;