Empirical analysis of sparse principal component analysis
Author
Mastylo, Damian Z.Other Contributors
Magdon-Ismail, Malik; Anshelevich, Elliot; Patterson, Stacy;Date Issued
2016-05Subject
Computer scienceDegree
MS;Terms of Use
This electronic version is a licensed copy owned by Rensselaer Polytechnic Institute, Troy, NY. Copyright of original work retained by author.; Attribution-NonCommercial-NoDerivs 3.0 United StatesMetadata
Show full item recordAbstract
Many optimizations exist via tweaking the sparsity factor, the number of left singular vectors used, or the column subset selection method. Many combinations of these approaches are examined, and their efficacy are reported by comparing information loss, symmetric variance, and the classification accuracy of a Support Vector Machine (SVM) using the transformed data set.; Sparse Principal Component Analysis (SPCA) builds upon regular Principal Component Analysis (PCA) by including a sparsity factor to further reduce the number of dimensions. The goal of this thesis is to demonstrate the benefits of using a SPCA method that focuses on minimizing information loss as opposed to maximizing variance. Current state-of-the-art SPCA methods include TPower, GPower, and Zou's SPCA as implemented in the SpaSM toolkit. These current methods focus on maximizing variance. We hypothesize that the other approach, minimizing information loss, may yield better results in machine learning. We employ a practical approach to examine this problem.;Description
May 2016; School of ScienceDepartment
Dept. of Computer Science;Publisher
Rensselaer Polytechnic Institute, Troy, NYRelationships
Rensselaer Theses and Dissertations Online Collection;Access
CC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.;Collections
Except where otherwise noted, this item's license is described as CC BY-NC-ND. Users may download and share copies with attribution in accordance with a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License. No commercial use or derivatives are permitted without the explicit approval of the author.