Empirical analysis of sparse principal component analysis

Loading...
Thumbnail Image
Authors
Mastylo, Damian Z.
Issue Date
2016-05
Type
Electronic thesis
Thesis
Language
ENG
Keywords
Computer science
Research Projects
Organizational Units
Journal Issue
Alternative Title
Abstract
Many optimizations exist via tweaking the sparsity factor, the number of left singular vectors used, or the column subset selection method. Many combinations of these approaches are examined, and their efficacy are reported by comparing information loss, symmetric variance, and the classification accuracy of a Support Vector Machine (SVM) using the transformed data set.
Sparse Principal Component Analysis (SPCA) builds upon regular Principal Component Analysis (PCA) by including a sparsity factor to further reduce the number of dimensions. The goal of this thesis is to demonstrate the benefits of using a SPCA method that focuses on minimizing information loss as opposed to maximizing variance. Current state-of-the-art SPCA methods include TPower, GPower, and Zou's SPCA as implemented in the SpaSM toolkit. These current methods focus on maximizing variance. We hypothesize that the other approach, minimizing information loss, may yield better results in machine learning. We employ a practical approach to examine this problem.
Description
May 2016
School of Science
Full Citation
Publisher
Rensselaer Polytechnic Institute, Troy, NY
Journal
Volume
Issue
PubMed ID
DOI
ISSN
EISSN