Skip to Main Content
243
Views
1
CrossRef citations to date
Altmetric
Received 01 Oct 2018
Accepted 11 Sep 2019
Accepted author version posted online: 26 Sep 2019
Published online: 29 Oct 2019
 
Translator disclaimer

ABSTRACT

Sparse principal component analysis (PCA) is used to obtain stable and interpretable principal components (PCs) from high-dimensional data. A robust sparse PCA method is proposed to handle potential outliers in the data. The proposed method is based on the least trimmed squares PCA method which provides robust but non-sparse PC estimates. To obtain sparse solutions, our method incorporates a regularization penalty on the loading vectors. The principal directions are determined sequentially to avoid that outliers in the PC subspace destroy the sparse structure of the loadings. Simulation studies and real data examples show that the new method gives accurate estimates, even when the data are highly contaminated. Moreover, compared to existing robust sparse PCA methods the computation time is reduced to a great extent. Supplementary materials providing more simulation results and discussion, and an R package to compute the proposed method are available online.

Supplementary Materials

The supplementary materials for this article are contained in a single archive (zip file) which is available online and can be downloaded from the journal’s website. The supplementary materials consist of the following items.

Extra resultsThe supplementary results file provides some empirical results regarding the estimation of the center, conditions on the eigenvalues for successful recovery of the sparse PCs, additional simulation results, a more detailed discussion on the escalator video example, a toy example illustrating the advantages of sequential estimation of the PCs, and the results of a simulation study with five PCs. (pdf file)

R package ltsspcaThe R-package ltsspca contains code to perform the LTS-SPCA method described in the article. The package also contains code to generate datasets according to the simulation design in the article. The R package can also be downloaded at https://wis.kuleuven.be/stat/robust/software. (GNU zipped tar file)

Additional information

Funding

The authors gratefully acknowledge the support of the International Funds KU Leuven grant C16/15/068 and COST Action IC1408 CRoNoS.

Login options

Purchase * Save for later
Online

Article Purchase 24 hours to view or download: USD 51.00 Add to cart

Issue Purchase 30 days to view or download: USD 105.00 Add to cart

* Local tax will be added as applicable