HAL will be down for maintenance from Friday, June 10 at 4pm through Monday, June 13 at 9am. More information
Skip to Main content Skip to Navigation
Journal articles

Selecting the number of components in PCA using cross-validation approximations

Abstract : Cross-validation is a tried and tested approach to select the number of components in principal component analysis (PCA), however, its main drawback is its computational cost. In a regression (or in a non parametric regression) setting, criteria such as the general cross-validation one (GCV) provide convenient approximations to leave-one-out cross-validation. They are based on the relation between the prediction error and the residual sum of squares weighted by elements of a projection matrix (or a smoothing matrix). Such a relation is then established in PCA using an original presentation of PCA with a unique projection matrix. It enables the definition of two cross-validation approximation criteria: the smoothing approximation of the cross-validation criterion (SACV) and the GCV criterion. The method is assessed with simulations and gives promising results.
Complete list of metadata

https://hal-agrocampus-ouest.archives-ouvertes.fr/hal-00729614
Contributor : Céline Martel Connect in order to contact the contributor
Submitted on : Friday, September 7, 2012 - 3:47:39 PM
Last modification on : Friday, May 20, 2022 - 9:04:43 AM

Identifiers

Citation

Julie Josse, François Husson. Selecting the number of components in PCA using cross-validation approximations. Computational Statistics and Data Analysis, Elsevier, 2012, 56 (6), pp.1869-1879. ⟨10.1016/j.csda.2011.11.012⟩. ⟨hal-00729614⟩

Share

Metrics

Record views

625