Selecting the number of components in PCA using cross-validation approximations

Abstract : Cross-validation is a tried and tested approach to select the number of components in principal component analysis (PCA), however, its main drawback is its computational cost. In a regression (or in a non parametric regression) setting, criteria such as the general cross-validation one (GCV) provide convenient approximations to leave-one-out cross-validation. They are based on the relation between the prediction error and the residual sum of squares weighted by elements of a projection matrix (or a smoothing matrix). Such a relation is then established in PCA using an original presentation of PCA with a unique projection matrix. It enables the definition of two cross-validation approximation criteria: the smoothing approximation of the cross-validation criterion (SACV) and the GCV criterion. The method is assessed with simulations and gives promising results.
Type de document :
Article dans une revue
Computational Statististics and Data Analysis, 2012, 56 (6), pp.1869-1879. 〈10.1016/j.csda.2011.11.012〉
Liste complète des métadonnées

https://hal-agrocampus-ouest.archives-ouvertes.fr/hal-00729614
Contributeur : Céline Martel <>
Soumis le : vendredi 7 septembre 2012 - 15:47:39
Dernière modification le : vendredi 22 juin 2018 - 01:19:54

Identifiants

Citation

Julie Josse, François Husson. Selecting the number of components in PCA using cross-validation approximations. Computational Statististics and Data Analysis, 2012, 56 (6), pp.1869-1879. 〈10.1016/j.csda.2011.11.012〉. 〈hal-00729614〉

Partager

Métriques

Consultations de la notice

531