Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases
ESAIM: Probability and Statistics, Tome 21 (2017), pp. 412-451

Voir la notice de l'article provenant de la source Numdam

We investigate the optimality for model selection of the so-called slope heuristics, V-fold cross-validation and V-fold penalization in a heteroscedatic with random design regression context. We consider a new class of linear models that we call strongly localized bases and that generalize histograms, piecewise polynomials and compactly supported wavelets. We derive sharp oracle inequalities that prove the asymptotic optimality of the slope heuristics – when the optimal penalty shape is known – and V-fold penalization. Furthermore, V-fold cross-validation seems to be suboptimal for a fixed value of V since it recovers asymptotically the oracle learned from a sample size equal to 1-V -1 of the original amount of data. Our results are based on genuine concentration inequalities for the true and empirical excess risks that are of independent interest. We show in our experiments the good behavior of the slope heuristics for the selection of linear wavelet models. Furthermore, V-fold cross-validation and V-fold penalization have comparable efficiency.

Reçu le :
Accepté le :
DOI : 10.1051/ps/2017005
Classification : 62G08, 62G09
Keywords: Nonparametric regression, heteroscedastic noise, random design, model selection, cross-validation, wavelets

Navarro, Fabien 1 ; Saumard, Adrien 1

1 CREST, ENSAI, Campus de Ker-Lann, rue Blaise Pascal, BP 37203, 35172 Bruz Cedex, France.
@article{PS_2017__21__412_0,
     author = {Navarro, Fabien and Saumard, Adrien},
     title = {Slope heuristics and {V-Fold} model selection in heteroscedastic regression using strongly localized bases},
     journal = {ESAIM: Probability and Statistics},
     pages = {412--451},
     publisher = {EDP-Sciences},
     volume = {21},
     year = {2017},
     doi = {10.1051/ps/2017005},
     mrnumber = {3743921},
     zbl = {1395.62093},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.1051/ps/2017005/}
}
TY  - JOUR
AU  - Navarro, Fabien
AU  - Saumard, Adrien
TI  - Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases
JO  - ESAIM: Probability and Statistics
PY  - 2017
SP  - 412
EP  - 451
VL  - 21
PB  - EDP-Sciences
UR  - http://geodesic.mathdoc.fr/articles/10.1051/ps/2017005/
DO  - 10.1051/ps/2017005
LA  - en
ID  - PS_2017__21__412_0
ER  - 
%0 Journal Article
%A Navarro, Fabien
%A Saumard, Adrien
%T Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases
%J ESAIM: Probability and Statistics
%D 2017
%P 412-451
%V 21
%I EDP-Sciences
%U http://geodesic.mathdoc.fr/articles/10.1051/ps/2017005/
%R 10.1051/ps/2017005
%G en
%F PS_2017__21__412_0
Navarro, Fabien; Saumard, Adrien. Slope heuristics and V-Fold model selection in heteroscedastic regression using strongly localized bases. ESAIM: Probability and Statistics, Tome 21 (2017), pp. 412-451. doi: 10.1051/ps/2017005

Cité par Sources :