On two approaches to concentration for sampling without replacement
Teoriâ veroâtnostej i ee primeneniâ, Tome 61 (2016) no. 3, pp. 464-488 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

This paper considers the concentration of values of functions of random variables sampled without replacement from a fixed finite set close to their expectations — a problem which is relevant to a variety of applications, including the transductive formulation of statistical learning theory. Apart from the review of known results, the paper studies two general approaches leading in many cases to sufficiently exact concentration inequalities. The first is based on the sub-Gaussian inequality of Bobkov [Ann. Probab., 32 (2004), pp. 2884–2907] for functions defined on a slice of the discrete cube. The second approach proposed by Hoeffding [J. Amer. Statist. Assoc., 58 (1963), pp. 13–30] reduces the problem to studying a sample of independent random variables.
Keywords: concentration inequalities, empirical processes, choice without replacement.
@article{TVP_2016_61_3_a2,
     author = {I. O. Tolstikhin},
     title = {On two approaches to concentration for sampling without replacement},
     journal = {Teori\^a vero\^atnostej i ee primeneni\^a},
     pages = {464--488},
     year = {2016},
     volume = {61},
     number = {3},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/TVP_2016_61_3_a2/}
}
TY  - JOUR
AU  - I. O. Tolstikhin
TI  - On two approaches to concentration for sampling without replacement
JO  - Teoriâ veroâtnostej i ee primeneniâ
PY  - 2016
SP  - 464
EP  - 488
VL  - 61
IS  - 3
UR  - http://geodesic.mathdoc.fr/item/TVP_2016_61_3_a2/
LA  - ru
ID  - TVP_2016_61_3_a2
ER  - 
%0 Journal Article
%A I. O. Tolstikhin
%T On two approaches to concentration for sampling without replacement
%J Teoriâ veroâtnostej i ee primeneniâ
%D 2016
%P 464-488
%V 61
%N 3
%U http://geodesic.mathdoc.fr/item/TVP_2016_61_3_a2/
%G ru
%F TVP_2016_61_3_a2
I. O. Tolstikhin. On two approaches to concentration for sampling without replacement. Teoriâ veroâtnostej i ee primeneniâ, Tome 61 (2016) no. 3, pp. 464-488. http://geodesic.mathdoc.fr/item/TVP_2016_61_3_a2/

[1] Vapnik V. N., Chervonenkis A. Ya., “O ravnomernoi skhodimosti chastot poyavleniya sobytii k ikh veroyatnostyam”, Teoriya veroyatn. i ee primen., 16:2 (1971), 264–280 | Zbl

[2] Vapnik V. N., Chervonenkis A. Ya., Teoriya raspoznavaniya obrazov, Nauka, M., 1974

[3] Vorontsov K. V., “Kombinatornaya teoriya pereobucheniya: rezultaty, prilozheniya i otkrytye problemy”, Matematicheskie metody raspoznavaniya obrazov, 15-ya Vseross. konf., MAKS Press, M., 2011, 40–43

[4] Tolstikhin I. O., “Lokalizatsiya otsenok izbytochnogo riska v kombinatornoi teorii pereobucheniya”, Mezhdunar. konf. IOI-9 (2012), 54–57

[5] Bardenet R., Maillard O. A., “Concentration inequalities for sampling without replacement”, Bernoulli, 21:3 (2015), 1361–1385 | DOI | MR | Zbl

[6] Bartlett P., Bousquet O., Mendelson S., “Local Rademacher Complexities”, Ann. Statist., 33:4 (2005), 1497–1537 | DOI | MR | Zbl

[7] Blum A., Langford J., “PAC-MDL bounds”, Lecture Note Comp. Sci., 2777, 2003, 344–357 | DOI | Zbl

[8] Bobkov S., “Concentration of normalized sums and a central limit theorem for noncorrelated random variables”, Ann. Probab., 32:4 (2004), 2884–2907 | DOI | MR | Zbl

[9] Boucheron S., Lugosi G., Bousquet O., “Theory of Classification: a Survey of Recent Advances”, ESAIM: Probability and Statistics, 2005, no. 9, 323–375 | DOI | MR | Zbl

[10] Boucheron S., Lugosi G., Massart P., Concentration Inequalities: A Nonasymptotic Theory of Independence, Oxford Univ. Press, Oxford, 2013, 496 pp. | MR | Zbl

[11] Bousquet O., Concentration inequalities and empirical processes theory applied to the analysis of learning algorithms, PhD Thesis, Ecole Polytechnique, 2002

[12] Bousquet O., Boucheron S., Lugosi G., Introduction to Statistical Learning Theory. Lecture Notes, Art. Intell., 2004

[13] Cortes C., Mohri M., Pechyony D., Rastogi A., Stability analysis and learning bounds for transductive regression algorithms, 2009, arXiv: 0904.0814 [cs.LG]

[14] Derbeko P., El-Yaniv R., Meir R., “Explicit learning curves for transduction and application to clustering and compression algorithms”, J. Artific. Intell. Res., 22 (2004), 117–142 | MR | Zbl

[15] El-Yaniv R., Pechyony D., “Transductive Rademacher complexity and its applications”, J. Artific. Intell. Res., 35 (2009), 193–234 | MR | Zbl

[16] Gross D., Nesme V., Note on sampling without replacing from a finite collection of matrices, 2010, arXiv: 1001.2738 [cs.IT]

[17] Hoeffding W., “Probability inequalities for sums of bounded random variables”, J. Amer. Statist. Assoc., 58:301 (1963), 13–30 | DOI | MR | Zbl

[18] Koltchinskii V., “Local Rademacher complexities and oracle inequalities in risk minimization”, Ann. Statis., 34:6 (2006), 2593–2656 | DOI | MR

[19] Koltchinskii V., Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems, Springer, Berlin, 2011, 254 pp. | MR | Zbl

[20] Koltchinskii V., Panchenko D., “Rademacher processes and bounding the risk of function learning”, High Dimensional Probability, v. II, eds. D. E. Gine, J. Wellner, Birkhauser, Boston, 1999, 443–457 | MR

[21] Pascal M., “Rates of convergence in the central limit theorem for empirical processes”, Ann. de l'institut Henri Poincaré (B) Probabiliés et Statistiques, 22:4 (1986), 381–423 | MR | Zbl

[22] McDiarmid C., “On the method of bounded differences”, Surveys Combin., 1989, 148–188 | MR | Zbl

[23] Serfling R. J., “Probability inequalities for the sum in sampling without replacement”, Ann. Statist., 1:1 (1974), 39–48 | DOI | MR

[24] Talagrand M., “New concentration inequalities in product spaces”, Inventiones Mathematicae, 126:3 (1996), 505–563 | DOI | MR | Zbl

[25] Vapnik V., Statistical Learning Theory, Wiley, New York, 1998, 768 pp. | MR | Zbl

[26] Vorontsov K. V., “Exact combinatorial bounds on the probability of overfitting for empirical risk minimization”, Pattern Recogn. Image Anal., 20:3 (2010), 269–285 | DOI

[27] Vorontsov K. V., Ivahnenko A. A., “Tight combinatorial generalization bounds for threshold conjunction rules”, Pattern Recognition and Machine Intelligence, PReMI'11, eds. S. O. Kuznetsov, D. P. Mandal, M. K. Kundu, S. K. Pal, Springer-Verlag, Berlin, 2011, 66–73 | DOI

[28] Vorontsov K. V., Frey A. I., Sokolov E. A., “Computable Combinatorial Overitting Bounds”, Machine Learning and Data Analysis, 1:6 (2013), 734–743