Combinatorial bounds of overfitting for threshold classifiers
Ufa mathematical journal, Tome 10 (2018) no. 1, pp. 49-63 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

Estimating the generalization ability is a fundamental objective of statistical learning theory. However, accurate and computationally efficient bounds are still unknown even for many very simple cases. In this paper, we study one-dimensional threshold decision rules. We use the combinatorial theory of overfitting based on a single probabilistic assumption that all partitions of a set of objects into an observed training sample and a hidden test sample are of equal probability. We propose a polynomial algorithm for computing both probability of overfitting and of complete cross-validation. The algorithm exploits the recurrent calculation of the number of admissible paths while walking over a three-dimensional lattice between two prescribed points with restrictions of special form. We compare the obtain sharp estimate of the generalized ability and demonstrate that the known upper bound are too overstated and they can not be applied for practical problems.
Keywords: computational learning theory, empirical risk minimization, combinatorial theory of overfitting, probability of overfitting, complete cross-validation, generalization ability, threshold classifier, computational complexity.
@article{UFA_2018_10_1_a3,
     author = {Sh. Kh. Ishkina},
     title = {Combinatorial bounds of overfitting for threshold classifiers},
     journal = {Ufa mathematical journal},
     pages = {49--63},
     year = {2018},
     volume = {10},
     number = {1},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/UFA_2018_10_1_a3/}
}
TY  - JOUR
AU  - Sh. Kh. Ishkina
TI  - Combinatorial bounds of overfitting for threshold classifiers
JO  - Ufa mathematical journal
PY  - 2018
SP  - 49
EP  - 63
VL  - 10
IS  - 1
UR  - http://geodesic.mathdoc.fr/item/UFA_2018_10_1_a3/
LA  - en
ID  - UFA_2018_10_1_a3
ER  - 
%0 Journal Article
%A Sh. Kh. Ishkina
%T Combinatorial bounds of overfitting for threshold classifiers
%J Ufa mathematical journal
%D 2018
%P 49-63
%V 10
%N 1
%U http://geodesic.mathdoc.fr/item/UFA_2018_10_1_a3/
%G en
%F UFA_2018_10_1_a3
Sh. Kh. Ishkina. Combinatorial bounds of overfitting for threshold classifiers. Ufa mathematical journal, Tome 10 (2018) no. 1, pp. 49-63. http://geodesic.mathdoc.fr/item/UFA_2018_10_1_a3/

[1] Vapnik V. N., Chervonenkis A. Ya., “O ravnomernoi skhodimosti chastot poyavleniya sobytii k ikh veroyatnostyam”, Teoriya veroyatnostei i ee primeneniya, 16:2 (1971), 264–280

[2] S. Boucheron, O. Bousquet, G. Lugosi, “Theory of classification: A survey of some recent advances”, ESAIM: Probability and Statistics, 9 (2005), 323–375 | DOI | MR

[3] V. Koltchinskii, “Oracle Inequalities in Empirical Risk Minimization and Sparse Recovery Problems”, École d'Été de Probabilités de Saint-Flour XXXVIII-2008, Lecture Notes in Mathematics, Springer, 2011 | DOI | MR

[4] K. V. Vorontsov, “Combinatorial probability and the tightness of generalization bounds”, Pattern Recognition and Image Analysis, 18:2 (2008), 243–259 | DOI

[5] D. Haussler, N. Littlestone, M. K. Warmuth, “Predicting $\{0,1\}$-functions on randomly drawn points”, Inf. Comput., 115, December (1994), 248–292 | DOI | MR

[6] Vorontsov K. V., “Kombinatornye otsenki kachestva obucheniya po pretsedentam”, Doklady RAN, 394:2 (2004), 175–178 | MR

[7] Vorontsov K. V., “Tochnye otsenki veroyatnosti pereobucheniya”, Doklady RAN, 429:1 (2009), 15–18 | MR

[8] K. V. Vorontsov, “Splitting and similarity phenomena in the sets of classifiers and their effect on the probability of overfitting”, Pattern Recognition and Image Analysis, 19:3 (2009), 412–420 | DOI | MR

[9] K. V. Vorontsov, “Exact combinatorial bounds on the probability of overfitting for empirical risk minimization”, Pattern Recognition and Image Analysis, 20:3 (2010), 269–285 | DOI

[10] V. Koltchinskii, “Rademacher Penalties and Structural Risk Minimization”, IEEE Trans. Inf. Theory, 47:5 (2001), 1902–1914 | DOI | MR

[11] K. V. Vorontsov, “Combinatorial Theory of Overfitting: How Connectivity and Splitting Reduces the Local Complexity”, Proc. 9th IFIP WG 12.5 International Conference, AIAI 2013 (Paphos, Cyprus, September 30–October 2, 2013), Springer-Verlag, Berlin–Heidelberg, 2013

[12] K. V. Vorontsov, A. A. Ivahnenko, “Tight combinatorial generalization bounds for threshold conjunction rules”, 4th International Conference on Pattern Recognition and Machine Intelligence (PReMI'11) (June 27–July 1, 2011), Lecture Notes in Computer Science, Springer-Verlag, 2011, 66–73 | DOI

[13] Zhivotovskii N. K., Vorontsov K. V., “Kriterii tochnosti kombinatornykh otsenok obobschayuschei sposobnosti”, Intellektualizatsiya obrabotki informatsii, IOI-2012, Dokl., Torus Press, M., 2012, 25–28

[14] Zhuravlev Yu. I., Ryazanov V. V., Senko O. V., «Raspoznavanie». Matematicheskie metody. Programmnaya sistema. Prakticheskie primeneniya, Fazis, M., 2006, 176 pp.

[15] Zhuravlev Yu. I., “Ob algebraicheskom podkhode k resheniyu zadach raspoznavaniya ili klassifikatsii”, Problemy kibernetiki, 33, 1978, 5–68

[16] Guz I. S., “Konstruktivnye otsenki polnogo skolzyaschego kontrolya dlya porogovoi klassifikatsii”, Matematicheskaya biologiya i bioinformatika, 6:2 (2011), 173–189 | DOI | MR

[17] Vorontsov K. V., Frei A. I., Sokolov E. A., “Vychislimye kombinatornye otsenki veroyatnosti pereobucheniya”, Mashinnoe obuchenie i analiz dannykh, 1:6 (2013), 734–743

[18] Frei A. I., Tolstikhin I. O., “Kombinatornye otsenki veroyatnosti pereobucheniya na osnove klasterizatsii i pokrytii mnozhestva algoritmov”, Mashinnoe obuchenie i analiz dannykh, 1:6 (2013), 761–778

[19] Frei A. I., Tolstikhin I. O., “Kombinatornye otsenki veroyatnosti pereobucheniya na osnove pokrytii mnozhestva algoritmov”, Doklady RAN, 455:3 (2014), 265–268 | DOI