On the convergence rate of the subgradient method with metric variation and its applications in neural network approximation schemes
Vestnik Tomskogo gosudarstvennogo universiteta. Matematika i mehanika, no. 55 (2018), pp. 22-37 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

In this paper, the relaxation subgradient method with rank 2 correction of metric matrices is studied. It is proven that, on high-convex functions, in the case of the existence of a linear coordinate transformation reducing the degree of the task casualty, the method has a linear convergence rate corresponding to the casualty degree. The paper offers a new efficient tool for choosing the initial approximation of an artificial neural network. The use of regularization allowed excluding the overfitting effect and efficiently deleting low-significant neurons and intra-neural connections. The ability to efficiently solve such problems is ensured by the use of the subgradient method with metric matrix rank 2 correction. It has been experimentally proved that the convergence rate of the quasi-Newton method and that of the method under research are virtually equivalent on smooth functions. The method has a high convergence rate on non-smooth functions as well. The method's computing capabilities are used to build efficient neural network learning algorithms. The paper describes an artificial neural network learning algorithm which, together with the redundant neuron suppression, allows obtaining reliable approximations in one count.
Keywords: method, subgradient, minimization, rate of convergence, neural networks, regularization.
@article{VTGU_2018_55_a2,
     author = {V. N. Krutikov and N. S. Samoilenko},
     title = {On the convergence rate of the subgradient method with metric variation and its applications in neural network approximation schemes},
     journal = {Vestnik Tomskogo gosudarstvennogo universiteta. Matematika i mehanika},
     pages = {22--37},
     year = {2018},
     number = {55},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/VTGU_2018_55_a2/}
}
TY  - JOUR
AU  - V. N. Krutikov
AU  - N. S. Samoilenko
TI  - On the convergence rate of the subgradient method with metric variation and its applications in neural network approximation schemes
JO  - Vestnik Tomskogo gosudarstvennogo universiteta. Matematika i mehanika
PY  - 2018
SP  - 22
EP  - 37
IS  - 55
UR  - http://geodesic.mathdoc.fr/item/VTGU_2018_55_a2/
LA  - ru
ID  - VTGU_2018_55_a2
ER  - 
%0 Journal Article
%A V. N. Krutikov
%A N. S. Samoilenko
%T On the convergence rate of the subgradient method with metric variation and its applications in neural network approximation schemes
%J Vestnik Tomskogo gosudarstvennogo universiteta. Matematika i mehanika
%D 2018
%P 22-37
%N 55
%U http://geodesic.mathdoc.fr/item/VTGU_2018_55_a2/
%G ru
%F VTGU_2018_55_a2
V. N. Krutikov; N. S. Samoilenko. On the convergence rate of the subgradient method with metric variation and its applications in neural network approximation schemes. Vestnik Tomskogo gosudarstvennogo universiteta. Matematika i mehanika, no. 55 (2018), pp. 22-37. http://geodesic.mathdoc.fr/item/VTGU_2018_55_a2/

[1] Vorontsov K. V., Course of lectures “Mathematical methods of learning by precedents”

[2] Khaikin S., Neural networks: a full course, Williams, M., 2006, 1104 pp.

[3] Osovski S., Neural networks for information processing, Goryachaya liniya - Telecom, M., 2016, 448 pp.

[4] Gorban A. N., Training of neural networks, USSR - USA JV Paragraph, M., 1990, 160 pp.

[5] Burnaev E. V., Prikhodko P. V., “On a method for constructing ensembles of regression models”, Autom. Remote Control, 74:10 (2013), 1630–1644 | DOI | MR | Zbl

[6] Gorbachenko V. I., Zhukov M. V., “Solving boundary value problems of mathematical physics using radial basis function networks”, Comput. Math. and Math. Phys., 57:1 (2017), 145–155 | DOI | DOI | MR | Zbl

[7] Kretinin A. V., “The weighted residuals method based on neural net trial functions for simulation of hydrodynamics problems”, Siberian J. Num. Math., 9:1 (2006), 23–35 | Zbl

[8] Tikhonov A. N., Arsenin V. Ya., Methods for solving ill-posed problems, Nauka, M., 1986

[9] Krutikov V. N., Aryshev D. V., “Algorithm of sequential screening of non-informative variables of a linear model”, Bulletin of Kemerovo State University, 2004, no. 3(7), 124–129

[10] Li Wang, Ji Zhu, Hui Zou, “The doubly regularized support vector machine”, Statistica Sinica, 16:2, 589–615 | MR | Zbl

[11] Tatarchuk A., Mottl V., Eliseyev A., Windridge D., “Selectivity supervision in com-bining pattern-recognition modalities by feature- and kernel-selective Support Vector Machines”, Proc. of the 19$^{{\mathrm th}}$ Int. Conf. on Pattern Recognition, v. 1–6, IEEE, 2008, 2336–2339

[12] Tatarchuk A., Urlov E., Mottl V., Windridge D., “A support kernel machine for supervised selective combining of diverse pattern-recognition modalities”, Multiple Classifier Systems, Lecture Notes In Computer Science, 5997, Springer-Verlag, Berlin–Heidelberg, 2010, 165–174 | DOI

[13] Alkezuini M. M., Gorbachenko V. I., “Improving the training algorithms for the networks of radial basis functions for solving approximation problems”, Modeli, sistemy, seti v ekonomike, tekhnike, prirode i obshchestve - Models, systems, networks in economics, engineering, nature and society, 2017, no. 3 (23), 123–138

[14] Polyak B. T., Introduction to optimization, Nauka, M., 1983

[15] Dennis J., Schnabel R., Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Mir, M., 1988

[16] Marquardt D. W., “An algorithm for least-squares estimation of nonlinear parameters”, J. Society for Industrial and Applied Mathematics, 11:2 (1963), 431–441 | DOI | MR | Zbl

[17] Shore N. Z., Minimization Methods for Non-Differentiable Functions, Naukova Dumka, Kiev, 1979

[18] Wolfe Ph., “Note on a method of conjugate subgradients for minimizing nondifferentiable functions”, Math. Program, 7:1 (1974), 380–383 | DOI | MR | Zbl

[19] Lemarechal C., “An extension of Davidon methods to non-differentiable problems”, Math. Program. Study, 3 (1975), 95–109 | DOI | MR | Zbl

[20] Nurminskii E. A., Thien D., “Method of conjugate subgradients with constrained memory”, Autom. Remote Control, 75:4 (2014), 646–656 | DOI | MR | Zbl

[21] Krutikov V. N., Vershinin Ya. N., “The subgradient multistep minimization method for nonsmooth high-dimensional problems”, Vestnik Tomskogo gosudarstvennogo universiteta. Matematika i mekhanika - Tomsk State University Journal of Mathematics and Mechanics, 2014, no. 3(29), 5–19

[22] Krutikov V. N., Vershinin Ya. N., “Subgradient minimization method with descent vectors correction based on pairs of training relations”, Bulletin of Kemerovo State University, 2014, no. 1-1 (57), 46–54 | DOI

[23] Krutikov V. N., Gorskaya T. A., “A family of subgradient relaxation methods with rank 2 correction of metric matrices”, Ekonomika i mat. metody - Economy and math. methods, 45:4 (2009), 37–80 | Zbl

[24] Krutikov V. N., Petrova T. V., “Relaxation method of minimization with space extension in the subgradient direction”, Ekonomika i mat. metody - Economy and math. methods, 39:1 (2003), 106–119 | Zbl

[25] Karmanov V. G., Mathematical programming, Nauka, M., 1980

[26] Tibshirani R. J., “Regression shrinkage and selection via the lasso”, J. Royal Statistical Society. Series B (Methodological), 58:1 (1996), 267–288 | DOI | MR | Zbl

[27] Tou J. T., Gonzalez R. C., Pattern recognition principles, Addison-Wesley, Reading, MA, 1974 | MR | Zbl

[28] Conn A. R., Gould N. I. M., Toint P. L., Trust regions methods, Society for Industrial and Applied Mathematics, 2000, 959 pp. | MR

[29] Goodfellow J. et al., Deep Learning, MIT Publ., 2016 | MR | Zbl

[30] Sutskever I., Martens J., Dahl G., Hinton G., “On the importance of initialization and momentum in deep learning” (Atlanta, Georgia, 2013), PMLR, 28, Proc. 30th Int. Conf. on Machine Learning, 1139–1147