About an algorithm for consistent weights initialization of deep neural networks and neural networks ensemble learning
Vestnik Sankt-Peterburgskogo universiteta. Prikladnaâ matematika, informatika, processy upravleniâ, no. 4 (2016), pp. 66-74 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

Using of the pretraining of multilayer perceptrons mechanism has greatly improved the quality and speed of training deep networks. In this paper we propose another way of the weights initialization using the principles of supervised learning, self-taught learning approach and transfer learning, tests showing performance approach have been carried out and further steps and directions for the development of the method presented have been suggested. In this paper we propose an iterative algorithm of weights initialization based on the rectification of the hidden layers of weights of the neural network through the resolution of the original problem of classification or regression, as well as the method for constructing a neural network ensemble that naturally results from the proposed learning algorithm, tests showing performance approach have been carried out. Refs 14. Figs 5. Tables 2.
Keywords: deep learning, neural networks weights initialization, ensemble of neural networks.
@article{VSPUI_2016_4_a5,
     author = {I. S. Drokin},
     title = {About an algorithm for consistent weights initialization of deep neural networks and neural networks ensemble learning},
     journal = {Vestnik Sankt-Peterburgskogo universiteta. Prikladna\^a matematika, informatika, processy upravleni\^a},
     pages = {66--74},
     year = {2016},
     number = {4},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/VSPUI_2016_4_a5/}
}
TY  - JOUR
AU  - I. S. Drokin
TI  - About an algorithm for consistent weights initialization of deep neural networks and neural networks ensemble learning
JO  - Vestnik Sankt-Peterburgskogo universiteta. Prikladnaâ matematika, informatika, processy upravleniâ
PY  - 2016
SP  - 66
EP  - 74
IS  - 4
UR  - http://geodesic.mathdoc.fr/item/VSPUI_2016_4_a5/
LA  - ru
ID  - VSPUI_2016_4_a5
ER  - 
%0 Journal Article
%A I. S. Drokin
%T About an algorithm for consistent weights initialization of deep neural networks and neural networks ensemble learning
%J Vestnik Sankt-Peterburgskogo universiteta. Prikladnaâ matematika, informatika, processy upravleniâ
%D 2016
%P 66-74
%N 4
%U http://geodesic.mathdoc.fr/item/VSPUI_2016_4_a5/
%G ru
%F VSPUI_2016_4_a5
I. S. Drokin. About an algorithm for consistent weights initialization of deep neural networks and neural networks ensemble learning. Vestnik Sankt-Peterburgskogo universiteta. Prikladnaâ matematika, informatika, processy upravleniâ, no. 4 (2016), pp. 66-74. http://geodesic.mathdoc.fr/item/VSPUI_2016_4_a5/

[1] Raina R., Battle A., Lee H., Packer B., Ng A. Y., “Selftaught learning: Transfer learning from unlabeled data”, Proc. of the 24th Intern. conference on machine learning (2007, June 20–24), 759–766

[2] Hinton G. E., Salakhutdinov R. R., “Reducing the dimensionality of data with Neural Networks”, Science, 313:5786 (2006), 504–507 | DOI | MR | Zbl

[3] Vincent P., Larochelle H., Lajoie I., Bengio Y., Manzagol P.-A., “Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion”, The Journal of Machine Learning Research archive, 11 (2010), 3371–3408 | MR | Zbl

[4] Masci J., Meier U., Ciresan D., Schmidhuber J., “Stacked convolutional auto-encoders for hierarchical feature extraction”, 21st Intern. conference on Artificial Neural Networks (Espoo, Finland, 2011, June 14–17), v. I, 52–59

[5] Gehring J., Miao Y., Metze F., Waibel A., “Extracting deep bottleneck features using stacked auto-encoders”, Acoustics, speech and signal processing, IEEE Intern. conference ICASSP (2013), 3377–3381

[6] Baldi P., Hornik K., “Neural Networks and principal component analysis: learning from examples without local minima”, Neural Networks, 2 (1989), 53–58 | DOI

[7] Caruana R., “Multitask learning”, Machine learning, 28 (1997), 41–75 | DOI

[8] UFDL, (data obrascheniya: 19.04.2016) http://ufldl.stanford.edu/wiki/index.php/UFLDL_Tutorial

[9] Ciresan D. C., Meier U., Schmidhuber J., “Transfer learning for Latin and Chinese characters with deep Neural Networks”, The 2012 Intern. joint conference on Neural Networks (IJCNN) (2012), 1–6

[10] CIFAR, (data obrascheniya: 19.04.2016) http://www.cs.toronto.edu/k̃riz/cifar.html

[11] Rolfe J. T., LeCun Y., “Discriminative recurrent sparse auto-encoders”, The Intern. conference on learning representations (2013)

[12] Masci J., Meier U., Cires D. Vëan, Schmidhuber J., “Stacked convolutional auto-encoders for hierarchical feature extraction”, Intern. conference artificial Neural Networks and machine learning (2011), 52–59

[13] Glorot X., Bengio Y., “Understanding the difficulty of training deep feedforward neural networks”, Intern. conference on artificial intelligence and statistics (2010), 249–256

[14] Pascanu R., Mikolov T., Bengio Y., Understanding the exploding gradient problem, Tech. Rep., Universite de Montreal, Montreal, 2012, 11 pp.