Multilevel minimization for deep residual networks
ESAIM. Proceedings, Tome 71 (2021), pp. 131-144.

Voir la notice de l'article provenant de la source EDP Sciences

We present a new multilevel minimization framework for the training of deep residual networks (ResNets), which has the potential to significantly reduce training time and effort. Our framework is based on the dynamical system’s viewpoint, which formulates a ResNet as the discretization of an initial value problem. The training process is then formulated as a time-dependent optimal control problem, which we discretize using different time-discretization parameters, eventually generating multilevel-hierarchy of auxiliary networks with different resolutions. The training of the original ResNet is then enhanced by training the auxiliary networks with reduced resolutions. By design, our framework is conveniently independent of the choice of the training strategy chosen on each level of the multilevel hierarchy. By means of numerical examples, we analyze the convergence behavior of the proposed method and demonstrate its robustness. For our examples we employ a multilevel gradient-based methods. Comparisons with standard single level methods show a speedup of more than factor three while achieving the same validation accuracy.
DOI : 10.1051/proc/202171131

Lisa Gaedke-Merzhäuser 1 ; Alena Kopaničáková 1 ; Rolf Krause 1

1 Institute of Computational Science, Università della Svizzera, italiana
@article{EP_2021_71_a12,
     author = {Lisa Gaedke-Merzh\"auser and Alena Kopani\v{c}\'akov\'a and Rolf Krause},
     title = {Multilevel minimization for deep residual networks},
     journal = {ESAIM. Proceedings},
     pages = {131--144},
     publisher = {mathdoc},
     volume = {71},
     year = {2021},
     doi = {10.1051/proc/202171131},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.1051/proc/202171131/}
}
TY  - JOUR
AU  - Lisa Gaedke-Merzhäuser
AU  - Alena Kopaničáková
AU  - Rolf Krause
TI  - Multilevel minimization for deep residual networks
JO  - ESAIM. Proceedings
PY  - 2021
SP  - 131
EP  - 144
VL  - 71
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/articles/10.1051/proc/202171131/
DO  - 10.1051/proc/202171131
LA  - en
ID  - EP_2021_71_a12
ER  - 
%0 Journal Article
%A Lisa Gaedke-Merzhäuser
%A Alena Kopaničáková
%A Rolf Krause
%T Multilevel minimization for deep residual networks
%J ESAIM. Proceedings
%D 2021
%P 131-144
%V 71
%I mathdoc
%U http://geodesic.mathdoc.fr/articles/10.1051/proc/202171131/
%R 10.1051/proc/202171131
%G en
%F EP_2021_71_a12
Lisa Gaedke-Merzhäuser; Alena Kopaničáková; Rolf Krause. Multilevel minimization for deep residual networks. ESAIM. Proceedings, Tome 71 (2021), pp. 131-144. doi : 10.1051/proc/202171131. http://geodesic.mathdoc.fr/articles/10.1051/proc/202171131/

Cité par Sources :