On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning
Theoretical and applied mechanics, Tome 52 (2025) no. 1, p. 67

Voir la notice de l'article provenant de la source eLibrary of Mathematical Institute of the Serbian Academy of Sciences and Arts

We analyze geometric aspects of the gradient descent algorithm in Deep Learning (DL), and give a detailed discussion of the circumstance that, in underparametrized DL networks, zero loss minimization cannot generically be attained. As a consequence, we conclude that the distribution of training inputs must necessarily be non-generic in order to produce zero loss minimizers, both for the method constructed in \cite{cheewa-2,cheewa-4}, or for gradient descent \cite{ch-7} (which assume clustering of training data).
DOI : 10.2298/TAM250121008C
Classification : 57R70, 62M45
Keywords: deep learning, underparametrization, generic training data, zero loss
Thomas Chen; Patricia Muñoz Ewald. On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning. Theoretical and applied mechanics, Tome 52 (2025) no. 1, p. 67 . doi: 10.2298/TAM250121008C
@article{10_2298_TAM250121008C,
     author = {Thomas Chen and Patricia Mu\~noz Ewald},
     title = {On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning},
     journal = {Theoretical and applied mechanics},
     pages = {67 },
     year = {2025},
     volume = {52},
     number = {1},
     doi = {10.2298/TAM250121008C},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.2298/TAM250121008C/}
}
TY  - JOUR
AU  - Thomas Chen
AU  - Patricia Muñoz Ewald
TI  - On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning
JO  - Theoretical and applied mechanics
PY  - 2025
SP  - 67 
VL  - 52
IS  - 1
UR  - http://geodesic.mathdoc.fr/articles/10.2298/TAM250121008C/
DO  - 10.2298/TAM250121008C
LA  - en
ID  - 10_2298_TAM250121008C
ER  - 
%0 Journal Article
%A Thomas Chen
%A Patricia Muñoz Ewald
%T On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning
%J Theoretical and applied mechanics
%D 2025
%P 67 
%V 52
%N 1
%U http://geodesic.mathdoc.fr/articles/10.2298/TAM250121008C/
%R 10.2298/TAM250121008C
%G en
%F 10_2298_TAM250121008C

Cité par Sources :