On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning

Thomas Chen; Patricia Muñoz Ewald

doi:10.2298/TAM250121008C

Theoretical and applied mechanics, Tome 52 (2025) no. 1, p. 67

Voir la notice de l'article provenant de la source eLibrary of Mathematical Institute of the Serbian Academy of Sciences and Arts

Résumé

We analyze geometric aspects of the gradient descent algorithm in Deep Learning (DL), and give a detailed discussion of the circumstance that, in underparametrized DL networks, zero loss minimization cannot generically be attained. As a consequence, we conclude that the distribution of training inputs must necessarily be non-generic in order to produce zero loss minimizers, both for the method constructed in \cite{cheewa-2,cheewa-4}, or for gradient descent \cite{ch-7} (which assume clustering of training data).

Détail
Citer cet article

DOI : 10.2298/TAM250121008C

Classification : 57R70, 62M45
Keywords: deep learning, underparametrization, generic training data, zero loss

Thomas Chen; Patricia Muñoz Ewald. On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning. Theoretical and applied mechanics, Tome 52 (2025) no. 1, p. 67 . doi: 10.2298/TAM250121008C

@article{10_2298_TAM250121008C,
     author = {Thomas Chen and Patricia Mu\~noz Ewald},
     title = {On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning},
     journal = {Theoretical and applied mechanics},
     pages = {67 },
     year = {2025},
     volume = {52},
     number = {1},
     doi = {10.2298/TAM250121008C},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.2298/TAM250121008C/}
}

TY  - JOUR
AU  - Thomas Chen
AU  - Patricia Muñoz Ewald
TI  - On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning
JO  - Theoretical and applied mechanics
PY  - 2025
SP  - 67 
VL  - 52
IS  - 1
UR  - http://geodesic.mathdoc.fr/articles/10.2298/TAM250121008C/
DO  - 10.2298/TAM250121008C
LA  - en
ID  - 10_2298_TAM250121008C
ER  -

%0 Journal Article
%A Thomas Chen
%A Patricia Muñoz Ewald
%T On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning
%J Theoretical and applied mechanics
%D 2025
%P 67 
%V 52
%N 1
%U http://geodesic.mathdoc.fr/articles/10.2298/TAM250121008C/
%R 10.2298/TAM250121008C
%G en
%F 10_2298_TAM250121008C

Cité par Sources :

Parcourir par

Geodesic

Parcourir par