On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning
Theoretical and applied mechanics, Tome 52 (2025) no. 1, p. 67
Voir la notice de l'article provenant de la source eLibrary of Mathematical Institute of the Serbian Academy of Sciences and Arts
We analyze geometric aspects of the gradient descent algorithm in Deep Learning (DL), and give a detailed discussion of the circumstance that, in underparametrized DL networks, zero loss minimization cannot generically be attained. As a consequence, we conclude that the distribution of training inputs must necessarily be non-generic in order to produce zero loss minimizers, both for the method constructed in \cite{cheewa-2,cheewa-4}, or for gradient descent \cite{ch-7} (which assume clustering of training data).
Classification :
57R70, 62M45
Keywords: deep learning, underparametrization, generic training data, zero loss
Keywords: deep learning, underparametrization, generic training data, zero loss
Thomas Chen; Patricia Muñoz Ewald. On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning. Theoretical and applied mechanics, Tome 52 (2025) no. 1, p. 67 . doi: 10.2298/TAM250121008C
@article{10_2298_TAM250121008C,
author = {Thomas Chen and Patricia Mu\~noz Ewald},
title = {On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning},
journal = {Theoretical and applied mechanics},
pages = {67 },
year = {2025},
volume = {52},
number = {1},
doi = {10.2298/TAM250121008C},
language = {en},
url = {http://geodesic.mathdoc.fr/articles/10.2298/TAM250121008C/}
}
TY - JOUR AU - Thomas Chen AU - Patricia Muñoz Ewald TI - On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning JO - Theoretical and applied mechanics PY - 2025 SP - 67 VL - 52 IS - 1 UR - http://geodesic.mathdoc.fr/articles/10.2298/TAM250121008C/ DO - 10.2298/TAM250121008C LA - en ID - 10_2298_TAM250121008C ER -
%0 Journal Article %A Thomas Chen %A Patricia Muñoz Ewald %T On non-approximability of zero loss global $\mathcal L^2$ minimizers by gradient descent in deep learning %J Theoretical and applied mechanics %D 2025 %P 67 %V 52 %N 1 %U http://geodesic.mathdoc.fr/articles/10.2298/TAM250121008C/ %R 10.2298/TAM250121008C %G en %F 10_2298_TAM250121008C
Cité par Sources :