Accelerated Zero-Order SGD under High-Order Smoothness and Overparameterized Regime

G. K. Bychkov; D. M. Dvinskikh; A. V. Antsiferova; A. V. Gasnikov; A. V. Lobanov

G. K. Bychkov ; D. M. Dvinskikh ; A. V. Antsiferova ; A. V. Gasnikov ; A. V. Lobanov

Russian journal of nonlinear dynamics, Tome 20 (2024) no. 5, pp. 759-788

Voir la notice de l'article provenant de la source Math-Net.Ru

Résumé

We present a novel gradient-free algorithm to solve a convex stochastic optimization problem, such as those encountered in medicine, physics, and machine learning (e.g., the adversarial multi-armed bandit problem), where the objective function can only be computed through numerical simulation, either as the result of a real experiment or as feedback given by the function evaluations from an adversary. Thus, we suppose that only black-box access to the function values of the objective is available, possibly corrupted by adversarial noise: deterministic or stochastic. The noisy setup can arise naturally from modeling randomness within a simulation or by computer discretization, or when exact values of the function are forbidden due to privacy issues, or when solving nonconvex problems as convex ones with an inexact function oracle. By exploiting higher-order smoothness, fulfilled, e.g., in logistic regression, we improve the performance of zero-order methods developed under the assumption of classical smoothness (or having a Lipschitz gradient). The proposed algorithm enjoys optimal oracle complexity and is designed under an overparameterization setup, i.e., when the number of model parameters is much larger than the size of the training dataset. Overparametrized models fit to the training data perfectly while also having good generalization and outperforming underparameterized models on unseen data. We provide convergence guarantees for the proposed algorithm under both types of noise. Moreover, we estimate the maximum permissible adversarial noise level that maintains the desired accuracy in the Euclidean setup, and then we extend our results to a non-Euclidean setup. Our theoretical results are verified using the logistic regression problem.

Keywords: zero-order optimization, gradient-free algorithms, high-order smoothness, overparametrization
Mots-clés : kernel approximation

@article{ND_2024_20_5_a4,
     author = {G. K. Bychkov and D. M. Dvinskikh and A. V. Antsiferova and A. V. Gasnikov and A. V. Lobanov},
     title = {Accelerated {Zero-Order} {SGD} under {High-Order} {Smoothness} and {Overparameterized} {Regime}},
     journal = {Russian journal of nonlinear dynamics},
     pages = {759--788},
     publisher = {mathdoc},
     volume = {20},
     number = {5},
     year = {2024},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/ND_2024_20_5_a4/}
}

TY  - JOUR
AU  - G. K. Bychkov
AU  - D. M. Dvinskikh
AU  - A. V. Antsiferova
AU  - A. V. Gasnikov
AU  - A. V. Lobanov
TI  - Accelerated Zero-Order SGD under High-Order Smoothness and Overparameterized Regime
JO  - Russian journal of nonlinear dynamics
PY  - 2024
SP  - 759
EP  - 788
VL  - 20
IS  - 5
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/ND_2024_20_5_a4/
LA  - en
ID  - ND_2024_20_5_a4
ER  -

%0 Journal Article
%A G. K. Bychkov
%A D. M. Dvinskikh
%A A. V. Antsiferova
%A A. V. Gasnikov
%A A. V. Lobanov
%T Accelerated Zero-Order SGD under High-Order Smoothness and Overparameterized Regime
%J Russian journal of nonlinear dynamics
%D 2024
%P 759-788
%V 20
%N 5
%I mathdoc
%U http://geodesic.mathdoc.fr/item/ND_2024_20_5_a4/
%G en
%F ND_2024_20_5_a4

G. K. Bychkov; D. M. Dvinskikh; A. V. Antsiferova; A. V. Gasnikov; A. V. Lobanov. Accelerated Zero-Order SGD under High-Order Smoothness and Overparameterized Regime. Russian journal of nonlinear dynamics, Tome 20 (2024) no. 5, pp. 759-788. http://geodesic.mathdoc.fr/item/ND_2024_20_5_a4/

Parcourir par

Geodesic

Parcourir par