On the application of reinforcement learning in the task of choosing the optimal trajectory
News of the Kabardin-Balkar scientific center of RAS, Tome 27 (2025) no. 2, pp. 86-102

Voir la notice de l'article provenant de la source Math-Net.Ru

This paper reviews state-of-the-art reinforcement learning methods, with a focus on their application in dynamic and complex environments. The study begins by analysing the main approaches to reinforcement learning such as dynamic programming, Monte Carlo methods, time-difference methods and policy gradients. Special attention is given to the Generalised Adversarial Imitation Learning (GAIL) methodology and its impact on the optimisation of agents' strategies. A study of model-free learning is presented and criteria for selecting agents capable of operating in continuous action and state spaces are highlighted. The experimental part is devoted to analysing the learning of agents using different types of sensors, including visual sensors, and demonstrates their ability to adapt to the environment despite resolution constraints. A comparison of results based on cumulative reward and episode length is presented, revealing improved agent performance in the later stages of training. The study confirms that the use of simulated learning significantly improves agent performance by reducing time costs and improving decision-making strategies. The present work holds promise for further exploration of mechanisms for improving sensor resolution and fine-tuning hyperparameters.
Keywords: Keywords: reinforcement learning, optimal trajectory, highly automated vehicles, policy-based learning, actor-critic architectures, simulated learning, sensors, continuous states, discrete states, PPO
Mots-clés : intelligent agents, SAC
@article{IZKAB_2025_27_2_a5,
     author = {M. G. Gorodnichev},
     title = {On the application of reinforcement learning in the task of choosing the optimal trajectory},
     journal = {News of the Kabardin-Balkar scientific center of RAS},
     pages = {86--102},
     publisher = {mathdoc},
     volume = {27},
     number = {2},
     year = {2025},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/IZKAB_2025_27_2_a5/}
}
TY  - JOUR
AU  - M. G. Gorodnichev
TI  - On the application of reinforcement learning in the task of choosing the optimal trajectory
JO  - News of the Kabardin-Balkar scientific center of RAS
PY  - 2025
SP  - 86
EP  - 102
VL  - 27
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/IZKAB_2025_27_2_a5/
LA  - ru
ID  - IZKAB_2025_27_2_a5
ER  - 
%0 Journal Article
%A M. G. Gorodnichev
%T On the application of reinforcement learning in the task of choosing the optimal trajectory
%J News of the Kabardin-Balkar scientific center of RAS
%D 2025
%P 86-102
%V 27
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/item/IZKAB_2025_27_2_a5/
%G ru
%F IZKAB_2025_27_2_a5
M. G. Gorodnichev. On the application of reinforcement learning in the task of choosing the optimal trajectory. News of the Kabardin-Balkar scientific center of RAS, Tome 27 (2025) no. 2, pp. 86-102. http://geodesic.mathdoc.fr/item/IZKAB_2025_27_2_a5/