On the application of reinforcement learning in the task of choosing the optimal trajectory
News of the Kabardin-Balkar scientific center of RAS, Tome 27 (2025) no. 2, pp. 86-102
Voir la notice de l'article provenant de la source Math-Net.Ru
This paper reviews state-of-the-art reinforcement learning methods, with a focus on their
application in dynamic and complex environments. The study begins by analysing the main approaches to
reinforcement learning such as dynamic programming, Monte Carlo methods, time-difference methods
and policy gradients. Special attention is given to the Generalised Adversarial Imitation Learning (GAIL)
methodology and its impact on the optimisation of agents' strategies. A study of model-free learning is
presented and criteria for selecting agents capable of operating in continuous action and state spaces are
highlighted. The experimental part is devoted to analysing the learning of agents using different types of
sensors, including visual sensors, and demonstrates their ability to adapt to the environment despite
resolution constraints. A comparison of results based on cumulative reward and episode length is
presented, revealing improved agent performance in the later stages of training. The study confirms that
the use of simulated learning significantly improves agent performance by reducing time costs and
improving decision-making strategies. The present work holds promise for further exploration of
mechanisms for improving sensor resolution and fine-tuning hyperparameters.
Keywords:
Keywords: reinforcement learning, optimal trajectory, highly automated vehicles,
policy-based learning, actor-critic architectures, simulated learning, sensors, continuous states, discrete
states, PPO
Mots-clés : intelligent agents, SAC
Mots-clés : intelligent agents, SAC
@article{IZKAB_2025_27_2_a5,
author = {M. G. Gorodnichev},
title = {On the application of reinforcement learning in the task of choosing the optimal trajectory},
journal = {News of the Kabardin-Balkar scientific center of RAS},
pages = {86--102},
publisher = {mathdoc},
volume = {27},
number = {2},
year = {2025},
language = {ru},
url = {http://geodesic.mathdoc.fr/item/IZKAB_2025_27_2_a5/}
}
TY - JOUR AU - M. G. Gorodnichev TI - On the application of reinforcement learning in the task of choosing the optimal trajectory JO - News of the Kabardin-Balkar scientific center of RAS PY - 2025 SP - 86 EP - 102 VL - 27 IS - 2 PB - mathdoc UR - http://geodesic.mathdoc.fr/item/IZKAB_2025_27_2_a5/ LA - ru ID - IZKAB_2025_27_2_a5 ER -
%0 Journal Article %A M. G. Gorodnichev %T On the application of reinforcement learning in the task of choosing the optimal trajectory %J News of the Kabardin-Balkar scientific center of RAS %D 2025 %P 86-102 %V 27 %N 2 %I mathdoc %U http://geodesic.mathdoc.fr/item/IZKAB_2025_27_2_a5/ %G ru %F IZKAB_2025_27_2_a5
M. G. Gorodnichev. On the application of reinforcement learning in the task of choosing the optimal trajectory. News of the Kabardin-Balkar scientific center of RAS, Tome 27 (2025) no. 2, pp. 86-102. http://geodesic.mathdoc.fr/item/IZKAB_2025_27_2_a5/