Reinforcement Learning in the Task of Spherical Robot Motion Control
Russian journal of nonlinear dynamics, Tome 20 (2024) no. 2, pp. 295-310.

Voir la notice de l'article provenant de la source Math-Net.Ru

This article discusses one of the DDPG (Deep Deterministic Policy Gradient) reinforcement learning algorithms applied to the problem of motion control of a spherical robot. Inside the spherical robot shell there is a platform with a wheel, and the robot is simulated in the MuJoCo physical simulation environment. The goal is to teach the robot to move along an arbitrary closed curve with minimal error. The output control algorithm is a pair of trained neural networks — actor and critic, where the actor-network is used to obtain the control torques applied to the robot wheel and the criticnetwork is only involved in the learning process. The results of the training are shown below, namely how the robot performs the motion along ten arbitrary trajectories, where the main quality functional is the average error magnitude over the trajectory length scale. The algorithm is implemented using the PyTorch machine learning library.
Keywords: control, control of a mechanical system, spherical robot, mechanics, artificial intelligence, reinforcement learning, Q-learning, actor-critic, multilayer neural network, MuJoCo, PyTorch
Mots-clés : DDPG
@article{ND_2024_20_2_a6,
     author = {N. V. Nor},
     title = {Reinforcement {Learning} in the {Task} of {Spherical} {Robot} {Motion} {Control}},
     journal = {Russian journal of nonlinear dynamics},
     pages = {295--310},
     publisher = {mathdoc},
     volume = {20},
     number = {2},
     year = {2024},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/ND_2024_20_2_a6/}
}
TY  - JOUR
AU  - N. V. Nor
TI  - Reinforcement Learning in the Task of Spherical Robot Motion Control
JO  - Russian journal of nonlinear dynamics
PY  - 2024
SP  - 295
EP  - 310
VL  - 20
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/ND_2024_20_2_a6/
LA  - en
ID  - ND_2024_20_2_a6
ER  - 
%0 Journal Article
%A N. V. Nor
%T Reinforcement Learning in the Task of Spherical Robot Motion Control
%J Russian journal of nonlinear dynamics
%D 2024
%P 295-310
%V 20
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/item/ND_2024_20_2_a6/
%G en
%F ND_2024_20_2_a6
N. V. Nor. Reinforcement Learning in the Task of Spherical Robot Motion Control. Russian journal of nonlinear dynamics, Tome 20 (2024) no. 2, pp. 295-310. http://geodesic.mathdoc.fr/item/ND_2024_20_2_a6/

[1] Borisov, A. V., Kilin, A. A., and Mamaev, I. S., “How to Control Chaplygin's Sphere Using Rotors”, Regul. Chaotic Dyn., 17:3–4 (2012), 258–272 | DOI | MR | Zbl

[2] Karavaev, Yu. L. and Kilin, A. A., “The Dynamics and Control of a Spherical Robot with an Internal Omniwheel Platform”, Regul. Chaotic Dyn., 20:2 (2015), 134–152 | DOI | MR | Zbl

[3] Karavaev, Yu. L., “Spherical Robots: An Up-to-Date Overview of Designs and Features”, Russian J. Nonlinear Dyn., 18:4 (2022), 699–740 | MR

[4] Li, M., Sun, H., Ma, L., Gao, P., Huo, D., Wang, Zh., and Sun, P., “Special Spherical Mobile Robot for Planetary Surface Exploration: A Review”, Int. J. Adv. Robot. Syst., 20:2 (2023), 20 pp. | DOI

[5] Diouf, A., Belzile, B., Saad, M., and St.-Onge, D., “Spherical Rolling Robots — Design, Modeling, and Control: A Systematic Literature Review”, Rob. Auton. Syst., 175 (2024), Art. 104657, 15 pp. | DOI

[6] Hess, G. and Ljungbergh, W., Deep Deterministic Path Following, , 2021, 4 pp. arXiv:2104.06014 [cs.RO]

[7] Kamran, D., Zhu, J., and Lauer, M., “Learning Path Tracking for Real Car-Like Mobile Robots from Simulation”, European Conf. on Mobile Robots (ECMR, Prague, Czech Republic), 2019, 6 pp.

[8] Cheng, X., Zhang, S., Cheng, S., Xia, Q., and Zhang, J., “Path-Following and Obstacle Avoidance Control of Nonholonomic Wheeled Mobile Robot Based on Deep Reinforcement Learning”, Appl. Sci., 12 (2022), Art. 6874, 14 pp.

[9] Gou, W. and Liu, Y., “Trajectory Tracking Control of Wheeled Mobile Robot Based on Improved LSTM-DDPG Algorithm”, J. Phys. Conf. Ser., 2303 (2022), Art. 012069, 7 pp. | DOI

[10] Todorov, E., Erez, T., and Tassa, Y., “MuJoCo: A Physics Engine for Model-Based Control”, 2012 IEEE/RSJ Internat. Conf. on Intelligent Robots and Systems (Vilamoura-Algarve, Portugal, Oct 2012), 5026–5033

[11] Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., and Riedmiller, M., Playing Atari with Deep Reinforcement Learning, , 2013, 9 pp. arXiv:1312.5602 [cs.LG]

[12] Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., and Wierstra, D., Continuous Control with Deep Reinforcement Learning, , 2015, 14 pp. arXiv:1509.02971 [cs.LG]

[13] Kingma, D. P. and Ba, J. L., Adam: A Method for Stochastic Optimization, , 2014, 15 pp. arXiv:1412.6980 [cs.LG] | MR | Zbl

[14] Ioffe, S. and Szegedy, Ch., Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift, , 2015, 11 pp. arXiv:1502.03167 [cs.LG]

[15] pytorch.org

[16] , 2022 The results of learning the robot