Classification of multi-agent reinforcement
News of the Kabardin-Balkar scientific center of RAS, no. 3 (2021), pp. 32-44.

Voir la notice de l'article provenant de la source Math-Net.Ru

With the advent of deep single-agents reinforcement learning (SARL), multi-agent reinforcement learning (MARL) has received a new impetus for development in the form of deep multi-agent reinforcement learning (MDRL). The active development of methods in this area over the past few years has actualized the issues of their systematization and classification. Existing works use the mechanisms used in the corresponding MDRL methods as classification signs. However, the applicability of a particular method is determined not only by the class of the method, but also by the class of the MARL problem. The purpose of this work is to formalize and classify MARL tasks. To achieve the goal, the mathematical formalization and generalization of the existing classifications of SARL tasks is carried out. The peculiarities arising in the transition from the SARL problem to the MARL problem are considered and mathematically formalized. The essential features are highlighted and the classification of MARL tasks is carried out on the basis of the set-theoretic approach. The use of the set-theoretic approach made it possible to identify classes of MARL problems, generalized in other similar works, but possessing specific properties, which can be used to develop more efficient methods for solving such MARL problems. It is expected that the proposed formalism and classification of MARL problems will be useful for researchers as a tool for setting a problem and determining the place of research in the general structure of MARL methods and tasks, and will also be useful for developers for a reasonable choice of MARL methods based on the class of the problem being solved.
Keywords: multi-agent reinforcement learning, multi-agent systems
Mots-clés : classification.
@article{IZKAB_2021_3_a2,
     author = {V. I. Petrenko},
     title = {Classification of multi-agent reinforcement},
     journal = {News of the Kabardin-Balkar scientific center of RAS},
     pages = {32--44},
     publisher = {mathdoc},
     number = {3},
     year = {2021},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/IZKAB_2021_3_a2/}
}
TY  - JOUR
AU  - V. I. Petrenko
TI  - Classification of multi-agent reinforcement
JO  - News of the Kabardin-Balkar scientific center of RAS
PY  - 2021
SP  - 32
EP  - 44
IS  - 3
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/IZKAB_2021_3_a2/
LA  - ru
ID  - IZKAB_2021_3_a2
ER  - 
%0 Journal Article
%A V. I. Petrenko
%T Classification of multi-agent reinforcement
%J News of the Kabardin-Balkar scientific center of RAS
%D 2021
%P 32-44
%N 3
%I mathdoc
%U http://geodesic.mathdoc.fr/item/IZKAB_2021_3_a2/
%G ru
%F IZKAB_2021_3_a2
V. I. Petrenko. Classification of multi-agent reinforcement. News of the Kabardin-Balkar scientific center of RAS, no. 3 (2021), pp. 32-44. http://geodesic.mathdoc.fr/item/IZKAB_2021_3_a2/

[1] V. Mnih et al., “Human-level control through deep reinforcement learning”, Nature, 518:7540 (2015), 529–533

[2] V. I. Petrenko, F. B. Tebueva, S. S. Ryabtsev, M. M. Gurchinsky, I. V. Struchkov, “Consensus achievement method for a robotic swarm about the most frequently feature of an environment”, IOP Conference Series: Materials Science and Engineering, 919:4 (2020)

[3] G. Kov-cs, N. Yussupova, D. Rizvanov, “Resource management simulation using multi-agent approach and semantic constraints”, Pollack Period, 12:1 (2017)

[4] V. Kh. Pshikhopov, M. Yu. Medvedev, “Group motion control of mobile robots in an uncertain environment using unstable modes”, Proceedings of SPIIRAS, 60:5 (2018), 39–63

[5] A. K. Tugengold, E. A. Lukyanov, Intelligent functions and control of autonomous technological mechatronic objects, Don State Technical University, Rostov-on-Don, 2013, 203 pp.

[6] K. V. Mironov, M. U. Pongratz, “Applying neural networks for prediction of flying objects trajectory”, Vestn. UGATU, 2013, no. 6

[7] O. V. Darintsev, A. B. Migranov, “Distributed control system for groups of mobile robots”, Vestnik USATU, 2:76 (2017)

[8] V. I. Petrenko, F. B. Tebueva, M. M. Gurchinsky, S. S. Ryabtsev, “Analysis of information security technologies for multi-agent robotic systems with swarm intelligence”, Science and business development paths, 2020, no. 4 (106), 96–99

[9] N. Yusupova, D. Rizvanov, D. Andrushko, “Cyber-Physical Systems and Reliability Issues”, Proceedings of the 8th Scientific Conference on Information Technologies for Intelligent Decision Making Support, ITIDS 2020, Atlantis Press, 2020, 133–137

[10] R. Lowe et al., “Multi-agent actor-critic for mixed cooperative-competitive environments”, Advances in Neural Information Processing Systems, 2017 (2017)

[11] H. Wang, Z. Liu, J. Yi, Z. Pu, “Multiagent hierarchical cognition difference policy for multiagent cooperation”, Algorithms, 14:3 (2021) | MR

[12] Silva F. L. Da, C. E.H. Nishida, D. M. Roijers, A. H.R. Costa, “Coordination of Electric Vehicle Charging through Multiagent Reinforcement Learning”, IEEE Trans. Smart Grid, 11:3 (2020) | DOI

[13] J. Cui, Y. Liu, A. Nallanathan, “Multi-Agent Reinforcement Learning-Based Resource Allocation for UAV Networks”, IEEE Trans. Wirel. Commun, 19:2 (2020) | DOI

[14] A. Shamsoshoara, M. Khaledi, F. Afghah, A. Razi, J. Ashdown, Distributed cooperative spectrum sharing in UAV networks using multi-agent reinforcement learning, 2018, arXiv: 1811.05053

[15] H. Qie et al, “Joint Optimization of Multi-UAV Target Assignment and Path Planning Based on Multi-Agent Reinforcement Learning”, IEEE Access, 7 (2019) | DOI

[16] X. Fang et al, “Multi-agent reinforcement learning approach for residential microgrid energy scheduling”, Energies, 13:1 (2019) | DOI

[17] I. A. Pshenokova, Z. A. Sundukov, “Development of a simulation model for predicting the behavior of an intelligent agent based on an invariant of a recursive multi-agent neurocognitive architecture”, News of the Kabardino-Balkarian Scientific Center of the RAS, 2020, no. 6 (98), 80–90

[18] I. A. Pshenokova, O. V. Nagoeva, I. A. Gurtueva, A. A. Airan, “Learning algorithm for an intelligent decision making system based on multi-agent neurocognitive architectures”, News of the Kabardino-Balkarian Scientific Center of the RAS, 2020, no. 3 (95), 23–31

[19] P. Hernandez-Leal, B. Kartal, M. E. Taylor, “A survey and critique of multiagent deep reinforcement learning”, Auton. Agent. Multi. Agent. Syst., 33:6 (2019) | DOI | MR

[20] L. Bu-oniu, R. Babu-ka, B. De Schutter, “A comprehensive survey of multiagent reinforcement learning”, IEEE Transactions on Systems, Man and Cybernetics. Part C: Applications and Reviews, 38:2 (2008)

[21] P. Hernandez-Leal, M. Kaisers, T. Baarslag, E. De Cote, A survey of learning in multiagent environments: Dealing with non-stationarity, 2017, arXiv: 1707.09183

[22] K. Zhang, Z. Yang, T. Ba-ar, Multi-agent reinforcement learning: A selective overview of theories and algorithms, 2019, arXiv: 1911.10635

[23] J. Hao, D. Huang, Y. Cai, Leung H. fung, “The dynamics of reinforcement social learning in networked cooperative multiagent systems”, Eng. Appl. Artif. Intell, 58 (2017) | DOI

[24] F. L. Da Silva, A. H. Reali Costa, “A survey on transfer learning for multiagent reinforcement learning systems”, J. Artif. Intell. Res, 64 (2019) | DOI | MR

[25] T. T. Nguyen, N. D. Nguyen, S. Nahavandi, “Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications”, IEEE Trans. Cybern, 50:9 (2020) | DOI | Zbl

[26] Y. Yang et al., Q-value path decomposition for deep multiagent reinforcement learning, 2020, arXiv: 2002.03950

[27] A. Shamsoshoara, M. Khaledi, F. Afghah, A. Razi, J. Ashdown, “Distributed Cooperative Spectrum Sharing in UAV Networks Using Multi-Agent Reinforcement Learning”, 2019 16th IEEE Annual Consumer Communications and Networking Conference, CCNC 2019

[28] K. Tuyls, G. Weiss, “Multiagent learning: Basics, challenges, and prospects”, AI Magazine, 33:3 (2012) | DOI

[29] L. Matignon, G. J. Laurent, N. Le Fort-Piat, “Independent reinforcement learners in cooperative Markov games: A survey regarding coordination problems”, Knowledge Engineering Review, 27:1 (2012) | DOI | MR

[30] M. L. Littman, “Markov games as a framework for multi-agent reinforcement learning Michael”, Thromb. Res., 120:1 (2007)

[31] A. Tampuu et al., “Multiagent cooperation and competition with deep reinforcement learning”, PLoS One, 12:4 (2017) | DOI