Modification of a deep learning algorithm for distributing functions

M. A. Shereuzhev; G. Wu; V. V. Serebrenny

M. A. Shereuzhev ; G. Wu ; V. V. Serebrenny

News of the Kabardin-Balkar scientific center of RAS, Tome 26 (2024) no. 6, pp. 208-218 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

Résumé

In the real world, conditions are rarely stable, which requires robotic systems to be able to adapt to uncertainty. Human-robot collaboration increases productivity, but this requires effective task allocation methods that consider the characteristics of both parties. The aim of the work is to determine optimal strategies for distributing tasks between people and collaborative robots and adaptive control of a collaborative robot under uncertainty and a changing environment. Research methods. The paper develops a graph-based approach to task allocation based on the capabilities of a human and a robot. The LSTM memory mechanism is built into the reinforcement learning algorithm to solve the problem of partial observability caused by inaccurate sensor measurements and environmental noise. The Hindsight Experience Replay method is used to overcome the problem of sparse rewards. Results. The trained model demonstrated stable convergence, achieving a high level of success rate of manipulation of objects. The integration of LSTM and HER methods into reinforcement learning allows solving the problems of distributing tasks between a human and a robot under uncertainty and a changing environment. The proposed method can be applied in various scenarios for collaborative robots in complex and changing conditions.

Keywords: human robot interaction, adaptive control algorithm, task distribution, reinforcement learning

@article{IZKAB_2024_26_6_a17,
     author = {M. A. Shereuzhev and G. Wu and V. V. Serebrenny},
     title = {Modification of a deep learning algorithm for distributing functions},
     journal = {News of the Kabardin-Balkar scientific center of RAS},
     pages = {208--218},
     year = {2024},
     volume = {26},
     number = {6},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/IZKAB_2024_26_6_a17/}
}

TY  - JOUR
AU  - M. A. Shereuzhev
AU  - G. Wu
AU  - V. V. Serebrenny
TI  - Modification of a deep learning algorithm for distributing functions
JO  - News of the Kabardin-Balkar scientific center of RAS
PY  - 2024
SP  - 208
EP  - 218
VL  - 26
IS  - 6
UR  - http://geodesic.mathdoc.fr/item/IZKAB_2024_26_6_a17/
LA  - ru
ID  - IZKAB_2024_26_6_a17
ER  -

%0 Journal Article
%A M. A. Shereuzhev
%A G. Wu
%A V. V. Serebrenny
%T Modification of a deep learning algorithm for distributing functions
%J News of the Kabardin-Balkar scientific center of RAS
%D 2024
%P 208-218
%V 26
%N 6
%U http://geodesic.mathdoc.fr/item/IZKAB_2024_26_6_a17/
%G ru
%F IZKAB_2024_26_6_a17

M. A. Shereuzhev; G. Wu; V. V. Serebrenny. Modification of a deep learning algorithm for distributing functions. News of the Kabardin-Balkar scientific center of RAS, Tome 26 (2024) no. 6, pp. 208-218. http://geodesic.mathdoc.fr/item/IZKAB_2024_26_6_a17/

Bibliographie
Cité par

[1] M. Fiore, A. Clodic, R. Alami, “On planning and task achievement modalities for human robot collaboration”, In Experimental Robotics: The 14th International Symposium on Experimental Robotics. Marrakech, 2016, 293–306, Springer, Morocco | DOI

[2] A. Ghadirzadeh, X. Chen, W. Yin et al., “Human-centered collaborative robots with deep reinforcement learning”, IEEE Robotics and Automation Letters, 6(2) (2020), 566–571 | DOI

[3] A. H. Qureshi, Y. Nakamura, Y. Yoshikawa, H. Ishiguro, “Robot gains social intelligence through multimodal deep reinforcement learning”, In IEEE-RAS. 16th International Conference on Humanoid Robots (humanoids), 2016, 745–751 | DOI

[4] Y. K. Kwok, I. Ahmad, “Static scheduling algorithms for allocating directed task graphs to multiprocessors”, ACM Computing Surveys, 31(4) (1999), 406–471 | DOI

[5] A. A. Malik, A. Bilberg, “Complexity-based task allocation in human-robot collaborative assembly”, Industrial Robot: International Journal of Robotics Research and Application, 46(4) (2019), 471–480 | DOI

[6] L. Lucignano, F. Cutugno, S. Rossi, A. Finzi, “A dialogue system for multimodal human robot interaction”, Proceedings of the 15th ACM on International Conference on Multimodal Interaction, 2013, 197–204 | DOI

[7] C. Qiu, Y. Hu, Y. Chen, B. Zeng, “Deep deterministic policy gradient (DDPG)-based energy harvesting wireless communications”, IEEE Internet of Things Journal, 6(5) (2019), 8577–8588 | DOI

[8] S. Hochreiter, Long Short-term Memory. Neural Computation MIT-Press, 1997

[9] M. Andrychowicz, F. Wolski, A. Ray et al., “Hindsight experience replay”, Advances in Neural Information Processing Systems, 30 (2017)

[10] M. Towers, A. Kwiatkowski, J. Terry et al., “Gymnasium: A standard interface for reinforcement learning environments”, arXiv:2407, 17032 (2024) | DOI

Parcourir par

Geodesic

Parcourir par