An automated driving strategy generating method based on WGAIL–DDPG

Zhang, Mingheng; Wan, Xing; Gang, Longhui; Lv, Xinfei; Wu, Zengwen; Liu, Zhaoyang

Zhang, Mingheng ; Wan, Xing ; Gang, Longhui ; Lv, Xinfei ; Wu, Zengwen ; Liu, Zhaoyang

International Journal of Applied Mathematics and Computer Science, Tome 31 (2021) no. 3, pp. 461-470

Voir la notice de l'article provenant de la source Library of Science

Résumé

Reliability, efficiency and generalization are basic evaluation criteria for a vehicle automated driving system. This paper proposes an automated driving decision-making method based on the Wasserstein generative adversarial imitation learning–deep deterministic policy gradient (WGAIL–DDPG(λ)). Here the exact reward function is designed based on the requirements of a vehicle’s driving performance, i.e., safety, dynamic and ride comfort performance. The model’s training efficiency is improved through the proposed imitation learning strategy, and a gain regulator is designed to smooth the transition from imitation to reinforcement phases. Test results show that the proposed decision-making model can generate actions quickly and accurately according to the surrounding environment. Meanwhile, the imitation learning strategy based on expert experience and the gain regulator can effectively improve the training efficiency for the reinforcement learning model. Additionally, an extended test also proves its good adaptability for different driving conditions.

Keywords: automated driving system, deep learning, deep reinforcement learning, imitation learning, deep deterministic policy gradient
Mots-clés : system jezdny, uczenie głębokie, uczenie przez naśladowanie

@article{IJAMCS_2021_31_3_a6,
     author = {Zhang, Mingheng and Wan, Xing and Gang, Longhui and Lv, Xinfei and Wu, Zengwen and Liu, Zhaoyang},
     title = {An automated driving strategy generating method based on {WGAIL{\textendash}DDPG}},
     journal = {International Journal of Applied Mathematics and Computer Science},
     pages = {461--470},
     publisher = {mathdoc},
     volume = {31},
     number = {3},
     year = {2021},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/IJAMCS_2021_31_3_a6/}
}

TY  - JOUR
AU  - Zhang, Mingheng
AU  - Wan, Xing
AU  - Gang, Longhui
AU  - Lv, Xinfei
AU  - Wu, Zengwen
AU  - Liu, Zhaoyang
TI  - An automated driving strategy generating method based on WGAIL–DDPG
JO  - International Journal of Applied Mathematics and Computer Science
PY  - 2021
SP  - 461
EP  - 470
VL  - 31
IS  - 3
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/IJAMCS_2021_31_3_a6/
LA  - en
ID  - IJAMCS_2021_31_3_a6
ER  -

%0 Journal Article
%A Zhang, Mingheng
%A Wan, Xing
%A Gang, Longhui
%A Lv, Xinfei
%A Wu, Zengwen
%A Liu, Zhaoyang
%T An automated driving strategy generating method based on WGAIL–DDPG
%J International Journal of Applied Mathematics and Computer Science
%D 2021
%P 461-470
%V 31
%N 3
%I mathdoc
%U http://geodesic.mathdoc.fr/item/IJAMCS_2021_31_3_a6/
%G en
%F IJAMCS_2021_31_3_a6

Zhang, Mingheng; Wan, Xing; Gang, Longhui; Lv, Xinfei; Wu, Zengwen; Liu, Zhaoyang. An automated driving strategy generating method based on WGAIL–DDPG. International Journal of Applied Mathematics and Computer Science, Tome 31 (2021) no. 3, pp. 461-470. http://geodesic.mathdoc.fr/item/IJAMCS_2021_31_3_a6/

Bibliographie
Cité par

[1] [1] Anderson, C.W., Lee, M. and Elliott, D.L. (2015). Faster reinforcement learning after pretraining deep networks to predict state dynamics, 2015 International Joint Conference on Neural Networks (IJCNN), Killarney, Ireland, pp. 1–7.

[2] [2] Bai, Z., Shangguan, W., Cai, B. and Chai, L. (2019). Deep reinforcement learning based high-level driving behavior decision-making model in heterogeneous traffic, 2019 Chinese Control Conference (CCC), Guangzhou, China, pp. 8600–8605.

[3] [3] Chang, B.-J., Hwang, R.-H., Tsai, Y.-L., Yu, B.-H. and Liang, Y.-H. (2019). Cooperative adaptive driving for platooning autonomous self driving based on edge computing, International Journal of Applied Mathematics and Computer Science 29(2): 213–225, DOI: 10.2478/amcs-2019-0016.

[4] [4] Chen, X., Tian, G., Miao, Y. and Gong, J.-w. (2017). Driving rule acquisition and decision algorithm to unmanned vehicle in urban traffic, Transactions of Beijing Institute of Technology 37(5): 491–496.

[5] [5] Dossa, R.F., Lian, X., Nomoto, H., Matsubara, T. and Uehara, K. (2020). Hybrid of reinforcement and imitation learning for human-like agents, IEICE Transactions on Information and Systems 103(9): 1960–1970.

[6] [6] Gao, H., Shi, G.,Wang, K., Xie, G. and Liu, Y. (2019). Research on decision-making of autonomous vehicle following based on reinforcement learning method, Industrial Robot: The International Journal of Robotics Research and Application 46(3): 444–452.

[7] [7] Gu, X., Han, Y. and Yu, J. (2020). Vehicle lane-changing decision model based on decision mechanism and support vector machine, Journal of Harbin Institute of Technology 52(07): 111–121.

[8] [8] Hedjar, R. and Bounkhel, M. (2019). An automatic collision avoidance algorithm for multiple marine surface vehicles, International Journal of Applied Mathematics and Computer Science 29(4): 759–768, DOI: 10.2478/amcs-2019-0056.

[9] [9] Ho, J. and Ermon, S. (2016). Generative adversarial imitation learning, Advances in Neural Information Processing Systems 29: 4572–4580.

[10] [10] Pomerleau, D. (1998). ALVINN: An autonomous land vehicle in a neural network, in D.S. Touretzky (Ed), Advances in Neural Information Processing Systems, Morgan Kaufmann Publishers, Burlington.

[11] [11] Xia,W. and Li, H. (2017). Training method of automatic driving strategy based on deep reinforcement learning, Journal of Integration Technology 6(3): 29–40.

[12] [12] Xiong, G.-m., Li, Y. and Wang, S.-y. (2015). A behavior prediction and control method based on FSM for intelligent vehicles in an intersection, Transactions of Beijing Institute of Technology 35(1): 7.

[13] [13] Xu, H., Gao, Y., Yu, F. and Darrell, T. (2017). End-to-end learning of driving models from large-scale video datasets, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 2174–2182.

[14] [14] Zhu, M., Wang, X. and Wang, Y. (2018). Human-like autonomous car-following model with deep reinforcement learning, Transportation Research C: Emerging Technologies 97: 348–368.

[15] [15] Ziegler, J., Bender, P., Schreiber, M., Lategahn, H., Strauss, T., Stiller, C., Dang, T., Franke, U., Appenrodt, N., Keller, C.G., Kaus, E., Herrtwich, R.G., Rabe, C., Pfeiffer, D., Lindner, F., Stein, F., Erbs, F., Enzweiler, M., Knoppel, C., Hipp, J., Haueis, M., Trepte, M., Brenk, C., Tamke, A., Ghanaat, M., Braun, M., Joos, A., Fritz, H., Mock, H., Hein, M. and Zeeb, E. (2014). Making Bertha drive—An autonomous journey on a historic route, IEEE Intelligent Transportation Systems Magazine 6(2): 8–20.

[16] [16] Zong, X., Xu, G., Yu, G., Su, H. and Hu, C. (2017). Obstacle avoidance for self-driving vehicle with reinforcement learning, SAE International Journal of Passenger Cars—Electronic and Electrical Systems 11(07-11-01-0003): 30–39.

[17] [17] Zou, Q., Xiong, K. and Hou, Y. (2020). An end-to-end learning of driving strategies based on DDPG and imitation learning, 2020 Chinese Control And Decision Conference (CCDC), Hefei, China, pp. 3190–3195.

Parcourir par

Geodesic

Parcourir par