A kernel based true online Sarsa(λ) for continuous space control problems
Computer Science and Information Systems, Tome 14 (2017) no. 3.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

Reinforcement learning is an efficient learning method for the control problem by interacting with the environment to get an optimal policy. However, it also faces challenges such as low convergence accuracy and slow convergence. Moreover, conventional reinforcement learning algorithms could hardly solve continuous control problems. The kernel-based method can accelerate convergence speed and improve convergence accuracy; and the policy gradient method is a good way to deal with continuous space problems. We proposed a Sarsa(λ) version of true online time difference algorithm, named True Online Sarsa(λ)(TOSarsa(λ)), on the basis of the clustering-based sample specification method and selective kernelbased value function. The TOSarsa(λ) algorithm has a consistent result with both the forward view and the backward view which ensures to get an optimal policy in less time. Afterwards we also combined TOSarsa(λ) with heuristic dynamic programming. The experiments showed our proposed algorithm worked well in dealing with continuous control problem.
Keywords: reinforcement learning, kernel method, true online, policy gradient, Sarsa(λ)
@article{CSIS_2017_14_3_a15,
     author = {Fei Zhu and Haijun Zhu and Yuchen Fu and Xiaoke Zhou},
     title = {A kernel based true online {Sarsa(\ensuremath{\lambda})} for continuous space control problems},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {14},
     number = {3},
     year = {2017},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2017_14_3_a15/}
}
TY  - JOUR
AU  - Fei Zhu
AU  - Haijun Zhu
AU  - Yuchen Fu
AU  - Xiaoke Zhou
TI  - A kernel based true online Sarsa(λ) for continuous space control problems
JO  - Computer Science and Information Systems
PY  - 2017
VL  - 14
IS  - 3
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2017_14_3_a15/
ID  - CSIS_2017_14_3_a15
ER  - 
%0 Journal Article
%A Fei Zhu
%A Haijun Zhu
%A Yuchen Fu
%A Xiaoke Zhou
%T A kernel based true online Sarsa(λ) for continuous space control problems
%J Computer Science and Information Systems
%D 2017
%V 14
%N 3
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2017_14_3_a15/
%F CSIS_2017_14_3_a15
Fei Zhu; Haijun Zhu; Yuchen Fu; Xiaoke Zhou. A kernel based true online Sarsa(λ) for continuous space control problems. Computer Science and Information Systems, Tome 14 (2017) no. 3. http://geodesic.mathdoc.fr/item/CSIS_2017_14_3_a15/