First passage risk probability optimality for continuous time Markov decision processes
Kybernetika, Tome 55 (2019) no. 1, pp. 114-133
Cet article a éte moissonné depuis la source Czech Digital Mathematics Library
In this paper, we study continuous time Markov decision processes (CTMDPs) with a denumerable state space, a Borel action space, unbounded transition rates and nonnegative reward function. The optimality criterion to be considered is the first passage risk probability criterion. To ensure the non-explosion of the state processes, we first introduce a so-called drift condition, which is weaker than the well known regular condition for semi-Markov decision processes (SMDPs). Furthermore, under some suitable conditions, by value iteration recursive approximation technique, we establish the optimality equation, obtain the uniqueness of the value function and the existence of optimal policies. Finally, two examples are used to illustrate our results.
In this paper, we study continuous time Markov decision processes (CTMDPs) with a denumerable state space, a Borel action space, unbounded transition rates and nonnegative reward function. The optimality criterion to be considered is the first passage risk probability criterion. To ensure the non-explosion of the state processes, we first introduce a so-called drift condition, which is weaker than the well known regular condition for semi-Markov decision processes (SMDPs). Furthermore, under some suitable conditions, by value iteration recursive approximation technique, we establish the optimality equation, obtain the uniqueness of the value function and the existence of optimal policies. Finally, two examples are used to illustrate our results.
DOI :
10.14736/kyb-2019-1-0114
Classification :
60E20, 90C40
Keywords: continuous time Markov decision processes; first passage time; risk probability criterion; optimal policy
Keywords: continuous time Markov decision processes; first passage time; risk probability criterion; optimal policy
@article{10_14736_kyb_2019_1_0114,
author = {Huo, Haifeng and Wen, Xian},
title = {First passage risk probability optimality for continuous time {Markov} decision processes},
journal = {Kybernetika},
pages = {114--133},
year = {2019},
volume = {55},
number = {1},
doi = {10.14736/kyb-2019-1-0114},
mrnumber = {3935417},
zbl = {07088881},
language = {en},
url = {http://geodesic.mathdoc.fr/articles/10.14736/kyb-2019-1-0114/}
}
TY - JOUR AU - Huo, Haifeng AU - Wen, Xian TI - First passage risk probability optimality for continuous time Markov decision processes JO - Kybernetika PY - 2019 SP - 114 EP - 133 VL - 55 IS - 1 UR - http://geodesic.mathdoc.fr/articles/10.14736/kyb-2019-1-0114/ DO - 10.14736/kyb-2019-1-0114 LA - en ID - 10_14736_kyb_2019_1_0114 ER -
%0 Journal Article %A Huo, Haifeng %A Wen, Xian %T First passage risk probability optimality for continuous time Markov decision processes %J Kybernetika %D 2019 %P 114-133 %V 55 %N 1 %U http://geodesic.mathdoc.fr/articles/10.14736/kyb-2019-1-0114/ %R 10.14736/kyb-2019-1-0114 %G en %F 10_14736_kyb_2019_1_0114
Huo, Haifeng; Wen, Xian. First passage risk probability optimality for continuous time Markov decision processes. Kybernetika, Tome 55 (2019) no. 1, pp. 114-133. doi: 10.14736/kyb-2019-1-0114
Cité par Sources :