Time-varying Markov decision processes with state-action-dependent discount factors and unbounded costs
Kybernetika, Tome 55 (2019) no. 1, pp. 166-182.

Voir la notice de l'article provenant de la source Czech Digital Mathematics Library

In this paper we are concerned with a class of time-varying discounted Markov decision models $\mathcal{M}_n$ with unbounded costs $c_n$ and state-action dependent discount factors. Specifically we study controlled systems whose state process evolves according to the equation $x_{n+1}=G_n(x_n,a_n,\xi_n), n=0,1,\ldots$, with state-action dependent discount factors of the form $\alpha_n(x_n,a_n)$, where $a_n$ and $\xi_n$ are the control and the random disturbance at time $n$, respectively. Assuming that the sequences of functions $\lbrace\alpha_n\rbrace$,$\lbrace c_n\rbrace$ and $\lbrace G_n\rbrace$ converge, in certain sense, to $\alpha_\infty$, $c_\infty$ and $G_\infty$, our objective is to introduce a suitable control model for this class of systems and then, to show the existence of optimal policies for the limit system $\mathcal{M}_\infty$ corresponding to $\alpha_\infty$, $c_\infty$ and $G_\infty$. Finally, we illustrate our results and their applicability in a class of semi-Markov control models.
DOI : 10.14736/kyb-2019-1-0166
Classification : 90C40, 93E20
Keywords: discounted optimality; non-constant discount factor; time-varying Markov decision processes
@article{10_14736_kyb_2019_1_0166,
     author = {Escobedo-Trujillo, Beatris A. and Higuera-Chan, Carmen G.},
     title = {Time-varying {Markov} decision processes with state-action-dependent discount factors and unbounded costs},
     journal = {Kybernetika},
     pages = {166--182},
     publisher = {mathdoc},
     volume = {55},
     number = {1},
     year = {2019},
     doi = {10.14736/kyb-2019-1-0166},
     mrnumber = {3935420},
     zbl = {07088884},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.14736/kyb-2019-1-0166/}
}
TY  - JOUR
AU  - Escobedo-Trujillo, Beatris A.
AU  - Higuera-Chan, Carmen G.
TI  - Time-varying Markov decision processes with state-action-dependent discount factors and unbounded costs
JO  - Kybernetika
PY  - 2019
SP  - 166
EP  - 182
VL  - 55
IS  - 1
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/articles/10.14736/kyb-2019-1-0166/
DO  - 10.14736/kyb-2019-1-0166
LA  - en
ID  - 10_14736_kyb_2019_1_0166
ER  - 
%0 Journal Article
%A Escobedo-Trujillo, Beatris A.
%A Higuera-Chan, Carmen G.
%T Time-varying Markov decision processes with state-action-dependent discount factors and unbounded costs
%J Kybernetika
%D 2019
%P 166-182
%V 55
%N 1
%I mathdoc
%U http://geodesic.mathdoc.fr/articles/10.14736/kyb-2019-1-0166/
%R 10.14736/kyb-2019-1-0166
%G en
%F 10_14736_kyb_2019_1_0166
Escobedo-Trujillo, Beatris A.; Higuera-Chan, Carmen G. Time-varying Markov decision processes with state-action-dependent discount factors and unbounded costs. Kybernetika, Tome 55 (2019) no. 1, pp. 166-182. doi : 10.14736/kyb-2019-1-0166. http://geodesic.mathdoc.fr/articles/10.14736/kyb-2019-1-0166/

Cité par Sources :