Monotone optimal policies in discounted Markov decision processes with transition probabilities independent of the current state: existence and approximation
Kybernetika, Tome 49 (2013) no. 5, pp. 705-719.

Voir la notice de l'article provenant de la source Czech Digital Mathematics Library

In this paper there are considered Markov decision processes (MDPs) that have the discounted cost as the objective function, state and decision spaces that are subsets of the real line but are not necessarily finite or denumerable. The considered MDPs have a cost function that is possibly unbounded, and dynamic independent of the current state. The considered decision sets are possibly non-compact. In the context described, conditions to obtain either an increasing or decreasing optimal stationary policy are provided; these conditions do not require assumptions of convexity. Versions of the policy iteration algorithm (PIA) to approximate increasing or decreasing optimal stationary policies are detailed. An illustrative example is presented. Finally, comments on the monotonicity conditions and the monotone versions of the PIA that are applied to discounted MDPs with rewards are given.
Classification : 90C40, 93E20
Keywords: Markov decision process; total discounted cost; total discounted reward; increasing optimal policy; decreasing optimal policy; policy iteration algorithm
@article{KYB_2013__49_5_a2,
     author = {Flores-Hern\'andez, Rosa Mar{\'\i}a},
     title = {Monotone optimal policies in discounted {Markov} decision processes with transition probabilities independent of the current state: existence and approximation},
     journal = {Kybernetika},
     pages = {705--719},
     publisher = {mathdoc},
     volume = {49},
     number = {5},
     year = {2013},
     mrnumber = {3182635},
     zbl = {1278.90425},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/KYB_2013__49_5_a2/}
}
TY  - JOUR
AU  - Flores-Hernández, Rosa María
TI  - Monotone optimal policies in discounted Markov decision processes with transition probabilities independent of the current state: existence and approximation
JO  - Kybernetika
PY  - 2013
SP  - 705
EP  - 719
VL  - 49
IS  - 5
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/KYB_2013__49_5_a2/
LA  - en
ID  - KYB_2013__49_5_a2
ER  - 
%0 Journal Article
%A Flores-Hernández, Rosa María
%T Monotone optimal policies in discounted Markov decision processes with transition probabilities independent of the current state: existence and approximation
%J Kybernetika
%D 2013
%P 705-719
%V 49
%N 5
%I mathdoc
%U http://geodesic.mathdoc.fr/item/KYB_2013__49_5_a2/
%G en
%F KYB_2013__49_5_a2
Flores-Hernández, Rosa María. Monotone optimal policies in discounted Markov decision processes with transition probabilities independent of the current state: existence and approximation. Kybernetika, Tome 49 (2013) no. 5, pp. 705-719. http://geodesic.mathdoc.fr/item/KYB_2013__49_5_a2/