Uniqueness of optimal policies as a generic property of discounted Markov decision processes: Ekeland's variational principle approach
Kybernetika, Tome 52 (2016) no. 1, pp. 66-75.

Voir la notice de l'article provenant de la source Czech Digital Mathematics Library

Many examples in optimization, ranging from Linear Programming to Markov Decision Processes (MDPs), present more than one optimal solution. The study of this non-uniqueness is of great mathematical interest. In this paper the authors show that in a specific family of discounted MDPs, non-uniqueness is a “fragile” property through Ekeland's Principle for each problem with at least two optimal policies; a perturbed model is produced with a unique optimal policy. This result not only supersedes previous papers on the subject, but it also renews the interest in the corresponding questions of well-posedness, genericity and structural stability of MDPs.
DOI : 10.14736/kyb-2016-1-0066
Classification : 90C40, 93E20
Keywords: discounted Markov decision processes; dynamic programming; unique optimal policy; non-uniqueness of optimal policies; Ekeland's variational principle
@article{10_14736_kyb_2016_1_0066,
     author = {Ortega-Guti\'errez, R. Israel and Montes-de-Oca, Ra\'ul and Lemus-Rodr{\'\i}guez, Enrique},
     title = {Uniqueness of optimal policies as a generic property of discounted {Markov} decision processes: {Ekeland's} variational principle approach},
     journal = {Kybernetika},
     pages = {66--75},
     publisher = {mathdoc},
     volume = {52},
     number = {1},
     year = {2016},
     doi = {10.14736/kyb-2016-1-0066},
     mrnumber = {3482611},
     zbl = {1374.90407},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.14736/kyb-2016-1-0066/}
}
TY  - JOUR
AU  - Ortega-Gutiérrez, R. Israel
AU  - Montes-de-Oca, Raúl
AU  - Lemus-Rodríguez, Enrique
TI  - Uniqueness of optimal policies as a generic property of discounted Markov decision processes: Ekeland's variational principle approach
JO  - Kybernetika
PY  - 2016
SP  - 66
EP  - 75
VL  - 52
IS  - 1
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/articles/10.14736/kyb-2016-1-0066/
DO  - 10.14736/kyb-2016-1-0066
LA  - en
ID  - 10_14736_kyb_2016_1_0066
ER  - 
%0 Journal Article
%A Ortega-Gutiérrez, R. Israel
%A Montes-de-Oca, Raúl
%A Lemus-Rodríguez, Enrique
%T Uniqueness of optimal policies as a generic property of discounted Markov decision processes: Ekeland's variational principle approach
%J Kybernetika
%D 2016
%P 66-75
%V 52
%N 1
%I mathdoc
%U http://geodesic.mathdoc.fr/articles/10.14736/kyb-2016-1-0066/
%R 10.14736/kyb-2016-1-0066
%G en
%F 10_14736_kyb_2016_1_0066
Ortega-Gutiérrez, R. Israel; Montes-de-Oca, Raúl; Lemus-Rodríguez, Enrique. Uniqueness of optimal policies as a generic property of discounted Markov decision processes: Ekeland's variational principle approach. Kybernetika, Tome 52 (2016) no. 1, pp. 66-75. doi : 10.14736/kyb-2016-1-0066. http://geodesic.mathdoc.fr/articles/10.14736/kyb-2016-1-0066/

Cité par Sources :