Uniqueness of optimal policies as a generic property of discounted Markov decision processes: Ekeland's variational principle approach

Ortega-Gutiérrez, R. Israel; Montes-de-Oca, Raúl; Lemus-Rodríguez, Enrique

doi:10.14736/kyb-2016-1-0066

Ortega-Gutiérrez, R. Israel ; Montes-de-Oca, Raúl ; Lemus-Rodríguez, Enrique

Kybernetika, Tome 52 (2016) no. 1, pp. 66-75 Cet article a éte moissonné depuis la source Czech Digital Mathematics Library

Voir la notice de l'article

Abstract (VO)
Abstract (VO)

Many examples in optimization, ranging from Linear Programming to Markov Decision Processes (MDPs), present more than one optimal solution. The study of this non-uniqueness is of great mathematical interest. In this paper the authors show that in a specific family of discounted MDPs, non-uniqueness is a “fragile” property through Ekeland's Principle for each problem with at least two optimal policies; a perturbed model is produced with a unique optimal policy. This result not only supersedes previous papers on the subject, but it also renews the interest in the corresponding questions of well-posedness, genericity and structural stability of MDPs.

MR Zbl

DOI : 10.14736/kyb-2016-1-0066

Classification : 90C40, 93E20
Keywords: discounted Markov decision processes; dynamic programming; unique optimal policy; non-uniqueness of optimal policies; Ekeland's variational principle

@article{10_14736_kyb_2016_1_0066,
     author = {Ortega-Guti\'errez, R. Israel and Montes-de-Oca, Ra\'ul and Lemus-Rodr{\'\i}guez, Enrique},
     title = {Uniqueness of optimal policies as a generic property of discounted {Markov} decision processes: {Ekeland's} variational principle approach},
     journal = {Kybernetika},
     pages = {66--75},
     year = {2016},
     volume = {52},
     number = {1},
     doi = {10.14736/kyb-2016-1-0066},
     mrnumber = {3482611},
     zbl = {1374.90407},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.14736/kyb-2016-1-0066/}
}

TY  - JOUR
AU  - Ortega-Gutiérrez, R. Israel
AU  - Montes-de-Oca, Raúl
AU  - Lemus-Rodríguez, Enrique
TI  - Uniqueness of optimal policies as a generic property of discounted Markov decision processes: Ekeland's variational principle approach
JO  - Kybernetika
PY  - 2016
SP  - 66
EP  - 75
VL  - 52
IS  - 1
UR  - http://geodesic.mathdoc.fr/articles/10.14736/kyb-2016-1-0066/
DO  - 10.14736/kyb-2016-1-0066
LA  - en
ID  - 10_14736_kyb_2016_1_0066
ER  -

%0 Journal Article
%A Ortega-Gutiérrez, R. Israel
%A Montes-de-Oca, Raúl
%A Lemus-Rodríguez, Enrique
%T Uniqueness of optimal policies as a generic property of discounted Markov decision processes: Ekeland's variational principle approach
%J Kybernetika
%D 2016
%P 66-75
%V 52
%N 1
%U http://geodesic.mathdoc.fr/articles/10.14736/kyb-2016-1-0066/
%R 10.14736/kyb-2016-1-0066
%G en
%F 10_14736_kyb_2016_1_0066

Ortega-Gutiérrez, R. Israel; Montes-de-Oca, Raúl; Lemus-Rodríguez, Enrique. Uniqueness of optimal policies as a generic property of discounted Markov decision processes: Ekeland's variational principle approach. Kybernetika, Tome 52 (2016) no. 1, pp. 66-75. doi: 10.14736/kyb-2016-1-0066

Cité par Sources :

Parcourir par

Geodesic

Parcourir par