Keywords: Markov decision chains; second order optimality; optimality conditions for transient; discounted and average models; policy iterations; value iterations
@article{10_14736_kyb_2017_6_1086,
author = {Sladk\'y, Karel},
title = {Second {Order} optimality in {Markov} decision chains},
journal = {Kybernetika},
pages = {1086--1099},
year = {2017},
volume = {53},
number = {6},
doi = {10.14736/kyb-2017-6-1086},
mrnumber = {3758936},
zbl = {06861642},
language = {en},
url = {http://geodesic.mathdoc.fr/articles/10.14736/kyb-2017-6-1086/}
}
Sladký, Karel. Second Order optimality in Markov decision chains. Kybernetika, Tome 53 (2017) no. 6, pp. 1086-1099. doi: 10.14736/kyb-2017-6-1086
[1] Feinberg, E. A., Fei, J.: Inequalities for variances of total discounted costs. J. Appl. Probab. 46 (2009), 1209-1212. | DOI | MR
[2] Gantmakher, F. R.: The Theory of Matrices. Chelsea, London 1959. | MR
[3] Jaquette, S. C.: Markov decision processes with a new optimality criterion: Discrete time. Ann. Statist. 1 (1973), 496-505. | DOI | MR
[4] Mandl, P.: On the variance in controlled Markov chains. Kybernetika 7 (1971), 1-12. | MR | Zbl
[5] Markowitz, H.: Portfolio Selection - Efficient Diversification of Investments. Wiley, New York 1959. | MR
[6] Puterman, M. L.: Markov Decision Processes - Discrete Stochastic Dynamic Programming. Wiley, New York 1994. | MR
[7] Bäuerle, N., Rieder, U.: Markov Decision Processes with Application to Finance. Springer-Verlag, Berlin 2011. | MR
[8] Righter, R.: Stochastic comparison of discounted rewards. J. Appl. Probab. 48 (2011), 293-294. | DOI | MR
[9] Sladký, K.: On mean reward variance in semi-Markov processes. Math. Meth. Oper. Res. 62 (2005), 387-397. | DOI | MR
[10] Sladký, K.: Risk-sensitive and mean variance optimality in Markov decision processes. Acta Oeconomica Pragensia 7 (2013), 146-161.
[11] Sladký, K.: Second order optimality in transient and discounted Markov decision chains. In: Proc. 33th Internat. Conf. Math. Methods in Economics MME 2015 (D. Martinčík, ed.), University of West Bohemia, Plzeň 2015, pp. 731-736.
[12] Sobel, M.: The variance of discounted Markov decision processes. J. Appl. Probab. 19 (1982), 794-802. | DOI | MR | Zbl
[13] Dijk, N. M. Van, Sladký, K.: On the total reward variance for continuous-time Markov reward chains. J. Appl. Probab. 43 (2006), 1044-1052. | DOI | MR
[14] Veinott, A. F., Jr: Discrete dynamic programming with sensitive discount optimality criteria. Ann. Math. Statist. 13 (1969), 1635-1660. | DOI | MR
[15] White, D. J.: Mean, variance and probability criteria in finite Markov decision processes: A review. J. Optimizat. Th. Appl. 56 (1988), 1-29. | DOI | MR
[16] Wu, X., Guo, X.: First passage optimality and variance minimisation of Markov decision processes with varying discount factors. J. Appl. Probab. 52 (2015), 441-456. | DOI | MR | Zbl
Cité par Sources :