An $\varepsilon$-optimal control of finite Markov chain with average
Teoriâ veroâtnostej i ee primeneniâ, Tome 25 (1980) no. 1, pp. 71-82 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

Discrete time Markov decition chain with average reward criterion is considered. It is proved that if the state space is finite and the sets of actions are measurable subsets of Polish space, then there exist non-randomized Markov $\varepsilon$-optimal policies. An example showing that there exists a Markov decition chain with countable state space and finite sets of actions such that randomized Markov $\varepsilon$-optimal policies for this chain don't exist is constructed.
@article{TVP_1980_25_1_a5,
     author = {E. A. Feǐnberg},
     title = {An $\varepsilon$-optimal control of finite {Markov} chain with average},
     journal = {Teori\^a vero\^atnostej i ee primeneni\^a},
     pages = {71--82},
     year = {1980},
     volume = {25},
     number = {1},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/TVP_1980_25_1_a5/}
}
TY  - JOUR
AU  - E. A. Feǐnberg
TI  - An $\varepsilon$-optimal control of finite Markov chain with average
JO  - Teoriâ veroâtnostej i ee primeneniâ
PY  - 1980
SP  - 71
EP  - 82
VL  - 25
IS  - 1
UR  - http://geodesic.mathdoc.fr/item/TVP_1980_25_1_a5/
LA  - ru
ID  - TVP_1980_25_1_a5
ER  - 
%0 Journal Article
%A E. A. Feǐnberg
%T An $\varepsilon$-optimal control of finite Markov chain with average
%J Teoriâ veroâtnostej i ee primeneniâ
%D 1980
%P 71-82
%V 25
%N 1
%U http://geodesic.mathdoc.fr/item/TVP_1980_25_1_a5/
%G ru
%F TVP_1980_25_1_a5
E. A. Feǐnberg. An $\varepsilon$-optimal control of finite Markov chain with average. Teoriâ veroâtnostej i ee primeneniâ, Tome 25 (1980) no. 1, pp. 71-82. http://geodesic.mathdoc.fr/item/TVP_1980_25_1_a5/