On a problem of D. Blackwell from the theory of dynamic programming
Teoriâ veroâtnostej i ee primeneniâ, Tome 15 (1970) no. 4, pp. 740-745
Cet article a éte moissonné depuis la source Math-Net.Ru
In this paper the positive case of a dynamic programming problem is considered. We prove that, for any probability $p$ on the set of states $S$ and $\lambda<1$, there exists a stationary policy $\pi^*$ such that $$ p\{I^{\pi^*}\ge\lambda\sup_\pi I^\pi\}=1, $$ where $I^\pi$ is the mean reward.
@article{TVP_1970_15_4_a13,
author = {E. B. Frid},
title = {On a~problem of {D.~Blackwell} from the theory of dynamic programming},
journal = {Teori\^a vero\^atnostej i ee primeneni\^a},
pages = {740--745},
year = {1970},
volume = {15},
number = {4},
language = {ru},
url = {http://geodesic.mathdoc.fr/item/TVP_1970_15_4_a13/}
}
E. B. Frid. On a problem of D. Blackwell from the theory of dynamic programming. Teoriâ veroâtnostej i ee primeneniâ, Tome 15 (1970) no. 4, pp. 740-745. http://geodesic.mathdoc.fr/item/TVP_1970_15_4_a13/