On a~problem of D.~Blackwell from the theory of dynamic programming
Teoriâ veroâtnostej i ee primeneniâ, Tome 15 (1970) no. 4, pp. 740-745
Voir la notice de l'article provenant de la source Math-Net.Ru
In this paper the positive case of a dynamic programming problem is considered. We prove that, for any probability $p$ on the set of states $S$ and $\lambda1$, there exists a stationary policy $\pi^*$ such that
$$
p\{I^{\pi^*}\ge\lambda\sup_\pi I^\pi\}=1,
$$
where $I^\pi$ is the mean reward.
@article{TVP_1970_15_4_a13,
author = {E. B. Frid},
title = {On a~problem of {D.~Blackwell} from the theory of dynamic programming},
journal = {Teori\^a vero\^atnostej i ee primeneni\^a},
pages = {740--745},
publisher = {mathdoc},
volume = {15},
number = {4},
year = {1970},
language = {ru},
url = {http://geodesic.mathdoc.fr/item/TVP_1970_15_4_a13/}
}
E. B. Frid. On a~problem of D.~Blackwell from the theory of dynamic programming. Teoriâ veroâtnostej i ee primeneniâ, Tome 15 (1970) no. 4, pp. 740-745. http://geodesic.mathdoc.fr/item/TVP_1970_15_4_a13/