Convergence of probability measures and Markov decision models with incomplete information

Eugene A. Feinberg; Pavlo O. Kasyanov; Michael Z. Zgurovsky

Eugene A. Feinberg ; Pavlo O. Kasyanov ; Michael Z. Zgurovsky

Trudy Matematicheskogo Instituta imeni V.A. Steklova, Stochastic calculus, martingales, and their applications, Tome 287 (2014), pp. 103-124

Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice du chapitre de livre

Résumé

This paper deals with three major types of convergence of probability measures on metric spaces: weak convergence, setwise convergence, and convergence in total variation. First, it describes and compares necessary and sufficient conditions for these types of convergence, some of which are well-known, in terms of convergence of probabilities of open and closed sets and, for the probabilities on the real line, in terms of convergence of distribution functions. Second, it provides criteria for weak and setwise convergence of probability measures and continuity of stochastic kernels in terms of convergence of probabilities defined on the base of the topology generated by the metric. Third, it provides applications to control of partially observable Markov decision processes and, in particular, to Markov decision models with incomplete information.

Export
Comment citer

@article{TM_2014_287_a5,
     author = {Eugene A. Feinberg and Pavlo O. Kasyanov and Michael Z. Zgurovsky},
     title = {Convergence of probability measures and {Markov} decision models with incomplete information},
     journal = {Trudy Matematicheskogo Instituta imeni V.A. Steklova},
     pages = {103--124},
     year = {2014},
     volume = {287},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/TM_2014_287_a5/}
}

TY  - JOUR
AU  - Eugene A. Feinberg
AU  - Pavlo O. Kasyanov
AU  - Michael Z. Zgurovsky
TI  - Convergence of probability measures and Markov decision models with incomplete information
JO  - Trudy Matematicheskogo Instituta imeni V.A. Steklova
PY  - 2014
SP  - 103
EP  - 124
VL  - 287
UR  - http://geodesic.mathdoc.fr/item/TM_2014_287_a5/
LA  - en
ID  - TM_2014_287_a5
ER  -

%0 Journal Article
%A Eugene A. Feinberg
%A Pavlo O. Kasyanov
%A Michael Z. Zgurovsky
%T Convergence of probability measures and Markov decision models with incomplete information
%J Trudy Matematicheskogo Instituta imeni V.A. Steklova
%D 2014
%P 103-124
%V 287
%U http://geodesic.mathdoc.fr/item/TM_2014_287_a5/
%G en
%F TM_2014_287_a5

Eugene A. Feinberg; Pavlo O. Kasyanov; Michael Z. Zgurovsky. Convergence of probability measures and Markov decision models with incomplete information. Trudy Matematicheskogo Instituta imeni V.A. Steklova, Stochastic calculus, martingales, and their applications, Tome 287 (2014), pp. 103-124. http://geodesic.mathdoc.fr/item/TM_2014_287_a5/

Bibliographie
Cité par

[1] Aoki M., “Optimal control of partially observable Markovian systems”, J. Franklin Inst., 280:5 (1965), 367–386 | DOI | MR | Zbl

[2] Bäuerle N., Rieder U., Markov decision processes with applications to finance, Springer, Berlin, 2011 | MR | Zbl

[3] Bensoussan A., Stochastic control of partially observable systems, Cambridge Univ. Press, Cambridge, 1992 | MR | Zbl

[4] Bertsekas D.P., Shreve S.E., Stochastic optimal control: The discrete time case, Acad. Press, New York, 1978 | MR | Zbl

[5] Billingsley P., Convergence of probability measures, J. Wiley Sons, New York, 1968 | MR | Zbl

[6] Bogachev V.I., Measure theory, V. 2, Springer, Berlin, 2007 | MR | Zbl

[7] Cohn D.L., Measure theory, Springer, New York, 2013 | MR | Zbl

[8] Dynkin E.B., “Controlled random sequences”, Theory Probab. Appl., 10 (1965), 1–14 | DOI | MR | Zbl

[9] Dynkin E.B., Yushkevich A.A., Controlled Markov processes, Springer, New York, 1979 | MR

[10] Feinberg E.A., Kasyanov P.O., Voorneveld M., “Berge's maximum theorem for noncompact image sets”, J. Math. Anal. Appl., 413:2 (2014), 1040–1046 | DOI | MR

[11] Feinberg E.A., Kasyanov P.O., Zadoianchuk N.V., “Average cost Markov decision processes with weakly continuous transition probabilities”, Math. Oper. Res., 37:4 (2012), 591–607 | DOI | MR | Zbl

[12] Feinberg E.A., Kasyanov P.O., Zadoianchuk N.V., “Berge's theorem for noncompact image sets”, J. Math. Anal. Appl., 397:1 (2013), 255–259 | DOI | MR | Zbl

[13] Feinberg E.A., Kasyanov P.O., Zgurovsky M.Z., “Optimality conditions for total-cost partially observable Markov decision processes”, Proc. 52th IEEE Conf. on Decision and Control and Eur. Control Conf. (Florence (Italy), 2013), IEEE, 2013, 5716–5721.

[14] Feinberg E.A., Kasyanov P.O., Zgurovsky M.Z., Partially observable total-cost Markov decision processes with weakly continuous transition probabilities, E-print, 2014, arXiv: 1401.2168 [math.OC] | MR

[15] Feinberg E.A., Kasyanov P.O., Zgurovsky M.Z., “Optimality conditions for partially observable Markov decision processes”, Continuous and distributed systems: Theory and applications, eds. M.Z. Zgurovsky, V.A. Sadovnichiy, Springer, Cham, 2014, 251–264 | DOI | MR

[16] Hernández-Lerma O., Adaptive Markov control processes, Springer, New York, 1989 | MR | Zbl

[17] Hernández-Lerma O., Lasserre J.B., Discrete-time Markov control processes: Basic optimality criteria, Springer, New York, 1996 | MR

[18] Hinderer K., Foundations of non-stationary dynamic programming with discrete time parameter, Springer, Berlin, 1970 | MR | Zbl

[19] Jacod J., Shiryaev A.N., Limit theorems for stochastic processes, 2nd ed., Springer, Berlin, 2003 | MR

[20] Kabanov Yu.M., Liptser R.Sh., Shiryaev A.N., “Some limit theorems for simple point processes (a martingale approach)”, Stochastics, 3 (1980), 203–216 | DOI | MR | Zbl

[21] Liptser R.Sh., Shiryaev A.N., Statistika sluchainykh protsessov: Nelineinaya filtratsiya i smezhnye voprosy, Nauka, M., 1974 ; Liptser R.Sh., Shiryaev A.N., Statistics of random processes. V. 1: General theory, Springer, New York, 1977 ; Liptser R.Sh., Shiryaev A.N., Statistics of random processes. V. 2: Applications, Springer, New York, 1978 | MR | Zbl | MR | Zbl | MR | Zbl

[22] Rhenius D., “Incomplete information in Markovian decision models”, Ann. Stat., 2:6 (1974), 1327–1334 | DOI | MR | Zbl

[23] Rieder U., “Bayesian dynamic programming”, Adv. Appl. Probab., 7:2 (1975), 330–348 | DOI | MR | Zbl

[24] Sawaragi Y., Yoshikawa T., “Discrete-time Markovian decision processes with incomplete state observation”, Ann. Math. Stat., 41:1 (1970), 78–86 | DOI | MR | Zbl

[25] Shiryaev A.N., “K teorii reshayuschikh funktsii i upravleniyu protsessom nablyudeniya po nepolnym dannym”, Trans. Third Prague Conf. Inf. Theory, Stat. Decis. Funct., Random Processes (Liblice, 1962.), Publ. House Czech. Acad. Sci., Prague, 1964, 657–681; Shiryaev A.N., “On the theory of decision functions and control by observation from incomplete data”, Sel. Transl. Math. Stat. Probab., 6 (1966), 162–188

[26] Shiryaev A.N., “Nekotorye novye rezultaty v teorii upravlyaemykh sluchainykh protsessov”, Trans. Fourth Prague Conf. Inf. Theory, Stat. Decis. Funct., Random Processes (Prague, 1965), Academia, Prague, 1967, 131–203; Shiryaev A.N., “Some new results in the theory of controlled random processes”, Sel. Transl. Math. Stat. Probab., 8 (1970), 49–130 | Zbl

[27] Shiryaev A.N., Probability, 2nd ed., Springer, New York, 1996 | MR

[28] Smallwood R.D., Sondik E.J., “The optimal control of partially observable Markov processes over a finite horizon”, Oper. Res., 21:5 (1973), 1071–1088 | DOI | MR | Zbl

[29] Sondik E.J., “The optimal control of partially observable Markov processes over the infinite horizon: Discounted costs”, Oper. Res., 26:2 (1978), 282–304 | DOI | MR | Zbl

[30] Striebel C., Optimal control of discrete time stochastic systems, Springer, Berlin, 1975 | MR | Zbl

[31] Yushkevich A.A., “Reduction of a controlled Markov model with incomplete data to a problem with complete information in the case of Borel state and control space”, Theory Probab. Appl., 21 (1976), 153–158 | DOI | MR | Zbl

Parcourir par

Geodesic

Parcourir par