On optimality in probability and almost surely for processes with a communication property. I. The discrete time case
Teoriâ veroâtnostej i ee primeneniâ, Tome 50 (2005) no. 1, pp. 3-26 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

We establish conditions under which the strategy minimizing the expected value of a cost functional has a much stronger property; namely, it minimizes the random cost functional itself for all realizations of the controlled process belonging to a set, the probability of which is close to one for large time horizons. The main difference of the conditions mentioned from those obtained earlier is that the former do not deal with value function properties but concern a possibility of transition of the controlled process from one state to another in a time with a finite mean. It makes the verification of these conditions in a number of situations of the general form much easier. The first part of the paper concerns processes in discrete time, and second part will be devoted to processes in continuous time.
Keywords: controlled processes, controlled Markov chains, asymptotic in probability.
@article{TVP_2005_50_1_a0,
     author = {T. A. Belkina and V. I. Rotar'},
     title = {On optimality in probability and almost surely for processes with a communication property. {I.~The} discrete time case},
     journal = {Teori\^a vero\^atnostej i ee primeneni\^a},
     pages = {3--26},
     year = {2005},
     volume = {50},
     number = {1},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/TVP_2005_50_1_a0/}
}
TY  - JOUR
AU  - T. A. Belkina
AU  - V. I. Rotar'
TI  - On optimality in probability and almost surely for processes with a communication property. I. The discrete time case
JO  - Teoriâ veroâtnostej i ee primeneniâ
PY  - 2005
SP  - 3
EP  - 26
VL  - 50
IS  - 1
UR  - http://geodesic.mathdoc.fr/item/TVP_2005_50_1_a0/
LA  - ru
ID  - TVP_2005_50_1_a0
ER  - 
%0 Journal Article
%A T. A. Belkina
%A V. I. Rotar'
%T On optimality in probability and almost surely for processes with a communication property. I. The discrete time case
%J Teoriâ veroâtnostej i ee primeneniâ
%D 2005
%P 3-26
%V 50
%N 1
%U http://geodesic.mathdoc.fr/item/TVP_2005_50_1_a0/
%G ru
%F TVP_2005_50_1_a0
T. A. Belkina; V. I. Rotar'. On optimality in probability and almost surely for processes with a communication property. I. The discrete time case. Teoriâ veroâtnostej i ee primeneniâ, Tome 50 (2005) no. 1, pp. 3-26. http://geodesic.mathdoc.fr/item/TVP_2005_50_1_a0/

[1] Arkin V. I., Evstigneev I. V., Veroyatnostnye modeli upravleniya i ekonomicheskoi dinamiki, Nauka, M., 1979 | MR

[2] Asriev A. V., Rotar V. I., “On asymptotic optimality in probability and almost surely in dynamic control”, Stochastics Stochastics Rep., 33:1–2 (1990), 1–16 | MR | Zbl

[3] Konyukhova (Belkina) T. A., “Asimptoticheski optimalnye po veroyatnosti upravleniya v zadache o lineinom regulyatore s peremennymi parametrami”, Avtomatika i telemekhanika, 55:2 (1994), 110–120 | MR | Zbl

[4] Belkina T. A., Presman E. L., “Asimptoticheski optimalnye po raspredeleniyu upravleniya dlya lineinoi stokhasticheskoi sistemy s kvadratichnym funktsionalom”, Avtomatika i telemekhanika, 58:3 (1997), 106–115 | MR | Zbl

[5] Konyukhova (Belkina) G. L., Rotar V. I., “Upravleniya, asimptoticheski optimalnye po veroyatnosti i pochti navernoe, v zadache o lineinom regulyatore”, Avtomatika i telemekhanika, 53:6 (1992), 65–78 | MR | Zbl

[6] Belkina T. A., Rotar V. I., “Ob usloviyakh asimptoticheskoi optimalnosti po veroyatnosti i pochti navernoe v modeli upravlyaemogo diffuzionnogo protsessa”, Avtomatika i telemekhanika, 60:2 (1999), 45–56 | MR

[7] Belkina T. A., Kabanov Yu. M., Presman E. L., “O stokhasticheskoi optimalnosti dlya lineino-kvadraticheskogo regulyatora”, Teoriya veroyatn. i ee primen., 48:4 (2003), 661–675 | MR | Zbl

[8] Bhattacharia R. N., “Asymptotic behavior of several-dimensional diffusions”, Stochastic Nonlinear Systems in Physics, Chemistry, and Biology, eds. L. Arnold and R. Lefever, Springer-Verlag, Berlin, New York, 1981, 86–99

[9] Borkar V. S., Optimal Control of Diffusion Processes, Longman, New York, 1989, 196 pp. | MR | Zbl

[10] Borkar V. S., “A convex analytic approach to Markov decision processes”, Probab. Theory Related Fields, 78:4 (1988), 583–602 | DOI | MR | Zbl

[11] Borkar V. S., “Control of Markov chains with long-run average cost criterion”, Stochastic Differential Systems, Stochastic Control Theory and Applications (Minneapolis, 1986), Springer-Verlag, New York, 1988, 57–77 | MR

[12] Borkar V. S., “Control of Markov chains with long-run average cost criterion: The dynamic programming equations”, SIAM J. Control Optim., 27:3 (1989), 642–657 | DOI | MR | Zbl

[13] Carlson D. A., Haurie A. B., Leizarowitz A., Infinite Horizon Optimal Control: Deterministic and Stochastic Systems, Springer-Verlag, Berlin, 1991, 332 pp.

[14] Chen H. F., Guo L., “Optimal adaptive control and consistent parameter estimates for ARMAX-model with quadratic cost”, SIAM J. Control Optim., 25:4 (1987), 845–867 | DOI | MR | Zbl

[15] Chow Y. S., Teicher H., Probability Theory: Independence, Interchangeability, Martingales, Springer-Verlag, New York, 1997, 488 pp. | MR

[16] Di Masi G. B., Kabanov Yu. M., “On sensitive probabilistic criteria in the linear regulator problem with the infinite horizon”, Obozrenie prikl. i promyshl. matem., 5:2 (1998), 410–422 | Zbl

[17] Evstigneev I. V., “Regulyarnye uslovnye matematicheskie ozhidaniya sluchainykh velichin, zavisyaschikh ot parametrov”, Teoriya veroyatn. i ee primen., 31:3 (1986), 586–589 | MR

[18] Fleming U., Rishel R., Optimalnoe upravlenie determinirovannymi i stokhasticheskimi sistemami, Mir, M., 1978, 316 pp. | MR

[19] Gikhman I. I., Skorokhod A. V., Upravlyaemye sluchainye protsessy, Naukova dumka, Kiev, 1977, 251 pp. | MR

[20] Hall P., Heyde C. C., Martingale Limit Theory and its Application, Academic Press, New York, London, 1980, 308 pp. | MR

[21] Hernández-Lerma O., Lasserre J. B., Further Topics on Discrete-time Markov Control Processes, Springer-Verlag, New York, 1999, 276 pp. | MR

[22] Hinderer K., Foundations of non-stationary dynamic programming with discrete time parameter, Springer-Verlag, Berlin, New York, 1970, 160 pp. | MR | Zbl

[23] Vatanabe S., Ikeda N., Stokhasticheskie differentsialnye uravneniya i diffuzionnye protsessy, Nauka, M., 1986, 445 pp. | MR

[24] Karatzas I., Shreve S. I., Brownian Motion and Stochastic Calculus, Springer-Verlag, New York, 1991, 470 pp. | MR | Zbl

[25] Kertz R. P., “Renewal plans and persistent optimality in count ably additive gambling”, Math. Oper. Res., 7:3 (1982), 361–382 | DOI | MR | Zbl

[26] Kvakernaak Kh., Sivan R., Lineinye optimalnye sistemy upravleniya, Mir, M., 1977, 625 pp.

[27] Leizarowitz A., “Infinite horizon stochastic regulation and tracking with the overtaking criterion”, Stochastics, 22:2 (1987), 85–110 | MR

[28] Leizarowitz A., “On almost sure optimization for stochastic control systems”, Stochastics, 23:2 (1988), 85–107 | MR | Zbl

[29] Leizarowitz A., “Overtaking and almost sure optimality for infinite horizon Markov decision processes”, Math. Oper. Res., 21:1 (1996), 158–181 | DOI | MR | Zbl

[30] Lippman S., “Maximal average-reward controls for semi-Markov decision processes with arbitrary space and action space”, Ann. Math. Statist., 42 (1971), 1717–1726 | DOI | MR | Zbl

[31] Liptser R. Sh., Shiryaev A. N., Teoriya martingalov, Nauka, M., 1986, 512 pp. | MR | Zbl

[32] Mandl P., “Estimation and control in Markov chains”, Adv. Appl. Probab., 6:1 (1974), 40–60 | DOI | MR | Zbl

[33] Mandl P., “The use of optimal stationary policies in the adaptive contol linear systems”, Proceedings of the Symposium to honour J. Neymann (Warsaw, 1974), PWN, Warsaw, 1977, 223–242 | MR

[34] Mandl P., “Some results in the adaptive control of linear systems”, Transactions of the Seventh Prague Conference on Information Theory, Statistical Decision Functions, Random Processes and of the Eight European Meeting of Statisticians (Prague, 1974), v. A, ed. J. Kozesnic, Reidel, Dordrecht, 1977, 399–410 | MR

[35] Mandl P., “On the adaptive control of countable Markov chains”, Probability Theory (Warsaw, 1976), PWN, Warsaw, 1979, 159–173 | MR

[36] Mandl P., “Asymptotic ordering of probability distributions for linear controlled systems with quadratic cost”, Lecture Notes Control Inform. Sci., 78, 1986, 277–283 | MR | Zbl

[37] Mandl P., Ayllón M. R. R., “On controlled Markov processes with an average cost criterion”, Kybernetika (Prague), 23:6 (1987), 433–442 | MR | Zbl

[38] Presman E., Rotar' V., Taksar M., “Optimality in probability and almost surely. The general scheme and a linear regulator problem”, Stochastics Stochastics Rep., 43:3–4 (1993), 127–137 | MR | Zbl

[39] Presman E. L., “Optimalnost pochti navernoe i po veroyatnosti dlya stokhasticheskogo lineino-kvadraticheskogo regulyatora”, Teoriya veroyatn. i ee primen., 42:3 (1997), 627–632 | MR | Zbl

[40] Rotar V. I., “O dostatochnykh upravleniyakh v dinamicheskoi skheme stokhasticheskoi optimizatsii”, Matem. zametki, 40:4 (1986), 542–551 | MR | Zbl

[41] Rotar V. I., “Nekotorye zamechaniya ob asimptoticheskoi optimalnosti”, Issledovaniya po veroyatnostnym problemam upravleniya ekonomicheskimi protsessami, TsEMI RAN, M., 1986, 93–116

[42] Rotar' V. I., “Connectivity property and optimality almost surely and in probability”, New Trends in Probability and Statistics (Bakuriani, 1980), v. 1, eds. V. Sazonov and T. Shervashidze, VSP, Utrecht; Mokslas, Vilnius, 1991, 528–539 | MR

[43] Rotar' V. I., Probability Theory, World Scientific, River Edge, NJ, 1997, 414 pp. | MR

[44] Shwartz A., Makowski A. M., “Comparing policies in Markov decision processes: Mandl's lemma revisited”, Math. Oper. Res., 15:1 (1990), 155–174 | DOI | MR | Zbl

[45] Yushkevich A. A., Chitashvili R. Ya., “Upravlyaemye sluchainye posledovatelnosti i markovskie tsepi”, Uspekhi matem. nauk, 37:6 (1982), 213–242 | MR | Zbl