Estimation and adaptive control of span-contracting Markov decision processes
Kybernetika, Tome 27 (1991) no. 1, pp. 66-71 Cet article a éte moissonné depuis la source Czech Digital Mathematics Library

Voir la notice de l'article

Classification : 90C40
@article{KYB_1991_27_1_a5,
     author = {H\"ubner, Gerhard},
     title = {Estimation and adaptive control of span-contracting {Markov} decision processes},
     journal = {Kybernetika},
     pages = {66--71},
     year = {1991},
     volume = {27},
     number = {1},
     mrnumber = {1099515},
     zbl = {0744.90099},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/KYB_1991_27_1_a5/}
}
TY  - JOUR
AU  - Hübner, Gerhard
TI  - Estimation and adaptive control of span-contracting Markov decision processes
JO  - Kybernetika
PY  - 1991
SP  - 66
EP  - 71
VL  - 27
IS  - 1
UR  - http://geodesic.mathdoc.fr/item/KYB_1991_27_1_a5/
LA  - en
ID  - KYB_1991_27_1_a5
ER  - 
%0 Journal Article
%A Hübner, Gerhard
%T Estimation and adaptive control of span-contracting Markov decision processes
%J Kybernetika
%D 1991
%P 66-71
%V 27
%N 1
%U http://geodesic.mathdoc.fr/item/KYB_1991_27_1_a5/
%G en
%F KYB_1991_27_1_a5
Hübner, Gerhard. Estimation and adaptive control of span-contracting Markov decision processes. Kybernetika, Tome 27 (1991) no. 1, pp. 66-71. http://geodesic.mathdoc.fr/item/KYB_1991_27_1_a5/

[1] R. S. Acosta-Abreu, O. Hernandez-Lerma: Iterative adaptive control of denumerable state average-cost Markov systems. Control Cybernet. 14 (1985), 313 - 322. | MR

[2] V. V. Baranov: Recursive algorithms of adaptive control in stochastic systems. Cybernetics 17 (1981), 815-824. | MR

[3] A. Federgruen: Markovian Control Problems. Math. Centre Tracts 97, Amsterdam 1983. | MR | Zbl

[4] A. Federgruen, P. J. Schweitzer: Nonstationary Markov decision problems with converging parameters. J. optim. Theory Appl. 34 (1981), 207-241. | MR | Zbl

[5] A. Federgruen P. J. Schweitzer, H. C Tijms: Contraction mappings underlying undiscounted Markov decision problems. J. Math. Anal. Appl. 65 (1978), 711 - 730. | MR

[6] A. Federgruen, H. C Tijms: The optimality equation in average cost denumerable state semi-Markov decision problems, recurrency conditions and algorithms. J. Appl. Probab. 15 (1978), 356-373. | MR | Zbl

[7] O. Hernandez-Lerma: Adaptive Control Processes. Springer-Verlag, Berlin-Heidelberg- New York 1989. | MR

[8] K. Hinderer: On approximate solutions of finite-stage dynamic programs. In: Dynamic Programming and its applications (M. L. Puterman, ed.), Academic Press, New York 1978, pp. 289-317. | MR | Zbl

[9] G. Hiibner: Contraction properties of Markov decision models with applications to the elimination of non-optimal actions. In: Dynamische optimierung, Bonner Math. Schriften 98 (1977), 57-65. | MR

[10] G. Hiibner: A unified approach to adaptive control of average reward Markov decision processes. OR Spektrum 10 (1988), 161-166. | MR

[11] M. Kurano: Discrete-time Markovian decision processes with an unknown parameter - average return criterion. J. oper. Res. Soc. Japan 15 (1972), 67-76. | MR | Zbl

[12] M. Kurano: Adaptive policies in Markov decision processes with uncertain matrices. J. Inf. Optim. 4 (1983), 21-40. | MR

[13] M. Kurano: Learning algorithms for Markov decision processes. J. Appl. Probab. 24 (1987), 270-276. | MR | Zbl

[14] P. Mandl: Estimation and control of Markov chains. Adv. in Appl. Probab. 6 (1974), 40-60. | MR

[15] P. Mandl: On the adaptive control of countable Markov chains. In: Probability Theory, Banach Centre Publications, Warsaw 1979, pp. 159-173. | MR | Zbl

[16] W. Whitt: Approximations of dynamic programs. Math. Oper. Res. 3 (1978), 231 - 243. | MR | Zbl