Bayesian parameter estimation and adaptive control of Markov processes with time-averaged cost
Applicationes Mathematicae, Tome 25 (1999) no. 3, pp. 339-358
Cet article a éte moissonné depuis la source Institute of Mathematics Polish Academy of Sciences
This paper considers Bayesian parameter estimation and an associated adaptive control scheme for controlled Markov chains and diffusions with time-averaged cost. Asymptotic behaviour of the posterior law of the parameter given the observed trajectory is analyzed. This analysis suggests a "cost-biased" estimation scheme and associated self-tuning adaptive control. This is shown to be asymptotically optimal in the almost sure sense.
DOI :
10.4064/am-25-3-339-358
Keywords:
time-averaged cost, adaptive control, asymptotic optimality, cost-biased estimate, Bayesian estimation
Affiliations des auteurs :
V. Borkar 1 ; S. Associate 1
@article{10_4064_am_25_3_339_358,
author = {V. Borkar and S. Associate},
title = {Bayesian parameter estimation and adaptive control of {Markov} processes with time-averaged cost},
journal = {Applicationes Mathematicae},
pages = {339--358},
year = {1999},
volume = {25},
number = {3},
doi = {10.4064/am-25-3-339-358},
zbl = {0992.93086},
language = {en},
url = {http://geodesic.mathdoc.fr/articles/10.4064/am-25-3-339-358/}
}
TY - JOUR AU - V. Borkar AU - S. Associate TI - Bayesian parameter estimation and adaptive control of Markov processes with time-averaged cost JO - Applicationes Mathematicae PY - 1999 SP - 339 EP - 358 VL - 25 IS - 3 UR - http://geodesic.mathdoc.fr/articles/10.4064/am-25-3-339-358/ DO - 10.4064/am-25-3-339-358 LA - en ID - 10_4064_am_25_3_339_358 ER -
%0 Journal Article %A V. Borkar %A S. Associate %T Bayesian parameter estimation and adaptive control of Markov processes with time-averaged cost %J Applicationes Mathematicae %D 1999 %P 339-358 %V 25 %N 3 %U http://geodesic.mathdoc.fr/articles/10.4064/am-25-3-339-358/ %R 10.4064/am-25-3-339-358 %G en %F 10_4064_am_25_3_339_358
V. Borkar; S. Associate. Bayesian parameter estimation and adaptive control of Markov processes with time-averaged cost. Applicationes Mathematicae, Tome 25 (1999) no. 3, pp. 339-358. doi: 10.4064/am-25-3-339-358
Cité par Sources :