TY  - JOUR
AU  - Rolando Cavazos-Cadena
AU  - Raúl Montes-de-Oca
TI  - Estimation and control
 in finite Markov decision processes
 with the average reward criterion
JO  - Applicationes Mathematicae
PY  - 2004
SP  - 127
EP  - 154
VL  - 31
IS  - 2
UR  - http://geodesic.mathdoc.fr/articles/10.4064/am31-2-1/
DO  - 10.4064/am31-2-1
LA  - en
ID  - 10_4064_am31_2_1
ER  -