TY - JOUR AU - Rolando Cavazos-Cadena AU - Raúl Montes-de-Oca TI - Estimation and control in finite Markov decision processes with the average reward criterion JO - Applicationes Mathematicae PY - 2004 SP - 127 EP - 154 VL - 31 IS - 2 PB - mathdoc UR - http://geodesic.mathdoc.fr/articles/10.4064/am31-2-1/ DO - 10.4064/am31-2-1 LA - en ID - 10_4064_am31_2_1 ER -