Solutions of semi-Markov control models with recursive discount rates and approximation by $\epsilon$-optimal policies
Kybernetika, Tome 55 (2019) no. 3, pp. 495-517
Cet article a éte moissonné depuis la source Czech Digital Mathematics Library

Voir la notice de l'article

This paper studies a class of discrete-time discounted semi-Markov control model on Borel spaces. We assume possibly unbounded costs and a non-stationary exponential form in the discount factor which depends of on a rate, called the discount rate. Given an initial discount rate the evolution in next steps depends on both the previous discount rate and the sojourn time of the system at the current state. The new results provided here are the existence and the approximation of optimal policies for this class of discounted Markov control model with non-stationary rates and the horizon is finite or infinite. Under regularity condition on sojourn time distributions and measurable selector conditions, we show the validity of the dynamic programming algorithm for the finite horizon case. By the convergence in finite steps to the value functions, we guarantee the existence of non-stationary optimal policies for the infinite horizon case and we approximate them using non-stationary $\epsilon$-optimal policies. We illustrated our results a discounted semi-Markov linear-quadratic model, when the evolution of the discount rate follows an appropriate type of stochastic differential equation.
This paper studies a class of discrete-time discounted semi-Markov control model on Borel spaces. We assume possibly unbounded costs and a non-stationary exponential form in the discount factor which depends of on a rate, called the discount rate. Given an initial discount rate the evolution in next steps depends on both the previous discount rate and the sojourn time of the system at the current state. The new results provided here are the existence and the approximation of optimal policies for this class of discounted Markov control model with non-stationary rates and the horizon is finite or infinite. Under regularity condition on sojourn time distributions and measurable selector conditions, we show the validity of the dynamic programming algorithm for the finite horizon case. By the convergence in finite steps to the value functions, we guarantee the existence of non-stationary optimal policies for the infinite horizon case and we approximate them using non-stationary $\epsilon$-optimal policies. We illustrated our results a discounted semi-Markov linear-quadratic model, when the evolution of the discount rate follows an appropriate type of stochastic differential equation.
DOI : 10.14736/kyb-2019-3-0495
Classification : 49L20, 93E20
Keywords: optimal stochastic control; dynamic programming method; semi-Markov processes
@article{10_14736_kyb_2019_3_0495,
     author = {Garc{\'\i}a, Yofre H. and Gonz\'alez-Hern\'andez, Juan},
     title = {Solutions of {semi-Markov} control models with recursive discount rates and approximation by $\epsilon$-optimal policies},
     journal = {Kybernetika},
     pages = {495--517},
     year = {2019},
     volume = {55},
     number = {3},
     doi = {10.14736/kyb-2019-3-0495},
     mrnumber = {4015995},
     zbl = {07144950},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.14736/kyb-2019-3-0495/}
}
TY  - JOUR
AU  - García, Yofre H.
AU  - González-Hernández, Juan
TI  - Solutions of semi-Markov control models with recursive discount rates and approximation by $\epsilon$-optimal policies
JO  - Kybernetika
PY  - 2019
SP  - 495
EP  - 517
VL  - 55
IS  - 3
UR  - http://geodesic.mathdoc.fr/articles/10.14736/kyb-2019-3-0495/
DO  - 10.14736/kyb-2019-3-0495
LA  - en
ID  - 10_14736_kyb_2019_3_0495
ER  - 
%0 Journal Article
%A García, Yofre H.
%A González-Hernández, Juan
%T Solutions of semi-Markov control models with recursive discount rates and approximation by $\epsilon$-optimal policies
%J Kybernetika
%D 2019
%P 495-517
%V 55
%N 3
%U http://geodesic.mathdoc.fr/articles/10.14736/kyb-2019-3-0495/
%R 10.14736/kyb-2019-3-0495
%G en
%F 10_14736_kyb_2019_3_0495
García, Yofre H.; González-Hernández, Juan. Solutions of semi-Markov control models with recursive discount rates and approximation by $\epsilon$-optimal policies. Kybernetika, Tome 55 (2019) no. 3, pp. 495-517. doi: 10.14736/kyb-2019-3-0495

[1] Arnold, L.: Stochastic Differential Equations. John Wiley and Sons, New York 1973. | MR

[2] Ash, R., Doléans-Dade, C.: Probability and Measure Theory. Academic Press, San Diego, 2000. | MR | Zbl

[3] Bhattacharya, R., Majumdar, M.: Controlled semi-Markov models - the discounted case. J. Statist. Plann. Inference 21 (1989), 3, 365-381. | DOI | MR

[4] Bertsekas, D., Shreve, S.: Stochastic Optimal Control: The Discrete Time Case. Athena Scientific, Belmont, Massachusetts 1996. | MR | Zbl

[5] Blackwell, D.: Discounted dynamic programming. Ann. Math. Statist. 36, (1965), 226-235. | DOI | MR

[6] Cani, J. De: A dynamic programming algorithm for embedded Markov chains the planning horizon is infinitely. Management. Sci. 10 (1963), 716-733. | DOI | MR

[7] Drenyovszki, R., Kovács, L., Tornai, K., Oláh, A., I., I. Pintér: Bottom-up modeling of domestic appliances with Markov chains and semi-Markov processes. Kybernetika 53 (2017), 6, 1100-1117. | DOI

[8] Dekker, R., Hordijk, A.: Denumerable semi-Markov decision chains with small interest rates. Ann. Oper. Res. 28 (1991), 185-212. | DOI | MR

[9] García, Y., González-Hernández, J.: Discrete-time Markov control process with recursive discounted rates. Kybernetika 52 (2016), 403-426. | DOI | MR

[10] González-Hernández, J., López-Martínez, R., Pérez-Hernández, J.: Markov control processes with randomized discounted cost. Math. Meth. Oper. Res. 65 (2006), 27-44. | DOI | MR | Zbl

[11] González-Hernández, J., Villarreal-Rodríguez, C.: Optimal solutions of constrained discounted semi-Markov control problems.

[12] Hernández-Lerma, O., Lasserre, J.: Discrete-Time Markov Control Processes. Basic Optimality Criteria. Springer-Verlag, New York 1996. | DOI | MR | Zbl

[13] Hu, Q., Yue, W.: Markov Decision Processes With Their Applications. Springer-Verlag, Advances in Mechanics and Mathematics book series 14, (2008). | DOI | MR

[14] Huang, X., Huang, Y.: Mean-variance optimality for semi-Markov decision processes under first passage criteria. Kybernetika 53 (2017), 1, 59-81. | DOI | MR

[15] Howard, R.: Semi-Markovian decision processes. Bull. Int. Statist. Inst. 40 (1963), 2, 625-652. | MR

[16] Jewell, W.: Markov-renewal programming I: formulation, finite return models, Markov-renewal programming II: infinite return models, example. Oper. Res. 11 (1963), 938-971. | DOI | MR

[17] Luque-Vázquez, F., Hernández-Lerma, O.: Semi-Markov control models with average costs. Appl. Math. 26 (1999), 315-331. | DOI | MR

[18] Luque-Vásquez, F., Minjárez-Sosa, J. A.: Semi-Markov control processes with unknown holding times distribution under a discounted criterion. Math. Methods Oper. Res. 61 (2005), 455-468. | DOI | MR

[19] Luque-Vásquez, F., Minjárez-Sosa, J., Rosas, L.: Semi-Markov control processes with unknown holding times distribution under an average cost criterion. Appl. Math. Optim. 61, (2010), 317-336. | DOI | MR

[20] Schweitzer, P.: Perturbation Theory and Markovian Decision Processes. Ph.D. Dissertation, Massachusetts Institute of Technology, 1965. | MR

[21] Vasicek, O.: An equilibrium characterization of the term structure. J. Financ. Econom. 5 (1977), 177-188. | DOI

[22] Vega-Amaya, O.: Average optimatily in semi-Markov control models on Borel spaces: unbounded costs and control. Bol. Soc. Mat. Mexicana 38 (1997), 2, 47-60. | MR

[23] Zagst, R.: The effect of information in separable Bayesian semi-Markov control models and its application to investment planning. ZOR - Math. Methods Oper. Res. 41 (1995), 277-288. | DOI | MR

Cité par Sources :