Markov decision processes on finite spaces with fuzzy total rewards
Kybernetika, Tome 58 (2022) no. 2, pp. 180-199
Cet article a éte moissonné depuis la source Czech Digital Mathematics Library

Voir la notice de l'article

The paper concerns Markov decision processes (MDPs) with both the state and the decision spaces being finite and with the total reward as the objective function. For such a kind of MDPs, the authors assume that the reward function is of a fuzzy type. Specifically, this fuzzy reward function is of a suitable trapezoidal shape which is a function of a standard non-fuzzy reward. The fuzzy control problem consists of determining a control policy that maximizes the fuzzy expected total reward, where the maximization is made with respect to the partial order on the $\alpha$-cuts of fuzzy numbers. The optimal policy and the optimal value function for the fuzzy optimal control problem are characterized by means of the dynamic programming equation of the standard optimal control problem and, as main conclusions, it is obtained that the optimal policy of the standard problem and the fuzzy one coincide and the fuzzy optimal value function is of a convenient trapezoidal form. As illustrations, fuzzy extensions of an optimal stopping problem and of a red-black gambling model are presented.
The paper concerns Markov decision processes (MDPs) with both the state and the decision spaces being finite and with the total reward as the objective function. For such a kind of MDPs, the authors assume that the reward function is of a fuzzy type. Specifically, this fuzzy reward function is of a suitable trapezoidal shape which is a function of a standard non-fuzzy reward. The fuzzy control problem consists of determining a control policy that maximizes the fuzzy expected total reward, where the maximization is made with respect to the partial order on the $\alpha$-cuts of fuzzy numbers. The optimal policy and the optimal value function for the fuzzy optimal control problem are characterized by means of the dynamic programming equation of the standard optimal control problem and, as main conclusions, it is obtained that the optimal policy of the standard problem and the fuzzy one coincide and the fuzzy optimal value function is of a convenient trapezoidal form. As illustrations, fuzzy extensions of an optimal stopping problem and of a red-black gambling model are presented.
DOI : 10.14736/kyb-2022-2-0180
Classification : 90C40, 93C40
Keywords: Markov decision process; total reward; fuzzy reward; trapezoidal fuzzy number; optimal stopping problem; gambling model
@article{10_14736_kyb_2022_2_0180,
     author = {Carrero-Vera, Karla and Cruz-Su\'arez, Hugo and Montes-de-Oca, Ra\'ul},
     title = {Markov decision processes on finite spaces with fuzzy total rewards},
     journal = {Kybernetika},
     pages = {180--199},
     year = {2022},
     volume = {58},
     number = {2},
     doi = {10.14736/kyb-2022-2-0180},
     mrnumber = {4467492},
     zbl = {07584152},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.14736/kyb-2022-2-0180/}
}
TY  - JOUR
AU  - Carrero-Vera, Karla
AU  - Cruz-Suárez, Hugo
AU  - Montes-de-Oca, Raúl
TI  - Markov decision processes on finite spaces with fuzzy total rewards
JO  - Kybernetika
PY  - 2022
SP  - 180
EP  - 199
VL  - 58
IS  - 2
UR  - http://geodesic.mathdoc.fr/articles/10.14736/kyb-2022-2-0180/
DO  - 10.14736/kyb-2022-2-0180
LA  - en
ID  - 10_14736_kyb_2022_2_0180
ER  - 
%0 Journal Article
%A Carrero-Vera, Karla
%A Cruz-Suárez, Hugo
%A Montes-de-Oca, Raúl
%T Markov decision processes on finite spaces with fuzzy total rewards
%J Kybernetika
%D 2022
%P 180-199
%V 58
%N 2
%U http://geodesic.mathdoc.fr/articles/10.14736/kyb-2022-2-0180/
%R 10.14736/kyb-2022-2-0180
%G en
%F 10_14736_kyb_2022_2_0180
Carrero-Vera, Karla; Cruz-Suárez, Hugo; Montes-de-Oca, Raúl. Markov decision processes on finite spaces with fuzzy total rewards. Kybernetika, Tome 58 (2022) no. 2, pp. 180-199. doi: 10.14736/kyb-2022-2-0180

[1] Abbasbandy, S., Hajjari, T.: A new approach for ranking of trapezoidal fuzzy numbers. Comput. Math. Appl. 57 (2009), 413-419. | DOI | MR

[2] Ban, A. I.: Triangular and parametric approximations of fuzzy numbers inadvertences and corrections. Fuzzy Sets and Systems 160 (2009), 3048-3058. | DOI | MR

[3] Bartle, R. G.: The Elements of Integration. Wiley, New York 1995. | MR

[4] Bellman, R. E., Zadeh, L. A.: Decision-making in a fuzzy enviroment. Management Sci. 17 (1970), 141-164. | DOI | MR

[5] Cavazos-Cadena, R., Montes-de-Oca, R.: Existence of optimal stationary policies in finite dynamic programs with nonnegative rewards. Probab. Engrg. Inform. Sci. 15 (2001), 557-564. | DOI | MR

[6] Chen, S. H.: Operations of fuzzy numbers with step form membership function using function principle. Information Sci. 108 (1998), 149-155. | DOI | MR | Zbl

[7] Diamond, P., Kloeden, P.: Metric Spaces of Fuzzy Sets: Theory and Applications. World Scientific, Singapore 1994. | MR

[8] Driankov, D., Hellendoorn, H., Reinfrank, M.: An Introduction to Fuzzy Control. Springer Science and Business Media, New York 2013. | MR

[9] Efendi, R., Arbaiy, N., Deris, M. M.: A new procedure in stock market forecasting based on fuzzy random auto-regression time series model. Information Sci. 441 (2018), 113-132. | DOI | MR

[10] Fakoor, M., Kosari, A., Jafarzadeh, M.: Humanoid robot path planning with fuzzy Markov decision processes. J. Appl. Res. Tech. 14 (2016), 300-310. | DOI

[11] Furukawa, N.: Parametric orders on fuzzy numbers and their roles in fuzzy optimization problems. Optimization 40 (1997), 171-192. | DOI | MR

[12] Kurano, M., Yasuda, M., Nakagami, J., Yoshida, Y.: Markov decision processes with fuzzy rewards. In: Proc. Int. Conf. on Nonlinear Analysis, Hirosaki 2002, pp. 221-232. | MR

[13] López-Díaz, M., Ralescu, D. A.: Tools for fuzzy random variables: embeddings and measurabilities. Comput. Statist. Data Anal. 51 (2006), 109-114. | DOI | MR

[14] Pedrycz, W.: Why triangular membership functions?. Fuzzy Sets and Systems 64 (1994), 21-30. | DOI | MR

[15] Puri, M. L., Ralescu, D. A.: Fuzzy random variable. J. Math. Anal. Appl. 114 (1986), 402-422. | DOI | MR

[16] Puterman, M. L.: Markov Decision Processes: Discrete Stochastic Dynamic. First edition. Wiley-Interscience, California 2005. | MR

[17] Rezvani, S., Molani, M.: Representation of trapezoidal fuzzy numbers with shape function. Ann. Fuzzy Math. Inform. 8 (2014), 89-112. | MR

[18] Ross, S.: Dynamic programming and gambling models. Adv. Appl. Probab. 6 (1974), 593-606. | DOI | MR

[19] Ross, S.: Introduction to Stochastic Dynamic Programming. Academic Press, New York 1983. | MR

[20] Semmouri, A., Jourhmane, M., Belhallaj, Z.: Discounted Markov decision processes with fuzzy costs. Ann. Oper. Res. 295 (2020), 769-786. | DOI | MR

[21] Syropoulos, A., Grammenos, T.: A Modern Introduction to Fuzzy Mathematics. Wiley, New Jersey 2020.

[22] Zadeh, L.: Fuzzy sets. Inform. Control 8 (1965), 338-353. | DOI | MR | Zbl

[23] Zeng, W., Li, H.: Weighted triangular approximation of fuzzy numbers. Int. J. Approx. Reason. 46 (2007), 137-150. | DOI | MR

Cité par Sources :