Keywords: approximate dynamic programming; Bellman equation; approximate HDMR minimization; trust region problem
@article{KYB_2013_49_5_a3,
author = {Pi\v{s}t\v{e}k, Miroslav},
title = {Approximate dynamic programming based on high dimensional model representation},
journal = {Kybernetika},
pages = {720--737},
year = {2013},
volume = {49},
number = {5},
mrnumber = {3182636},
zbl = {1278.90423},
language = {en},
url = {http://geodesic.mathdoc.fr/item/KYB_2013_49_5_a3/}
}
Pištěk, Miroslav. Approximate dynamic programming based on high dimensional model representation. Kybernetika, Tome 49 (2013) no. 5, pp. 720-737. http://geodesic.mathdoc.fr/item/KYB_2013_49_5_a3/
[1] Busygin, S.: A new trust region technique for the maximum weight clique problem. Discrete Appl. Math. 154 (2002), 2006. | MR | Zbl
[2] Demiralp, M.: High dimensional model representation and its application varieties. In: Proc. Fourth International Conference on Tools for Mathematical Modelling, St. Petersburg 2003, pp. 146-159. | Zbl
[3] Feil, K. S., Shah, N.: Volatility calibration using spline and high dimensional model representation models. Wilmott J. 1 (2009), 179-195. | DOI
[4] Garivier, A., Cappé, O.: The KL-UCB algorithm for bounded stochastic bandits and beyond. In: Proc. 24th Annual Conference on Learning Theory, Budapest 2011.
[5] George, A., Powell, W. B., Kulkarni, S. R.: Value function approximation using multiple aggregation for multiattribute resource management. J. Machine Learning Res. 9 (2008), 2079-2111. | Zbl
[6] Gittins, J.: Bandit processes and dynamic allocation indices. J. Roy. Statist. Soc. Ser.B 41 (1979), 2, 148-177. | MR | Zbl
[7] Gosavi, A.: Reinforcement learning: A tutorial survey and recent advances. INFORMS J. Comput. 21 (2009), 178-192. | DOI | MR | Zbl
[8] Hauskrecht, M.: Value-function approximations for partially observable markov decision processes. J. Artif. Internat. Res. 13 (2000), 33-94. | MR | Zbl
[9] Jaakkola, T., Singh, S. P., Jordan, M. I.: Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems. MIT Press, 1995.
[10] Karp, R. M.: Reducibility among combinatorial problems. In: Complexity of Computer Computations (R. Miller and J. W. Thatcher, eds.), Plenum, New York 1972. | DOI | MR | Zbl
[11] Kushner, H.: Introduction to Stochastic Control. Holt, Rinehart and Winston, New York 1970. | MR | Zbl
[12] LeBlanc, M., Tibshirani, R.: Combining estimates in regression and classification. J. Amer. Statist. Assoc. 91 (1996), 1641-1650. | MR | Zbl
[13] Li, G., Hu, J., Wang, S.-W., Georgopoulos, P. G., Schoendorf, J., Rabitz, H.: Random sampling-high dimensional model representation (rs-hdmr) and orthogonality of its different order component functions. J. Phys. Chem. A 110 (2006), 7, 2474-2485. | DOI
[14] Luus, R.: Iterative Dynamic Programming. Chapman and Hall/CRC Monographs and Surveys in Pure and Applied Mathematics, 2000. | MR | Zbl
[15] Ma, X., Zabaras, N.: An adaptive high-dimensional stochastic model representation technique for the solution of stochastic partial differential equations. J. Comput. Phys. 229 (2010), 3884-3915. | DOI | MR | Zbl
[16] Matoušek, J., Nešetřil, J.: Invitation to Discrete Mathematics. Clarendon Press, 1998. | MR | Zbl
[17] Miller, W., Sutton, R., Werbos, P.: Neural Networks for Control. Neural Network Modeling and Connectionism. Mit Press, 1995.
[18] Olsson, C., Eriksson, A., Kahl, F.: Solving large scale binary quadratic problems: Spectral methods vs. semidefinite programming. In: Computer Vision and Pattern Recognition, 2007.
[19] Peterka, V.: Bayesian system identification. In: Trends and Progress in System Identification (P. Eykhoff, ed.), Pergamon Press, Oxford 1981, pp. 239-304. | MR | Zbl
[20] Pištěk, M.: On implicit approximation of the bellman equation. In: 15th IFAC Symposium on System Identification, Saint-Malo 2009.
[21] Powell, W. B.: Approximate Dynamic Programming: Solving the Curses of Dimensionality. Wiley-Interscience, 2007. | MR | Zbl
[22] Rabitz, H., Alis, O.: General foundations of high-dimensional model representations. J. Math. Chem. 25 (1999), 197-233. | DOI | MR | Zbl
[23] Rahman, S.: A polynomial dimensional decomposition for stochastic computing. Internat. J. Numer. Methods in Engrg. 76 (2008), 2091-2116. | DOI | MR | Zbl
[24] Rojas, M., Santos, S. A., Sorensen, D. C.: A new matrix-free algorithm for the large-scale trust-region subproblem. SIAM J. Optim. 11 (2000), 611-646. | DOI | MR | Zbl
[25] Schrijver, A.: Theory of Linear and Integer Programming. Wiley-Interscience Series in Discrete Mathematics and Optimization, John Wiley and Sons, 1998. | MR | Zbl
[26] Sorensen, D. C.: Newton's method with a model trust region modification. SIAM J. Numer. Anal. 19 (1982), 2, 409-426. | DOI | MR | Zbl
[27] Sutton, R. S., Barto, A. G.: Reinforcement Learning: An Introduction. MIT Press, 1998.