Optimally approximating exponential families
Kybernetika, Tome 49 (2013) no. 2, pp. 199-215 Cet article a éte moissonné depuis la source Czech Digital Mathematics Library

Voir la notice de l'article

This article studies exponential families $\mathcal{E}$ on finite sets such that the information divergence $D(P\|\mathcal{E})$ of an arbitrary probability distribution from $\mathcal{E}$ is bounded by some constant $D>0$. A particular class of low-dimensional exponential families that have low values of $D$ can be obtained from partitions of the state space. The main results concern optimality properties of these partition exponential families. The case where $D=\log(2)$ is studied in detail. This case is special, because if $D\log(2)$, then $\mathcal{E}$ contains all probability measures with full support.
This article studies exponential families $\mathcal{E}$ on finite sets such that the information divergence $D(P\|\mathcal{E})$ of an arbitrary probability distribution from $\mathcal{E}$ is bounded by some constant $D>0$. A particular class of low-dimensional exponential families that have low values of $D$ can be obtained from partitions of the state space. The main results concern optimality properties of these partition exponential families. The case where $D=\log(2)$ is studied in detail. This case is special, because if $D\log(2)$, then $\mathcal{E}$ contains all probability measures with full support.
Classification : 62B10, 94A15, 94A17
Keywords: exponential family; information divergence
@article{KYB_2013_49_2_a0,
     author = {Rauh, Johannes},
     title = {Optimally approximating exponential families},
     journal = {Kybernetika},
     pages = {199--215},
     year = {2013},
     volume = {49},
     number = {2},
     mrnumber = {3085392},
     zbl = {06176033},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/KYB_2013_49_2_a0/}
}
TY  - JOUR
AU  - Rauh, Johannes
TI  - Optimally approximating exponential families
JO  - Kybernetika
PY  - 2013
SP  - 199
EP  - 215
VL  - 49
IS  - 2
UR  - http://geodesic.mathdoc.fr/item/KYB_2013_49_2_a0/
LA  - en
ID  - KYB_2013_49_2_a0
ER  - 
%0 Journal Article
%A Rauh, Johannes
%T Optimally approximating exponential families
%J Kybernetika
%D 2013
%P 199-215
%V 49
%N 2
%U http://geodesic.mathdoc.fr/item/KYB_2013_49_2_a0/
%G en
%F KYB_2013_49_2_a0
Rauh, Johannes. Optimally approximating exponential families. Kybernetika, Tome 49 (2013) no. 2, pp. 199-215. http://geodesic.mathdoc.fr/item/KYB_2013_49_2_a0/

[1] Ay, N.: An information-geometric approach to a theory of pragmatic structuring. Ann. Probab. 30 (2002), 416-436. | DOI | MR | Zbl

[2] Ay, N.: Locality of global stochastic interaction in directed acyclic networks. Neural Computat. 14 (2002), 2959-2980. | DOI | Zbl

[3] Brown, L.: Fundamentals of Statistical Exponential Families: With Applications in Statistical Decision Theory. Institute of Mathematical Statistics, Hayworth 1986. | MR | Zbl

[4] Cover, T., Thomas, J.: Elements of Information Theory. First edition. Wiley, 1991. | MR

[5] Csiszár, I., Shields, P.: Information Theory and Statistics: A Tutorial. First edition. Foundations and Trends in Communications and Information Theory. Now Publishers, 2004.

[6] Csiszár, I., Matúš, F.: Generalized maximum likelihood extimates for exponential families. Probab. Theory Rel. Fields 141 (2008), 213-246. | DOI | MR

[7] Pietra, S. Della, Pietra, V. Della, Lafferty, J.: Inducing features of random fields. IEEE Trans. Pattern Analysis and Machine Intelligence 19 (1997), 380-393. | DOI

[8] Drton, M., Sturmfels, B., Sullivant, S.: Lectures on algebraic statistics. In: Oberwolfach Seminars 39, Birkhäuser, Basel 2009. | MR | Zbl

[9] Geiger, D., Meek, C., Sturmfels, B.: On the toric algebra of graphical models. Ann. Statist. 34 (2006), 5, 1463-1492. | DOI | MR | Zbl

[10] Jaynes, E. T.: Information theory and statistical mechanics. Phys. Rev. 106 (1957), 4, 620-630. | DOI | MR | Zbl

[11] Juríček, J.: Maximization of information divergence from multinomial distributions. Acta Univ. Carolin. 52 (2011), 1, 27-35. | MR

[12] Lauritzen, S. L.: Graphical Models. First edition. Oxford Statistical Science Series, Oxford University Press, 1996. | MR

[13] Linsker, R.: Self-organization in a perceptual network. IEEE Computer 21 (1988), 105-117. | DOI

[14] Matúš, F., Ay, N.: On maximization of the information divergence from an exponential family. In: Proc. WUPES'03, University of Economics, Prague 2003, pp. 199-204.

[15] Matúš, F., Rauh, J.: Maximization of the information divergence from an exponential family and criticality. In: 2011 IEEE International Symposium on Information Theory Proceedings (ISIT2011), 2011.

[16] Montúfar, G., Rauh, J., Ay, N.: Expressive power and approximation errors of Restricted Boltzmann Machines. In: NIPS, 2011.

[17] Oxley, J.: Matroid Theory. First edition. Oxford University Press, New York 1992. | MR

[18] Rauh, J.: Finding the Maximizers of the Information Divergence from an Exponential Family. Ph.D. Dissertation, Universität Leipzig, 2011. | MR

[19] Rauh, J.: Finding the maximizers of the information divergence from an exponential family. IEEE Trans. Inform. Theory 57 (2011), 6, 3236-3247. | DOI | MR

[20] Rauh, J., Kahle, T., Ay, N.: Support sets of exponential families and oriented matroids. Internat. J. Approx. Reasoning 52 (2011), 5, 613-626. | DOI | MR

[21] Zhu, S. C., Wu, Y. N., Mumford, D.: Minimax entropy principle and its application to texture modeling. Neural Computation 9 (1997), 1627-1660. | DOI