Multi-variate correlation and mixtures of product measures
Kybernetika, Tome 56 (2020) no. 3, pp. 459-499
Cet article a éte moissonné depuis la source Czech Digital Mathematics Library

Voir la notice de l'article

Total correlation (`TC') and dual total correlation (`DTC') are two classical ways to quantify the correlation among an $n$-tuple of random variables. They both reduce to mutual information when $n=2$. The first part of this paper sets up the theory of TC and DTC for general random variables, not necessarily finite-valued. This generality has not been exposed in the literature before. The second part considers the structural implications when a joint distribution $\mu$ has small TC or DTC. If $\mathrm{TC}(\mu) = o(n)$, then $\mu$ is close to a product measure according to a suitable transportation metric: this follows directly from Marton's classical transportation-entropy inequality. If $\mathrm{DTC}(\mu) = o(n)$, then the structural consequence is more complicated: $\mu$ is a mixture of a controlled number of terms, most of them close to product measures in the transportation metric. This is the main new result of the paper.
Total correlation (`TC') and dual total correlation (`DTC') are two classical ways to quantify the correlation among an $n$-tuple of random variables. They both reduce to mutual information when $n=2$. The first part of this paper sets up the theory of TC and DTC for general random variables, not necessarily finite-valued. This generality has not been exposed in the literature before. The second part considers the structural implications when a joint distribution $\mu$ has small TC or DTC. If $\mathrm{TC}(\mu) = o(n)$, then $\mu$ is close to a product measure according to a suitable transportation metric: this follows directly from Marton's classical transportation-entropy inequality. If $\mathrm{DTC}(\mu) = o(n)$, then the structural consequence is more complicated: $\mu$ is a mixture of a controlled number of terms, most of them close to product measures in the transportation metric. This is the main new result of the paper.
DOI : 10.14736/kyb-2020-3-0459
Classification : 60B99, 60G99, 62B10, 94A17
Keywords: total correlation; dual total correlation; transportation inequalities; mixtures of products
@article{10_14736_kyb_2020_3_0459,
     author = {Austin, Tim},
     title = {Multi-variate correlation and mixtures of product measures},
     journal = {Kybernetika},
     pages = {459--499},
     year = {2020},
     volume = {56},
     number = {3},
     doi = {10.14736/kyb-2020-3-0459},
     mrnumber = {4131739},
     zbl = {07250733},
     language = {en},
     url = {http://geodesic.mathdoc.fr/articles/10.14736/kyb-2020-3-0459/}
}
TY  - JOUR
AU  - Austin, Tim
TI  - Multi-variate correlation and mixtures of product measures
JO  - Kybernetika
PY  - 2020
SP  - 459
EP  - 499
VL  - 56
IS  - 3
UR  - http://geodesic.mathdoc.fr/articles/10.14736/kyb-2020-3-0459/
DO  - 10.14736/kyb-2020-3-0459
LA  - en
ID  - 10_14736_kyb_2020_3_0459
ER  - 
%0 Journal Article
%A Austin, Tim
%T Multi-variate correlation and mixtures of product measures
%J Kybernetika
%D 2020
%P 459-499
%V 56
%N 3
%U http://geodesic.mathdoc.fr/articles/10.14736/kyb-2020-3-0459/
%R 10.14736/kyb-2020-3-0459
%G en
%F 10_14736_kyb_2020_3_0459
Austin, Tim. Multi-variate correlation and mixtures of product measures. Kybernetika, Tome 56 (2020) no. 3, pp. 459-499. doi: 10.14736/kyb-2020-3-0459

[1] Abdallah, S. A., Plumbley, M. D.: Predictive Information, Multi-Information and Binding Information. Technical Report C4DM-TR-10-10, Queen Mary University of London, 2010.

[2] Ahlswede, R.: An elementary proof of the strong converse theorem for the multiple-access channel. J. Combin. Inform. System Sci. 7 (1982), 3, 216-230. | MR

[3] Ahlswede, R.: The rate-distortion region for multiple descriptions without excess rate. IEEE Trans. Inform. Theory 31 (1985), 6, 721-726. | DOI | MR

[4] Austin, T.: The structure of low-complexity Gibbs measures on product spaces. Ann. Probab. 47 (2019), 6, 4002-4023. | DOI | MR

[5] Austin, T.: Measure concentration and the weak Pinsker property. Publ. Math. Inst. Hautes Etudes Sci. 128 (2018), 1-119. | DOI | MR

[6] Ay, N., Olbrich, E., Bertschinger, N., Jost, J.: A unifying framework for complexity measures of finite systems. Working Paper 06-08-028, Santa Fe Institute, 2006.

[7] Balister, P., Bollobás, B.: Projections, entropy and sumsets. Combinatorica 32 (2012), 2, 125-141. | DOI | MR

[8] Chatterjee, S., Dembo, A.: Nonlinear large deviations. Adv. Math. 299 (2016), 396-450. | DOI | MR

[9] Chung, F. R. K., Graham, R. L., Frankl, P., Shearer, J. B.: Some intersection theorems for ordered sets and graphs. J. Combin. Theory Ser. A 43 (1986), 1, 23-37. | DOI | MR

[10] Coja-Oghlan, A., Krzakala, F., Perkins, W., Zdeborová, L.: Information-theoretic thresholds from the cavity method. Adv. Math. 333 (2018), 694-795. | DOI | MR

[11] Coja-Oghlan, A., Perkins, W.: Bethe states of random factor graphs. Comm. Math. Phys. 366 (2019), 1, 173-201. | DOI | MR

[12] Cover, T. M., Thomas, J. A.: Elements of Information Theory. Second edition. Wiley-Interscience, John Wiley and Sons, Hoboken, NJ 2006. | MR

[13] Crooks, G.: On Measures of Entropy and Information. Technical note.

[14] Csiszár, I.: Sanov property, generalized $I$-projection and a conditional limit theorem. Ann. Probab. 12 (1984), 3, 768-793. | DOI | MR

[15] Csiszár, I., Narayan, P.: Secrecy capacities for multiple terminals. IEEE Trans. Inform. Theory 50 (2004), 12, 3047-3061. | DOI | MR

[16] Dembo, A., Zeitouni, O.: Large deviations techniques and applications. Second edition. Springer-Verlag, Stochastic Modelling and Applied Probability 38, Berlin 2010. | DOI | MR

[17] Dobrušin, R. L.: A general formulation of Shannon's fundamental theorem in the theory of information. Dokl. Akad. Nauk SSSR 126 (1959), 474-477. | MR

[18] Dougherty, R., Freiling, C., Zeger, K.: Networks, matroids, and non-Shannon information inequalities. IEEE Trans. Inform. Theory 53 (2007), 6, 1949-1969. | DOI | MR

[19] Dudley, R. M.: Real Analysis and Probability. Cambridge University Press, Cambridge Studies in Advanced Mathematics 74, Cambridge 2002. | DOI | MR | Zbl

[20] Dueck, G.: The strong converse of the coding theorem for the multiple-access channel. J. Combin. Inform. System Sci. 6 (1981), 3, 187-196. | MR

[21] Eldan, R.: Gaussian-width gradient complexity, reverse log-sobolev inequalities and nonlinear large deviations. Geometr. Funct. Anal. 28 (2018), 6, 1548-1596. | DOI | MR

[22] Eldan, R., Gross, R.: Exponential random graphs behave like mixtures of stochastic block models. Preprint, available online at | arXiv | MR

[23] Eldan, R., Gross, R.: Decomposition of mean-field Gibbs distributions into product measures. Electron. J. Probab. 23 (2018), 35, 24. | DOI | MR

[24] Ellis, D., Friedgut, E., Kindler, G., Yehudayoff, A.: Geometric stability via information theory. Discrete Anal. 10 (2016), 28 pp. | DOI | MR

[25] Fritz, T., Chaves, R.: Entropic inequalities and marginal problems. IEEE Trans. Inform. Theory 59 (2013), 2, 803-817. | DOI | MR

[26] Fujishige, S.: Polymatroidal dependence structure of a set of random variables. Inform. Control 39 (1978), 1, 55-72. | DOI | MR

[27] Gelfand, I., Kolmogorov, A., Yaglom, I.: On the general definition of the quantity of information. Doklady Akad. Nauk SSSR 111 (1956), 4, 745-748. | MR

[28] Han, T. S.: Linear dependence structure of the entropy space. Inform. Control 29 (1975), 4, 337-368. | DOI | MR

[29] Han, T. S.: Nonnegative entropy measures of multivariate symmetric correlations. Inform. Control 36 (1978), 2, 133-156. | DOI | MR

[30] Kolmogorov, A. N.: On the shannon theory of information transmission in the case of continuous signals. IEEE Trans. Inform. Theory IT-2 (1956), 102-108. | DOI | MR

[31] Ledoux, M.: On {T}alagrand's deviation inequalities for product measures. ESAIM Probab. Statist. 1 (1995/97), 63-87. | DOI | MR

[32] Madiman, M., Tetali, P.: Information inequalities for joint distributions, with interpretations and applications. IEEE Trans. Inform. Theory 56 (2010), 6, 2699-2713. | DOI | MR

[33] Makarychev, K., Makarychev, Y., Romashchenko, A., Vereshchagin, N.: A new class of non-Shannon-type inequalities for entropies. Commun. Inf. Syst. 2 (2002), 2, 147-165. | DOI | MR

[34] Marton, K.: A simple proof of the blowing-up lemma. IEEE Trans. Inform. Theory 32 (1986), 3, 445-446. | DOI | MR

[35] Marton, K.: Bounding {$\overline d$}-distance by informational divergence: a method to prove measure concentration. Ann. Probab. 24 (1996), 2, 857-866. | DOI | MR

[36] Matúš, F.: Two constructions on limits of entropy functions. IEEE Trans. Inform. Theory 53 (2007), 1, 320-330. | DOI | MR

[37] McDiarmid, C.: On the method of bounded differences. In: Surveys in Combinatorics, Norwich 1989, London Math. Soc. Lecture Note Ser. 141, Cambridge Univ. Press, Cambridge 1989, pp. 148-188. | DOI | MR

[38] McGill, W. J.: Multivariate information transmission. Trans. I.R.E. PGIT-4 (1954), 93-111. | DOI | MR

[39] Pearl, J., Paz, A.: Graphoids: a graph-based logic for reasoning about relevance relations. In: Advances in Artificial Intelligence - II (B. Du Boulay, D. Hogg, and L. Steels, eds.), North Holland, Amsterdam 1987, pp. 357-363.

[40] Perez, A.: Information theory with an abstract alphabet. Generalized forms of McMillan's limit theorem for the case of discrete and continuous times. Theory Probab. Appl. 4 (1959), 99-102. | DOI | MR

[41] Perez, A.: $\epsilon $-admissible simplifications of the dependence structure of a set of random variables. Kybernetika 13 (1977), 6, 439-449. | MR

[42] Pinsker, M. S.: Information and information Stability of Random Variables and Processes. Holden-Day, Inc., San Francisco 1964. | MR

[43] Radhakrishnan, J.: Entropy and counting. In: Computational Mathematics, Modelling and Applications (IIT Kharagpur, Golden Jubilee Volume) (J. Mishra, ed.), Narosa Publishers, 2001, pp. 146-168.

[44] Schneidman, E., Still, S., Berry, M. J., Bialek, W.: Network information and connected correlations. Phys. Rev. Lett. 91 (2003), 238701. | DOI

[45] Studený, M., Vejnarová, J.: The multiinformation function as a tool for measuring stochastic dependence. In: Proc. NATO Advanced Study Institute on Learning in Graphical Models, Kluwer Academic Publishers, Norwell 1998, pp. 261-297. | DOI

[46] Timme, N., Alford, W., Flecker, B., Beggs, J. M.: Synergy, redundancy, and multivariate information measures: An experimentalist's perspective. J. Comput. Neurosci. 36 (2014), 2, 119-140. | DOI | MR

[47] Watanabe, S.: Information theoretical analysis of multivariate correlation. IBM J. Res. Develop. 4 (1960), 66-82. | DOI | MR

[48] Yan, J.: Nonlinear large deviations: beyond the hypercube. Preprint, available online at | arXiv | MR

[49] Zhang, Z., Yeung, R. W.: On characterization of entropy function via information inequalities. IEEE Trans. Inform. Theory 44 (1998), 4, 1440-1452. | DOI | MR

Cité par Sources :