Convergence of the Algorithm of Additive Regularization of Topic Models
Trudy Instituta matematiki i mehaniki, Trudy Instituta Matematiki i Mekhaniki UrO RAN, Tome 26 (2020) no. 3, pp. 56-68 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice du chapitre de livre

The problem of probabilistic topic modeling is as follows. Given a collection of text documents, find the conditional distribution over topics for each document and the conditional distribution over words (or terms) for each topic. Log-likelihood maximization is used to solve this problem. The problem generally has an infinite set of solutions and is ill-posed according to Hadamard. In the framework of Additive Regularization of Topic Models (ARTM), a weighted sum of regularization criteria is added to the main log-likelihood criterion. The numerical method for solving this optimization problem is a kind of an iterative EM-algorithm written in a general form for an arbitrary smooth regularizer as well as for a linear combination of smooth regularizers. This paper studies the problem of convergence of the EM iterative process. Sufficient conditions are obtained for the convergence to a stationary point of the regularized log-likelihood. The constraints imposed on the regularizer are not too restrictive. We give their interpretations from the point of view of the practical implementation of the algorithm. A modification of the algorithm is proposed that improves the convergence without additional time and memory costs. Experiments on a news text collection have shown that our modification both accelerates the convergence and improves the value of the criterion to be optimized.
Keywords: natural language processing, probabilistic topic modeling, probabilistic latent semantic analysis (PLSA), latent Dirichlet allocation (LDA), additive regularization of topic models (ARTM)
Mots-clés : EM-algorithm, sufficient conditions for convergence.
@article{TIMM_2020_26_3_a5,
     author = {I. A. Irkhin and K. V. Vorontsov},
     title = {Convergence of the {Algorithm} of {Additive} {Regularization} of {Topic} {Models}},
     journal = {Trudy Instituta matematiki i mehaniki},
     pages = {56--68},
     year = {2020},
     volume = {26},
     number = {3},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/TIMM_2020_26_3_a5/}
}
TY  - JOUR
AU  - I. A. Irkhin
AU  - K. V. Vorontsov
TI  - Convergence of the Algorithm of Additive Regularization of Topic Models
JO  - Trudy Instituta matematiki i mehaniki
PY  - 2020
SP  - 56
EP  - 68
VL  - 26
IS  - 3
UR  - http://geodesic.mathdoc.fr/item/TIMM_2020_26_3_a5/
LA  - ru
ID  - TIMM_2020_26_3_a5
ER  - 
%0 Journal Article
%A I. A. Irkhin
%A K. V. Vorontsov
%T Convergence of the Algorithm of Additive Regularization of Topic Models
%J Trudy Instituta matematiki i mehaniki
%D 2020
%P 56-68
%V 26
%N 3
%U http://geodesic.mathdoc.fr/item/TIMM_2020_26_3_a5/
%G ru
%F TIMM_2020_26_3_a5
I. A. Irkhin; K. V. Vorontsov. Convergence of the Algorithm of Additive Regularization of Topic Models. Trudy Instituta matematiki i mehaniki, Trudy Instituta Matematiki i Mekhaniki UrO RAN, Tome 26 (2020) no. 3, pp. 56-68. http://geodesic.mathdoc.fr/item/TIMM_2020_26_3_a5/

[1] Apishev M.A., Vorontsov K.V., “Learning topic models with arbitrary loss”, Proc. of the 26th Conf. of FRUCT (Finnish-Russian University Cooperation in Telecommunications) Association, 2020, 30–37 | DOI

[2] Blei D.M., Ng A.Y., Jordan M.I., “Latent Dirichlet allocation”, J. Mach. Learn. Res., 3 (2003), 993–1022 | Zbl

[3] Dempster A.P., Laird N.M., Rubin D.B., “Maximum likelihood from incomplete data via the EM algorithm”, J. Royal Statistical Society. Series B (methodological), 39:1 (1977), 1–38 | DOI | MR | Zbl

[4] Frei O.I., Apishev M.A., “Parallel non-blocking deterministic algorithm for online topic modeling”, Analysis of Images, Social networks and Texts (AIST'2016), Communications in Computer and Information Science (CCIS), 661, Springer International Publ., Cham, 2016, 132–144 | DOI | MR

[5] Hofmann T., “Probabilistic latent semantic indexing”, Proc. of the 22nd Annual International ACM SIGIR Conf. on Research and Development in Information Retrieval, ACM, N Y, 1999, 50–57 | DOI

[6] Kochedykov D.A., Apishev M.A., Golitsyn L.V., Vorontsov K.V., “Fast and modular regularized topic modelling”, Proc. of The 21st Conference of FRUCT (Finnish-Russian University Cooperation in Telecommunications) Association, The seminar on Intelligence, Social Media and Web (ISMW) (Helsinki, Finland, November 6-10, 2017), ACM, N Y, 2017, 182–193 | DOI

[7] Lang K., 20 newsgroups [e-resource] Data retrieved from the dataset's official website, 2008 URL: http://qwone.com/ jason/20Newsgroups/

[8] Tan Y., Ou Z., “Topic-weak-correlated latent Dirichlet allocation”, 7th International Symposium Chinese Spoken Language Processing (ISCSLP), 2010, 224–228 | DOI

[9] Tikhonov A.N., Arsenin V.Y., Solution of ill-posed problems, John Wiley Sons, N Y etc., 1977, 258 pp. | MR

[10] Topsoe F., “Some inequalities for information divergence and related measures of discrimination”, Information Theory, IEEE Transactions on, 46:4 (2000), 1602–1609 | DOI | MR | Zbl

[11] Vorontsov K.V., “Additive regularization for topic models of text collections”, Dokl. Math., 89:3 (2014), 301–304 | DOI | MR | Zbl

[12] Vorontsov K.V., Frei O.I., Apishev M.A., Romov P.A., Suvorova M.A., “BigARTM: Open source library for regularized multimodal topic modeling of large collections”, Analysis of Images, Social Networks and Texts (AIST'2015), Communications in Computer and Information Science (CCIS), 542, Springer International Publ., Cham, 2015, 370–381 | DOI

[13] Vorontsov K.V., Potapenko A.A., “Tutorial on probabilistic topic modeling: Additive regularization for stochastic matrix factorization”, Analysis of Images, Social Networks and Texts (AIST'2014), Communications in Computer and Information Science, 436, Springer International Publ., Cham, 2014, 29–46 | DOI

[14] Vorontsov K.V., Potapenko A.A., “Additive regularization of topic models”, Machine Learning: Special Issue on Data Analysis and Intelligent Optimization with Applications, 101:1 (2015), 303–323 | DOI | MR | Zbl

[15] Vorontsov K.V., Potapenko A.A., Plavin A.V., “Additive regularization of topic models for topic selection and sparse factorization”, The Third International Symposium on Learning and Data Sciences (SLDS 2015) (April 20-22 Royal Holloway, University of London, UK), Springer International Publ., Cham, 2015, 193–202 | DOI | MR

[16] Wallach H.M., Mimno D.M., McCallum A., “Rethinking LDA: Why priors matter”, Advances in Neural Information Processing Systems, Curran Associates, Red Hook, 2009, 1973–1981

[17] Wu C.J., “On the convergence properties of the EM algorithm”, The Annals of Statistics, 11:1 (1983), 95–103 | DOI | MR | Zbl