Regularization of multilingual topic models
Numerical methods and programming, Tome 16 (2015) no. 1, pp. 26-38.

Voir la notice de l'article provenant de la source Math-Net.Ru

A multilingual probabilistic topic model based on the additive regularization ARTM allowing to combine both a parallel or comparable corpus and a bilingual translation dictionary is proposed. Two approaches to include information from a bilingual dictionary are discussed: the first one takes into account only the fact of connection between word translations, whereas the second one learns the translation probabilities for each topic. To measure the quality of the proposed multilingual topic model, a cross-language search is performed. For each query document in one language, it is found its translation on an other language. It is shown that the combined translation of words from a bilingual dictionary and the corresponding connected documents improves the cross-lingual search compared to the models using only one information source. The use of learning word translation probabilities for bilingual dictionaries improves the quality of the model and allows one to determine a context (a set of topics) for each pair of word translations, where these translations are appropriate.
Keywords: multilingual topic model, probabilistic topic model, parallel corpus, bilingual dictionary, regularization, cross-language search.
Mots-clés : comparable corpus
@article{VMP_2015_16_1_a3,
     author = {M. A. Dudarenko},
     title = {Regularization of multilingual topic models},
     journal = {Numerical methods and programming},
     pages = {26--38},
     publisher = {mathdoc},
     volume = {16},
     number = {1},
     year = {2015},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/VMP_2015_16_1_a3/}
}
TY  - JOUR
AU  - M. A. Dudarenko
TI  - Regularization of multilingual topic models
JO  - Numerical methods and programming
PY  - 2015
SP  - 26
EP  - 38
VL  - 16
IS  - 1
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/VMP_2015_16_1_a3/
LA  - ru
ID  - VMP_2015_16_1_a3
ER  - 
%0 Journal Article
%A M. A. Dudarenko
%T Regularization of multilingual topic models
%J Numerical methods and programming
%D 2015
%P 26-38
%V 16
%N 1
%I mathdoc
%U http://geodesic.mathdoc.fr/item/VMP_2015_16_1_a3/
%G ru
%F VMP_2015_16_1_a3
M. A. Dudarenko. Regularization of multilingual topic models. Numerical methods and programming, Tome 16 (2015) no. 1, pp. 26-38. http://geodesic.mathdoc.fr/item/VMP_2015_16_1_a3/