Topic-Sensitive Multi-document Summarization Algorithm
Computer Science and Information Systems, Tome 12 (2015) no. 4.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

Latent Dirichlet Allocation (LDA) has been used to generate text corpora topics recently. However, not all the estimated topics are of equal importance or correspond to genuine themes of the domain. Some of the topics can be a collection of irrelevant words or represent insignificant themes. This paper proposed a topic-sensitive algorithm for multi-document summarization. This algorithm uses LDA model and weight linear combination strategy to identify significance topic which is used in sentence weight calculation. Each topic is measured by three different LDA criteria. Significance topic is evaluated by using weight linear combination to combine the multi-criteria. In addition to topic features, the proposed approach also considered some statistics features, such as term frequency, sentence position, sentence length, etc. It not only highlights the advantages of statistics features, but also cooperates with topic model. The experiments showed that the proposed algorithm achieves better performance than the other state-of-the-art algorithms on DUC2002 corpus.
Keywords: multi-document summarization, LDA, topic model, weighted linear combination
@article{CSIS_2015_12_4_a15,
     author = {Liu Na and Di Tang and Lu Ying and Tang Xiao-jun and Wang Hai-wen},
     title = {Topic-Sensitive {Multi-document} {Summarization} {Algorithm}},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {12},
     number = {4},
     year = {2015},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2015_12_4_a15/}
}
TY  - JOUR
AU  - Liu Na
AU  - Di Tang
AU  - Lu Ying
AU  - Tang Xiao-jun
AU  - Wang Hai-wen
TI  - Topic-Sensitive Multi-document Summarization Algorithm
JO  - Computer Science and Information Systems
PY  - 2015
VL  - 12
IS  - 4
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2015_12_4_a15/
ID  - CSIS_2015_12_4_a15
ER  - 
%0 Journal Article
%A Liu Na
%A Di Tang
%A Lu Ying
%A Tang Xiao-jun
%A Wang Hai-wen
%T Topic-Sensitive Multi-document Summarization Algorithm
%J Computer Science and Information Systems
%D 2015
%V 12
%N 4
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2015_12_4_a15/
%F CSIS_2015_12_4_a15
Liu Na; Di Tang; Lu Ying; Tang Xiao-jun; Wang Hai-wen. Topic-Sensitive Multi-document Summarization Algorithm. Computer Science and Information Systems, Tome 12 (2015) no. 4. http://geodesic.mathdoc.fr/item/CSIS_2015_12_4_a15/