Simple essential improvements to the ROUGE-W algorithm
Žurnal Sibirskogo federalʹnogo universiteta. Matematika i fizika, Tome 8 (2015) no. 4, pp. 497-501.

Voir la notice de l'article provenant de la source Math-Net.Ru

The ROUGE-W algorithm to calculate the similarity of texts is referred in more than 500 scientific publications since 2004. The power of the algorithm depends on the weight function choice. An optimal selection of the weight function is studied. The weight functions used previously are far from optimality. An example of incorrect output of the algorithm is provided. Simple changes are described to ensure the expected result.
Keywords: sequence alignment, longest common subsequence, edit distance, string similarity, optimization, complexity bounds.
Mots-clés : ROUGE-W
@article{JSFU_2015_8_4_a12,
     author = {Sergej V. Znamenskij},
     title = {Simple essential improvements to the {ROUGE-W} algorithm},
     journal = {\v{Z}urnal Sibirskogo federalʹnogo universiteta. Matematika i fizika},
     pages = {497--501},
     publisher = {mathdoc},
     volume = {8},
     number = {4},
     year = {2015},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/JSFU_2015_8_4_a12/}
}
TY  - JOUR
AU  - Sergej V. Znamenskij
TI  - Simple essential improvements to the ROUGE-W algorithm
JO  - Žurnal Sibirskogo federalʹnogo universiteta. Matematika i fizika
PY  - 2015
SP  - 497
EP  - 501
VL  - 8
IS  - 4
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/JSFU_2015_8_4_a12/
LA  - en
ID  - JSFU_2015_8_4_a12
ER  - 
%0 Journal Article
%A Sergej V. Znamenskij
%T Simple essential improvements to the ROUGE-W algorithm
%J Žurnal Sibirskogo federalʹnogo universiteta. Matematika i fizika
%D 2015
%P 497-501
%V 8
%N 4
%I mathdoc
%U http://geodesic.mathdoc.fr/item/JSFU_2015_8_4_a12/
%G en
%F JSFU_2015_8_4_a12
Sergej V. Znamenskij. Simple essential improvements to the ROUGE-W algorithm. Žurnal Sibirskogo federalʹnogo universiteta. Matematika i fizika, Tome 8 (2015) no. 4, pp. 497-501. http://geodesic.mathdoc.fr/item/JSFU_2015_8_4_a12/

[1] P. Heckel, “A technique for isolating differences between files”, Commun. ACM, 21:4 (1978), 264–268 | DOI | Zbl

[2] S. V. Znamenskij, “A Belief Framework for Similarity Evaluation of Textual or Structured Data”, Similarity Search and Applications, LNCS, 9371, 2015, 138–149

[3] G. Benson, A. Levy, R. Shalom, Longest Common Subsequence in $k$-length substrings, 2014, arXiv: 1402.2097

[4] K.-T. Tseng, C.-B. Yang, K.-S. Huang, “The better alignment among output alignments”, Journal of Computers, 3 (2007), 51–62

[5] Y.-P. Guo, Y.-H. Peng, C.-B. Yang, “Efficient algorithms for the flexible longest common subsequence problem”, Proceedings of the 31st Workshop on Combinatorial Mathematics and Computation Theory (2014), 1–8

[6] C. Y. Lin, “Rouge: A package for automatic evaluation of summaries”, Proceedings of Workshop on Text Summarization Branches Out, Post-Conference Workshop of ACL 2004 (July 21–26, 2004, Barcelona, Spain), 2004, 8 pp.

[7] C.-Y. Lin, F. J. Och, “ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation”, Proceedings of 20th International Conference on Computational Linguistic, COLING 2004, 2004

[8] A. Amir, Z. Gotthilf, B. R. Shalom, “Weighted LCS”, Journal of Discrete Algorithms, 8 (2010), 273–281 | DOI | MR | Zbl

[9] S. V. Znamenskij, “Modeling of the optimal sequence alignment problem”, Program systems: theory and applications, 5:4 (2014), 257–267

[10] S. V. Znamenskij, “A model and algorithm for sequence alignment”, Program systems: theory and applications, 6:1 (2015), 189–197 (in Russian)