Algorithm for Document Authorship Identification and Plagiarism Evaluation Based on Generalized Suffix Tree
Review of the National Center for Digitization, Tome 37 (2020) no. 1
Cet article a éte moissonné depuis la source eLibrary of Mathematical Institute of the Serbian Academy of Sciences and Arts
Identifying an author of an anonymous text document is an important problem when dealing with historical data. As authors have their own characteristic writing styles, expressed through specific phrases, sentence constructions or word choices, their text documents incorporate the style and create implicit connection with the author. This paper proposes an approach for identification of authors of the anonymous documents, based on generalized suffix tree data structure and defined similarity score, suitable for analysis of digitized historical text documents. The following method can also be used for detecting and evaluating plagiarism, where the document author is known, but the document shows a high similarity with documents from another author.
@article{NCD_2020_37_1_a4,
author = {Aleksandar Veljkovi\'c},
title = {Algorithm for {Document} {Authorship} {Identification} and {Plagiarism} {Evaluation} {Based} on {Generalized} {Suffix} {Tree}},
journal = {Review of the National Center for Digitization},
pages = {46 - 51},
year = {2020},
volume = {37},
number = {1},
url = {http://geodesic.mathdoc.fr/item/NCD_2020_37_1_a4/}
}
TY - JOUR AU - Aleksandar Veljković TI - Algorithm for Document Authorship Identification and Plagiarism Evaluation Based on Generalized Suffix Tree JO - Review of the National Center for Digitization PY - 2020 SP - 46 EP - 51 VL - 37 IS - 1 UR - http://geodesic.mathdoc.fr/item/NCD_2020_37_1_a4/ ID - NCD_2020_37_1_a4 ER -
%0 Journal Article %A Aleksandar Veljković %T Algorithm for Document Authorship Identification and Plagiarism Evaluation Based on Generalized Suffix Tree %J Review of the National Center for Digitization %D 2020 %P 46 - 51 %V 37 %N 1 %U http://geodesic.mathdoc.fr/item/NCD_2020_37_1_a4/ %F NCD_2020_37_1_a4
Aleksandar Veljković. Algorithm for Document Authorship Identification and Plagiarism Evaluation Based on Generalized Suffix Tree. Review of the National Center for Digitization, Tome 37 (2020) no. 1. http://geodesic.mathdoc.fr/item/NCD_2020_37_1_a4/