Automatic text classification in the system of concepts lexical ontology
Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki, Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki, Tome 152 (2010) no. 1, pp. 255-267 Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice du chapitre de livre

In this work we consider a problem of text semantic indexing from positions of text classification problem with lexical ontology units. We offer to use this approach for objects identification problem and to improve quality of information retrieval. Basic differences of classification problem with using lexical ontology units from the text indexing task and from WSD task are formulated. A new method of classification is offered named as OntoKlass. Formal statement of the problem is made. Engineering realization of the method is executed.
Keywords: artificial intelligence, computer linguistics, lexical ontology, text mining.
Mots-clés : text classification
@article{UZKU_2010_152_1_a23,
     author = {S. I. Danchenkov and V. N. Polyakov},
     title = {Automatic text classification in the system of concepts lexical ontology},
     journal = {U\v{c}\"enye zapiski Kazanskogo universiteta. Seri\^a Fiziko-matemati\v{c}eskie nauki},
     pages = {255--267},
     year = {2010},
     volume = {152},
     number = {1},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/UZKU_2010_152_1_a23/}
}
TY  - JOUR
AU  - S. I. Danchenkov
AU  - V. N. Polyakov
TI  - Automatic text classification in the system of concepts lexical ontology
JO  - Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki
PY  - 2010
SP  - 255
EP  - 267
VL  - 152
IS  - 1
UR  - http://geodesic.mathdoc.fr/item/UZKU_2010_152_1_a23/
LA  - ru
ID  - UZKU_2010_152_1_a23
ER  - 
%0 Journal Article
%A S. I. Danchenkov
%A V. N. Polyakov
%T Automatic text classification in the system of concepts lexical ontology
%J Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki
%D 2010
%P 255-267
%V 152
%N 1
%U http://geodesic.mathdoc.fr/item/UZKU_2010_152_1_a23/
%G ru
%F UZKU_2010_152_1_a23
S. I. Danchenkov; V. N. Polyakov. Automatic text classification in the system of concepts lexical ontology. Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki, Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki, Tome 152 (2010) no. 1, pp. 255-267. http://geodesic.mathdoc.fr/item/UZKU_2010_152_1_a23/

[1] Fabrizio S., “Machine Learning in Automated Text Categorization”, ACM Computing Surveys, 34:1 (2002), 1–47 | DOI

[2] Ageev M. S., Dobrov B. V., Lukashevich N. V., “Avtomaticheskaya rubrikatsiya tekstov: metody i problemy”, Uchen. zap. Kazan. un-ta. Ser. Fiz.-matem. nauki, 150, no. 4, 2008, 25–40

[3] Polyakov V., Sidyakin O., Sinitcin V., Ten A., “Analysis of influence of heuristics on text classification effectiveness”, Paper Collection “Text Processing and Cognitive Technologies”, 11, eds. V. Solovyev, V. Polyakov, V. Goldberg, Ucheba, Moscow–Kazan–Varna, 2005, 121–135

[4] Danchenkov C. I., Polyakov V. N., Sidyakin O. A., “Ispolzovanie tekhnologii klassifikatsii tekstov Rubryx v zadache razresheniya leksicheskoi mnogoznachnosti”, Trudy Kazanskoi shkoly po kompyuternoi i kognitivnoi lingvistike TEL-2006, Kazan, 2007, 16–23

[5] Dobrov B. V., Lukashevich N. V., Sinitsyn M. N., Shapkin V. N., “Razrabotka lingvisticheskoi ontologii po estestvennym naukam dlya resheniya zadach informatsionnogo poiska”, Elektronnye biblioteki; perspektivnye metody i tekhnologii, elektronnye kollektsii, Trudy sedmoi Vseros. nauch. konf. RCDL'2005 (Yaroslavl, 4–6 okt. 2005 g.), Yarosl. gos. un-t, Yaroslavl, 2005, 70–79

[6] Rosseeva O. I., Zagorulko Yu. A., “Organizatsiya effektivnogo poiska na osnove ontologii”, Trudy mezhdunar. seminara Dialog'2001 po kompyuternoi lingvistike i ee prilozheniyam, v. 2, Aksakovo, 2001, 333–342

[7] Polyakov V. N., “Ispolzovanie tekhnologii, orientirovannykh na leksicheskoe znachenie, v zadachakh poiska i klassifikatsii”, Sb. st. IYa RAN, Problemy prikladnoi lingvistiki, 2, Azbukovnik, M., 2004, 101–117

[8] Gavrilova T. A., “Ontologicheskii podkhod k upravleniyu znaniyami pri razrabotke korporativnykh informatsionnykh sistem”, Novosti iskusstvennogo intellekta, 2003, no. 2, 24–30

[9] Guarino N., “Formal Ontology and Information Systems”, Proc. of the 1st Intern. Conf. on Formal Ontologies in Information Systems (FOIS'98), ed. Guarino N., IOS Press, Trento, Italy, 1998, 3–15

[10] Hovy E. H., “Combining and standardizing large-scale, practical ontologies for machine translation and other uses”, Proc. of the 1st Intern. Conf. on Language Resources and Evaluation (LREC), Granada, Spain, 1998, 535–542

[11] Stumme G., “Using ontologies and formal concept analysis for organizing business knowledge”, Wissensmanagement mit Referenzmodellen – Konzepte fur die Anwendungssystem- und OrganisationsgestaltungHeidelberg, eds. J. Becker, R. Knackstedt, Physica, Heidelberg, 2002, 163–174 | DOI

[12] Solovev V. D., Dobrov B. V., Ivanov V. V., Lukashevich N. V., Ontologii i tezaurusy, Kazan. gos. un-t, Kazan; Mosk. gos. un-t, Moskva, 2006, 157 pp.

[13] Miller G., Beckwith R., Fellbaum C., Gross D., Miller K., “Introduction to WordNet: An On-Line Lexical Database”, Intern. J. Lexicography, 3:4 (1990), 235–312 | DOI

[14] Roventini A., Marinelli R., “Extending the Italian WordNet with the Specialized Language of the Maritime Domain”, Proc. of Second Intern. WordNet Conf. GWC, 2004, 193–198

[15] Vossen P., “Extending, Trimming and Fusing WordNet for Technical Documents”, Proc. of WordNet and Other Lexical Resources: Applications, Extensions and Customizations, Pittsburg, USA, 2001, 125–131