Topic categorization based on collectives of term weighting methods for natural language call routing

Roman B. Sergienko; Muhammad Shan; Wolfgang Minker; Eugene S. Semenkin

Roman B. Sergienko ; Muhammad Shan ; Wolfgang Minker ; Eugene S. Semenkin

Žurnal Sibirskogo federalʹnogo universiteta. Matematika i fizika, Tome 9 (2016) no. 2, pp. 235-245

Voir la notice de l'article provenant de la source Math-Net.Ru

Résumé

Natural language call routing is an important data analysis problem which can be applied in different domains including airspace industry. This paper presents the investigation of collectives of term weighting methods for natural language call routing based on text classification. The main idea is that collectives of different term weighting methods can provide classification effectiveness improvement with the same classification algorithm. Seven different unsupervised and supervised term weighting methods were tested and compared with each other for classification with k-NN. After that different combinations of term weighting methods were formed as collectives. Two approaches for the handling of the collectives were considered: the meta-classifier based on the rule induction and the majority vote procedure. The numerical experiments have shown that the best result is provided with the vote of all seven different term weighting methods. This combination provides a significant increasing of classification effectiveness in comparison with the most effective term weighting methods.

Keywords: natural language call routing, term weighting.
Mots-clés : text classification

@article{JSFU_2016_9_2_a12,
     author = {Roman B. Sergienko and Muhammad Shan and Wolfgang Minker and Eugene S. Semenkin},
     title = {Topic categorization based on collectives of term weighting methods for natural language call routing},
     journal = {\v{Z}urnal Sibirskogo federalʹnogo universiteta. Matematika i fizika},
     pages = {235--245},
     publisher = {mathdoc},
     volume = {9},
     number = {2},
     year = {2016},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/JSFU_2016_9_2_a12/}
}

TY  - JOUR
AU  - Roman B. Sergienko
AU  - Muhammad Shan
AU  - Wolfgang Minker
AU  - Eugene S. Semenkin
TI  - Topic categorization based on collectives of term weighting methods for natural language call routing
JO  - Žurnal Sibirskogo federalʹnogo universiteta. Matematika i fizika
PY  - 2016
SP  - 235
EP  - 245
VL  - 9
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/JSFU_2016_9_2_a12/
LA  - en
ID  - JSFU_2016_9_2_a12
ER  -

%0 Journal Article
%A Roman B. Sergienko
%A Muhammad Shan
%A Wolfgang Minker
%A Eugene S. Semenkin
%T Topic categorization based on collectives of term weighting methods for natural language call routing
%J Žurnal Sibirskogo federalʹnogo universiteta. Matematika i fizika
%D 2016
%P 235-245
%V 9
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/item/JSFU_2016_9_2_a12/
%G en
%F JSFU_2016_9_2_a12

Roman B. Sergienko; Muhammad Shan; Wolfgang Minker; Eugene S. Semenkin. Topic categorization based on collectives of term weighting methods for natural language call routing. Žurnal Sibirskogo federalʹnogo universiteta. Matematika i fizika, Tome 9 (2016) no. 2, pp. 235-245. http://geodesic.mathdoc.fr/item/JSFU_2016_9_2_a12/

Parcourir par

Geodesic

Parcourir par