Analysis of machine learning models by solving the text data classification problem
Journal of computational and engineering mathematics, Tome 8 (2021) no. 2, pp. 33-45
Cet article a éte moissonné depuis la source Math-Net.Ru
The article presents a study of usage of machine-learning models for the classification of text data on the example of the problem of classifying requests to technical support through a chat bot of a mobile application. The following methods were considered: Naive Bayes classifier, K-Nearest Neighbors algorithm (KNN)), Decision Tree, Random Forest, Support Vector Machines (SVM) and the method of Logistic Regression (Logistic Regression), as well as 21 models based on above methods. The best machine-learning model for classifying text requests to the technical support chat bot turned out to be a model, based on the Logistic Regression method, and model, based on the Random Forest Classifier. The Complement Naive Bayes model of the Naive Bayes group of models showed the shortest tuning time among the trained models with an acceptable accuracy. The proposed methodology can be used to analyze and classify text data.
Keywords:
machine learning methods, natural language, text data analysis.
Mots-clés : text classification, regression
Mots-clés : text classification, regression
@article{JCEM_2021_8_2_a2,
author = {A. V. Pchelin and N. A. Kononov and V. S. Serova and E. V. Bunova and A. D. Marchenko and A. E. Shevchenko},
title = {Analysis of machine learning models by solving the text data classification problem},
journal = {Journal of computational and engineering mathematics},
pages = {33--45},
year = {2021},
volume = {8},
number = {2},
language = {en},
url = {http://geodesic.mathdoc.fr/item/JCEM_2021_8_2_a2/}
}
TY - JOUR AU - A. V. Pchelin AU - N. A. Kononov AU - V. S. Serova AU - E. V. Bunova AU - A. D. Marchenko AU - A. E. Shevchenko TI - Analysis of machine learning models by solving the text data classification problem JO - Journal of computational and engineering mathematics PY - 2021 SP - 33 EP - 45 VL - 8 IS - 2 UR - http://geodesic.mathdoc.fr/item/JCEM_2021_8_2_a2/ LA - en ID - JCEM_2021_8_2_a2 ER -
%0 Journal Article %A A. V. Pchelin %A N. A. Kononov %A V. S. Serova %A E. V. Bunova %A A. D. Marchenko %A A. E. Shevchenko %T Analysis of machine learning models by solving the text data classification problem %J Journal of computational and engineering mathematics %D 2021 %P 33-45 %V 8 %N 2 %U http://geodesic.mathdoc.fr/item/JCEM_2021_8_2_a2/ %G en %F JCEM_2021_8_2_a2
A. V. Pchelin; N. A. Kononov; V. S. Serova; E. V. Bunova; A. D. Marchenko; A. E. Shevchenko. Analysis of machine learning models by solving the text data classification problem. Journal of computational and engineering mathematics, Tome 8 (2021) no. 2, pp. 33-45. http://geodesic.mathdoc.fr/item/JCEM_2021_8_2_a2/