Analysis of the texts for predicting the churn of ISP
Čelâbinskij fiziko-matematičeskij žurnal, Tome 3 (2018) no. 2, pp. 227-236

Voir la notice de l'article provenant de la source Math-Net.Ru

The possibility of forecasting the churn of customers based on the data of the Russian ISP are considered. The basic stages and approaches to the preliminary processing of the texts of operators’ comments have been determined. It’s offered to use classification algorithms such as the logistic regression, $k$-nearest neighbors method, the gradient boosting, the naive Bayesian algorithm. As a sample, an array of input data from 23 features of 380 000 subscribers was formed. Typos are correcting with using the Dahmerau — Levenshtein distance and lemmatizing of the textual information, and then they are converted into a feature vector using the TF-IDF method and are added to the model. The main approaches of categorical features coding are determined. The forecast models are constructed. Comparison of the results of the study with different classifiers is made and conclusions are drawn.
Keywords: prediction, clients churn, ISP, python, customers calls, classification, analysis of texts, tf-idf.
@article{CHFMJ_2018_3_2_a8,
     author = {A. A. Karyakina and D. S. Botov},
     title = {Analysis of the texts for predicting the churn of {ISP}},
     journal = {\v{C}el\^abinskij fiziko-matemati\v{c}eskij \v{z}urnal},
     pages = {227--236},
     publisher = {mathdoc},
     volume = {3},
     number = {2},
     year = {2018},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/CHFMJ_2018_3_2_a8/}
}
TY  - JOUR
AU  - A. A. Karyakina
AU  - D. S. Botov
TI  - Analysis of the texts for predicting the churn of ISP
JO  - Čelâbinskij fiziko-matematičeskij žurnal
PY  - 2018
SP  - 227
EP  - 236
VL  - 3
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CHFMJ_2018_3_2_a8/
LA  - ru
ID  - CHFMJ_2018_3_2_a8
ER  - 
%0 Journal Article
%A A. A. Karyakina
%A D. S. Botov
%T Analysis of the texts for predicting the churn of ISP
%J Čelâbinskij fiziko-matematičeskij žurnal
%D 2018
%P 227-236
%V 3
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CHFMJ_2018_3_2_a8/
%G ru
%F CHFMJ_2018_3_2_a8
A. A. Karyakina; D. S. Botov. Analysis of the texts for predicting the churn of ISP. Čelâbinskij fiziko-matematičeskij žurnal, Tome 3 (2018) no. 2, pp. 227-236. http://geodesic.mathdoc.fr/item/CHFMJ_2018_3_2_a8/