SMS Sentiment Classification based on Lexical Features, Emoticons and Informal Abbreviations
Serdica Journal of Computing, Tome 13 (2019) no. 1-2, pp. 081-096
Cet article a éte moissonné depuis la source Bulgarian Digital Mathematics Library
In this paper we investigate the influence of emoticons, informal speech,
lexical and other linguistic features on the sentiment contained in SMS messages.
Using the dataset of ∼ 6,000 samples, we trained a linear SVM classifier able to
determine positive, negative and neutral sentiments. The dataset mostly contains
messages in Serbian, but also in English and German. The classifier had an average
accuracy score of 92.3% in a 5-fold Cross Validation setting, and F1-score of
92.1%, 74.0% and 93.3% in favor of the positive, negative and neutral class, respectively.
Keywords:
Computer Application in Arts and Humanities, Web-Based Services, Document Analysis
@article{SJC_2019_13_1-2_a5,
author = {\v{S}andrih, Branislava},
title = {SMS {Sentiment} {Classification} based on {Lexical} {Features,} {Emoticons} and {Informal} {Abbreviations}},
journal = {Serdica Journal of Computing},
pages = {081--096},
year = {2019},
volume = {13},
number = {1-2},
language = {en},
url = {http://geodesic.mathdoc.fr/item/SJC_2019_13_1-2_a5/}
}
TY - JOUR AU - Šandrih, Branislava TI - SMS Sentiment Classification based on Lexical Features, Emoticons and Informal Abbreviations JO - Serdica Journal of Computing PY - 2019 SP - 081 EP - 096 VL - 13 IS - 1-2 UR - http://geodesic.mathdoc.fr/item/SJC_2019_13_1-2_a5/ LA - en ID - SJC_2019_13_1-2_a5 ER -
Šandrih, Branislava. SMS Sentiment Classification based on Lexical Features, Emoticons and Informal Abbreviations. Serdica Journal of Computing, Tome 13 (2019) no. 1-2, pp. 081-096. http://geodesic.mathdoc.fr/item/SJC_2019_13_1-2_a5/