Methods of implicit aspect detection in Russian publicism sentences

A. Y. Poletaev; I. V. Paramonov; E. M. Kolupaev

A. Y. Poletaev ; I. V. Paramonov ; E. M. Kolupaev

Modelirovanie i analiz informacionnyh sistem, Tome 31 (2024) no. 3, pp. 226-239

Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice de l'article

Résumé

The paper compares performance of various methods of automatic implicit aspect detection in publicism sentences in Russian. The task of implicit aspect detection is an auxiliary task in the aspect-oriented sentiment analysis. The experiments were conducted on a corpus of sentences extracted from political campaign materials. The best results, with F1-measure reaching 0.84, were obtained using the Navec embeddings and classifiers based on the support vector machine method. Fairly high results, with F1-measure reaching 0.77, were obtained using the bag-of-words model and the naive Bayesian classifier. Other methods showed lower performance. It was also revealed during the experiments that the detection quality can differ significantly between the aspects. The detection quality is the highest for the aspects associated with characteristic marker words, for example, “health car” and “holding elections”. More general aspects, such as “quality of governance”, are detected with the worst quality.

Keywords: aspect detection, implicit aspects, sentiment analysis, publicism.

@article{MAIS_2024_31_3_a0,
     author = {A. Y. Poletaev and I. V. Paramonov and E. M. Kolupaev},
     title = {Methods of implicit aspect detection in {Russian} publicism sentences},
     journal = {Modelirovanie i analiz informacionnyh sistem},
     pages = {226--239},
     year = {2024},
     volume = {31},
     number = {3},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/MAIS_2024_31_3_a0/}
}

TY  - JOUR
AU  - A. Y. Poletaev
AU  - I. V. Paramonov
AU  - E. M. Kolupaev
TI  - Methods of implicit aspect detection in Russian publicism sentences
JO  - Modelirovanie i analiz informacionnyh sistem
PY  - 2024
SP  - 226
EP  - 239
VL  - 31
IS  - 3
UR  - http://geodesic.mathdoc.fr/item/MAIS_2024_31_3_a0/
LA  - ru
ID  - MAIS_2024_31_3_a0
ER  -

%0 Journal Article
%A A. Y. Poletaev
%A I. V. Paramonov
%A E. M. Kolupaev
%T Methods of implicit aspect detection in Russian publicism sentences
%J Modelirovanie i analiz informacionnyh sistem
%D 2024
%P 226-239
%V 31
%N 3
%U http://geodesic.mathdoc.fr/item/MAIS_2024_31_3_a0/
%G ru
%F MAIS_2024_31_3_a0

A. Y. Poletaev; I. V. Paramonov; E. M. Kolupaev. Methods of implicit aspect detection in Russian publicism sentences. Modelirovanie i analiz informacionnyh sistem, Tome 31 (2024) no. 3, pp. 226-239. http://geodesic.mathdoc.fr/item/MAIS_2024_31_3_a0/

Bibliographie
Cité par

[1] B. Liu, Sentiment Analysis and Opinion Mining, Springer, 2022, 167 pp.

[2] W. Zhang, X. Li, Y. Deng, L. Bing, W. Lam, “A survey on aspect-based sentiment analysis: Tasks, methods, and challenges”, IEEE Transactions on Knowledge and Data Engineering, 35:11 (2022), 11-019–11-038 | DOI

[3] M. M. Trusca, F. Frasincar, “Survey on aspect detection for aspect-based sentiment analysis”, Artificial Intelligence Review, 56:5 (2023), 3797–3846 | DOI

[4] A. Naumov, R. Rybka, A. Sboev, A. Selivanov, A. Gryaznov, “Neural-network method for determining text author's sentiment to an aspect specified by the named entity”, CEUR Workshop Proceedings, 2648, 2020, 134–143

[5] E. V. Sergeeva, “Features of speech exposure in the preelection media discourse”, Aktual'nye problemy gumanitarnogo znaniya v tekhnicheskom vuze, 2021, 237–239 (in Russian)

[6] A. Nazir, Y. Rao, L. Wu, L. Sun, “Issues and challenges of aspect-based sentiment analysis: A comprehensive survey”, IEEE Transactions on Affective Computing, 13:2 (2020), 845–863 | DOI | MR

[7] P. K. Soni, R. Rambola, “A survey on implicit aspect detection for sentiment analysis: Terminology, issues, and scope”, IEEE Access, 10 (2022), 63 932–63 957 | DOI

[8] B. Mohammed et al, “Hybrid approach to extract adjectives for implicit aspect identification in opinion mining”, 11th International Conference on Intelligent Systems: Theories and Applications (SITA), IEEE, 2016, 1–5 | DOI | MR

[9] A. O. Kornej, E. N. Kryuchkova, “Semantiko-statisticheskij algoritm opredeleniya kategorij aspektov v zadachah sentiment-analiza”, Izvestiya Yuzhnogo federal'nogo universiteta. Tekhnicheskie nauki, 2020, no. 6(216), 66–74 (in Russian) | DOI

[10] E. I. Gribkov, Y. P. Ekhlakov, “Nejrosetevaya model' na osnove sistemy perekhodov dlya izvlecheniya sostavnyh ob'ektov i ih atributov iz tekstov na estestvennom yazyke”, Doklady Tomskogo gosudarstvennogo universiteta sistem upravleniya i radioelektroniki, 23:1 (2020), 47–52 (in Russian) | DOI

[11] L. Hickman, S. Thapa, L. Tay, M. Cao, P. Srinivasan, “Text preprocessing for text mining in organizational research: Review and recommendations”, Organizational Research Methods, 25:1 (2022), 114–146 | DOI

[12] S. Bird, E. Klein, E. Loper, Natural language processing with Python: analyzing text with the natural language toolkit, OTReilly Media, Inc, 2009

[13] U. Naseem, I. Razzak, P. W. Eklund, “A survey of pre-processing techniques to improve short-text quality: A case study on hate speech detection on Twitter”, Multimedia Tools and Applications, 80 (2021), 35 239–35 266 | DOI

[14] J. Coates, D. Bollegala, “Frustratingly easy meta-embedding computing meta-embeddings by averaging source word embeddings”, Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, v. 2, Short Papers, Association for Computational Linguistics, 2018, 194–198 | DOI

[15] T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space, 2013, arXiv: 1301.3781v3 [cs.CL]

[16] I. Yamada et al, “Wikipedia2Vec: An efficient toolkit for learning and visualizing the embeddings of words and entities from Wikipedia”, Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Association for Computational Linguistics, 2020, 23–30 | DOI

[17] A. Joulin, E. Grave, P. Bojanowski, T. Mikolov, “Bag of tricks for efficient text classification”, Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, v. 2, Short Papers, Association for Computational Linguistics, 2017, 427–431 | DOI

[18] A. Kukushkin, Navec kompaktnye embeddingi dlya russkogo yazyka, 2020 (in Russian) (visited on 08/11/2024) https://natasha.github.io/navec/

[19] J. Pennington, R. Socher, C. D. Manning, “GloVe: Global vectors for word representation”, Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, 2014, 1532–1543 | DOI | MR

[20] Q. Le, T. Mikolov, “Distributed representations of sentences and documents”, International conference on machine learning, PMLR, 2014, 1188–1196

[21] F. Pedregosa et al, “Scikit-learn: Machine learning in Python”, Journal of Machine Learning Research, 12 (2011), 2825–2830 | MR | Zbl

Parcourir par

Geodesic

Parcourir par