Neural Coreference Resolution for Slovene Language
Computer Science and Information Systems, Tome 19 (2022) no. 2.

Voir la notice de l'article provenant de la source Computer Science and Information Systems website

Coreference resolution systems aim to recognize and cluster together mentions of the same underlying entity. While there exist large amounts of research on broadly spoken languages such as English and Chinese, research on coreference in other languages is comparably scarce. In this work we first present SentiCoref 1.0 - a coreference resolution dataset for Slovene language that is comparable to English-based corpora. Further, we conduct a series of analyses using various complex models that range from simple linear models to current state-of-the-art deep neural coreference approaches leveraging pre-trained contextual embeddings. Apart from SentiCoref, we evaluate models also on a smaller coref149 Slovene dataset to justify the creation of a new corpus. We investigate robustness of the models using cross-domain data and data augmentations. Models using contextual embeddings achieve the best results - up to 0.92 average F 1 score for the SentiCoref dataset. Cross-domain experiments indicate that SentiCoref allows the models to learn more general patterns, which enables them to outperform models, learned on coref149 only.
Keywords: coreference resolution, Slovene language, neural networks, word embeddings
@article{CSIS_2022_19_2_a2,
     author = {Matej Klemen and Slavko \v{Z}itnik},
     title = {Neural {Coreference} {Resolution} for {Slovene} {Language}},
     journal = {Computer Science and Information Systems},
     publisher = {mathdoc},
     volume = {19},
     number = {2},
     year = {2022},
     url = {http://geodesic.mathdoc.fr/item/CSIS_2022_19_2_a2/}
}
TY  - JOUR
AU  - Matej Klemen
AU  - Slavko Žitnik
TI  - Neural Coreference Resolution for Slovene Language
JO  - Computer Science and Information Systems
PY  - 2022
VL  - 19
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/CSIS_2022_19_2_a2/
ID  - CSIS_2022_19_2_a2
ER  - 
%0 Journal Article
%A Matej Klemen
%A Slavko Žitnik
%T Neural Coreference Resolution for Slovene Language
%J Computer Science and Information Systems
%D 2022
%V 19
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/item/CSIS_2022_19_2_a2/
%F CSIS_2022_19_2_a2
Matej Klemen; Slavko Žitnik. Neural Coreference Resolution for Slovene Language. Computer Science and Information Systems, Tome 19 (2022) no. 2. http://geodesic.mathdoc.fr/item/CSIS_2022_19_2_a2/