Recovering word forms by context for~morphologically~rich~languages

A. M. Alekseev; S. I. Nikolenko

A. M. Alekseev ; S. I. Nikolenko

Zapiski Nauchnykh Seminarov POMI, Investigations on applied mathematics and informatics. Part I, Tome 499 (2021), pp. 129-136

Voir la notice de l'article provenant de la source Math-Net.Ru

Résumé

In this work, we focus on “sentence-level unlemmatization”, the task of generating a grammatical sentence given a lemmatized one, which can usually be easily done by humans. We treat this setting as a machine translation problem and – as a first try – apply a sequence-to-sequence model to the texts of Russian Wikipedia articles, evaluate the effect of the different training sets sizes quantitatively and achieve the BLUE score of 67,3 using the largest training set available. We discuss preliminary results and flaws of traditional machine translation evaluation methods for this task and suggest directions for future research.

Export
Comment citer

@article{ZNSL_2021_499_a8,
     author = {A. M. Alekseev and S. I. Nikolenko},
     title = {Recovering word forms by context for~morphologically~rich~languages},
     journal = {Zapiski Nauchnykh Seminarov POMI},
     pages = {129--136},
     publisher = {mathdoc},
     volume = {499},
     year = {2021},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/ZNSL_2021_499_a8/}
}

TY  - JOUR
AU  - A. M. Alekseev
AU  - S. I. Nikolenko
TI  - Recovering word forms by context for~morphologically~rich~languages
JO  - Zapiski Nauchnykh Seminarov POMI
PY  - 2021
SP  - 129
EP  - 136
VL  - 499
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/ZNSL_2021_499_a8/
LA  - en
ID  - ZNSL_2021_499_a8
ER  -

%0 Journal Article
%A A. M. Alekseev
%A S. I. Nikolenko
%T Recovering word forms by context for~morphologically~rich~languages
%J Zapiski Nauchnykh Seminarov POMI
%D 2021
%P 129-136
%V 499
%I mathdoc
%U http://geodesic.mathdoc.fr/item/ZNSL_2021_499_a8/
%G en
%F ZNSL_2021_499_a8

A. M. Alekseev; S. I. Nikolenko. Recovering word forms by context for~morphologically~rich~languages. Zapiski Nauchnykh Seminarov POMI, Investigations on applied mathematics and informatics. Part I, Tome 499 (2021), pp. 129-136. http://geodesic.mathdoc.fr/item/ZNSL_2021_499_a8/

Parcourir par

Geodesic

Parcourir par