Robust word vectors: context-informed embeddings for noisy texts

T. Khakhulin; V. Logacheva; V. Malykh

T. Khakhulin ; V. Logacheva ; V. Malykh

Zapiski Nauchnykh Seminarov POMI, Investigations on applied mathematics and informatics. Part I, Tome 499 (2021), pp. 248-266

Voir la notice de l'article provenant de la source Math-Net.Ru

Résumé

We suggest a new language-independent architecture of robust word vectors (RoVe). It is designed to alleviate the issue of typos and misspellings, common in almost any user-generated content, which hinder automatic text processing. Our model is morphologically motivated, which allows it to deal with unseen word forms in morphologically rich languages. We present the results on a number of natural language processing (NLP) tasks and languages for a variety of related architectures and show that the proposed architecture is robust to typos.

Export
Comment citer

@article{ZNSL_2021_499_a13,
     author = {T. Khakhulin and V. Logacheva and V. Malykh},
     title = {Robust word vectors: context-informed embeddings for noisy texts},
     journal = {Zapiski Nauchnykh Seminarov POMI},
     pages = {248--266},
     publisher = {mathdoc},
     volume = {499},
     year = {2021},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/ZNSL_2021_499_a13/}
}

TY  - JOUR
AU  - T. Khakhulin
AU  - V. Logacheva
AU  - V. Malykh
TI  - Robust word vectors: context-informed embeddings for noisy texts
JO  - Zapiski Nauchnykh Seminarov POMI
PY  - 2021
SP  - 248
EP  - 266
VL  - 499
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/ZNSL_2021_499_a13/
LA  - en
ID  - ZNSL_2021_499_a13
ER  -

%0 Journal Article
%A T. Khakhulin
%A V. Logacheva
%A V. Malykh
%T Robust word vectors: context-informed embeddings for noisy texts
%J Zapiski Nauchnykh Seminarov POMI
%D 2021
%P 248-266
%V 499
%I mathdoc
%U http://geodesic.mathdoc.fr/item/ZNSL_2021_499_a13/
%G en
%F ZNSL_2021_499_a13

T. Khakhulin; V. Logacheva; V. Malykh. Robust word vectors: context-informed embeddings for noisy texts. Zapiski Nauchnykh Seminarov POMI, Investigations on applied mathematics and informatics. Part I, Tome 499 (2021), pp. 248-266. http://geodesic.mathdoc.fr/item/ZNSL_2021_499_a13/

Parcourir par

Geodesic

Parcourir par