Wav2Vec2 without Attention: do you need Hopfield Networks for Self-Supervised Learning of Speech Representations?

D. Grebenkin; I. Bondarenko

D. Grebenkin ; I. Bondarenko

Zapiski Nauchnykh Seminarov POMI, Investigations on applied mathematics and informatics. Part II–1, Tome 529 (2023), pp. 43-53

Voir la notice de l'article provenant de la source Math-Net.Ru

Résumé

In this work, we consider the possibility of replacing multi-head attention with dense associative memory (DAM) layers in the wav2vec2 automatic speech recognition algorithm. We examine the hypothesis that the concept of modern Hopfield networks is more suitable for restoration of missing fragments of the audio signal task and speech-to-text task than multi-head attention. Our experiments indicate that the model with the new architecture allows to improve the quality of speech recognition and can be used for pretraining the models on a large amount of audio data.

Export
Comment citer

@article{ZNSL_2023_529_a3,
     author = {D. Grebenkin and I. Bondarenko},
     title = {Wav2Vec2 without {Attention:} do you need {Hopfield} {Networks} for {Self-Supervised} {Learning} of {Speech} {Representations?}},
     journal = {Zapiski Nauchnykh Seminarov POMI},
     pages = {43--53},
     publisher = {mathdoc},
     volume = {529},
     year = {2023},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/ZNSL_2023_529_a3/}
}

TY  - JOUR
AU  - D. Grebenkin
AU  - I. Bondarenko
TI  - Wav2Vec2 without Attention: do you need Hopfield Networks for Self-Supervised Learning of Speech Representations?
JO  - Zapiski Nauchnykh Seminarov POMI
PY  - 2023
SP  - 43
EP  - 53
VL  - 529
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/ZNSL_2023_529_a3/
LA  - en
ID  - ZNSL_2023_529_a3
ER  -

%0 Journal Article
%A D. Grebenkin
%A I. Bondarenko
%T Wav2Vec2 without Attention: do you need Hopfield Networks for Self-Supervised Learning of Speech Representations?
%J Zapiski Nauchnykh Seminarov POMI
%D 2023
%P 43-53
%V 529
%I mathdoc
%U http://geodesic.mathdoc.fr/item/ZNSL_2023_529_a3/
%G en
%F ZNSL_2023_529_a3

D. Grebenkin; I. Bondarenko. Wav2Vec2 without Attention: do you need Hopfield Networks for Self-Supervised Learning of Speech Representations?. Zapiski Nauchnykh Seminarov POMI, Investigations on applied mathematics and informatics. Part II–1, Tome 529 (2023), pp. 43-53. http://geodesic.mathdoc.fr/item/ZNSL_2023_529_a3/

Parcourir par

Geodesic

Parcourir par