Automated recognition of paralinguistic signals in spoken dialogue systems: ways of improvement

Maxim Sidorov; Alexander Schmitt; Eugene S. Semenkin

Maxim Sidorov ; Alexander Schmitt ; Eugene S. Semenkin

Žurnal Sibirskogo federalʹnogo universiteta. Matematika i fizika, Tome 8 (2015) no. 2, pp. 208-216

Voir la notice de l'article provenant de la source Math-Net.Ru

Résumé

The ability of artificial systems to recognize paralinguistic signals, such as emotions, depression, or openness, is useful in various applications. However, the performance of such recognizers is not yet perfect. In this study we consider several directions which can significantly improve the performance of such systems. Firstly, we propose building speaker- or gender-specific emotion models. Thus, an emotion recognition (ER) procedure is followed by a gender- or speaker-identifier. Speaker- or gender-specific information is used either for including into the feature vector directly, or for creating separate emotion recognition models for each gender or speaker. Secondly, a feature selection procedure is an important part of any classification problem; therefore, we proposed using a feature selection technique, based on a genetic algorithm or an information gain approach. Both methods result in higher performance than baseline methods without any feature selection algorithms. Finally, we suggest analysing not only audio signals, but also combined audio-visual cues. The early fusion method (or feature-based fusion) has been used in our investigations to combine different modalities into a multimodal approach. The results obtained show that the multimodal approach outperforms single modalities on the considered corpora. The suggested methods have been evaluated on a number of emotional databases of three languages (English, German and Japanese), in both acted and non-acted settings. The results of numerical experiments are also shown in the study.

Keywords: recognition of paralinguistic signals, machine learning algorithms, speaker-adaptive emotion recognition, multimodal approach.

@article{JSFU_2015_8_2_a10,
     author = {Maxim Sidorov and Alexander Schmitt and Eugene S. Semenkin},
     title = {Automated recognition of paralinguistic signals in spoken dialogue systems: ways of improvement},
     journal = {\v{Z}urnal Sibirskogo federalʹnogo universiteta. Matematika i fizika},
     pages = {208--216},
     publisher = {mathdoc},
     volume = {8},
     number = {2},
     year = {2015},
     language = {en},
     url = {http://geodesic.mathdoc.fr/item/JSFU_2015_8_2_a10/}
}

TY  - JOUR
AU  - Maxim Sidorov
AU  - Alexander Schmitt
AU  - Eugene S. Semenkin
TI  - Automated recognition of paralinguistic signals in spoken dialogue systems: ways of improvement
JO  - Žurnal Sibirskogo federalʹnogo universiteta. Matematika i fizika
PY  - 2015
SP  - 208
EP  - 216
VL  - 8
IS  - 2
PB  - mathdoc
UR  - http://geodesic.mathdoc.fr/item/JSFU_2015_8_2_a10/
LA  - en
ID  - JSFU_2015_8_2_a10
ER  -

%0 Journal Article
%A Maxim Sidorov
%A Alexander Schmitt
%A Eugene S. Semenkin
%T Automated recognition of paralinguistic signals in spoken dialogue systems: ways of improvement
%J Žurnal Sibirskogo federalʹnogo universiteta. Matematika i fizika
%D 2015
%P 208-216
%V 8
%N 2
%I mathdoc
%U http://geodesic.mathdoc.fr/item/JSFU_2015_8_2_a10/
%G en
%F JSFU_2015_8_2_a10

Maxim Sidorov; Alexander Schmitt; Eugene S. Semenkin. Automated recognition of paralinguistic signals in spoken dialogue systems: ways of improvement. Žurnal Sibirskogo federalʹnogo universiteta. Matematika i fizika, Tome 8 (2015) no. 2, pp. 208-216. http://geodesic.mathdoc.fr/item/JSFU_2015_8_2_a10/

Parcourir par

Geodesic

Parcourir par