Suitability of speech files for automatic speech recognition systems after noise reduction procedures

R. Kh. Latypov; E. L. Stolov

Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki, Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki, Tome 157 (2015) no. 4, pp. 49-55

Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice du chapitre de livre

Résumé

The results of the experiments on noise reduction in speech files with further transfer in the Google automatic speech recognition (ASR) system are presented. An original method is developed to identify the intervals containing only noise. It is shown that using a modified Wiener filter in the time domain allows to improve the recognition quality of noisy files.

Keywords: speech file, noise reduction, speech recognition, Wiener filter.
Mots-clés : Google

@article{UZKU_2015_157_4_a3,
     author = {R. Kh. Latypov and E. L. Stolov},
     title = {Suitability of speech files for automatic speech recognition systems after noise reduction procedures},
     journal = {U\v{c}\"enye zapiski Kazanskogo universiteta. Seri\^a Fiziko-matemati\v{c}eskie nauki},
     pages = {49--55},
     year = {2015},
     volume = {157},
     number = {4},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/UZKU_2015_157_4_a3/}
}

TY  - JOUR
AU  - R. Kh. Latypov
AU  - E. L. Stolov
TI  - Suitability of speech files for automatic speech recognition systems after noise reduction procedures
JO  - Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki
PY  - 2015
SP  - 49
EP  - 55
VL  - 157
IS  - 4
UR  - http://geodesic.mathdoc.fr/item/UZKU_2015_157_4_a3/
LA  - ru
ID  - UZKU_2015_157_4_a3
ER  -

%0 Journal Article
%A R. Kh. Latypov
%A E. L. Stolov
%T Suitability of speech files for automatic speech recognition systems after noise reduction procedures
%J Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki
%D 2015
%P 49-55
%V 157
%N 4
%U http://geodesic.mathdoc.fr/item/UZKU_2015_157_4_a3/
%G ru
%F UZKU_2015_157_4_a3

R. Kh. Latypov; E. L. Stolov. Suitability of speech files for automatic speech recognition systems after noise reduction procedures. Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki, Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki, Tome 157 (2015) no. 4, pp. 49-55. http://geodesic.mathdoc.fr/item/UZKU_2015_157_4_a3/

Bibliographie
Cité par

[1] Quatieri T. F., Discrete-Time Speech Signal Processing, Prentice-Hall PTR, 2002, 781 pp.

[2] Benesty J., Chen J., Huang Y., Cohen I., Noise Reduction in Speech Processing, Springer-Verlag, Berlin–Heidelberg, 2009, 240 pp.

[3] Hendriks R. C., Gerkmann T., Jensen G. J., DFT-Domain Based Single Microphone Noise Reduction for Speech Enhancement, Morgan Claypool Publ., 2013, 80 pp.

[4] Schroeder M. R., Apparatus for suppressing noise and distortion in communication signals, U. S. Patent No. 3,180,936, filed Dec. 1, 1960, issued Apr. 27, 1965

[5] McAulay R. J., Malpass M. L., “Speech enhancement using a soft-decision noise”, EEE Trans Acoust., Speech, Signal Processing, ASSP-28:2 (1980), 137–145 | DOI

[6] Ephraim Y., Malah D., “Speech enhancement using a minimum mean-square error short-time spectral amplitude estimator”, IEEE Trans Acoust., Speech, Signal Processing, ASSP-32:6 (1984), 1109–1121 | DOI

[7] Srinivasan S., Samuelsson J., Kleijn W. B., “Codebook-Based Bayesian Speech Enhancement for Nonstationary Environments”, IEEE Trans. Audio, Speech, Language Processing, 15:2 (2007), 441–452 | DOI

[8] Rozenkranz T., “Modeling the temporal evolution of LPC parameters for codebook-based speech enhancement”, Proc. 6th Int. Symposium on Image and Signal Proc. and Analysis, IEEE, 2009, 455–460

[9] Marzinzik M., Kollmeier B., “Speech pause detection for noise spectrum estimation by tracking power envelope dynamics”, IEEE Trans. Speech and Audio Processing, 10:2 (2007), 109–118 | DOI

[10] Chen J., Benesty J., Yiteng Huang, Doclo S., “New insights into the noise reduction Wiener filter”, IEEE Trans. Audio, Speech, Language Processing, 14:4 (2006), 1218–1234 | DOI

[11] Hunter J. D., “Matplotlib: A 2D Graphics Environment”, Computing in Science Engineering, 9:3 (2007), 90–95 | DOI

Parcourir par

Geodesic

Parcourir par