Parameters Describing Local Properties of Speech Records

R. R. Nigmatullin; E. L. Stolov

Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki, Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki, Tome 155 (2013) no. 2, pp. 100-107

Cet article a éte moissonné depuis la source Math-Net.Ru

Voir la notice du chapitre de livre

Résumé

We suggest some new parameters for the description of short fragments of speech signals as well as methods for their estimation. We develop a technique for the exact localization of “explosion” in syllables beginning with stop consonants. A method for the evaluation of instantaneous frequency in the sound part of the syllable is presented. The method does not suppose the implementation of the Hilbert transform. It is shown that the distribution of instantaneous frequencies in a speech signal can be used for speaker identification.

Mots-clés : stop consonants
Keywords: explosion localization, approximation of instantaneous frequency, distribution of instantaneous frequencies, speaker identification.

@article{UZKU_2013_155_2_a8,
     author = {R. R. Nigmatullin and E. L. Stolov},
     title = {Parameters {Describing} {Local} {Properties} of {Speech} {Records}},
     journal = {U\v{c}\"enye zapiski Kazanskogo universiteta. Seri\^a Fiziko-matemati\v{c}eskie nauki},
     pages = {100--107},
     year = {2013},
     volume = {155},
     number = {2},
     language = {ru},
     url = {http://geodesic.mathdoc.fr/item/UZKU_2013_155_2_a8/}
}

TY  - JOUR
AU  - R. R. Nigmatullin
AU  - E. L. Stolov
TI  - Parameters Describing Local Properties of Speech Records
JO  - Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki
PY  - 2013
SP  - 100
EP  - 107
VL  - 155
IS  - 2
UR  - http://geodesic.mathdoc.fr/item/UZKU_2013_155_2_a8/
LA  - ru
ID  - UZKU_2013_155_2_a8
ER  -

%0 Journal Article
%A R. R. Nigmatullin
%A E. L. Stolov
%T Parameters Describing Local Properties of Speech Records
%J Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki
%D 2013
%P 100-107
%V 155
%N 2
%U http://geodesic.mathdoc.fr/item/UZKU_2013_155_2_a8/
%G ru
%F UZKU_2013_155_2_a8

R. R. Nigmatullin; E. L. Stolov. Parameters Describing Local Properties of Speech Records. Učënye zapiski Kazanskogo universiteta. Seriâ Fiziko-matematičeskie nauki, Uchenye Zapiski Kazanskogo Universiteta. Seriya Fiziko-Matematicheskie Nauki, Tome 155 (2013) no. 2, pp. 100-107. http://geodesic.mathdoc.fr/item/UZKU_2013_155_2_a8/

Bibliographie
Cité par

[1] Li H., Ma B., Lee C.-H., “A vector space modeling approach to spoken language identification”, IEEE Audio, Speech, Language Process, 15:1 (2007), 271–284 | DOI

[2] Campbell W. M., Campbell J. P., Reynolds D. A., Singer E., Torres-Carrasquillo P. A., “Support vector machines for speaker and language recognition”, Comput. Speech Lang., 20:2–3 (2006), 210–229 | DOI

[3] Siniscalchi S. M., Reed J., Svendsen T., Lee C.-H., “Universal attribute characterization of spoken languages for automatic spoken language recognition”, Comput. Speech Lang., 27:1 (2013), 209–227 | DOI

[4] Koolagudi S. G., Rastogi D., Rao K. S., “Spoken language identification using spectral features”, Commun. Comput. Inform. Sci., 306 (2012), 496–497 | DOI

[5] Newman J. L., Cox S. J., “Language identification using visual features”, IEEE Audio, Speech, Language Process, 20:7 (2012), 1936–1947 | DOI

[6] Nigmatullin R. R., Stolov E. L., “Razlichenie dvukh diktorov po korotkim frazam neortogonalnym veivlet preobrazovaniem”, Issled. po prikl. matem. i informatike, 27, Izd-vo Kazan. un-ta, Kazan, 2011, 153–160

[7] Diehl R. L., Lotto A. J., Holt L. L., “Speech perception”, Annu. Rev. Psychol., 55 (2004), 149–179 | DOI

[8] Nigmatullin R. R., Stolov E. L., “Opredelenie vremeni ustanovleniya vokalizatsii v slogakh, nachinayuschikhsya s glukhoi soglasnoi”, Vestn. KGTU im. A. N. Tupoleva, 2011, no. 1, 159–163

[9] Laions R., Tsifrovaya obrabotka signalov, Binom, M., 2006, 652 pp.

[10] Reynolds D. A., Rose R. C., “Robust text independent speaker identification using Gaussian mixture speaker models”, IEEE Speech Audio Process, 3:1 (1995), 72–83 | DOI

Parcourir par

Geodesic

Parcourir par