Voir la notice de l'article provenant de la source Math-Net.Ru
@article{IZKAB_2022_1_a3, author = {I. A. Gurtueva and K. Ch. Bzhikhatlov}, title = {Analytical review and classification of methods}, journal = {News of the Kabardin-Balkar scientific center of RAS}, pages = {41--58}, publisher = {mathdoc}, number = {1}, year = {2022}, language = {ru}, url = {http://geodesic.mathdoc.fr/item/IZKAB_2022_1_a3/} }
I. A. Gurtueva; K. Ch. Bzhikhatlov. Analytical review and classification of methods. News of the Kabardin-Balkar scientific center of RAS, no. 1 (2022), pp. 41-58. http://geodesic.mathdoc.fr/item/IZKAB_2022_1_a3/
[1] J. R. Deller, J. G. Proakis, J. H.L. Hansen, Discrete Time Processing of Speech Signals, Wiley-IEEE Press, Hoboken NJ, 1999, 936 pp.
[2] V. Gupta, “A Survey of Natural Language Processing Techniques”, International Journal of Computer Science Engineering Technology (IJCSET), 5:1 (2014), 14–16
[3] J. W. Picone, “Signal modeling techniques in speech recognition”, Proceedings of the IEEE, 81:9 (1993), 1215–1245 | DOI
[4] D. Jurafsky, J. Martin, Speech and Language Processing: An introduction to natural language processing, computational linguistics, and speech recognition, Prentice Hall, NJ, 2009, 1024 pp.
[5] J. O. Pickles, An Introduction to the Physiology of Hearing, Academic Press, New York, 1988, 400 pp. | MR
[6] H. Fletcher, W. A. Munson, “Relation between Loudness and Masking”, J. Acoust. Soc. Am, 1937, no. 9, 1–10 | DOI
[7] E. Zviker, R. Feldkeller, Ear as a receiver of information, Svyaz', Moscow, 1971, 255 pp. (In Russian)
[8] M. R. Schroeder, “Optimizing Digital Speech Coders by Exploiting Masking properties of the Human Ear”, J. Acoust. Soc. Am, 1979, no. 66 (6), 1647–1652 | DOI
[9] Traunmuller H, “Analytical Expressions for the tonotopic sensory scale.”, The Journal of the Acoustical Society of America, 88:1 (1990), 97–100 | DOI
[10] A. N. Kavalchuk, “The formula for the transition from the frequency domain to the bark scale and vice versa”, Informatics, 2011, no. 4 (32), 71–81 (In Russian)
[11] I. Aldoshina, R. Pritts, “Musical acoustics”, Textbook. St, 2014, 720, Composer, Petersburg (In Russian)
[12] B. Moore, “Frequency selectivity in hearing”, Boston, 1986, 456, Springer, MA | DOI
[13] J. O. Smith, “Mathematics of the Discrete Fourier Transform (DFT) with Audio Applications”, W3K Publishing, 2007, 322 http://books.w3k.org/ | MR
[14] L. R. Rabiner, R. W. Schafer, Digital processing of speech signal, eds. Russ. ed.: Rabiner L.R., Shafer R.V., PrenticeHall, New Jersey, 1978, 496 pp. (Tsifrovaya obrabotka rechevykh signalov. Moscow: Radio i svyaz' Publ., 1981)
[15] S. Davis, P. Mermelstein, “Experiments in syllable-based recognition of continuous speech”, IEEE Transactions on Acoustics, Speech and Signal Processing, 28 (1980), 357–366 | DOI | MR
[16] S. Chakroborty, A. Roy, G. Saha, “Fusion of a complementary feature set with MFCC for improved closed set text-independent speaker identification”, IEEE International Conference on Industrial Technology., 2006, 387–390, Mumbai
[17] Y. Shao, D. L. Wang, “Robust speaker identification using auditory features and computational auditory scene analysis”, Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2008)., 2008, 1589–1592 (Las Vegas, NV, USA) | DOI | MR
[18] L. V. Novikov, Fundamentals of Wavelet Signal Analysis, IanP RAN, St.Petersburg, 1999, 152 pp. (In Russian)
[19] N. E. Huang, “The empirical mode decomposition and the Hilbert spectrum for nonlinear and non-stationary time series analysis”, Proceedings of the Royal Society of London A, v. 454, 1998, 903–995 | DOI | MR | Zbl
[20] M. Todisco, H. Delgado, N. Evans, “A new feature for automatic speaker verification antispoofing: constant Q cepstral coefficients”, The Speaker and Language Recognition Workshop (Odyssey 2016.. Bilbao, Spain), 2016, 283–290
[21] G. Fant, Acoustic Theory of Speech Production, Walter de Gruyter, 1970, 328 pp.
[22] L. Rabiner, B. H. Juang, Fundamentals of speech recognition, Prentice-Hall, Inc, NJ, 1993, 507 pp.