Call for Paper - November 2023 Edition
IJCA solicits original research papers for the November 2023 Edition. Last date of manuscript submission is October 20, 2023. Read More

Robust Features for Noisy Speech Recognition using MFCC Computation from Magnitude Spectrum of Higher Order Autocorrelation Coefficients

Print
PDF
International Journal of Computer Applications
© 2010 by IJCA Journal
Number 8 - Article 7
Year of Publication: 2010
Authors:
Dr. Amita Dev
Poonam Bansal
10.5120/1499-2016

Dr. Amita Dev and Poonam Bansal. Article:Robust Features for Noisy Speech Recognition using MFCC Computation from Magnitude Spectrum of Higher Order Autocorrelation Coefficients. International Journal of Computer Applications 10(8):36–38, November 2010. Published By Foundation of Computer Science. BibTeX

@article{key:article,
	author = {Dr. Amita Dev and Poonam Bansal},
	title = {Article:Robust Features for Noisy Speech Recognition using MFCC Computation from Magnitude Spectrum of Higher Order Autocorrelation Coefficients},
	journal = {International Journal of Computer Applications},
	year = {2010},
	volume = {10},
	number = {8},
	pages = {36--38},
	month = {November},
	note = {Published By Foundation of Computer Science}
}

Abstract

Noise robustness is one of the most challenging problem in automatic speech recognition. The goal of robust feature extraction is to improve the performance of speech recognition in adverse conditions. The mel-scaled frequency cepstral coefficients (MFCCs) derived from Fourier transform and filter bank analysis are perhaps the most widely used front-ends in state-of-the-art speech recognition systems. One of the major issues with the MFCCs is that they are very sensitive to additive noise. To improve the robustness of speech front-ends we introduce, in this paper, a new set of MFCC vector which is estimated through three steps. First, the relative higher order autocorrelation coefficients are extracted. Then magnitude spectrum of the resultant speech signal is estimated through the fast Fourier transform (FFT) and it is differentiated with respect to frequency. Finally, the differentiated magnitude spectrum is transformed into MFCC-like coefficients. These are called MFCCs extracted from Differentiated Relative Higher Order Autocorrelation Sequence Specrum (DRHOASS). Speech recognition experiments for various tasks indicate that the new feature vector is more robust than traditional mel-scaled frequency cepstral coefficients (MFCCs) in additive noise conditions.

Reference

  • Davis, S.B., Mermelstein, P.: Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. Acoustics, Speech Signal Process. 28, 357–366 (1980).
  • Vaseghi, S.V., Milner, and B.P.: Noise compensation methods for hidden Markov model speech recognition in adverse environments. IEEE Trans. Speech Audio Process. 5 (1), 11–21 (1997).
  • Boll, S.F.: Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoustics, Speech Signal Process. 27 (2), 113–120 (1979).
  • Hermansky, H., Morgan, N.: RASTA of processing of speech. IEEE Trans. Speech Audio Process. 2 (4), 578–589 (1994).
  • Hermansky, H., Morgan, N., Bayya, A., Kohn, P.: Compensation for the effect of the communication channel in auditory-like analysis of speech (RASTA-PLP). In: EUROSPEECH, Genova, p.p. 1367–1370 (1991).
  • Gales, M.J.F., Young, S.J.: Robust speech recognition using parallel model combination. IEEE Trans. Speech Audio Process. 4 (5), 352–359 (1996).
  • Moreno, P.J., Raj, B., Stern, R.M.: A vector Taylor series approach for environment independent speech recognition. In: ICSLP, Philadelphia, PA, pp. 733–736 (1996).
  • Padmanabhan, M. : Spectral peak tracking and its use in speech recognition. In: ICSLP (2000).
  • Sujatha, J., Prasanna K.R., Ramakrishnan, K. R., Balakrishnan, N.: Spectral maxima representation for robust automatic speech recognition. In: Eurospeech, pp. 3077-3080 (2003).
  • You, K.H., Wang, H.C.: Robust features for noisy speech recognition based on temporal trajectory filtering of short-time autocorrelation sequences. Speech Communication, (28),13-24 (1999).
  • Strope, B., Alwan, A.: A model of dynamic auditory perception and its application to robust word recognitionIEEE Trans. on Speech and Audio Processing. 5 (5), 451-464 (1997).
  • Shannon B.J., Paliwal, K. K.: Feature extraction from higher-lag autocorrelation coefficients for robust speech recognition. Speech Communication, 48(11), 1458-1485 (2006).
  • Varga, A., Steeneken, H.J.M.: Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems. Speech Commun., 12, 247–251 (1993).