Robust ASR Systems using Auditory Filter in Impulsive Noise Environment

Issam Bel Haj Yahia; Zied Hajaiej

Call for Paper

March Edition

IJCA solicits high quality original research papers for the upcoming March edition of the journal. The last date of research paper submission is 20 February 2026

Submit your paper

Know more

The week's pick

A Knowledge-Graph–Driven Multimodal Large Model for Semantic Understanding and Controllable Generation of Intangible Cultural Heritage

Jundi Yang Heng Yao

Random Articles

Reseach Article

Robust ASR Systems using Auditory Filter in Impulsive Noise Environment

by Issam Bel Haj Yahia, Zied Hajaiej

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 137 - Number 10

Year of Publication: 2016

Authors: Issam Bel Haj Yahia, Zied Hajaiej

10.5120/ijca2016908802

Issam Bel Haj Yahia, Zied Hajaiej . Robust ASR Systems using Auditory Filter in Impulsive Noise Environment. International Journal of Computer Applications. 137, 10 ( March 2016), 22-27. DOI=10.5120/ijca2016908802

@article{ 10.5120/ijca2016908802,

author = { Issam Bel Haj Yahia, Zied Hajaiej },

title = { Robust ASR Systems using Auditory Filter in Impulsive Noise Environment },

journal = { International Journal of Computer Applications },

issue_date = { March 2016 },

volume = { 137 },

number = { 10 },

month = { March },

year = { 2016 },

issn = { 0975-8887 },

pages = { 22-27 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume137/number10/24309-2016908802/ },

doi = { 10.5120/ijca2016908802 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:37:59.509773+05:30

%A Issam Bel Haj Yahia

%A Zied Hajaiej

%T Robust ASR Systems using Auditory Filter in Impulsive Noise Environment

%J International Journal of Computer Applications

%@ 0975-8887

%V 137

%N 10

%P 22-27

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

This paper is dedicated to the development of new automatic methods for recognizing of isolated words with impulsive sounds. This article presents a parameterization technique of speech signal with impulsive noise based on auditory filter modeling by the gammachirp filterbank (Gammachirp Filter Banc (GFB). This work includes two parts; the first is devoted to traditional techniques. The second deals with modern methods incorporating a model of auditory filter called gamma chirp. In this section, we will extract the characteristics of a single word with impulsive noise from the TIMIT database using parameterization technique Perceptual Linear Preduction( PLP) with the GFB.The recognition system is implemented on Hidden Markov Model Toolkit HTK platform based on HMM. For evaluation a comparative study was operated with standard PLP and Mel Frequency Cepstral Coefficient (MFCC). We propose a study of the performance of new parameterization technique GFB_PLP and GFB_MFCC proposed in the presence of different impulsive noises. Three types of impulsive noise are used (blast door, glass breaks, and explosion) Tests were carried out at different SNR levels (15dB, 10dB, 5dB, 0 dB and -3 dB) The GFB –PLP technique give the better results in different tests.

References

Timo Gerkmann, Richard C. Hendrikes, “Noise power estimation based on the probability of speech presence,” Proc. IEEEWASPAA, pp. 145-148, New York, Oct. 2011.
Irino. T, E. Okamoto, R. Nisimura, Hideki Kawahara and Roy D. Patterson, "A Gammachirp Auditory Filterbank for Reliable Estimation of Vocal Tract Length from both Voiced and Whispered Speech," The 4th Annual Conference of the British Society of Audiology, Keele, UK, 4-6, Sept, 2013
T. Irino, R. D. Patterson. ‘‘Temporal asymmetry in the auditory system.’’ J. Acoust. Soc. Am. 99(4): 2316-2331, April, 1997.
T. Irino, R. D. Patterson. ‘‘A time-domain, Level-dependent auditory filter: The gammachirp.’’ J. Acoust. Soc. Am. 101(1): 412-419, January, 1997.
T. Irino et M. Unoki. ‘‘An Analysis Auditory Filterbank Based on an IIR Implementation of the Gammachirp.’’ J. Acoust. Soc Japan. 20(6): 397-406, November, 1999.
Patterson, R, D., Nimmo-Smith, I., ’’Off-frequency listening and auditory-filter asymmetry’’ J. Acoust. Soc. Am, Vol. 67, No. 1, pp. 229-245, 1980.
SCHLÜTER, R., BEZRUKOV, I., WAGNER, H., NEY, H, “Gamma tone features and feature combination for large vocabulary speech recognition,” In ICASSP 2007. Honolulu (HI, USA), April 2007, p. 649-652.
Paliwal, K, K., “Decorrelated and Liftered Filter-Bank Energies for Robust Speech Recognition”, Proc. Eurospeech, pp. 85-88. 1999.
Irino, T., Patterson, R, D., ‘‘A compressive gammachirp auditory filter for both physiological and psychophysical data.’’ J. Acoust Soc. Am. 109(5): 2008-2022, may 2001.
Young, S., Evermann,G., Gales, M., Hain, T. D., Kershaw,X. Liu, Moore, G., Odell, J. D., Ollason, D., and Woodland. P., The HTK book (for HTK version 3.4). Cambridge University Engineering Department, Cambridge, UK, 2006.
Young S. J., Woodland P. C., Byrne W. J., "HTK. Reference Manual for HTK version 3.1", December 2001.
L.R. Rabiner. A tutorial on hidden Markov models and selected applications in speech processing. Proceedings of IEEE, 77(2):257–286, 1989.
Zied Hajaiej, Kais Ouni, Noureddine Ellouze, “ Etude et évaluation d’une technique de paramétrisation perceptive des signaux de parole ”, Traitements & Analyse d’Informations : Méthodes & Applications, TAIMA 05, Hammamet, Tunisie, pp.259 – 264, 1–3 octobre 2005.
Smith III, J, O., Abel, J, S,. ’’Bark and ERB Bilinear Transforms.’’ IEEE Tran. On speech and Audio Processing, Vol. 7, No. 6, November 1999.
NIST., The DARPA TIMIT Acoustic-phonetic Continuous Speech Recognition Database, 1990 .
Hermansky, H., ‘‘Perceptual Linear predictive (PLP) analysis of speech’’, J. Acoust. Soc. Am. Vol. 87, No. 4, pp. 1738-1752., April 1990
Varga, A., Steeneken, H,J,M., Omlison, M,T., Jones, D., “The NOISEX-92 Study on the Effect of Additive Noise on Automatic Speech Recognition”, Documentation included in the NOISEX-92 CD-ROM Set.,1992
A. B. Poritz, “Hidden Markov models: A guided tour”, in Proc. of the IEEE Int’l. Conf. on Acoustics, Speech and Signal Processing (ICASSP ‘88), May 1988, pp. 7-13.
E. Loweimi and S. M. Ahadi, “A new group delay-based feature for robust speech recognition,” in Proc. IEEE Int. Conf. on Multimedia & Expo, Barcelona, pp. 1-5, July 2011.
Skowronski M. D. and Harris J. G., 2002, “Increased MFCC filter bandwidth for noise-robust phoneme recognition”, in Proc. ICASSP-02, Florida.
L. Bréhélin, O. Gasuel. « Modèles de Markov cachés et apprentissage des séquences. Le temps, l'espace et l'évolutif en sciences du traitement de l'information », Éditions Cépaduès, pp. 407-421, 2000.
Zied Hajaiej, Kaïs Ouni and Noureddine Ellouze “Gammachirp Filter Frond-End for Automatic Speech Recognition “International Conference: Sciences of Electronic, Technologies of Information and Telecommunications SETTIT, 2000

Index Terms

Computer Science

Information Sciences

Keywords

Mfcc plp gfb mfcc gfb plp