Voice Activity Detection for Robust Speaker Identification System

Call for Paper

July Edition

IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 22 June 2026

Submit your paper

Know more

The week's pick

CAD-Genesis: An Open-Source AI-Powered Add-in for Natural Language-Driven Parametric CAD Modeling and Cross-Platform Integration in SolidWorks and Fusion 360

Anil Mandloi Prakhi Mandloi

Random Articles

Generating Weather Forecast Texts with Case based Reasoning

May

2012

A Review on Mobility and Mobility Aware MAC Protocols in Wireless Sensor Network

April

2014

Listless Block Tree Coding with Discrete Wavelet Transform for Embedded Image Compression at Low Bit Rate

May

2013

Extracting Market Value of Business and Business Decision from Big Data Analytics

January

2016

Reseach Article

Voice Activity Detection for Robust Speaker Identification System

Published on September 2012 by El Bachir Tazi, Abderrahim Benabbou, Mostafa Harti

Software Engineering, Databases and Expert Systems

Foundation of Computer Science USA

SEDEX - Number 2

September 2012

Authors: El Bachir Tazi, Abderrahim Benabbou, Mostafa Harti

bc4a527f-f89e-45af-ae68-35dba4bcf54e

El Bachir Tazi, Abderrahim Benabbou, Mostafa Harti . Voice Activity Detection for Robust Speaker Identification System. Software Engineering, Databases and Expert Systems. SEDEX, 2 (September 2012), 35-39.

@article{

author = { El Bachir Tazi, Abderrahim Benabbou, Mostafa Harti },

title = { Voice Activity Detection for Robust Speaker Identification System },

journal = { Software Engineering, Databases and Expert Systems },

issue_date = { September 2012 },

volume = { SEDEX },

number = { 2 },

month = { September },

year = { 2012 },

issn = 0975-8887,

pages = { 35-39 },

numpages = 5,

url = { /specialissues/sedex/number2/8365-1016/ },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Special Issue Article

%1 Software Engineering, Databases and Expert Systems

%A El Bachir Tazi

%A Abderrahim Benabbou

%A Mostafa Harti

%T Voice Activity Detection for Robust Speaker Identification System

%J Software Engineering, Databases and Expert Systems

%@ 0975-8887

%V SEDEX

%N 2

%P 35-39

%D 2012

%I International Journal of Computer Applications

Abstract

The performances of Speaker Identification Systems (SIS) are strongly influenced by the quality of the speech signal. Most of these systems are based on Gaussian Mixture Models (GMM) that is trained using a training speech database. The mismatch between the training conditions and the testing conditions has a deep impact on the accuracy of these systems and represents a barrier for their operation in real conditions generally affected by noises disturbances. The Voice Activity Detection (VAD) is a very useful technique for improving the performance of these systems working in these scenarios. In this paper we have used within the feature extraction process, a robust VAD module, that yield high speech/non-speech discrimination accuracy and improve the performance of the SIS in noisy environments. A set of experiments which we have conducted on our proper database containing 37 Arabic speaker in order to evaluate the performances of our SIS based on gammatone frequency cepstral coefficients (GFCC) front-end combined to VAD algorithm show 7. 84% average improvement of Identification Rate (IR) performance of our SIS based on GFCC robust method compared to a baseline MFCC method. 2. 13% average improvement accuracy as a benefit of VAD technique is observed when the Rignal per Roise Ratio (SNR) changes from 40 dB to 0dB.

References

J. P. Campbell, "Speaker identification: A tutorial," Proc. IEEE, vol. 85, pp. 1437-1462, 1997.
S. Furui, Digital speech processing, synthesis, and identification. New York: Marcel Dekker, 2001.
D. A. Reynolds, et al. , "The SuperSID project: exploiting high-level information for high-accuracy speaker identification," in Proc. ICASSP, pp. 784-787, 2003.
D. A. Reynolds, "Speaker identification and verification using Gaussian mixture speaker models," Speech Comm. , vol. 17, pp. 91108, 1995.
Y. Shao and D. L. Wang, "Robust speaker identification using binary time-frequency masks," in Proc. ICASSP, vol. I, pp. 645-648, 2006.
Sohn, J. , Sung, W. , 1998. A voice activity detector employing soft decision based noise spectrum adaptation. In: Internat. Conf. on Acoust. Speech Signal Process. , Vol. 1, pp. 365–368
J. A. Haigh and J. S. Mason, "Robust voice activity detection using cepstral features," in IEEE TEN-CON, 1993, pp. 321–324
D. K. Freeman, G. Cosier, C. B. Southcott, and I. Boyd, "The voice activity detector for the pan European digital cellular mobile telephone service," in Proc. Int. Conf. Acoustics, Speech, Signal Processing, May 1989, pp. 369–372.
W. Abdulla, "Auditory based feature vectors for speech recognition systems" Advances in Communications and Software Technologies, N. E. Mastorakis & V. V. Kluev, Editor. WSEAS Press. pp 231-236, 2002.
M. Kleinschmidt, J. Tchorz and B. Kollmeier, Combining speech enhancement and auditory feature extraction for robust speech recognition, Speech Communication, Vol. 34, Issues 1-2, pp. 75-91, 2001.
B. Tazi, A. Benabbou, M. Harti, "Improved Feature Extraction for Text independent Automatic Speaker Identification System" in CMT'2012, EST USMBA Fez 22,23 and 24 Mars 2012
Douglas A. Reynolds et Richard C. Rose; " Robust text-independent speaker identification using gaussian mixture speaker models". IEEE Transactions on Acoustics, Speech and Signal Processing, Vol 3, N° 1 pp: 72-83, january 1995.
Reynolds, Douglas A. Thomas F. Quatieri, and Robert B. Dunn. Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing. vol. 10, pp. 19-41, 2000.
Dempster, A. P. , Laird, N. M. , and Rubin, D. B. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, B, 39, 1–38. December 1976.
http://www. speech. kth. se/wavesurfer/
S. Furui, An Overview of speaker recognition technology In Proceedings of the ESCA Workshop on Automatic Speaker Recognition, Identification and Verification, pages 1-9, Martigny, Switzerland, April 1994.

Index Terms

Computer Science

Information Sciences

Keywords

Gaussian Mixture Models (gmm) Mel Frequency Cepstral Coefficients (mfcc) Gammatone Frequency Cepstral Coefficients (gfcc) Speaker Identification System (sis) Voice Activity Detection (vad)