Performance of Complementary Features for Robust Speaker Identification

Sharada V. Chougule; Mahesh S. Chavan

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 20 July 2026

Submit your paper

Know more

The week's pick

CAD-Genesis: An Open-Source AI-Powered Add-in for Natural Language-Driven Parametric CAD Modeling and Cross-Platform Integration in SolidWorks and Fusion 360

Anil Mandloi Prakhi Mandloi

Random Articles

Computation (Abacus) Aspects of the Sahasralingam

Jun

2016

Design and Implementation of Photo Voltaic System: Arduino Approach

August

2013

A Review of the Effective Techniques of Compression in Medical Image Processing

July

2014

Performance Comparisons of Novel Feature Vector Selection Methods for Iris Recognition

July

2012

Reseach Article

Performance of Complementary Features for Robust Speaker Identification

by Sharada V. Chougule, Mahesh S. Chavan

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 123 - Number 9

Year of Publication: 2015

Authors: Sharada V. Chougule, Mahesh S. Chavan

10.5120/ijca2015905617

Sharada V. Chougule, Mahesh S. Chavan . Performance of Complementary Features for Robust Speaker Identification. International Journal of Computer Applications. 123, 9 ( August 2015), 21-27. DOI=10.5120/ijca2015905617

@article{ 10.5120/ijca2015905617,

author = { Sharada V. Chougule, Mahesh S. Chavan },

title = { Performance of Complementary Features for Robust Speaker Identification },

journal = { International Journal of Computer Applications },

issue_date = { August 2015 },

volume = { 123 },

number = { 9 },

month = { August },

year = { 2015 },

issn = { 0975-8887 },

pages = { 21-27 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume123/number9/21987-2015905617/ },

doi = { 10.5120/ijca2015905617 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:12:14.892688+05:30

%A Sharada V. Chougule

%A Mahesh S. Chavan

%T Performance of Complementary Features for Robust Speaker Identification

%J International Journal of Computer Applications

%@ 0975-8887

%V 123

%N 9

%P 21-27

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

This paper considers the problem of acoustic mismatch caused by use of different sensors, in digital gazettes and hand-held devices. In this paper, two complementary features derived from conventional cepstral features are proposed, namely linear/mel spectral subband features (L/M-SSC) and log filter bank energy features (LFBE). The performance of these complementary features is compared with conventional features in acoustic mismatch conditions. To investigate the performance of features alone, all processing and classification steps are kept constant to allow a controlled comparison. A multi-variability speech database (IITG-MV) with acoustic mismatch (different microphones) is used for experimental evaluation. It is observed that all these features shows almost equal performance for text independent speaker identification in same acoustic condition. Whereas in mismatch condition, spectral subband centroids (L/M-SSC) features proved to be robust than other features when used alone. Further, use of dynamic features along with channel and noise compensation enhances the percentage identification rate of the system for all cases of acoustic mismatch, with spectral subband centroid features showing comparable performance to that of conventional features.

References

Joseph P Campbell, Wade Shen, Willam M Campbell,Reva Schwartz, Jean-Francois Bonastre and Driss Matrouf, “Forensic speaker recognition” , IEEE Signal Processing Magazine, March 2009 , pp. 95-103.
Tomi Kinnunen, Haizhou Li, “An overview of text independent speaker recognition, from features to supervectors” , Speech Communication, July 2009.
Douglas A Raynolds, “Automatic speaker recognition using Gaussian Mixture Model” , The LINCON Laboratory Journal, vol.8, No.2, 1995, pp.173-192.
D.A. Reynolds, T.F. Quateri, and R.B. Dunn, “Speaker verification using adapted Gaussian mixture models”, Digital Signal Processing , vol. 10,2000, p. 19-41.
Taufiq Hasan and John H.L. Hansen, ”A study of universal background model training in speaker verification”, IEEE Trans. Audio Speech Lang. Process. vol. 19, No. 7, Sep 2011.
B. Yegnanarayana, and S.P. Kishore, “AANN An alternative to GMM for pattern recognition”, Neural Networks , vol. 15, 2002, p. 459-69.
V. Wan, and S. Renals, ”Evaluation of kernel methods for speaker verification and identification”, Proceeding IEEE International Conference on Acoustic, Speech, Signal Processing. , vol. 1, 2002, pp.669 –672.
Chang Huai You , Kong Aik Lee and Haizhou Li, “GMM-SVM Kernel with a Bhattacharyya based distance for speaker recognition” , IEEE Transaction on Audio, Speech and Language Processing,vol.18,no.6, August 2010, pp.1300-1312.
Marc Ferras, Cheung-Chi Leung, Claude Barras and Jean-Luc Gauvain, ”Comparison of speaker adaption methods as feature extraction for SVM-based speaker recognition”, IEEE Transaction on Audio, Speech and Language Processing,vol.19,no.7, September 2011,pp.1890-1899.
Seyed Omid Sadjadi and John H.L. Hansen, ”Robust front end processing “, IEEE ICASSP 2013, pp.7214-7218.
James G Lyons, James G. O’Connel and Kuldip K Paliwal, “Using long-term information to improve robustness in Speaker Identification”, IEEE 2010.
Xiaojia Zhao and DeLiang Wang, “ Analyzing noise robustness of MFCC and GFCC features in speaker identification”, IEEE, ICASSP 2013, pp.7204-7208.
Vikramjit Mitra , Mitchel McLaren,Horacio Franco, Martin Graciarena, Nicolas Scheffer, “Modulation features for noise
robust speaker identification“, INTERSPEECH 2013, pp. 3707-3713.
Steven V Devis and Paul Mermelstein, ”Comparison of parametric representations of monosyllabic word recognition in continuously spoken sentences”, IEEE Transaction on Audio, Speech and Language Processing,vol.4, ISSP-28,no.4, August 1980 , pp.357-366.
K. K. Paliwal, ” Spectral Centroid Features for speech recognition” , Proc. ICASSP, vol. 2, Seattle, 1998, pp.617–620.
Jinggong Chen, Yiteng Huang, Qi Li and Kuldip Paliwal, “ Recognition of noisy speech using dynamic spectral subband centroids”, IEEE Signal Processing Letters, vol.11, no.2. February 2004,pp. 258-261.
Tomi Kinnunen, Evgeny Karpov and Pasi Franti, “ Real time speaker identification and verification”, IEEE Transaction on speech and audio processing, vol. 14, no.1, January 2006, pp.277-288.
Electro Medical and Speech Technology Laboratory, Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati. http://www.iitg.ernet.in/ece/emstlab/
Pujol P, Macho D., Nadeu C:On real time mean and variance normalization of speech recognition features,IEEE,ICASSP, 2006
Saeed V. Vaseghi : Advanced Digital Signal Processing and Noise Reduction, Second Edition, John Wiley & Sons Ltd,2000.

Index Terms

Computer Science

Information Sciences

Keywords

MFCC LFCC Linear/Mel scale spectral subband centroids (L/M-SSC) Log filter bank energy (LFBE)