Determining Number of Speakers in Multi-Speaker Condition with Additive Noise

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

Evaluating Text-to-Text Generation from LLMs: A Case Study and Scalable Framework

Ziqiao Ao Juhi Singh Sebastian Antinome

Random Articles

Reseach Article

Determining Number of Speakers in Multi-Speaker Condition with Additive Noise

Published on September 2015 by Namratha N, R Kumaraswamy

National Conference “Electronics, Signals, Communication and Optimization"

Foundation of Computer Science USA

NCESCO2015 - Number 3

September 2015

Authors: Namratha N, R Kumaraswamy

Namratha N, R Kumaraswamy . Determining Number of Speakers in Multi-Speaker Condition with Additive Noise. National Conference “Electronics, Signals, Communication and Optimization". NCESCO2015, 3 (September 2015), 15-18.

@article{

author = { Namratha N, R Kumaraswamy },

title = { Determining Number of Speakers in Multi-Speaker Condition with Additive Noise },

journal = { National Conference “Electronics, Signals, Communication and Optimization" },

issue_date = { September 2015 },

volume = { NCESCO2015 },

number = { 3 },

month = { September },

year = { 2015 },

issn = 0975-8887,

pages = { 15-18 },

numpages = 4,

url = { /proceedings/ncesco2015/number3/22309-5325/ },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Proceeding Article

%1 National Conference “Electronics, Signals, Communication and Optimization"

%A Namratha N

%A R Kumaraswamy

%T Determining Number of Speakers in Multi-Speaker Condition with Additive Noise

%J National Conference “Electronics, Signals, Communication and Optimization"

%@ 0975-8887

%V NCESCO2015

%N 3

%P 15-18

%D 2015

%I International Journal of Computer Applications

Abstract

The performance of speaker recognition system considerably degrades if the sample used for speaker recognition task has voices from different speakers in the close vicinity. Solutions to these problems are needed, especially for signals collected in a practical environment, such as in a room with background noise and reverberation. This paper presents a method of determining number of speakers in multi speaker condition using excitation source information. Speech in a multi speaker environment are collected using two spatially separated microphones which results in time delay of arrival of speech signals with respect to a given speaker. This time delay is estimated from the cross correlation function of Hilbert envelopes of LP Residual signals. Thus by estimating the difference in time delay for different speakers the number of speakers can be determined. The performance of the proposed method is evaluated by adding different types of noise to the clean speech signal which illustrates the robustness of the proposed method.

References

L. R. Rabiner and B. H. Juang, Fundamentals of Speech Recognition.
Kumara Swamy. R. , Sri Rama Murty. K. , & Yegnanarayana. B, "Determining number of speakers from multispeaker speech
B. Yegnanarayana, S. R. M. Prasanna, R. Duraiswami, and D. Zotkin,"Processing of reverberent speech for time-delay estimation," IEEE Trans. Speech Audio Process. , vol. 13, no. 6, pp. 1110â1118, Nov. 2005.
T. V. Ananthapadmanabha and B. Yegnanarayana, "Epoch extraction from linear prediction residual for identification of closed glottis interval,"IEEE Trans. Acoust. , Speech, Signal Process. , vol. ASSP-27, no. 4, pp. 309â319, Aug. 1979.
J. Makhoul, "Linear prediction: A tutorial review," Proc. IEEE, vol. 63, no. 4, pp. 561â580, Apr. 1975.
K. Sri Rama Murty, Vivek Boominathan, and Karthika Vijayan, "Allpass modeling of lp residual for speaker recogni-tion," in International Conference on Signal Processing and Communications, SPCOM, July 2012, pp. 1â5.
Ananthapadmanabha, T. V. , & Yegnanarayana, B. (1979). "Epoch extraction from linear prediction residual for identification of closed glottis interval". IEEE Transactions on Acoustics, Speech, and Signal Processing, 27, 309â319.
Krishnamoorthy, P. , & Prasanna, S. R. M. (2007). Processing noisy speech by noise components subtraction and speech components enhancement. In Proc. int. conf. systemics and cybernetics informatics, Hyberabad, India.
Berouti, M. , Schwartz, R. , & Makhoul, J. (1979) " Enhancement.
Of speech corrupted by acoustic noise". In Proc. IEEE int. conf. acoust. , speech, signal process (pp. 208â211). Smits, R. , & Yegnanarayana, B. (1995) ," Determination of instants of significant excitation in speech using group delay function". IEEE Transactions on Speech and Audio Processing, 3, 325â333.

Index Terms

Computer Science

Information Sciences

Keywords

Excitation Source Information Instants Of Glottal Closure (gcis) Linear Prediction(lp) Residual Hilbert Envelop (he) Time Delay Estimation Different Types Of Noises.