Speaker Identification using Spectrograms of Varying Frame Sizes

H. B. Kekre; Vaishali Kulkarni; Prashant Gaikar; Nishant Gupta

Call for Paper

July Edition

IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 22 June 2026

Submit your paper

Know more

The week's pick

CAD-Genesis: An Open-Source AI-Powered Add-in for Natural Language-Driven Parametric CAD Modeling and Cross-Platform Integration in SolidWorks and Fusion 360

Anil Mandloi Prakhi Mandloi

Random Articles

Customer Satisfaction Analysis to Health Service by Servqual 5 Dimension Method and Customer Satisfaction Index

May

2013

Node Mobility Control Mechanism of Mobile Ad-Hoc Network

May

2014

Cryptocurrencies Analytics with Machine Learning and Human-centered Explainable AI: Enhancing Decision-Making in Dynamic Market

Jan

2025

An Efficient Image Compression using Singular Value Decomposition with Scale Invariant Feature Transform

Feb

2017

Reseach Article

Speaker Identification using Spectrograms of Varying Frame Sizes

by H. B. Kekre, Vaishali Kulkarni, Prashant Gaikar, Nishant Gupta

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 50 - Number 20

Year of Publication: 2012

Authors: H. B. Kekre, Vaishali Kulkarni, Prashant Gaikar, Nishant Gupta

10.5120/7921-1228

H. B. Kekre, Vaishali Kulkarni, Prashant Gaikar, Nishant Gupta . Speaker Identification using Spectrograms of Varying Frame Sizes. International Journal of Computer Applications. 50, 20 ( July 2012), 27-33. DOI=10.5120/7921-1228

@article{ 10.5120/7921-1228,

author = { H. B. Kekre, Vaishali Kulkarni, Prashant Gaikar, Nishant Gupta },

title = { Speaker Identification using Spectrograms of Varying Frame Sizes },

journal = { International Journal of Computer Applications },

issue_date = { July 2012 },

volume = { 50 },

number = { 20 },

month = { July },

year = { 2012 },

issn = { 0975-8887 },

pages = { 27-33 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume50/number20/7921-1228/ },

doi = { 10.5120/7921-1228 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:48:50.502156+05:30

%A H. B. Kekre

%A Vaishali Kulkarni

%A Prashant Gaikar

%A Nishant Gupta

%T Speaker Identification using Spectrograms of Varying Frame Sizes

%J International Journal of Computer Applications

%@ 0975-8887

%V 50

%N 20

%P 27-33

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In this paper, a text dependent speaker recognition algorithm based on spectrogram is proposed. The spectrograms have been generated using Discrete Fourier Transform for varying frame sizes with 25% and 50% overlap between speech frames. Feature vector extraction has been done by using the row mean vector of the spectrograms. For feature matching, two distance measures, namely Euclidean distance and Manhattan distance have been used. The results have been computed using two databases: a locally created database and CSLU speaker recognition database. The maximum accuracy is 92. 52% for an overlap of 50% between speech frames with Manhattan distance as similarity measure.

References

D. A. Reynolds, "An overview of automatic Speaker Recognition Technology", ICASSP 2002, pp 4072-4075.
J. M. Naik, "Speaker Verification: A Tutorial", IEEE Communications Magazine, January 1990, pp. 42-48.
S. Furui, "Fifty years of progress in speech and speaker recognition," Proc. 148th ASA Meeting, 2004.
F. Bimbot, J. -F. Bonastre, C. Fredouille, G. Gravier, I. Magrin-Chagnolleau, S. Meignier, T. Merlin, J. Ortega-García, D. Petrovska-Delacrétaz, and D. A. Reynolds, "A tutorial on text-independent speaker verification," EURASIP J. Appl. Signal Process. , no. 1, pp. 430–451, 2004.
Joseph P. Campbell, Jr. , Senior Member, IEEE, "Speaker Recognition: A Tutorial", Proceedings of the IEEE, vol. 85, no. 9, pp. 1437-1462, September 1997.
S. Pruzansky, "Pattern-matching procedure for automatic talker recognition," J. A. S. A. , 35, pp. 354-358, 1963.
R. H. Bolt, F. S. Cooper, E. E. David, Jr. , P. B. Denes, J. M. Pickett and K. N. Stevens "Identification of a Speaker by Speech Spectrograms", Science Volume 166, pp. 338-343 (1969)
T. Dutta, 'Text Dependent Speaker Identification Based on Spectrograms', Proceedings of Image and Vision Computing New Zealand 2007, pp. 238–243, Hamilton, New Zealand, December 2007.
Dr. H. B. Kekre, Dr. Tanuja K. Sarode, Shachi J. Natu, Prachi J. Natu, "Speaker Identification Using 2-D DCT, Walsh And Haar On Full And Block Spectrogram", (IJCSE) International Journal on Computer Science and Engineering Vol. 02, No. 05, 2010, 1733--1740.
Dr. H. B. Kekre, Vaishali Kulkarni, "Speaker Identification using row Mean of Haar and Kekre's Transform on Spectrograms of Different Frame Sizes", International Journal of Advanced Computer Science and Applications, Special Issue on Artificial Intelligence.
H. B. Kekre, Tanuja Sarode, Shachi Natu, Prachi Natu, "Performance Comparison Of 2-D DCT On Full/Block Spectrogram And 1-D DCT On Row Mean Of Spectrogram For Speaker Identification", (Selected) CSC-International Journal of Biometrics and Bioinformatics (IJBB), Volume (4): Issue (3).
A. V. Oppenheim, "Speech spectrograms using the fast Fourier transform," IEEE Spectrum, vol. 7, pp. 57–62, August 1970.
W. Koenig, H. K. Dunn, and L. Y. Lacey, "The sound spectrograph," Journal of the Acoustical Society of America, vol. 18, pp. 19–49, 1946.
Paul E. Black, "Euclidean distance", Dictionary of Algorithms and Data Structures [online].
Paul E. Black, ed. , U. S. National Institute of Standards and Technology. 17 December 2004. Available from: http://www. nist. gov/dads/HTML/euclidndstnc. html
Micki Krause and Harold F. Tipton, "Handbook of Information Security Management", Auerbach Publications, CRC Press, ISBN: 0849399475.

Index Terms

Computer Science

Information Sciences

Keywords

Discrete Fourier Transform (DFT) Row Mean Euclidean Distance Manhattan distance