Speaker Independent Speech Recognition using MFCC with Cubic-Log Compression and VQ Analysis

Neeraj Kaberpanthi; Ashutosh Datar

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

Speaker Independent Speech Recognition using MFCC with Cubic-Log Compression and VQ Analysis

by Neeraj Kaberpanthi, Ashutosh Datar

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 95 - Number 26

Year of Publication: 2014

Authors: Neeraj Kaberpanthi, Ashutosh Datar

10.5120/16962-7081

Neeraj Kaberpanthi, Ashutosh Datar . Speaker Independent Speech Recognition using MFCC with Cubic-Log Compression and VQ Analysis. International Journal of Computer Applications. 95, 26 ( June 2014), 33-37. DOI=10.5120/16962-7081

@article{ 10.5120/16962-7081,

author = { Neeraj Kaberpanthi, Ashutosh Datar },

title = { Speaker Independent Speech Recognition using MFCC with Cubic-Log Compression and VQ Analysis },

journal = { International Journal of Computer Applications },

issue_date = { June 2014 },

volume = { 95 },

number = { 26 },

month = { June },

year = { 2014 },

issn = { 0975-8887 },

pages = { 33-37 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume95/number26/16962-7081/ },

doi = { 10.5120/16962-7081 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:20:32.367040+05:30

%A Neeraj Kaberpanthi

%A Ashutosh Datar

%T Speaker Independent Speech Recognition using MFCC with Cubic-Log Compression and VQ Analysis

%J International Journal of Computer Applications

%@ 0975-8887

%V 95

%N 26

%P 33-37

%D 2014

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Speech processing is developed as one of the paramount requisition region of digital signal processing. Different fields for research in speech processing are speech recognition, speaker identification, speech bland, speech coding etc. The objective of Speaker Independent Speech Recognition is to concentrate, describe and distinguish information about speech signal and methodology towards creating the speaker free speech recognition system. Extracted information will be valuable for the directing and working different electronic contraptions and hardware through the human voice proficiently. Feature extraction is the first venture for speech recognition. Numerous algorithms are recommended / created by the scientists for feature extraction. In this work, the cubic-log compression in Mel-Frequency Cepstrum Coefficient (MFCC) feature extraction system is utilized to concentrate the characteristics from speech sign for outlining a speaker independent speaker recognition system. Extracted features are used to train and test this system with the help of Vector Quantization approach.

References

Rabiner, L. , and Juang, B. H. 2003. Fundamentals of Speech Recognition. Pearson Education (Singapore).
Pathak, P. 2010. Speech Recognition Technology: Applications & Future. International journal on Advance Research on computer science. Vol. 1.
Anusuya, M. A. , and Katti, S. K. 2009. Speech Recognition by Machine: A Review. International Journal of Computer Science and Information Security. Vol. 6. No. 3. 181-205.
Gaikwad, S. K. , Gawali, B. W. , and Yannawar, P. 2010. A Review on Speech Recognition Technique. International Journal of Computer Applications. Vol. 10. No. 3. 16-24.
Prabhakar, O. P. , and Sahu, N. K. 2013. A Survey On: Voice Command Recognition Technique. International Journal of Advanced Research in Computer Science and Software Engineering. Vol. 3. 576-585.
Karam, Zahi N. , and Campbell W. M. A new Kernel for SVM MIIR based Speaker recognition. MIT Lincoln Laboratory, Lexington, MA, USA.
Hermansky, H. 1990. Perceptual linear predictive (PLP) analysis of speech. US WEST Advanced Technologies, Science and Technology, Englewood, Colorado. 1738-1752.
Umbach, R. H. , and Ney, H. 1992. Linear discriminant analysis for improved large vocabulary continuous speech recognition. Acoustics, Speech, and Signal Processing, 1992. ICASSP-92. 1992 IEEE International Conference on, San Francisco, CA. Vol. 1. 13-16.
Jiang, H. , and Joo, M. 2003. Improved linear predictive coding method for speech recognition. Information, Communications and Signal Processing, 2003 and Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint Conference of the Fourth International Conference on IEEE. Vol. 3. 15-18.
Hossan, M. A. , Memon, S. , and Gregory, M. A. 2010. A Novel Approach for MFCC Feature Extraction. Signal Processing and Communication Systems (ICSPCS), 2010 4th International Conference, Gold Coast, QLD. 1-5.
Junqin, W. , and Junjun, Y. 2011. An Improved Arithmetic of MFCC in Speech Recognition System. Electronics, Communications and Control (ICECC), 2011 International Conference on, Zhejiang. 719 - 722.
Ittichaichareon, C. , Suksri, S. and Yingthawornsuk, T. 2012. Speech Recognition Using MFCC. International Conference on Computer Graphics, Simulation and Modeling (ICGSM'2012), Thailand. 135 - 138.
Stevens, S. S. , Volkman, J. , and Newman, E. B. 1937. A scale for the measurement of the psychological magnitude pitch. Journal of the Acoustical Society of America 8 (3). 185 - 190.
Zhu, W. , and O'Shaughnessy, D. 2004. Incorporating Frequency Masking Filtering in a Standard MFCC Feature Extraction Algorithm. INRS-EMT, Quebec Univ. , Montreal, Que. , Canada. Vol. 1. 617 - 620.
Firoz S. A. , Vimal Krishnan, V. R. , Sukumar, R. , Jayakumar, A. , and Anto, B. P. 2009. Speaker Independent Automatic Emotion Recognition from Speech:-A Comparison of MFCCs and Discrete Wavelet Transforms. Advances in Recent Technologies in Communication and Computing, 2009. ARTCom '09. International Conference on, Kottayam, Kerala. 528 – 531.
Homberg, M. , and Gelbart, D. 2006. Automatic speech recognition with an adaptation model motivated by auditory processing. IEEE Transactions on Audio, Speech, and Language Processing. Vol. 14. 43 - 49.
Wang, H. , Xu, Y. , and Li, M. 2011. Study on the MFCC Similarity-based Voice Activity Detection Algorithm. Artificial Intelligence, Management Science and Electronic Commerce (AIMSEC), 2011 2nd International Conference on, Deng Leng. 4391 - 4394.
Devi, M. R. , and Ravichandran, T. 2013. A Novel Approach for Speech Feature Extraction by Cubic-Log Compression in MFCC. Pattern Recognition, Informatics and Mobile Engineering (PRIME), 2013 International Conference on, Salem. 182 – 186.
Childers, D. G. , Skinner, D. P. , and Kemerait, R. C. 1977. The Cepstrum: A Guide to Processing. The IEEE. Vol. 65. No. 10.
Kekre, H. B. , and Tanuja K. S. 2008. Speech Data Compression using Vector Quantization. World Academy of Science, Engineering and Technology. Vol. 2. No. 3. 568 – 571.
Kekre H. B. , and Sarode, T. K. 2013. New Clustering Algorithm for Vector Quantization using Rotation of Error Vector. International Journal of Computer Science and Information Security. Vol. 7. No. 3. 159 – 165.
Singh, S. , and Rajan, E. G. 2011. Vector Quantization Approach for Speaker Recognition using MFCC and Inverted MFCC. International Journal of Computer Applications. Vol. 17. No. 1. 1-7.
Gupta, D. , Mounima, R. C. , Manjunath, N. , and Manoj, P. B. 2012. Isolated Word Speech Recognition Using Vector Quantization (VQ). International Journal of Advanced Research in Computer Science and Software Engineering, Bangalore, India. Vol. 2. 164 - 168.

Index Terms

Computer Science

Information Sciences

Keywords

Speech Recognition Speaker Independent Speech Recognition MFCC Mel Frequency Cepstrum Coefficient Vector Quantization VQ Approach Cubic-Log Compression.