Call for Paper - January 2023 Edition
IJCA solicits original research papers for the January 2023 Edition. Last date of manuscript submission is December 20, 2022. Read More

MFCC and Prosodic Feature Extraction Techniques: A Comparative Study

Print
PDF
International Journal of Computer Applications
© 2012 by IJCA Journal
Volume 54 - Number 1
Year of Publication: 2012
Authors:
Nilu Singh
R. A. Khan
Raj Shree
10.5120/8529-2061

Nilu Singh, R A Khan and Raj Shree. Article: MFCC and Prosodic Feature Extraction Techniques: A Comparative Study. International Journal of Computer Applications 54(1):9-13, September 2012. Full text available. BibTeX

@article{key:article,
	author = {Nilu Singh and R. A. Khan and Raj Shree},
	title = {Article: MFCC and Prosodic Feature Extraction Techniques: A Comparative Study},
	journal = {International Journal of Computer Applications},
	year = {2012},
	volume = {54},
	number = {1},
	pages = {9-13},
	month = {September},
	note = {Full text available}
}

Abstract

In this paper our main aim to provide the difference between cepstral and non-cepstral feature extraction techniques. Here we try to cover-up most of the comparative features of Mel Frequency Cepstral Coefficient and prosodic features. In speaker recognition, there are two type of techniques are available for feature extraction: Short-term features i. e. Mel Frequency Cepstral Coefficient (MFCC) and long-term features (Prosodic) extraction techniques. In this paper, we explore the usefulness of prosodic features for syllable classification and MFCC for feature extraction of a speech signal followed by comparison between them. The Me1 Frequency Cepstral Coefficients (MFCC) is one of the most important features extraction techniques, which is required among various kinds of speech applications. The MFCC features are extracted from the speaker phonemes in the pre-segmented speech sentences. Now days Prosodic features are currently used in most emotion recognition algorithms Prosodic features are relatively simple in their structures and known for their effectiveness in some speech recognition tasks. There are various ways of generating prosodic syllable contour features that have recently been applied to enhance systems for speaker recognition.

References

  • Shriberg, E, L Ferrer, S Kajarekar, A Venkataraman, and A Stolcke. "Modeling prosodic feature sequences for speaker recognition. " 46 (2005): 455–472. Print.
  • Sen, Nirmalya, T. K Basu, and Hemant. A. Patil. "New Features Extracted from Nyquist Filter Bank for Text-Independent Speaker Identification. " Annual IEEE India Conference (INDICON). 978-1-4244-9074-5/10. (2010): 1-5. Print.
  • W. M. Ng, Raymond, tan Lee, Cheung Chi Leung, Bin Ma, and Haizhou Li. " Analysis and Selection of Prosodic Features for Language Identification. 978-0-7695-3904-1/09. (2009): 123-128. Print.
  • Kockmann, Marcel, Lukas BurgetLast, and Jan Honza Cernocky. "INVESTIGATIONS INTO PROSODIC SYLLABLE CONTOUR FEATURES FOR SPEAKER RECOGNITION. " ICASSP 2010. 978-1-4244-4296-6/10. 2010 (2010): 4418-4421. Print.
  • Abdulaziz, Yousra, and Sharrifah mumtazah Syed Ahamad. "Infant cry recognition System:A comparision of System Performance based on Mel Frequencyand Linear Prediction cepstral coefficient. " IEEE. 978-1-4244-5651-2/10. (2010): 260-263. Print.
  • Nelwamondo, Fulufhelo V. , and Tshilidzi Marwala. "Faults Detection Using Gaussian Mixture Models, Mel-Frequency Cepstral Coefflcients and Kurtosis. " 2006 IEEE International Conference on Systems, MAan, and Cybernetics October 8-11, 2006, Taipei, Taiwan. 1-4244-0100-3/06. (2006): 290-295. Print.
  • SEDDIK, HASSEN, AMEL RAHMOUNI, and MOUNIR SAYADI. "TEXT INDEPENDENT SPEAKER RECOGNITION USING THE MEL FREQUENCY CEPSTRAL COEFFICIENTS AND A NEURAL NETWORK CLASSIFIER. " ieee. 0-7803-8379-6/04. (2004): 631-634. Print.
  • Geravanchizadeh, Masoud, and Amir Karirnpour. "Improving the Noise-Robustness of Mel-Frequency Cepstral Coefficients for Speaker Verification. " Proceedings of the 4th International Symposium on Communications, Control and Signal Processing, ISCCSP 2010, Limassol, Cyprus , 3-5 March 2010. 978-1-4244-6287- 2 /10. (2010): 1-4 . Print.
  • Friedland, Gerald, Oriol Vinyals, Yan Huang, and Christian Miiller. "Prosodic and other Long-Term Features for Speaker Diarization. " IEEE Transaction on Audio, Speech and Languages Processing. Vol. 17, NO. 5. (july 2009): n. page. Print.
  • Chi Leung, Cheung, Marc Ferras, Claude Barras, and Jean Luc Gauvain. "Comparing Prosodic Models for Speaker Recognition. " ISCA. September 22-26. (2008): 1945-1948. Print.
  • Ananthakrishnan, Sankaranarayanan, and Shrikanth S. Narayanan. "Automatic Prosodic Event Detection Using Acoustic, Lexical, and Syntactic Evidence. " IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING. VOL. 16, NO. 1. 2008 (2008): 216-228. Print.
  • Huang, Zhongqiang, Lei Chen, and Mary P. Harper. "Purdue Prosodic Feature Extraction Tool on Praat. " Spoken Language Processing Lab School of Electrical and Computer Engineering Purdue University, West Lafayette. (2006): 1-35. Print.
  • Jayanna, HS , and SR Mahadeva Prasanna. "Analysis, Feature Extraction, Modeling and Testing Techniques for Speaker Recognition. " Academic Journals in U. S. . 26. 3 (2009): 181-190. Print.
  • Ferrer, Luciana, Nicolas Scheffer, and Elizabeth Shriberg. "A COMPARISON OF APPROACHES FOR MODELING PROSODIC FEATURES IN SPEAKER RECOGNITION. " ICASSP 2010. 978-1-4244-4296-6/10. (2010): 4414-4417. Print.
  • Huang, Zhongqiang, Lei Chen, and Mary Harper. "An Open Source Prosodic Feature Extraction Tool. " School of Electrical and Computer Engineering Purdue University West Lafayette, IN 47907. (2006): 1-6. Print. .
  • Singh, Satyanand , and Dr. E. G Rajan. "MFCC VQ based Speaker Recognition and Its Accuracy Affecting Factors. " International Journal of Computer Applications (0975 – 8887). 21. 6 (2011): 1-6. Print.
  • Ezzaidi, Hassan, Jean Rouat, and Douglas O' Shaughnessy. "Combining pitch and MFCC for speaker recognition systems. " NSERC, Communications Security Establishment and the FUQAC. (2001): 1-6. Print.