CFP last date
20 May 2024
Reseach Article

A Fuzzy C-Means based GMM for Classifying Speech and Music Signals

by R.thiruvengatanadhan, P. Dhanalakshmi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 102 - Number 5
Year of Publication: 2014
Authors: R.thiruvengatanadhan, P. Dhanalakshmi
10.5120/17810-8638

R.thiruvengatanadhan, P. Dhanalakshmi . A Fuzzy C-Means based GMM for Classifying Speech and Music Signals. International Journal of Computer Applications. 102, 5 ( September 2014), 16-22. DOI=10.5120/17810-8638

@article{ 10.5120/17810-8638,
author = { R.thiruvengatanadhan, P. Dhanalakshmi },
title = { A Fuzzy C-Means based GMM for Classifying Speech and Music Signals },
journal = { International Journal of Computer Applications },
issue_date = { September 2014 },
volume = { 102 },
number = { 5 },
month = { September },
year = { 2014 },
issn = { 0975-8887 },
pages = { 16-22 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume102/number5/17810-8638/ },
doi = { 10.5120/17810-8638 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:32:19.441332+05:30
%A R.thiruvengatanadhan
%A P. Dhanalakshmi
%T A Fuzzy C-Means based GMM for Classifying Speech and Music Signals
%J International Journal of Computer Applications
%@ 0975-8887
%V 102
%N 5
%P 16-22
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Gaussian Mixture Model (GMM) with Fuzzy c-means attempts to classify signals into speech and music. Feature extraction is done before classification. The classification accuracy mainly relays on the strength of the feature extraction techniques. Simple audio features such as Time domain and Frequency domain are adopted. The time domain features are Zero Crossing Rate (ZCR) and Short Time Energy (STE). The frequency domain features are Spectral Centroid (SC), Spectral Flux (SF), Spectral Roll-off (SR) and Spectral Entropy (SE) and Discrete Wavelet Transforms. The features thus extracted are used for classification. Commonly GMM uses Expectation Maximization (EM) algorithm to determine parameters. The proposed GMM makes use of fuzzy c-means algorithm. The fuzzy c-means algorithm is used to estimate the parameters of the GMM. Compute the probability density function and fix the Gaussian parameter. The proposed GMM model classifies the given input signal is either speech or music and compared with GMM using EM algorithm.

References
  1. Arijit Ghosal BCD, Saha SK (2011) 'Speech/music classification using empirical mode decomposition', Second International Conference on Emerging Applications of Information Technology , pp 4952.
  2. Breebaart J, McKinney M(2003) ' Features for audio classification. ', IntConf on MIR
  3. Dat Tran TV, Wagner M (1998) ' Fuzzy Gaussian mixture models for speaker recognition',Proceedings of the International Conference on Spoken Language Processing , vol. 2, pp 759762.
  4. Changsheng Xu NCM, Shao X(2005) ' Automatic music classification and summarization. ',IEEE Trans Speech and Audio Processing , vol. 13, pp 441450.
  5. Chungsoo Lim Mokpo YWL, Chang JH (2012) ' New techniques for improving the practicality of an svm-based speech/music classifier. ',Acoustics, Speech and Signal Processing (ICASSP) , pp 1657-1660.
  6. F Gouyon FP, Delerue O(2000) ' Classifying percussive sounds: a matter of zero crossing rate. ', Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-00) Verona, Italy .
  7. H Watanabe SM, Kikuchi H (2010) ' Interval calculation of em algorithm for gmm parameter estimation',Proceedings of 2010 IEEE International Symposium ,pp 2686-2689.
  8. Joanna Czajkowska MB, Pietka E (2012) ' Kernelized fuzzy c-means method and gaussian mixture model in unsupervised cascade clustering', Information Technologies in Biomedicine , pp 58-66.
  9. Lim C, J-H(2012) ' Enhancing support vector machine-based speech/music classifica-tion using conditional maximum a posteriori criterion. ',Signal Processing, IET ,vol. 64, pp 335-340.
  10. Panagiotakis C, Tziritas G (2005) ' A speech/music discriminator based on rms and zero-crossings. ', IEEE Trans Multimedia .
  11. Peeters G(2004) ' A large set of audio features for sound description. ', tech rep, IRCAM .
  12. Redner R, Walker H (1984) 'Mixture densities, maximum likelihood and the emalgorithm. ', SIAM Review .
  13. Reynolds D (1993) 'A gaussian mixture modeling approach to text-independent speaer identification',Intl. Technical Report 967 .
  14. Sourabh Ravindran KS, Anderson DV (2005) 'A physiologi-cally inspired method for audio classification. ', Journal on Applied Signal Processing. vol. 9,pp. 1374-1381
  15. Toru Taniguchi MT, Shirai K(2008) ' Detection of speech and music based on spectral tracking. ', Speech Communication . vol. 50, pp. 547-563
  16. Tran D, Wagner M (1998) ' Fuzzy gaussian mixture models for speaker recognition', Special issue of the Australian Journal of Intelligent InformationProcessing Systems . vol. 5, No. 2, pp. 293-300
  17. Tran D, Wagner M (1999) ' Fuzzy approach to gaussian mixture models and gener-alised gaussian mixture models', Proceedings of the Computation Intel-ligence Methods and Applications . pp 154-158
  18. Ziyou Xiong AD Regunathan Radhakrishnan, SHuang T(2004) ' Effec-tive and efficient sports highlights extraction using the minimum description length criterion in selecting gmm structures. ', IEEE Intl Conf Multimedia and Ex . pp. 1947-1950.
  19. Chun-Lin, Liu, A Tutorial of the Wavelet Transform, February 23, 2010
  20. Siwar Rekik, Driss Guerchi, Habib Hamam & Sid-Ahmed Selouani," Audio Steganography Coding Using the Discrete Wavelet Transforms", International Journal of Computer Science and Security, Volume . 6 Issue . 1, pp. 79-83, 2012.
Index Terms

Computer Science
Information Sciences

Keywords

Classification Feature extraction Discrete Wavelet Transform Fuzzy c-means Gaussian Mixture Model