Indexing and Retrieval of Music using Gaussian Mixture Model Techniques

Print
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2016
Authors:
R. Thiruvengatanadhan, P. Dhanalakshmi
10.5120/ijca2016911095

R Thiruvengatanadhan and P Dhanalakshmi. Indexing and Retrieval of Music using Gaussian Mixture Model Techniques. International Journal of Computer Applications 148(3):35-41, August 2016. BibTeX

@article{10.5120/ijca2016911095,
	author = {R. Thiruvengatanadhan and P. Dhanalakshmi},
	title = {Indexing and Retrieval of Music using Gaussian Mixture Model Techniques},
	journal = {International Journal of Computer Applications},
	issue_date = {August 2016},
	volume = {148},
	number = {3},
	month = {Aug},
	year = {2016},
	issn = {0975-8887},
	pages = {35-41},
	numpages = {7},
	url = {http://www.ijcaonline.org/archives/volume148/number3/25740-2016911095},
	doi = {10.5120/ijca2016911095},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}
}

Abstract

Audio processing systems have taken gigantic leaps in everyday life of most people in developed countries. The technologies are getting entrenched in providing entertainment to consumers. Digital audio techniques have now achieved domination in audio delivery with CD players, internet radio, mp3 players and iPods being the systems of choice in many cases. With the huge growth of the digital music databases people begin to realize the importance of effectively managing music databases relying on music content analysis. The goal of music indexing and retrieval system is to provide the user with capabilities to index and retrieve the music data in an efficient manner. For efficient music retrieval, some sort of music similarity measure is desirable. In this paper, we propose a method for indexing and retrieval of the music using Gaussian mixture models. Acoustic features namely MFCC, chromagram, tempogram and MPEG-7 features are used to create the index. Retrieval is based on the highest probability density function and the experimental analysis shows that the rate of average number of clips retrieved for each query is 5 clips.

References

  1. Gal Chechik, Eugene Ie, Martin Rehn, Samy Bengio and Dick Lyon, 2008, Large-Scale Content-Based Audio Retrieval from Text Queries, International Conference on Multimedia Information Retrieval, pp. 105-112.
  2. Hyoung-Gook Kim, Nicolas Moreau and Thomas Sikora, 2004, MPEG-7 Audio and Beyond Audio Content Indexing and Retrieval, John Wiley and sons Ltd.,.
  3. Masataka Goto, 2004, A Real-Time Music-Scene Description System: Predominant-F0 Estimation for Detecting Melody and Bass Lines in Real-world Audio Signals, Speech Communication, no. 43, pp. 311-329.
  4. Cheong Hee Park, 2015, Query by Humming Based on Multiple Spectral Hashing and Scaled Open-End Dynamic Time Warping, Signal Processing, no.108, pp. 220-225.
  5. BalajiThoshkahna and K.R.Ramakrishnan, 2005, Projektquebex: A Query by Example System for Audio Retrieval, IEEE International Conference on Multimedia and Expo, pp. 265-268.
  6. Seungmin Rho, Byeongjun Han, Eenjun Hwang, and Minkoo Kim, 2007, Musemble: A music Retrieval System Based on Learning Environment, IEEE International Conference on Multimedia and Expo, pp. 1463-1466.
  7. Ausgef¨uhrt, 2006, Evaluation of New Audio Features and Their Utilization in Novel Music Retrieval Applications, Master's thesis, Vienna University of Technology.
  8. Peter M. Grosche, 2012, Signal Processing Methods for Beat Tracking, Music Segmentation and Audio Retrieval, Thesis, Universit¨at des Saarlandes.
  9. Yin-Fu Huang, Sheng-Min Lin, Huan-Yu Wu and Yu-Siou Li, 2014, Music Genre Classification Based on Local Feature Selection using a Self-Adaptive Harmony Search Algorithm, Data & Knowledge Engineering, no.92, pp. 60–76.
  10. Jesper Højvang Jensen, 2009, Feature Extraction for Music Information Retrieval, Thesis, Aalborg University.
  11. Ahmad R. Abu-El-Quran, Rafik A. Goubran, and Adrian D. C. Chan, 2006, Security Monitoring using Microphone Arrays and Audio Classification, IEEE Transaction on Instrumentation and Measurement, vol. 55, no. 4, pp. 1025-1032.
  12. Meng .A and J. Shawe-Taylor, 2005, An Investigation of Feature Models for Music Genre Classification using the Support Vector Classifier, International Conference on Music Information Retrieval, Queen Mary, University of London, UK, pp. 604-609.
  13. Francois Pachet and Pierre Roy, 2009, Analytical Features: A Knowledge-Based Approach to Audio Feature Generation, Journal on Applied Signal Processing.
  14. FitzGerald. D and J. Paulus, 2006, Unpitched Percussion Transcription, in Signal Processing Methods for Music Transcription, Springer, pp. 131-162.
  15. Zhe Zuo, 2011, Towards Automated Segmentation of Repetitive Music Recordings, Master's Thesis, Saarland University.
  16. Müller. M, 2007, Information Retrieval for Music and Motion, Database Management & Information Retrieval Springer.
  17. Shepard, 1964, Circularity in Judgements of Relative Pitch, The Journal of the Acoustical Society of America, vol. 36, pp. 2346-2353.
  18. Schroder M. R., B. S. Atal, and J. L. Hall, 1979, Optimizing Digital Speech Coders by Exploiting Masking Properties of the Human Ear, Journal of the Acoustical Society of America, vol. 66, pp. 1647-1652.
  19. Philipp von Styp-Rekowsky, 2011, Towards Time-Adaptive Feature Design in Music Signal Processing, Master's Thesis, Saarland University.
  20. Eronen, A. and Klapuri, A , 2010 “Music Tempo Estimation with k-NN regression,” IEEE Transactions on Audio, Speech and Language Processing, vol. 18, no. 1, pp. 50-57.
  21. Venkatesh Kulkarni, 2014, Towards Automatic Audio Segmentation of Indian Carnatic Music, Master Thesis, Friedrich Alexander University.
  22. Duifhuis H., Willems L. and Sluyter R., 1982, Measurement of Pitch in Speech: An Implementation of Goldstein’s Theory of Pitch Perception, Journal Acoustic Society of America, vol. 71, no.6, pp. 1568-1580.
  23. William Fong and Simon J. Godsill, 2002, Monte Carlo Smoothing with Application to Audio Signal Enhancement, IEEE Transactions on signal processing, vol. 50, Issue 2, pp. 438-449.
  24. Eberhard Zwicker and Hugo Fastl, 1999, Psychoacoustics-Facts and Models, Springer Series of Information Sciences, Berlin.
  25. PetrMotlcek, 2003, Modeling of Spectra and Temporal Trajectories in Speech Processing, PhD thesis, Brno University of Technology.
  26. Menaka Rajapakse and Lonce Wyse, 2005, Generic Audio Classification using a Hybrid Model Based on GMMs and HMMs, IEEE International Multimedia Modelling Conference, pp. 53-58.
  27. Zanoni, M., Ciminieri, D., Sarti, A. and Tubaro, S, 2012, Searching for Dominant High-Level Features for Music Information Retrieval, European Signal Processing Conference, pp. 2025-2029.
  28. Chunhui Wang, Qianqian Zhu, Zhenyu Shan, Yingjie Xia and Yuncai Liu, 2014, Fusing Heterogeneous Traffic Data by Kalman Filters and Gaussian Mixture Models, IEEE International Conference on Intelligent Transportation Systems, pp. 276-281.
  29. Rafael Iriya and Miguel Arjona Ramírez, 2014, Gaussian Mixture Models with Class-Dependent Features for Speech Emotion Recognition, IEEE Workshop on Statistical Signal Processing, pp. 480-483.
  30. Tang, H., Chu, S. M., Hasegawa-Johnson, M. and Huang, T. S., 2012, Partially Supervised Speaker Clustering, IEEE transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 5, pp. 959-971.
  31. Chien-Lin Huang, Chiori Hori and Hideki Kashioka, 2013, Semantic Inference Based on Neural Probabilistic Language Modeling for Speech Indexing, IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8480-8484.
  32. Papadopoulos and Geoffroy Peeters, 2007, Large-Scale Study of Chord Estimation Algorithms Based on Chroma Representation and HMM, International workshop on Content-Based Multimedia Indexing, pp. 53-60.

Keywords

Acoustic Feature Extraction, Indexing and Retrieval, Gaussian Mixture Model and Probability density function