Audio-video based Segmentation and Classification using SVM and AANN

K. Subashini; S. Palanivel

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 20 July 2026

Submit your paper

Know more

The week's pick

Quantifying Label-Induced Bias in Large Language Model Self and Cross Evaluations

Muskan Saraf Sajjad Rezvani Boroujeni Justin Beaudry Hossein Abedi Tom Bush

Random Articles

Survey of Methods of Solving TSP along with its Implementation using Dynamic Programming Approach

August

2012

Coordinator Location Effects in AODV Routing Protocol in ZigBee Mesh Network

October

2015

A Simple and Efficient Roadmap to Process Fingerprint Images in Frequency Domain

February

2015

Architectural Distortion Detection in Mammogram using Contourlet Transform and Texture Features

July

2013

Reseach Article

Audio-video based Segmentation and Classification using SVM and AANN

by K. Subashini, S. Palanivel

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 53 - Number 18

Year of Publication: 2012

Authors: K. Subashini, S. Palanivel

10.5120/8525-2271

K. Subashini, S. Palanivel . Audio-video based Segmentation and Classification using SVM and AANN. International Journal of Computer Applications. 53, 18 ( September 2012), 43-49. DOI=10.5120/8525-2271

@article{ 10.5120/8525-2271,

author = { K. Subashini, S. Palanivel },

title = { Audio-video based Segmentation and Classification using SVM and AANN },

journal = { International Journal of Computer Applications },

issue_date = { September 2012 },

volume = { 53 },

number = { 18 },

month = { September },

year = { 2012 },

issn = { 0975-8887 },

pages = { 43-49 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume53/number18/8525-2271/ },

doi = { 10.5120/8525-2271 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:54:26.756493+05:30

%A K. Subashini

%A S. Palanivel

%T Audio-video based Segmentation and Classification using SVM and AANN

%J International Journal of Computer Applications

%@ 0975-8887

%V 53

%N 18

%P 43-49

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In this paper, we propose a method for combining audio and video for segmentation and classification. The objective of segmentation is to detect the category change point such news to advertisement. The classification system classify the audio-video data into one of the predefined categories such as news, advertisement, sports, serial and movies. Mel frequency cepstral coefficients( MFCC) are used as acoustic features and color histogram is used as visual features for segmentation and classification. Support vector machine(SVM) and autoassociative neural network(AANN) models are used for segmentation and classification. The evidence from audio and video are combined using weighted sum rule for both segmentation and classifications.

References

J. Ajmera, I. McCowan, and H. Bourland. Robust speaker change detection. IEEE Journal of Signal Process Letter, 11(8):649–651, Aug 2004.
J. Ajmera, I. McCowan, and H. Bourlard. Speech/music segmentation using entropy and dynamism features in a HMM classification framework. Speech Communication, 40(3):351–363, 2003.
J. A. Arias, J. Pinquier, and R. Ande-Obrecht. Evaluation of classification techniques for audio indexing. In proc. 13th Eropean conf. Signal Processing, 2005.
Drain Brezeale and Diane J. Cook. Automatic video classification a survey of the literature. IEEE Transaction on System,Man,and cybernetic, 38(3):416–430, May 2008.
S. Cheng and H. Wang. Metric SEQDAC: A hybrid approach for audio segmentation. Proc. 8th International conference on spoken language Process. , pages 1617– 1620, Oct 2004.
Zhouyu Fu, Guojun Lu, Kai Ming Ting, and Dengsheng Zhang. A survey of audio-based music classification and annotation. IEEE Transactions Multimedia, 13(2):303– 318, April 2011.
M. Kalaiselvi Geetha, S. Palanivel, and V. Ramaligam. A novel block intensity code for video classification and retrieval. Expert System With Applications, 36:6415–6420, 2009.
W. J. Gillespie and D. T. Nguyen. Video classification using a tree-based RBF network. IEEE International Conferance on image processing, 3(1):465–468, 2005.
R. Jarina, M. Paralici, M. Kuba, J. Olajec, A. Lukan, and M. Dzurek. Development of reference platform for generic audio classification development of reference plat from for generic audio classification. IEEE Computer society, Work shop on Image Analysis for Multimedia Interactive,, pages 239–242, 2008.
S. Jothilaskmi, S. Palanivel, and V. Ramalingam. Unsupervised speaker segmentation with residual phase and MFCC features. Expert System With Applications, 36:9799–9804, 2009.
K. Kaabneh, A. Abdullah, and A. Al-Halalemah. Video classification using normalized information distance. In Proceedings of the geometric modelling and imaging-new trends, pages 34–40, 2005.
T. Kemp, M. Schmidt, M. Westphal, and A. Waibel. Acoustic strategies for automatic segmentation of audio data. Proc. IEEE International conference on Acoust, Speech, Signal Process. , pages 1423–1426, jun 2000.
Serkan Kiranyaz, Ahmad Farooq Qureshi, and Moncef Gabbouj. A generic audio classification and segmentation approach for multimedia indexing and retrieval. IEEE Trans. Audio, Speech and Lang Processing, 14(3):1062– 1081, May 2006.
J. Kittler, M. Hatef, R. P. Duin, and J. Matas. On combining classifier. IEEE Trans. Pattern Anal. Mach. Intell. , 20(3):226–239, 1998.
C. Lin, J. Shih, K. Yn, and H. Lin. Automatic music genre classification based on modulation spectral analysis of spectral and cepstral features. IEEE Transactions Multimedia, 11(4):670–682, June 2009.
P. C. Lin, J. C. Wang, J. F. Wang, and H. C. Sung. Unsupervised speaker change detection using SVM training misclasssification rate. IEEE Int'l Conf. Acoustics, Speech and Signal Processing, 14(3):1062–1081, May 2006.
Lie Lu, Hong-Jiang Zhang, and Stan Z. Li. Content-based audio classification and segmentation by using support vector machines. Springer-Verlag Multimedia Systems, 8:482– 492, 2003.
Yu-Fei. Ma and Hong-Jiang. Zhang. Motion pattern based video classification using support vector machines. In Proceedings of IEEE International Symposium on Circuit and Systems, 2:69–72, 2002.
S. Palanivel. Person Authentication using Speech, Face and Visual Speech. Ph. D thesis, Indian Institute of Technology Madras, Department of Computer Science and Engg, 2004.
P. Dhanalakshimi, S. Palanivel, and V. Ramaligam. Classification of audio signals using SVM and RBFNN. Expert System With Applications, 36:6069–6075, 2009.
M. Sieglar, U. Jain, B. Raj, and R. Stern. Automatic segmentation, classification and clustering of broadcast news audio. Proc. DARPA Speech recognition workshop, pages 97–99, 1997.
V. Suresh, C. Krishna Mohan, R. Kumaraswamy, and B. Yegnanarayana. Content-based video classification using SVM. In International conference on neural information processing, 2004.
V. Suresh, C. Krishna Mohan, R. Kumaraswamy, and B. Yegnanarayana. Combining multiple evidence for video classification. In IEEE internationalconference intelligent sensing and information processing, pages 187–192, jan2005 2005.
V. Vapnik. Statistical Learning Theory. John Wiley and Sons, New York, 1998.
Y. Wang, Z. Liu, and J. C. Huang. Multimedia content analysis using both audio and visual clues. IEEE Signal Process. Mag. , 17:12–36, 2000.
H. V. Weiming, Nianhua xie, Li. Li, Xiang Lin Zeng, and Stephen maybank. A survey on visual content-based video indexing and retrival. IEEE Transaction on System, Man,and cybernetic, part c:1–23, 2011.
L. Xu, A. Krzyzak, and C. Y. Suen. Methods of combining multiple classifiers and their applications to handwriting recognition. IEEE Trans. Syst. Man, Cybern. , 2:418–435, 1992.
L. Q. Xu and Y. Li. Video classification using spactialtemporal features and PCA. International Conference on Multimedia and Expo, 3:345–348, 2003.
B. Yegnanarayana and S. P. Kishore. AANN: An alternative to GMM for pattern recognition. Neural Networks, 15, 2002.
Y. Yuan and C. Wan. The application of edge features in automatic sports genre classification. In Proceedings of IEEE Conference on Cybernetics and Intelligent Systems, pages 1133–1136, 2004.
R. Zhang, B. Li, and T. Peng. Audio classification based on SVM-USB. Proc. Int. Conf. signal Processing, pages 1586–1589, 2008.

Index Terms

Computer Science

Information Sciences

Keywords

Support vector machines(SVM) Auto associative neural network( AANN) Mel frequency cepstral coefficients Color histogram Audio and video segmentation Audio and video classification Weighted sum rule