Emotion Recognition from Speech using Discriminative Features

Purnima Chandrasekar; Santosh Chapaneri; Deepak Jayaswal

Call for Paper

October Edition

IJCA solicits high quality original research papers for the upcoming October edition of the journal. The last date of research paper submission is 22 September 2025

Submit your paper

Know more

The week's pick

Real-Time Video Transmission using Gaussian Minimum Shift Keying (GMSK) on GNU Radio and USRP for Radiation Monitoring Applications in Nuclear Reactors

Nabiha Ben Abid Abdalla M. Khattab Hani A.M. Harb Chokri Souani

Random Articles

Reseach Article

Emotion Recognition from Speech using Discriminative Features

by Purnima Chandrasekar, Santosh Chapaneri, Deepak Jayaswal

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 101 - Number 16

Year of Publication: 2014

Authors: Purnima Chandrasekar, Santosh Chapaneri, Deepak Jayaswal

10.5120/17775-8913

Purnima Chandrasekar, Santosh Chapaneri, Deepak Jayaswal . Emotion Recognition from Speech using Discriminative Features. International Journal of Computer Applications. 101, 16 ( September 2014), 31-36. DOI=10.5120/17775-8913

@article{ 10.5120/17775-8913,

author = { Purnima Chandrasekar, Santosh Chapaneri, Deepak Jayaswal },

title = { Emotion Recognition from Speech using Discriminative Features },

journal = { International Journal of Computer Applications },

issue_date = { September 2014 },

volume = { 101 },

number = { 16 },

month = { September },

year = { 2014 },

issn = { 0975-8887 },

pages = { 31-36 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume101/number16/17775-8913/ },

doi = { 10.5120/17775-8913 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:31:51.721435+05:30

%A Purnima Chandrasekar

%A Santosh Chapaneri

%A Deepak Jayaswal

%T Emotion Recognition from Speech using Discriminative Features

%J International Journal of Computer Applications

%@ 0975-8887

%V 101

%N 16

%P 31-36

%D 2014

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Creating an accurate Speech Emotion Recognition (SER) system depends on extracting features relevant to that of emotions from speech. In this paper, the features that are extracted from the speech samples include Mel Frequency Cepstral Coefficients (MFCC), energy, pitch, spectral flux, spectral roll-off and spectral stationarity. In order to avoid the 'curse of dimensionality', statistical parameters, i. e. mean, variance, median, maximum, minimum, and index of dispersion have been applied on the extracted features. For classifying the emotion in an unknown test sample, Support Vector Machines (SVM) has been chosen due to its proven efficiency. Through experimentation on the chosen features, an average classification accuracy of 86. 6% has been achieved using one-v/s-all multi-class SVM which is further improved to 100% when reduced to binary form problem. Classifier metrics viz. precision, recall, and F-score values show that the proposed system gives improved accuracy for Emo-DB.

References

Rong, J. , Li, G and Chen, Y. Acoustic feature selection for automatic emotion recognition from speech. Information Processing and Management. (May 2009), 315-328.
Batliner, A. et al. The automatic recognition of emotions in speech. Emotion-Oriented Systems. 2011, 71-99.
Davis, S. and Mermelstein, P. Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences. IEEE Trans. on Acoustics, Speech and Signal processing. (Aug. 1980), 357-366,
Chapaneri, S. Spoken Digits Recognition using Weighted MFCC and Improved Features for Dynamic Time Warping. Intl. Journal of Computer Applications. (Feb. 2012), 6-12.
Pao, T. , Chen, Y. , Yeh, J. and Liao, W. Detecting emotions in Mandarin speech. Computational Linguistics and Chinese Processing. (Sep 2005), 347-361.
Kumar, K. , Kim, C. and Stern, R. Delta-spectral Cepstral Co-efficients for robust speech recognition. IEEE Intl. Conf on Acoustics, Speech and Signal Processing. (May 2011), 4784-4787.
Zhou, G. , Hansen, J. and Kaiser, J. Nonlinear feature based classification of speech under stress. IEEE Trans. on Speech and Audio Processing. (Mar. 2001), 201-216.
Lee, C. and Narayanan, S. Towards detecting emotions in spoken dialogs. IEEE Trans. on Speech and Audio Processing. (Mar. 2005), 293-303.
Wu, S. , Falk, T. and Chan, W. Automatic speech emotion recognition using modulation spectral features. Speech Communication. (Sep. 2010), 768-785.
Fewzee, P. and Karray, F. Dimensionality reduction for emotional speech recognition. IEEE Intl. Conf. on Social Computing and Intl. Conf. on Privacy, Security Risk and Trust. (Sep. 2012), 532-537.
Zou, H. and Hastie, T. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society. (Apr. 2005), 301-320.
Yu, L. and Liu, H. Feature selection for high-dimensional data: a fast correlation-based filter solution. Proceedings of the 12th Intl. Conf. on Machine Learning. 2003, 856-863.
Albornoz, E. , Milone, D. and Rufiner, H. Spoken emotion recognition using hierarchical classifiers. Computer Speech & Language. (Jul. 2011), 556-570.
Seehapoch, T. and S. Wongthanavasu. Speech emotion recognition using Support Vector Machines. 5th IEEE Intl. Conference on Knowledge and Smart Technology (KST). (Jan. 2013), 86-91.
Burkhardt, F. et al. A database of German emotional speech. INTERSPEECH. 2005, 1-4.
Combrinck, H. and Botha, E. On the mel-scaled cepstrum. University of Pretoria. 1996.
Talkin, D. A robust algorithm for pitch tracking. Speech Coding and Synthesis. 1995, 495-518.
Rabiner, L. and Schafer, R. Introduction to digital speech processing. Foundations and trends in signal processing. (Jan. 2007), 1-194.
Polzehl, T. , Schmitt, A. , Metze, F. and Wagner, M. Anger recognition in speech using acoustic and linguistic cues. Speech Communication. (Jan. 2013), 1-14.
Eyben, F. , Wollmer, M. and Schuller, B. OpenEAR-introducing the Munich open-source emotion and affect recognition toolkit. 3rd Intl. Conf. on Affective Computing and Intelligent Interaction and Workshops. (Sep. 2009), 1-6.
Finkelstein, S. et al. Investigating the influence of virtual peers as dialect models on students' prosodic inventory. INTERSPEECH. (Sep. 2012), 60-67.
Buzo, A. , Gray, A. , Gray, R and Markel, J. Speech coding based upon vector quantization. IEEE Trans. on Acoustics, Speech and Signal processing. (Oct. 1980), 562-574.
Alpaydin, E. Introduction to machine learning. ISBN-978-81-203-4160-9. 2012.
Burges, C. A tutorial on Support Vector Machines for pattern recognition. Data Mining and Knowledge Discovery. (Jun. 1998), 121-167.
Hsu, C. , Chang, C. and Lin, C. A practical guide to Support Vector Machines. 2003, 1-16.

Index Terms

Computer Science

Information Sciences

Keywords

Feature extraction dimensionality reduction feature classification Support Vector Machines Emotion recognition