Emotion Recognition and Classification in Speech using Artificial Neural Networks

Akash Shaw; Rohan Kumar Vardhan; Siddharth Saxena

Call for Paper

March Edition

IJCA solicits high quality original research papers for the upcoming March edition of the journal. The last date of research paper submission is 20 February 2026

Submit your paper

Know more

The week's pick

A Knowledge-Graph–Driven Multimodal Large Model for Semantic Understanding and Controllable Generation of Intangible Cultural Heritage

Jundi Yang Heng Yao

Random Articles

Reseach Article

Emotion Recognition and Classification in Speech using Artificial Neural Networks

by Akash Shaw, Rohan Kumar Vardhan, Siddharth Saxena

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 145 - Number 8

Year of Publication: 2016

Authors: Akash Shaw, Rohan Kumar Vardhan, Siddharth Saxena

10.5120/ijca2016910710

Akash Shaw, Rohan Kumar Vardhan, Siddharth Saxena . Emotion Recognition and Classification in Speech using Artificial Neural Networks. International Journal of Computer Applications. 145, 8 ( Jul 2016), 5-9. DOI=10.5120/ijca2016910710

@article{ 10.5120/ijca2016910710,

author = { Akash Shaw, Rohan Kumar Vardhan, Siddharth Saxena },

title = { Emotion Recognition and Classification in Speech using Artificial Neural Networks },

journal = { International Journal of Computer Applications },

issue_date = { Jul 2016 },

volume = { 145 },

number = { 8 },

month = { Jul },

year = { 2016 },

issn = { 0975-8887 },

pages = { 5-9 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume145/number8/25296-2016910710/ },

doi = { 10.5120/ijca2016910710 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:48:14.042617+05:30

%A Akash Shaw

%A Rohan Kumar Vardhan

%A Siddharth Saxena

%T Emotion Recognition and Classification in Speech using Artificial Neural Networks

%J International Journal of Computer Applications

%@ 0975-8887

%V 145

%N 8

%P 5-9

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

To date, little research has been done in emotion classification and recognition in speech. Therefore, there is a need to discuss why this topic is interesting and present a system for classifying and recognizing emotions through speech using neural networks through this article. The proposed system will be speaker independent since a database of speech samples will be used. Various classifiers will used to differentiate emotions such as neutral, anger, happy, sad, etc. The database will consist of emotional speech samples. Prosodic features like pitch, energy, formant frequencies and spectral features like mel frequency cepstral coefficients will be used in the system. Further the classifiers will be trained by using these features for classifying emotions accurately. Following classification, these features will be used to recognize the emotion of the speech sample. Thus, many components like pre-processing of speech, MFCC features, classifiers, prosodic features come together in the implementation of emotion recognition system using speech.

References

L. Rabiner and B. H. Juang, 1993, Fundamentals of Speech Recognition.
Lawrence Rabiner, Ronald Schafer, Introduction to digital speech processing.
Nicholson J., Takahashi K., Nakatsu R., "Emotion Recognition in Speech using Neural Networks", IEEE Trans. Neural Information Proc., 1999, Vol. 2, 495-501.
Wouter Gevaert, Georgi Tsenov, Valeri Mladenov, "Neural Networks used for Speech Recognition", JOURNAL OF AUTOMATIC CONTROL, 2010, University of Belgrade.
S. Haykin, 1999, Neural Networks: a comprehensive foundation.
J. Ang, R. Dhillon, A. Krupski, E. Shriberg, A. Stolcke, "Prosody-Based Automatic Detection of Annoyance and Frustration in Human-Computer Dialog", Proc. ICSLP, Denver, Colorado, USA, 2002, 2037-2040.
S. Yacoub, S. Simske, X. Lin, J. Burns, "Recognition of Emotions in Interactive Voice Response Systems", Proc. European Conference on Speech Communication and Technology, Geneva, Switzerland, 2003, 729-732.
J. Liscombe, "Detecting Emotion in Speech: Experiments in Three Domains". Proc. HLT/NAACL, New York, NY, USA, 2006, 231-234.
R.P. Hobson, "The autistic child's appraisal of expressions of emotion. Journal of Child Psychology and Psychiatry", 27, 1986, 321-342.
K.A. Loveland, B. TUNALI-KOTOSKI, Y.R. Chen, J. Ortegon, D.A. Pearson, K.A. Brelsford, M.C. Gibbs, "Emotion recognition in autism: Verbal and nonverbal information", Development and Psychopathology, 9(3), 1997, 579-593.
T. Vogt, E. André, "Improving Automatic Emotion Recognition from Speech via Gender Differentiation", Proc. Language Resources and Evaluation Conference, Genoa, Italy, 2006, 1123-1126.
F. Dellaert, T. Polzin, A. Waibel, "Recognizing Emotion in Speech", Proc. ICSLP, Philadelphia, PA, USA, 1996, 1970-1973.
M. W. Bhatti, Y. Wang, and L. Guan, "A neural network approach for human emotion recognition in speech", IEEE ISCAS, Vancouver, May 2004, 23-26
Snell. R. "Formant location from LPC analysis data", IEEE Transactions on Speech and Audio Processing, 1(2),1993, pp. 129–134.

Index Terms

Computer Science

Information Sciences

Keywords

ANN MFCC prosodic features emotion classification and recognition pre-processing.