Call for Paper - July 2019 Edition
IJCA solicits original research papers for the July 2019 Edition. Last date of manuscript submission is June 20, 2019. Read More

Designing and Recording Emotional Speech Databases

Print
PDF
IJCA Proceedings on National Conference on Innovative Paradigms in Engineering and Technology (NCIPET 2012)
© 2012 by IJCA Journal
ncipet - Number 14
Year of Publication: 2012
Authors:
Swati D. Bhutekar
M. B. Chandak

Swati D Bhutekar and M B Chandak. Article: Designing and Recording Emotional Speech Databases. IJCA Proceedings on National Conference on Innovative Paradigms in Engineering and Technology (NCIPET 2012) ncipet(14):4-10, March 2012. Full text available. BibTeX

@article{key:article,
	author = {Swati D. Bhutekar and M. B. Chandak},
	title = {Article: Designing and Recording Emotional Speech Databases},
	journal = {IJCA Proceedings on National Conference on Innovative Paradigms in Engineering and Technology (NCIPET 2012)},
	year = {2012},
	volume = {ncipet},
	number = {14},
	pages = {4-10},
	month = {March},
	note = {Full text available}
}

Abstract

This paper describes the factors used in designing and recording large speech databases for applications requiring speech synthesis. Given the growing demand for customized and domain specific voices for use in corpus based synthesis systems, good practices should be established for the creation of these databases which are a key factor in the quality of the resulting speech synthesizer. This paper focuses on the factors affecting to the designing of the recording prompts, on the speaker selection procedure, on the recording setup and on the quality control of the resulting database. One way to find the emotions in the speech is , Once the speech has been recorded from the user it is converted into text, at the same time the stressed word from the speech is recorded & then the frequency for that word is find out for recording the corresponding emotion.

References

  • Gregor O. Hofer , “Emotional Speech Synthesis”, Master of Science School of Informatics University of Edinburgh 2004
  • Ibon Saratxaga, Eva Navas, Inmaculada Hernáez, Iker Luengo, Aholab - “Designing and Recording an Emotional Speech Database for Corpus Based Synthesis in Basque”, Dept. of Electronics and Telecommunications. Faculty of Engineering. University of the Basque Country.
  • Inger S. Engberg, Anya V. Hansen, Ove Andersen and Paul Dalsgaard, “Design, Recording and verification of a Danish Emotional speech Database”
  • Lu´?s C. Oliveira, S´ergio Paulo, Lu´?s Figueira, Carlos Mendes, Ana Nunes‡, Joaquim Godinho‡ ,“Methodologies for Designing and Recording Speech Databases for Corpus Based Synthesis”
  • Masaki Kurematsu, Jun Hakura and Hamido Fujita, “An Extraction of Emotion in Human Speech Using Speech Synthesize and Classifiers for Each Emotion”, in International Journal of Circuits, Systems and Signal Processing
  • Dimitrios Ververidis and Constantine Kotropoulos, “A Review of Emotional Speech Databases”, in Proc. 9th Panhellinic Conference on Informatics (PCI) , pp-560-574,Thessaloniki, Greece, November 2003
  • Voice", Irvine, “Models of Speech Synthesis”, draft version of a paper presented at the "Colloquium on Human-Machine Communication California, February 8-9, 1993, organized by the National Academy of Sciences, USA.
  • “Features and Algorithms for the Recognition of Emotions in Speech”, in Proceedings of the 1st International Conference on Speech Prosody (2002)
  • C. Lee and S. Narayanan, "Toward detecting emotions in spoken dialogs," IEEE transaction on speech and audio processing, vol.13, 2005.
  • B. Kort, R. Reilly, and R. W. Picard, "An Affective Model of Interplay Between Emotions and Learning: Reengineering Educational Pedagogy-Building a Learning Companion.," presented at In Proceedings of International Conference on Advanced Learning Technologies (ICALT 2001), Madison, Wisconsin, August 2001.
  • Slobodan T. Jovi?i?, Zorka Kaši?, Miodrag ?or?evi?, Mirjana Rajkovi?, “ Serbian emotional speech database: design, processing and evaluation”, presented at SPECOM’2004: 9th Conference Speech and Computer St.Petersburg, Russia September20-22,2004.