CFP last date
22 April 2024
Reseach Article

Designing and Recording Emotional Speech Databases

Published on March 2012 by Swati D. Bhutekar, M. B. Chandak
2nd National Conference on Innovative Paradigms in Engineering and Technology (NCIPET 2013)
Foundation of Computer Science USA
NCIPET - Number 14
March 2012
Authors: Swati D. Bhutekar, M. B. Chandak
f648415c-7840-4543-89b4-64edd07088cc

Swati D. Bhutekar, M. B. Chandak . Designing and Recording Emotional Speech Databases. 2nd National Conference on Innovative Paradigms in Engineering and Technology (NCIPET 2013). NCIPET, 14 (March 2012), 4-10.

@article{
author = { Swati D. Bhutekar, M. B. Chandak },
title = { Designing and Recording Emotional Speech Databases },
journal = { 2nd National Conference on Innovative Paradigms in Engineering and Technology (NCIPET 2013) },
issue_date = { March 2012 },
volume = { NCIPET },
number = { 14 },
month = { March },
year = { 2012 },
issn = 0975-8887,
pages = { 4-10 },
numpages = 7,
url = { /proceedings/ncipet/number14/5294-1106/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 2nd National Conference on Innovative Paradigms in Engineering and Technology (NCIPET 2013)
%A Swati D. Bhutekar
%A M. B. Chandak
%T Designing and Recording Emotional Speech Databases
%J 2nd National Conference on Innovative Paradigms in Engineering and Technology (NCIPET 2013)
%@ 0975-8887
%V NCIPET
%N 14
%P 4-10
%D 2012
%I International Journal of Computer Applications
Abstract

This paper describes the factors used in designing and recording large speech databases for applications requiring speech synthesis. Given the growing demand for customized and domain specific voices for use in corpus based synthesis systems, good practices should be established for the creation of these databases which are a key factor in the quality of the resulting speech synthesizer. This paper focuses on the factors affecting to the designing of the recording prompts, on the speaker selection procedure, on the recording setup and on the quality control of the resulting database. One way to find the emotions in the speech is , Once the speech has been recorded from the user it is converted into text, at the same time the stressed word from the speech is recorded & then the frequency for that word is find out for recording the corresponding emotion.

References
  1. Gregor O. Hofer , “Emotional Speech Synthesis”, Master of Science School of Informatics University of Edinburgh 2004
  2. Ibon Saratxaga, Eva Navas, Inmaculada Hernáez, Iker Luengo, Aholab - “Designing and Recording an Emotional Speech Database for Corpus Based Synthesis in Basque”, Dept. of Electronics and Telecommunications. Faculty of Engineering. University of the Basque Country.
  3. Inger S. Engberg, Anya V. Hansen, Ove Andersen and Paul Dalsgaard, “Design, Recording and verification of a Danish Emotional speech Database”
  4. Lu´?s C. Oliveira, S´ergio Paulo, Lu´?s Figueira, Carlos Mendes, Ana Nunes‡, Joaquim Godinho‡ ,“Methodologies for Designing and Recording Speech Databases for Corpus Based Synthesis”
  5. Masaki Kurematsu, Jun Hakura and Hamido Fujita, “An Extraction of Emotion in Human Speech Using Speech Synthesize and Classifiers for Each Emotion”, in International Journal of Circuits, Systems and Signal Processing
  6. Dimitrios Ververidis and Constantine Kotropoulos, “A Review of Emotional Speech Databases”, in Proc. 9th Panhellinic Conference on Informatics (PCI) , pp-560-574,Thessaloniki, Greece, November 2003
  7. Voice", Irvine, “Models of Speech Synthesis”, draft version of a paper presented at the "Colloquium on Human-Machine Communication California, February 8-9, 1993, organized by the National Academy of Sciences, USA.
  8. “Features and Algorithms for the Recognition of Emotions in Speech”, in Proceedings of the 1st International Conference on Speech Prosody (2002)
  9. C. Lee and S. Narayanan, "Toward detecting emotions in spoken dialogs," IEEE transaction on speech and audio processing, vol.13, 2005.
  10. B. Kort, R. Reilly, and R. W. Picard, "An Affective Model of Interplay Between Emotions and Learning: Reengineering Educational Pedagogy-Building a Learning Companion.," presented at In Proceedings of International Conference on Advanced Learning Technologies (ICALT 2001), Madison, Wisconsin, August 2001.
  11. Slobodan T. Jovi?i?, Zorka Kaši?, Miodrag ?or?evi?, Mirjana Rajkovi?, “ Serbian emotional speech database: design, processing and evaluation”, presented at SPECOM’2004: 9th Conference Speech and Computer St.Petersburg, Russia September20-22,2004.
Index Terms

Computer Science
Information Sciences

Keywords

Extraction of Emotion in Speech Database Recording