CFP last date
20 May 2024
Call for Paper
June Edition
IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper
Know more
Reseach Article

Survey on Various Methods of Text to Speech Synthesis

by Desai Siddhi, Jashin M. Verghese, Desai Bhavik
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 165 - Number 6
Year of Publication: 2017
Authors: Desai Siddhi, Jashin M. Verghese, Desai Bhavik
10.5120/ijca2017913891

Desai Siddhi, Jashin M. Verghese, Desai Bhavik . Survey on Various Methods of Text to Speech Synthesis. International Journal of Computer Applications. 165, 6 ( May 2017), 26-30. DOI=10.5120/ijca2017913891

@article{ 10.5120/ijca2017913891,
author = { Desai Siddhi, Jashin M. Verghese, Desai Bhavik },
title = { Survey on Various Methods of Text to Speech Synthesis },
journal = { International Journal of Computer Applications },
issue_date = { May 2017 },
volume = { 165 },
number = { 6 },
month = { May },
year = { 2017 },
issn = { 0975-8887 },
pages = { 26-30 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume165/number6/27578-2017913891/ },
doi = { 10.5120/ijca2017913891 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:11:43.201345+05:30
%A Desai Siddhi
%A Jashin M. Verghese
%A Desai Bhavik
%T Survey on Various Methods of Text to Speech Synthesis
%J International Journal of Computer Applications
%@ 0975-8887
%V 165
%N 6
%P 26-30
%D 2017
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The primary objective of this paper is to provide an overview of existing methods Text-To-Speech synthesis techniques. Text to speech synthesis can be broadly categorized into three categories, formant Based, Concatenative based and Articulatory. Formant based speech synthesis relies on different techniques such as cascade, parallel, klatt and PARCAS Model etc. Concatenative speech synthesis can be broadly categorized into three categories, Diphones Based, Corpus based and Hybrid whereas Articulatory synthesis involves Vocal Tract Models, Acoustic Models, Glottis Models , Noise Source Models . In this paper, all text to speech synthesis methods are explained with their pros and cones.

References
  1. Sami Lemmetty. Review of Speech Synthesis Technology. Helsinki University of Technology Department of Electrical and Communications Engineering. March 30, 1999.
  2. Rubeena A. Khan , J. S. Chitode, Concatenative Speech Synthesis: A Review, International Journal of Computer Applications (0975 – 8887). Volume 136 – No.3, February 2016.pg-1 to 4.
  3. Raitio, Tuomo, et al. "HMM-based speech synthesis utilizing glottal inverse filtering." Audio, Speech, and Language Processing, IEEE Transactions on vol.19, no.1, 2011, pp. 153-165.
  4. Heiga Zen, Keiichi Tokuda, Alan W. Black ,“Statistical parametric speech synthesis”, Speech Communication vol.51,no.11,2009,pp. 1039–1064.
  5. Stas Tiomkin, David Malah, Slava Shechtman, and Zvi Kons, “A hybrid text-to-speech system that combines concatenative and statistical synthesis units” IEEE Transactions on Audio, SPEECH, and Language Processing, vol. 19, no. 5, JULY 2011 pp 1278-1288.
  6. Pertti Palo. A Review of Articulatory Speech Synthesis. Espoo, June 5, 2006
  7. Bernd J. Kröger,Peter Birkholz. Articulatory Synthesis of Speech and Singing: State of the Art and Suggestions for Future Research. Multimodal Signals: Cognitive and Algorithmic Issues. pp 306-319
  8. Birkholz P, Martin L, Willmes K, Kröger BJ, Neuschaefer-Rube C (2015) The contribution of phonation type to the perception of vocal emotions in German: An articulatory synthesis study. Journal of the Acoustical Society of America 137:1503-1512
  9. Louis Goldstein and Carol A. Fowler. Articulatory Phonology: A phonology for public language use
  10. Richard S, Mc gowan and Alice Faber. Introduction to papers on speech recognition and perception from an articulatory point of view.
  11. Shuangyu Chang. A Syllable, Articulatory-F eature, and Stress-Accent Model of Speech Recognition. September 2002
  12. Kelly and Lochbaum 1962, Liljencrants 1985, Meyer et al. 1989, Kröger 1998.(e.g. Flanagan 1975, Maeda 1982, Birkholz et al. 2007.
Index Terms

Computer Science
Information Sciences

Keywords

Text to speech synthesis Formant speech synthesis Concatenative speech synthesis Articulatory speech synthesis