Call for Paper - May 2020 Edition
IJCA solicits original research papers for the May 2020 Edition. Last date of manuscript submission is April 20, 2020. Read More

Performance Evaluation of Speech Synthesis Techniques for Marathi Language

Print
PDF
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2015
Authors:
Sangramsing Kayte, Monica Mundada, Charansing Kayte
10.5120/ijca2015907023

Sangramsing Kayte, Monica Mundada and Charansing Kayte. Article: Performance Evaluation of Speech Synthesis Techniques for Marathi Language. International Journal of Computer Applications 130(3):45-50, November 2015. Published by Foundation of Computer Science (FCS), NY, USA. BibTeX

@article{key:article,
	author = {Sangramsing Kayte and Monica Mundada and Charansing Kayte},
	title = {Article: Performance Evaluation of Speech Synthesis Techniques for Marathi Language},
	journal = {International Journal of Computer Applications},
	year = {2015},
	volume = {130},
	number = {3},
	pages = {45-50},
	month = {November},
	note = {Published by Foundation of Computer Science (FCS), NY, USA}
}

Abstract

Text to speech synthesis (TTS) is the production of artificial speech by a machine for the given text as input. The speech synthesis can be achieved by concatenation and Hidden Markov Model techniques. The voice synthesized by these techniques should be evaluated for quality. The study extends towards the comparative analysis for quality of speech synthesis using hidden markov model and unit selection approach. The quality of synthesized speech is analyzed for subjective measurement using mean opinion score and objective measurement based on mean square score and peak signal-to-noise ratio (PSNR). The quality is also accessed by Mel-frequency cepstral coefficient features for synthesized speech. The experimental analysis shows that unit selection method results in better synthesized voice than hidden markov model.

References

  1. Mohammed Waseem, C.N Sujatha, “Speech Synthesis System for Indian Accent using Festvox”, International journal of Scientific Engineering and Technology Research, ISSN 2319-8885 Vol.03,Issue.34 November-2014, Pages:6903-6911
  2. Sangramsing Kayte, Kavita waghmare ,Dr. Bharti Gawali “Marathi Speech Synthesis: A review” International Journal on Recent and Innovation Trends in Computing and Communication ISSN: 2321-8169 Volume: 3 Issue: 6 3708 – 3711.
  3. S. Martincic- Ipsic and I. Ipsic, “Croatian HMM Based Speech Synthesis,” 28th Int. Conf. Information Technology Interfaces ITI 2006, pp.19-22, 2006, Cavtat, Croatia.
  4. Alex Acero, “Formant Analysis and Synthesis using Hidden Markov Models”. Proceedings of Eurospeech conference. September 1999
  5. R.sproat, J. Hirschberg, and D. Yarowsky, “A corpus-based synthesizer”, Proc. ICSLP, pp.563-566, 1992.
  6. T.Yoshimura, K.Tokuda, T. Masuko, T. Kobayashi and T. Kitamura,“Simultaneous Modeling of Spectrum, Pitch and Duration in HMM-Based Speech Synthesis”In Proc. of ICASSP 2000, vol 3, pp.1315-1318, June 2000.
  7. A. Black, P. Taylor, and R. Caley, “The Festival Speech Synthesis System System documentation Edition 1.4, for Festival Version 1.4.3 27th December 2002.
  8. Series P: Telephone Transmission Quality “Methods for objective and subjective assessment of quality "- Methods for Subjective Determination of Transmission Quality ITU-T Recommendation P.800.
  9. ITU-T P.830, Subjective performance assessment of telephone-band and wideband digital codecs
  10. Lehmann, E. L.; Casella, George. “Theory of Point Estimation (2nd ed.). New York: Springer. ISBN 0-387-98502-6. MR 1639875
  11. Huynh-Thu, Q.; Ghanbari, M. (2008). "Scope of validity of PSNR in image/video quality assessment". Electronics Letters 44 (13): 800. doi:10.1049/el:20080522
  12. SR Quackenbush, TP Barnwell, MA Clements, Objective Measures of Speech Quality(Prentice-Hall, New York, NY, USA, 1988)
  13. AW Rix, MP Hollier, AP Hekstra, JG Beerends, PESQ, the new ITU standard for objective measurement of perceived speech quality—part 1: time alignment. Journal of the Audio Engineering Society 50, 755–764 (2002)
  14. JG Beerends, AP Hekstra, AW Rix, MP Hollier, PESQ, the new ITU standard for objective measurement of perceived speech quality—part II: perceptual model. Journal of the Audio Engineering Society 50, 765–778 (2002)
  15. ITU-T P.862, Perceptual evaluation of speech quality: an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech.codecs 2001
  16. Monica Mundada, Bharti Gawali, Sangramsing Kayte "Recognition and classification of speech and its related fluency disorders" Monica Mundada et al, / (IJCSIT) nternational Journal of Computer Science and Information Technologies, Vol. 5 (5) , 2014, 6764-6767
  17. Monica Mundada, Sangramsing Kayte, Dr. Bharti Gawali "Classification of Fluent and Dysfluent Speech Using KNN Classifier" International Journal of Advanced Research in Computer Science and Software Engineering Volume 4,Issue 9, September 2014
  18. Sangramsing N.kayte “Marathi Isolated-Word Automatic Speech Recognition System based on Vector Quantization (VQ) approach” 101th Indian Science Congress Jammu University 03th Feb to 07 Feb 2014.
  19. Sangramsing Kayte, Monica Mundada "Study of Marathi Phones for Synthesis of Marathi Speech from Text" International Journal of Emerging Research in Management &Technology ISSN: 2278-9359 (Volume-4, Issue-10) October 2015
  20. Sangramsing Kayte, Monica Mundada "Study of Marathi Phones for Synthesis of Marathi Speech from Text" International Journal of Emerging Research in Management &Technology ISSN: 2278-9359 (Volume-4, Issue-10) October 2015

Keywords

Keyword TTS, MOS, HMM, Unit Selection, Mean, Variance.