CFP last date
21 October 2024
Reseach Article

Modeling of Phoneme Transitions for Natural Synthesis of Speech Signals

by H. M. L. N. K. Herath, J. V. Wijayakulasooriya
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 181 - Number 23
Year of Publication: 2018
Authors: H. M. L. N. K. Herath, J. V. Wijayakulasooriya
10.5120/ijca2018918008

H. M. L. N. K. Herath, J. V. Wijayakulasooriya . Modeling of Phoneme Transitions for Natural Synthesis of Speech Signals. International Journal of Computer Applications. 181, 23 ( Oct 2018), 22-29. DOI=10.5120/ijca2018918008

@article{ 10.5120/ijca2018918008,
author = { H. M. L. N. K. Herath, J. V. Wijayakulasooriya },
title = { Modeling of Phoneme Transitions for Natural Synthesis of Speech Signals },
journal = { International Journal of Computer Applications },
issue_date = { Oct 2018 },
volume = { 181 },
number = { 23 },
month = { Oct },
year = { 2018 },
issn = { 0975-8887 },
pages = { 22-29 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume181/number23/30026-2018918008/ },
doi = { 10.5120/ijca2018918008 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:06:49.032334+05:30
%A H. M. L. N. K. Herath
%A J. V. Wijayakulasooriya
%T Modeling of Phoneme Transitions for Natural Synthesis of Speech Signals
%J International Journal of Computer Applications
%@ 0975-8887
%V 181
%N 23
%P 22-29
%D 2018
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Natural synthesis of speech needs to identify the minute variations in phoneme during reproduction, which is affected by many factors. One well-known problem with speech synthesis is the occurrence of audible discontinuities at phoneme boundaries, which lead to the unnaturalness of synthetic speech. This study basically focuses on introducing a novel method with low bit rate to improve the naturalness of synthetic speech. The research presents a sinusoidal noise based mathematical method to reform the transition regions from one phoneme to another phoneme with lesser number of parameters. The speech information which are amplitude, phase and frequency were extracted using three different algorithms. They are Fast Fourier Transform (FFT) algorithm, Auto Regressive model (AR) with Linear Predictive Coding (LPC) algorithm and Auto Regressive Moving Average model (ARMA) with Steiglitz-McBride method. Polynomial coefficients were estimated to represent the speech information in lesser number of parameters. The results show that the synthesized output is highly correlated to the source signal in FFT method than AR model and ARMA model.

References
  1. Tatham, M., Morton, K., Development in Speech Synthesis, John Wiley & Sons Ltd, 2005, 43-44.
  2. Kröger, B., Minimal Rules for Articulatory Speech Synthesis, Proceedings of EUSIPCO, 1992, 92 (1), 331-334.
  3. Rahim M., Goodyear C., Kleijn B., Schroeter J., Sondhi M., On the Use of Neural Networks in Articulatory Speech Synthesis, Journal of the Acoustical Society of America, JASA vol, 1993, 93 (2), 1109-1121.
  4. Lemmetty, S., Review of Speech Synthesis Technology, M.Sc. Thesis, Helsinki University of Technology, 1999
  5. Holmes, J., Holmes, W., Speech Synthesis and Recognition, Second Edition, Taylor & Francis, 2001
  6. Wang, M., Speech Analysis And Synthesis Based On ARMA Lattice Model, Master’s Thesis, University of Windor, 2003.
  7. Rabiner, L., Juang. B., Fundamentals of speech Recognition, Prentice Hall International, 1993.
  8. Sinha P., Speech Processing in Embedded Systems, Springer 2010 .
  9. O’Saughnessy D., Speech Communications – Human and Machine, Hyderabad Universities press, 2001
  10. Taylor, P., Text to Speech Synthesis, Cambridge University Press, 2009.
  11. Keller. E., Baily, G., Monaghan, A., Huckvale, M., Improvements in Speech synthesis, COST 258: The Naturalness of Synthetic Speech, John Wiley & Sons, LTD.
  12. Flanagan, J., Speech Analysis, Synthesis and Perception, Springer-Verlag, Second edition, 1972.
  13. Herath,H.M.L.N.K.,Wijayakulasooriya,J.V, (2014) A Sinusoidal Noise Model Based Speech Synthesis for Phoneme Transitions, International Journal Of Scientific & Technology Research Volume 3, Issue 7, July 2014 (Full Paper)
  14. Herath,H.M.L.N.K.,Wijayakulasooriya,J.V , (2015) . Comparison Of The Applicability Of FFT And LPC Methods For Natural Human Voice Synthesis.Proceeding of the Peradeniya University International Research Session (iPURSE).Vol 19. Pg 295(Abstrct –Poster)
  15. Herath, H.M.L.N.K.,Wijayakulasooriya,J.V. (2016), Auto Regressive Model Based PhonemeTransition Model For Natural Speech Synthesis. Elixir International Journal- Digital Processing, August issue – 2016
  16. Herath, H.M.L.N.K.,Wijayakulasooriya,J.V. (2017), Auto Regressive Moving Average Model Based Speech Synthesis for Phoneme Transition. IOSR Journal of Computer Engineering, volume 19 , issue I (Jan-Feb-2017) pp 103-109
Index Terms

Computer Science
Information Sciences

Keywords

Fast Fourier Transform algorithm (FFT) Auto Regressive model (AR) Auto Regressive Moving Average model. (ARMA) Speech Synthesis Correlation Coefficient Phoneme Transition