CFP last date
20 August 2024
Call for Paper
September Edition
IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2024

Submit your paper
Know more
Reseach Article

Text-to-Speech Recognition using Google API

by Orlunwo Placida Orochi, Ledisi Giok Kabari
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 183 - Number 15
Year of Publication: 2021
Authors: Orlunwo Placida Orochi, Ledisi Giok Kabari

Orlunwo Placida Orochi, Ledisi Giok Kabari . Text-to-Speech Recognition using Google API. International Journal of Computer Applications. 183, 15 ( Jul 2021), 18-20. DOI=10.5120/ijca2021921474

@article{ 10.5120/ijca2021921474,
author = { Orlunwo Placida Orochi, Ledisi Giok Kabari },
title = { Text-to-Speech Recognition using Google API },
journal = { International Journal of Computer Applications },
issue_date = { Jul 2021 },
volume = { 183 },
number = { 15 },
month = { Jul },
year = { 2021 },
issn = { 0975-8887 },
pages = { 18-20 },
numpages = {9},
url = { },
doi = { 10.5120/ijca2021921474 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
%0 Journal Article
%1 2024-02-07T01:16:52.783720+05:30
%A Orlunwo Placida Orochi
%A Ledisi Giok Kabari
%T Text-to-Speech Recognition using Google API
%J International Journal of Computer Applications
%@ 0975-8887
%V 183
%N 15
%P 18-20
%D 2021
%I Foundation of Computer Science (FCS), NY, USA

Speech is the most natural mode of human communication. To enable machines to understand human speech, computers can act as an intermediary for human experts, allowing them to respond accurately and reliably to human voices.This can be accomplished by a text-to-speech recognition device, which allows a data processor to accurately interpret the language in which a message was written and translate it to an audio file that can be heard through a sound medium such as a speaker. The aim of the study is to use the Python programming language to introduce a text-to-speech model to see whether the messages written are read. Using Google API, text-to-speech conversion was successful.

  1. Aditya Amberkar, Gaurav Deshmukh, ParikshitAwasarmol, Piyush Dave, “Speech Recognition using RecurrentNeural Networks, IEEE.
  2. Arpita Gupta and Akshay Joshi. (2018). Speech Recognitionusing Artificial NeuralNetwork, IEEE.
  3. Ashwin Nair Anil Kumar, Senthil Arumugam Muthukumaraswamy. (2017). Text dependent voice recognition system using MFCC and VQ for security applications, International conference of Electronics, Communication and Aerospace Technology (ICECA), Volume 2, pp.130-136.
  4. JiPibil, Anna Pibilov, JindichMatouek. (2016). Comparison of one and two-level architecture of the GMM-based speaker age classifier”, 39th International Conference on Telecommunications and Signal Processing (TSP), pp.299- 302.
  5. Ledisi G. Kabari, Marcus B. Chigoziri. (2019). Speech Recognition Using MATLAB and Cross-Correlation Technique. EJERS, European Journal of Engineering Research and Science Vol. 4, No. 8.
  6. Manjutha M, Gracy J, Subashini P, Krishnaveni M. (2017). Automated Speech Recognition System – A Literature Review”,IJETA-V4I2P9.
  7. Mohsen Sadeghi, Hossein Marvi. (2017). OptimalMFCCFeaturesExtraction by Differential Evolution Algorithm for Speaker Recognition, 3rd Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS), pp.169-173.
  8. MouazBezoui,AbdelmajidElmoutaouakkil, AbderrahimBenihssane. (2016). Feature extraction of some Quranic recitation using Mel-Frequency Cepstral Coefficients (MFCC), 5th International Conference on Multimedia Computing and Systems (ICMCS), pp.127-131.
  9. R. Smith. (n.a). An Overview of the Tesseract OCR Engine", USA: Google Inc
  10. Rania Chakroun, Leila BeltafaZouari, MondherFrikha, Ahmed Ben Hamida. (2016). Improving text-independent speaker recognition with GMM, 2nd International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), pp.693-696.
  11. Rusli A. T., Ahmad M. I., Ilyas M. Z. (2018). Improving speaker verification using MFCC order, International Conference on Robotics, Automation and Sciences (ICORAS), pp.1-4, 2016.
  12. Suhas R. Mache, Manasi R. Baheti, Namrata C. Mahender. (2015). Review on Text-To-Speech Synthesizer, International Journal of Advanced Research in Computer and Communication Engineering Vol. 4, Issue 8, August.
  13. Teddy Surya Gunawan, Rashida Husain, Mira Kartiwi. (2017). Development of language identification system using MFCC and vector quantization, IEEE 4th International Conference on Smart Instrumentation, Measurement and Application (ICSIMA), pp.1-4.
  14. Wenyong Lin. (2015). An improved GMM-based clustering algorithm for efficient speaker identification, 4th International Conference on Computer Science and Network Technology (ICCSNT), Volume 1, pp.1490-1493.
  15. Ying Zhang, Mohammad Pezeshki, Phil´emonBrakel, Saizheng Zhang, C´esar Laurent Yoshua Bengio1, Aaron Courville. (2017). TowardsEnd-to-End Speech Recognition with Deep Convolutional Neural Networks, IEEE.
Index Terms

Computer Science
Information Sciences


API Artificial Intelligence Speech Text.