Call for Paper - January 2024 Edition
IJCA solicits original research papers for the January 2024 Edition. Last date of manuscript submission is December 20, 2023. Read More

Text-to-Speech Recognition using Google API

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2021
Orlunwo Placida Orochi, Ledisi Giok Kabari

Orlunwo Placida Orochi and Ledisi Giok Kabari. Text-to-Speech Recognition using Google API. International Journal of Computer Applications 183(15):18-20, July 2021. BibTeX

	author = {Orlunwo Placida Orochi and Ledisi Giok Kabari},
	title = {Text-to-Speech Recognition using Google API},
	journal = {International Journal of Computer Applications},
	issue_date = {July 2021},
	volume = {183},
	number = {15},
	month = {Jul},
	year = {2021},
	issn = {0975-8887},
	pages = {18-20},
	numpages = {3},
	url = {},
	doi = {10.5120/ijca2021921474},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}


Speech is the most natural mode of human communication. To enable machines to understand human speech, computers can act as an intermediary for human experts, allowing them to respond accurately and reliably to human voices.This can be accomplished by a text-to-speech recognition device, which allows a data processor to accurately interpret the language in which a message was written and translate it to an audio file that can be heard through a sound medium such as a speaker. The aim of the study is to use the Python programming language to introduce a text-to-speech model to see whether the messages written are read. Using Google API, text-to-speech conversion was successful.


  1. Aditya Amberkar, Gaurav Deshmukh, ParikshitAwasarmol, Piyush Dave, “Speech Recognition using RecurrentNeural Networks, IEEE.
  2. Arpita Gupta and Akshay Joshi. (2018). Speech Recognitionusing Artificial NeuralNetwork, IEEE.
  3. Ashwin Nair Anil Kumar, Senthil Arumugam Muthukumaraswamy. (2017). Text dependent voice recognition system using MFCC and VQ for security applications, International conference of Electronics, Communication and Aerospace Technology (ICECA), Volume 2, pp.130-136.
  4. JiPibil, Anna Pibilov, JindichMatouek. (2016). Comparison of one and two-level architecture of the GMM-based speaker age classifier”, 39th International Conference on Telecommunications and Signal Processing (TSP), pp.299- 302.
  5. Ledisi G. Kabari, Marcus B. Chigoziri. (2019). Speech Recognition Using MATLAB and Cross-Correlation Technique. EJERS, European Journal of Engineering Research and Science Vol. 4, No. 8.
  6. Manjutha M, Gracy J, Subashini P, Krishnaveni M. (2017). Automated Speech Recognition System – A Literature Review”,IJETA-V4I2P9.
  7. Mohsen Sadeghi, Hossein Marvi. (2017). OptimalMFCCFeaturesExtraction by Differential Evolution Algorithm for Speaker Recognition, 3rd Iranian Conference on Signal Processing and Intelligent Systems (ICSPIS), pp.169-173.
  8. MouazBezoui,AbdelmajidElmoutaouakkil, AbderrahimBenihssane. (2016). Feature extraction of some Quranic recitation using Mel-Frequency Cepstral Coefficients (MFCC), 5th International Conference on Multimedia Computing and Systems (ICMCS), pp.127-131.
  9. R. Smith. (n.a). An Overview of the Tesseract OCR Engine", USA: Google Inc
  10. Rania Chakroun, Leila BeltafaZouari, MondherFrikha, Ahmed Ben Hamida. (2016). Improving text-independent speaker recognition with GMM, 2nd International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), pp.693-696.
  11. Rusli A. T., Ahmad M. I., Ilyas M. Z. (2018). Improving speaker verification using MFCC order, International Conference on Robotics, Automation and Sciences (ICORAS), pp.1-4, 2016.
  12. Suhas R. Mache, Manasi R. Baheti, Namrata C. Mahender. (2015). Review on Text-To-Speech Synthesizer, International Journal of Advanced Research in Computer and Communication Engineering Vol. 4, Issue 8, August.
  13. Teddy Surya Gunawan, Rashida Husain, Mira Kartiwi. (2017). Development of language identification system using MFCC and vector quantization, IEEE 4th International Conference on Smart Instrumentation, Measurement and Application (ICSIMA), pp.1-4.
  14. Wenyong Lin. (2015). An improved GMM-based clustering algorithm for efficient speaker identification, 4th International Conference on Computer Science and Network Technology (ICCSNT), Volume 1, pp.1490-1493.
  15. Ying Zhang, Mohammad Pezeshki, Phil´emonBrakel, Saizheng Zhang, C´esar Laurent Yoshua Bengio1, Aaron Courville. (2017). TowardsEnd-to-End Speech Recognition with Deep Convolutional Neural Networks, IEEE.


API, Artificial Intelligence, Speech, Text.