T.U.E.S.D.A.Y (Translation Using machine learning from English Speech to Devanagari Automated for You)

Varun Soni; Rizwan Shaikh; Sayantan Mahato; Shaikh Phiroj

Call for Paper

July Edition

IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 22 June 2026

Submit your paper

Know more

The week's pick

Multi-Band RLS Estimation with Rank Two Updates: Application to Short-Term Temperature Forecast

Alexander Stotsky

Random Articles

FPGA Implementation of Interrupt Controller (8259) by using Verilog HDL

June

2012

Tracking Line Segment without Knowledge of Camera Motion

July

2013

Analysis of Boneh-Shaw Finger Printing Codes under Majority Value Collusion Attacks

Jun

2017

AMHCD: A Database for Amazigh Handwritten Character Recognition Research

August

2011

Reseach Article

T.U.E.S.D.A.Y (Translation Using machine learning from English Speech to Devanagari Automated for You)

by Varun Soni, Rizwan Shaikh, Sayantan Mahato, Shaikh Phiroj

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 183 - Number 9

Year of Publication: 2021

Authors: Varun Soni, Rizwan Shaikh, Sayantan Mahato, Shaikh Phiroj

10.5120/ijca2021921387

Varun Soni, Rizwan Shaikh, Sayantan Mahato, Shaikh Phiroj . T.U.E.S.D.A.Y (Translation Using machine learning from English Speech to Devanagari Automated for You). International Journal of Computer Applications. 183, 9 ( Jun 2021), 1-6. DOI=10.5120/ijca2021921387

@article{ 10.5120/ijca2021921387,

author = { Varun Soni, Rizwan Shaikh, Sayantan Mahato, Shaikh Phiroj },

title = { T.U.E.S.D.A.Y (Translation Using machine learning from English Speech to Devanagari Automated for You) },

journal = { International Journal of Computer Applications },

issue_date = { Jun 2021 },

volume = { 183 },

number = { 9 },

month = { Jun },

year = { 2021 },

issn = { 0975-8887 },

pages = { 1-6 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume183/number9/31952-2021921387/ },

doi = { 10.5120/ijca2021921387 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T01:16:16.519178+05:30

%A Varun Soni

%A Rizwan Shaikh

%A Sayantan Mahato

%A Shaikh Phiroj

%T T.U.E.S.D.A.Y (Translation Using machine learning from English Speech to Devanagari Automated for You)

%J International Journal of Computer Applications

%@ 0975-8887

%V 183

%N 9

%P 1-6

%D 2021

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In today's globalized world, one thing that acts as a barrier to healthy information exchange is language. On top of that, with the onset of technologies such as YouTube and Facebook making it easy to share knowledge with people all around the world, language can impede that flow of information. If someone in India wants to access a piece of content in written form from another country, they can make use of services such as Google translate. However, the same cannot be extended for any piece of content which is rendered in an audio-visual medium since no such apparatus has been developed which can help people comprehend the content of that specific type. Specifically, students have experienced this first-hand when they try to access content from other universities but the medium of language is something that they are not wellversed in. With these issues in mind, this group is trying to build an automatic voice dubbing system: a speech-to-speech translation pipeline which can help users easily understand other users without the worry of language barrier. The model is called T.U.E.S.D.A.Y. (Translation Using machine learning for English Speech to Devanagari Automated for You) and is divided into three conversion modules: English speech to English text, English text to Devanagari text, and finally Devanagari text to Hindi speech. These modules work together in tandem to ensure an integrated model as a whole.

References

Deepspeech - an open-source speech-to-text engine. https: //github.com/mozilla/DeepSpeech. Accessed: 2020-05- 13.
Google colab - a python development environment that runs in the browser using google cloud. https://research. google.com/colaboratory/. Accessed: 2020-05-13.
Keras - an open-source software library for artificial neural networks. https://keras.io/. Accessed: 2020-05-13.
Tensorboard - tensorflow’s visualization toolkit. https:// www.tensorflow.org/tensorboard. Accessed: 2020-05- 13.
Tensorflow - a free and open-source software library for machine learning. https://www.tensorflow.org/. Accessed: 2020-05-13.
Yi-Hsiu Liao, Hung-yi Lee, and Lin-shan Lee. Towards structured deep neural network for automatic speech recognition. In 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pages 137–144. IEEE, 2015.
Yi Ren, Yangjun Ruan, Xu Tan, Tao Qin, Sheng Zhao, Zhou Zhao, and Tie-Yan Liu. Fastspeech: Fast, robust and controllable text to speech, 2019.
Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V. Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, Jeff Klingner, Apurva Shah, Melvin Johnson, Xiaobing Liu, ukasz Kaiser, Stephan Gouws, Yoshikiyo Kato, Taku Kudo, Hideto Kazawa, Keith Stevens, George Kurian, Nishant Patil, Wei Wang, Cliff Young, Jason Smith, Jason Riesa, Alex Rudnick, Oriol Vinyals, Greg Corrado, Macduff Hughes, and Jeffrey Dean. Google’s neural machine translation system: Bridging the gap

Index Terms

Computer Science

Information Sciences

Keywords

Neural Networks TensorFlow DeepSpeech Keras FastSpeech