Telephony Speech Recognition System: Challenges

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

Navigating the Future of Cybersecurity: A Strategic Approach to Crypto Agility for Modern Enterprises

Aditya Gupta

Random Articles

Passenger Travel behavior Model in Railway Network Simulation

Apr

2017

Review of Application of Internet of Things in Agriculture in India

Aug

2018

Web Application Top 10 OWASP Attacks and Defence Mechanism

Aug

2023

An Incorporated Voting Strategy on Majority and Score- based Fuzzy Voting Algorithms for Safety-Critical Systems

July

2014

Reseach Article

Telephony Speech Recognition System: Challenges

Published on November 2012 by Joyanta Basu, Rajib Roy, Milton S. Bepari, Soma Khan

National Conference on Communication Technologies & its impact on Next Generation Computing 2012

Foundation of Computer Science USA

CTNGC - Number 1

November 2012

Authors: Joyanta Basu, Rajib Roy, Milton S. Bepari, Soma Khan

Joyanta Basu, Rajib Roy, Milton S. Bepari, Soma Khan . Telephony Speech Recognition System: Challenges. National Conference on Communication Technologies & its impact on Next Generation Computing 2012. CTNGC, 1 (November 2012), 30-36.

@article{

author = { Joyanta Basu, Rajib Roy, Milton S. Bepari, Soma Khan },

title = { Telephony Speech Recognition System: Challenges },

journal = { National Conference on Communication Technologies & its impact on Next Generation Computing 2012 },

issue_date = { November 2012 },

volume = { CTNGC },

number = { 1 },

month = { November },

year = { 2012 },

issn = 0975-8887,

pages = { 30-36 },

numpages = 7,

url = { /proceedings/ctngc/number1/9051-1007/ },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Proceeding Article

%1 National Conference on Communication Technologies & its impact on Next Generation Computing 2012

%A Joyanta Basu

%A Rajib Roy

%A Milton S. Bepari

%A Soma Khan

%T Telephony Speech Recognition System: Challenges

%J National Conference on Communication Technologies & its impact on Next Generation Computing 2012

%@ 0975-8887

%V CTNGC

%N 1

%P 30-36

%D 2012

%I International Journal of Computer Applications

Abstract

Present paper describes the challenges to design the telephony Automatic Speech Recognition (ASR) System. Telephonic speech data are collected automatically from all geographical regions of West Bengal to cover major dialectal variations of Bangla spoken language. All incoming calls are handled by Asterisk Server i. e. Computer telephony interface (CTI). The system asks some queries and users' spoken responses are stored and transcribed manually for ASR system training. In real time scenario, the telephonic speech contains channel drop, silence or no speech event, truncated speech signal, noisy signal etc along with the desired speech event. This paper describes these kinds of challenges of telephony ASR system. And also describes some brief techniques which will handle such unwanted signals in case of telephonic speech to certain extent and able to provide almost desired speech signal for the ASR system.

References

Kwan Min Lee, Jennifer Lai, "Speech vs. Touch: A Comparative Study of the Use of Speech and DTMF Keypad for Navigation", International Journal of Human Computer Interaction IJHCI, Vol. 19, No. 3, 2005.
Gomillion D, Dempster B, "Building Telephony System with Asterisk", ISBN: 1-904811-15-9, Packet Publishing Ltd.
Meggelen J V, Madsen L, Smith J, "Asterisk: The Future of Telephony", ISBN-10: 0-596-51048-9, ISBN-13: 987-0-596-51048-0, O'REILL
Yiu-Kei Lau; Chok-Ki Chan; , "Speech recognition based on zero crossing rate and energy," Acoustics, Speech and Signal Processing, IEEE Transactions on, vol. 33, no. 1, pp. 320- 323, Feb 1985.
Aye, Y. Y. ; , "Speech Recognition Using Zero-Crossing Features," Electronic Computer Technology, 2009 International Conference on , vol. , no. , pp. 689-692, 20-22 Feb. 2009.
Swee, T. T. ; Salleh, S. H. S. ; Jamaludin, M. R. ;, "Speech pitch detection using short-time energy," Computer and Communication Engineering (ICCCE), 2010 International Conference on, vol. , no. , pp. 1-6, 11-12 May 2010.
Erdol, N. ; Castelluccia, C. ; Zilouchian, A. ; "Recovery of missing speech packets using the short-time energy and zero-crossing measurements," Speech and Audio Processing, IEEE Transactions on, vol. 1, no. 3, pp. 295-303, Jul 1993. http://www. speech. cs. cmu. edu/.
Joyanta Basu, Soma Khan, Rajib Roy and Milton Samirakshma Bepari, "Designing Voice Enabled Railway Travel Enquiry System: An IVR Based Approach on Bangla ASR", ICON 2011, Anna University, Chennai, India, pp – 138-145, December, 2011.
Guoyu Zuo; Wenju Liu; Xiaogang Ruan; "Telephone speech recognition using simulated data from clean database," Robotics, Intelligent Systems and Signal Processing, 2003. Proceedings. 2003 IEEE International Conference on, vol. 1, no. , pp. 49- 53 vol. 1, 8-13 Oct. 2003.
Joyanta Basu, Milton Samirakshma Bepari, Rajib Roy and Soma Khan, "Design of Telephonic Speech Data Collection and Transcription Methodology for Speech Recognition Systems", FRSM 2012, pp- 147-153, KIIT, Gurgaon.
Basu, J, Basu T, Mitra M, Das Mandal S, "Grapheme to Phoneme (G2P) conversion for Bangla," Oriental COCOSDA International Conference, pp. 66-71, 10-12 Aug. 2009.

Index Terms

Computer Science

Information Sciences

Keywords

Asterisk Server Interactive Voice Response Transcription Tool Temporal And Spectral Features Knowledge Base