CFP last date
20 June 2024
Reseach Article

Telephony Speech Recognition System: Challenges

Published on November 2012 by Joyanta Basu, Rajib Roy, Milton S. Bepari, Soma Khan
National Conference on Communication Technologies & its impact on Next Generation Computing 2012
Foundation of Computer Science USA
CTNGC - Number 1
November 2012
Authors: Joyanta Basu, Rajib Roy, Milton S. Bepari, Soma Khan
56da2de2-7758-4982-89dc-db519569f05d

Joyanta Basu, Rajib Roy, Milton S. Bepari, Soma Khan . Telephony Speech Recognition System: Challenges. National Conference on Communication Technologies & its impact on Next Generation Computing 2012. CTNGC, 1 (November 2012), 30-36.

@article{
author = { Joyanta Basu, Rajib Roy, Milton S. Bepari, Soma Khan },
title = { Telephony Speech Recognition System: Challenges },
journal = { National Conference on Communication Technologies & its impact on Next Generation Computing 2012 },
issue_date = { November 2012 },
volume = { CTNGC },
number = { 1 },
month = { November },
year = { 2012 },
issn = 0975-8887,
pages = { 30-36 },
numpages = 7,
url = { /proceedings/ctngc/number1/9051-1007/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National Conference on Communication Technologies & its impact on Next Generation Computing 2012
%A Joyanta Basu
%A Rajib Roy
%A Milton S. Bepari
%A Soma Khan
%T Telephony Speech Recognition System: Challenges
%J National Conference on Communication Technologies & its impact on Next Generation Computing 2012
%@ 0975-8887
%V CTNGC
%N 1
%P 30-36
%D 2012
%I International Journal of Computer Applications
Abstract

Present paper describes the challenges to design the telephony Automatic Speech Recognition (ASR) System. Telephonic speech data are collected automatically from all geographical regions of West Bengal to cover major dialectal variations of Bangla spoken language. All incoming calls are handled by Asterisk Server i. e. Computer telephony interface (CTI). The system asks some queries and users' spoken responses are stored and transcribed manually for ASR system training. In real time scenario, the telephonic speech contains channel drop, silence or no speech event, truncated speech signal, noisy signal etc along with the desired speech event. This paper describes these kinds of challenges of telephony ASR system. And also describes some brief techniques which will handle such unwanted signals in case of telephonic speech to certain extent and able to provide almost desired speech signal for the ASR system.

References
  1. Kwan Min Lee, Jennifer Lai, "Speech vs. Touch: A Comparative Study of the Use of Speech and DTMF Keypad for Navigation", International Journal of Human Computer Interaction IJHCI, Vol. 19, No. 3, 2005.
  2. Gomillion D, Dempster B, "Building Telephony System with Asterisk", ISBN: 1-904811-15-9, Packet Publishing Ltd.
  3. Meggelen J V, Madsen L, Smith J, "Asterisk: The Future of Telephony", ISBN-10: 0-596-51048-9, ISBN-13: 987-0-596-51048-0, O'REILL
  4. Yiu-Kei Lau; Chok-Ki Chan; , "Speech recognition based on zero crossing rate and energy," Acoustics, Speech and Signal Processing, IEEE Transactions on, vol. 33, no. 1, pp. 320- 323, Feb 1985.
  5. Aye, Y. Y. ; , "Speech Recognition Using Zero-Crossing Features," Electronic Computer Technology, 2009 International Conference on , vol. , no. , pp. 689-692, 20-22 Feb. 2009.
  6. Swee, T. T. ; Salleh, S. H. S. ; Jamaludin, M. R. ;, "Speech pitch detection using short-time energy," Computer and Communication Engineering (ICCCE), 2010 International Conference on, vol. , no. , pp. 1-6, 11-12 May 2010.
  7. Erdol, N. ; Castelluccia, C. ; Zilouchian, A. ; "Recovery of missing speech packets using the short-time energy and zero-crossing measurements," Speech and Audio Processing, IEEE Transactions on, vol. 1, no. 3, pp. 295-303, Jul 1993. http://www. speech. cs. cmu. edu/.
  8. Joyanta Basu, Soma Khan, Rajib Roy and Milton Samirakshma Bepari, "Designing Voice Enabled Railway Travel Enquiry System: An IVR Based Approach on Bangla ASR", ICON 2011, Anna University, Chennai, India, pp – 138-145, December, 2011.
  9. Guoyu Zuo; Wenju Liu; Xiaogang Ruan; "Telephone speech recognition using simulated data from clean database," Robotics, Intelligent Systems and Signal Processing, 2003. Proceedings. 2003 IEEE International Conference on, vol. 1, no. , pp. 49- 53 vol. 1, 8-13 Oct. 2003.
  10. Joyanta Basu, Milton Samirakshma Bepari, Rajib Roy and Soma Khan, "Design of Telephonic Speech Data Collection and Transcription Methodology for Speech Recognition Systems", FRSM 2012, pp- 147-153, KIIT, Gurgaon.
  11. Basu, J, Basu T, Mitra M, Das Mandal S, "Grapheme to Phoneme (G2P) conversion for Bangla," Oriental COCOSDA International Conference, pp. 66-71, 10-12 Aug. 2009.
Index Terms

Computer Science
Information Sciences

Keywords

Asterisk Server Interactive Voice Response Transcription Tool Temporal And Spectral Features Knowledge Base