Speech Recognition System: A Review

Nitin Washani; Sandeep Sharma

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 20 July 2026

Submit your paper

Know more

The week's pick

RackOps: Software Architecture and Automation Patterns for Large-Scale Server Rack Validation

Gopimahesh Vatram

Random Articles

Big Data Analysis with Dataset Scaling in Yet Another Resource Negotiator (YARN)

April

2014

Fuzzy based Probability Factor Calculation for Number of Cluster Estimation to K-Mean by using Apriori

March

2015

Comparison of various Security Protocols in RFID

June

2011

Code and Performance-based Metrics for Multithreaded Object-Oriented Software

Jan

2025

Reseach Article

Speech Recognition System: A Review

by Nitin Washani, Sandeep Sharma

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 115 - Number 18

Year of Publication: 2015

Authors: Nitin Washani, Sandeep Sharma

10.5120/20249-2617

Nitin Washani, Sandeep Sharma . Speech Recognition System: A Review. International Journal of Computer Applications. 115, 18 ( April 2015), 7-10. DOI=10.5120/20249-2617

@article{ 10.5120/20249-2617,

author = { Nitin Washani, Sandeep Sharma },

title = { Speech Recognition System: A Review },

journal = { International Journal of Computer Applications },

issue_date = { April 2015 },

volume = { 115 },

number = { 18 },

month = { April },

year = { 2015 },

issn = { 0975-8887 },

pages = { 7-10 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume115/number18/20249-2617/ },

doi = { 10.5120/20249-2617 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:55:11.848271+05:30

%A Nitin Washani

%A Sandeep Sharma

%T Speech Recognition System: A Review

%J International Journal of Computer Applications

%@ 0975-8887

%V 115

%N 18

%P 7-10

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

To be able to control devices by voice has always intrigued mankind. Today after intense research, Speech Recognition System, have made a niche for themselves and can be seen in many walks of life. The accuracy of Speech Recognition Systems remains one of the most important research challenges e. g. noise, speaker variability, language variability, vocabulary size and domain. The design of speech recognition system requires careful attentions to the challenges such as various types of Speech Classes and Speech Representation, Speech Preprocessing stages, Feature Extraction techniques, Database and Performance evaluation. This paper presents the advances made as well as highlights the pressing problems for a speech recognition system. The paper also classifies the system into Front End and Back End for better understanding and representation of speech recognition system in each part.

References

Dr. Shaila D. Apte, "Speech and Audio Processing",Wiley India Edition.
Jacob Benesty, M. Mohan Sondhi, Yiteng Huang, "Springer Handbook of Speech Processing", Springer.
L. R. Rabiner and R. W. Schafer, "Digital Processing of Speech Signals", Prentice Hall Signal Processing Series.
N. Srivastava, "Speech Recognition using Artificial Neural Network", IJESIT, Volume 3, Issue 3, May 2014.
L. R. Rabiner, M. J. Cheng, A. E. Rosenberg and C. A. McGonegal, "A Comparative Performance Study of Several Pitch Detection Algorithms", IEEE Transactions On Acoustics, Speech, And Signal Processing, Vol. Assp-24,No. 5, October 1976.
S. Ahmadi and A. S. Spanias, "Cepstrum-Based Pitch Detection Using a New Statistical V/UV Classification Algorithm", IEEE Transactions on Speech And Audio Processing, Vol. 7, No. 3, May 1999.
K. K. Paliwal, "Effect of Preemphasis on Vowel Recognition Performance", Elsevier Science Publishers B. V. (North-Holland), Vol. 3. No. 1. April 1984.
R. Vergin, Douglas O'Shaughnessy and A. Farhat, "Generalized Mel Frequency Cepstral Coefficients for Large-Vocabulary Speaker-Independent Continuous-Speech Recognition", IEEE Transactions On Speech And Audio Processing, Vol. 7, No. 5, September 1999.
I. Patel, Dr. Y. Srinivas Rao, "Speech Recognition Using HMM with MFCC-AN Analysis Using Frequency Spectral Decomposition Technique", SIPIJ,Vol. 1,No. 2,December 2010.
A. N. Mishra, M. Chandra, A. Biswas, S. N. Sharana, "Robust Features for Connected Hindi Digits Recognition", International Journal of Signal Processing, Image Processing and Pattern Recognition Vol. 4, No. 2, June, 2011.
Sadaoki Furui, "Speaker-Independent Isolated Word Recognition Using Dynamic Features of Speech Spectrum", IEEE Transactions on Acoustics, Speech and Signal Processing, Vol. ASSp-34, No. 1, February 1986.
Bachu R. G. , Kopparthi S. , Adapa B. , Barkana B. D. , "Separation of Voiced and Unvoiced using Zero crossing rate and Energy of the Speech Signal", Springer Science & Business Media.
A. Singh, Dr. D. K. Rajoria, V. Singh, "Database Development and Analysis of Spoken Hybrid Words Using Endpoint Detection", IJECSE, Volume 1, Number 3.
K. Waheed, Kim Weaver and F. M. Salam, "A Robust Algorithm for Detecting Speech Segments Using an Entropic Contrast".
Lingyun Gu and S. A. Zahorian, "A New Robust Algorithm for Isolated Word Endpoint Detection".
Qi Li, J. Zheng, A. Tsai and Q. Zhou, Member, "Robust Endpoint Detection and Energy Normalization for Real-Time Speech and Speaker Recognition", IEEE Transactions on Speech And Audio Processing, Vol. 10, No. 3, March 2002.
N. N. Lokhande, N. S. Nehe, P. S. Vikhe, "Voice Activity Detection Algorithm for Speech Recognition Applications", ICCIA, 2011.
Hui Jiang, K. Hirose and Qiang Huo, "A Minimax Search Algorithm for Robust Continuous Speech Recognition", IEEE Transactions On Speech And Audio Processing, Vol. 8, No. 6, November 2000.
J. K. Lee and C. D. Yoo, "Wavelet Speech Enhancement Based On Voiced/Unvoiced Decision", the 32nd International Congress and Exposition on Noise Control Engineering Jeju International Convention Center, Seogwipo, Korea, August 25-28, 2003.
W. Gevaert, G. Tsenov and V. Mladenov, "Neural Networks used for Speech Recognition", Journal Of Automatic Control, University Of Belgrade, Vol. 20:1-7, 2010.
Amr Rashed, "Fast Algorithm for Noisy Speaker Recognition Using ANN", IJCET, Volume 5, Issue 2, February (2014), pp. 56-65.
T. Lee, C. Ching and Lai-Wan Chan, "Isolated Word Recognition Using Modular Recurrent Neural Networks", Pattern Recognition, Vol. 31, No. 6, pp. 751—760, 1998.
K. Dutta and K. K. Sarma, "Multiple Feature Extraction for RNN-based Assamese Speech Recognition for Speech to Text Conversion Application", International Conference on Communications, Devices and Intelligent Systems (CODIS), IEEE, 2012.
K. Dutta and K. K. Sarma, "Dynamic Segmentation of Vocal Extract for Assamese Speech to Text Conversion using RNN", CISP, IEEE, 2012.
A. Singh, Dr. D. K. Rajoria, V. Singh, "Broad Acoustic Classification of Spoken Hindi Hybrid Paired Words using Artificial Neural Networks", International Journal of Computer Applications, Volume 52, No. 12, August 2012.
M. Vyas, "A Gaussian Mixture Model Based Speech Recognition System Using Matlab", SIPIJ, Vol. 4, No. 4, August 2013.
Hiroaki Sakoe, "Two-Level DP-Matching, A Dynamic Programming Based Pattern Matching Algorithm For Connected Word Recognition", IEEE Transactions On Acoustics, Speech, And Signal Processing, Vol. Assp-27, No. 6, December 1979.

Index Terms

Computer Science

Information Sciences

Keywords

VAD Feature Extraction Hidden Markov Model Neural Networks.