Issues and Limitations of HMM in Speech Processing: A Survey

Chandralika Chakraborty; P.H. Talukdar

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

Issues and Limitations of HMM in Speech Processing: A Survey

by Chandralika Chakraborty, P.H. Talukdar

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 141 - Number 7

Year of Publication: 2016

Authors: Chandralika Chakraborty, P.H. Talukdar

10.5120/ijca2016909693

Chandralika Chakraborty, P.H. Talukdar . Issues and Limitations of HMM in Speech Processing: A Survey. International Journal of Computer Applications. 141, 7 ( May 2016), 13-17. DOI=10.5120/ijca2016909693

@article{ 10.5120/ijca2016909693,

author = { Chandralika Chakraborty, P.H. Talukdar },

title = { Issues and Limitations of HMM in Speech Processing: A Survey },

journal = { International Journal of Computer Applications },

issue_date = { May 2016 },

volume = { 141 },

number = { 7 },

month = { May },

year = { 2016 },

issn = { 0975-8887 },

pages = { 13-17 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume141/number7/24796-2016909693/ },

doi = { 10.5120/ijca2016909693 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:42:49.953971+05:30

%A Chandralika Chakraborty

%A P.H. Talukdar

%T Issues and Limitations of HMM in Speech Processing: A Survey

%J International Journal of Computer Applications

%@ 0975-8887

%V 141

%N 7

%P 13-17

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Speech is the most natural way of communication among humans. This mode of communication is constituted of two parts, namely sound and sense. The intelligent production and synthesis of speech has intrigued man himself for long and efforts at automated speech recognition, has gone through various phases. Hidden Markov Models (HMMs) provide a simple and effective framework for modeling time-varying spectral vector sequences. Application of HMMs to speech recognition has seen considerable success and gained much popularity. As a consequence, almost all present day speech recognition systems are based on HMMs. The current paper presents a brief study on the HMM based technique applied to speech recognition and also discusses the issues and limitations of HMMs in speech processing.

References

Creative Commons Attribution. (2016, February 18). Focal Folds [Online]. Available: http://en.wikipedia.org/wiki/Vocal_folds.
“Human Speech Production Mechansims” , Masaki Honda, NTT Technical Review, Vol.1, No.2, May 2003.
Fant, G., “Glottal source and excitation analysis”, Speech Trans. Lab. - Q. Prog. Status Rep. 4, 85-107, 1976.
Titze, I.R., “Physiological and acoustic differences between male and female voices”, journal of the Acoustical Society of America, 85, 1699- 1707, 1989.
Stoicheff, M., “Speaking fundamental frequency characteristics of non-smoking female adults”, Journal of Speech Hear. Res., 24, 437-441, 1981.
Linke, C.E., “A Study of Pitch Characteristics of female voices and their relationship to vocal effectiveness”, Folia Phoniatr, 25,173-185, 1973.
Hollien, H. & Shipp, T., “Speaking fundamental frequency and chronologic age in males”, Journal of Speech Hear. Res., 15, 155-159, 1972.
James M. Hillenbrand, M.J. Clark, “ The role of f0 and formant frequencies in distinguishing the voices of men and women”, 71 (5), 1150-1166, The Psychonomic Society,Inc., 2009.
Ke Wu & Childers, “ Gender Recognition from Speech. Part I: Course Analysis”, Journal of Acoustical Society of America, 90 (4), 1828-1840, October, 1991.
Peterson,G.E., and Barney, H.L., “Control methods used in a study of the vowels”, Journal of Acoustical Society of America, 35, 354-358, 1963.
B.H. Juang and L.R. Rabiner, “Hidden Markov Models for Speech Recognition”, Technometrics, Aug 1991, Vol 33, No.3.
Cohen,J., “Application of an Adaptive Auditory Model to Speech Recognition,” unpublished paper presented at the 110th meeting of Acoustical Society of America, Nashville, Tennessee, Nov. 4-8, 1985.
Ghitza, O., “Auditory Nerve Representation as a Front-end for Speech Recognition in a Noisy Environment,” Computer Speech and Language, 1, 109.
Juang, Rabiner, and Wilpon, “ On the use of BAndpass Liftering in Speech Recognition,” IEEE transactions on Acoustics, Speech and Signal Processing, 35, 947-954.
Rabiner, Juang, “An Introduction to Hidden Markov Models,” IEEE ASSP Magazine, January 1986.
Rabiner, L.R., “ A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition,” Proc. of the IEEE, Vol.77, No.2, February 1989.
Barbara Resch,HiddenMarkov Models, “A Tutorial for the course Comutational Intelligence, Signal Processing and Speech Communication Laboratory, Inffeldgasse 16c.
Master’s Thesis of Aarnio, Tomi, “ Speech Recognition with Hidden Markov Models in Visual Communication”, UNIVERSITY OF TURKU, Computer Science, April 1999.
Homayoon Beigi, “Speaker Recognition: Advancements and Challenges”, Chapter 1, INTECH, 2012.
Shigeru Katagiri et.al., “A New hybrid algorithm for speech recognition based on HMM segmentation and learning Vector quantization,” IEEE Transactions on Audio Speech and Language processing Vol.1,No.4.

Index Terms

Computer Science

Information Sciences

Keywords

Speech recognition speech representation Hidden Markov Model implementation Issues limitations challenges.