Call for Paper - March 2023 Edition
IJCA solicits original research papers for the March 2023 Edition. Last date of manuscript submission is February 20, 2023. Read More

Speaker Dependent and Independent Isolated Hindi Word Recognizer using Hidden Markov Model (HMM)

Print
PDF
International Journal of Computer Applications
© 2012 by IJCA Journal
Volume 52 - Number 7
Year of Publication: 2012
Authors:
Ishan Bhardwaj
Narendra D. Londhe
10.5120/8217-1639

Ishan Bhardwaj and Narendra D Londhe. Article: Speaker Dependent and Independent Isolated Hindi Word Recognizer using Hidden Markov Model (HMM). International Journal of Computer Applications 52(7):34-40, August 2012. Full text available. BibTeX

@article{key:article,
	author = {Ishan Bhardwaj and Narendra D. Londhe},
	title = {Article: Speaker Dependent and Independent Isolated Hindi Word Recognizer using Hidden Markov Model (HMM)},
	journal = {International Journal of Computer Applications},
	year = {2012},
	volume = {52},
	number = {7},
	pages = {34-40},
	month = {August},
	note = {Full text available}
}

Abstract

Hindi is very complex language with large number of phonemes and being used with various ascents in different regions in India. In this manuscript, speaker dependent and independent isolated Hindi word recognizers using the Hidden Markov Model (HMM) is implemented, under noisy environment. For this study, a set of 10 Hindi names has been chosen as a test set for which the training and testing is performed. The scheme instigated here implements the Mel Frequency Cepstral Coefficients (MFCC) in order to compute the acoustic features of the speech signal. Then, K-means algorithm is used for the codebook generation by performing clustering over the obtained feature space. Baum Welch algorithm is used for re-estimating the parameters, and finally for deciding the recognized Hindi word whose model likelihood is highest, Viterbi algorithm has been implemented; for the given HMM. This work resulted in successful recognition with 98. 6% recognition rate for speaker dependent recognition, for total of 10 speakers (6 male, 4 female) and 97. 5% for speaker independent isolated word recognizer for 10 speakers (male).

References

  • Ferrer, M. A. , Camino, J. L. , Travieso, C. M. , Morales, C. , 1999, Signature Classification by Hidden Markov Model, 33rd Annual IEEE International Carnahan Conference on Security Technology, (IEEE ICCST'99), Comisaría General de PolicíaCientífica, Ministerio del Interior, IEEE Spain Section, COIT, SSR-UPM, SeguritasSeguridadEspaña S. A, Madrid, Spain, Oct. 1999, 481-484.
  • Sánchez, J. A. , Travieso, C. M. , Alonso, I. G. , Ferrer, M. A. , 2001, Handwritten Recognizer By Its Envelope and Strokes Layout Using HMM's, 35rd Annual 2001 IEEE International Carnahan Conference on Security Technology, (IEEE ICCST'01), London, UK, 267-271.
  • Yin, M. M. , Wang, J. T. L. , 1999,Applicationof Hidden Markov Models to Gene Prediction in DNA, International Conference on Information Intelligence and Systems, 1999. Proceedings. , 40 – 47.
  • Cohen, A. , 1998, Hidden Markov Models in Biomedical Signal Processing, Engineering in Medicine and Biology Society, 1998. Proceedings of the 20th Annual International Conf. of the IEEE, 3, 1145 – 1150.
  • Alonso, J. B. , Carmona, C. , León,J. de,Ferrer, M. A. , 2002, Combining Neural Networks and Hidden Markov Models for Automatic Detection of Pathologies, 16_th Biennial International Eurasip Conference Biosignal 2002, Brno, Check Republic.
  • Bahl, L. R. , Brown, P. F. , de Souza, P. V. , and Mercer, R. L. , 1986, Maximum Mutual Information Estimation Of HMM Parameters For Speech Recognition,. In Proc. IEEE Intl. Conf. on Acoustics, Speech, and Signal processing,, Tokyo, Japan.
  • Renals, S. , Morgan, N. , Bourlard, H. , Cohen, M. & Franco, H. 1994, Connectionist Probability Estimators in HMM Speech Recognition, IEEE Transactions on Speech and Audio Processing 2(1), 161-174.
  • Buam,L. E. and Petrie, T. , 1966, StatsticalInference for Probabilistic Functions of Finite State Markov Chains, Ann. Math. Stat. , 37, 1554-1563.
  • Buam,L. E. and Egon, J. A. ,1967, An Inequality with Applications to Statstical Estimation for Probabiistic Functions of a Markov Process and to a Model for Ecology, Bull. Amer. Meterorol. Soc. , 73, 360-363.
  • Buam,L. E. and Sell,G. R. , 1968,GrowthFunctions of Transformation on Manifold, Pac. J. Math. , 27(2),211-227.
  • Jelinek, F. , 1969, A Fast Sequential Decoding Algorithm Using a Stack, IBM J. Res. Develop. ,13, 675-685.
  • Buam,L. E. , et al. , 1970, A Maximisation Technique Occouring in the StasticalAnnalysis of Probabilistic Function of Markov Chains, Ann. Math. Stat. , 41(1), 164-171.
  • Buam, L. E. ,1972, AnInequality and Associated Maximisation Technique in Stastical Estimation of Probabilistic Function of Markov Process, Inequalities, 3, 1-8.
  • Bahl,L. R. and 1975, Decoding for Channels With Insertions Deletions Substituion with Applications to Speech Recognition, IEEE Trans. Informat. Theory, IT-21, 404-411.
  • Jelinek, F. , et. al, 1975, Design of a Linguistic Stastical Decoder For The Recogniton of Continous Speech," IEEE Trans. Informat. Theory, 64, 250-256.
  • Baker, J. K. , 1975, The Dragon System-An Overview, IEEETrans. Acoust. SpeechSignal Processing, vol. ASSP-23, 1, 24-29.
  • Bakis, R. , 1976, ContinousSpeech Word Recognition CentisecondAccoustic States, in Proc. ASA Meeting. (WDC).
  • Jelinek, F. , 1976, ContinousSpeech Recognition by Stastical Methods, Proc. IEEE, 64, 532-536
  • Jelinek, F. , 1982, ContinousSpeech Recognition :Stastical Methods, in Handbook of Stastics, II, P. R. Krishnaiad, Ed. Amsterdam, The Netherlands:North-Holland.
  • Bahl,L. R. and Jelinek, F. , 1983, A Maximum LiklihoodApproch to Continous Speech Recognition. " IEEE Trans. Patern Anal. Machine Intell. , PAMI-5, 179-190.
  • Rabiner,L. R. and Wilpon, J. G. , 1979, Speaker Independent, Isolated Word Recognition for a Moderate Size (54 word) Vocabulary, IEEE Trans. on Acoustics, Speech, and Signal Processing, ASSP-­27,. 6, 583?587
  • Juang, B. H. and Rabiner, L. R. , 1985, A Probabilistic Distance Measure for Hidden Markov Models, AT&T Technical Journal, 64(2), 391-408.
  • Rabiner, L. R. , Levinson, S. E. and Sondhi, M. M. , 1983, On The Application of Vector Quantication and Hidden Markov Models to Speaker Independent Isolated Word Recognition, Bell Syst. Tech. , 62 (4), 1075-1105.
  • Levinson, S. E. , et al. , 1983, Speaker Independent Isolated Digit Recognition Using Hidden Markov Models, Conference Record 1983 International Conference on Acoustics, Speech, and Signal Processing, Paper 22. 8, 1049?1052.
  • RabinerL. R. , and Levinson, S. E. , 1985, A Speaker Independent, Syntax Directed Connected Word Recognition System Based on Hidden Markov Models and Level Building, IEEE Trans. on Acoustics, Speech, and Signal Processing, ASSP?33, No. 3, 561?573.
  • Rabiner, L. R. , Juang, B. H. , Levinson, S. E. and Sondhi, M. M. , 1985, Some Properties of Continous Hidden Markov Model Reperesentations, AT&T Tech. J. , 64(2),391-408
  • Juang,B. H. and Rabiner,L. R. , Mixture Autoregressive Hidden Markov Models for Speaker Independent Isolated Word Recognition, Conference Record 1986 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2. 1, 41?44, 1986.
  • RabinerL. R. and Juang, B. H. , 1986, An Introduction to Hidden Markov Models, IEEE ASSP Magazine, 3 (1), 4?16.
  • Ephraim, Y. et al. , 1987, A Minimum Discrimination Information Approach for Hidden Markov Modeling, Conference Record 1987 IEEE International Conference on Acoustics, Speech, and Signal Processing, Paper 1. 8. 1, pp. 25?28.
  • Rabiner, L. R. , Wiplon,J. G. , 1987, Application of Hidden Markov Models to Automatic Speech Endpoint Detection, Computer Speech and Language, 2 (¾),701-714.
  • Rabiner, L. R. , Wilpon, J. G. , and Soong, F. K. , 1988,High Performance Connected Digit Recognition, Using Hidden Markov Models, Conference Record 1988 IEEE International Conference on Acoustics, Speech, and Signal Processing, Paper S3. 6, 119?122.
  • Rabiner, L. R. , Wilpon, J. G. and Soong, F. K. , 1989, High Performance Connected Digit Recognition Using Hidden Markov Models, IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. 37, No. 8, 1214?1225.
  • Rabiner, L. R. , Lee, C. H. , Juang, B. H. and Wilpon, J. G. , 1989, HMM Clustering for Connected Word Recognition, Conference Record 1989 IEEE International Conference on Acoustics, Speech, and Signal Processing, S 8. 5, 405?408.
  • Ljolje, A. , Ephraim, Y. and Rabiner, L. R. ,1990, Estimation of Hidden Markov Model Parameters By Minimizing Empirical Error Rate, Conference Record 1990 IEEE International Conference on Acoustics, Speech, and Signal Processing, Paper S13. 8, 709?712
  • WilponJ. G. and Rabiner, L. R. , 1985, A Modified K?means Clustering Algorithm for Use in Speaker Independent Isolated Word Recognition, IEEE Trans. on Acoustics, Speech, and Signal Processing, ASSP?33 (3), 587?594.
  • Juang, B. H. , Rabiner, L. R. , 1990, The segmental K-means Algorithm for Estimating Parameters of Hidden Markov Models" Acoustics, Speech and Signal Processing, IEEE Transactions, 38, 9, 1639 – 1641.
  • David, S. et al, gpdsHMM: A Hidden Markov Model ToolboxinThe Matlab Environment,Dpto. De Señales y Comunicaciones, Universidad de Las Palmas de Gran Canaria SPAIN
  • Viterbi, A. J. , 1967, Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm, IEEE Trans. on Information Theory, 13(2), pp. 260-269
  • Rabiner, L. R. , 1989,ATutorial on Hidden Markov Models and Selected Applications in Speech Recognition. " Proceedings of the IEEE, 77 (2), 257–286.
  • Xuedong, H. , et al. , Spoken Language Processing : A Guide to Theory, Algorithm, and System Development, Printice-Hall NJ.
  • Rabiner, L. R. , Levinson, S. E. , May 1981, Isolated and Connected Word Recognition ? Theory and Selected Applications, and IEEE Trans. on Communications, COM?29, (5), 621?659.
  • Davis S. and MermelsteinP. , 1980, Comparison of Parametric Representations for Monosyllable Word Recognition in Continuously Spoken Sentences, IEEE Trans. on Acoustics, Speech and Signal Processing, 28(4), 357-366.
  • Rabiner, L. , Juang, B. H. , YegnanarayanaB. , Fundamentals of Speech Recognition, Pearson Publication.
  • Swaranjali: Isolated Word Recognition for Hindi Language using VQ and HMM,2000, International conference on multimedia processing and systems (ICMPS).
  • Kumar,M. , Rajput,N. ,2005,Verma,A. , Hybrid BaseformBuilder for Phonetic Languages, Proceedings of International Conference on Intelligent Sensing and Information Processing IEEE conference publications, 382 – 386.
  • Kumar,M. , 2004, A Large-Vocabulary Continuous Speech Recognition System For Hindi, IBM Journal of Research and Development, 48 (5. 6), 703 – 715.
  • Agarwal, A. , Jain,A. , Prakash, N. , Agrawal,S. S. , 2010, Word Boundary Detection In Continuous Speech Based On Suprasegmental Features For Hindi Language, 2nd International Conference on Signal Processing Systems (ICSPS),2, V2-591 - V2-594.
  • Rajput, N. Adapting Phonetic Decision Trees Between Languages For Continuous Speech Recognition,2010, Sixth International Conference on Spoken Language Processing, Beijing, China, 3, 850-852.
  • SekharC. C. , YegnanarayanaB. , 2000, A constraint satisfaction model for recognition of stop consonant-vowel (SCV) utterances, IEEE Transactions on Speech and Audio Processing, 10 (7), 472 – 480.
  • PoonamBansal et al. ,2008, Optimum HMM Combined with Vector Quantisation for Hindi Speech Word Recognition, IETE journal of research, 54 (4).
  • Lichi Yuan1, 2,2008, An Improved HMM Speech Recognition Model, ICALIP 2008. International Conference on Audio, Language and Image Processing, IEEE Conference Publications, 1311 – 1315.
  • Ranjan, S. , 2010, A Discrete Wavelet Transform Based Approach to Hindi Speech Recognition", International Conference on Signal Acquisition and Processing, IEEE Confrence Publications.
  • Mehta,K. , Anand,R. S. , 2010, Robust Front-End and Back-End Processing for Feature Extraction for Hindi Speech Recognition" IEEE International Conference on Computational Intelligence and Computing Research (ICCIC), 1 – 4.
  • Gupta, V. K. , 2011, Speech Enhancement Using MMSE Estimation and Spectral Subtraction Methods International Conference on Devices and Communications (ICDeCom), IEEE Confrence,1– 5.