CFP last date
20 May 2024
Reseach Article

Deep Belief Networks for Kannada Phoneme Recognition

Published on September 2015 by Akhila K.S., R. Kumaraswamy
National Conference “Electronics, Signals, Communication and Optimization"
Foundation of Computer Science USA
NCESCO2015 - Number 1
September 2015
Authors: Akhila K.S., R. Kumaraswamy
378a0905-8edc-4038-9623-beed266e8226

Akhila K.S., R. Kumaraswamy . Deep Belief Networks for Kannada Phoneme Recognition. National Conference “Electronics, Signals, Communication and Optimization". NCESCO2015, 1 (September 2015), 25-30.

@article{
author = { Akhila K.S., R. Kumaraswamy },
title = { Deep Belief Networks for Kannada Phoneme Recognition },
journal = { National Conference “Electronics, Signals, Communication and Optimization" },
issue_date = { September 2015 },
volume = { NCESCO2015 },
number = { 1 },
month = { September },
year = { 2015 },
issn = 0975-8887,
pages = { 25-30 },
numpages = 6,
url = { /proceedings/ncesco2015/number1/22296-5305/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National Conference “Electronics, Signals, Communication and Optimization"
%A Akhila K.S.
%A R. Kumaraswamy
%T Deep Belief Networks for Kannada Phoneme Recognition
%J National Conference “Electronics, Signals, Communication and Optimization"
%@ 0975-8887
%V NCESCO2015
%N 1
%P 25-30
%D 2015
%I International Journal of Computer Applications
Abstract

In this paper, a baseline phoneme recognition system for Kannada language is built using MFCC and Deep Belief Networks (DBNs). Phonemes are segmented from continuous Kannada speech and MFCC features are extracted from each speech frame. These features are further used as input to the recognizer. DBNs are probabilistic generative model which are constructed by stacking Restricted Boltzmann machines (RBMs). The learning procedure of DBN undergoes pre-training phase followed by fine-tuning phase. Evaluations are also carried out on conventional speech recognition methods such as Multi-Layer Feed Forward Neural Networks (ML-FFNNs) and Support Vector Machines (SVMs). The Experimental result shows that DBN's performance is superior to the conventional methods for recognition of Kannada phonemes using MFCC features.

References
  1. D. H. Klatt, "Overview of the ARPA speech understanding project". In Lea, W. es. Trends in Speech Recognition, Englewood Cliffs, NJ: Prentice-Hall. 1980.
  2. K. Ng and V. Zue. "Phonetic Recognition for Spoken Document Retrieval", In Proceedings of ICASSP 98, pp. 325-328. 1993.
  3. Clements, Mark, P. Cardillo and Michael Miller, "Phonetic searching of digital audio", Proceedings, conference of the National Association of Broadcasters. 2001.
  4. J. R. Rohlicek, P. Jeanrenaud, K. Ng, H. Gish, B. Musicus, M. Siu, "Phonetic training and language modeling for word spotting", ICASSP, 1993.
  5. P. Saini, P Kaur and Mohit Dua, "Hindi Automatic Speech Recognition Using HTK", International Journal of Engineering Trends and Technology (IJETT)- Volume 4 Issue6- June 2013
  6. M. Dua, R. K. Aggarwal, V Kadyan and Shelza Dua, "Punjabi Automatic Speech Recognition Using HTK", IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 4, No 1, July 2012.
  7. Shridhar M. V, Bapu K. Banahatti, Narthan L, Veena Karjigi, R. Kumaraswamy, "Development of Kannada Speech Corpus for Prosodically Guided Phonetic Engine",Oriental COCOSDA ,2013.
  8. M. Gales and S. Young, " The Application of Hidden markov Models in Speech Recognition ", Foundations &Trends in Signal Processing, vol, 1,no. 3, pp. 195-304, 2007.
  9. Matthew Nicholas Stuttle, "A Gaussian Mixture Model Spectral Representation for Speech Recognition", Hughes Hall and Cambridge Univ. Engg. Dept. ,July 2003.
  10. Chau Giang Le, "A thesis on Application of back-propagation neural network for Isolated Word Specch Recognition", Naval PG school, Monterey, California, June-1993.
  11. Paulraj M P, Sazali Bin Yaacob, Ahamad Nazriand Satheesh Kumar, "Classification of Vowel Sounds Using MFCC and Feed Forward Neural Network", International Colloquium on Signal Processing and Its Application , pp 59-62, 2009.
  12. M. A. Al-Alaoui, L. Al-Kanj, J. Azar and E. Yaacoub, "Support Vector machine (SVM ) for English handwritten Character Recognition", Second International Conference on Computer Engineering and Applications, IEEE DOI 10. 1109/ICCEA. 2010. 56, 2010.
  13. Fereshteh Falah Chamasemani, Yashwant Prasas Singh, Multi-class Support Vector Machine (SVM) classifiers, "An Application in Hypothyroid detection and Classification", IEEE DOI 10. 1109/BIC-TA. 2011. 51, 2011
  14. Mohamed, A. R. Dahl, G. E, and Hinton. G, "Acoustic Modeling using Deep Belief Networks", submitted to IEEE Trans on Audio, Speech and Language processing, 2010.
  15. A. Mohamed, G. Dahl and G. Hinton, "Deep Belief Networks for Phone Recognition", in Proc. of NIPS 2009 workshop on Deep Learning for Speech Recognition and Related Applications, 2009.
  16. Hinton . G. , Osindero. S and Teh. Y, "A fast learning algorithm for deep belief nets", Neural Computation, vo. 18,pp. 1527-1554, 2006.
  17. G. E. Hinton, "A practical guide to training restricted Boltzmann machine", Tech Rep. UTML TR 2010-003, Dept. Computer. Sci. , Univ. Toronto, 2010.
  18. Y. Bengio, Learning Deep Architectures for Artificial Intelligence", Jr Foundation and Trends in Machine Learning, vol. 2,pp. 1-127, 2009.
  19. Abdel Rahaman Mohamed, G. Hinton, Gerald Penn, "Understanding How Deep Belief Networks Perform Acoustic Modeling", IEEE International Conference on Digital Object Identifier, pp. 4273-4276, 2012.
  20. Pradeep. R and R. Kumaraswamy,"Comparison of conventional methods and deep belief networks for isolated word recognition", Proc. of IEEE National Conference on Communication, Signal processing and Networking (NCCSN), pp 1-5, 2014.
  21. L. Rabiner and B-H Juang, Fundamentals of Speech Recognition, Pearson Education India, 1st edition, 2008.
Index Terms

Computer Science
Information Sciences

Keywords

Kannada Phoneme Recognition Mfcc Features Deep Belief Networks (dbns) Multi-layer Feed Forward Neural Networks (ml-ffnns) And Support Vector Machines (svms).