CFP last date
20 May 2024
Reseach Article

Automatic Speech Recognition and Verification using LPC, MFCC and SVM

by Aaron M. Oirere, Ganesh B. Janvale, Ratnadeep R. Deshmukh
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 127 - Number 8
Year of Publication: 2015
Authors: Aaron M. Oirere, Ganesh B. Janvale, Ratnadeep R. Deshmukh
10.5120/ijca2015906447

Aaron M. Oirere, Ganesh B. Janvale, Ratnadeep R. Deshmukh . Automatic Speech Recognition and Verification using LPC, MFCC and SVM. International Journal of Computer Applications. 127, 8 ( October 2015), 47-52. DOI=10.5120/ijca2015906447

@article{ 10.5120/ijca2015906447,
author = { Aaron M. Oirere, Ganesh B. Janvale, Ratnadeep R. Deshmukh },
title = { Automatic Speech Recognition and Verification using LPC, MFCC and SVM },
journal = { International Journal of Computer Applications },
issue_date = { October 2015 },
volume = { 127 },
number = { 8 },
month = { October },
year = { 2015 },
issn = { 0975-8887 },
pages = { 47-52 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume127/number8/22753-2015906447/ },
doi = { 10.5120/ijca2015906447 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:18:01.920887+05:30
%A Aaron M. Oirere
%A Ganesh B. Janvale
%A Ratnadeep R. Deshmukh
%T Automatic Speech Recognition and Verification using LPC, MFCC and SVM
%J International Journal of Computer Applications
%@ 0975-8887
%V 127
%N 8
%P 47-52
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Speech has much capability as an interface between human and computer which comes under the Human Computer interaction (HCI). The major challenge has been the nature of voice is ever varying speech signal. The paper presents the development of the speech recognition system using Swahili speech database which was collected in three sets: digits, isolated words and sentences from both native and non native speakers of Swahili language. Different feature extraction techniques deployed in the system are: Linear Prediction Coding (LPC) and Mel-Frequency Coefficients (MFCC). We have used the 12 coefficient features from MFCC and 20 coefficients features from LPC. All these features extracted techniques are applied and tested for the own developed Swahili speech database. Recognition and verification were done using confusion matrix and Support Vector Machine (SVM) as a classifier for the classification purpose. LDA was tested for the entire dataset for the dimension reduction. LDA gave a good clustering. The performance of the system was checked on basis of their accuracy; Confusion with MFCC 50.9%, confusion with LPC 50.1%, the higher recognition rate in each data set were as follows numeric data: MFCC: 75%, LCP:72% , isolated word data: MFCC: 65.2% LPC: 66.67%, sentence data MFCC: 63.8%, LPC: 59.6.

References
  1. M.A.Anusuya and S.K.Katti , (2009) “Speech Recognition by Machine: A Review” (IJCSIS) International Journal of Computer Science and Information Security, Vol. 6, No. 3.
  2. Dat Tat Tran, Fuzzy “Approaches to Speech and Speaker Recognition”, A thesis submitted for the degree of Doctor of Philosophy of the university of Canberra.
  3. Irele, Abiola and Biodun Jeyifo, (2010) The Oxford encyclopedia of African thought, Volume 1. Oxford University Press US. New York City. 2010. ISBN 0-19-533473-6
  4. Gakuru, Mucemi Iraki, Frederick K. Tucker, Roger Shalonova, Ksenia Ngugi, Kamanda, (2005) “Development of a Kiswahili text to speech system”, In INTERSPEECH, 1481-1484.
  5. Aaron M. Oirere, Ratnadeep R. Deshmukh and Pukhraj P. Shirshrimal, (2013) “Development of Isolated Numeric Speech Corpus for Swahili Language for Development of Automatic Speech Recognition System” International Journal of Computer Applications (0975 – 8887) Volume 74– No.11, July 2013
  6. Kashyap Patel, R.K. Prasad, (2003) “Speech Recognition and Verification using MFCC & VQ” international journal of Emerging Science and Engineering (IJESE) volume 1 issue 7, 33-37.
  7. Shivanker Dev Dhingra, Geeta Nijhawan and Poonam Pandit , (2013) “ Isolated Speech Recognition Using MFCC and DTW” International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, p 4085- 4092.
  8. Daniel Jurafsky & James H. Martin, (2007)”Speech and Language Processing: An introduction to natural language processing, computational linguistics, and speech recognition.
  9. Makhoul J. Linear Prediction: (1975) A Tutorial Review. Proceedings of the IEEE. Vol 63, 561-579.
  10. Campell J.P. and Jr. (1997) Speaker recognition: A tutorial. Proceeding of the IEEE. Vol 85, 1437-1462.
  11. Ganesh B. Janvale , Vishal Waghmare, Vijay Kale, Ajit Ghodke, “Recognition of Marathi Isolated Spoken Words Using Interpolation and DTW Techniques”, ICT and Critical Infrastructure: Proceedings of the 48th Annual Convention of Computer Society of India- Vol I
  12. Volume 248 of the series Advances in Intelligent Systems and Computing pp 21-29
  13. R.K.Moore, (1994) “Twenty things we still don’t know about Speech”, Proc.CRIM/ FORWISS Workshop on Progress and Prospects of speech Research and Technology.
Index Terms

Computer Science
Information Sciences

Keywords

Swahili Swahili Text corpus Phonetics Text Corpus and Speech Corpus Automatic Speech Recognition