CFP last date
20 May 2024
Reseach Article

Voice activity detection Algorithm for Speech Recognition Applications

Published on March 2012 by Nitin N Lokhande, Navnath S Nehe, Pratap S Vikhe
International Conference in Computational Intelligence
Foundation of Computer Science USA
ICCIA - Number 6
March 2012
Authors: Nitin N Lokhande, Navnath S Nehe, Pratap S Vikhe
fe828d45-9572-4f00-b83f-96d340dd0c88

Nitin N Lokhande, Navnath S Nehe, Pratap S Vikhe . Voice activity detection Algorithm for Speech Recognition Applications. International Conference in Computational Intelligence. ICCIA, 6 (March 2012), 5-7.

@article{
author = { Nitin N Lokhande, Navnath S Nehe, Pratap S Vikhe },
title = { Voice activity detection Algorithm for Speech Recognition Applications },
journal = { International Conference in Computational Intelligence },
issue_date = { March 2012 },
volume = { ICCIA },
number = { 6 },
month = { March },
year = { 2012 },
issn = 0975-8887,
pages = { 5-7 },
numpages = 3,
url = { /proceedings/iccia/number6/5134-1046/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 International Conference in Computational Intelligence
%A Nitin N Lokhande
%A Navnath S Nehe
%A Pratap S Vikhe
%T Voice activity detection Algorithm for Speech Recognition Applications
%J International Conference in Computational Intelligence
%@ 0975-8887
%V ICCIA
%N 6
%P 5-7
%D 2012
%I International Journal of Computer Applications
Abstract

Determining the beginning and the termination of speech in the presence of background noise is a complicated problem. This paper is concerned with labeling sections of speech samples based on whether they are silence, voiced or unvoiced speech. The labeling is done using calculations over the speech samples; zero crossing and short-term energy functions. The short-term energy and zero crossing rate of speech have been extensively used to detect the endpoints of an utterance.

References
  1. Lori F. Lamel, Lawrence R. Rabiner, Aaron E. Rosenberg, Jay G. Wilpon, “An Improved Endpoint Detector for Isolated Word Recognition” IEEE Transactions on Acoustics, Speech, and Signal Processing, Vol. Assp-29, No. 4, August 1981.
  2. Lawrence Rabiner and Biing-Hwang Juang, “Fundamentals of speech Recognition”, Prentice Hall, Englewood Cliffs, N.J., 1993.
  3. John R. Deller, Jr., John H. L. Hansen, John G. Proakis, “Discrete-Time Processing Of Speech Signals”, John Wiley & Sons, inc., publication, IEEE Press.
  4. Mikael Nilsson, Marcus Ejnarsson. “Speech Recognition using Hidden Markov Model”. Department of Telecommunications and Speech Processing, Blekinge Institute of Technology. 2002
  5. Ê.R. Aida–Zade, C. Ardil and S.S. Rustamov, Investigation of Combined use of MFCC and LPC Features in Speech Recognition Systems”, World Academy of Science, Engineering and Technology 19 2006.
  6. L.R.Rabiner, M.R Sambur, “An Algorithm for determining the endpoints of Isolated Utterances”, The Bell System Technical Journal, February 1975, pp 298-315.
  7. B. Atal, and L. Rabiner, “A Pattern Recognition Approach to Voiced-Unvoiced-Silence Classification with Applications to Speech Recognition,” IEEE Trans. On ASSP, vol. ASSP-24, pp. 201-212, 197.
  8. Rabiner, L. R., and Schafer, R. W., Digital Processing of Speech Signals, Englewood Cliffs, New Jersey, Prentice Hall, 512-ISBN-13:9780132136037, 1978.
  9. L. Siegel, “A Procedure for using Pattern Classification Techniques to obtain a Voiced/Unvoiced Classifier”, IEEE Trans. on ASSP, vol. ASSP-27, pp. 83- 88, 1979.
  10. Y. Qi, and B.R. Hunt, “Voiced-Unvoiced-Silence Classifications of Speech using Hybrid Features and a Network Classifier,” IEEE Trans. Speech Audio Processing, vol. 1 No. 2, pp. 250-255, 1993.
Index Terms

Computer Science
Information Sciences

Keywords

Short Term Energy (STE) Short Term Power (STP) Zero Crossing Rate (ZCR) Voice Activity Detection (VAD)