Automatic Speech Segmentation and Recognition using Class-Specific Features

J. Ujwala Rekha; K. Shahu Chatrapati; A Vinaya Babu

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

Automatic Speech Segmentation and Recognition using Class-Specific Features

by J. Ujwala Rekha, K. Shahu Chatrapati, A Vinaya Babu

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 113 - Number 17

Year of Publication: 2015

Authors: J. Ujwala Rekha, K. Shahu Chatrapati, A Vinaya Babu

10.5120/19916-2055

J. Ujwala Rekha, K. Shahu Chatrapati, A Vinaya Babu . Automatic Speech Segmentation and Recognition using Class-Specific Features. International Journal of Computer Applications. 113, 17 ( March 2015), 4-9. DOI=10.5120/19916-2055

@article{ 10.5120/19916-2055,

author = { J. Ujwala Rekha, K. Shahu Chatrapati, A Vinaya Babu },

title = { Automatic Speech Segmentation and Recognition using Class-Specific Features },

journal = { International Journal of Computer Applications },

issue_date = { March 2015 },

volume = { 113 },

number = { 17 },

month = { March },

year = { 2015 },

issn = { 0975-8887 },

pages = { 4-9 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume113/number17/19916-2055/ },

doi = { 10.5120/19916-2055 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:51:30.060719+05:30

%A J. Ujwala Rekha

%A K. Shahu Chatrapati

%A A Vinaya Babu

%T Automatic Speech Segmentation and Recognition using Class-Specific Features

%J International Journal of Computer Applications

%@ 0975-8887

%V 113

%N 17

%P 4-9

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

The class-specific automatic speech recognition systems construct an individual classifier for each class based on its own feature set, wherein the feature set for each class is selected such that it distinguishes that class from the other classes most accurately. Consequently, different feature set sequences must be fed into each of the classifiers, and the output of each of the classifiers must be combined to predict the actual class of the observation sequences. However, speech is continuous, and to be able to apply class-specific features, speech should be segmented and fed to the classifiers, which requires the identification of segmentation cues. This paper proposes a framework that jointly segments, and combines the output of the class-specific classifiers in the absence of any segmentation cues using a recursive formulation.

References

Kalamani, M. ; Valarmathy, S. ; Poonkuzhali, C. ; Catherine, J. N. , "Feature selection algorithms for automatic speech recognition," Computer Communication and Informatics (ICCCI), 2014 International Conference on , vol. , no. , pp. 1,7, 3-5 Jan. 2014.
Altun, Halis, and Gökhan Polat. "Boosting selection of speech related features to improve performance of multi-class SVMs in emotion detection. " Expert Systems with Applications 36. 4 (2009): 8197-8203.
Baggenstoss, Paul M. "Class-specific feature sets in classification. " Intelligent Control (ISIC), 1998. Held jointly with IEEE International Symposium on Computational Intelligence in Robotics and Automation (CIRA), Intelligent Systems and Semiotics (ISAS), Proceedings. IEEE, 1998.
Cairns, Paul, et al. "Bootstrapping word boundaries: A bottom-up corpus-based approach to speech segmentation. " Cognitive Psychology 33. 2 (1997): 111-153.
Bozonnet, Simon, Nicholas WD Evans, and Corinne Fredouille. "The LIA-EURECOM RT'09 speaker diarization system: enhancements in speaker modelling and cluster purification. " Acoustics, Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on. IEEE, 2010.
Juneja, Amit, and Carol Espy-Wilson. "Speech segmentation using probabilistic phonetic feature hierarchy and support vector machines. " Neural Networks, 2003. Proceedings of the International Joint Conference on. Vol. 1. IEEE, 2003.
Matsunaga, Sho?Ichi, and Kiyohiro Shikano. "Speech recognition based on top?down and bottom?up phoneme recognition. " Systems and Computers in Japan 17. 7 (1986): 95-106.
Wang, Z. Jane, and Peter Willett. "Joint segmentation and classification of time series using class-specific features. " Systems, Man, and Cybernetics, Part B: Cybernetics, IEEE Transactions on 34. 2 (2004): 1056-1067.
Siegler, Matthew A. , et al. "Automatic segmentation, classification and clustering of broadcast news audio. " Proc. DARPA speech recognition workshop. Vol. 1997. 1997.
Bridle, J. , and N. Sedgwick. "A method for segmenting acoustic patterns, with applications to automatic speech recognition. " Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP'77. . Vol. 2. IEEE, 1977.
Martens, Jean-Pierre, and Lieven Depuydt. "Broad phonetic classification and segmentation of continuous speech by means of neural networks and dynamic programming. " Speech communication 10. 1 (1991): 81-90.
Zimmermann, Matthias, et al. "A* based joint segmentation and classification of dialog acts in multiparty meetings. " Automatic Speech Recognition and Understanding, 2005 IEEE Workshop on. IEEE, 2005.
Andre-Obrecht, Regine. "A new statistical approach for the automatic segmentation of continuous speech signals. " Acoustics, Speech and Signal Processing, IEEE Transactions on 36. 1 (1988): 29-40.
Rekha, J. Ujwala, K. Shahu Chatrapati, and A. Vinaya Babu. "Game theoretic approach for automatic speech segmentation and recognition. " Electrical & Electronics Engineers in Israel (IEEEI), 2014 IEEE 28th Convention of. IEEE, 2014.
Lamel, Lori F. , Robert H. Kassel, and Stephanie Seneff. "Speech database development: Design and analysis of the acoustic-phonetic corpus. " Speech Input/Output Assessment and Speech Databases. 1989.
Lee, K-F. , and H-W. Hon. "Speaker-independent phone recognition using hidden Markov models. " Acoustics, Speech and Signal Processing, IEEE Transactions on 37. 11 (1989): 1641-1648.
Estevan, Y. P. , Wan, V. , & Scharenborg, O. (2007, April). Finding maximum margin segments in speech. In Proc. ICASSP (Vol. 4).
Rekha, J. Ujwala, K. Shahu Chatrapati, and A. Vinaya Babu. "Feature selection using game theory for phoneme based speech recognition. " Contemporary Computing and Informatics (IC3I), 2014 International Conference on. IEEE, 2014.
Rekha, J. Ujwala, K. Shahu Chatrapati, and A. Vinaya Babu. "Feature Selection for Phoneme Recognition Using a Cooperative Game Theory Based Framework. " Proceedings of the International Conference on Multimedia, Communication and Computing Application (MCCA 2014).
Kettner, Andreas, and Lothar Thiele. "Speech Features for Optimal Discrimination of Phonemes. " (2012).

Index Terms

Computer Science

Information Sciences

Keywords

Class-specific feature set speech segmentation speech recognition