International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 187 - Number 2 |
Year of Publication: 2025 |
Authors: Rekha S. Kotwal, Geetanjali Jindal |
![]() |
Rekha S. Kotwal, Geetanjali Jindal . AI Powered Speech Recognition System using Wavelet Multi-Resolution Analysis with One-Dimentional CNN-LSTM. International Journal of Computer Applications. 187, 2 ( May 2025), 72-81. DOI=10.5120/ijca2025924807
The objective of current project is for developing deep learning (DL)-based speech emotion detection system that may identify and categorize emotional states including happiness and sadness. For capturing spatial and temporal patterns in audio input, system uses mel-spectrogram features, that are processed employing hybrid model that combines "convolutional neural networks (CNNs)" and "long short-term memory networks (LSTMs)". Pre-trained model's efficacy in this field is further demonstrated by refinement of transformer-based Wav2Vec2 model for emotion classification. The provided methods accurately identify speech emotions, thus being beneficial for customer service, healthcare, and human-computer interaction.