International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 187 - Number 17 |
Year of Publication: 2025 |
Authors: Kaushik Sinha, Debalina Sinha Jana |
![]() |
Kaushik Sinha, Debalina Sinha Jana . Detection of Synthetic or Cloned Voices using Deep Learning and Acoustic Feature Analysis. International Journal of Computer Applications. 187, 17 ( Jul 2025), 47-52. DOI=10.5120/ijca2025925228
The advancement of generative deep learning models has enabled the creation of synthetic and cloned voices that are increasingly indistinguishable from genuine human speech. While these innovations provide numerous benefits in accessibility and personalized services, they also raise serious concerns in the realms of cybersecurity, misinformation, and digital forensics. This paper proposes a robust detection framework that leverages deep neural networks combined with advanced spectro-temporal acoustic features. A hybrid CNN-BiLSTM model is used for binary classification between real and synthetic speech. The model is evaluated on a comprehensive dataset that includes a wide range of synthesized voices generated using state-of-the-art voice cloning technologies. The proposed system achieves a detection accuracy of 96.4% and exhibits strong generalizability across synthesis methods and audio compression formats. The findings underscore the model's potential as a vital tool in multimedia forensics and digital authentication.