CFP last date
20 May 2026
Reseach Article

Real Time Audio Deepfake Identification: A Hybrid Framework Utilizing OpenAI Whisper Feature and Deep Neural Networks

by Abhijeet More, Vibhuti Awasthi, Laharika Bhoga, Pratham Kalamkar, Sanika Taru
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 187 - Number 105
Year of Publication: 2026
Authors: Abhijeet More, Vibhuti Awasthi, Laharika Bhoga, Pratham Kalamkar, Sanika Taru
10.5120/ijca7f198880abc2

Abhijeet More, Vibhuti Awasthi, Laharika Bhoga, Pratham Kalamkar, Sanika Taru . Real Time Audio Deepfake Identification: A Hybrid Framework Utilizing OpenAI Whisper Feature and Deep Neural Networks. International Journal of Computer Applications. 187, 105 ( May 2026), 38-43. DOI=10.5120/ijca7f198880abc2

@article{ 10.5120/ijca7f198880abc2,
author = { Abhijeet More, Vibhuti Awasthi, Laharika Bhoga, Pratham Kalamkar, Sanika Taru },
title = { Real Time Audio Deepfake Identification: A Hybrid Framework Utilizing OpenAI Whisper Feature and Deep Neural Networks },
journal = { International Journal of Computer Applications },
issue_date = { May 2026 },
volume = { 187 },
number = { 105 },
month = { May },
year = { 2026 },
issn = { 0975-8887 },
pages = { 38-43 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume187/number105/real-time-audio-deepfake-identification/ },
doi = { 10.5120/ijca7f198880abc2 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2026-05-17T02:29:22.226155+05:30
%A Abhijeet More
%A Vibhuti Awasthi
%A Laharika Bhoga
%A Pratham Kalamkar
%A Sanika Taru
%T Real Time Audio Deepfake Identification: A Hybrid Framework Utilizing OpenAI Whisper Feature and Deep Neural Networks
%J International Journal of Computer Applications
%@ 0975-8887
%V 187
%N 105
%P 38-43
%D 2026
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Recently, major advances have been made in artificial intelligence and deep learning that allow the generation of very realistic synthetic audio. This circumstance is a big challenge to digital security and public trust. The paper proposes a dependable and quick response system capable of making a difference between a genuine human speech and an AI deepfake audio. It is a hybrid solution that merges feature extraction by OpenAI's Whisper with classification using the Deep Neural Networks (DNNs). The main feature of the system is the ability to detect the key acoustic signatures, e. g. pitch, timbre changes and spectral irregularities, the symptoms of digital "artifacts" that are very difficult to be detected by human hearing. The major goal of this study is to define the optimum search space of the two contradictory objectives of accurate detection and fast operational response, thereby paving the way for the real-time application pipeline enclosing telephonic authentication, financial transactions, and secure communication networks. This is a multi-step approach where the first stage is an audio message capture, followed by ML-based feature extraction, and lastly, classification producing a ready-to-use quality score to alert users about the possible cheating attempt. Experimental outcomes reveal the highlight of the model in dealing with different real-life cases where it offers a scalable way out of the dilemma of recognition in the digital age.

References
  1. References Muhammad Aleem, Saqib Riaz, Muhammad Tayan Aziz & Abdul Rehman Chishti, “AI Based Deepfake Audio Detection A Review,” Spectrum of Engineering Sciences, vol. 3, no. 9, pp. 469–479, 2025.
  2. Nisreen Babiker Mohammed Babiker, Ali Osman Mohammed Salih, Abdelmajid H. Mansour, Alwaleed Bashier G. E. Ahmed, Mahmoud Khalifa & Abdelaziz Awad El seed E. Suliman, “Deepfake Audio Detection in Voice Authentication: A Spectral and CNN Based Comprehensive Review,” Engineering, Technology & Applied Science Research, vol. 15, no. 6, pp. 29824–29832, Dec. 2025.
  3. AI Shamilah, A. S., Riasat, H., Allocadia, A. S. et al., “Novel transfer learning based acoustic feature engineering for scene fake audio detection,” Scientific Reports, vol. 15, 8066, 2025.
  4. Yujie Chen, Jiangyin Yi, Cunhang Fan, Jianhua Tao, Yong Ren, Siding Zeng, Chu Yuan Zhang, Xinrui Yan, Hao Gu, Jun Xue, Changlong Wang, Zhao Lev, Xiaohui Zhang, “Region Based Optimization in Continual Learning for Audio Deepfake Detection,” arrive preprint, Dec. 2024.
  5. Yasaman Ahmadiadli, Xiao Ping Zhang & Naimal Khan, “Beyond Identity: A Generalizable Approach for Deepfake Audio Detection,” arrive preprint, May 2025.
  6. Anton Farc, Kamil Malinka & Petr Hanáček, “Evaluation framework for deepfake speech detection: a comparative study of state-of-the-art deepfake speech detectors,” Cybersecurity, vol. 8, Article 50, 2025.
  7. Marc Laureta, John Maynardk Atienza & John Lemuel Tapel, “Deepfake Speech Detection: Identifying AI Generated and Real Human Voices Using Hybrid Convolutional Neural Network and Long Short-Term Memory Model,” Journal of Engineering, Computing and Technology, 2025.
  8. Hafiz Muhammad Sharafat Ali, Syed Muhammad Muslim Rizvi, Hassan Tariq, Saqib Majeed, Anees Tariq & Muhammad Munawar Iqbal, “AI Based Deepfake Audio Detection Technique from Real and Fake Audio Dataset,” Journal of Computing & Biomedical Informatics, vol. 8, no. 2, 2025.
  9. Authors of “Deepfake audio detection with spectral features and Resent based architecture,” Knowledge Based Systems, 2025.
  10. Priyadarshan Dhabe, Nitin Choudhary, Ayush Vidale, Yash Munde, Muaz Sayyed, Netal Zan war, “Advanced Sequential Modeling for Deepfake Audio Identification,” IJRASET Journal for Research in Applied Science and Engineering Technology, 2025.
Index Terms

Computer Science
Information Sciences

Keywords

Fake audio detection real time system deepfake audio audio classification machine learning voice spoofing