| International Journal of Computer Applications |
| Foundation of Computer Science (FCS), NY, USA |
| Volume 187 - Number 59 |
| Year of Publication: 2025 |
| Authors: Elliot Q.C. Garcia, Nic´eias Silva Vilela, K´atia Pires Nascimento do Sacramento, Tiago A.E. Ferreira |
10.5120/ijca2025925961
|
Elliot Q.C. Garcia, Nic´eias Silva Vilela, K´atia Pires Nascimento do Sacramento, Tiago A.E. Ferreira . Text-Independent Speaker Identification using Audio Looping with Margin based Loss Functions. International Journal of Computer Applications. 187, 59 ( Nov 2025), 1-8. DOI=10.5120/ijca2025925961
Speaker identification has become a crucial component in various applications, including security systems, virtual assistants, and personalized user experiences. This paper investigates the effectiveness of CosFace Loss and ArcFace Loss for text-independent speaker identification using a Convolutional Neural Network architecture based on the VGG16 model, modified to accommodate mel spectrogram inputs of variable sizes generated from the Voxceleb1 dataset. The approach involves implementing both loss functions to analyze their effects on model accuracy and robustness, where the Softmax loss function served as a comparative baseline. Additionally, the study examines how the sizes of mel spectrograms and their varying time lengths influence model performance using 3 seconds as the baseline, with 10 seconds being the maximum time length. The experimental results demonstrate superior identification accuracy compared to traditional Softmax loss in the model that was used. Furthermore, the paper discusses the implications of these findings for future research.