International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 187 - Number 9 |
Year of Publication: 2025 |
Authors: Muhammad Anwarul Azim, Md Gias Uddin, Mohammad Khairul Islam, Abu Nowshed Chy |
![]() |
Muhammad Anwarul Azim, Md Gias Uddin, Mohammad Khairul Islam, Abu Nowshed Chy . Bangla News Document Categorization using Deep Learning Approaches and Fine-tuned BERT. International Journal of Computer Applications. 187, 9 ( May 2025), 1-9. DOI=10.5120/ijca2025924469
With the explosive growth of text documents available in digital form, document categorization has become a critical challenge in managing digital data effectively and precisely. So, researchers apply supervised, semi-supervised, and unsupervised approaches to categorize text documents. Recently, Transformers-based models show outstanding results in the downstream tasks of natural language processing, such as text classification, sentiment analysis, emotion classification, name entity recognition, spam email detection, etc. As the Bangla language is a widely spoken language, we deploy deep neural networks based CBiLSTM, BiLSTM, FastText, and Transformer-based BERT classifier models to categorize Bangla news documents into predefined categories. We utilize pre-trained fastText word embedding vectors with CBiLSTM, BiLSTM classifier models. The dataset we used has 28,800 news documents with 12 categories. In order to find the best outcome, we fine-tune each model with different hyperparameters. Fine-tuned BERT classifier model manages to achieve the highest accuracy of 94.74% compared to other classifier models. We also compare the accuracy of different classifier models with respect to Bangla news documents.