CFP last date
20 October 2025
Reseach Article

Advancing Bengali Sentiment Analysis: A Benchmark Study with ELMo and XLNet Architecture

by Prithwiraj Bhattacharjee
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 187 - Number 44
Year of Publication: 2025
Authors: Prithwiraj Bhattacharjee
10.5120/ijca2025925755

Prithwiraj Bhattacharjee . Advancing Bengali Sentiment Analysis: A Benchmark Study with ELMo and XLNet Architecture. International Journal of Computer Applications. 187, 44 ( Sep 2025), 32-36. DOI=10.5120/ijca2025925755

@article{ 10.5120/ijca2025925755,
author = { Prithwiraj Bhattacharjee },
title = { Advancing Bengali Sentiment Analysis: A Benchmark Study with ELMo and XLNet Architecture },
journal = { International Journal of Computer Applications },
issue_date = { Sep 2025 },
volume = { 187 },
number = { 44 },
month = { Sep },
year = { 2025 },
issn = { 0975-8887 },
pages = { 32-36 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume187/number44/advancing-bengali-sentiment-analysis-a-benchmark-study-with-elmo-and-xlnet-architecture/ },
doi = { 10.5120/ijca2025925755 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2025-09-23T00:37:27.927454+05:30
%A Prithwiraj Bhattacharjee
%T Advancing Bengali Sentiment Analysis: A Benchmark Study with ELMo and XLNet Architecture
%J International Journal of Computer Applications
%@ 0975-8887
%V 187
%N 44
%P 32-36
%D 2025
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Even though Bengali is the sixth most spoken language in the world, many of its natural language processing domains remain underexplored, with sentiment analysis being one of the core areas. This study applies ELMo (Embeddings from Language Models) and XLNet (Transformer-XL) for the first time in Bengali sentiment classification. A dataset of over 40,000 user comments was used, which was collected from YouTube online platforms and used to train both models on binary classification with positive and negative labels and ternary classification with positive, neutral, and negative labels. Experimental results show that the ELMo two-class model achieved the highest accuracy of 71%, while the XLNet two-class model reached 56%. These findings highlight the potential of context-rich representations like ELMo and XLNet for Bengali sentiment analysis, while also revealing the challenges of more nuanced ternary classification. Overall, the research provides new insights into leveraging cutting-edge deep learning models for limited-resource languages such as Bengali.

References
  1. Ashima Yadav and Dinesh Kumar Vishwakarma. Sentiment analysis using deep learning architectures: a review. Artificial Intelligence Review, 53(6):4335–4385, 2020.
  2. Rajesh Kumar Das, Mirajul Islam, Md Mahmudul Hasan, Sultana Razia, Mocksidul Hassan, and Sharun Akter Khushbu. Sentiment analysis in multilingual context: Comparative analysis of machine learning and hybrid deep learning models. Heliyon, 9(9), 2023.
  3. Fröhlich, Shahrukh Khan and Mahnoor Shahid. Hindi/bengali sentiment analysis using transfer learning and joint dual input learning with self-attention. arXiv preprint arXiv:2202.05457, 2020.
  4. Md Rezaul Karim, Bharathi Raja Chakravarthi, John P McCrae, and Michael Cochez. Classification benchmarks for under-resourced bengali language based on multichannel convolutional-lstm network. In 2020 IEEE 7th international conference on Data Science and Advanced Analytics (DSAA), pages 390–399. IEEE, 2020.
  5. Md Ferdous Wahid, Md Jahid Hasan, and Md Shahin Alom. Cricket sentiment analysis from bangla text using recurrent neural network with long short term memory model. In 2019 International Conference on Bangla Speech and Language Processing (ICBSLP), pages 1–4. IEEE, 2019.
  6. Md Rafidul Hasan Khan, Umme Sunzida Afroz, Abu Kaisar Mohammad Masum, Sheikh Abujar, and Syed Akhter Hossain. Sentiment analysis from bengali depression dataset using machine learning. In 2020 11th international conference on computing, communication and networking technologies (ICCCNT), pages 1–5. IEEE, 2020.
  7. SM Samiul Salehin, Rasel Miah, and Md Saiful Islam. A comparative sentiment analysis on bengali facebook posts. In Proceedings of the international conference on computing advancements, pages 1–8, 2020.
  8. Samsul Islam, Md Jahidul Islam, Md Mahadi Hasan, SM Shahnewaz Mahmud Ayon, and Syeda Shabnam Hasan. Bengali social media post sentiment analysis using deep learning and bert model. In 2022 IEEE Symposium on Industrial Electronics & Applications (ISIEA), pages 1–6. IEEE, 2022.
  9. Nafis Irtiza Tripto and Mohammed Eunus Ali. Detecting multilabel sentiment and emotions from bangla youtube comments. In 2018 international conference on Bangla speech and language processing (ICBSLP), pages 1–6. IEEE, 2018.
  10. Baidya Nath Saha, Apurbalal Senapati, and Anmol Mahajan. Lstm based deep rnn architecture for election sentiment analysis from bengali newspaper. In 2020 International Conference on Computational Performance Evaluation (ComPE), pages 564–569. IEEE, 2020.
  11. Mahfuz Ahmed Masum, Sheikh Junayed Ahmed, Ayesha Tasnim, and Md Saiful Islam. Ban-absa: An aspect-based sentiment analysis dataset for bengali and its baseline evaluation. In Proceedings of International Joint Conference on Advances in Computational Intelligence: IJCACI 2020, pages 385–395. Springer, 2021.
  12. Mayur Wankhade, Annavarapu Chandra Sekhara Rao, and Chaitanya Kulkarni. A survey on sentiment analysis methods, applications, and challenges. Artificial Intelligence Review, 55(7):5731–5780, 2022.
  13. Matthew E Peters, Mark Neumann, Mohit Iyyer, Matt Gardner, Christopher Clark, Kenton Lee, and Luke Zettlemoyer. Deep contextualized word representations. arxiv 2018. arXiv preprint arXiv:1802.05365, 12, 2018.
  14. Zhilin Yang, Zihang Dai, Yiming Yang, Jaime Carbonell, Russ R Salakhutdinov, and Quoc V Le. Xlnet: Generalized autoregressive pretraining for language understanding. Advances in neural information processing systems, 32, 2019.
  15. Md. Ashfaqur Rahman, Amer Mahbub, Bishwo Nikhil Paul, Prithwiraj Bhattacharjee, and Md. Akhter-Uz-Zaman Ashik. Sentifive: A multi-class bengali dataset for sentiment analysis. Manuscript under review, 2025.
  16. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. Advances in neural information processing systems, 30, 2017.
Index Terms

Computer Science
Information Sciences

Keywords

YouTube Comments ELMo XLNet Deep Learning