CFP last date
22 December 2025
Call for Paper
January Edition
IJCA solicits high quality original research papers for the upcoming January edition of the journal. The last date of research paper submission is 22 December 2025

Submit your paper
Know more
Random Articles
Reseach Article

Sentiment Analysis of FIFA-Related Tweets: Integrating NLTK’s VADER with BERT for Enhanced Classification Accuracy

by Mirsad Hadžić, Zerina Mašetić, Fatima Mašić
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 187 - Number 58
Year of Publication: 2025
Authors: Mirsad Hadžić, Zerina Mašetić, Fatima Mašić
10.5120/ijca2025925988

Mirsad Hadžić, Zerina Mašetić, Fatima Mašić . Sentiment Analysis of FIFA-Related Tweets: Integrating NLTK’s VADER with BERT for Enhanced Classification Accuracy. International Journal of Computer Applications. 187, 58 ( Nov 2025), 73-79. DOI=10.5120/ijca2025925988

@article{ 10.5120/ijca2025925988,
author = { Mirsad Hadžić, Zerina Mašetić, Fatima Mašić },
title = { Sentiment Analysis of FIFA-Related Tweets: Integrating NLTK’s VADER with BERT for Enhanced Classification Accuracy },
journal = { International Journal of Computer Applications },
issue_date = { Nov 2025 },
volume = { 187 },
number = { 58 },
month = { Nov },
year = { 2025 },
issn = { 0975-8887 },
pages = { 73-79 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume187/number58/sentiment-analysis-of-fifa-related-tweets-integrating-nltks-vader-with-bert-for-enhanced-classification-accuracy/ },
doi = { 10.5120/ijca2025925988 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2025-11-18T21:11:20.820202+05:30
%A Mirsad Hadžić
%A Zerina Mašetić
%A Fatima Mašić
%T Sentiment Analysis of FIFA-Related Tweets: Integrating NLTK’s VADER with BERT for Enhanced Classification Accuracy
%J International Journal of Computer Applications
%@ 0975-8887
%V 187
%N 58
%P 73-79
%D 2025
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The purpose of this research is to perform sentiment analysis on Twitter data using Natural Language Processing (NLP) techniques, particularly leveraging the NLTK library in Python within a Jupyter notebook environment. The study aims to explore sentiment classification methods, evaluating the emotional tone of tweets and categorizing them as neutral, positive, or negative sentiments, utilizing NLTK's SentimentIntensityAnalyzer. The sample consists of Twitter data with columns like 'Tweet' and 'Sentiment' sourced from a CSV file. The methodology involves tokenizing and processing the text, grading sentiment, counting occurrences of the hashtag #fifa, and analyzing word frequencies [1]. In addition to the lexicon-based VADER approach, the study incorporates a transformer-based deep learning model—BERT (Bidirectional Encoder Representations from Transformers) -to enhance sentiment classification accuracy. BERT, pre-trained on large corpora and capable of understanding context and nuanced language, offers a state-of-the-art alternative to traditional models. This inclusion allows a comparative analysis between rule-based and deep learning approaches, highlighting BERT’s effectiveness in handling complex tweet structures. Furthermore, the study investigates the impact of removing stopwords and explores the list of eliminated stopwords. The expected results include gaining insights into prevalent sentiments on Twitter regarding a specified topic, frequency of the hashtag #fifa, and a comprehensive understanding of word usage, visually depicted through wordclouds. Possible limitations include inherent subjectivity in sentiment analysis, potential variations in language use, reliance on hashtag frequency as an indicator of topic prevalence, and the effectiveness of stopword removal, which may be context-dependent. The addition of wordcloud analysis enhances the visual representation of the most frequent words, providing a holistic perspective on the dataset.

References
  1. Saif M. Mohammed (2017). Challenges in Sentiment Analysis. arXiv preprint. https://ufal.mff.cuni.cz/~hana/teaching/Mohammad2017_Chapter_ChallengesInSentimentA nalysis.pdf
  2. VADER. (2024). https://www.geeksforgeeks.org/python-sentiment-analysis-using-vader/
  3. Liu, B. (2012). Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies. https://www.cs.uic.edu/~liub/FBS/SentimentAnalysis-and-OpinionMining.pdf
  4. Liu, B. (2012). Sentiment Analysis and Opinion Mining. Synthesis Lectures on Human Language Technologies, 5(1), 1-167.
  5. Pang, B., & Lee, L. (2008). Thumbs up? Sentiment Classification using Machine Learning Techniques. https://www.cs.cornell.edu/home/llee/papers/sentiment.pdf
  6. Devlin, J., et al. (2018). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https://arxiv.org/abs/1810.04805
  7. Lei Z., S.W., B. Liu (2018). Deep Learning for Sentiment Analysis : A Survey https://arxiv.org/abs/1801.07883
  8. Saif M. Mohammed, & S.K. (2018). https://svkir.com/papers/Mohammad-Kiritchenko-Tweets-VAD-EI-LREC-2018.pdf
  9. Caliskan, A., et al. (2017). Semantics derived automatically from language corpora contain human-like biases. Science. https://www.science.org/doi/10.1126/science.aal4230
  10. "Natural Language Processing in Python: Exploring Word Frequencies with NLTK" - Medium. (2021) https://medium.com/@siglimumuni/natural-language-processing-in-python-exploring-word-fr equencies-with-nltk-918f33c1e4c3
  11. Dataset. (2022). https://www.kaggle.com/datasets/tirendazacademy/fifa-world-cup-2022-tweets
  12. NLTK. (2025). https://www.nltk.org/
  13. "Simple WordCloud using NLTK Library in Python" - NLPfy. (2021) https://nlpfy.com/simple-wordcloud-using-nltk-library-in-python/
  14. Mueller, A. (2012). WordCloud Documentation. https://github.com/amueller/word_cloud
  15. Hutto, C. J., & Gilbert, E. (2014). VADER: A parsimonious rule-based model for sentiment analysis of social media text. Proceedings of the International AAAI Conference on Web and Social Media, 8(1), 216-225.
  16. MNB. (2024). https://www.geeksforgeeks.org/multinomial-naive-bayes/
  17. Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. https://arxiv.org/abs/1810.04805
  18. Kaggle (2023). https://www.kaggle.com/datasets/tirendazacademy/fifa-world-cup-2022-tweets/data.
  19. Snscrape (2007). https://github.com/JustAnotherArchivist/snscrape
Index Terms

Computer Science
Information Sciences

Keywords

Sentiment analysis Twitter data mining VADER BERT NLP