Call for Paper - December 2020 Edition
IJCA solicits original research papers for the December 2020 Edition. Last date of manuscript submission is November 20, 2020. Read More

Comparative Study on Machine Learning Algorithms for Sentiment Classification

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2018
Mohammad Mohaiminul Islam, Naznin Sultana

Mohammad Mohaiminul Islam and Naznin Sultana. Comparative Study on Machine Learning Algorithms for Sentiment Classification. International Journal of Computer Applications 182(21):1-7, October 2018. BibTeX

	author = {Mohammad Mohaiminul Islam and Naznin Sultana},
	title = {Comparative Study on Machine Learning Algorithms for Sentiment Classification},
	journal = {International Journal of Computer Applications},
	issue_date = {October 2018},
	volume = {182},
	number = {21},
	month = {Oct},
	year = {2018},
	issn = {0975-8887},
	pages = {1-7},
	numpages = {7},
	url = {},
	doi = {10.5120/ijca2018917961},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}


Sentiment Analysis is the study of people’s opinions and emotional feedbacks towards an entity which can be products, services, individuals or events. The opinions are most presumably be expressed as reviews or comments. With the advent of social networks, forums and blogs, these reviews emerged as an important factor for the customers’ decision for the purchase or choice of any item. Nowadays, a vast scalable computing environment provides us with very sophisticated way of carrying out various data-intensive natural language processing (NLP) and machine-learning tasks to analyze these reviews. One such task is text classification, a very effective way of predicting customers’ sentiment. This paper investigates the different ways of sentiment analysis from customers’ review using machine learning algorithms. For classifying text from overall sentiment, we considered two class, i.e. predicting whether a comment or review is positive or negative. In our study, we used two popular public datasets and six different machine learning algorithms – Naïve Bayes (Multinomial and Bernoulli), Logistic Regression, SGD (Stochastic Gradient Descent), Linear SVM (Support Vector Machine) and RF (Random Forest). Moreover, we applied parameter optimization on SVM and SGD classifiers on different threshold values to identify and analyze the differences in the accuracy of the classifiers and to obtain the optimal outcome from the model.


  1. C. Akkaya, J. Wiebe, and R. Mihalcea, “Subjectivity word sense disambiguation”, in Proc. Of Conf. Empirical Methods Natural Language Processing, Association Computer Linguistic, vol. 1, pp. 190–199, 2009.
  2. E. Ahmed, M. A. U. Sazzad, M. T. Islam, M. Azad, S. Islam, and M. H. Ali, “Challenges, comparative analysis and a proposed methodology to predict sentiment from movie reviews using machine learning”, in International Conference on Big Data Analytics and Computational Intelligence (ICBDAC), pp. 86–91,2017.
  3.[Accessed: 04-Aug-2018].
  4. [Accessed: 05-Aug-2018].
  5. B. Pang and L. Lee, “Opinion Mining and Sentiment Analysis”, Foundations and Trends® in Information Retrieval, vol. 2, no. 1–2, pp. 1–135, 2008.
  6. B. Liu, “Sentiment Analysis and Opinion Mining”, Synth. Lect. Hum. Lang. Technol., vol. 5, no. 1, pp. 1–167, May 2012.
  7. E. Cambria, B. Schuller, Y. Xia and C. Havasi, “New Avenues in Opinion Mining and Sentiment Analysis”, IEEE Intelligent Systems, vol. 28, no. 2, pp. 15–21, 2013.
  8. R. Feldman, “Techniques and applications for sentiment analysis”, Commun. ACM, vol. 56, no. 4, p. 82, 2013.
  9. A. Montoyo, P. Martínez-Barco, and A. Balahur, “Subjectivity and sentiment analysis: An overview of the current state of the art and envisaged developments”, Decision Support System, vol. 53, no. 4, pp. 675–679, 2012.
  10. M. Tsytsarau and T. Palpanas, “Survey on mining subjective data on the web”, Data Mining and Knowledge Discovery, vol. 24, no. 3, pp. 478–514, 2012.
  11. E. Stamatatos, N. Fakotakis, and G. Kokkinakis, “Text genre detection using common word frequencies”, in Proc. of 18th Conference on Computer Linguistics, vol. 2, p. 808, 2000.
  12. V. Hatzivassiloglou and J. M. Wiebe, “Effects of adjective orientation and gradability on sentence subjectivity”, in Proceedings of the 18th conference on Computational linguistics,vol. 1, pp. 299–305, 2000.
  13. V. Hatzivassiloglou and K. R. McKeown, “Predicting the semantic orientation of adjectives,” in Proceedings of the 35th annual meeting on Association for Computational Linguistics, pp. 174–181, 1997.
  14. P. D. Turney and M. L. Littman, “Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews”, in Proceedings of the Association for Computational Linguistics 40th Anniversary Meeting. Association for Computational Linguistics, New Brunswick, N.J, 2002
  15. G. Grefenstette, “Sextant: Exploring Unexplored Contexts for Semantic Extraction from Syntactic Analysis”, in Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics, pp. 324–326, 1992.
  16. A. Huettner and P. Subasic, “Fuzzy Typing for Document Management”, in Tutorial Abstracts and Demonstration Notes (ACL 2000 Companion Volume), pp. 26–27, 2000.
  17. S. Tong and D. Koller, “Support vector machine active learning with applications to text classification”,Journal of Machine Learning Research, vol. 2, pp. 45–66, 2001.
  18. P. D. Turney, “Thumbs up or thumbs down? Semantic Orientation applied to Unsupervised Classification of Reviews”, in Proc. 40th Annual Meeting of Association Computer Linguistics,pp. 417–424, 2002.
  19. T. Joachims, “Text categorization with Support Vector Machines: Learning with many relevant features”, in Proceedings of 10th European Conference on Machine Learning, pp. 137–142, 1998.
  20. S. Dumais, “Using SVMs for Text Categorization,” IEEE Intelligent Systems Magazine, Trends and Controversies, pp. 18–28, 1998.
  21. M. Seddon, “Natural Language Processing with Apache Spark ML and Amazon Reviews,” [Online] (2015). [Cited: August 10, 2018.]
  22. A. Mountassir, H. Benbrahim, and I. Berrada, “An empirical study to address the problem of unbalanced data sets in sentiment classification”, in Conference Proceedings - IEEE International Conference on Systems, Man and Cybernetics, pp. 3298–3303, 2012.


Natural Language Processing, Sentiment Analysis, Opinion mining, Machine Learning.