Call for Paper - January 2022 Edition
IJCA solicits original research papers for the January 2022 Edition. Last date of manuscript submission is December 20, 2021. Read More

A Study using Support Vector Machines to Classify the Sentiments of Tweets

Print
PDF
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2017
Authors:
Wassim A. Zgheib, Aziz M. Barbar
10.5120/ijca2017914690

Wassim A Zgheib and Aziz M Barbar. A Study using Support Vector Machines to Classify the Sentiments of Tweets. International Journal of Computer Applications 170(2):8-12, July 2017. BibTeX

@article{10.5120/ijca2017914690,
	author = {Wassim A. Zgheib and Aziz M. Barbar},
	title = {A Study using Support Vector Machines to Classify the Sentiments of Tweets},
	journal = {International Journal of Computer Applications},
	issue_date = {July 2017},
	volume = {170},
	number = {2},
	month = {Jul},
	year = {2017},
	issn = {0975-8887},
	pages = {8-12},
	numpages = {5},
	url = {http://www.ijcaonline.org/archives/volume170/number2/28040-2017914690},
	doi = {10.5120/ijca2017914690},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}
}

Abstract

It is difficult to sidestep Big Data today, as the industry is abuzz with its promises. The trend is towards data-driven decision-making in all aspects of businesses because making sense out of data is very profitable and valuable. People tend to use social media, especially Twitter, to tweet about their opinions and sentiments. However, due to the prevalence of data that might be noisy, varied, unfiltered, and the impractical state of manually labeling large number of tweets to train classifiers, data acquisition for training sentiment analysis classifiers is becoming more and more of a challenge. This paper proposes a solution to easily acquire automatically labeled, filtered, and huge training data from Twitter in order to be given as input to a support vector machine classifier. The recommended solution discusses the workaround of unlabeled data through using Twitter hashtags to automatically induct the sentiment of a tweet (positive or negative). Neutral class is trained using tweets generated by newspapers accounts. A test study was conducted to show the accuracy of the applied features on the classifier. As a result, tweets trending on Twitter can now be analyzed to induce their sentiments which helps organizations in future data-driven decisions.

References

  1. S. F. Wamba, S. Akter, A. Edwards, G. Chopin and D. Gnanzou, "How ‘big data’ can make big impact: Findings from a systematic review and a longitudinal case study," International Journal of Production Economics, vol. 165, pp. 234-246, 2015.
  2. Kit_Smith, "Marketing: 96 Amazing Social Media Statistics and Facts for 2016," Brandwatch, 7 March 2016. [Online].
  3. S. Rosenthal, P. Nakov, S. Kiritchenko, S. M. Mohammad, A. Ritter and V. Stoyanov, "SemEval-2015 Task 10: Sentiment Analysis in Twitter," in Proceedings of the 9th International Workshop on Semantic Evaluation, Denver, 2015.
  4. A. Pak and P. Paroubek, "Twitter as a Corpus for Sentiment Analysis and Opinion Mining," in In Proceedings of the Seventh conference on International Language Resources and Evaluation, Valletta, 2010.
  5. A. Go, R. Bhayani and L. Huang, "Twitter Sentiment Classification using Distant Supervision," Stanford, 2009.
  6. W. Magdy, H. Sajjad, T. El-Ganainy and F. Sebastiani, "Distant Supervision for Tweet Classification Using YouTube Labels," in Ninth International AAAI Conference on Web and Social Media, Oxford, 2015.
  7. D. Davidov, O. Tsur and A. Rappoport, "Enhanced Sentiment Learning Using Twitter Hashtags and Smileys," in Proceedings of the 23rd international conference on computational linguistics, 2010.
  8. T. Wilson, J. Wiebe and P. Hoffmann, "Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis," in Proceedings of the conference on human language technology and empirical methods in natural language processing, 2005.
  9. S. Allen, "stopword-dictionary," 2007-2016. [Online].
  10. C.-C. Chang and C.-J. Lin, "LIBSVM: a library for support vector machines," ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 3, 2011.
  11. D. Meyer, Support Vector Machines ∗ The Interface to libsvm in package e1071, Technikum Wien, 2015.
  12. J. Milgram, M. Cheriet and R. Sabourin, "One Against One” or ”One Against All”: Which One is Better for Handwriting Recognition with SVMs?," in enth International Workshop on Frontiers in Handwriting Recognition, 2006.
  13. A. KOWALCZYK, "How to classify text using SVM in C#," 2014. [Online].
  14. T. Joachims, "Text Categorization with Support Vector Machines: Learning with Many Relevant Features," in 10th European Conference on Machine Learning, Chemnitz, 1998.
  15. M. Hu and B. Liu, "Mining Opinion Features in Customer Reviews," Association for the Advancement of Artificial Intelligence, vol. 4, no. 4, 2004.
  16. A. O. D. Community, "OpenNLP Part-of-Speech (POS) Tags: Penn English Treebank," The Apache Software Foundation. [Online].
  17. G. Patil, V. Galande, V. Kekan and K. Dange, "Sentiment Analysis Using Support Vector Machine," International Journal of Innovative Research in Computer and Communication Engineering, vol. 2, no. 1, 2014.

Keywords

Big Data, data-driven, Twitter, automatically labeled, training, support vector machine, unlabeled data, hashtags, newspapers.