A Study using Support Vector Machines to Classify the Sentiments of Tweets

Wassim A. Zgheib; Aziz M. Barbar

Call for Paper

April Edition

IJCA solicits high quality original research papers for the upcoming April edition of the journal. The last date of research paper submission is 20 March 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

A Study using Support Vector Machines to Classify the Sentiments of Tweets

by Wassim A. Zgheib, Aziz M. Barbar

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 170 - Number 2

Year of Publication: 2017

Authors: Wassim A. Zgheib, Aziz M. Barbar

10.5120/ijca2017914690

Wassim A. Zgheib, Aziz M. Barbar . A Study using Support Vector Machines to Classify the Sentiments of Tweets. International Journal of Computer Applications. 170, 2 ( Jul 2017), 8-12. DOI=10.5120/ijca2017914690

@article{ 10.5120/ijca2017914690,

author = { Wassim A. Zgheib, Aziz M. Barbar },

title = { A Study using Support Vector Machines to Classify the Sentiments of Tweets },

journal = { International Journal of Computer Applications },

issue_date = { Jul 2017 },

volume = { 170 },

number = { 2 },

month = { Jul },

year = { 2017 },

issn = { 0975-8887 },

pages = { 8-12 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume170/number2/28040-2017914690/ },

doi = { 10.5120/ijca2017914690 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:17:23.100736+05:30

%A Wassim A. Zgheib

%A Aziz M. Barbar

%T A Study using Support Vector Machines to Classify the Sentiments of Tweets

%J International Journal of Computer Applications

%@ 0975-8887

%V 170

%N 2

%P 8-12

%D 2017

%I Foundation of Computer Science (FCS), NY, USA

Abstract

It is difficult to sidestep Big Data today, as the industry is abuzz with its promises. The trend is towards data-driven decision-making in all aspects of businesses because making sense out of data is very profitable and valuable. People tend to use social media, especially Twitter, to tweet about their opinions and sentiments. However, due to the prevalence of data that might be noisy, varied, unfiltered, and the impractical state of manually labeling large number of tweets to train classifiers, data acquisition for training sentiment analysis classifiers is becoming more and more of a challenge. This paper proposes a solution to easily acquire automatically labeled, filtered, and huge training data from Twitter in order to be given as input to a support vector machine classifier. The recommended solution discusses the workaround of unlabeled data through using Twitter hashtags to automatically induct the sentiment of a tweet (positive or negative). Neutral class is trained using tweets generated by newspapers accounts. A test study was conducted to show the accuracy of the applied features on the classifier. As a result, tweets trending on Twitter can now be analyzed to induce their sentiments which helps organizations in future data-driven decisions.

References

S. F. Wamba, S. Akter, A. Edwards, G. Chopin and D. Gnanzou, "How ‘big data’ can make big impact: Findings from a systematic review and a longitudinal case study," International Journal of Production Economics, vol. 165, pp. 234-246, 2015.
Kit_Smith, "Marketing: 96 Amazing Social Media Statistics and Facts for 2016," Brandwatch, 7 March 2016. [Online].
S. Rosenthal, P. Nakov, S. Kiritchenko, S. M. Mohammad, A. Ritter and V. Stoyanov, "SemEval-2015 Task 10: Sentiment Analysis in Twitter," in Proceedings of the 9th International Workshop on Semantic Evaluation, Denver, 2015.
A. Pak and P. Paroubek, "Twitter as a Corpus for Sentiment Analysis and Opinion Mining," in In Proceedings of the Seventh conference on International Language Resources and Evaluation, Valletta, 2010.
A. Go, R. Bhayani and L. Huang, "Twitter Sentiment Classification using Distant Supervision," Stanford, 2009.
W. Magdy, H. Sajjad, T. El-Ganainy and F. Sebastiani, "Distant Supervision for Tweet Classification Using YouTube Labels," in Ninth International AAAI Conference on Web and Social Media, Oxford, 2015.
D. Davidov, O. Tsur and A. Rappoport, "Enhanced Sentiment Learning Using Twitter Hashtags and Smileys," in Proceedings of the 23rd international conference on computational linguistics, 2010.
T. Wilson, J. Wiebe and P. Hoffmann, "Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis," in Proceedings of the conference on human language technology and empirical methods in natural language processing, 2005.
S. Allen, "stopword-dictionary," 2007-2016. [Online].
C.-C. Chang and C.-J. Lin, "LIBSVM: a library for support vector machines," ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 3, 2011.
D. Meyer, Support Vector Machines ∗ The Interface to libsvm in package e1071, Technikum Wien, 2015.
J. Milgram, M. Cheriet and R. Sabourin, "One Against One” or ”One Against All”: Which One is Better for Handwriting Recognition with SVMs?," in enth International Workshop on Frontiers in Handwriting Recognition, 2006.
A. KOWALCZYK, "How to classify text using SVM in C#," 2014. [Online].
T. Joachims, "Text Categorization with Support Vector Machines: Learning with Many Relevant Features," in 10th European Conference on Machine Learning, Chemnitz, 1998.
M. Hu and B. Liu, "Mining Opinion Features in Customer Reviews," Association for the Advancement of Artificial Intelligence, vol. 4, no. 4, 2004.
A. O. D. Community, "OpenNLP Part-of-Speech (POS) Tags: Penn English Treebank," The Apache Software Foundation. [Online].
G. Patil, V. Galande, V. Kekan and K. Dange, "Sentiment Analysis Using Support Vector Machine," International Journal of Innovative Research in Computer and Communication Engineering, vol. 2, no. 1, 2014.

Index Terms

Computer Science

Information Sciences

Keywords

Big Data data-driven Twitter automatically labeled training support vector machine unlabeled data hashtags newspapers.