Call for Paper - September 2022 Edition
IJCA solicits original research papers for the September 2022 Edition. Last date of manuscript submission is August 22, 2022. Read More

Multilabel Classification of Tweets

Print
PDF
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2017
Authors:
Abha Tewari, Pratik Sawant, Jai Samtani, Sanket Sawant, Gaurav Massand
10.5120/ijca2017912209

Abha Tewari, Pratik Sawant, Jai Samtani, Sanket Sawant and Gaurav Massand. Multilabel Classification of Tweets. International Journal of Computer Applications 159(1):1-4, February 2017. BibTeX

@article{10.5120/ijca2017912209,
	author = {Abha Tewari and Pratik Sawant and Jai Samtani and Sanket Sawant and Gaurav Massand},
	title = {Multilabel Classification of Tweets},
	journal = {International Journal of Computer Applications},
	issue_date = {February 2017},
	volume = {159},
	number = {1},
	month = {Feb},
	year = {2017},
	issn = {0975-8887},
	pages = {1-4},
	numpages = {4},
	url = {http://www.ijcaonline.org/archives/volume159/number1/26962-2017912209},
	doi = {10.5120/ijca2017912209},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}
}

Abstract

With the help of Social Networking sites many news providers used to share their news headlines on the micro blogging sites such as twitter. We are proposing a system to classify tweets into different groups and labels so that the user can identify the particular tweet from particular category. We will use 120 character tweets for our analysis purpose. Various active and verified twitter accounts would be chosen to extract the tweets. Each tweet is to be classified into 2 category-spam and non-spam. Then further spam group is classified as advertisement, malicious and URL links. The non-spam tweets are classified into 6 labels. These classified tweets then are used to train the various machine learning techniques. Words of each tweet considered as features and a feature vector was created using bag-of-words approach in order to create the instances. The data will be trained using SVM (Support Vector Machine), Naive Bayes and K neighbor machine learning techniques and their efficiency will be compared.

References

  1. ErsinYar,LemiBaruh, Syleyman S. Kozat 2016 Online Text Classification for Real Life Tweet Analysis
  2. P. Selvaperumal, A. Suruliandi 2014 A short message classification algorithm for tweet classification.
  3. InoshikaDilrukshi, Kasun De Zoysa 2014 Twitter news classification: Theoretical and practical comparison of SVM against Naive Bayes algorithm.
  4. Nitin Jindal, Bing Liu 2007 Analyzing and Detecting Review Spam
  5. Shankar Setty ,RajendraJadi , Sabya Shaikh , ChandanMattikalli , Uma Mudenagudi 2014 Classification of Facebook news feeds and sentiment analysis.
  6. KamalanathanKandasamy, PreethiKoroth 2014 An integrated approach to spam classification on Twitter using URL analysis, natural language processing and machine learning techniques.
  7. Support vector mechanism by David Meyer: The interface to libsvm in package e1071
  8. How to Get Started With Machine Learning Algorithms in R by Jason Brownlee: http://machinelearningmastery.com/how-to-get-started-with-machine-learning-algorithms-in-r/
  9. Machine learning course by Andrew Nig: https://www.coursera.org/learn/machine-learning
  10. Basic text mining in r: https://rstudio-pubs-staic.s3.amazonaws.com/31867_8236987cf0a8444e962ccd2aec46d9c3.html

Keywords

SVM -Support Vector Mechanism NLP -Natural Language Processing NB-Naïve Bayes KNN-K Nearest Neighbor