CFP last date
20 June 2024
Reseach Article

Feature Subset Selection for Twitter Spam Detection

Published on August 2018 by Harshita Tiwary, Indu Kashyap
National Conference on Networking, Cloud Computing, Analytics and Computing Technology
Foundation of Computer Science USA
NCNCCACT2017 - Number 1
August 2018
Authors: Harshita Tiwary, Indu Kashyap

Harshita Tiwary, Indu Kashyap . Feature Subset Selection for Twitter Spam Detection. National Conference on Networking, Cloud Computing, Analytics and Computing Technology. NCNCCACT2017, 1 (August 2018), 24-28.

author = { Harshita Tiwary, Indu Kashyap },
title = { Feature Subset Selection for Twitter Spam Detection },
journal = { National Conference on Networking, Cloud Computing, Analytics and Computing Technology },
issue_date = { August 2018 },
volume = { NCNCCACT2017 },
number = { 1 },
month = { August },
year = { 2018 },
issn = 0975-8887,
pages = { 24-28 },
numpages = 5,
url = { /proceedings/ncnccact2017/number1/29779-7015/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
%0 Proceeding Article
%1 National Conference on Networking, Cloud Computing, Analytics and Computing Technology
%A Harshita Tiwary
%A Indu Kashyap
%T Feature Subset Selection for Twitter Spam Detection
%J National Conference on Networking, Cloud Computing, Analytics and Computing Technology
%@ 0975-8887
%N 1
%P 24-28
%D 2018
%I International Journal of Computer Applications

Rapid growth of social networking have had an immense effect on today's general public and Web stage. Social networking sites are developing in both size and prevalence with a high rate in recent years. Twitter is one of the quickest developing Social Networking Sites. With the measure of information developing in Twitter lately, detection of spam in real time has become a challenging task for researchers as well asfor Twitter itself. Enormous work is being done towards spam detection. The work done previously was not giving the appropriate results in the context of content based spam discovery on Twitter. In this paper accuracy is analyzed by using Classical approaches like Naïve Bayes and Random forest algorithm. It is observed that these algorithms are not giving accurate results. With a specific end goal to increase the accuracy of spam detection Random forest with Feature Subset Selection have been used. Here the aim is to propose a Feature Subset Based Classification Approach where a set of features will be tested using Random Forest Classifier for twitter spam detection. In this paper the capabilities of Random Forest Classifier has been extended for detecting spam by including Feature Subset with it.

  1. Chen Chao,Wang Yu,Zhang Jun,Xiang Yang,Zhou Wanlei,Min Geyong,"Statistical Feature Based Real Time Detection of Drifted Twitter Spam",IEEE Transactions on Information Forensics and security,2015
  2. X. Zhang, S. Zhu, and W. Liang. Detecting spam and promoting campaigns in the twitter social network. In Data Mining (ICDM), 2012 IEEE 12th International Conference on, pages 1194–1199, 2012.
  3. K. Thomas ,C. Grier, J. Ma, V. Paxsonand D. Song. Design and evaluation of a real-time url spam filtering service. In Proceedings of the 2011 IEEE Symposium on Security and Privacy, SP '11, pages 447–462, Washington, DC, USA, 2011. IEEE Computer Society.
  4. H. Gao, J. Hu ,C. Wilson, Z. Li, Y. Chen, and B. Y. Zhao. Detectingand characterizing social spam campaigns. In Proceedings of the 10th ACM SIGCOMM conference on Internet measurement, IMC '10, pages35–47, New York, NY, USA, 2010. ACM.
  5. KhuranaGirisha,Mr. KumarMarish,"Review:Efficient Spam Detection on social Network",International Journal for Research in Applied Science and Engineering Technology(IJRASET),vol 3,Issue VI,June 2015
  6. Miss. Shukla Twinkle Kailas,Prof. Shirsagar D. B. K,"Design of Machine Learning Approach for Spam Tweet Detection",vol-2,Issue 5,2016
  7. R . Kumar Arun , Mittal Shruti ," Twitter Spamming:Techniques and Defence Approaches",International Journal of Applied Engineering Research , vol 7,No. 11,2012
  8. E. M. Clark, J. R. Williams, C. A. Jones, R. A. Galbraith, C. M. Danforth, and P. S. Dodds. Sifting robotic from organic text: A natural language approach for detecting automation on twitter. Journal of Computational Science, 16:1 – 7, 2016.
  9. S. Yardi , D. Romero, G. Schoenebeck, and D. Boyd. Detecting spam in a twitter network. First Monday, 15(1-4), January 2010.
  10. A. H. Wang. Don't follow me: Spam detection in twitter. In Security and Cryptography (SECRYPT), Proceedings of the 2010 International Conference on, pages 1–10, 2010.
  11. C. Chen, J. Zhang, X. Chen, Y. Xiang, and W. Zhou. 6 million spam tweets: A large ground truth for timely twitter spam detection. In IEEE ICC 2015 - Communication and Information Systems SecuritySymposium (ICC'15 (11) CISS), pages 8689–8694, London, United Kingdom, June 2015.
  12. C. Chen, J. Zhang, Y. Xiang, W. Zhou, and J. Oliver. Spammers are becoming smarter on twitter. IT Professional, 18(2):14–18, Mar. -April. 2016.
  13. J. a. Gama, I. Zliobait?e, A. Bifet, M. Pechenizkiy, and A. Bouchachia. A survey on concept drift adaptation. ACM Comput. Surv. , 46(4):44:1 44:37, Mar. 2014.
  14. K. Huang, Z. Xu, I. King, M. Lyu, and C. Campbell. Supervised selftaught learning: Actively transferring knowledge from unlabelled data. In Neural Networks, 2009. IJCNN 2009. International Joint Conference on, pages 1272–1277, June 2009.
  15. R. Jeyaraman. Fighting spam with botmaker. Twitter Engineering Blog,August 2014.
  16. K. Lee, J. Caverlee, and S. Webb. Uncovering social spammers: socialhoneypots + machine learning. In Proceedings of the 33rd internationalACM SIGIR conference on Research and development in information retrieval, SIGIR '10, pages 435–442, New York, NY, USA, 2010. ACM.
  17. S. Lee and J. Kim. Warningbird: A near real-time detection system for suspicious urls in twitter stream. IEEE Transactions on Dependable andSecure Computing, 10(3):183–195, 2013.
  18. J. Oliver, P. Pajares, C. Ke, C. Chen, and Y. Xiang. An in-depthanalysis of abuse on twitter. Technical report, Trend Micro, 225 E. JohnCarpenter Freeway, Suite 1500 Irving, Texas 75062 U. S. A. , September2014.
  19. A. Comparatives. Whole product dynamic real-world protection test Technical report,AVComparatives,http://www. avcomparatives. org/wpcontent/uploads/2016/07/avc prot 2016a en. pdf, July 2016.
  20. G. Stringhini, C. Kruegel, and G. Vigna. Detecting spammers on social networks. In Proceedings of the 26th Annual Computer SecurityApplications Conference, ACSAC '10, pages 1–9, New York, NY, USA, 2010. ACM.
  21. Dheeraj Pal,Alok Jain,Aradhana Saxena and Vaibhav Agarwal,"Comparing Various Classifier Techniques for Efficient Mining of Data",Proceedings of the International Congress on Information and Communication technology,pp. 191-202,2016
  22. S. Dinh,T. Azeb,F. Fortin,D. Mouheb and M. Debbabi,"Spamcampaig detection,analysis,investigation",vol. 12,pp. S12-S21,2015.
Index Terms

Computer Science
Information Sciences


Labeled Dataset Feature Subset Selection random Forest