Call for Paper - November 2022 Edition
IJCA solicits original research papers for the November 2022 Edition. Last date of manuscript submission is October 20, 2022. Read More

Social Media Forensics for Hate Speech Opinion Mining

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2016
George Wafula Wanjala, Andrew M. Kahonge

George Wafula Wanjala and Andrew M Kahonge. Social Media Forensics for Hate Speech Opinion Mining. International Journal of Computer Applications 155(1):39-47, December 2016. BibTeX

	author = {George Wafula Wanjala and Andrew M. Kahonge},
	title = {Social Media Forensics for Hate Speech Opinion Mining},
	journal = {International Journal of Computer Applications},
	issue_date = {December 2016},
	volume = {155},
	number = {1},
	month = {Dec},
	year = {2016},
	issn = {0975-8887},
	pages = {39-47},
	numpages = {9},
	url = {},
	doi = {10.5120/ijca2016912258},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}


Social Media Hate Speech has continued to grow both locally and globally due to the increase of Online Social Media web forums like Facebook, Twitter and blogging. This has been propelled even further by smartphones and mobile data penetration locally. Global and Local terrorism has posed a vital question for technologists to investigate, prosecute, predict and prevent Social Media Hate Speech.

This study provides a social media digital forensics tool through the design, development and implementation of a software application. The study will develop an application using Linux Apache MySQL PHP and Python. The application will use Scrapy Python page ranking algorithm to perform web crawling and the data will be placed in a MySQL database for data mining.

The application used Agile Software development methodology with twenty websites being the subject of interest. The websites will be the sample size to demonstrate how the application works together with the Python libraries as the framework for web crawling. MySQL data mining, database query application models will be used in performing the search of the lexicon of keywords for hate speech, Inferences from the data mined from crawled web pages will be drawn.


  1. Juliet, N.N. (2014). Internet Freedom in Kenya: Balancing Hate Speech and Free: Launch of the Internet Freedoms in East Africa 2014, 4(1), 21-25.
  2. Sutton, J. N. (2009). Social media monitoring and the democratic national convention : New tasks and emergent processes. Journal of Homeland Security and Emergency Management, 6(1), 1-20.
  3. Baumrin, Julian (2011). Internet Hate Speech and the First Amendment, Revisited.
  4. Kenya Parliament (2014). Article 19. Kenya Cyber Crime and Computer Related Crimes bill.
  5. Yang, M., Kiang, M., Ku, Y. (2011). Journal of Homeland Security and Emergency Management. Volume 8, Issue 1
  6. Chalothorn, T. and Ellman, J. (2013) 'Affect Analysis of Radical Contents on Web Forums Using SentiWordNet'. International Journal of Innovation Management and Technology, 4 (1). pp. 122-124.
  7. Abbasi, A. and Chen, H. 2007a. Affect intensity analysis of Dark Web forums. In Proceedings of the 5th IEEE International Conference on Intelligence and Security Informatics, New Brunswick, NJ, 282–288.
  8. Van Rossum, G., 2007, June. Python Programming Language. In USENIX Annual Technical Conference (Vol. 41)
  9. Collier, Ken W. (2011). Agile Analytics: A Value-Driven Approach to Business Intelligence and Data Warehousing. Pearson Education. pp. 121 ff. ISBN 9780321669544. "What is a self-organizing team?"
  10. Larman, Craig (2004). Agile and Iterative Development: A Manager's Guide. Addison-Wesley. p. 27. ISBN 978-0-13-111155-4.
  11. Sultan, M. A., Salazar, C., & Sumner, T. (2016). Fast and Easy Short Answer Grading with High Accuracy. In Proceedings of NAACL-HLT (pp. 1070-1075).
  12. Hughey, Douglas (2009). Comparing Traditional Systems Analysis and Design with Agile Methodologies. University of Missouri – St. Louis. Retrieved 11 August 2014.
  13. Valkanas, G., Katakis, I., Gunopulos, D. and Stefanidis, A., 2014, August. Mining twitter data with resource constraints. In Proceedings of the 2014 IEEE/WIC/ACM International Joint Conferences on Web Intelligence (WI) and Intelligent Agent Technologies (IAT)-Volume 01 (pp. 157-164). IEEE Computer Society.
  14. Shkapenyuk, V. and Suel, T., 2002. Design and implementation of a high-performance distributed web crawler. In Data Engineering, 2002. Proceedings. 18th International Conference on (pp. 357-368). IEEE.
  15. Richard Sproat and Steven Bedrick (September 2011). "CS506/606: Txt Nrmlztn". Retrieved October 2, 2012.
  16. Pantone, P., Adding Sentiment Analysis support to the NLTK Python Platform.
  17. Bollen, J., Mao, H. and Pepe, A., 2011. Modeling public mood and emotion: Twitter sentiment and socio-economic phenomena. ICWSM, 11, pp.450-453.
  18. Pak, A. and Paroubek, P., 2010, May. Twitter as a Corpus for Sentiment Analysis and Opinion Mining. In LREc (Vol. 10, pp. 1320-1326).
  19. Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y. and Potts, C., 2011, June. Learning word vectors for sentiment analysis. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1 (pp. 142-150). Association for Computational Linguistics.
  20. Tan, S., Cheng, X., Wang, Y. and Xu, H., 2009, April. Adapting naive bayes to domain adaptation for sentiment analysis. In European Conference on Information Retrieval (pp. 337-349). Springer Berlin Heidelberg.
  21. Pontiki, M., Galanis, D., Papageorgiou, H., Manandhar, S. and Androutsopoulos, I., 2015, June. Semeval-2015 task 12: Aspect based sentiment analysis. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Association for Computational Linguistics, Denver, Colorado (pp. 486-495).
  22. Smedt, T.D. and Daelemans, W., 2012. Pattern for python. Journal of Machine Learning Research, 13(Jun), pp.2063-2067.
  23. Aurum, A., Jeffery, R., Wohlin, C. and Handzic, M. eds., 2013. Managing software engineering knowledge. Springer Science & Business Media.
  24. Gintis, N., Ixia, 2014. Methods, systems, and computer readable media for providing user interfaces for specification of system under test (sut) and network tap topology and for presenting topology specific test results. U.S. Patent Application 14/452,205.
  25. Shree Divya, S. and Chitra, P., 2014. common automation testing framework using selenium. Journal on Software Engineering, 9(1).


Web Forums, Social Media, Linux Apache MySQL PHP Python (LAMP), Hate Speech, Opinion Mining, Digital Forensics.