CFP last date
22 April 2024
Reseach Article

Sherlock: An Ensemble based Deep Learning Framework for Fake News Detection

by Sameer Kulkarni, R. M. Samant, Atharva Bhusari
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 176 - Number 22
Year of Publication: 2020
Authors: Sameer Kulkarni, R. M. Samant, Atharva Bhusari
10.5120/ijca2020920218

Sameer Kulkarni, R. M. Samant, Atharva Bhusari . Sherlock: An Ensemble based Deep Learning Framework for Fake News Detection. International Journal of Computer Applications. 176, 22 ( May 2020), 37-42. DOI=10.5120/ijca2020920218

@article{ 10.5120/ijca2020920218,
author = { Sameer Kulkarni, R. M. Samant, Atharva Bhusari },
title = { Sherlock: An Ensemble based Deep Learning Framework for Fake News Detection },
journal = { International Journal of Computer Applications },
issue_date = { May 2020 },
volume = { 176 },
number = { 22 },
month = { May },
year = { 2020 },
issn = { 0975-8887 },
pages = { 37-42 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume176/number22/31333-2020920218/ },
doi = { 10.5120/ijca2020920218 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:43:14.167151+05:30
%A Sameer Kulkarni
%A R. M. Samant
%A Atharva Bhusari
%T Sherlock: An Ensemble based Deep Learning Framework for Fake News Detection
%J International Journal of Computer Applications
%@ 0975-8887
%V 176
%N 22
%P 37-42
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Fake news is impacting societal harmony and peace. Considering the magnitude of this harmful impact, there is a need to find a solution to curb the online spread of fake news. Detection of fake news is being tackled with various approaches like manual checks, statistical based classification algorithms and deep learning techniques in recent times. This task however, becomes tricky due to the non-binary (entirely true of false) nature of news reporting. Results reported in existing research work require deeper investigation such as classification on a scale of entirely true to entirely false rather than binary classification of news articles. In this paper, a novel ensemble-based framework – Sherlock, to detect fake news articles using natural language processing (NLP) and deep learning techniques is proposed. Due to unsatisfactory results of using a single approach, this framework consists of three distinct tasks of classification based on the article’s semantic structure, source credibility and sentiment of the news. The technique of using pre-trained word vectors as word embeddings for semantic analysis has shown performance boost by 2-4%. Additionally, a scale for measuring fakeness of news is proposed. Sherlock classifies a given news article into one of the four degrees of fakeness- “true”, “mostly-fake”, “entirely-fake” or “uncertain”. A comparison of the performance of text classification task using various statistical based machine learning algorithms and deep neural networks are also reported based on two publicly available benchmark datasets. The best test accuracies of 94% for binary classification and 65.5% for multiclass classification were obtained for a GRU (Gated Recurrent Unit) based deep neural network model which has been incorporated in the proposed framework. Sherlock uses a browser plugin to accept news for detection via web-scraping technique and consequently, the training dataset is updated in order to establish context for current affairs. An indigenous dataset which is frequently updated with Indian news context is introduced for the first time to the best of our knowledge. The overall product experience using Sherlock largely intervenes the impulsive behavior of forwarding news, and thereby provides the solution to curb rampant spread of fake news.

References
  1. K. Cho, B. v. Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk and Y. Bengio, “Learning Phrase Representations using RNN Encoder–Decoder,” arXiv:1406.1078, 2014.
  2. S. Helmstetter and H. Paulheim, “Weakly Supervised Learning for Fake News Detection on Twitter,” IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) , 2018, pp. 274-277.
  3. S. Krishnan and M. Chen, “Identifying Tweets with Fake News,” IEEE International Conference on Information Reuse and Integration for Data Science, 2018, pp. 460-464.
  4. A. Sen, K. Rudra and S. Ghosh, “Extracting Situational Awareness from Microblogs during Disaster Events,” Social Networking Workshop, COMSNETS 2015 , 2015.
  5. T. Traylor , J. Straub, Gurmeet and N. Snell, “Classifying Fake News Articles Using Natural Language Processing to Identify In-Article Attribution as a Supervised Learning Estimator,” IEEE 13th International Conference on Semantic Computing (ICSC), 2019, pp. 445-449.
  6. W. Y. Wang, “Liar, liar pants on fire: A new benchmark dataset for fake news detection,” arXiv:1705.00648v1 [cs.CL], 2017, p. 422–426.
  7. S. Saad, W. Nicholas, S. Mei-Ling and F. Daniel, “High Dimensional Latent Space Variational Auto Encoders for Fake News Detection,” IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), 2019, pp. 437-442.
  8. J. Reis, A. Correia, F. Murai, A. Veloso and F. Benevenuto, “Supervised Learning for Fake News Detection,” IEEE Computer Society, 2019, pp. 76-81.
  9. S. Sudarshan, S. Seth, K. Chebrolu, S. Chakrabarti, M. Agarwal, A. Pale and A. Bagade, “The Kauwa-Kaate Fake News Detection System: Demo,” ACM India Joint International Conference on Data Science and Management of Data (CoDS-COMAD) , 2020.
  10. “Fake News Challenge,” [Online]. Available: http://www.fakenewschallenge.org/.
  11. M. Risdal, “Kaggle.com,” [Online]. Available: https://www.kaggle.com/mrisdal/fake-news.
  12. R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning, A. Y. Ng and C. Potts, “Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank,” In Proceedings of EMNLP, 2013.
  13. Y. Bengio, P. Simrad and P. Franconi, “Learning Long-Term Dependencies with Gradient Descent is Difficult,” IEEE Transactions on Neural Networks , Vol 5, No. 2, 1994, pp. 157-166.
  14. S. Hochreiter and J. ¨. Schmidhuber, “Long Short-Term Memory,” in Neural Computation Vol. 9,8, https://doi.org/10.1162/neco.1997.9.8.1735, 1997, pp. 1735-1780.
  15. Y. Kim, “Convolutional Neural Networks for Sentence Classification,” Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1746-1751.
  16. T. Mikolov, I. Sutskever, K. Chen, G. Corrado and J. Dean, “Distributed Represnetations of Words and Phrases and their Compositionality,” 2013, pp. 1-9.
  17. R. Socher, J. Pennington, E. H. Huang, A. Y. Ng and C. D. Manning, “Semi-Supervised Recursive Autoencoders,” EMNLP '11: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2011, pp. 151-161.
  18. N. Kalchbrenner, E. Grefenstette and P. Blunsom, “A Convolutional Neural Network for Modelling Sentences,” arXiv:1404.2188v1 [cs.CL] , 2014.
  19. D. P. Kingma and J. L. Ba, “ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION,” 3rd International Conference for Learning Representations, San Diego arXiv:1412.6980, 2015.
Index Terms

Computer Science
Information Sciences

Keywords

Fake News Natural Language Processing Deep Learning Semantic Analysis Sentiment Analysis Pre-trained Vectors Gated Recurrent Unit.