Sherlock: An Ensemble based Deep Learning Framework for Fake News Detection

Sameer Kulkarni; R. M. Samant; Atharva Bhusari

Call for Paper

June Edition

IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2025

Submit your paper

Know more

The week's pick

Attack information gathering from network analysis data during scanning activity

Stephane J. Tamafo Elie Fute Tagne Jaime C. Acosta Charles Kamhoua Rawat Danda

Random Articles

Diagnosis of Cardiovascular Diseases using Artificial Intelligence Techniques: A Review

May

2021

Algorithmic Analysis of an Efficient Packet Scheduler for Optimizing the QoS of VoIP Networks

July

2014

Review of Deep Learning: Architectures, Applications and Challenges

Jun

2022

Pentagonal Shaped Multi-Wideband Antenna for Indoor Wireless Communication

Jun

2018

Reseach Article

Sherlock: An Ensemble based Deep Learning Framework for Fake News Detection

by Sameer Kulkarni, R. M. Samant, Atharva Bhusari

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 176 - Number 22

Year of Publication: 2020

Authors: Sameer Kulkarni, R. M. Samant, Atharva Bhusari

10.5120/ijca2020920218

Sameer Kulkarni, R. M. Samant, Atharva Bhusari . Sherlock: An Ensemble based Deep Learning Framework for Fake News Detection. International Journal of Computer Applications. 176, 22 ( May 2020), 37-42. DOI=10.5120/ijca2020920218

@article{ 10.5120/ijca2020920218,

author = { Sameer Kulkarni, R. M. Samant, Atharva Bhusari },

title = { Sherlock: An Ensemble based Deep Learning Framework for Fake News Detection },

journal = { International Journal of Computer Applications },

issue_date = { May 2020 },

volume = { 176 },

number = { 22 },

month = { May },

year = { 2020 },

issn = { 0975-8887 },

pages = { 37-42 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume176/number22/31333-2020920218/ },

doi = { 10.5120/ijca2020920218 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:43:14.167151+05:30

%A Sameer Kulkarni

%A R. M. Samant

%A Atharva Bhusari

%T Sherlock: An Ensemble based Deep Learning Framework for Fake News Detection

%J International Journal of Computer Applications

%@ 0975-8887

%V 176

%N 22

%P 37-42

%D 2020

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Fake news is impacting societal harmony and peace. Considering the magnitude of this harmful impact, there is a need to find a solution to curb the online spread of fake news. Detection of fake news is being tackled with various approaches like manual checks, statistical based classification algorithms and deep learning techniques in recent times. This task however, becomes tricky due to the non-binary (entirely true of false) nature of news reporting. Results reported in existing research work require deeper investigation such as classification on a scale of entirely true to entirely false rather than binary classification of news articles. In this paper, a novel ensemble-based framework – Sherlock, to detect fake news articles using natural language processing (NLP) and deep learning techniques is proposed. Due to unsatisfactory results of using a single approach, this framework consists of three distinct tasks of classification based on the article’s semantic structure, source credibility and sentiment of the news. The technique of using pre-trained word vectors as word embeddings for semantic analysis has shown performance boost by 2-4%. Additionally, a scale for measuring fakeness of news is proposed. Sherlock classifies a given news article into one of the four degrees of fakeness- “true”, “mostly-fake”, “entirely-fake” or “uncertain”. A comparison of the performance of text classification task using various statistical based machine learning algorithms and deep neural networks are also reported based on two publicly available benchmark datasets. The best test accuracies of 94% for binary classification and 65.5% for multiclass classification were obtained for a GRU (Gated Recurrent Unit) based deep neural network model which has been incorporated in the proposed framework. Sherlock uses a browser plugin to accept news for detection via web-scraping technique and consequently, the training dataset is updated in order to establish context for current affairs. An indigenous dataset which is frequently updated with Indian news context is introduced for the first time to the best of our knowledge. The overall product experience using Sherlock largely intervenes the impulsive behavior of forwarding news, and thereby provides the solution to curb rampant spread of fake news.

References

K. Cho, B. v. Merrienboer, C. Gulcehre, D. Bahdanau, F. Bougares, H. Schwenk and Y. Bengio, “Learning Phrase Representations using RNN Encoder–Decoder,” arXiv:1406.1078, 2014.
S. Helmstetter and H. Paulheim, “Weakly Supervised Learning for Fake News Detection on Twitter,” IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM) , 2018, pp. 274-277.
S. Krishnan and M. Chen, “Identifying Tweets with Fake News,” IEEE International Conference on Information Reuse and Integration for Data Science, 2018, pp. 460-464.
A. Sen, K. Rudra and S. Ghosh, “Extracting Situational Awareness from Microblogs during Disaster Events,” Social Networking Workshop, COMSNETS 2015 , 2015.
T. Traylor , J. Straub, Gurmeet and N. Snell, “Classifying Fake News Articles Using Natural Language Processing to Identify In-Article Attribution as a Supervised Learning Estimator,” IEEE 13th International Conference on Semantic Computing (ICSC), 2019, pp. 445-449.
W. Y. Wang, “Liar, liar pants on ﬁre: A new benchmark dataset for fake news detection,” arXiv:1705.00648v1 [cs.CL], 2017, p. 422–426.
S. Saad, W. Nicholas, S. Mei-Ling and F. Daniel, “High Dimensional Latent Space Variational Auto Encoders for Fake News Detection,” IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), 2019, pp. 437-442.
J. Reis, A. Correia, F. Murai, A. Veloso and F. Benevenuto, “Supervised Learning for Fake News Detection,” IEEE Computer Society, 2019, pp. 76-81.
S. Sudarshan, S. Seth, K. Chebrolu, S. Chakrabarti, M. Agarwal, A. Pale and A. Bagade, “The Kauwa-Kaate Fake News Detection System: Demo,” ACM India Joint International Conference on Data Science and Management of Data (CoDS-COMAD) , 2020.
“Fake News Challenge,” [Online]. Available: http://www.fakenewschallenge.org/.
M. Risdal, “Kaggle.com,” [Online]. Available: https://www.kaggle.com/mrisdal/fake-news.
R. Socher, A. Perelygin, J. Y. Wu, J. Chuang, C. D. Manning, A. Y. Ng and C. Potts, “Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank,” In Proceedings of EMNLP, 2013.
Y. Bengio, P. Simrad and P. Franconi, “Learning Long-Term Dependencies with Gradient Descent is Difficult,” IEEE Transactions on Neural Networks , Vol 5, No. 2, 1994, pp. 157-166.
S. Hochreiter and J. ¨. Schmidhuber, “Long Short-Term Memory,” in Neural Computation Vol. 9,8, https://doi.org/10.1162/neco.1997.9.8.1735, 1997, pp. 1735-1780.
Y. Kim, “Convolutional Neural Networks for Sentence Classiﬁcation,” Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1746-1751.
T. Mikolov, I. Sutskever, K. Chen, G. Corrado and J. Dean, “Distributed Represnetations of Words and Phrases and their Compositionality,” 2013, pp. 1-9.
R. Socher, J. Pennington, E. H. Huang, A. Y. Ng and C. D. Manning, “Semi-Supervised Recursive Autoencoders,” EMNLP '11: Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2011, pp. 151-161.
N. Kalchbrenner, E. Grefenstette and P. Blunsom, “A Convolutional Neural Network for Modelling Sentences,” arXiv:1404.2188v1 [cs.CL] , 2014.
D. P. Kingma and J. L. Ba, “ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION,” 3rd International Conference for Learning Representations, San Diego arXiv:1412.6980, 2015.

Index Terms

Computer Science

Information Sciences

Keywords

Fake News Natural Language Processing Deep Learning Semantic Analysis Sentiment Analysis Pre-trained Vectors Gated Recurrent Unit.