CFP last date
20 June 2024
Call for Paper
July Edition
IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 20 June 2024

Submit your paper
Know more
Reseach Article

Web Phishing Detection System: Bayesian and Clustering Approach

by Nilima Ramdas Narad, Sandeep U. Kadam
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 145 - Number 10
Year of Publication: 2016
Authors: Nilima Ramdas Narad, Sandeep U. Kadam

Nilima Ramdas Narad, Sandeep U. Kadam . Web Phishing Detection System: Bayesian and Clustering Approach. International Journal of Computer Applications. 145, 10 ( Jul 2016), 7-10. DOI=10.5120/ijca2016910767

@article{ 10.5120/ijca2016910767,
author = { Nilima Ramdas Narad, Sandeep U. Kadam },
title = { Web Phishing Detection System: Bayesian and Clustering Approach },
journal = { International Journal of Computer Applications },
issue_date = { Jul 2016 },
volume = { 145 },
number = { 10 },
month = { Jul },
year = { 2016 },
issn = { 0975-8887 },
pages = { 7-10 },
numpages = {9},
url = { },
doi = { 10.5120/ijca2016910767 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
%0 Journal Article
%1 2024-02-06T23:48:24.139442+05:30
%A Nilima Ramdas Narad
%A Sandeep U. Kadam
%T Web Phishing Detection System: Bayesian and Clustering Approach
%J International Journal of Computer Applications
%@ 0975-8887
%V 145
%N 10
%P 7-10
%D 2016
%I Foundation of Computer Science (FCS), NY, USA

Phishing is an online crime that aims to create genuine looking websites to attract users and let them releasing their sensitive information on that fraud websites. Website phishing is one of the major attacks by which most of internet users are being fooled by the phisher. The best way to protect from phishing is to recognize a phish. Phishing emails usually appear to come from well-known organization and ask your personal information such as credit card number, security number, account number or passwords. What actually attacker does? The attacker creates the no of replicas of authenticate sites , and users are forced to direct to that websites by attracting them with offers. As standard mentioned in W3C (World Wide Web Consortium), I am proposing a system which can easily recognize the difference between authenticate site and phishing site. There are certain standards which are given by W3C (World Wide Web Consortium), based on these standards I am choosing some features which can easily describe the difference between legit site and phish site. To protect you from phishing, I am proposing a model to determine the fraud sites. To determine the phishing attack, URL features and HTML features of web page are considered. Clustering algorithm such as K-Means clustering is applied on the database and prediction techniques such as Naive Bayes Classifier is applied. By applying this, probability of the web site as valid Phish or Invalid Phish. To check the validity of URL, if still we are not able decide the validity of web page then Naive Bayes Classifier is applied . Also training model is applied for the extraction of HTML tag features of site and probability.

  1. Rachna Dhamija, J. D. Tygar, and Marti Heast, “Why Phishing Works”, CHI-2006, Conference on Human Factor in Computing Systems, April 2006.
  2. RSA Online Fraud Surveyor, “The phishing kit – the same wolf, just different sheep’s clothing”, RSA Surveys, vol-1, February-2013.
  3. Xiaoqing GU, Hongyuan WANG, and Tongguang NI “An Efficient Approach to Detect Phishing Web” Journal of Computational Information Systems 9:14(2013), 2013, pp. 5553-5560.
  4. Haijun Zhang, Gang Liu, Tommy W. S. Chow, Senior Member, IEEE, and Wenyin Liu, Senior Member, IEEE “Textual and Visual Content-Based Anti-Phishing: A Bayesian Approach”, vol-22, IEEE Transactions October- 2011 pp. 1532-1546.
  5. Angelo P. E.Rosiello, Engin Kirda, Christopher Kruegel, Fabrizio Ferrandi, and Politecnico di Milano “A Layout-Similarity-Based Approach for Detecting Phishing Pages”- unpublished
  6. WIKIPEDIA.ORG- The Online Encyclopedia,
  7. Abraham Sillberschatz, Henry Korth, and S. Sudarshan, “Database System Concepts”, 5th Edition, pp. 900-903.
  8. PHISHTANK.COM- The Online Valid Phish Sites Repository,
  9. Eric Meisner, Naive Bayes Classifier Example, 22nd November 2003-unpublished
  10. A hybrid model for detection of phishing sites using
  11. clustering and Bayesian approach,6th April 2014.
Index Terms

Computer Science
Information Sciences


Anti Phishing Bayesian technique Data Mining Database Clustering and Phishing Attack.