CFP last date
20 May 2024
Reseach Article

Phishing URL Detection: A Machine Learning and Web Mining-based Approach

by Bhagyashree E. Sananse, Tanuja K. Sarode
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 123 - Number 13
Year of Publication: 2015
Authors: Bhagyashree E. Sananse, Tanuja K. Sarode
10.5120/ijca2015905665

Bhagyashree E. Sananse, Tanuja K. Sarode . Phishing URL Detection: A Machine Learning and Web Mining-based Approach. International Journal of Computer Applications. 123, 13 ( August 2015), 46-50. DOI=10.5120/ijca2015905665

@article{ 10.5120/ijca2015905665,
author = { Bhagyashree E. Sananse, Tanuja K. Sarode },
title = { Phishing URL Detection: A Machine Learning and Web Mining-based Approach },
journal = { International Journal of Computer Applications },
issue_date = { August 2015 },
volume = { 123 },
number = { 13 },
month = { August },
year = { 2015 },
issn = { 0975-8887 },
pages = { 46-50 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume123/number13/22022-2015905665/ },
doi = { 10.5120/ijca2015905665 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:12:39.482943+05:30
%A Bhagyashree E. Sananse
%A Tanuja K. Sarode
%T Phishing URL Detection: A Machine Learning and Web Mining-based Approach
%J International Journal of Computer Applications
%@ 0975-8887
%V 123
%N 13
%P 46-50
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

There has been an abrupt development and use of online transactions over the past decade. The increased sophistication of cyber criminals has lead to proliferation of phishing attacks. The continuous expansion of World Wide Web has led to the rapid spread of phishing, malware and spamming. This paper proposes a feature based approach to classify URLs into phishing or non-phishing category. The usage of a variety of URL features is done by studying the anatomy of URLs. For classification of URLs, two different algorithms have been used. Random Forest machine learning algorithm is used to build an efficient classifier which would decide whether a given URL is phishing or not. In addition, a novel scheme has been proposed to detect phishing URLs by mining the publicly available content on the URLs.

References
  1. Namrata Singh, Nihar Ranjan Roy, “A Survey of Phishing Website Detection Techniques”, IRAJ International Conference-Proceedings of ICRIEST-AICEEMCS, 2013, Pune India.
  2. McAfee SiteAdvisor Software- Website Safety Ratings and Secure Search, http://www.siteadvisor.com, accessed on June 25, 2015.
  3. Netcarft Anti-Phishing Toolbar, http://toolbar.netcraft.com, accessed on June 25, 2015.
  4. AVG Security Toolbar, http://www.avg.com/product-avg-toolbar-tlbrc#tba2, accessed on June 25, 2015.
  5. C. Whittaker, B. Ryner, M. Nazif, “Large-Scale Automatic Classification Of Phishing Pages”, In: Proc 17th Annual Network and Distributed System Security Symposium, NDSS’10, San Diego, CA, USA, 2010.
  6. S. Garera, N. Provos, M. Chew, A.D. Rubin, “A Framework For Detection And Measurement Of Phishing Attacks”. In: Proc. 5th ACM Workshop on Recurring Malcode, WORM’07, ACM, New York, NY, USA, 2007.
  7. Y. Zhang, J. Hong, L. Cranor, “CANTINA: A Content-Based Approach To Detecting Phishing Web Sites”, In: Proc. 15th Int. Conf. World Wide Web, WWW‟07, Banff, Alberta, Canada, 2007.
  8. J. Ma, L.K. Saul, S. Savage, G.M. Voelker, “Beyond Blacklists: Learning To Detect Malicious Web Sites From Suspicious URLs”, In: Proc. 15th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Paris, France, 2009.
  9. Ram B. Basnet, Andrew H. Sung, Quingzhong Liu, “Learning To Detect Phishing URLs”, IJRET: International Journal of Research in Engineering and Technology”, 2013.
  10. Joby James, Sandhya L, Ciza Thomas, “Detection Of Phishing URLs Using Machine Learning Techniques”, International Conference on Control Communication and Computing (ICCC), 2013.
  11. Ke-Wei Su, Kuo-Ping Wu, Hahn-Ming Lee, Te-En Wei, “Suspicious URL Filtering Based On Logistic Regression with Multi-view Analysis”, Eight Asia Joint Conference on Information Security, 2013.
  12. http://www.whois.com/, accessed on December 17th 2014.
  13. Anjali Sardana, A.Naga Venkata Sunil, “A PageRank Based Detection Technique for Phishing Web Sites”, IEEE Symposium on Computer and Informatics (ICSI), 2012.
  14. http://developers.evrsoft.com/find-traffic-rank.shtml, accessed on December 17th 2014.
  15. https://www.phishtank.com/, accessed on November 30th, 2014.
  16. M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, I.H. Witten, “The WEKA Data Mining Software: An Update”, SIGKDD Explorations, 2009.
  17. Sana Ansari, Jayant Gadge, “Architecture For Checking Trustworthiness Of Websites”, International Journal of Computer Application, 2012.
  18. Rami m. Mohammad, Fadi Tabhtah, Lee McCluskey, “ Intelligent Rule based Phishing Websites Classification, Information Security, IET, 2014.
  19. Ram B. Basnet, Andrew H.Sung, “Mining Web to Detect Phishing URLs”, 11th International Conference on Machine Learning and Applications, 2012.
Index Terms

Computer Science
Information Sciences

Keywords

Phishing URL web mining benign machine learning