CFP last date
22 April 2024
Reseach Article

Anti-Phishing based on Text Classification using Bayesian Approach

Published on February 2015 by Pankaj H. Gawale, D. R. Patil
International Conference on Advances in Science and Technology
Foundation of Computer Science USA
ICAST2014 - Number 3
February 2015
Authors: Pankaj H. Gawale, D. R. Patil
3eb6c2a4-ba9a-4b69-994d-2dc1aba733d3

Pankaj H. Gawale, D. R. Patil . Anti-Phishing based on Text Classification using Bayesian Approach. International Conference on Advances in Science and Technology. ICAST2014, 3 (February 2015), 19-22.

@article{
author = { Pankaj H. Gawale, D. R. Patil },
title = { Anti-Phishing based on Text Classification using Bayesian Approach },
journal = { International Conference on Advances in Science and Technology },
issue_date = { February 2015 },
volume = { ICAST2014 },
number = { 3 },
month = { February },
year = { 2015 },
issn = 0975-8887,
pages = { 19-22 },
numpages = 4,
url = { /proceedings/icast2014/number3/19487-5036/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 International Conference on Advances in Science and Technology
%A Pankaj H. Gawale
%A D. R. Patil
%T Anti-Phishing based on Text Classification using Bayesian Approach
%J International Conference on Advances in Science and Technology
%@ 0975-8887
%V ICAST2014
%N 3
%P 19-22
%D 2015
%I International Journal of Computer Applications
Abstract

Phishing is an act of cracking by single person or group of persons to stolen the personal confidential information such as credit card detail, bank account detail, passwords etc. , from unknown sufferer for illegal activities. In this paper we have implemented the text classifier using Bayesian approach for phishing detection. Text classifier works on textual content for measuring the similarity between the real web page and untrustworthy web page. Stemming is used for simplicity of our model. For generating threshold we used probabilistic approach with large data set of homepage URLs. The experimental result gives phishing pages detection ratio is 98. 87% also for FAR is nearly equal to zero.

References
  1. A. Emigh. (2005, Oct. ). Online Identity Theft: Phishing Technology, Chokepoints and Countermeasures. Radix Laboratories Inc. , Eau Claire, WI [Online]. Available: http://www. antiphishing. org/phisging- dsh-report. pdf
  2. A. Y. Fu, W. Liu, and X. Deng, "Detecting phishing web pages with visual similarity assessment based on earth mover's distance (EMD)", IEEE Trans. Depend. Secure Comput. , vol. 3, no. 4, pp. 301–311, Oct. -Dec. 2006.
  3. N. Chou, R. Ledesma, Y. Teraguchi, and D. Boneh, "Client-side defense against web-based identity theft", in Proc. 11th Annu. Netw. Distribut. Syst. Secur. Symp. , San Diego, CA, Feb. 2005, pp. 119–128.
  4. Y. Zhang, S. Egelman, L. Cranor, and J. Hong, "Phinding phish: Evaluating anti-phishing tools", in Proc. 14th Annu. Netw. Distribut. Syst. Secur. Symp. , San Diego, CA, Feb. 2007, pp. 1–16.
  5. W. Liu, N. Fang, X. Quan, B. Qiu, and G. Liu, "Discovering phishing target based on semantic link network", Future Generat. Comput. Syst. , vol. 26, no. 3, pp. 381–388, Mar. 2010.
  6. Y. Zhang, J. Hong, and L. Cranor, "CANTINA: A content-based approach to detecting phishing web sites", in Proc. 16th Int. Conf. World Wide Web, Banff, AB, Canada, May 2007, pp. 639–648.
  7. P. Likarish, E. Jung, D. Dunbar, T. E. Hansen, and J. P. Hourcade, "B-APT: Bayesian anti-phishing toolbar", in Proc. IEEE Int. Conf. Commun. , Beijing, China, May 2008, pp. 1745–1749.
  8. W. Liu, X. Deng, G. Huang, and A. Y. Fu, "An antiphishing strategy based on visual similarity assessment", IEEE Internet Comput. , vol. 10, no. 2, pp. 58–65, Mar. –Apr. 2006.
  9. M. Chandrasekaran, K. Narayanan, and S. Upadhyaya, "Phishing email detection based on structural properties", in Proc. 9th Annu. NYS Cyber Secur. Conf. , New York, Jun. 2006, pp. 2–8.
  10. H. Zang , G. Liu, Tommy W. , S. Chow ,"Textual and visual content based anti-phishing : a Bayesian approach", IEEE Transaction of neural network, 1532- 1446, 2011.
Index Terms

Computer Science
Information Sciences

Keywords

Uniform Resource Locator (url) Web Pages Phishing Text Classifier Bayesian Approach Correct Classification Ratio (ccr) F-score Matthews Correlation Coefficient (mcc) False Negative Ratio (fnr) False Alarm Ratio (far).