CFP last date
20 May 2024
Reseach Article

Phishing URL Detection: A novel hybrid Approach using Long Short-Term Memory and Gated Recurrent Units

by B.A.S. Dilhara
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 183 - Number 44
Year of Publication: 2021
Authors: B.A.S. Dilhara
10.5120/ijca2021921859

B.A.S. Dilhara . Phishing URL Detection: A novel hybrid Approach using Long Short-Term Memory and Gated Recurrent Units. International Journal of Computer Applications. 183, 44 ( Dec 2021), 41-54. DOI=10.5120/ijca2021921859

@article{ 10.5120/ijca2021921859,
author = { B.A.S. Dilhara },
title = { Phishing URL Detection: A novel hybrid Approach using Long Short-Term Memory and Gated Recurrent Units },
journal = { International Journal of Computer Applications },
issue_date = { Dec 2021 },
volume = { 183 },
number = { 44 },
month = { Dec },
year = { 2021 },
issn = { 0975-8887 },
pages = { 41-54 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume183/number44/32230-2021921859/ },
doi = { 10.5120/ijca2021921859 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:19:42.262983+05:30
%A B.A.S. Dilhara
%T Phishing URL Detection: A novel hybrid Approach using Long Short-Term Memory and Gated Recurrent Units
%J International Journal of Computer Applications
%@ 0975-8887
%V 183
%N 44
%P 41-54
%D 2021
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Phishing is one of the oldest types of cyber-attack, which mostly comes in the form of camouflaged URLs to delude the users to disclose their personal information for malevolent purposes of the attacker. It is one of the easiest ways of inducing people into revealing their personal credentials including credit card details. Usually, most phishing attacks come up as fake websites pretending to mimic a trustworthy website, and the attackers use these malicious website URLs for successful data breaches. Therefore, it is a necessity to filter up which URLs are benign, and which are malicious. This study proposes three non-hybrid deep learning models, namely CNN (1D), LSTM, GRU, and four hybrid deep learning models, namely GRU-LSTM, LSTM-LSTM, BI (GRU)-LSTM, and BI (LSTM)-LSTM. Based on the results obtained, it was found that BI (GRU)-LSTM model was the best performing model with an accuracy of 93.91%, precision of 93.94 %, recall of 93.38 %, and F1-Score of 93.66 %. Thus, the primary objective of this paper is to provide an insight into the hybrid deep learning approaches in phishing URL detection by evaluating their accuracy, precision, recall, and f1 score.

References
  1. A. Kulkarni and L. L. Brown, “Phishing websites detection using machine learning,” Int. J. Adv. Comput. Sci. Appl., vol. 10, no. 7, pp. 8–13, 2019, doi: 10.14569/ijacsa.2019.0100702.
  2. J. James, L. Sandhya, and C. Thomas, “Detection of phishing URLs using machine learning techniques,” 2013 Int. Conf. Control Commun. Comput. ICCC 2013, no. December 2013, pp. 304–309, 2013, doi: 10.1109/ICCC.2013.6731669.
  3. E. Zhu, Y. Chen, C. Ye, X. Li, and F. Liu, “OFS-NN: An Effective Phishing Websites Detection Model Based on Optimal Feature Selection and Neural Network,” IEEE Access, vol. 7, pp. 73271–73284, 2019, doi: 10.1109/ACCESS.2019.2920655
  4. D. Sahoo, C. Liu, and S. C. H. Hoi, “Malicious URL Detection using Machine Learning: A Survey,” vol. 1, no. 1, pp. 1–37, 2017.
  5. M. Ferreira, “Malicious URL Detection using Machine Learning Algorithms,” pp. 114– 122, 2019.
  6. H. Le, Q. Pham, D. Sahoo, and S. C. H. Hoi, “URLNet: Learning a URL Representation with Deep Learning for Malicious URL Detection,” no. i, 2018.
  7. F. Vanhoenshoven, G. Napoles, R. Falcon, K. Vanhoof, and M. Koppen, “Detecting malicious URLs using machine learning techniques,” 2016 IEEE Symp. Ser. Comput. Intell. SSCI 2016, no. December 2017, doi: 10.1109/SSCI.2016.7850079.
  8. A. Blum, B. Wardman, T. Solorio, and G. Warner, “Lexical feature based phishing URL detection using online learning,” Proc. ACM Conf. Comput. Commun. Secur., pp. 54–60, 2010, doi: 10.1145/1866423.1866434.
  9. I. N. V. D. Naveen, K. Manamohana, and R. Verma, “Detection of malicious URLs using machine learning techniques,” Int. J. Innov. Technol. Explor. Eng., vol. 8, no. 4S2, pp. 389–393, 2019.
  10. M. A. Adebowale, K. T. Lwin, and M. A. Hossain, “Deep learning with convolutional neural network and long short-term memory for phishing detection,” 2019 13th Int. Conf. Software, Knowledge, Inf. Manag. Appl. Ski. 2019, no. March 2018, 2019, doi: 10.1109/SKIMA47702.2019.8982427.
  11. V. M. Yazhmozhi, B. Janet, and S. Reddy, “Anti-phishing System using LSTM and CNN,” 2020 IEEE Int. Conf. Innov. Technol. INOCON 2020, pp. 3–7, 2020, doi: 10.1109/INOCON50539.2020.9298298.
  12. M. Somesha, A. R. Pais, R. S. Rao, and V. S. Rathour, “Efficient deep learning techniques for the detection of phishing websites,” Sadhana - Acad. Proc. Eng. Sci., vol. 45, no. 1, 2020, doi: 10.1007/s12046-020-01392-4.
  13. W. Chen, W. Zhang and Y. Su, "Phishing Detection Research Based on LSTM Recurrent Neural Network", Communications in Computer and Information Science, pp. 638-645, 2018. Available: 10.1007/978-981-13-2203-7_52 [Accessed 15 March 2021].
  14. A. C. Bahnsen, E. C. Bohorquez, S. Villegas, J. Vargas, and F. A. Gonzalez, “Classifying phishing URLs using recurrent neural networks,” eCrime Res. Summit, eCrime, pp. 1–8, 2017, doi: 10.1109/ECRIME.2017.7945048.
  15. Y. Peng, S. Tian, L. Yu, Y. Lv, and R. Wang, “A Joint Approach to Detect Malicious URL Based on Attention Mechanism,” Int. J. Comput. Intell. Appl., vol. 18, no. 3, pp. 1– 14, 2019, doi: 10.1142/S1469026819500214.
  16. P. Vaitkevicius and V. Marcinkevicius, Composition of ensembles of recurrent neural networks for phishing websites detection, vol. 1243 CCIS. Springer International Publishing, 2020.
  17. R. Verma and A. Das, “What’s in a URL: Fast feature extraction and malicious URL detection,” IWSPA 2017 - Proc. 3rd ACM Int. Work. Secur. Priv. Anal. co-located with CODASPY 2017, pp. 55–63, 2017, doi: 10.1145/3041008.3041016.
  18. C. Liu, L. Wang, B. Lang, and Y. Zhou, “Finding effective classifier for malicious URL detection,” ACM Int. Conf. Proceeding Ser., pp. 240–244, 2018, doi: 10.1145/3180374.3181352.
  19. R. Kumar, X. Zhang, H. A. Tariq, and R. U. Khan, “Malicious URL detection using multi-layer filtering model,” 2016 13th Int. Comput. Conf. Wavelet Act. Media Technol. Inf. Process. ICCWAMTIP 2017, vol. 2018- February, no. November, pp. 97–100, 2017, doi: 10.1109/ICCWAMTIP.2017.8301457.
  20. B. B. Benuwa, Y. Zhan, B. Ghansah, D. K. Wornyo, and F. B. Kataka, “A review of deep machine learning,” Int. J. Eng. Res. Africa, vol. 24, no. February 2017, pp. 124–136, 2016, doi:10.4028/www.scientific.net/JERA.24.124.
  21. A. Mahani and A. Riad Baba Ali, “Classification Problem in Imbalanced Datasets,” Recent Trends Comput. Intell., pp. 1–23, 2020, doi: 10.5772/intechopen.89603.
  22. G. Vrbančič, I. Fister, and V. Podgorelec, “Datasets for phishing websites detection,” Data Br., vol. 33, p. 106438, 2020, doi: 10.1016/j.dib.2020. 106438..
  23. H. M and S. M.N, “A Review on Evaluation Metrics for Data Classification Evaluations,” Int. J. Data Min. Knowl. Manag. Process, vol. 5, no. 2, pp. 01–11, 2015, doi: 10.5121/ijdkp.2015.5201.
Index Terms

Computer Science
Information Sciences

Keywords

Deep Learning URL Classification Hybrid Approach Phishing URL detection