Kernel k-Means Clustering for Phishing Website and Malware Categorization

Kanti Sahu; S K. Shrivastava

Call for Paper

March Edition

IJCA solicits high quality original research papers for the upcoming March edition of the journal. The last date of research paper submission is 20 February 2026

Submit your paper

Know more

The week's pick

A Knowledge-Graph–Driven Multimodal Large Model for Semantic Understanding and Controllable Generation of Intangible Cultural Heritage

Jundi Yang Heng Yao

Random Articles

Reseach Article

Kernel k-Means Clustering for Phishing Website and Malware Categorization

by Kanti Sahu, S K. Shrivastava

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 111 - Number 9

Year of Publication: 2015

Authors: Kanti Sahu, S K. Shrivastava

10.5120/19565-1326

Kanti Sahu, S K. Shrivastava . Kernel k-Means Clustering for Phishing Website and Malware Categorization. International Journal of Computer Applications. 111, 9 ( February 2015), 20-25. DOI=10.5120/19565-1326

@article{ 10.5120/19565-1326,

author = { Kanti Sahu, S K. Shrivastava },

title = { Kernel k-Means Clustering for Phishing Website and Malware Categorization },

journal = { International Journal of Computer Applications },

issue_date = { February 2015 },

volume = { 111 },

number = { 9 },

month = { February },

year = { 2015 },

issn = { 0975-8887 },

pages = { 20-25 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume111/number9/19565-1326/ },

doi = { 10.5120/19565-1326 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:47:25.033351+05:30

%A Kanti Sahu

%A S K. Shrivastava

%T Kernel k-Means Clustering for Phishing Website and Malware Categorization

%J International Journal of Computer Applications

%@ 0975-8887

%V 111

%N 9

%P 20-25

%D 2015

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In these days there are two famous internet attacks these are malware and phishing. Malware stands for malicious software. It is designed to damage computer system without knowledge of the user. Phishing website is comparatively new internet crime to malware attack. Phishing is a form of online fraud such as social engineering schemes by sending e-mails, sudden message or online advertising attract users to phishing website that pretend to be trustworthy website in order to trick individuals sensitive information for illustration- financial accounts, password and personal identification numbers, which is used for profit. Malware and Phishing website is share same properties, firstly increasing at a rate of thousands per day and secondly phishing webpage represented by the term frequencies of the website content share comparable characteristic of malware samples represented through instruction frequencies of the program executable code. Past few years many techniques have been develop to detect malware and phishing website. In these techniques firstly extract feature from phishing website or malware and then categorize them into group. In this paper, we proposed Kernel k-means clustering to categorize malware and phishing website. Kernel k-means is advance version of the k-means algorithm. In which vectors are mapped from vector space to a higher dimensional feature space through kernel function and then k-means is applied in feature space. Thus kernel k-means avoids the separable clusters in vector space and improves the accuracy of phishing website and malware categorization.

References

Y. Ye, D. Wang, T. Li, D. Ye, and Q. Jiang, Jan. 2008 "An intelligent PE-malware detection system based on association mining," J. Comput. Virol. , vol. 4, pp. 323–334.
E. Menahem, A. Shabtai, L. Rokach, and Y. Elovici, Feb. 2009 "Improving malware detection by applying multi-inducer ensemble," J. Comput. Stat. Data Anal. , vol. 53, no. 4, pp. 1483–1494.
Y. Ye, T. Li, Y. Chen, and Q. Jiang, 2010 "Automatic malware categorization using cluster ensemble," in Proc. 16th ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pp. 95–104.
Zongqu Zhao, Junfeng Wang, Jinrong Bai1 2014, "Malware detection method based on the control-flow construct feature of software" IET Inf. Secur. Vol. 8, Iss. 1, pp. 18–24 .
I. Santos F. Brezo B. Sanz C. Laorden P. G. Bringas 2011 "Using opcode sequences in single-class learning to detect unknown malware" IET Inf. Secur,Vol. 5, Iss. 4, pp. 220–227
M. Aburrous,M. A. Hossain, K. Dahal, and F. Thabtah, 2010 "Predicting phishingwebsites using classificationmining techniqueswith experimental casestudies," in Proc. 7th Int. Conf. Inf. Technol. pp. 176–181.
G. Liu, B. Qiu, and L. Wenyin, 2010 "Automatic detection of phishing target from phishing webpage," in Proc. 20th Int. Conf. Pattern Recognit, pp. 4153–4156.
W. Liu, X. Deng, G. Huang, and A. Y. Fu, Mar. /Apr. 2006,"An antiphishing strategy based on visual similarity assessment," in Proc. IEEE Internet Comput, pp. 58–65.
Weiwei Zhuang, Yanfang Ye, Yong Chen, and Tao Li Nov 2012 " Ensemble Clustering for Internet Security Applications" IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS—PART C: APPLICATIONS AND REVIEWS, VOL. 42, NO. 6.
M. Bailey, J. Oberheide, J Andersen, Z. M. Mao, F. Jahanian, andJ. Nazario, 2007 "Automated classification and analysis of internet malware," in Recent Advances in Intrusion Detection, (Lecture Notes in Computer Science vol. 4637). New York: Springer, pp. 178–197.
T. Lee and J. J. Mody, May 2006 "Behavioral classification," in Proc. EICAR.
R. Tian, L. M. Batten, and S. C. Versteeg, 2008 "Function length as a tool for malware classification," in Proc. 3rd Int. Conf. Malicious Unwanted Software, pp. 69–76.
N. Chou, R. Ledesma, Y. Teraguchi, D. Boneh, and J. C. Mitchell, 2004 "Clientside defense against web-based identity theft," in Proc. 11th Annu. Network Distrib. Syst. Secur. Symp.
M. Wu, 2004 "Fighting phishing at the user interface" Ph. D. dissertation, Mass. Inst. Technol. , MA.
R. Dazeley, J. L. Yearwood, B. H. Kang, and A. V. Kelarev, 2010 "Consensus clustering and supervised classification for profiling phishing emails in internet commerce security," in Knowledge Management and Acquisition for Smart Systems and Service (Lecture Notes in Computer Science, vol. 6232). New York, Springer-Verlag, pp. 235–246.
Y. Zhang, J. Hong, and L. Cranor, 2007 "CANTINA: A content-based approach to detecting phishing web sites," in Proc. 16th World Wide Web Conf. pp. 639–648.
Grigorios Tzortzis and Aristidis Likas 2008 "The Global Kernel k-Means Clustering Algorithm" 978-1-4244-1821 3/08/$25. 00© IEEE.
"VirusSign" Available: http://www. VirusSignMalware Research & Data Center, Virus Free Downloads. html" © VirusSign, Inc.
"Phishload" Available: http://www. Phishload-Tablesexplained. html Copyright (c) 2012Max-Emanuel Maurer (University of Munich).
Han Jiawai and Kamber Micheline 2006 Data Mining concept and technique second ed. USA by Elsevier Inc.

Index Terms

Computer Science

Information Sciences

Keywords

Malware Phishing website Kernel k-means clustering algorithm.