CFP last date
22 April 2024
Reseach Article

Malware Classification through HEX Conversion and Mining

Published on December 2012 by A. Pratheema Manju Prabha, P. Kavitha
EGovernance and Cloud Computing Services - 2012
Foundation of Computer Science USA
EGOV - Number 4
December 2012
Authors: A. Pratheema Manju Prabha, P. Kavitha
cc86bf4c-2b4f-4a64-99c1-da101b047543

A. Pratheema Manju Prabha, P. Kavitha . Malware Classification through HEX Conversion and Mining. EGovernance and Cloud Computing Services - 2012. EGOV, 4 (December 2012), 6-12.

@article{
author = { A. Pratheema Manju Prabha, P. Kavitha },
title = { Malware Classification through HEX Conversion and Mining },
journal = { EGovernance and Cloud Computing Services - 2012 },
issue_date = { December 2012 },
volume = { EGOV },
number = { 4 },
month = { December },
year = { 2012 },
issn = 0975-8887,
pages = { 6-12 },
numpages = 7,
url = { /proceedings/egov/number4/9503-1031/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 EGovernance and Cloud Computing Services - 2012
%A A. Pratheema Manju Prabha
%A P. Kavitha
%T Malware Classification through HEX Conversion and Mining
%J EGovernance and Cloud Computing Services - 2012
%@ 0975-8887
%V EGOV
%N 4
%P 6-12
%D 2012
%I International Journal of Computer Applications
Abstract

The malicious codes are normally referred as malware. Systems are vulnerable to the traditional attacks, and attackers continue to find new ways around existing protection mechanisms in order to execute their injected code. Malware is a pervasive problem in distributed computer and network systems. These new malicious executables are created at the rate of thousands every year. There are several types of threat to violate these components; for example Viruses, Worms, Trojan horse and Malware. Malware represents a serious threat to confidentiality since it may result in loss of control over private data for computer users. It is typically hidden from the user and difficult to detect since it can create significant unwanted CPU activity, disk usage and network traffic. In existing systems, new malicious programs can be detected by automatic signature generation called as F-Sign for automatic extraction of unique signatures from malware files. This is primarily intended for high-speed network traffic. The signature extraction process is based on a comparison with a common function repository. The data mining framework employed in this research learns through analyzing the behavior of existing malicious and benign codes in large datasets. We have employed robust classifiers, namely Naïve Bayes (NB) Algorithm, k?Nearest Neighbor (kNN) Algorithm, and J48 decision tree and have evaluated their performance. This involves extracting opcode sequence from the dataset, to construct a classification model and to identify it as malicious or benign. Our approach showed 98. 4% detection rate on new programs whose data was not used in the model building process.

References
  1. Symantec, "Symantec internet security threat report: Volume XII," effectiveness and efficiency of our work is in
  2. . In their Symantec 2008.
  3. F-Secure. (2007, 19 August 2009). F-Secure Reports Amount of variants such as the Netsky family of malware u sing the Malware Grew by 100% during 2007.
  4. K. Griffin, S. Schneider, X. Hu, and T. Chiueh, "Automatic Generation of String Signatures for Malware Detection," in Recent Advances in Intrusion Detection: 12th International Symposium, RAID 2009 , Saint- Malo, France, 2009.
  5. J. O. Kephart and W. C. Arnold, "Automatic extraction of computer virus signatures," in 4th Virus Bulletin International Conference , 1994, pp. 178-184.
  6. J. Z. Kolter and M. A. Maloof, "Learning to detect malicious executables in the wild," in International Conference on Knowledge Discovery and Data Mining , 2004, pp. 470-478.
  7. M. E. Karim, A. Walenstein, A. Lakhotia, and L. Parida, "Malware phylogeny generation using permutations of code," Journal in Computer Virology, vol. 1, pp. 13-23, 2005.
  8. M. Gheorghescu, "An automated virus classification system," in Virus Bulletin Conference , 2005, pp. 294-300.
  9. Y. Ye, D. Wang, T. Li, and D. Ye, "IMDS: intelligent malware detection system," in Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining , 2007.
  10. E. Carrera and G. Erdélyi, "Digital genome mapping–advanced binary malware analysis," in Virus Bulletin Conference , 2004, pp. 187-197.
  11. T. Dullien and R. Rolles, "Graph-based comparison of Executable Objects (English Version)," in SSTIC , 2005.
  12. I. Briones and A. Gomez, "Graphs, Entropy and Grid Computing: Automatic Comparison of Malware," in Virus Bulletin Conference , 2008 pp. 1-12.
  13. S. Cesare and Y. Xiang, "Classification of Malware Using Structured Control Flow," in 8th Australasian Symposium on Parallel and Distributed Computing (AusPDC 2010) , 2010.
  14. G. Bonfante, M. Kaczmarek, and J. Y. Marion, "Morphological Detection of Malware," in International Conference on Malicious and Unwanted Software, IEEE , Alexendria VA, USA, 2008, pp. 1-8.
  15. R. T. Gerald and A. F. Lori, "Polymorphic malware detection and identification via context-free grammar homomorphism," Bell Labs Technical Journal, vol. 12, pp. 139-147, 2007.
  16. X. Hu, T. Chiueh, and K. G. Shin, "Large-Scale Malware Indexing Using Function-Call Graphs," in Computer and Communications Security , Chicago, Illinois, USA, pp. 611-620.
  17. Henchiri. O, Japkowicz. N (2006), ?A Feature Selection and Evaluation Scheme for Computer Virus Detection , Data Mining, ICDM '06. Sixth International Conference on Digital Object Identifier: 10. 1109/ICDM. 2006. 4 Publication Year: 2006 , Page(s): 891 – 895
  18. Moskovitch. R, Feher. C, Tzachar. N, Berger. E, Gitelman. M, Dolev. S, and Elovici. Y (2008) ?Unknown Malcode Detection Using OPCODE Representation , ISI 2008, June 17-20, Taipei, Taiwan.
  19. Bozagac. C. D, ?Application of Data Mining based Malicious Code Detection Techniques for Detecting new Spyware , White paper, Bilkent University 2005.
  20. J. Kinable and O. Kostakis, "Malware classification based on call graph clustering," Journal in Computer Virology, vol. 7, pp. 233-245, 2011.
  21. Moskovitch. R, Feher. C, Tzachar. N, Berger. E, Gitelman. M, Dolev. S, and Elovici. Y (2008) ?Unknown Malcode Detection Using OPCODE Representation , ISI 2008, June 17-20, Taipei, Taiwan.
Index Terms

Computer Science
Information Sciences

Keywords

Malware F –sign Naïve Bayes Algorithm J48 Algorithm K?nearest Neighbor Algorithm