CFP last date
20 May 2024
Reseach Article

An Efficient Decision Tree Model for Classification of Attacks with Feature Selection

by Akhilesh Kumar Shrivas, S. K. Singhai, H. S. Hota
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 84 - Number 14
Year of Publication: 2013
Authors: Akhilesh Kumar Shrivas, S. K. Singhai, H. S. Hota
10.5120/14647-2967

Akhilesh Kumar Shrivas, S. K. Singhai, H. S. Hota . An Efficient Decision Tree Model for Classification of Attacks with Feature Selection. International Journal of Computer Applications. 84, 14 ( December 2013), 42-48. DOI=10.5120/14647-2967

@article{ 10.5120/14647-2967,
author = { Akhilesh Kumar Shrivas, S. K. Singhai, H. S. Hota },
title = { An Efficient Decision Tree Model for Classification of Attacks with Feature Selection },
journal = { International Journal of Computer Applications },
issue_date = { December 2013 },
volume = { 84 },
number = { 14 },
month = { December },
year = { 2013 },
issn = { 0975-8887 },
pages = { 42-48 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume84/number14/14647-2967/ },
doi = { 10.5120/14647-2967 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:00:56.756582+05:30
%A Akhilesh Kumar Shrivas
%A S. K. Singhai
%A H. S. Hota
%T An Efficient Decision Tree Model for Classification of Attacks with Feature Selection
%J International Journal of Computer Applications
%@ 0975-8887
%V 84
%N 14
%P 42-48
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Application of Internet is increasing rapidly in almost all the domains including online transaction and data communication, due to which cases of attacks are increasing rapidly. Also security of information in victim computer is an important need, which requires a security wall for identification and prevention of attacks in form of intrusion detection system (IDS). Basically Intrusion detection system (IDS) is a classifier that can classify the network data as normal or attack. Our main motive in this piece of research work is to develop a robust binary classifier as an IDS using various decision tree based techniques applied on NSL-KDD data set. Due to high dimensionality of data set, ranking based feature selection technique is used to select critical features and to reduce unimportant features to be applied to deduct random forest model, which is obtained as one of the best model. Empirical result shows that random forest model produces highest accuracy of 99. 84% (Almost 100%) with only 19 features. Performance of the model with reduced feature subsets are also evaluated using other performance measures like true positive rate (TPR), false positive rate (FPR), precision, F-measure and receiver operating characteristic (ROC) area and the results are found to be satisfactory.

References
  1. V. , Bolon Canedo et al. 2011. Feature selection and classification in multiple class datasets: an application to KDDCup 99 dataset, Expert systems with Applications,vol 38, pp. 5947-5957.
  2. Jiawei Han and Micheline Kamber 2006. Data Mining Concepts and Techniques, 2nd edition, Morgan Kaufmann, San Francisco.
  3. Ibrahim, Laheeb M. , et al. 2013. A comparison study for intrusion (KDD99, NSL-KDD) based on self organization map (SOM) artificial database neural network, Journal of Engineering Science and Technology, vol. 8, No. 1, pp. 107 – 119.
  4. L. , Koc et al. 2012. A network intrusion detection system based on Hidden Naive Bayes multiclass classifier', Journal of Expert system with applications, vol 39, pp. 13492-13500.
  5. Y. , Li et al. 2012. An efficient intrusion detection system based on support vector machines and gradually feature removal method, Expert systems with Applications, vol 39, pp. 424-430.
  6. NSL-KDD Data set for network based intrusion detection system, last accessed: Oct 2012. available at http://www. iscx. info/NSL-KDD/ .
  7. Saurbh Mukherjee et al. 2012. Intrusion detection using Bayes classifier with feature reduction', Procedia technology, vol 4, pp. 119-128.
  8. Mrutyunjaya, Panda et al. 2012. A hybrid intelligent approach for network intrusion detection, Proceedia Engineering, vol 30, pp. 1-9.
  9. R. , Parimala et al. 2011. A study of spam E-mail classification using feature selection package, Global General of computer science and technology, vol. 11, ISSN 0975-4172.
  10. A. K. Pujari 2001. Data mining techniques, 4th edition, Universities Press (India) private limited.
  11. Tanagra – A Free Data Mining Software for Teaching and Research last accessed: Sep. 2012. available at: http://eric. univ-lyon2. fr/~ricco/tanagra/en/tanagra. html
  12. Web sources: http: //www. cs. waikato,ac. nz/~ml/weka/
  13. Krzysztopf J. Cios,Witold Pedrycz and Roman W. Swiniarski 1998. Data mining methods for knowledge discovery, 3rd editions, Kluwer academic publishers.
  14. Ian H. Witten et al. 2005. Data Mining practical machine learning tools and techniques', 2nd edition, Morgan Kaufmann.
Index Terms

Computer Science
Information Sciences

Keywords

Gain ratio feature selection Binary class Multi class Intrusion detection system (IDS) NSL-KDD Random forest