CFP last date
22 April 2024
Reseach Article

Application of Feature Selection Methods and Ensembles on Network Security Dataset

by Neeraj Bisht, Amir Ahmad, Shilpi Bisht
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 135 - Number 11
Year of Publication: 2016
Authors: Neeraj Bisht, Amir Ahmad, Shilpi Bisht
10.5120/ijca2016908532

Neeraj Bisht, Amir Ahmad, Shilpi Bisht . Application of Feature Selection Methods and Ensembles on Network Security Dataset. International Journal of Computer Applications. 135, 11 ( February 2016), 1-5. DOI=10.5120/ijca2016908532

@article{ 10.5120/ijca2016908532,
author = { Neeraj Bisht, Amir Ahmad, Shilpi Bisht },
title = { Application of Feature Selection Methods and Ensembles on Network Security Dataset },
journal = { International Journal of Computer Applications },
issue_date = { February 2016 },
volume = { 135 },
number = { 11 },
month = { February },
year = { 2016 },
issn = { 0975-8887 },
pages = { 1-5 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume135/number11/24090-2016908532/ },
doi = { 10.5120/ijca2016908532 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:35:29.218979+05:30
%A Neeraj Bisht
%A Amir Ahmad
%A Shilpi Bisht
%T Application of Feature Selection Methods and Ensembles on Network Security Dataset
%J International Journal of Computer Applications
%@ 0975-8887
%V 135
%N 11
%P 1-5
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Generally intrusion detection systems (IDS) use all the data features to classify normal and anomaly packet. It has been observed in the studies that some of the data features may be redundant or are less important in this classification process. Authors have studied NSL KDD dataset with different feature selected from Gain Ratio and Chi- Square feature selection methods and carried out the experiments with single Decision Tree and then applied ensemble with Random Forests and Decision Tree with Bagging. Results show that significant feature selection is very important in the design of a lightweight and efficient intrusion detection system. Random Forests are better than Single Decision Tree and Decision Tree with Bagging for the current dataset. Performance of Gain Ratio is better than Chi square feature selection method for this dataset.

References
  1. Amor, N.B., Benferhat, S., and Elouedi, Z. 2004. Naïve Bayes vs. Decision Trees in Intrusion Detection Systems, Proceedings of ACM Symposium on Applied Computing, Nicosia, Cyprus.
  2. Gaddam, S.R., Phoha, V.V., and Balagani, K.S. 2007. Means+id3 a novel method for supervised anomaly detection by cascading k-means clustering and id3 decision tree learning methods, IEEE Trans Knowl and Data Engg , 19:(3), 345-354.
  3. Horng, S.J., Su, M.Y., Chen, Y.H., Kao, T.W., Chen R.J., Lai, J.L., and Perkasa, C.D. 2011. A novel intrusion detection system based on hierarchical clustering and support vector machines, Expert Syst with Appl, 38:(1), 306-313.
  4. Sabhnani, M., and Serpen, G. 2003. Application of machine learning algorithms to KDD intrusion detection dataset within misuse detection context, Proceedings of Conference on Machine Learning Models, Technology and Application, 209-215, MLMTA.
  5. Tajbakhsh, A., Rahmati, M., and Mirzaei, A. 2009 Intrusion detection using fuzzy association rules, Appl Soft Comput , 9:(2), 462-469.
  6. Hansen, L.K., and Salamon, P. 1990. Neural network ensembles, IEEE Trans Patt Anal Mach Intel, 12, 993-1001.
  7. Kuncheva, L.I. 2004. Combining pattern classifiers: Methods and Algorithms, Wiley-IEEE Press, New York.
  8. Breiman, L. 1996. Bagging predictors, Machine Learning, 24:(2), 123–140.
  9. Quinlan, J.R. 1996. Bagging, Boosting and C4.5, In Proc. 13th National Conf. Back Propagation Intelligence (AAAI’96), Portland, 725-730.
  10. Breiman, L. 2001. Random Forests, Machine Learning, 45:(1), 5–32.
  11. Tavallaee, M.E., Bagheri, W.L., and Ghorbani, A. 2009. A Detailed Analysis of the KDD CUP 99 Data Set, Proceedings of the Second IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA), Piscataway, NJ, USA, 53-58.
  12. Breiman, L., Friedman, J.H., Olshen, R., and Stone, C. 1984. Classification and Regression Trees, Chapman and Hall, London.
  13. Quinlan, J.R. 1993. C4.5: Programs for machine learning, Morgan Kaufmann, San Mateo.
  14. Quinlan, J.R. 1986. Induction of Decision Trees, Machine Learning1, Kluwer Academic Publishers, Boston, 81-106
  15. Han, J. and Kamber, M. 2001. Data Mining Concepts and Techniques. Morgan Kaufmann.
  16. Witten, I.H., and Frank, E. 2000. Data Mining: Practical Machine Learning Tools with Java Implementations, Morgan Kaufmann, San Francisco.
Index Terms

Computer Science
Information Sciences

Keywords

Network security NSL KDD classifier ensembles Decision trees Random Forests Chi Square Gain Ratio.