CFP last date
20 May 2024
Reseach Article

Feature Selection by Mining Optimized Association Rules based on Apriori Algorithm

by K. Rajeswari
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 119 - Number 20
Year of Publication: 2015
Authors: K. Rajeswari
10.5120/21186-3531

K. Rajeswari . Feature Selection by Mining Optimized Association Rules based on Apriori Algorithm. International Journal of Computer Applications. 119, 20 ( June 2015), 30-34. DOI=10.5120/21186-3531

@article{ 10.5120/21186-3531,
author = { K. Rajeswari },
title = { Feature Selection by Mining Optimized Association Rules based on Apriori Algorithm },
journal = { International Journal of Computer Applications },
issue_date = { June 2015 },
volume = { 119 },
number = { 20 },
month = { June },
year = { 2015 },
issn = { 0975-8887 },
pages = { 30-34 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume119/number20/21186-3531/ },
doi = { 10.5120/21186-3531 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:04:36.240206+05:30
%A K. Rajeswari
%T Feature Selection by Mining Optimized Association Rules based on Apriori Algorithm
%J International Journal of Computer Applications
%@ 0975-8887
%V 119
%N 20
%P 30-34
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper presents a novel feature selection based on association rule mining using reduced dataset. The key idea of the proposed work is to find closely related features using association rule mining method. Apriori algorithm is used to find closely related attributes using support and confidence measures. From closely related attributes a number of association rules are mined. Among these rules, only few related with the desirable class label are needed for classification. We have implemented a novel technique to reduce the number of rules generated using reduced data set thereby improving the performance of Association Rule Mining (ARM) algorithm. Experimental results of proposed algorithm on datasets from standard university of California, Irvine (UCI) demonstrate that our algorithm is able to classify accurately with minimal attribute set when compared with other feature selection algorithms.

References
  1. Jaiwei Han and Micheline Kamber, "Data Mining Concepts and Techniques", Second Edition, Elsevier, Morgan Kaufmann publishers.
  2. R. Agrawal, T. Imielinski, and A. Swami, "Database mining: A perfor- mance perspective," IEEE Trans. Knowledge Data Eng. , vol. 5, Dec. 1993.
  3. Reunanen, J. (2003). "Overfitting in making comparisons between variable selection methods". Journal of Machine Learning Research, 3 (7/8), 1371—1382.
  4. K. Z. Mao, "Fast Orthogonal Forward Selection Algorithm for Feature Subset Selection". IEEE Transactions on Neural Networks, 2002. 13(5): 1218-1224.
  5. J. Jelonek, Jerzy S. , "Feature Subset Selection for Classification of Histological Images. Artificial Intelligence in Medicine", 1997. 9:22-239.
  6. B. Sahiner, H. P. Chan, N. Petrick, R. F. Wagner, and L. Hadjiiski, "Feature Selection and Classifier Performance in Computer-Aided Diagnosis: The Effect of Finite Sample Size" Medical Physics, 2000. 27(7): 1509-1522.
  7. Z. Zhao, H. Liu, Searching for Interacting Features, IJCAI 2007.
  8. Gatu C. And Kontoghiorghes E. J. (2003). "Parallel Algorithms for Computing all Possible Subset Regression Models Using the {QR} Decomposition". Parallel Computing, 29, pp. 505-521.
  9. Gatu C. And Kontoghiorghes E. J. (2005). "Efficient Strategies for Deriving the Subset {VAR} Models". Computational Management Science, 2 (4):253-278.
  10. Gatu C. And Kontoghiorghes E. J. (2006). "Branch-and-bound Algorithms for Computing the Best-Subset Regression Models". Journal of Computational and Graphical Statistics, 15 (1):139-156.
  11. T. Joliffe, "Principal Component Analysis", New York: Springer- Verlag, 1986.
  12. K. L. Priddy et al. , "Bayesian selection of important features for feed- forward neural networks", Neurocomput. , vol. 5, no. 2 and 3, 1993.
  13. L. M. Belue and K. W. Bauer, "Methods of determining input features for multilayer perceptrons," Neural Comput. , vol. 7, no. 2, 1995.
  14. J. M. Steppe, K. W. Bauer Jr. , and S. K. Rogers, "Integrated feature and architecture selection," IEEE Trans. Neural Networks, vol. 7, July 1996.
  15. Q. Li and D. W. Tufts, "Principal feature classification," IEEE Trans. Neural Networks, vol. 8, Jan. 1997.
  16. R. Setiono and H. Liu, "Neural network feature selector," IEEE Trans. Neural Networks, vol. 8, May 1997.
  17. R. Quinlan, C4. 5: Programs for Machine Learning. San Mateo, CA: Morgan Kaufmann.
  18. L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, Classification and Regression Trees. Belmont, CA:Wadsworth, 1984.
  19. Hanchuan Peng, Fuhui Long, Chris Ding, Feature Selection Based on Mutual Information: Criteria of Max-Dependency, Max-Relevance, and Min-Redundancy, IEEE Trabsactions on Pattern Alalysis and machine Intelligence, vol. 27, No. 8, August 2005.
  20. S Nojun Kwak and Chong-Ho Choi, Input Feature Selection by Mutual Information Based on Parzen Window, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, No. 12, December 2002.
  21. Thomas Drugman, Mihai Gurban and jean-Philippe Thiran,' Feature Selection and Bimodal Integration for Audio-Visual Speech Recognition', School of Engineering-STI Signal Processing Institute
  22. Georgia D. Tourassi, Erik D. Frederick, Mia K. Markey, Carey E. , Floyd, Jr. , "Application of the mutual information criterion for feature selection in computer-aided diagnosis", North Carolina, Medical Physics, vol. 28, No. 12, December 2001.
  23. Gang Wang, Frederick H. Lochovsky, Qiang Yang, "Feature Selection with Conditional Mutual Information MaxiMin in Text Categorization", Department of Computer Science, Hong Kong University of Science and technology, Kowloon, Hong Kong, 2004.
  24. J. J. Liu, G. Cutler, W. Li, Z. Pan, S. Peng, T. Hoey, L. Chen, and X. B. Ling, "Multiclass Cancer Classification and Biomarker Discovery Using GA-Based Algorithms," Bioinformatics, vol. 21, pp. 2691-2697, 2005.
  25. L. Li, T. A. Darden, C. R. Weingberg, and Levine. , "Gene Assessment and Sample Classification for Gene Expression Data Using a Genetic Algorithm / k-Nearest Neighbor Method," Combinatorial Chemistry & High Throughput Screening, vol. 4, pp. 727-739, 2001.
  26. Guyon and A. Elisseeff, "An Introduction to Variable and Feature Selection," Journal of Machine Learning Research, vol. 3, pp. 1157-1182, 2003.
  27. Huizhen Liu, Shangping Dai, Hong Jiang, 'Quantitative association rules mining algorithm based on matrix', 978-1-4244-4507-3/09©2009 IEEE.
  28. Weka Software http://www. cs. waikato. ac. nz /ml/ weka.
  29. Murphy P. M. and Aha. D. W. (1994). "UCI repository of Machine Learning, University of California", Department of Information and Computer Science, http://www. ics. uci. edu/~ mlearn/ ML Repository. html.
  30. R. Battiti, "Using mutual information for selecting features in supervised neural net learning," IEEE Trans. Neural Networks, vol. 5, July 1994.
  31. N. R. Draper and H. Smith, Applied Regression Analysis, 2nd ed. New York: Wiley, 1981.
  32. P. H. Winston, Artificial Intelligence, MA: Addison-Wesley, 1992.
  33. G. E. P. Peterson et al. , "Using Taguchi's method of experimental design to control errors in layered perceptrons," IEEE Trans. Neural Networks, vol. 6, July 1995.
  34. Pang-Ning Tan, Michael Steinbach, Vipin Kumar "Introduction to Data Mining", Addison Wesley.
Index Terms

Computer Science
Information Sciences

Keywords

Feature selection Association Rule Mining (ARM) Apriori Classification. .