Call for Paper - January 2023 Edition
IJCA solicits original research papers for the January 2023 Edition. Last date of manuscript submission is December 20, 2022. Read More

An Improved Progressive Sampling based Approach for Association Rule Mining

Print
PDF
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2017
Authors:
S. S. Thakur, Shalini Zanzote Ninoria
10.5120/ijca2017913928

S S Thakur and Shalini Zanzote Ninoria. An Improved Progressive Sampling based Approach for Association Rule Mining. International Journal of Computer Applications 165(7):27-35, May 2017. BibTeX

@article{10.5120/ijca2017913928,
	author = {S. S. Thakur and Shalini Zanzote Ninoria},
	title = {An Improved Progressive Sampling based Approach for Association Rule Mining},
	journal = {International Journal of Computer Applications},
	issue_date = {May 2017},
	volume = {165},
	number = {7},
	month = {May},
	year = {2017},
	issn = {0975-8887},
	pages = {27-35},
	numpages = {9},
	url = {http://www.ijcaonline.org/archives/volume165/number7/27586-2017913928},
	doi = {10.5120/ijca2017913928},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}
}

Abstract

Data Mining is the multistage process of extraction of useful information from the large database. Association rule mining is one of the important techniques of data mining in which relationships among the items present in the transactions are discovered. There are different algorithms are available in the field of data mining for association rule mining but most of them are time consuming hence the run time and memory overheads incurred is extremely high specially in the case of very large database. Sampling is one of the remarkable approach which can be used to speed up the process of association rule mining hence it is a approach to reduce the complexity of association rule mining technique to some extent but still consuming comparable time and memory. A progressive sampling based approach is a noval expert approach in the field of association rule mining to reduce the overheads of usual sampling based approaches. It is very effective in case of the large databases. In this paper, we have extended the Progressive sampling based approach presented by Umarani & Punithavalli,2009[22] and performed an extensive experimental analysis of the progressive sampling-based approach for the different Partitioned itemset 1/3,1/4,2/3,3/4 with the sample dataset also in addition the performance of this Improved Progressive Sampling Based Approach is evaluated with the Progressive sampling based approach by Umarani & Punithavalli,2009[22]. The experimental results illustrate the complexity of an algorithm in terms of run time as well as the memory utilization. Complete implementation has been done in Java Jdk 6.1. and MySQL5.0 on the Sample dataset CompPeriPurchase.

References

  1. Agarwal, R. and Srikant, R. “Fast algorithms for mining association rules”, In the Proceedings of 20th Int’l Conf. Very large Data Bases, pp.487-499,1994.
  2. Agrawal, R.; Imieliński, T.; Swami, A. (1993). "Mining association rules between sets of items in large databases". Proceedings of the 1993 ACM SIGMOD international conference on Management of data - SIGMOD '93. p. 207
  3. Atul Palandurkar, “NetBeans IDE How-to”, PACKT Publishing, UK (2013).
  4. B. Mobasher, N. Jain, E.H. Han, and J. Srivastava, “Web Mining: Pattern Discovery from World Wide Web Transactions” Department of Computer Science, University of Minnesota, Technical Report TR96-050, (March, 1996).
  5. Basel et al, “a new sampling technique for Association rule mining”, Journal of information science ,June 2009,vol 35, pp 358-376.
  6. Bodon, F. “A Fast Apriori Implementation”, In the Proceedings of the IEEE ICDM Workshop on Frequent Item set Mining Implementations, Vol.90, Melbourne, 2003.
  7. Dunham, M. H., Sridhar S., “Data Mining: Introductory and Advanced Topics”, Pearson Education, New Delhi, ISBN: 81-7758-785-4, 1st Edition, 2006
  8. Farah Hanna AL-Zawaidah , Marwan AL-Abed Abu-Zanona, Yosef Hasan Jbara,”An Improved Algorithm for Mining Association Rules in Large Databases”(WCSIT) ISSN: 2221-0741 Vol. 1, No. 7, 311-316, 2011
  9. G.K.Gupta, “Introduction to Data Mining with Case Studies”Prentice-Hall of India Pvt.Ltd.New Delhi,India(2006)
  10. Herbert Schildt, “Java : The Complete Reference, Seventh Edition”Tata Mc Grow Hills Publishing Company Ltd. ,New Delhi.
  11. Introduction to Data Mining and Knowledge Discovery, Third Edition ISBN: 1-892095-02-5, Two Crows Corporation, 10500 Falls Road, Potomac, MD 20854 (U.S.A.), 1999.
  12. Jiawei Han and Micheline Kamber “Data Mining –concept and techniques” Morgan Kaufmann Elsevier Science India (2002)
  13. Jiming, L. Yiuiming, C. and Hujun, Y. “Intelligent data engineering and automated learning”, In the Proceedings of the 4th International Conference IDEAL Springer, 2003.
  14. Maojo V., Sanandres J., A Survey of Data Mining Techniques, LECT NOTES COMPUT SC, 2000, 1933, 77-92.
  15. Mohammed Javeed,Zaki,Srinivasan Parthasarathy,Wei Li,Mitsunori Ogihara,”Evaluation of Sampling for data mining of Association rules”, proc Intn’l workshop research issues in data engineering 1997.
  16. Sotiris, K. and Dimitris, K. “Association Rules Mining: A Recent Overview”, GESTS International Transactions on Computer Science and Engineering, Vol. 32, No.1, pp.71-82, 2006.
  17. Srikant, R., Vu, Q. and Agarwal, R. “Mining association rules with item constraints”, In the proceedings of 3rd Intl Conf on Knowledge Discovery and Data Mining, 1997.
  18. Srinivasan Parthasarathy,”Efficient Progressive Sampling for Association Rules”, in Proceeding of the 2002 IEEE International Conference on Data Mining,pp.354,2002.
  19. Toivonen.H.(1996),Sampling large databases for association rules in “The VLDB Journal” pp.134-135.
  20. Tsau, Y.L. “Sampling in association rule mining”, International Conference on Data mining and knowledge discovery: Theory, Tools and Technology VI, Vol.5433, pp. 161-167, 2004.
  21. Umarani, V. and Punithavalli, M. “A Novel Progressive Sampling based Approach for Effective Mining of Association Rules”, International Journal of Computer Science and Research, Vol. 10 Issue 11(2010).
  22. Umarani, V. and Punithavalli, M. “Developing Novel and Effective Approach For Association Rule Mining Using Progressove Sampling”, International Conferecne on Computer and Electrical Engineering ,IEEE,(2009).
  23. Umarani, V. and Punithavalli, M. “Sampling Based Association Rule Mining–A recent overview”, International Journal of Computer Science and Engineering”, Vol. 2 Issue 2 (2010).
  24. V.Umarani, M.Punithavalli,” On developing an effectual progressive sampling based approach for Association Rule Discovery”, In the proceedings of 2nd IEEE ICIME Int’l conference on Information and Data Management”, Chengdu,China(2010)
  25. Venkatesan, T. C, Vinayaka, P. and Yogish, S. “Analysis of sampling techniques for Association Rule Mining.”, In the Proceedings of the 12th International Conference on Database Theory, Vol.361, pp. 276-283, 2009.
  26. Vikram Vaswani, “MySQL : The Complete Reference,”Tata Mc Grow Hills Publishing Company Ltd., New Delhi.

Keywords

Association Rule Mining, Frequent Itemsets, Negative Border, Partitioned Itemsets.