CFP last date
22 April 2024
Reseach Article

A New Dynamic Distributed Algorithm for Frequent Itemsets Mining

by Azam Adelpoor, Mohammad Saniee Abadeh
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 67 - Number 15
Year of Publication: 2013
Authors: Azam Adelpoor, Mohammad Saniee Abadeh
10.5120/11472-7081

Azam Adelpoor, Mohammad Saniee Abadeh . A New Dynamic Distributed Algorithm for Frequent Itemsets Mining. International Journal of Computer Applications. 67, 15 ( April 2013), 21-28. DOI=10.5120/11472-7081

@article{ 10.5120/11472-7081,
author = { Azam Adelpoor, Mohammad Saniee Abadeh },
title = { A New Dynamic Distributed Algorithm for Frequent Itemsets Mining },
journal = { International Journal of Computer Applications },
issue_date = { April 2013 },
volume = { 67 },
number = { 15 },
month = { April },
year = { 2013 },
issn = { 0975-8887 },
pages = { 21-28 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume67/number15/11472-7081/ },
doi = { 10.5120/11472-7081 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:24:55.150115+05:30
%A Azam Adelpoor
%A Mohammad Saniee Abadeh
%T A New Dynamic Distributed Algorithm for Frequent Itemsets Mining
%J International Journal of Computer Applications
%@ 0975-8887
%V 67
%N 15
%P 21-28
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Mining for association rules between items in large transactional databases is a central problem in the field of knowledge discovery. It has crucial applications in decision support and marketing strategy. Centralized and Distributed Association Rules Mining (DARM) include two phases of frequent itemset extraction and strong rule generation. The most important part of ARM is Frequent Itemsets Mining (FIM)and because of its importance in recent years, there have been many algorithms implemented for it. In this paper, we have focused on distributed Apriori-Like frequent itemsets mining and proposed a distributed algorithm, called New Dynamic Distributed Frequent Itemsets Mining (NDD-FIM), for geographically distributed data sets. NDD-FIM has a merger site to reduce communication overhead and eliminates size of dataset partitions dynamically. The experimental results show that our algorithm generates support counts of candidate itemsets quickerthan other DARM algorithms and reduces the size of average transactions, datasets, and messageexchanges.

References
  1. Ailing,W. 2011. An Improved Distributed Mining Algorithm of Association Rules, JCIT: Journal of Convergence Information Technology, 6(4) 118-122.
  2. Roy, S. , Bhattacharyya, D. K. 2008. OPAM: An Efficient One Pass Association Mining Technique without Candidate Generation, JCIT: Journal of Convergence Information Technology, 3(3) 32-38.
  3. Li, Y. , Sun, L. , Yin, J. , Bao, W. , Gu,M. 2010. Multi-Level Weighted Sequential Pattern Mining Based on Prime Encoding, JDCTA: International Journal of Digital Content Technology and its Applications, 4(9) 8-16.
  4. Lin, F. , Le, W. ,Bocor, J. 2010. Research on Maximal Frequent Pattern Outlier Factor for Online High-Dimensional Time-Series Outlier Detection, JCIT: Journal of Convergence Information Technology, 5(10) 66-71.
  5. Agrawal, R. , Shafer,J. C. 1996. Parallel mining of association rules, IEEE Transactions on Knowledge and Data Engineering, 8(6) 962-969.
  6. Cheung, D. W. , Han, J. , Ng, V. T. , Fu, A. W. , Fu, Y. 1996. A fast distributed algorithm for mining association rules, In Proceedings of the Fourth International Conference on Parallel and Distributed Information Systems, 31-42.
  7. Schuster, A. , Wolff, R. , Trock,D. 2005. A high-performance distributed algorithm for mining association rules, Knowledge Information System, 7(4) 458–475.
  8. Cheung, D. , Xiao, Y. 1998. Effect of data skewness in parallel mining of association rules, In 12th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Melbourne, Australia, April,48-60.
  9. Schuster, A. , Wolff, R. 2001. Communication-efficient distributed mining of association rules, In Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, Santa Barbara, California, May,473-484.
  10. Agrawal, R. ,Srikant, R. 1994. Fast algorithms for mining association rules, In Proceedings of the 20th International Conference on Very Large Databases (VLDB94), Santiago, Chile, September,487-499.
  11. Park,J. S. , Chen,M. , Yu,P. S. 1995. An effective hash-based algorithm for mining association rules, In Proceedings of ACM SIGMOD International Conference on Management of Data, San Jose, California, May,175-186.
  12. Park, J. S. , Chen, M. , Yu, P. S. 1995. Efficient parallel data mining for association rules, in Proceedings of ACM International Conference on Information and Knowledge Management, Baltimore, MD, November,31-36.
  13. Pramudiono, I. , Kitsuregawa, M. 2003. Parallel FP-Growth on PC cluster, In Proceedings of the 7th Pacific–Asia Conference of Knowledge Discovery and Data Mining, (PAKDD03)467- 473.
  14. Han, J. , Pei, J. , Yin, Y. 1999. Mining frequent patterns without candidate generation, Technical Report, Simon Fraser University, October,99-102.
  15. Ashrafi, M. Z. , Taniar, D. , Smith, K. A. 2004. ODAM: an optimized distributed association rule mining algorithm, IEEE Distributed Systems Online, 5(3).
  16. Zaki, M. J. 2000b. Parallel and distributed data mining: An introduction, In M. J. Zaki, C. Ho, (Eds. ), Large-Scale Parallel Data Mining. New York, NY: Springer-Verlag. 1-23.
  17. Agrawal, R. , Imielinski, T. , Swami, A. 1993. Mining association rules between sets of items in large databases, In Proceedings of ACM SIGMOD International Conference on Management of Data, 207-216.
  18. Chung, S. M. ,Congnan, L. 2008. Efficient mining of maximal frequent itemsets from databases on a cluster of workstations, Knowledge Information System, 16(3) 359–391.
  19. Congnan, L. , Chung, S. M. 2008. A scalable algorithm for mining maximal frequent sequences using a sample, Knowledge Information System, 15(2) 149–179.
  20. Lian, W. , Cheung, D. W. , Yiu,S. M. 2007. Maintenance of Maximal Frequent Itemsets in Large Databases, In Proceedings of 2007 ACM Symposium on Applied Computing (SAC'07), Seoul, 388-392.
  21. Tsoumakas,G. , Vlahavas, I. 2009. Distributed Data Mining, Database Technologies: Concepts, Methodologies, Tools, and Applications,Page-710.
  22. Toivonen, H. 1996. Sampling large databases for association rules, In Proceedings of 22th International Conference on Very Large Data Bases (VLDB'96), Bombay, India, 134-145.
  23. Brin, S. ,Motwani,R. , Ullman, J. D. , Tsur, S. 1997. Dynamic itemset counting and implication rules for market basket data, In Proceedings of ACM SIGMOD International Conference on Management of Data, 255-264.
  24. Ye, Y. , Chiang, C. C. 2006. A parallel apriori algorithm for frequent itemsets mining, In Proceedings of the Fourth International Conference on Software Engineering Research, Management and Applications,87–94.
  25. Bodon, F. 2003. A fast apriori implementation, In Proceedings of the IEEE ICDM Workshop on Frequent Itemset Mining Implementations.
  26. Wu, J. , Li,X. M. 2008. An efficient association rule mining algorithm in distributed database, In International Workshop on Knowledge Discovery and Data Mining (WKDD),108–113.
  27. Wessel, T. 2009. Parallel mining of association rules using a lattice based approach [dissertation], Nova Southeastern University.
  28. Aggarwal, C. C. , Yu, P. S. 2001. A new approach to online generation of association rules, IEEE Transactions Knowledge and Data Engineering, 13(4)527-540.
  29. Cheung, D. W. , Ng, V. T. , Fu, A. W. , Fu, Y. 1996. Efficient mining of association rules in distributed databases, IEEE Transactions on Knowledge and Data Engineering, 8(6) 911-922.
  30. Kohavi, R. , Bradley, C. E. , Frasca, B. , Mason, L. , Zheng, Z. 2000. KDD-Cup 2000 organizers Report: Peeling the Onion. SIGKDD Exploration 2(2) 86–93.
  31. Zaki, M. J. 2000c. Hierarchical parallel algorithms for association mining, In Kargupta,H & Chan P. (Eds. ), Advances in Distributed and Parallel Knowledge Discovery, Cambridge, MA: MIT Press 339-336.
Index Terms

Computer Science
Information Sciences

Keywords

Distributed Data Mining Frequent Itemsets Association Rule Apriori Algorithm