Call for Paper - May 2023 Edition
IJCA solicits original research papers for the May 2023 Edition. Last date of manuscript submission is April 20, 2023. Read More

Incremental Cluster Detection using a Soft Computing Approach

Print
PDF
International Journal of Computer Applications
© 2010 by IJCA Journal
Number 8 - Article 3
Year of Publication: 2010
Authors:
Alpa Reshamwala
Vijay Katkar
Mamta Ubnare
10.5120/1604-2155

Alpa Reshamwala, Vijay Katkar and Mamta Ubnare. Article:Incremental Cluster Detection using a Soft Computing Approach. International Journal of Computer Applications 11(8):13–17, December 2010. Published By Foundation of Computer Science. BibTeX

@article{key:article,
	author = {Alpa Reshamwala and Vijay Katkar and Mamta Ubnare},
	title = {Article:Incremental Cluster Detection using a Soft Computing Approach},
	journal = {International Journal of Computer Applications},
	year = {2010},
	volume = {11},
	number = {8},
	pages = {13--17},
	month = {December},
	note = {Published By Foundation of Computer Science}
}

Abstract

Clustering is the process of locating patterns in large data sets. As databases continue to grow in size, efficient and effective clustering algorithms play a paramount role in data mining applications. Traditional clustering approaches usually analyze static datasets in which objects are kept unchanged after being processed, but many practical datasets are dynamically modified which means some previously learned patterns have to be updated accordingly. Re-clustering the whole dataset from scratch is not a good choice due to the frequent data modifications and the limited out-of-service time, so the development of incremental clustering approaches is highly desirable. In this paper, we propose an incremental algorithm, IPYRAMID: Incremental Parallel hYbrid clusteRing using genetic progrAmming and Multiobjective fItness with Density employs a combination of data parallelism, genetic programming (GP), special operators, and multi-objective density-based incremental fitness function. Although many incremental clustering algorithms have been proposed which can handle insertion of new record properly using incremental approach but cannot handle deletion of record properly. This issue is resolved in the proposed algorithm and density based incremental fitness function that helps to handle outliers. Use of parallelism increases the speed of execution as well as identifies clusters of arbitrary shapes. The incremental merge engine can dynamically determine the number of clusters. Preliminary experimental results show that it can increase the efficiency of clustering process.

Reference

  • Cheung D. W., Han J., Ng V. T., Wong Y.: “Maintenance of Discovered Association Rules in Large Databases: An Incremental Technique”, Proc. 12th Int. Conf. on Data Engineering, New Orleans, USA, 1996, pp. 106-1 14.
  • Feldman R., Aumann Y., Amir A., Mannila H.: “Efficient Algorithms for Discovering Frequent Sets in Incremental Databases”, Proc. ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Tucson, AZ, 1997, pp. 59-66.
  • Ester M., Wittmann R.: “Incremental Generalization for Mining in a Data Warehousing Environment”, Proc. 6th Int. Cod on Extending Database Technology, Valencia, Spain, 1998, in: Lecture Notes in Computer Science, Vol. 1377, Springer, 1998, pp. 135-152.
  • Ester. M., Kriegel. H.-P. Sander. J. et. al. Incremental Clustering for Mining in a Data Warehousing Environment. In: Gupta. A., Shmueli. 0.. Widom. J., eds. Proceedings of the 24& International Conference on Very .Lyge Data Bases. New York: Morgan Kaufmann Publishers Inc., 1998.323-333.
  • Chen Zhuo, Liu Xiang-shuang, Zhuang Xiao-dong (2007), “A Fast Incremental Clustering Algorithm Based on Grid and Density”, Third International Conference on Natural Computation (ICNC 2007) ISBN: 0-7695-2875-9/07 IEEE 2007, 207 – 211.
  • Chen, An, Chen Ning, “An Incremental Grid Density-Based Clustering Algorithm in Large Spatial Databases”, http://www.jos.org.cn/ch/reader/download_pdf.aspx?file_no= 20020101&year_id=2002&quarter_id=1&falg=1.
  • Tao Li, Saabjot S. Anand (2008) HIREL: An Incremental Clustering Algorithm for Relational Datasets” 2008 Eighth IEEE International Conference on Data Mining, 887 – 892.
  • Yong-Feng Zhou, Qing-Bao Ln, Su Deng, Qing Yang (2002) “An Incremental Outlier Factor Based Clustering Algorithm” Proceedings of the First International Conference on Machine Learning and Cybernetics, ISBN: 0-7803-7508-4, 1358 - 1361 vol.3.
  • Li Xiaohong, Luo Min (2009) , “GAKC: A new GA-based k clustering algorithm”, Second International Symposium on Information Science and Engineering, ISBN: 978-0-7695-3991-1, IEEE 2009, 334 – 338.
  • Tout, S., Sverdlik, W., & Sun, J. (2006). “Parallel Hybrid Clustering using Genetic Programming and Multi-objective Fitness with Density (PYRAMID)”, Proceedings of the 2006 International Conference on Data Mining (DMIN’06), Las Vegas, NV, USA, 197-203.
  • Koza, J.R. (1991). “Evolving a Computer Program to Generate Random Numbers using the Genetic Programming Paradigm”, Proceedings of the Fourth International Conference on Genetic Algorithms, La Jolla, CA, 37-44.
  • Karypis, G., Han, S., & Kumar, V. (1999). ”Chameleon: A Hierarchical Clustering using Dynamic Modeling.” IEEE Computer: Special Issue on Data Analysis and Mining, 32(8), 68-75.