CFP last date
20 June 2024
Reseach Article

Article:Incremental Cluster Detection using a Soft Computing Approach

by Alpa Reshamwala, Vijay Katkar, Mamta Ubnare
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 11 - Number 8
Year of Publication: 2010
Authors: Alpa Reshamwala, Vijay Katkar, Mamta Ubnare
10.5120/1604-2155

Alpa Reshamwala, Vijay Katkar, Mamta Ubnare . Article:Incremental Cluster Detection using a Soft Computing Approach. International Journal of Computer Applications. 11, 8 ( December 2010), 13-17. DOI=10.5120/1604-2155

@article{ 10.5120/1604-2155,
author = { Alpa Reshamwala, Vijay Katkar, Mamta Ubnare },
title = { Article:Incremental Cluster Detection using a Soft Computing Approach },
journal = { International Journal of Computer Applications },
issue_date = { December 2010 },
volume = { 11 },
number = { 8 },
month = { December },
year = { 2010 },
issn = { 0975-8887 },
pages = { 13-17 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume11/number8/1604-2155/ },
doi = { 10.5120/1604-2155 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:00:01.250633+05:30
%A Alpa Reshamwala
%A Vijay Katkar
%A Mamta Ubnare
%T Article:Incremental Cluster Detection using a Soft Computing Approach
%J International Journal of Computer Applications
%@ 0975-8887
%V 11
%N 8
%P 13-17
%D 2010
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Clustering is the process of locating patterns in large data sets. As databases continue to grow in size, efficient and effective clustering algorithms play a paramount role in data mining applications. Traditional clustering approaches usually analyze static datasets in which objects are kept unchanged after being processed, but many practical datasets are dynamically modified which means some previously learned patterns have to be updated accordingly. Re-clustering the whole dataset from scratch is not a good choice due to the frequent data modifications and the limited out-of-service time, so the development of incremental clustering approaches is highly desirable. In this paper, we propose an incremental algorithm, IPYRAMID: Incremental Parallel hYbrid clusteRing using genetic progrAmming and Multiobjective fItness with Density employs a combination of data parallelism, genetic programming (GP), special operators, and multi-objective density-based incremental fitness function. Although many incremental clustering algorithms have been proposed which can handle insertion of new record properly using incremental approach but cannot handle deletion of record properly. This issue is resolved in the proposed algorithm and density based incremental fitness function that helps to handle outliers. Use of parallelism increases the speed of execution as well as identifies clusters of arbitrary shapes. The incremental merge engine can dynamically determine the number of clusters. Preliminary experimental results show that it can increase the efficiency of clustering process.

References
  1. Cheung D. W., Han J., Ng V. T., Wong Y.: “Maintenance of Discovered Association Rules in Large Databases: An Incremental Technique”, Proc. 12th Int. Conf. on Data Engineering, New Orleans, USA, 1996, pp. 106-1 14.
  2. Feldman R., Aumann Y., Amir A., Mannila H.: “Efficient Algorithms for Discovering Frequent Sets in Incremental Databases”, Proc. ACM SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, Tucson, AZ, 1997, pp. 59-66.
  3. Ester M., Wittmann R.: “Incremental Generalization for Mining in a Data Warehousing Environment”, Proc. 6th Int. Cod on Extending Database Technology, Valencia, Spain, 1998, in: Lecture Notes in Computer Science, Vol. 1377, Springer, 1998, pp. 135-152.
  4. Ester. M., Kriegel. H.-P. Sander. J. et. al. Incremental Clustering for Mining in a Data Warehousing Environment. In: Gupta. A., Shmueli. 0.. Widom. J., eds. Proceedings of the 24& International Conference on Very .Lyge Data Bases. New York: Morgan Kaufmann Publishers Inc., 1998.323-333.
  5. Chen Zhuo, Liu Xiang-shuang, Zhuang Xiao-dong (2007), “A Fast Incremental Clustering Algorithm Based on Grid and Density”, Third International Conference on Natural Computation (ICNC 2007) ISBN: 0-7695-2875-9/07 IEEE 2007, 207 – 211.
  6. Chen, An, Chen Ning, “An Incremental Grid Density-Based Clustering Algorithm in Large Spatial Databases”, http://www.jos.org.cn/ch/reader/download_pdf.aspx?file_no= 20020101&year_id=2002&quarter_id=1&falg=1.
  7. Tao Li, Saabjot S. Anand (2008) HIREL: An Incremental Clustering Algorithm for Relational Datasets” 2008 Eighth IEEE International Conference on Data Mining, 887 – 892.
  8. Yong-Feng Zhou, Qing-Bao Ln, Su Deng, Qing Yang (2002) “An Incremental Outlier Factor Based Clustering Algorithm” Proceedings of the First International Conference on Machine Learning and Cybernetics, ISBN: 0-7803-7508-4, 1358 - 1361 vol.3.
  9. Li Xiaohong, Luo Min (2009) , “GAKC: A new GA-based k clustering algorithm”, Second International Symposium on Information Science and Engineering, ISBN: 978-0-7695-3991-1, IEEE 2009, 334 – 338.
  10. Tout, S., Sverdlik, W., & Sun, J. (2006). “Parallel Hybrid Clustering using Genetic Programming and Multi-objective Fitness with Density (PYRAMID)”, Proceedings of the 2006 International Conference on Data Mining (DMIN’06), Las Vegas, NV, USA, 197-203.
  11. Koza, J.R. (1991). “Evolving a Computer Program to Generate Random Numbers using the Genetic Programming Paradigm”, Proceedings of the Fourth International Conference on Genetic Algorithms, La Jolla, CA, 37-44.
  12. Karypis, G., Han, S., & Kumar, V. (1999). ”Chameleon: A Hierarchical Clustering using Dynamic Modeling.” IEEE Computer: Special Issue on Data Analysis and Mining, 32(8), 68-75.
Index Terms

Computer Science
Information Sciences

Keywords

Data Mining Clustering Genetic Programming Parallelism Density Incremental mining