An Agglomerative Clustering Method for Large Data Sets

Omar Kettani; Faycal Ramdani; Benaissa Tadili

Call for Paper

June Edition

IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper

Know more

The week's pick

Enhancing Privacy Preservation: Multi-Attribute Protection with P-Sensitive K-Anonymity

Twinkle Patel Kiran Amin

Random Articles

Process Optimization Time for a Service in 4G Network by SNMP Monitoring and IaaS Cloud Computing

August

2013

An Implementation and Comparative Analysis of PID Controller and their Auto Tuning Method for Three Tank Liquid Level Control

May

2011

Towards Standardization of Deregulated Electricity Market Communications in Nigeria

November

2015

An Analysis of Wide-Area Networks

Oct

2016

Reseach Article

An Agglomerative Clustering Method for Large Data Sets

by Omar Kettani, Faycal Ramdani, Benaissa Tadili

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 92 - Number 14

Year of Publication: 2014

Authors: Omar Kettani, Faycal Ramdani, Benaissa Tadili

10.5120/16074-4952

Omar Kettani, Faycal Ramdani, Benaissa Tadili . An Agglomerative Clustering Method for Large Data Sets. International Journal of Computer Applications. 92, 14 ( April 2014), 1-7. DOI=10.5120/16074-4952

@article{ 10.5120/16074-4952,

author = { Omar Kettani, Faycal Ramdani, Benaissa Tadili },

title = { An Agglomerative Clustering Method for Large Data Sets },

journal = { International Journal of Computer Applications },

issue_date = { April 2014 },

volume = { 92 },

number = { 14 },

month = { April },

year = { 2014 },

issn = { 0975-8887 },

pages = { 1-7 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume92/number14/16074-4952/ },

doi = { 10.5120/16074-4952 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:14:16.897230+05:30

%A Omar Kettani

%A Faycal Ramdani

%A Benaissa Tadili

%T An Agglomerative Clustering Method for Large Data Sets

%J International Journal of Computer Applications

%@ 0975-8887

%V 92

%N 14

%P 1-7

%D 2014

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In Data Mining, agglomerative clustering algorithms are widely used because their flexibility and conceptual simplicity. However, their main drawback is their slowness. In this paper, a simple agglomerative clustering algorithm with a low computational complexity, is proposed. This method is especially convenient for performing clustering on large data sets, and could also be used as a linear time initialization method for other clustering algorithms, like the commonly used k-means algorithm. Experiments conducted on some standard data sets confirm that the proposed approach is effective.

References

Aloise, D. ; Deshpande, A. ; Hansen, P. ; Popat, P. (2009). "NP-hardness of Euclidean sum-of-squares clustering". Machine Learning 75: 245–249. doi:10. 1007/s10994-009-5103-0.
Franti, P. , Virmajoki, O. , Hautamaki, V. : Fast agglomerative clustering using a k-nearest neighbor graph. IEEE TPAMI 28(11) (2006) 1875–1881
Cho, M. , Lee, J. , Lee, K. : Feature correspondence and deformable object matching via agglomerative correspondence clustering. In: ICCV. (2009)
Sander, J. , Ester, M. , Kriegel, H. , Xu, X. : Density-based clustering in spatial databases: The algorithm GDBSCAN and its applications. Data Mining and Knowledge Discovery 2(2)(1998) 169–194
Karypis, G. , Han, E. , Kumar, V. : Chameleon: Hierarchical clustering using dynamic modeling. IEEE Computer 32(8) (1999) 68–75
Zhao, D. , Tang, X. : Cyclizing clusters via zeta function of a graph. In: NIPS. (2008)
Felzenszwalb, P. , Huttenlocher, D. : Efficient graph-based image segmentation. IJCV 59(2)
Wei Zhang, Xiaogang Wang, Deli Zhao, Xiaoou Tang: Graph Degree Linkage: Agglomerative Clustering on a Directed Graph Computer Vision – ECCV 2012 Lecture Notes in Computer Science Volume 7572, 2012, pp 428-441
Pasi Fränti, Olli Virmajoki and Ville Hautamäki:Fast PNN-based Clustering Using K-nearest Neighbor Graph. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE,VOL. 28, NO. 11,NOVEMBER 2006
Jianfu LI,, Jianshuang LI, Huaiqing HE:A Simple and Accurate Approach to Hierarchical Clustering. Journal of Computational Information Systems 7: 7 (2011) 2577-2584
Chih-Tang Chang, Jim Z. C. Lai, M. D. Jeng: fast agglomerative clustering using information of k-nearest neighbors. Pattern Recognition 43 (2010) 3958–3968
Wei Zhang , Deli Zhao, Xiaogang Wang:Agglomerative Clustering via Maximum Incremental Path Integral. Pattern Recognition 46(11) 3056-3065 (2013)
S. Lloyd, "Least Squares Quantization in PCM," IEEE Transactions on Information Theory, vol. 28, no. 2, pp. 129–136, 1982.
P. S. Bradley and U. M. Fayyad, "Refining initial points for K-means clustering", proceedings of the 15th International Conference on Machine Learning, (1998) July 24-27, Morgan Kaufmann, San Francisco, pp. 91-99.
M. Al-Daoud and S. Roberts. New methods for the initialisation of clusters. Technical Report 94. 34, School of Computer Studies,University of Leeds, 1994.
I. Katsavounidis, C. -C. J. Kuo, and Z. Zhang. A new initialization technique for generalized Lloyd iteration. IEEE Signal Processing Letters,1(10):144–146, 1994.
T. Su and J. G. Dy, "In Search of Deterministic Methods for Initializing K-Means and Gauss (2004) 167–181
Merz C and Murphy P, UCI Repository of Machine Learning ftp://ftp. ics. uci. edu/pub/machine-Learning-databases Clustering datasets:http://cs. joensuu. fi/sipu/datasets/
Kaufmann, L. and Rousseeuw, P. J. (1990) Finding Groups in Data. Wiley, New York.
http://www. mathworks. com

Index Terms

Computer Science

Information Sciences

Keywords

Agglomerative clustering k-means initialization.