CFP last date
20 May 2024
Reseach Article

Effect of Distance Functions on Simple K-means Clustering Algorithm

by Richa Loohach, Kanwal Garg
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 49 - Number 6
Year of Publication: 2012
Authors: Richa Loohach, Kanwal Garg
10.5120/7629-0698

Richa Loohach, Kanwal Garg . Effect of Distance Functions on Simple K-means Clustering Algorithm. International Journal of Computer Applications. 49, 6 ( July 2012), 7-9. DOI=10.5120/7629-0698

@article{ 10.5120/7629-0698,
author = { Richa Loohach, Kanwal Garg },
title = { Effect of Distance Functions on Simple K-means Clustering Algorithm },
journal = { International Journal of Computer Applications },
issue_date = { July 2012 },
volume = { 49 },
number = { 6 },
month = { July },
year = { 2012 },
issn = { 0975-8887 },
pages = { 7-9 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume49/number6/7629-0698/ },
doi = { 10.5120/7629-0698 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:45:33.249411+05:30
%A Richa Loohach
%A Kanwal Garg
%T Effect of Distance Functions on Simple K-means Clustering Algorithm
%J International Journal of Computer Applications
%@ 0975-8887
%V 49
%N 6
%P 7-9
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Clustering analysis is the most significant step in data mining. This paper discusses the k-means clustering algorithm and various distance functions used in k-means clustering algorithm such as Euclidean distance function and Manhattan distance function. Experimental results are shown to observe the effect of Manhattan distance function and Euclidean distance function on k-means clustering algorithm. These results also show that distance functions furthermore affect the size of clusters formed by the k-means clustering algorithm.

References
  1. Shi Na , Liu Xumin and Guan yong 2010 "Research on k-means Clustering Algorithm An Improved k-means Clustering Algorithm", Third International Symposium on Intelligent Information Technology and Security Informatics 978-0-7695-4020-7/10 $26. 00 © 2010 IEEE
  2. A. K. Jain, M. N. Murty and P. J. Flynn 1999, "Data Clustering: A Review", ACM Computing Surveys, Vol. 31, No. 3, September 1999.
  3. Source: collection of regression datasets by Luis Torgo (ltorgo@ncc. up. pt) at http://www. ncc. up. pt/~ltorgo/Regression/DataSets. html
  4. D. Randall Wilson and Tony R. Martinez 1997 "Improved Heterogeneous Distance Functions" Journal of Artificial Intelligence Research 6 (1997) 1-34 Submitted 5/96; published 1/97 © 1997 AI Access Foundation and Morgan Kaufmann Publishers. All rights reserved
  5. Antoni Moore 2002 "The case for approximate Distance Transforms" Presented at SIRC 2002 – The 14th Annual Colloquium of the Spatial Information Research Centre University of Otago, Dunedin, New Zealand December 3-5th 2002
  6. Glenn Fung 2001, "A Comprehensive Overview of Basic Clustering Algorithms" June 22, 2001
  7. Michael Steinbach , Levent Ertöz and Vipin Kumar, "The Challenges of Clustering High Dimensional Data", Access to computing facilities was provided by AHPCRC and the Minnesota Supercomputing Institute.
  8. Pavel Berkhin, " Survey of Clustering Data Mining Techniques", Accrue Software, Inc. Author's Address: Pavel Berkhin, Accrue Software, 1045 Forest Knoll Dr. , San Josh, CA, 95129; e-mail: pavelb@accrue. com
  9. Juanying Xie, Shuai Jiang 2010, "A simple and fast algorithm for global K-means clustering", 2010 Second International Workshop on Education Technology and Computer Science, 978-0-7695-3987-4/10 $26. 00 © 2010 IEEE DOI 10. 1109/ETCS. 2010. 347
  10. Ren Jingbiao,Yin Shaohong 2010 "Research and Improvement of Clustering Algorithm in Data Mining", 2010 2nd International Conference on Signal Processing Systems (ICSPS) 978-1-4244-6893-5/$26. 00 C 2010 IEEE
  11. H. G. Wilson, B. Boots, and A. A. Millward 2002, "A Comparison of Hierarchical and Partitional Clustering Techniques for Multispectral Image Classification", 0-7803-7536-X/$17. 00 (C) 2002 IEEE
  12. Tung-Shou Chen, Tzu-Hsin Tsai, Yi-Tzu Chen, Chin-Chiang Lin, Rong-Chang Chen, Shuan-Yow Li and Hsin-Yi Chen 2005, "A Combined K-Means And Hierarchical Clustering Method For Improving The Clustering Efficiency Of Microarray", Proceedings of 2005 International Symposium on Intelligent Signal Processing and Communication Systems December 13-16, 2005 Hong Kong, 0-7803-9266-3/05/$20. 00 ©2005 IEEE
Index Terms

Computer Science
Information Sciences

Keywords

K-means clustering distance functions clustering Euclidean distance function Manhattan distance function