CFP last date
20 May 2024
Reseach Article

Performance Evaluation of K-means Clustering Algorithm with Various Distance Metrics

by Y. S. Thakare, S. B. Bagal
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 110 - Number 11
Year of Publication: 2015
Authors: Y. S. Thakare, S. B. Bagal
10.5120/19360-0929

Y. S. Thakare, S. B. Bagal . Performance Evaluation of K-means Clustering Algorithm with Various Distance Metrics. International Journal of Computer Applications. 110, 11 ( January 2015), 12-16. DOI=10.5120/19360-0929

@article{ 10.5120/19360-0929,
author = { Y. S. Thakare, S. B. Bagal },
title = { Performance Evaluation of K-means Clustering Algorithm with Various Distance Metrics },
journal = { International Journal of Computer Applications },
issue_date = { January 2015 },
volume = { 110 },
number = { 11 },
month = { January },
year = { 2015 },
issn = { 0975-8887 },
pages = { 12-16 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume110/number11/19360-0929/ },
doi = { 10.5120/19360-0929 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:46:04.928719+05:30
%A Y. S. Thakare
%A S. B. Bagal
%T Performance Evaluation of K-means Clustering Algorithm with Various Distance Metrics
%J International Journal of Computer Applications
%@ 0975-8887
%V 110
%N 11
%P 12-16
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Cluster analysis has been widely used in several disciplines, such as statistics, software engineering, biology, psychology and other social sciences, in order to identify natural groups in large amounts of data. Clustering has also been widely adopted by researchers within computer science and especially the database community. K-means is the most famous clustering algorithms. In this paper, the performance of basic k means algorithm is evaluated using various distance metrics for iris dataset, wine dataset, vowel dataset, ionosphere dataset and crude oil dataset by varying no of clusters. From the result analysis we can conclude that the performance of k means algorithm is based on the distance metrics for selected database. Thus, this work will help to select suitable distance metric for particular application.

References
  1. P. K. Simpson, "Fuzzy min–max neural networks—Part 2: Clustering", IEEE Trans. Fuzzy systems, vol. 1, no. 1, pp. 32–45, Feb. 1993.
  2. U. V. Kulkarni, T. R. Sontakke, and A. B. Kulkarni, "Fuzzy hyperline segment clustering neural network", Electronics Letters, IEEE, vol. 37, no. 5, pp. 301–303, March. 2001.
  3. U. V. Kulkarni, D. D. Doye, T. R. Sontakke, "General fuzzy hyper sphere neural network", Proceedings of the IEEE IJCNN. 2369–2374, (2002).
  4. A. Vadivel, A. K. Majumdar, and S. Sural, "Performance comparison of distance metrics in content-based Image retrieval applications", in Proc. 6th International Conf. Information Technology, Bhubaneswar, India, Dec. 22-25, pp. 159-164, 2003.
  5. Ming-Chuan Hung, Jungpin Wu+, Jin-Hua Chang,"An Efficient K-Means Clustering Algorithm Using Simple Partitioning", Journal Of Information Science And Engineering 21, 1157-1177 (2005)
  6. Fahim A. M. , Salem A. M. , Torkey F. A. , Ramadan M. A. "An efficient enhanced k-means clustering algorithm", J Zhejiang Univ. SCIENCE A 2006 7(10):1626-1633
  7. Juntao Wang and Xiaolong Su, "An improved k-mean clustering algorithm", IEEE 3rd International Conference on Communication Software and Networks (ICCSN), 2011, pp 44-46, 2011.
  8. Bhoomi Bangoria, Prof. Nirali Mankad, Prof. Vimal Pambhar, "A survey on Efficient Enhanced K-Means Clustering Algorithm", International Journal for Scientific Research & Development, Vol. 1, Issue 9, 2013.
  9. Bangoria Bhoomi M. , "Enhanced K-Means Clustering Algorithm to Reduce Time Complexity for Numeric Values", International Journal of Computer Science and Information Technologies, Vol. 5 (1), 876-879, 2014.
  10. Pritesh Vora and Bhavesh Oza "A Survey on K-mean Clustering and Particle Swarm Optimization", International Journal of Science and Modern Engineering Volume-1, Issue-3, February 2013
  11. P. M. Murphy and D. W. Aha, UCI Repository of Machine Learning Databases, (Machine-Readable Data Repository). Irvine, CA: Dept. Inf. Comput. Sci. , Univ. California, 1995.
  12. Mohammad F. Eltibi, Wesam M. Ashour, "Initializing K-Means Clustering Algorithm using Statistical Information", International Journal of Computer Applications (0975 – 8887) Volume 29 No. 7, September 2011
  13. Jian Zhu, Hanshi Wang, "An improved K-means clustering algorithm", The 2nd IEEE International Conference on Information Management and Engineering (ICIME), 2010
  14. U. Maulik, and S. Bandyopadhyay, "Genetic Algorithm-Based Clustering Technique", Pattern Recognition 33, pp. 1455-1465, 1999.
  15. K. S. Kadam and S. B. Bagal, "Fuzzy Hyperline Segment Neural Network Pattern Classifier with Different Distance Metrics", International Journal of Computer Applications 95(8):6-11, June 2014.
  16. K. S. Kadam, S. B. Bagal, Y. S. Thakare, N. P. Sonawane, "Canberra Distance Metric Based Hyperline Segment Pattern Classifier Using Hybrid Approach of Fuzzy Logic and Neural Network", 3rd International Conference on Recent Trends in Engineering & Technology (ICRTET'2014), India, March 28-30, 2014.
Index Terms

Computer Science
Information Sciences

Keywords

Clustering algorithms Pattern recognition.