Avoiding Objects with few Neighbors in the K-Means Process and Adding ROCK Links to Its Distance

Hadi A. Alnabriss; Wesam Ashour

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

FORENSIC ANALYSIS FRAMEWORKS FOR ENCRYPTED CLOUD STORAGE INVESTIGATIONS

Joy Awoleye Sarah Mavire Allan Munyira Kelvin Magora

Random Articles

An Easily Comprehendible Unicode based Sorting Algorithm for Bangla Words

October

2013

Detection and Prevention of Sybil Attack in MANET using MAC Address

July

2015

A Comparative Study of Assessing Software Reliability using SPC: An MMLE Approach

July

2012

Performance Comparison of Three Types of Sensor Matrices for Indoor Multi-Robot Localization

Nov

2018

Reseach Article

Avoiding Objects with few Neighbors in the K-Means Process and Adding ROCK Links to Its Distance

by Hadi A. Alnabriss, Wesam Ashour

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 28 - Number 10

Year of Publication: 2011

Authors: Hadi A. Alnabriss, Wesam Ashour

10.5120/3421-4040

Hadi A. Alnabriss, Wesam Ashour . Avoiding Objects with few Neighbors in the K-Means Process and Adding ROCK Links to Its Distance. International Journal of Computer Applications. 28, 10 ( August 2011), 12-17. DOI=10.5120/3421-4040

@article{ 10.5120/3421-4040,

author = { Hadi A. Alnabriss, Wesam Ashour },

title = { Avoiding Objects with few Neighbors in the K-Means Process and Adding ROCK Links to Its Distance },

journal = { International Journal of Computer Applications },

issue_date = { August 2011 },

volume = { 28 },

number = { 10 },

month = { August },

year = { 2011 },

issn = { 0975-8887 },

pages = { 12-17 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume28/number10/3421-4040/ },

doi = { 10.5120/3421-4040 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:14:25.398875+05:30

%A Hadi A. Alnabriss

%A Wesam Ashour

%T Avoiding Objects with few Neighbors in the K-Means Process and Adding ROCK Links to Its Distance

%J International Journal of Computer Applications

%@ 0975-8887

%V 28

%N 10

%P 12-17

%D 2011

%I Foundation of Computer Science (FCS), NY, USA

Abstract

K-means is considered as one of the most common and powerful algorithms in data clustering, in this paper we're going to present new techniques to solve two problems in the K-means traditional clustering algorithm, the 1st problem is its sensitivity for outliers, in this part we are going to depend on a function that will help us to decide if this object is an outlier or not, if it was an outlier it will be expelled from our calculations, that will help the K-means to make good results even if we added more outlier points; in the second part we are going to make K-means depend on Rock links in addition to its traditional distance, Rock links takes into account the number of common neighbors between two objects, that will make the K-means able to detect shapes that can't be detected by the traditional K-means.

References

J. Hartigan and M. Wang. A K-means clustering algorithm. Applied Statistics, 28:100{108, 1979.
S. P. Lloyd. Least squares quantization in pcm. Technical note, Bell Laboratories, 1957. Pub- lished in 1982 in IEEE Transactions on Information Theory 28, 128-137.
J. MacQueen. Some methods for classification and analysis of multivariate observations. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability 1967.
D. Arthur and S. Vassilvitskii. K-means++: The advantages of careful seeding. In Bay Area Theory Symposium, BATS 06, 2006.
Hautamaki, V., Karkkainen, I., Franti, Outlier detection using k-nearest neighbour graph. In: 17th International Conference on Pattern Recognition (ICPR 2004), Cambridge, United Kingdom (2004) 430–433.
Sudipto Guha, Rajeev Rastogi, Kyuseok Shim, ROCK: A Robust Clustering Algorithm for Categorical Attributes.
Ville Hautam¨aki, Svetlana Cherednichenko, Ismo Karkkainen, Tomi Kinnunen, and Pasi Franti, Improving K-means by Outlier Removal.
Mu-Chun Su and Chien-Hsing Chou, A K-means Algorithm with a Novel Non-Metric Distance.
Wesam Barbakh, Similarity Graphs.
A. Likas, N. Vlassis and J. J. Verbeek, The Global K-means Clustering Algorithm. Pattern Recognition, vol. 2, pp. 451-461, 2002.
Xiaoping Qing, Shijue Zheng, A new method for initialising the K-means clustering algorithm, 2009 Second International Symposium on Knowledge Acquisition and Modeling.
G. H. Ball and D.I. Hall, “Some Fundamental Concepts and Synthesis Procedures for Pattern Recognition Preprocessors,” in Proc. of Int. Conf. Microwaves, Circuit Theory, and Information Theory, Tokyo, Japan, pp. 281-297, Sep. 1964.
Mu-Chun Su and Chien-Hsing Chou, A K-means Algorithm with a Novel Non-Metric Distance.
D. Reisfeld, H. Wolfsow, and Y. Yeshurun,“Context-Free Attentional Operators: the Generalized Symmetry Transform,” international Journal of Computer Vision, vol. 14, pp. 119 -130, 1995.
Xiaochuan Wu and Colin Fyfe, On initializing prototypes for clustering.
L. Breiman. Bagging predictors. Machine Learning, 24(2):123-140, 1996.
W. Barbakh, M. Crowe, and C. Fyfe. A family of novel clustering algorithms. In 7th international conference on intelligent data engineering and automated learning, IDEAL2006, pages 283–290, September 2006. ISSN 0302-9743 ISBN-13 978-3-540-45485-4.
M. Khalilian, N. Mustapha, M. N. Sulaiman, and F. Z. Boroujeni, "K-Means Divide and Conquer Clustering," in ICCAE, Thiland, Bangkok, 2009, pp. 306-309.
Girolami, M. (2002). Mercer kernel based clustering in feature space. IEEE Transactionson Neural Networks (13(3)), 780-784.
Kaufman, L., & Rousseuw, P. J. (1990). Finding Groups in Data. An Introduction to Cluster Analysis. John Wiley & Sons, Inc.

Index Terms

Computer Science

Information Sciences

Keywords

Robust K-means Rock links Initializing K-means electing centroids Optimizing K-means distance measurement