Cluster based Outlier Detection

Pranjali Kasture; Jayant Gadge

Call for Paper

June Edition

IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper

Know more

The week's pick

Enhancing Privacy Preservation: Multi-Attribute Protection with P-Sensitive K-Anonymity

Twinkle Patel Kiran Amin

Random Articles

Feasible Study on Pattern Matching Algorithms based on Intrusion Detection Systems

June

2014

Modeling and Economic Analysis of Energy Generation from Biomass Energy

December

2014

M-Pass: Web Authentication Protocol Resistant to Malware and Phishing

April

2014

Performance Analysis on the Effect of Doping Concentration in Copper Indium Gallium Selenide (CIGS) Thin-film Solar Cell

March

2015

Reseach Article

Cluster based Outlier Detection

by Pranjali Kasture, Jayant Gadge

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 58 - Number 10

Year of Publication: 2012

Authors: Pranjali Kasture, Jayant Gadge

10.5120/9317-3549

Pranjali Kasture, Jayant Gadge . Cluster based Outlier Detection. International Journal of Computer Applications. 58, 10 ( November 2012), 11-15. DOI=10.5120/9317-3549

@article{ 10.5120/9317-3549,

author = { Pranjali Kasture, Jayant Gadge },

title = { Cluster based Outlier Detection },

journal = { International Journal of Computer Applications },

issue_date = { November 2012 },

volume = { 58 },

number = { 10 },

month = { November },

year = { 2012 },

issn = { 0975-8887 },

pages = { 11-15 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume58/number10/9317-3549/ },

doi = { 10.5120/9317-3549 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:02:05.133151+05:30

%A Pranjali Kasture

%A Jayant Gadge

%T Cluster based Outlier Detection

%J International Journal of Computer Applications

%@ 0975-8887

%V 58

%N 10

%P 11-15

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Outlier detection is a fundamental issue in data mining, specifically it has been used to detect and remove anomalous objects from data. mining. The proposed approach to detect outlier includes three methods which are clustering, pruning and computing outlier score. For clustering k-means algorithm is used which partition the dataset into given number of clusters. In pruning, based on some distance measure, points which are closed to centroid of each cluster are pruned. For the unpruned points, local distance based outlier factor (LDOF) measure is calculated. A measure called LDOF, tells how much a point is deviating from its neighbors. The high LDOF value of a point indicates that the point is deviating more from its neighbors and probably it may be an outlier.

References

Rajendra Pamula, Jatindra Kumar Deka, Sukumar Nandi. Distance based Fast Outlier Detection Method. 2010, Annual IEEE, India Conference (INDICON).
K. Zhang, M. Hutter, and H. Jin. A new local distance- based outlier detection approach for scattered real-world data. In PAKDD '09: Proceedings of the 13th Pacific- Asia Conference on Advances in Knowledge Discovery and Data Mining, pages 813–822, 2009.
Hans-Peter Kriegel, Peer Kröger, Erich Schubert, Arthur Zimek. LoOP: Local Outlier Probabilities. CIKM'09, November 2–6, 2009, Hong Kong, China. Copyright 2009 ACM pages 1649-1652, 2009
E. M. Knorr and R. T. Ng. Algorithms for mining distance based outliers in large datasets. In Proc. 24th Int. Conf. Very Large Data Bases, VLDB, pages 392–403, 1998.
F. Angiulli, S. Basta, and C. Pizzuti. Distance-based detection and prediction of outliers. IEEE Transactions on Knowledge and Data Engineering, 18:145–160, 2006.
M. M. Breunig, H. -P. Kriegel, R. T. Ng, and J. Sander. Lof: identifying density-based local outliers. SIGMOD Rec. , 29(2):93–104, 2000
M. Ester, H. -P. Kriegel, and X. Xu. A database interface for clustering in large spatial databases. In Proceedings of 1st International Conference on Knowledge Discovery and Data Mining (KDD-95), 1995
S. Guha, R. Rastogi, and K. Shim. CURE: An efficient clus tering algorithm for large databases. SIGMOD Rec. , 27(2):73–84, 1998. Sannella, M. J. 1994 Constraint Satisfaction and Debugging for Interactive User Interfaces. Doctoral Thesis. UMI Order Number: UMI Order No. GAX95-09398. , University of Washington.
S. Ramaswamy, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. pages 427–438, 2000
T. Zhang, R. Ramakrishnan, and M. Livny. Birch: an efficient data clustering method for very large databases. SIGMOD Rec. , 25(2):103–114, 1996.
A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, and M. A. Ramadan: K-Means for Spherical Clusters with Large variance in Sizes, World Academy of Science, Engineering and technology 45 2008

Index Terms

Computer Science

Information Sciences

Keywords

Outlier cluster pruning outlier score k nearest neighbor