CFP last date
22 April 2024
Reseach Article

Onion-Peeling Outlier Detection in 2-D data Sets

by Archit Harsh, John E. Ball, Pan Wei
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 139 - Number 3
Year of Publication: 2016
Authors: Archit Harsh, John E. Ball, Pan Wei
10.5120/ijca2016909122

Archit Harsh, John E. Ball, Pan Wei . Onion-Peeling Outlier Detection in 2-D data Sets. International Journal of Computer Applications. 139, 3 ( April 2016), 26-31. DOI=10.5120/ijca2016909122

@article{ 10.5120/ijca2016909122,
author = { Archit Harsh, John E. Ball, Pan Wei },
title = { Onion-Peeling Outlier Detection in 2-D data Sets },
journal = { International Journal of Computer Applications },
issue_date = { April 2016 },
volume = { 139 },
number = { 3 },
month = { April },
year = { 2016 },
issn = { 0975-8887 },
pages = { 26-31 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume139/number3/24471-2016909122/ },
doi = { 10.5120/ijca2016909122 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:39:57.738262+05:30
%A Archit Harsh
%A John E. Ball
%A Pan Wei
%T Onion-Peeling Outlier Detection in 2-D data Sets
%J International Journal of Computer Applications
%@ 0975-8887
%V 139
%N 3
%P 26-31
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Outlier Detection is a critical and cardinal research task due its array of applications in variety of domains ranging from data mining, clustering, statistical analysis, fraud detection, network intrusion detection and diagnosis of diseases etc. Over the last few decades, distance-based outlier detection algorithms have gained significant reputation as a viable alternative to the more traditional statistical approaches due to their scalable, non-parametric and simple implementation. In this paper, we present a modified onion peeling (Convex hull) genetic algorithm to detect outliers in a Gaussian 2-D point data set. We present three different scenarios of outlier detection using a) Euclidean Distance Metric b) Standardized Euclidean Distance Metric and c) Mahalanobis Distance Metric. Finally, we analyze the performance and evaluate the results.

References
  1. J. Laurikkala, M. Juhola, and E. Kentala. Informal identification of outliers in medical data. In The Fifth International Workshop on Intelligent Data Analysisin Medicine and Pharmacology. Citeseer, 2000.
  2. R.L Graham. An efficient algorithm for finding convex hulls of a finite planar set. Information processing letters1972.
  3. "Algorithm Design" by M. T. Goodrich and R. Tamassia (c) 2002, John Wiley & Sons Inc., pp.578
  4. P. Torr and D. Murray. Outlier detection and motionsegmentation. Sensor Fusion VI, 2059:432–443, 1993.
  5. S. Ramaswamy, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. In SIGMOD ’00: Proc. ACM SIGMOD Int. Conf. onManagement of data, pages 427–438, NewYork, NY, USA, 2000. ACM Press.
  6. E. M. Knorr and R. T. Ng. Finding intensional knowledge of distance-based outliers. In VLDB ’99: 25th Int. Conf. on Very Large Data Bases, pages 211–222, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc.
  7. J. Tukey. Exploratory data analysis. Addison-Wesley, 1977.
  8. www. MathWorks.com
  9. M. Mahoney and P. Chan. Learning rules for anomaly detection of hostile network traffic. In Proceedings of thethird IEEE International Conference on Data Mining, page 601. Citeseer, 2003.
  10. M. Mahoney and P. Chan. Learning nonstationary models of normal network traffic for detecting novelattacks. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery anddata mining, pages 376–385. ACM New York, NY, USA, 2002.
  11. F. Preparata and M. Shamos. Computation Geometry: anIntroduction. Springer-Verlag, 1988.
  12. F. Angiulli and C. Pizzuti. Fast outlier detection in high dimensional spaces. In PKDD ’02: Proc. of the 6th European Conf. on Principles of Data Mining and Knowledge Discovery, pages 15–26, London, UK, 2002. Springer-Verlag.
  13. Gustavo H. Orair, Carlos H.C. Teixeira, Wagner Meira Jr., Ye Wang, Srinivasan Parthasarathy. Distance-basedOutlier Detection: Consolidation and Renewed BearingProceedings of the VLDB Endowment (PVLDB).
  14. Engineering Statistics Handbook. NIST. url:http://www.itl.nist.gov/div898/handbook/prc/section1/prc16.htm
Index Terms

Computer Science
Information Sciences

Keywords

Onion Peeling Convex Hull Outlier Detection Computational Statistics