Call for Paper - January 2023 Edition
IJCA solicits original research papers for the January 2023 Edition. Last date of manuscript submission is December 20, 2022. Read More

Onion-Peeling Outlier Detection in 2-D data Sets

Print
PDF
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2016
Authors:
Archit Harsh, John E. Ball, Pan Wei
10.5120/ijca2016909122

Archit Harsh, John E Ball and Pan Wei. Article: Onion-Peeling Outlier Detection in 2-D data Sets. International Journal of Computer Applications 139(3):26-31, April 2016. Published by Foundation of Computer Science (FCS), NY, USA. BibTeX

@article{key:article,
	author = {Archit Harsh and John E. Ball and Pan Wei},
	title = {Article: Onion-Peeling Outlier Detection in 2-D data Sets},
	journal = {International Journal of Computer Applications},
	year = {2016},
	volume = {139},
	number = {3},
	pages = {26-31},
	month = {April},
	note = {Published by Foundation of Computer Science (FCS), NY, USA}
}

Abstract

Outlier Detection is a critical and cardinal research task due its array of applications in variety of domains ranging from data mining, clustering, statistical analysis, fraud detection, network intrusion detection and diagnosis of diseases etc. Over the last few decades, distance-based outlier detection algorithms have gained significant reputation as a viable alternative to the more traditional statistical approaches due to their scalable, non-parametric and simple implementation. In this paper, we present a modified onion peeling (Convex hull) genetic algorithm to detect outliers in a Gaussian 2-D point data set. We present three different scenarios of outlier detection using a) Euclidean Distance Metric b) Standardized Euclidean Distance Metric and c) Mahalanobis Distance Metric. Finally, we analyze the performance and evaluate the results.

References

  1. J. Laurikkala, M. Juhola, and E. Kentala. Informal identification of outliers in medical data. In The Fifth International Workshop on Intelligent Data Analysisin Medicine and Pharmacology. Citeseer, 2000.
  2. R.L Graham. An efficient algorithm for finding convex hulls of a finite planar set. Information processing letters1972.
  3. "Algorithm Design" by M. T. Goodrich and R. Tamassia (c) 2002, John Wiley & Sons Inc., pp.578
  4. P. Torr and D. Murray. Outlier detection and motionsegmentation. Sensor Fusion VI, 2059:432–443, 1993.
  5. S. Ramaswamy, R. Rastogi, and K. Shim. Efficient algorithms for mining outliers from large data sets. In SIGMOD ’00: Proc. ACM SIGMOD Int. Conf. onManagement of data, pages 427–438, NewYork, NY, USA, 2000. ACM Press.
  6. E. M. Knorr and R. T. Ng. Finding intensional knowledge of distance-based outliers. In VLDB ’99: 25th Int. Conf. on Very Large Data Bases, pages 211–222, San Francisco, CA, USA, 1999. Morgan Kaufmann Publishers Inc.
  7. J. Tukey. Exploratory data analysis. Addison-Wesley, 1977.
  8. www. MathWorks.com
  9. M. Mahoney and P. Chan. Learning rules for anomaly detection of hostile network traffic. In Proceedings of thethird IEEE International Conference on Data Mining, page 601. Citeseer, 2003.
  10. M. Mahoney and P. Chan. Learning nonstationary models of normal network traffic for detecting novelattacks. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery anddata mining, pages 376–385. ACM New York, NY, USA, 2002.
  11. F. Preparata and M. Shamos. Computation Geometry: anIntroduction. Springer-Verlag, 1988.
  12. F. Angiulli and C. Pizzuti. Fast outlier detection in high dimensional spaces. In PKDD ’02: Proc. of the 6th European Conf. on Principles of Data Mining and Knowledge Discovery, pages 15–26, London, UK, 2002. Springer-Verlag.
  13. Gustavo H. Orair, Carlos H.C. Teixeira, Wagner Meira Jr., Ye Wang, Srinivasan Parthasarathy. Distance-basedOutlier Detection: Consolidation and Renewed BearingProceedings of the VLDB Endowment (PVLDB).
  14. Engineering Statistics Handbook. NIST. url:http://www.itl.nist.gov/div898/handbook/prc/section1/prc16.htm

Keywords

Onion Peeling, Convex Hull, Outlier Detection, Computational Statistics