CFP last date
22 April 2024
Reseach Article

Comparative Analysis of Outlier Detection Techniques

by Kamal Malik, H. Sadawarti, Kalra G. S
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 97 - Number 8
Year of Publication: 2014
Authors: Kamal Malik, H. Sadawarti, Kalra G. S
10.5120/17026-7318

Kamal Malik, H. Sadawarti, Kalra G. S . Comparative Analysis of Outlier Detection Techniques. International Journal of Computer Applications. 97, 8 ( July 2014), 12-21. DOI=10.5120/17026-7318

@article{ 10.5120/17026-7318,
author = { Kamal Malik, H. Sadawarti, Kalra G. S },
title = { Comparative Analysis of Outlier Detection Techniques },
journal = { International Journal of Computer Applications },
issue_date = { July 2014 },
volume = { 97 },
number = { 8 },
month = { July },
year = { 2014 },
issn = { 0975-8887 },
pages = { 12-21 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume97/number8/17026-7318/ },
doi = { 10.5120/17026-7318 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:23:34.107071+05:30
%A Kamal Malik
%A H. Sadawarti
%A Kalra G. S
%T Comparative Analysis of Outlier Detection Techniques
%J International Journal of Computer Applications
%@ 0975-8887
%V 97
%N 8
%P 12-21
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Data Mining simply refers to the extraction of very interesting patterns of the data from the massive data sets. Outlier detection is one of the important aspects of data mining which actually finds out the observations that are deviating from the common expected behavior. Outlier detection and analysis is sometimes known as outlier mining. In this paper, we have tried to provide the broad and a comprehensive literature survey of outliers and outlier detection techniques under one roof, so as to explain the richness and complexity associated with each outlier detection technique. Moreover, we have also given a broad comparison of the various methods of the different outlier techniques.

References
  1. Aggarwal, C. C. , Yu, S. P. , "An effective and efficient algorithm for high-dimensional outlier detection, The VLDB Journal, 2005, vol. 14, pp. 211-221.
  2. Abe, N, Zadrozny, B, and Langford, J. 2006. Outlier detection by active learning. In Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM Press, New York, NY, USA, 504 - 509.
  3. Arning, A. , Agrawal, R. , and Raghavan, P. : 1996, 'A Linear Method for Deviation Detection in Large Databases'. In: Proceedings of the ACM SIGKDD
  4. S. Vijayarani : [An Efficient clustering Algorithm, for outlier Detection IJCA,vol 32 oct,2011]
  5. Charu C. Aggarwal, Phillip S. Y, An effective and efficient algorithm for higher dimensional outlier detection.
  6. Karanjeet Singh and Dr. SuchitraUpadhyay. Outlier Detection: Applications and Techniques IJCSI International Journal of Computer Science Issues, Vol. 9, Issue 1, No 3, January 2012 ISSN (Online): 1694-0814
  7. Brother ton, T. , Johnson, T. , and Chadderdon, G. : 1998, 'Classification and Novelty Detection using Linear Models and a Class Dependent - Elliptical Bassi Function Neural Network '. In: Proceedings of the International conference on neural networks. Anchorage, Alaska.
  8. Barnett, V. and Lewis, T. : 1994, Outliers in Statistical Data. John Wiley & Sons. 3rd edition.
  9. V. Chandola, A. Banerjee, and V. Kumar. Outlier Detection-A Survey, Technical Report, TR 07-017, Department of Computer Science and Engineering, University of Minnesota, 2007.
  10. Dorronsoro, J. R. , Ginel, F. , Sanchez, C. , and Cruz, C. S. 1997. Neural fraud detection in credit card operations. IEEE Transactions On Neural Networks 8, 4 (July), 827 -834.
  11. Keogh, E. , Lin, J. , and Fu, A. 2005. Hot sax: Effciently finding the most unusual time series subsequence. In ICDM '05: Proceedings of the Fifth IEEE International Conference on Data Mining. IEEE Computer Society, Washington, DC, USA.
  12. Teng, H. , Chen, K. , and Lu, S. 1990. Adaptive real-time outlier detection using inductively generated sequential patterns. In Proceedings of IEEE Computer Society Symposium on Research in Security and Privacy. IEEE Computer Society.
  13. Sun, P,Chawla, S. , and Arunasalam, B. 2006. Mining for outliers in sequential databases. In SIAM International Conference on Data Mining.
  14. Noble, C. C. and Cook, D. J. 2003. Graph- outlier detection. In Proceedings of the 9th ACM SIGKDD international conference on Knowledge discovery and data mining. ACM Press, 631 - 636.
  15. Ester, M. , Kriegel, H-P. , and Xu, X. : 1996, 'A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise'. In: Proceedings ofthe Second International Conference on Knowledge Discovery and Data Mining,Portland, Oregon, pp. 226–231. AAAI Press.
  16. . Theiler, J. and Cai, D. M. 2003. Resampling approach for outlier detection in multispectral images. In Proceedings of SPIE 5093, 230-240, Ed.
  17. Steinwart, I. , Hush, D. , and Scovel, C. 2005. A classification framework for outlier detection. Journal of Machine Learning Research 6, 211 – 232
  18. Fujimaki, R. ,Yairi, T. , and Machida, K. 2005. An approach to spacecraft outlier detection problem using kernel feature space. In Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM Press, New York, NY,
  19. Bolton, R. J. and Hand, D. J. : 2001, 'Unsupervised Profiling Methods for Fraud Detection'. In: Credit Scoring and Credit Control VII, Edinburgh, UK, 5-7 Sept.
  20. Barnett, V. and Lewis, T. : 1994, Outliers in Statistical Data. John Wiley & Sons. 3rd edition.
  21. Huber, P. 1974. Robust Statistics. Wiley, New York.
  22. Grubbs, F. E. : 1969, 'Procedures for detecting outlying observations in samples'Technometrics11, 1–21. Hickinbotham, S. and Austin, J. : 2000, 'Novelty detection in Airframe Strain Data'. In: Proceedings of 15th International Conference on Pattern Recognition. Barcelona, pp. 536–539
  23. Laurikkala, J. , Juhola, M. , and Kentala, E. : 2000, 'Informal Identification of Outliers in Medical Data'. In: Fifth International Workshop on Intelligent Data Analysis in Medicine and Pharmacology IDAMAP-2000 Berlin, 22 August. Organized as a workshop of the 14th European Conference on Artificial Intelligence ECAI-2000.
  24. Hodge, V. and Austin, J. 2004. A survey of outlier detection methodologies. Artificial Intelli-gence Review 22, 2, 85.
  25. Byers, S. and Raftery, A. E. : 1998, 'Nearest Neighbor Clutter Removal for Estimating Features in Spatial Point Processes'. Journal of the American Statistical Association 93(442), 577–584.
  26. V. Chandola, A. Banerjee, and V. Kumar. Outlier Detection-A Survey, Technical Report, TR 07-017, Department of Computer Science and Engineering, University of Minnesota, 2007.
  27. Rousseeuw, P. J. and Leroy, A. M. 1987. Robust regression and outlier detection. John Wiley & Sons, Inc. New York, NY, USA.
  28. Fujimaki, R, Yairi, T. , and Machida, K. 2005. An approach to spacecraft outlier detection problem using kernel feature space. In Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM Press, New York, NY,
  29. Ji Zhang: A doctoral thesis titled as" Towards Outlier Detection For High-Dimensionaldata Streams Using Projected Outlier Analysis".
  30. E. Parzen. On the estimation of a probability density function and mode. Annals of Mathematical Statistics 33, 1065-1076, 1962.
  31. W. Jin, A. K. H. Tung, J. Han andW. Wang: Ranking Outliers Using Symmetric Neighborhood Relationship. PAKDD'06, 577-593, 2006
  32. Knorr, E. M. and Ng, R. T. : 1998, 'Algorithms for Mining Distance-Based Outliers in Large Datasets '. In: Proceedings of the VLDB Conference. New York, USA, pp. 392–403.
  33. Y kou,CT Lu,RF Dos Santos-Spatial outlier Detection- a graph based approach published in Tools with Artificial Intelligence 2007, ICTAI 2007,19th IEEE International Conference on Volume 1
  34. A descriptive framework for the field of data Mining and Knowledge discovery by Yi Peng, Gang Kou, Yong SHI, and ZHENGXIN CHEN.
  35. M. Breuning, H-P. Kriegel, R. Ng, and J. Sander. LOF: Identifying Density-Based Local Outliers. In Proc. of 2000 ACM SIGMOD International Conference on Management of Data (SIGMOD'00), Dallas, Texas, pp 93-104, 2000.
  36. Zhang doctoral thesis titled as "Towards outlier detection for high-dimensional data streams using projected outlier analysis strategy".
  37. Ramaswamy, S. , Rastogi, R. , and Shim, K. 2000. Efficient algorithms for mining outliers from large data sets. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data. ACM Press, 427.
  38. Charu C. Aggarwal, Phillip S. Y, An effective and efficient algorithm for higher dimensional outlier detection.
  39. J. Zhang, M. Lou, T. W. Ling and H. Wang. HOS-Miner: A System for Detecting Outlying Subspaces of High-dimensional Data. VLDB Conference, 2004.
  40. Ajay Challagalla, S. S. Shivaji, Dhiraj, DVLN Somayajulu, Toms Shaji Mathew,SauravTiwari, SayedSharique Ahmed "Privacy preserving outlier detection using hierarchical clustering Methods, 2010 34th Annual IEEE Computer Software and Application Conference Workshops
  41. T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH: An Efficient Data Clustering Method for Very Large Databases. In proceedings of the 1996 ACM International Conference on Management of Data (SIGMOD'96), pages 103-114,Montreal, Canada, 1996.
  42. E. Eskin, A. Arnold, M. Prerau, L. Portnoy and S. Stolfo. A Geometric Frame-work for Unsupervised Anomaly Detection: Detecting Intrusions in UnlabeledData. Applications of Data Mining in Computer Security, 2002.
  43. Caudell, T. P. and Newman, D. S. : 1993, 'An Adaptive Resonance Architecture toDefine Normality and Detect Novelties in Time Series and Databases'. In: IEEE World Congress on Neural Networks, Portland, Oregon. pp. 166–176.
  44. Roberts, S. and Penny, W. 1996. Novelty, confidence and errors in connectionist systems. In Proceedings of IEEE Colloquium on Intelligent Sensors and Fault Detection. Savoy place, London, 261.
  45. Crook, P. and Hayes, G. : 1995, 'A Robot Implementation of a Biologically Inspired Method for Novelty Detection'. In: Proceedings of TIMR-2001, Towards Intelligent Mobile Robots. Manchester.
  46. Aleskerov, E. , Freisleben, B. , and Rao, B. 1997. Cardwatch: A neural network based database mining system for credit card fraud detection. In Proceedings of IEEE Computational Intelligence for Financial Engineering. 220- 226.
Index Terms

Computer Science
Information Sciences

Keywords

Outliers data mining Clustering Neural Network