Call for Paper - January 2023 Edition
IJCA solicits original research papers for the January 2023 Edition. Last date of manuscript submission is December 20, 2022. Read More

Detection of Discordant Observations and Visualization of Data

Print
PDF
International Conference and Workshop on Emerging Trends in Technology
© 2011 by IJCA Journal
Number 9 - Article 3
Year of Publication: 2011
Authors:
Loshma Gunisetti

Loshma Gunisetti. Detection of Discordant Observations and Visualization of Data. IJCA Proceedings on International Conference and workshop on Emerging Trends in Technology (ICWET) (9):15-18, 2011. Full text available. BibTeX

@article{key:article,
	author = {Loshma Gunisetti},
	title = {Detection of Discordant Observations and Visualization of Data},
	journal = {IJCA Proceedings on International Conference and workshop on Emerging Trends in Technology (ICWET)},
	year = {2011},
	number = {9},
	pages = {15-18},
	note = {Full text available}
}

Abstract

Discordant Observations are special values or extraordinary cases in the available data which deviate so much from other observations so as to arouse suspicions that they were generated by a different mechanism. They can be used to identify special or extraordinary or fraudulent cases in day to day transactions. Preprocessing can be used to identify the noise in the data and removal of such noise improves data quality. Discordant Observations are also called Anomalies or Outliers. Anomaly Detection can be used for Traffic Analysis, Credit Card Fraud Detection. We applied Anomaly Detection to Traffic data set for identifying the anomaly traffic stations on the highway. Detected stations represent abnormalities in the traffic sensors data. This information is used by us to identify the faulty traffic sensors located at the highway stations. Two dimensional visualization of the outliers has been provided which can be used for analyzing the data in an efficient manner. Traffic Management becomes easier when the abnormal traffic sensors identified at the corresponding stations are identified. The method used here can be easily applied to very large datasets.

Reference

  • A. Koufakou, E.G. Ortiz, M. Georgiopoulos, G.C. Anagnostopoulos, K.M. Reynolds . A Scalable and Efficient Outlier Detection Strategy for Categorical Data, 19th IEEE International Conference on Tools with Artificial Intelligence
  • Bolton, R.J., Hand, D.J. Statistical fraud detection: A Review, Statistical Science, 17, pp. 235–255, 2002.
  • D.Hawkins. Identification of Outliers. Chapman and Hall, 1980
  • Jiawei Han, Micheline Kamber. Data Mining: Concepts and Techniques, Morgan Kaufman Publishers.
  • Jingke Xi. Outlier Detection Algorithms in Data Mining, Second International Symposium on Intelligent Information Technology Application
  • Pang-Ning Tan, Michael Steinbach, Vipin Kumar, Introduction to Data Mining, Addison-Wesley, 2005
  • Penny, K.I., Jolliffe, I.T. A comparison of multivariate outlier detection methods for clinical laboratory safety data, The Statistician, Journal of the Royal Statistical Society, 50, pp. 295–308, 2001
  • Knorr, E., Ng, R., and Tucakov, V. Distance-based outliers: Algorithms and applications, VLDB Journal, 2000.
  • S. Shekhar, C. T. Lu, and P. Zhang, A Unified Approach to Detecting Spatial Outliers, GeoInformatica, pp. 139-166. 2003
  • V. Barnett and T. Lewis. Outliers in Statistical Data, John Wiley, New York, 3rd Edition, 1994.
  • Shashi Shekhar, Sanjay Chawla. Spatial Databases A Tour, First Edition,2003