CFP last date
22 April 2024
Reseach Article

Outlier Detection in Dataset using Hybrid Approach

by Shivani P. Patel, Vinita Shah, Jay Vala
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 122 - Number 8
Year of Publication: 2015
Authors: Shivani P. Patel, Vinita Shah, Jay Vala
10.5120/21723-4874

Shivani P. Patel, Vinita Shah, Jay Vala . Outlier Detection in Dataset using Hybrid Approach. International Journal of Computer Applications. 122, 8 ( July 2015), 38-41. DOI=10.5120/21723-4874

@article{ 10.5120/21723-4874,
author = { Shivani P. Patel, Vinita Shah, Jay Vala },
title = { Outlier Detection in Dataset using Hybrid Approach },
journal = { International Journal of Computer Applications },
issue_date = { July 2015 },
volume = { 122 },
number = { 8 },
month = { July },
year = { 2015 },
issn = { 0975-8887 },
pages = { 38-41 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume122/number8/21723-4874/ },
doi = { 10.5120/21723-4874 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:10:03.910824+05:30
%A Shivani P. Patel
%A Vinita Shah
%A Jay Vala
%T Outlier Detection in Dataset using Hybrid Approach
%J International Journal of Computer Applications
%@ 0975-8887
%V 122
%N 8
%P 38-41
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Outlier is a data point that deviates too much from the rest of dataset. Most of real-world dataset have outlier. Outlier analysis is one of the techniques in data mining whose task is to discover the data which have an exceptional behavior compare to remaining dataset. Outlier detection plays an important role in data mining field. Outlier Detection is useful in many fields like Medical, Network intrusion detection, Credit card fraud detection, medical, fault diagnosis in machines, etc. In order to deal with outlier, clustering method is used. Outlier detection contains clustering and finding outlier by applying any outlier detection technique. For that K-mean is widely used to cluster the dataset. Different techniques like statistical-based, distance-based, and deviation-based and density based methods are used to detect outlier. The experiment result shows that existing algorithm perform better than proposed cluster-based and distance-based Algorithm.

References
  1. Dantong Yu*, Gholamhosein Sheikholeslami and Aidong Zhang, "FindOut: Finding Outliers in Very Large Datasets", Knowledge and Information Systems, 31 May 2001,pp. 387-412
  2. Juntao Wang, Xiaolong Su, "An improved K-Means clustering algorithm", 2011, IEEE, pp. 44-46
  3. Janpreet Singh, Shruti Aggarwal, "Survey on Outlier Detection in Data Mining", International Journal of Computer Application, (0975 – 8887) Volume 67– No. 19,April 2013
  4. Karanjit Singh and Dr. Shuchita Upadhyaya, "Outlier Detection: Applications And Techniques", International Journal of Computer Science Issues, Vol. 9, Issue 1, No 3, January 2012,pp. 307-323
  5. S. Vijayarani, S. Nithya, "An Efficient Clustering Algorithm For Outlier Detection", International Journal of Computer Application, (0975 – 8887) Volume 32– No. 7,October 2011
  6. Vijay Kumar, Sunil Kumar, Ajay Kumar Singh, "Outlier Detection: A Clustering-Based Approach", International Journal of Science and Modern Engineering, Volume-1, Isaue-7, June 2013, pp. 16-19
  7. Ms. S. D. Pachgade, Ms. S. S. Dhande, "Outlier Detection over Data Set Using Cluster-Based and Distance-Based Approach", International Journal of Advance Research in Computer science and Software Engineering, Volume 2, Issue 6,June 2012,pp. 12-16
  8. Jingke Xi, "Outlier Detection Algorithm in Data Mining", Second International Symposium on Intelligent Information Technology Application, 2008 IEEE, pp. 94-97
  9. RajendraPamula, Jatindra Kumar Deka, Sukumar Nandi, "An Outlier Detection Method based on Clustering", Second International Conference on Emerging Applications of Information Technology, 2011 IEEE, pp. 253-256
Index Terms

Computer Science
Information Sciences

Keywords

Data Mining Outlier Clustering Approach k-mean Algorithm Distance Based Approach