CFP last date
22 April 2024
Call for Paper
May Edition
IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 22 April 2024

Submit your paper
Know more
Reseach Article

Global High Dimension Outlier Algorithm for Efficient Clustering and Outlier Detection

by Nidhi Nigam, Tripti Saxena
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 131 - Number 18
Year of Publication: 2015
Authors: Nidhi Nigam, Tripti Saxena
10.5120/ijca2015905363

Nidhi Nigam, Tripti Saxena . Global High Dimension Outlier Algorithm for Efficient Clustering and Outlier Detection. International Journal of Computer Applications. 131, 18 ( December 2015), 1-4. DOI=10.5120/ijca2015905363

@article{ 10.5120/ijca2015905363,
author = { Nidhi Nigam, Tripti Saxena },
title = { Global High Dimension Outlier Algorithm for Efficient Clustering and Outlier Detection },
journal = { International Journal of Computer Applications },
issue_date = { December 2015 },
volume = { 131 },
number = { 18 },
month = { December },
year = { 2015 },
issn = { 0975-8887 },
pages = { 1-4 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume131/number18/23546-2015905363/ },
doi = { 10.5120/ijca2015905363 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:27:41.038709+05:30
%A Nidhi Nigam
%A Tripti Saxena
%T Global High Dimension Outlier Algorithm for Efficient Clustering and Outlier Detection
%J International Journal of Computer Applications
%@ 0975-8887
%V 131
%N 18
%P 1-4
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

In this digital era most of the knowledge kinded on the market in digital form. For several years, individuals have command the hypothesis that exploitation phrases for square measure presentation of document and topic ought to perform higher than terms. During this paper we have a tendency to square measure examine and investigate this reality with considering many states of art data processing strategies that offers satisfactory results to boost the effectiveness of the pattern. Here we have a tendency to implementing pattern detection methodology to resolve downside of term-based strategies and improved result that useful in info retrieval systems. Our proposal is additionally evaluated for many well distinguish domain, providing all told cases, reliable taxonomies considering preciseness and recall in conjunction with F-measure. For the experiment, we'll use massive dataset and therefore the results ought to show that we have a tendency to improve the discovering pattern as compared to previous text mining strategies. The results of the experiment setup ought to show that the keyword-based strategies not offer higher performance than pattern-based methodology. The results additionally indicate that removal of vacuous patterns not solely reduces the price of computation however additionally improves the effectiveness of the system

References
  1. Fabrizio Angiulli, Senior Member, IEEE, Stefano Basta, Stefano Lodi, and Claudio Sartori “Distributed Strategies for Mining Outliers in Large Data Sets” IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 25, NO. 7, JULY 2013.
  2. F. Angiulli, S. Basta, S. Lodi, and C. Sartori, “A Distributed Approach to Detect Outliers in very Large Data Sets,” Proc. 16th Int’l Euro-Par Conf. Parallel Processing (Euro-Par), pp. 329-340, 2010.
  3. F. Angiulli, S. Basta, and C. Pizzuti, “Distance-Based Detection and Prediction of Outliers,” IEEE Trans. Knowledge and Data Eng., vol. 18, no. 2, pp. 145-160, Feb. 2006.
  4. Rakesh Agrawal Johannes Gehrke_ Dimitrios Gunopulos Prabhakar Raghavan,” Automatic Subspace Clustering of High Dimensional Data for Data Mining Applications”
  5. R. Agrawal, H. Mannila, R. Srikant, H. Toivonen, and A. I. Verkamo. Fast Discovery of Association Rules. In U. M. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthu rusamy, editors, Advances in Knowledge Discovery and Data Mining, chapter 12, pages 307{328. AAAI/MIT Press, 1996.
  6. V. Chandola, A. Banerjee, and V. Kumar, “Anomaly Detection: A Survey,” ACM Computing Survey, vol. 41, no. 3, pp. 15:1-15:58,2009.
  7. H. Dutta, C. Giannella, K.D. Borne, and H. Kargupta, “Distributed Top-K Outlier Detection from Astronomy Catalogs Using the DEMAC System,” Proc. SIAM Int’l Conf. Data Mining (SDM), 2007.
  8. A. Ghosting, S. Parthasarathy, and M.E. Otey, “Fast Mining of Distance-Based Outliers in High-Dimensional Datasets,” DataMining Knowledge Discovery, vol. 16, no. 3, pp. 349-364, 2008.
  9. S.E. Guttormsson, R.J. Marks, M.A. El-Sharkawi, and I. Kerszenbaum, “Elliptical Novelty Grouping for on-line Short- Turn Detection of Excited Running Rotors,” Trans. Energy Conversion, vol. 14, no. 1, pp. 16-22, 1999.
  10. J. Han and M. Kamber, Data Mining, Concepts and Technique. Morgan Kaufmann, 2001.
  11. E. Hung and D.W. Cheung, “Parallel Mining of Outliers in Large Database,” Distributed and Parallel Databases, vol. 12, no. 1, pp. 5-26,2002.
  12. S. Jakubek and T. Strasser, “Fault-Diagnosis Using Neural Networks with Ellipsoidal Basis Functions,” Proc. Am. ControlConf., vol. 5, pp. 3846-3851, 2002.
  13. Advances in Distributed and Parallel Knowledge Discovery, H. Kargupta and P. Chan, eds. AAAI/MIT Press, 2000.
  14. E. Knorr and R. Ng, “Algorithms for Mining Distance-Based Outliers in Large Datasets,” Proc. 24rd Int’l Conf. Very Large DataBases (VLDB), pp. 392-403, 1998.
Index Terms

Computer Science
Information Sciences

Keywords

KDD DBSCAN Noisy data Distributed solving set Lazy distributed solving set.