CFP last date
20 March 2024
Reseach Article

Towards Unsupervised and Consistent High Dimensional Data Clustering

by R. G. Mehta, N. J. Mistry, M. Raghuwanshi
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 87 - Number 2
Year of Publication: 2014
Authors: R. G. Mehta, N. J. Mistry, M. Raghuwanshi
10.5120/15183-3532

R. G. Mehta, N. J. Mistry, M. Raghuwanshi . Towards Unsupervised and Consistent High Dimensional Data Clustering. International Journal of Computer Applications. 87, 2 ( February 2014), 40-44. DOI=10.5120/15183-3532

@article{ 10.5120/15183-3532,
author = { R. G. Mehta, N. J. Mistry, M. Raghuwanshi },
title = { Towards Unsupervised and Consistent High Dimensional Data Clustering },
journal = { International Journal of Computer Applications },
issue_date = { February 2014 },
volume = { 87 },
number = { 2 },
month = { February },
year = { 2014 },
issn = { 0975-8887 },
pages = { 40-44 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume87/number2/15183-3532/ },
doi = { 10.5120/15183-3532 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:04:55.301708+05:30
%A R. G. Mehta
%A N. J. Mistry
%A M. Raghuwanshi
%T Towards Unsupervised and Consistent High Dimensional Data Clustering
%J International Journal of Computer Applications
%@ 0975-8887
%V 87
%N 2
%P 40-44
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

The boosted demand for immense information, the enhanced data acquisition and so do the size and number of dimensions of data is a big challenge for the data mining algorithms. Clustering exercise to collect the data with same characteristics together, for better performance of knowledge based systems. High dimensional and large size data results in declined performance of existing clustering algorithms. PROCLUS is an efficient high dimensional clustering algorithm; consist of significant issues like inconsistency in results and expert supervised subspaces. MPROCLUS: a modified PROCLUS algorithm is proposed, aimed at improving the running time and consistency as well as the unsupervised selection of the parameter like, average number of dimensions. The promising and consistent results of MPROCLUS has open the sky wide open for further research for usage of MPROCLUS in stream Data Mining.

References
  1. Vaishali, P. and Rupa, M. , "Modified k-Means Clustering Algorithm", Computational Intelligence and Information Technology(2011),Vol. 250, 307-312
  2. Aggarwal, C. C. , Joel, L. W. , Philip, S. Yu, Cecilia, P. , and Jong, S. P. , "Fast algorithms for projected clustering. " A CM SIGMOD international conference on Management of data (May 1999), 28(2), 61-72
  3. Hans-Peter, K. , Peer K. , and Arthur, Z. "Clustering high dimensional data: A survey on subspace clustering, pattern based clustering, and correlation clustering" ,ACM Transactions on Knowledge Discovery from Data (April 2009), 3(1)
  4. Hall, M. , A. , and Holmes, G. "Benchmarking attribute selection techniques for discrete class data mining", IEEE Transactions on Knowledge and Data Engineering (Nov 2003), 15(6), 1437-1447
  5. Aggarwal, C. , Hinneburg, A. and Keim, D. "On the surprising behavior of distance metrics in high dimensional space". Database Theory -- ICDT 2001, Springer, . 420-435
  6. Aggarwal, C. , and Philip S. Yu. , "Finding generalized projected clusters in high dimensional spaces". Proceedings of the 2000 ACM SIGMOD international conference on Management of data(Feb 2000), 70-81
  7. Kevin, Y. and David W. ,"Harp: A practical projected clustering algorithm. " , IEEE Transactions on Knowledge and Data Engineering(Nov 2004), 16(11), 1387-1397
  8. Woo, K. , Lee J. and Kim, M. "FINDIT: a fast and intelligent subspace clustering algorithm using dimension voting. " Information and Software Technology(March 2004), 46(4), 255-271, 2004
  9. Bharat T. , Rupa M. , "A Novel Approach For High Dimensional Data Clustering", LAP LAMBERT Academic Publishing, 2012
  10. UCI Machine learning data set repository: http://archive. ics. uci. edu/ml
Index Terms

Computer Science
Information Sciences

Keywords

High dimensional clustering Unsupervised and consistent clustering PROCLUS