Call for Paper - November 2020 Edition
IJCA solicits original research papers for the November 2020 Edition. Last date of manuscript submission is October 20, 2020. Read More

Towards Unsupervised and Consistent High Dimensional Data Clustering

Print
PDF
International Journal of Computer Applications
© 2014 by IJCA Journal
Volume 87 - Number 2
Year of Publication: 2014
Authors:
R. G. Mehta
N. J. Mistry
M. Raghuwanshi
10.5120/15183-3532

R G Mehta, N J Mistry and M Raghuwanshi. Article: Towards Unsupervised and Consistent High Dimensional Data Clustering. International Journal of Computer Applications 87(2):40-44, February 2014. Full text available. BibTeX

@article{key:article,
	author = {R. G. Mehta and N. J. Mistry and M. Raghuwanshi},
	title = {Article: Towards Unsupervised and Consistent High Dimensional Data Clustering},
	journal = {International Journal of Computer Applications},
	year = {2014},
	volume = {87},
	number = {2},
	pages = {40-44},
	month = {February},
	note = {Full text available}
}

Abstract

The boosted demand for immense information, the enhanced data acquisition and so do the size and number of dimensions of data is a big challenge for the data mining algorithms. Clustering exercise to collect the data with same characteristics together, for better performance of knowledge based systems. High dimensional and large size data results in declined performance of existing clustering algorithms. PROCLUS is an efficient high dimensional clustering algorithm; consist of significant issues like inconsistency in results and expert supervised subspaces. MPROCLUS: a modified PROCLUS algorithm is proposed, aimed at improving the running time and consistency as well as the unsupervised selection of the parameter like, average number of dimensions. The promising and consistent results of MPROCLUS has open the sky wide open for further research for usage of MPROCLUS in stream Data Mining.

References

  • Vaishali, P. and Rupa, M. , "Modified k-Means Clustering Algorithm", Computational Intelligence and Information Technology(2011),Vol. 250, 307-312
  • Aggarwal, C. C. , Joel, L. W. , Philip, S. Yu, Cecilia, P. , and Jong, S. P. , "Fast algorithms for projected clustering. " A CM SIGMOD international conference on Management of data (May 1999), 28(2), 61-72
  • Hans-Peter, K. , Peer K. , and Arthur, Z. "Clustering high dimensional data: A survey on subspace clustering, pattern based clustering, and correlation clustering" ,ACM Transactions on Knowledge Discovery from Data (April 2009), 3(1)
  • Hall, M. , A. , and Holmes, G. "Benchmarking attribute selection techniques for discrete class data mining", IEEE Transactions on Knowledge and Data Engineering (Nov 2003), 15(6), 1437-1447
  • Aggarwal, C. , Hinneburg, A. and Keim, D. "On the surprising behavior of distance metrics in high dimensional space". Database Theory -- ICDT 2001, Springer, . 420-435
  • Aggarwal, C. , and Philip S. Yu. , "Finding generalized projected clusters in high dimensional spaces". Proceedings of the 2000 ACM SIGMOD international conference on Management of data(Feb 2000), 70-81
  • Kevin, Y. and David W. ,"Harp: A practical projected clustering algorithm. " , IEEE Transactions on Knowledge and Data Engineering(Nov 2004), 16(11), 1387-1397
  • Woo, K. , Lee J. and Kim, M. "FINDIT: a fast and intelligent subspace clustering algorithm using dimension voting. " Information and Software Technology(March 2004), 46(4), 255-271, 2004
  • Bharat T. , Rupa M. , "A Novel Approach For High Dimensional Data Clustering", LAP LAMBERT Academic Publishing, 2012
  • UCI Machine learning data set repository: http://archive. ics. uci. edu/ml