Call for Paper - August 2020 Edition
IJCA solicits original research papers for the August 2020 Edition. Last date of manuscript submission is July 20, 2020. Read More

Reducing and Clustering high Dimensional Data through Principal Component Analysis

Print
PDF
International Journal of Computer Applications
© 2010 by IJCA Journal
Number 8 - Article 1
Year of Publication: 2010
Authors:
R.Indhumathi
Dr.S.Sathiyabama
10.5120/1606-2158

R.Indhumathi and Dr.S.Sathiyabama. Article:Reducing and Clustering high Dimensional Data through Principal Component Analysis. International Journal of Computer Applications 11(8):1–4, December 2010. Published By Foundation of Computer Science. BibTeX

@article{key:article,
	author = {R.Indhumathi and Dr.S.Sathiyabama},
	title = {Article:Reducing and Clustering high Dimensional Data through Principal Component Analysis},
	journal = {International Journal of Computer Applications},
	year = {2010},
	volume = {11},
	number = {8},
	pages = {1--4},
	month = {December},
	note = {Published By Foundation of Computer Science}
}

Abstract

High dimensional data is phenomenon in real-world data mining applications. Developing effective clustering methods for high dimensional dataset is a challenging problem due to the curse of dimensionality. Usually k-means clustering algorithm is used but it results in time consuming, computationally expensive and the quality of the resulting clusters depends on the selection of initial centroid and the dimension of the data. The accuracy of the resultant value perhaps not up to the level of expectation when the dimension of the dataset is high because we cannot say that the dataset chosen are free from noisy and flawless. Hence to improve the efficiency and accuracy of mining task on high dimensional data, the data must be pre-processed by an efficient dimensionality reduction method. This paper proposes a method in which the high dimensional data is reduced through Principal Component Analysis and then bisecting k-means clustering is performed on the reduced data where there is no initialization of the centroids.

Reference

  • Pang-Ning Tang, Michal Steinbach and Vipin Kumar, “ Introduction to Data Mining”, Pearson Education,Third edition, 2009.
  • Chris Ding and Xiaofeng He, “K-Means Clustering via Principal Component Analysis”,In proceedings of the 21stInternational Conference on Machine Learning, Banff, Canada, 2004
  • Sandro Saitta, Combining PCA and K-means March 26, 2007 by Filed under: PCA, k-means
  • Chris Ding and Xiaofeng He ,K-means Clustering via Principal Component Analysis: Proceedings of the twenty-first international conference on Machine learning, Page: 29 ,Year of Publication: 2004
  • Zhang Z., Zhang J. and Xue H.2008.Improved K-means clustering algorithm Proceedings of the congress on Image and signal Processing, Vol.5,n0.5,pp.162-172
  • Principal component analysis From Wikipedia, the free encyclope
  • I.T. Jolliffe. Principal Component Analysis. Springer, 2nd edition2002, ISBN 978-0-387-95442-4.
  • Rajashree Dash,Debahuti Mishra,Amiya Kumar Rath,Milu Acharya ,A hybridized K- means clustering approach for high dimensional dataset, ,Inertnatioanl Journal of Engineering Science and Technology,Vol 2,No 2, 2010,pp,59-66.
  • Merz C and Murphy P, UCI Repository of Machine Learning Databases.
  • A Deterministic Method for Initializing K- Means Clustering, Ting Su,Jennifer Dy, Proceedings of the 16th IEEE International Conference on Tools with Artifical Intelligence,pp.784-786.
  • Valarrnathie P.,Srinath M.and Dinakaran K., 2009.An Increased performance of Clustering high dimensional data through dimensionality reduction technique,Journal of Theoretical and Applied Information Technology,Vol 13,pp 271-273.
  • Sergio M. Savaresi and Daniel L. Boley, On the performance of Bisecting K-Means and PDDP.
  • N.Tajunisha and V.Saravanan,”An increased performance of clustering high dimensional data using Priniciapl Component Analysis, 2010 First International Conference on Integrated Intelligent Computing”DOI 10.11.09
  • A k-Means-Based Projected Clustering Algorithm,Yufen Sun,Gang Liy and Kun Xu, 2010 Third International Joint Conference on Computational Science and Optimization, DOI 10.11.09