CFP last date
20 May 2024
Reseach Article

Feature Selection based Semi-Supervised Subspace Clustering

Published on January 2013 by V. R. Saraswathy, N. Kasthuri, M. Revathi
Amrita International Conference of Women in Computing - 2013
Foundation of Computer Science USA
AICWIC - Number 4
January 2013
Authors: V. R. Saraswathy, N. Kasthuri, M. Revathi
729f0def-5425-405e-9e33-8df23441d67b

V. R. Saraswathy, N. Kasthuri, M. Revathi . Feature Selection based Semi-Supervised Subspace Clustering. Amrita International Conference of Women in Computing - 2013. AICWIC, 4 (January 2013), 10-14.

@article{
author = { V. R. Saraswathy, N. Kasthuri, M. Revathi },
title = { Feature Selection based Semi-Supervised Subspace Clustering },
journal = { Amrita International Conference of Women in Computing - 2013 },
issue_date = { January 2013 },
volume = { AICWIC },
number = { 4 },
month = { January },
year = { 2013 },
issn = 0975-8887,
pages = { 10-14 },
numpages = 5,
url = { /proceedings/aicwic/number4/9882-1324/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 Amrita International Conference of Women in Computing - 2013
%A V. R. Saraswathy
%A N. Kasthuri
%A M. Revathi
%T Feature Selection based Semi-Supervised Subspace Clustering
%J Amrita International Conference of Women in Computing - 2013
%@ 0975-8887
%V AICWIC
%N 4
%P 10-14
%D 2013
%I International Journal of Computer Applications
Abstract

Clustering is the process which is used to assign a set of n objects into clusters(groups). Dimensionality reduction techniques help in increasing the accuracy of clustering results by removing redundant and irrelevant dimensions. But, in most of the situations, objects can be related in different ways in different subsets of the dimensions. Dimensionality reduction tends to get rid of such relationship information and generate clusters which do not fully reflect the real cluster's properties. Subspace clustering preserves such relationships by detecting all clusters in all subspaces. The accuracy of the subspace clustering results can be improved by making use of semi-supervised learning method. But finding subspaces by considering all input dimensions may decrease the clustering accuracy. This paper proposes a feature selection based semi-supervised subspace clustering method which applies feature selection in the beginning to eliminate unnecessary dimensions. Later, subspace clustering can be performed on the resulting dataset. This approach tends to improve the accuracy of resulting clusters since subspace clustering is performed on a reduced dataset. Experimental results show that the proposed method produces high quality clusters than semi-supervised subspace clustering algorithm.

References
  1. Zhang, X. , Qiu, Y. , & Wu, Y. 2011. Exploiting constraint inconsistence for dimension selection in subspace clustering: A semi-supervised approach. Neurocomputing, 74(17), 3598-3608.
  2. Berkhin, P. 2006. A survey of clustering data mining techniques. Grouping multidimensional data, 25-71.
  3. Woo, K. G. , Lee, J. H. , Kim, M. H. , & Lee, Y. J. 2004. FINDIT: a fast and intelligent subspace clustering algorithm using dimension voting. Information and Software Technology, 46(4), 255-271.
  4. Parsons, L. , Haque, E. , & Liu, H. 2004. Subspace clustering for high dimensional data: a review. ACM SIGKDD Explorations Newsletter, 6(1), 90-105.
  5. Cevikalp, H. , Verbeek, J. , Jurie, F. , & Klaser, A. 2008. Semi-supervised dimensionality reduction using pairwise equivalence constraints. In 3rd International Conference on Computer Vision Theory and Applications (VISAPP'08) (pp. 489-496).
  6. Fromont, E. , Robardet, C. , & Prado, A. 2009. Constraint-based subspace clustering.
  7. Aggarwal, C. C. , Wolf, J. L. , Yu, P. S. , Procopiuc, C. , & Park, J. S. 1999. Fast algorithms for projected clustering. ACM SIGMOD Record, 28(2), 61-72.
  8. Aggarwal, C. C. , & Yu, P. S. 2000. Finding generalized projected clusters in high dimensional spaces (Vol. 29, No. 2, pp. 70-81). ACM.
  9. Basu, S. , Bilenko, M. , & Mooney, R. J. 2004, August. A probabilistic framework for semi-supervised clustering. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 59-68). ACM.
  10. Basu, S. , Banerjee, A. , & Mooney, R. 2002, July. Semi-supervised clustering by seeding. In Machine Learning International Workshop then Conference - (pp. 19-26).
  11. Wagstaff, K. , & Cardie, C. 2000, June. Clustering with instance-level constraints. In Proceedings of the National Conference on Artificial Intelligence (pp. 1097-1097). Menlo Park, CA; Cambridge, MA; London; AAAI Press; MIT Press; 1999.
  12. Yip, K. P. , Cheung, D. W. , & Ng, M. K. 2005, April. On discovery of extremely low-dimensional clusters using semi-supervised projected clustering. In Data Engineering, 2005. ICDE 2005. Proceedings. 21st International Conference on(pp. 329-340). IEEE.
  13. Ahmed, M. S. , & Khan, L. 2009, December. Sisc: A text classification approach using semi supervised subspace clustering. In Data Mining Workshops, 2009. ICDMW'09. IEEE International Conference on (pp. 1-6). IEEE.
  14. Ferreira, A. J. , & Figueiredo, M. A. 2012. Efficient feature selection filters for high-dimensional data. Pattern Recognition Letters.
  15. Hansen, M. H. , & Yu, B. 2001. Model selection and the principle of minimum description length. Journal of the American Statistical Association, 96(454), 746-774.
Index Terms

Computer Science
Information Sciences

Keywords

Subspace Feature Selection Semi-supervised Learning