CFP last date
22 April 2024
Reseach Article

Effective Clustering Algorithms for Gene Expression Data

by T.Chandrasekhar, K.Thangavel, E.Elayaraja
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 32 - Number 4
Year of Publication: 2011
Authors: T.Chandrasekhar, K.Thangavel, E.Elayaraja

T.Chandrasekhar, K.Thangavel, E.Elayaraja . Effective Clustering Algorithms for Gene Expression Data. International Journal of Computer Applications. 32, 4 ( October 2011), 25-29. DOI=10.5120/3893-5454

@article{ 10.5120/3893-5454,
author = { T.Chandrasekhar, K.Thangavel, E.Elayaraja },
title = { Effective Clustering Algorithms for Gene Expression Data },
journal = { International Journal of Computer Applications },
issue_date = { October 2011 },
volume = { 32 },
number = { 4 },
month = { October },
year = { 2011 },
issn = { 0975-8887 },
pages = { 25-29 },
numpages = {9},
url = { },
doi = { 10.5120/3893-5454 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
%0 Journal Article
%1 2024-02-06T20:18:18.322737+05:30
%A T.Chandrasekhar
%A K.Thangavel
%A E.Elayaraja
%T Effective Clustering Algorithms for Gene Expression Data
%J International Journal of Computer Applications
%@ 0975-8887
%V 32
%N 4
%P 25-29
%D 2011
%I Foundation of Computer Science (FCS), NY, USA

Microarrays are made it possible to simultaneously monitor the expression profiles of thousands of genes under various experimental conditions. Identification of co-expressed genes and coherent patterns is the central goal in microarray or gene expression data analysis and is an important task in Bioinformatics research. In this paper, K-Means algorithm hybridised with Cluster Centre Initialization Algorithm (CCIA) is proposed Gene Expression Data. The proposed algorithm overcomes the drawbacks of specifying the number of clusters in the K-Means methods. Experimental analysis shows that the proposed method performs well on gene Expression Data when compare with the traditional K- Means clustering and Silhouette Coefficients cluster measure.

  1. A. M. Fahim, A. M. Salem, F. A. Torkey and M. A. Ramadan, 2006 “An Efficient enhanced K-Means clustering algorithm”, journal of Zhejiang University, 10 (7): 1626 - 1633.
  2. Bashar Al-Shboul and Sung-Hyon Myaeng, 2009 “Initializing K-Means using Genetic Algorithms”, World Academy of Science, Engineering and Technology 54.
  3. Chen Zhang and Shixiong Xia, 2009 “ K-Means Clustering Algorithm with Improved Initial center,” in Second International Workshop on Knowledge Discovery and Data Mining (WKDD), pp. 790-792.
  4. Chris Ding and Hanchuna Peng, 11-14 August - 2003 “Mininmum Redundancy Feature Selection from Microarray Gene Expression Data”, proceedings of the International Bioinformatic Conference.
  5. Edmond h.wuy, Michael K. Ngy, Andy M. Yipz, and Tony F. Chanz, 2004 “Discretization of Multidimensional Web Data forInformative Dense Regions Discovery”, CIS 2004, LNCS 3314, pp718-724.
  6. F. Yuan, Z. H. Meng, H. X. Zhangz, C. R. Dong, August 2004 “ A New Algorithm to Get the Initial Centroids”, proceedings of the 3rdInternational Conference on Machine Learning and Cybernetics, pp. 26-29.
  7. Doulaye Dembele and Philippe Kastner, 2003 “Fuzzy C means method for clustering microarray data”, Bioinformatics, Vol.19, no.8, pp.973- 980.
  8. Daxin Jiang, Jian Pei, and Aidong Zhang, October 2005 “An Interactive Approach to mining Gene Expression Data”. IEEE Transactions on Knowledge and Data Engineering, Vol 17, No.10, pp.1363- 1380.
  9. Dongxiao Zhu, Alfred O Hero, Hong Cheng, Ritu Khanna and Anand Swaroop, 2005 “Network constrained clustering for gene microarray Data”, doi:10.1093 /bioinformatics / bti 655, Vol. 21 no. 21, pp. 4014 – 4020,.
  10. Fahim A.M, Salem A. M, Torkey A and Ramadan M. A, 2006 “An Efficient enhanced K-Means clustering algorithm”, Journal of Zhejiang University, 10(7):1626–1633.
  11. K Karteeka Pavan, Allam Appa Rao, A V Dattatreya Rao, GR Sridhar, September 2008 “Automatic Generation of Merge Factor for Clustering Microarray Data”, IJCSNS International Journal of Computer Science and Network Security, Vol.8, No.9.
  12. Kohei Arai and Ali Ridho Barakbah, 2007 “Hierarchical K-Means: an algorithm for centroids initialization for K-Means”, Reports of the Faculty of Science and Engineering, Saga University, Vol. 36, No.1, 25-31.
  13. K.R De and A. Bhattacharya, 2008“Divisive Correlation Clustering Algorithm (DCCA) for grouping of genes: detecting varying Patterns in expression profiles,” bioinformatics, Vol. 24, pp. 1359-1366.
  14. K. A. Abdul Nazeer and M. P. Sebastian, July 2009 “Improving the accuracy and efficiency of the K-Means clustering algorithm”, international Conference on Data Mining and Knowledge Engineering (ICDMKE), Proceedings of the World Congress on Engineering (WCE- 2009), London, UK. Vol 1.
  15. Lletí, R., Ortiz, M.C., Sarabia, L.A., Sánchez, M.S. 2004 “Selecting variables for K-Means cluster analysis by using a genetic algorithm that optimises the silhouettes”. Analytica Chimica Acta.
  16. Moh'd Belal Al- Zoubi and Mohammad al Rawi, 2008 “An Efficient Approach for Computing Silhouette Coefficients”. Journal of Computer Science 4 (3): 252-255.
  17. Madhu Yedla, Srinivasa Rao Pathakota, T M Srinivasa , 2010 “Enhancing K-Means Clustering Algorithm with Improved Initial Center” , Madhu Yedla et al. / (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 1 (2), pp121-125.
  18. Sauravjoyti Sarmah and Dhruba K. Bhattacharyya. May 2010 “An Effective Technique for Clustering Incremental Gene Expression data”, IJCSI International Journal of Computer Science Issues, Vol. 7, Issue 3, No 3.
  19. Wei Zhong, Gulsah Altun, Robert Harrison, Phang C. Tai, and Yi Pan, September 2005 “Improved K-Means Clustering Algorithm for Exploring Local Protein Sequence Motifs Representing Common Structural Property”, IEEE transactions on nanobioscience, Vol. 4, no. 3.
  20. Y. Lu, S. Lu, F. Fotouhi, Y. Deng, and S. Brown, 2004 “Incremental Genetic K-Means Algorithm and its Application in Gene Expression Data Analysis”, BMC Bioinformatics.
Index Terms

Computer Science
Information Sciences


Clustering CCIA K-Means Gene expression data