Call for Paper - November 2023 Edition
IJCA solicits original research papers for the November 2023 Edition. Last date of manuscript submission is October 20, 2023. Read More

Gene Expression Data Analysis using Fuzzy C-means Clustering Technique

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2016
Thomas Scaria, Gifty Stephen, Juby Mathew

Thomas Scaria, Gifty Stephen and Juby Mathew. Article: Gene Expression Data Analysis using Fuzzy C-means Clustering Technique. International Journal of Computer Applications 135(8):33-36, February 2016. Published by Foundation of Computer Science (FCS), NY, USA. BibTeX

	author = {Thomas Scaria and Gifty Stephen and Juby Mathew},
	title = {Article: Gene Expression Data Analysis using Fuzzy C-means Clustering Technique},
	journal = {International Journal of Computer Applications},
	year = {2016},
	volume = {135},
	number = {8},
	pages = {33-36},
	month = {February},
	note = {Published by Foundation of Computer Science (FCS), NY, USA}


The challenging issue in microarray technique is to analyze and interpret the large volume of data. This can be achieved by clustering techniques in data mining. In hard clustering like hierarchical and k-means clustering techniques, data is divided into distinct clusters, where each data element belongs to exactly one cluster so that the outcome of the clustering may not be correct in many times. The problems addressed in hard clustering could be solved in fuzzy clustering technique. Among fuzzy based clustering, fuzzy c means (FCM) is the most suitable for microarray gene expression data. The problem associated with fuzzy c-means is the number of clusters to be generated for the given dataset needs to be specified in prior. The main objective of this proposed Possibilistic fuzzy c-means method is to determine the precise number of clusters and interpret the same efficiently. The PFCM is a good clustering algorithm to perform classification tests because it possesses capabilities to give more importance to topicalities or membership values. PFCM is a hybridization of PCM and FCM that often avoids various problems of PCM, FCM and FPCM. Based on the sample dataset ‘lung’ the entire research has been developed. The available research works already developed in this area are not exclusively working with cancer genes.At this juncture, using of the Modified Possibilitistic fuzzy c- means algorithm could be found matching with cancer genes in a better fashion. “Matlab” is used for the algorithm.The accuracy of the dataset may be identified with the usage of different training sets.Possibilistic fuzzy c means algorithm has provided better results while identifying the cancer gene. For evaluating the feasibility of the Possibilistic Fuzzy C-Means (PFCM) clustering approach, the researcher has carried out the experimental analysis.


  1. Michel B Eisen, Paul T. Spellman, Patrick O. Brown, and David Botstein,Cluster analysis and display of genome-wide expression patterns”, Proc, Natl.Acad. Sci. USA, Vol. 95, pp, 14863- 14868, December 1998.
  2. Mathew, Juby, and R. Vijayakumar. "Scalableparallel clustering approach for large data usinggenetic possibilistic fuzzy c-means algorithm", 2014 IEEEInternational Conference on Computational Intelligence and Computing Research,2014.
  3. RM Suresh, K Dinakaran, P Valarmathie, “Model based modified k-means clustering for microarray data”, International Conference on Information Management and Engineering, Vol.13, pp 271-273, 2009, IEEE.
  4. AnirbanMukhopadhyay, UjjwalMaulik and Sanghamitrabandyopadhyay, “Efficient two stage fuzzy clustering of microarray gene expression data”, International Conference on Information Technology (ICIT’06), 2006 IEEE.
  5. Seo Young Kim, Tai MyongChoi,”Fuzzy types clustering for microarray data”, PWASET Volume 4 February 2005 ISSN 1307-6884
  6. A. L. Tarca, R. Romero, and S. Draghici, "Analysis of microarrayexperiments of gene expression profiling",American Journal of Obstetrics and Gynecology (2006) 195, pp. 373–88.
  7. C. Escudero et al., "Classification of Gene Expression Profiles: Comparison of k-means and expectation maximization algorithms", IEEE Computer Society, 2008, pp. 831-836.
  8. D. Dembele and P. Kastner, "Fuzzy C-means method for clustering microarray data", Bioinformatics, Vol. 19, Issue 8, 2003, pp. 973-980.
  9. E. Naghieh and Y. Peng, “Microarray Gene Expression Data Mining: Clustering Analysis Review”, Techniques, 2009.
  10. P valarmathie,MV Srinath,T.Ravichandran and K Dinakaran ”Hybrid Fuzzy C-Means clustering Technique for Gene Expression Data”,Vol 1,Issue 1,International Journal of research and reviews in Applied Sciences,ISSn2076-734X.
  11. Mathew,Juby,and R Vijayakumar.”Scalable parallel clustering approach for large data using parallel K means and firefly algorithms”.International Conference on High Performance Computing and Applications(ICHPA),2014


Clustering, Microarray, Gene Expression, Fuzzy Clustering, FCM, PFCM