Call for Paper - September 2022 Edition
IJCA solicits original research papers for the September 2022 Edition. Last date of manuscript submission is August 22, 2022. Read More

Analysis of Gene Expression Microarray Dataset for Feature Selection

Print
PDF
IJCA Proceedings on National Conference on Communication Technologies & its impact on Next Generation Computing 2012
© 2012 by IJCA Journal
CTNGC - Number 3
Year of Publication: 2012
Authors:
G. Baskar
P. Ponmuthuramalingam

G Baskar and P Ponmuthuramalingam. Article: Analysis of Gene Expression Microarray Dataset for Feature Selection. IJCA Proceedings on National Conference on Communication Technologies & its impact on Next Generation Computing 2012 CTNGC(3):33-35, November 2012. Full text available. BibTeX

@article{key:article,
	author = {G. Baskar and P. Ponmuthuramalingam},
	title = {Article: Analysis of Gene Expression Microarray Dataset for Feature Selection},
	journal = {IJCA Proceedings on National Conference on Communication Technologies & its impact on Next Generation Computing 2012},
	year = {2012},
	volume = {CTNGC},
	number = {3},
	pages = {33-35},
	month = {November},
	note = {Full text available}
}

Abstract

Microarray is a powerful technology for biological exploration which enables to simultaneously measure the level of activity of thousands genes in various cancer study . clustering is important data mining technique to extract useful information from various high dimensional datasets. A wide range of clustering algorithm is available and still in an open area of research k-Means algorithm is one of the basic and most simple partitioning clustering technique is given by Mac Queen in 1967. In this paper a sample weighting and efficient margin based sample weighting algorithm to improve the stability of feature selection. We proposed a weighted k-means to improve the cluster stability and presented an experimental evaluation of the proposed method, the experiment of microarray dataset show the feature selection algorithm such as SVM-RFE are more stable in gene selection.

References

  • T. R. Golub, D. K. Slonim, P. Tamayo, C. Huard, M. Gaasenbeek, J. P. Mesirov, H. Coller, M. L. Loh, J. R. Downing, M. A. Caligiuri, C. D. Bloomfield, and E. S. Lander, "Molecular Classification of Cancer: Class Discovery and Class Prediction by Gene Expression Monitoring," Science, vol. 286, pp. 531-537, 1999.
  • T. Li, C. Zhang, and M. Ogihara, "A Comparative Study of Feature Selection and Multiclass Classification Methods for Tissue Classification Based on Gene Expression," Bioinformatics, vol. 20, pp. 2429-2437, 2004.
  • Y. Saeys, I. Inza, and P. Larranaga, "A Review of Feature Selection Techniques in Bioinformatics," Bioinformatics, vol. 23, no. 19, pp. 2507-2517, 2007.
  • H. Liu, J. Li, and L. Wong, "A Comparative Study on Feature Selection and Classification Methods Using Gene Expression Profiles and Proteomic Patterns," Genome Informatics, vol. 13, pp. 51-60, 2002.
  • P. A. Mundra and J. C. Rajapakse, "SVM-RFE with MRMR Filter for Gene Selection," IEEE Trans. NanoBioscience, vol. 9, no. 1, pp. 31- 37, Mar. 2010
  • I. H. Witten and E. Frank, Data Mining - Practical Machine Learning Tools and Techniques. Morgan Kaufmann Publishers, 2005.
  • B. Y. Rubinstein, Simulation and the Monte Carlo Method. John Wiley & Sons, 1981.
  • Y. Tang, Y. Q. Zhang, and Z. Huang, "Development Two-Stage SVM-RFE Gene Selection Strategy for Microarray Expression Data Analysis," IEEE/ACM Trans. Computational Biology and Bioinformatics, vol. 4, no. 3, pp. 365-381, July 2007.
  • Pawan Lingras, Chad West. Interval set Clustering of Web users with Rough k-Means, submitted to the Journal of Intelligent Information System in 2002.
  • Yeung K. Y, Haynor D. R, Ruzzo W. L. Validating clustering for gene expression data. Bioinformatics. 2001.