CFP last date
22 April 2024
Reseach Article

A TOPSIS based Method for Gene Selection for Cancer Classification

by I. M. Abd-el Fattah, W. I. Khedr, K. M. Sallam
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 67 - Number 17
Year of Publication: 2013
Authors: I. M. Abd-el Fattah, W. I. Khedr, K. M. Sallam
10.5120/11490-7195

I. M. Abd-el Fattah, W. I. Khedr, K. M. Sallam . A TOPSIS based Method for Gene Selection for Cancer Classification. International Journal of Computer Applications. 67, 17 ( April 2013), 39-44. DOI=10.5120/11490-7195

@article{ 10.5120/11490-7195,
author = { I. M. Abd-el Fattah, W. I. Khedr, K. M. Sallam },
title = { A TOPSIS based Method for Gene Selection for Cancer Classification },
journal = { International Journal of Computer Applications },
issue_date = { April 2013 },
volume = { 67 },
number = { 17 },
month = { April },
year = { 2013 },
issn = { 0975-8887 },
pages = { 39-44 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume67/number17/11490-7195/ },
doi = { 10.5120/11490-7195 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:25:44.130790+05:30
%A I. M. Abd-el Fattah
%A W. I. Khedr
%A K. M. Sallam
%T A TOPSIS based Method for Gene Selection for Cancer Classification
%J International Journal of Computer Applications
%@ 0975-8887
%V 67
%N 17
%P 39-44
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Cancer classification based on microarray gene expressions is an important problem. In this work a new gene selection technique is proposed. The technique combines TOPSIS (Techniques for Order Preference by Similarity to an Ideal Solution) and F-score method to select subset of relevant genes. The output of the combined gene selection technique is fed into four different classifiers resulting in four hybrid cancer classification systems. In the proposed technique some important genes were chosen from thousands of genes (most informative genes). After that, the microarray data sets were classified with a K-Nearest Neighbour (KNN), Decision Tree (DT), Support Vector Machine (SVM) and Naive Bayes (NB). The goal of this proposed approach is to select most informative subset of features/genes that give better classification accuracy.

References
  1. I. H. Witten and E. Frank, Data Mining: Practical machine learning tools and techniques: Morgan Kaufmann, 2005.
  2. Y. Saeys, I. Inza, and P. Larrañaga, "A review of feature selection techniques in bioinformatics," Bioinformatics, vol. 23, pp. 2507-2517, 2007.
  3. A. Statnikov, C. F. Aliferis, I. Tsamardinos, D. Hardin, and S. Levy, "A comprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis," Bioinformatics, vol. 21, pp. 631-643, 2005.
  4. A. C. Lorena, I. G. Costa, and M. C. de Souto, "On the complexity of gene expression classification data sets," in Hybrid Intelligent Systems, 2008. HIS'08. Eighth International Conference on, 2008, pp. 825-830.
  5. M. A. Ranzato, F. J. Huang, Y. -L. Boureau, and Y. LeCun, "Unsupervised learning of invariant feature hierarchies with applications to object recognition," in Computer Vision and Pattern Recognition, 2007. CVPR'07. IEEE Conference on, 2007, pp. 1-8.
  6. S. Kotsiantis, I. Zaharakis, and P. Pintelas, "Supervised machine learning: A review of classification techniques," Frontiers in Artificial Intelligence and Applications, vol. 160, p. 3, 2007.
  7. B. Krishnapuram, L. Carin, and A. Hartemink, "1 Gene expression analysis: Joint feature selection and classifier design," Kernel Methods in Computational Biology, pp. 299-317, 2004.
  8. Y. Lu and J. Han, "Cancer classification using gene expression data," Information Systems, vol. 28, pp. 243-268, 2003.
  9. Y. Song, J. Huang, D. Zhou, H. Zha, and C. Giles, "Iknn: Informative k-nearest neighbor pattern classification," Knowledge Discovery in Databases: PKDD 2007, pp. 248-264, 2007.
  10. I. Guyon, J. Weston, S. Barnhill, and V. Vapnik, "Gene selection for cancer classification using support vector machines," Machine learning, vol. 46, pp. 389-422, 2002.
  11. G. -W. Wei, "Extension of TOPSIS method for 2-tuple linguistic multiple attribute group decision making with incomplete weight information," Knowledge and information systems, vol. 25, pp. 623-634, 2010.
  12. T. Yang and C. -C. Hung, "Multiple-attribute decision making methods for plant layout design problem," Robotics and computer-integrated manufacturing, vol. 23, pp. 126-137, 2007.
  13. S. -B. Cho and H. -H. Won, "Machine learning in DNA microarray analysis for cancer classification," in Proceedings of the First Asia-Pacific bioinformatics conference on Bioinformatics 2003-Volume 19, 2003, pp. 189-198.
  14. G. D. Ruxton, "The unequal variance t-test is an underused alternative to Student's t-test and the Mann–Whitney U test," Behavioral Ecology, vol. 17, pp. 688-690, 2006.
  15. D. C. Montgomery, G. C. Runger, and N. F. Hubele, Engineering statistics: Wiley, 2009.
  16. J. -H. Hong and S. -B. Cho, "Lymphoma cancer classification using genetic programming with SNR features," Genetic Programming, pp. 78-88, 2004.
  17. K. Mao, P. Zhao, and P. -H. Tan, "Supervised learning-based cell image segmentation for p53 immunohistochemistry," Biomedical Engineering, IEEE Transactions on, vol. 53, pp. 1153-1163, 2006.
  18. G. Forman, "An extensive empirical study of feature selection metrics for text classification," The Journal of Machine Learning Research, vol. 3, pp. 1289-1305, 2003.
  19. W. Du and Z. Zhan, "Building decision tree classifier on private data," 2002.
  20. G. Stiglic, S. Kocbek, I. Pernek, and P. Kokol, "Comprehensive Decision Tree Models in Bioinformatics," PloS one, vol. 7, p. e33812, 2012.
  21. M. A. Mazurowski, P. A. Habas, J. M. Zurada, J. Y. Lo, J. A. Baker, and G. D. Tourassi, "Training neural network classifiers for medical decision making: The effects of imbalanced datasets on classification performance," Neural networks: the official journal of the International Neural Network Society, vol. 21, p. 427, 2008.
  22. K. Al-Aidaroos, A. Bakar, and Z. Othman, "Medical Data Classi?cation with Naive Bayes Approach," 2012.
  23. B. Liu, Q. Cui, T. Jiang, and S. Ma, "A combinational feature selection and ensemble neural network method for classification of gene expression data," BMC bioinformatics, vol. 5, p. 136, 2004.
  24. T. Paul and H. Iba, "Identification of informative genes for molecular classification using probabilistic model building genetic algorithm," in Genetic and Evolutionary Computation–GECCO 2004, 2004, pp. 414-425.
  25. J. Liu and H. Iba, "Selecting informative genes using a multiobjective evolutionary algorithm," in Evolutionary Computation, 2002. CEC'02. Proceedings of the 2002 Congress on, 2002, pp. 297-302.
Index Terms

Computer Science
Information Sciences

Keywords

TOPSIS Gene Selection Cancer classification Neural Network Decision Tree Naive Bayes and K-Nearest Neighbour