Reseach Article

A Comparative Study on Bioinformatics Feature Selection and Classification

by Amal Tamer, Amr Badr
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 43 - Number 3
Year of Publication: 2012
Authors: Amal Tamer, Amr Badr

Amal Tamer, Amr Badr . A Comparative Study on Bioinformatics Feature Selection and Classification. International Journal of Computer Applications. 43, 3 ( April 2012), 5-8. DOI=10.5120/6081-8219

This paper presents an application of supervised machine learning approaches to the classification of the colon cancer gene expression data. Established feature selection techniques based on principal component analysis (PCA), independent component analysis (ICA), genetic algorithm (GA) and support vector machine (SVM) are, for the first time, applied to this data set to support learning and classification. Different classifiers are implemented to investigate the impact of combining feature selection and classification methods. Learning classifiers implemented include K-Nearest Neighbors (KNN) and support vector machine. Results of comparative studies are provided, demonstrating that effective feature selection is essential to the development of classifiers intended for use in high dimension domains. This research also shows that feature selection helps increase computational efficiency while improving classification accuracy.

Index Terms

Computer Science
Information Sciences


Hold Out Pca Svm Knn Ica Features Classification Feature Selection Accuracy Colon Cancer