CFP last date
20 July 2026
Reseach Article

Kernel PCA-Enhanced Deep Learning for Cancer Classification in High-Dimensional Microarray Gene Expression Data

by Anil Kumar R.J., Veena M.N., Monica R., Nirmala M.S.
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 187 - Number 121
Year of Publication: 2026
Authors: Anil Kumar R.J., Veena M.N., Monica R., Nirmala M.S.
10.5120/ijca6dcb99d70d67

Anil Kumar R.J., Veena M.N., Monica R., Nirmala M.S. . Kernel PCA-Enhanced Deep Learning for Cancer Classification in High-Dimensional Microarray Gene Expression Data. International Journal of Computer Applications. 187, 121 ( Jun 2026), 25-33. DOI=10.5120/ijca6dcb99d70d67

@article{ 10.5120/ijca6dcb99d70d67,
author = { Anil Kumar R.J., Veena M.N., Monica R., Nirmala M.S. },
title = { Kernel PCA-Enhanced Deep Learning for Cancer Classification in High-Dimensional Microarray Gene Expression Data },
journal = { International Journal of Computer Applications },
issue_date = { Jun 2026 },
volume = { 187 },
number = { 121 },
month = { Jun },
year = { 2026 },
issn = { 0975-8887 },
pages = { 25-33 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume187/number121/kernel-pca-enhanced-deep-learning-for-cancer-classification-in-high-dimensional-microarray-gene-expression-data/ },
doi = { 10.5120/ijca6dcb99d70d67 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2026-07-01T03:10:16.305499+05:30
%A Anil Kumar R.J.
%A Veena M.N.
%A Monica R.
%A Nirmala M.S.
%T Kernel PCA-Enhanced Deep Learning for Cancer Classification in High-Dimensional Microarray Gene Expression Data
%J International Journal of Computer Applications
%@ 0975-8887
%V 187
%N 121
%P 25-33
%D 2026
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Gene expression datasets used for cancer analysis are frequently high- dimensional and complex, making accurate bracketing delicate. This work presents a harmonious machine learning technique for classification of cancer types using multiple reference gene expression datasets, including leukaemia, DLBCL, brain, breast cancer, Golub, and colon cancer. Originally, the datasets are pre-processed using standard point scaling to reduce variations in gene expression values. To address the dimensionality problem, KPCA with a RBF is employed to extract applicable nonlinear features. Latterly, the class markers are converted to a numerical format, and Min-Max normalization is used for enhancing learning effectiveness. The reused data is divided for training and testing the sets, and a feedforward deep neural network is trained for cancer prediction. The model’s performance is estimated using bracket delicacy. The experimental results demonstrate the proposed frame effectively handles high-dimensional gene expression data and achieves harmonious bracket performance across five cancer datasets.

References
  1. B. Schölkopf, A. Smola, and K.-R. Müller, “Nonlinear component analysis as a kernel eigenvalue problem,” Neural Computation, vol. 10, no. 5, pp. 1299–1319, 1998.
  2. S. Mika, G. Rätsch, J. Weston, B. Schölkopf, and K.-R. Müller, “Fisher discriminant analysis with kernels,” in Proc. IEEE Neural Networks for Signal Processing, 1999.
  3. C. M. Bishop, Pattern Recognition and Machine Learning. New York, NY, USA: Springer, 2006.
  4. T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning. New York, NY, USA: Springer, 2009.
  5. T. R. Golub et al., “Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring,” Science, vol. 286, no. 5439, pp. 531–537, 1999.
  6. A. A. Alizadeh et al., “Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling,” Nature, vol. 403, pp. 503–511, 2000.
  7. D. Singh et al., “Gene expression correlates of clinical prostate cancer behavior,” Cancer Cell, vol. 1, no. 2, pp. 203–209, 2002.
  8. S. Ramaswamy et al., “Multiclass cancer diagnosis using tumor gene expression signatures,” PNAS, vol. 98, no. 26, pp. 15149–15154, 2001.
  9. C. Cortes and V. Vapnik, “Support-vector networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995.
  10. N. Cristianini and J. Shawe-Taylor, An Introduction to Support Vector Machines. Cambridge, U.K.: Cambridge Univ. Press, 2000.
  11. I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. Cambridge, MA, USA: MIT Press, 2016.
  12. S. Haykin, Neural Networks and Learning Machines, 3rd ed. Upper Saddle River, NJ, USA: Pearson, 2009.
  13. I. Guyon et al., “Gene selection for cancer classification using support vector machines,” Machine Learning, vol. 46, pp. 389–422, 2002.
  14. C. Ding and X. He, “K-means clustering via principal component analysis,” in Proc. ICML, 2004.
  15. I. T. Jolliffe, Principal Component Analysis. New York, NY, USA: Springer, 2002.
  16. M. P. S. Brown et al., “Knowledge-based analysis of microarray gene expression data,” PNAS, vol. 97, no. 1, pp. 262–267, 2000.
  17. D. K. Slonim, “From patterns to pathways: Gene expression data analysis comes of age,” Nature Genetics, vol. 32, pp. 502–508, 2002.
  18. J. Han, M. Kamber, and J. Pei, Data Mining: Concepts and Techniques. Burlington, MA, USA: Morgan Kaufmann, 2011.
  19. P.-N. Tan, M. Steinbach, and V. Kumar, Introduction to Data Mining. Boston, MA, USA: Pearson, 2006.
  20. K. P. Murphy, Machine Learning: A Probabilistic Perspective. Cambridge, MA, USA: MIT Press, 2012.
Index Terms

Computer Science
Information Sciences

Keywords

RBF DLBCL KPCA Gene expression datasets