CFP last date
20 May 2024
Reseach Article

Comparative Prediction Performance with Support Vector Machine and Random Forest Classification Techniques

by Ashfaq Ahmed K, Sultan Aljahdali, Syed Naimatullah Hussain
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 69 - Number 11
Year of Publication: 2013
Authors: Ashfaq Ahmed K, Sultan Aljahdali, Syed Naimatullah Hussain
10.5120/11885-7922

Ashfaq Ahmed K, Sultan Aljahdali, Syed Naimatullah Hussain . Comparative Prediction Performance with Support Vector Machine and Random Forest Classification Techniques. International Journal of Computer Applications. 69, 11 ( May 2013), 12-16. DOI=10.5120/11885-7922

@article{ 10.5120/11885-7922,
author = { Ashfaq Ahmed K, Sultan Aljahdali, Syed Naimatullah Hussain },
title = { Comparative Prediction Performance with Support Vector Machine and Random Forest Classification Techniques },
journal = { International Journal of Computer Applications },
issue_date = { May 2013 },
volume = { 69 },
number = { 11 },
month = { May },
year = { 2013 },
issn = { 0975-8887 },
pages = { 12-16 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume69/number11/11885-7922/ },
doi = { 10.5120/11885-7922 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:29:58.201861+05:30
%A Ashfaq Ahmed K
%A Sultan Aljahdali
%A Syed Naimatullah Hussain
%T Comparative Prediction Performance with Support Vector Machine and Random Forest Classification Techniques
%J International Journal of Computer Applications
%@ 0975-8887
%V 69
%N 11
%P 12-16
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Machine learning with classification can effectively be applied for many applications, especially those with complex measurements. Therefore classification technique can be used for prediction of diseases like cancer, liver disorders and heart disease etc which involve complex measurements. This is part of growing demand and much interesting towards predictive diagnosis. It has also been established that classification and learning methods can be used effectively to improve the accuracy of prediction of a diseases and its recurrence. In the present work machine learning techniques namely Support Vector Machine [SVM] and Random Forest [RF] are used to learn, classify and compare cancer, liver and heart disease data with varying kernels and kernel parameters. Results with Support Vector Machines and Random Forest are compared for different data sets. The results with different kernels are tuned with proper parameters selection. Results are better analyzed to establish better learning techniques for predictions.

References
  1. Chen AH. Exploring novel algorithms for the prediction of cancer classification. Software Engineering and Data Mining(SEDM), 2010 2nd International Conference on Software Engineering and Data Mining, June 2010 pages: 378-383, Tzu-chi Univ. , Hualien, Taiwan.
  2. Milan Kumari, Sunila Godara,K, Aoki Kinoshita, Comparative Study of Data Mining Classification Methods in Cardiovascular Disease Prediction;IJCST volume:2; Pages:304-308.
  3. M. Klasssen. Learning Microarray Cancer Datasets by Random Forests and Support Vector Machine. Future Information Technology (FutureTech) 2010, 5th International Conference California University, Thousand Oaks, CA, USA, page(s): 1-6
  4. Furuta K, Aoki Kinoshita, K. F Wai-Ki ching,, Support Vector Machine Methods for the prediction of Cancer growth, 2010, 3rd international joint conference on computational science and optimization(CSO); volume:1; Pages:229-232
  5. Yanwei, X. ; Wang, J. ; Zhao, Z. ; Gao, Y. , "Combination datamining models with new medical data to predict outcome of coronary heart disease". Proceedings International Conference on Convergence Information Technology 2007, pp. 868 – 872.
  6. Yao, Z. ; Lei, L. ; Yin, J. , "R-C4. 5 Decision tree model and its applications to health care dataset". Proceedings of International Conference on Services Systems and Services Management 2005, pp. 1099-1103.
  7. Burke HB, Goodman PH, Rosen DB, et al. 1997. Artificial neural networks improve the accuracy of cancer survival prediction. Cancer, 79:857-62.
  8. Leenhouts HP. 1999. Radon-induced lung cancer in smokers and nonsmokers: risk implications using a two-mutation carcinogenesis model. Radiat Environ Biophys, 1999 38:57-71.
  9. E. Boser, I. Guyon, and V. Vapnik. A training algorithm for optimal margin classifiers. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pages 144{152. ACM Press, 1992.
  10. C. Chang and C. -J. Lin. LIBSVM: a library for support vector machines, 2001. Software available at http://www. csie. ntu. edu. tw/~cjlin/libsvm.
  11. Cortes and V. Vapnik. Support-vector network. Machine Learning, 20:273-297, 1995.
  12. E. Fan, K. -W. Chang, C. -J. Hsieh, X. -R. Wang, and C. -J. Lin. LIBLINEAR: A library for large linear classi_cation. Journal of Machine Learning Research, 9:1871 1874, 2008. URL http://www. csie. ntu. edu. tw/~cjlin/papers/liblinear. pdf.
  13. L. Gardy, C. Spencer, K. Wang, M. Ester, G. E. Tusnady, I. Simon, S. Hua, K. deFays, C. Lambert, K. Nakai, and F. S. Brinkman. PSORT-B: improving protein subcellular localization prediction for gram-negative bacteria. Nucleic Acids Research, 31(13):3613-3617, 2003.
  14. S. Keerthi and C. -J. Lin. Asymptotic behaviors of support vector machines with Gaussian kernel. Neural Computation, 15(7):1667{1689, 2003}.
  15. T. Lin and C. -J. Lin. A study on sigmoid kernels for SVM and the training of non- PSD kernels by SMO-type methods. Technical report, Department of Computer Science, National Taiwan University, 2003. URL http://www. csie. ntu. edu. tw/~cjlin/papers/tanh. pdf.
  16. Michie, D. J. Spiegelhalter, C. C. Taylor, and J. Campbell, editors. Machine learning, neural and statistical classi_cation. Ellis Horwood, Upper Saddle River, NJ, USA, 1994. ISBN 0-13-106360-X. Data available at http://archive. ics. uci. edu/ml/machine-learning-databases/statlog/.
  17. W. S. Sarle. Neural Network FAQ, 1997. URL ftp://ftp. sas. com/pub/neural/ FAQ. html. Periodic posting to the Usenet newsgroup comp. ai. neural-nets.
  18. Vapnik. The Nature of Statistical Learning Theory. Springer-Verlag, New York, NY, 1995.
  19. Breiman, Leo (2001). "Random F orests". Machine Learning 45 (1): 5–32. doi:10. 1023/A:1010933404324.
  20. Ho, Tin (1995). "Random Decision Forest". 3rd Int'l Conf. on Document Analysis and Recognition. pp. 278–282.
  21. Tavel, P. 2007 Modeling and Simulation Design. AK Peters Ltd.
  22. Sannella, M. J. 1994 Constraint Satisfaction and Debugging for Interactive User Interfaces. Doctoral Thesis. UMI Order Number: UMI Order No. GAX95-09398. , University of Washington.
  23. Forman, G. 2003. An extensive empirical study of feature selection metrics for text classification. J. Mach. Learn. Res. 3 (Mar. 2003), 1289-1305.
  24. Brown, L. D. , Hua, H. , and Gao, C. 2003. A widget framework for augmented interaction in SCAPE.
  25. Y. T. Yu, M. F. Lau, "A comparison of MC/DC, MUMCUT and several other coverage criteria for logical decisions", Journal of Systems and Software, 2005, in press.
  26. Spector, A. Z. 1989. Achieving application requirements. In Distributed Systems, S. Mullender
Index Terms

Computer Science
Information Sciences

Keywords

Support Vector Machine Random Forest Kernels Radial Basis Function Sigmoid