Improving Classification Accuracy based on Random Forest Model with Uncorrelated High Performing Trees

S. Bharathidason; C. Jothi Venkataeswaran

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 21 July 2025

Submit your paper

Know more

The week's pick

FORENSIC ANALYSIS FRAMEWORKS FOR ENCRYPTED CLOUD STORAGE INVESTIGATIONS

Joy Awoleye Sarah Mavire Allan Munyira Kelvin Magora

Random Articles

Impact of using Snowflake Schema and Bitmap Index on Data Warehouse Querying

Jan

2018

Customer Complain Detection in E-commerce Platforms using NLP

Dec

2022

Comparative Analysis of Search Algorithms

Jun

2018

Enhanced HMM Speech Emotion Recognition using SVM and Neural Classifier

February

2014

Reseach Article

Improving Classification Accuracy based on Random Forest Model with Uncorrelated High Performing Trees

by S. Bharathidason, C. Jothi Venkataeswaran

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 101 - Number 13

Year of Publication: 2014

Authors: S. Bharathidason, C. Jothi Venkataeswaran

10.5120/17749-8829

S. Bharathidason, C. Jothi Venkataeswaran . Improving Classification Accuracy based on Random Forest Model with Uncorrelated High Performing Trees. International Journal of Computer Applications. 101, 13 ( September 2014), 26-30. DOI=10.5120/17749-8829

@article{ 10.5120/17749-8829,

author = { S. Bharathidason, C. Jothi Venkataeswaran },

title = { Improving Classification Accuracy based on Random Forest Model with Uncorrelated High Performing Trees },

journal = { International Journal of Computer Applications },

issue_date = { September 2014 },

volume = { 101 },

number = { 13 },

month = { September },

year = { 2014 },

issn = { 0975-8887 },

pages = { 26-30 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume101/number13/17749-8829/ },

doi = { 10.5120/17749-8829 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:31:35.815425+05:30

%A S. Bharathidason

%A C. Jothi Venkataeswaran

%T Improving Classification Accuracy based on Random Forest Model with Uncorrelated High Performing Trees

%J International Journal of Computer Applications

%@ 0975-8887

%V 101

%N 13

%P 26-30

%D 2014

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Random forest can achieve high classification performance through a classification ensemble with a set of decision trees that grow using randomly selected subspaces of data. The performance of an ensemble learner is highly dependent on the accuracy of each component learner and the diversity among these components. In random forest, randomization would cause occurrence of bad trees and may include correlated trees. This leads to inappropriate and poor ensemble classification decision. In this paper an attempt has been made to improve the performance of the model by including only uncorrelated high performing trees in a random forest. Experimental results have shown that, the random forest can be further enhanced in terms of the classification accuracy.

References

Breiman, L. 2001. Random Forests. Machine Learning, Vol. 45 Issue 1, pp. 5-32.
Breiman, L. 1996. Heuristics of instability and stabilization in model selection. The Annals of Statistics, Vol. 24 Issue 6, pp. 2350–2383.
Ho, T. 1998. The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 20 Issue 8, pp. 832–844.
Amit, Y. and Geman, D. 1997. Shape quantization and recognition with randomized trees. Neural Computation, Vol. 9, Issue 7, pp. 1545–1588.
Goldstein, B. , Polley, E. , and Briggs, F. 2011. Random forests for genetic association studies. Statistical Applications in Genetics and Molecular Biology, Vol. 10, Issue 1, pp. 1–34.
Siroky, D. 2009. Navigating random forests and related advances in algorithmic modeling. Statistics Surveys, 3:147–163.
Jiang, P. , Wu, H. , Wang, W. , Ma, W. , Sun, X. , and Lu, Z. 2007. Mipred: classification of real and pseudo microrna precursors using random forest prediction model with combined features. Nucleic Acids Research, Vol. 35, Issue 2. pp. 339–344.
Palmer, D. , O'Boyle, N. , Glen, R. , and Mitchell, J. 2007. Random forest models to predict aqueous solubility. J Chem Inf Model, Vol. 47, Issue 1, pp. 150–158.
Kumar, M. and Thenmozhi, M. 2006. Forecasting stock index movement: A comparison of support vector machines and random forest. Indian Institute of Capital Markets 9th Capital Markets Conference.
Diaz-Uriarte, R. and de AndršŠs, S. A. 2006. Gene selection and classification of microarray data using random forest. BMC Bioinformatics, Vol. 7, pp. 3–15.
Ward, M. , Pajevic, S. , Dreyfuss, J. , and Malley, J. 2006. Short-term prediction of mortality in patients with systemic lupus erythematosus: Classification of outcomes using random forests. Arthritis and Rheumatism, Vol. 55, pp. 74–80.
Shi, T. , Seligson, D. , Belldegrun, A. , Palotie, A. , and Horvath, S. 2005. Tumor classification by tissue microarray profiling: Random forest clustering applied to renal cell carcinoma. Modern Pathology, Vol. 18, Issue 4, pp. 547–557.
Pal, M. 2003. Random forest classifier for remote sensing classification. International Journal of Remote Sensing, Vol. 26, Issue 1, pp. 217–222.
Ozuysal, M. , P. Fua, and V. Lepetit. 2007. Fast key point recognition in ten lines of code. In Proc. CVPR,. pp. 1377–1379.
Geurts, P. , D. Ernst, and L. Wehenkel. 2006. Extremely randomized trees. Machine Learning, Vol. 36, Issue 1, pp. 3–42.
Bernard, S. , Heutte, L. , Adam, S. 2009. On the selection of decision trees in random forests. International Joint Conference on Neural Network , pp. 302–307.
Banfield, R. , Hall, L. , Bowyer, K. , Kegelmeyer,W. 2006. A comparison of decision tree ensemble creation techniques. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, Issue 1, pp. 173–180.
Boinee, P. , Angelis, A. D. , Foresti, G. 2005. Ensembling classifiers - an application to image data classification from cherenkov telescope experiment. World Academy of Science, Engineering and Technology, Vol. 12, pp. 66–70.
Baoxun Xu, Junjie Li, Qiang Wang, Xiaojun Chen, 2012. A Tree Selection Model for Improved Random Forest, Bulletin of advanced technology research, Vol. 6 No. 2 Apr. 2012.
Dietterich, T. G. 2000. "An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization," Machine Learning, Vol. 40, Issue 2, pp. 139–157.
Banfield, R. E. , L. O. Hall, K. W. Bowyer and W. P. Kegelmeyer, 2007. A Comparison of Decision Tree Ensemble Creation Techniques, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 29, Issue 1, pp. 173–180.
Dietterich, T. G. 1997. "Machine learning research: For current directions," AI Magazine, Vol. 18, Issue 4, pp. 97–136.
Moro, S. , R. Laureano and P. Cortez. Using Data Mining for Bank Direct Marketing: An Application of the CRISP-DM Methodology.
Novais, P. , et al. 2011. Proceedings of the European Simulation and Modelling Conference (Eds) - ESM'2011, Guimarães, Portugal, October, 2011. EUROSIS. Available: http://hdl. handle. net/1822/14838. pp. 117–121.
Pacific-Asia Knowledge Discovery and Data Mining conference (14th), 2010. Hyderabad, India. PAKDD2010 hosted data mining competition, co-organized by NeuroTech Ltd. and Center for Informatics of the Federal University of Pernambuco (Brazil). Available: http://sede. neurotech. com. br/PAKDD2010.

Index Terms

Computer Science

Information Sciences

Keywords

Strength Correlation Tree Performance Decision trees.