Prediction Improvement using Optimal Scaling on Random Forest Models for Highly Categorical Data

Saurabh Mangal; Aditya Shankar

Call for Paper

June Edition

IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper

Know more

The week's pick

Enhancing Privacy Preservation: Multi-Attribute Protection with P-Sensitive K-Anonymity

Twinkle Patel Kiran Amin

Random Articles

An IoT based Smart Power Mangement System for Technical University

Sep

2016

Users’ Topic Detection from Tweets based on Keyword Extraction

Jun

2017

VHDL Realization of Hybrid Control Strategy for a Nonlinear System

March

2015

Comparative Analysis of Video Streaming Services in H.323 Application Layered Protocol Coexisting of WLAN with Wireless Broadband Standard Networks

May

2012

Reseach Article

Prediction Improvement using Optimal Scaling on Random Forest Models for Highly Categorical Data

by Saurabh Mangal, Aditya Shankar

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 108 - Number 3

Year of Publication: 2014

Authors: Saurabh Mangal, Aditya Shankar

10.5120/18895-0183

Saurabh Mangal, Aditya Shankar . Prediction Improvement using Optimal Scaling on Random Forest Models for Highly Categorical Data. International Journal of Computer Applications. 108, 3 ( December 2014), 40-43. DOI=10.5120/18895-0183

@article{ 10.5120/18895-0183,

author = { Saurabh Mangal, Aditya Shankar },

title = { Prediction Improvement using Optimal Scaling on Random Forest Models for Highly Categorical Data },

journal = { International Journal of Computer Applications },

issue_date = { December 2014 },

volume = { 108 },

number = { 3 },

month = { December },

year = { 2014 },

issn = { 0975-8887 },

pages = { 40-43 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume108/number3/18895-0183/ },

doi = { 10.5120/18895-0183 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T22:42:03.882878+05:30

%A Saurabh Mangal

%A Aditya Shankar

%T Prediction Improvement using Optimal Scaling on Random Forest Models for Highly Categorical Data

%J International Journal of Computer Applications

%@ 0975-8887

%V 108

%N 3

%P 40-43

%D 2014

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Random Forests are an effective ensemble method which is becoming increasingly popular, particularly for binary classification prediction problems. One of the most popular algorithms for implementing the Random Forest model is the Breiman and Cutler's algorithm and this forms the basis of the "randomForest" package in R. However, a Random Forest model implemented using this package has a limitation, especially in a milieu which has limited computational power, that it cannot handle highly categorical data. In this paper, we present one of the many techniques we tried to improve the performance of a Random Forest Model using highly categorical data. The performance improvement was solely achieved using advanced pre-processing techniques like Optimal Scaling, hence the title of the paper.

References

Greer, J. E. and G. McCalla, 1994. Evaluating a Simulated Student using Real Students Data for Training and Testing.
Anderson, J. R. 1995, Cognitive tutors: Lessons learned, Carnegie Mellon University.
Noboru Matsuda1, William W. Cohen 2010, Tuning Cognitive Tutors into a Platform for Learning by-Teaching with SimStudent Technology Carnegie Mellon University.
Noboru Matsuda, Applying Machine Learning to Cognitive Modelling for Cognitive Tutors 2006, in Machine Learning Department Technical Report (CMU ML).
Muggleton, S. and L. de Raedt 1994, Inductive Logic Programming: Theory and methods
Lau, T. A. and D. S. Weld, 1998 an inductive learning formulation.
Johnson, W. L. 1998, Integrating pedagogical agents into virtual environments.
Baffes, P. and R. Mooney, 1996, Refinement-Based Student Modelling and Automated Bug Library Construction.
Merceron, A and K. Yacef, A web-based tutoring tool with mining facilities to improve learning and teaching, 2003.
Mertz, J. S. 1997, Using Simulated Student for Instructional Design.
Koedinger, K. R. and A. Corbett, 2006, Cognitive Tutors: Technology Bringing Learning Sciences to the Classroom, in The Cambridge Handbook of the Learning Sciences.
Matsuda, N. , W. W. Cohen, and K. R. Koedinger 2005, Applying Programming by Demonstration in an Intelligent Authoring Tool for Cognitive Tutors.

Index Terms

Computer Science

Information Sciences

Keywords

Ensemble Methods Random Forest Prediction with Categorical Variables Optimal Scaling Classification Machine Learning Non-Linear Categorical Prediction.