Theoretical Study of Decision Tree Algorithms to Identify Pivotal Factors for Performance Improvement: A Review
![]() |
10.5120/ijca2016909926 |
Amita Sharma Pooja Gulati and Manish Gupta. Theoretical Study of Decision Tree Algorithms to Identify Pivotal Factors for Performance Improvement: A Review. International Journal of Computer Applications 141(14):19-25, May 2016. BibTeX
@article{10.5120/ijca2016909926, author = {Pooja Gulati, Amita Sharma and Manish Gupta}, title = {Theoretical Study of Decision Tree Algorithms to Identify Pivotal Factors for Performance Improvement: A Review}, journal = {International Journal of Computer Applications}, issue_date = {May 2016}, volume = {141}, number = {14}, month = {May}, year = {2016}, issn = {0975-8887}, pages = {19-25}, numpages = {7}, url = {http://www.ijcaonline.org/archives/volume141/number14/24852-2016909926}, doi = {10.5120/ijca2016909926}, publisher = {Foundation of Computer Science (FCS), NY, USA}, address = {New York, USA} }
Abstract
Decision tree is a data mining technique used for the classification and forecasting of the data. It is the supervised learning algorithm that follows the greedy approach and works in a top down manner. Decision tree uses white box model approach and classifies the data in a hierarchical structure. It makes data easy to represent and understand. It can handle a large database and works well with both numerical and categorical variables. A variety of decision tree algorithms are proposed in the literature like ID3 (Iterative Dichotomiser 3), C4.5 (successor of ID3), CART (Classification and Regression tree), CHAID (Chi-squared Automatic Interaction Detector). These algorithms have specific mechanisms based on certain criteria’s. The study of these criteria are important and requisite for analysis of DT algorithms. The aim of this paper is to identify and inspect these vital criteria’s or factors of DT algorithms. The major contribution of this review is to provide a path to select a specific factor for improvement of DT algorithm as per requirement or problem.
References
- Chady El Moucary,2011, “Data Mining for Engineering Schools Predicting Students’ Performance and Enrollment in Masters Programs”, International Journal of Advanced Computer Science and Applications, Louaize (NDU), Vol. 2, No. 10, 2011
- N. Suneethaet.et.,2010., MODIFIED GINI INDEX CLASSIFICATION: A CASE STUDY OF HEART DISEASE DATASET, International Journal on Computer Science and Engineering,Vol. 02, No. 06, 2010, 1959-1965
- Kusrini and Sri Hartati,2007,” Implementation of C4.5 algorithms to evaluate the cancellation possibility of new student applicants at stmikamikomyogyakarta.” Proceedings of the International Conference on Electrical Engineering and Informatics InstitutTeknologi Bandung, Indonesia June 17-19, 2007, 623-626, ISBN 978-979-16338-0-2
- LiorRokach and OdedMaimon, DECISION TREES, Chapter 9, online source: http://www.ise.bgu.ac.il/faculty/liorr/hbchap9.pdf
- http://physics.about.com/od/glossary/g/entropy.htm
- http://en.wikipedia.org/wiki/CHAID
- H. Jiawei and K. Micheline, Data Mining: Concepts and Techniques, vol. 2, Morgan Kaufmann Publishers, 2006.
- LiorRokach and OdedMaimon, DECISION TREES, Chapter 9, Data Mining and Knowledge Discovery Handbook online source: http://www.ise.bgu.ac.il/faculty/liorr/hbchap9.pdf
- Florin Gorunescu, Data Mining: Concepts, Models and Techniques, Intelligent Systems Reference Library, Vol. 12, Springer Publication, 2011.
- https://en.wikipedia.org/wiki/Decision_tree_learning#Implementations
- S.Santhosh Kumar, Dr.E.Ramaraj, Modified C4.5 Algorithm with Improved Information Entropy and Gain Ratio International Journal of Engineering Research & Technology (IJERT) Vol. 2 Issue 9, September – 2013
Keywords
Data Mining, Decision Tree Technique, Decision Tree Methodology, Decision Tree Factors.