An Approach to Automation Selection of Decision Tree based on Training Data Set

D. Saravanakumar; N. Ananthi; M. Devi

Call for Paper

June Edition

IJCA solicits high quality original research papers for the upcoming June edition of the journal. The last date of research paper submission is 20 May 2024

Submit your paper

Know more

The week's pick

Enhancing Privacy Preservation: Multi-Attribute Protection with P-Sensitive K-Anonymity

Twinkle Patel Kiran Amin

Random Articles

Simple Techniques to Predict the Onset of Pandemics

Sep

2022

Optimal Assistive Drive System using Mobile Cloud Computing

Mar

2019

Systematic Review of Energy-efficient Scheduling Techniques in Cloud Computing

August

2012

Data Mining Application in Enrollment Management: A Case Study

March

2012

Reseach Article

An Approach to Automation Selection of Decision Tree based on Training Data Set

by D. Saravanakumar, N. Ananthi, M. Devi

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 64 - Number 21

Year of Publication: 2013

Authors: D. Saravanakumar, N. Ananthi, M. Devi

10.5120/10755-5500

D. Saravanakumar, N. Ananthi, M. Devi . An Approach to Automation Selection of Decision Tree based on Training Data Set. International Journal of Computer Applications. 64, 21 ( February 2013), 1-4. DOI=10.5120/10755-5500

@article{ 10.5120/10755-5500,

author = { D. Saravanakumar, N. Ananthi, M. Devi },

title = { An Approach to Automation Selection of Decision Tree based on Training Data Set },

journal = { International Journal of Computer Applications },

issue_date = { February 2013 },

volume = { 64 },

number = { 21 },

month = { February },

year = { 2013 },

issn = { 0975-8887 },

pages = { 1-4 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume64/number21/10755-5500/ },

doi = { 10.5120/10755-5500 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T21:17:11.620476+05:30

%A D. Saravanakumar

%A N. Ananthi

%A M. Devi

%T An Approach to Automation Selection of Decision Tree based on Training Data Set

%J International Journal of Computer Applications

%@ 0975-8887

%V 64

%N 21

%P 1-4

%D 2013

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In Data mining applications, very large training data sets with several million records are common. Decision trees are very much powerful and excellent technique for both classification and prediction problems. Many decision tree construction algorithms have been proposed to develop and handle large or small training data. Some related algorithms are best for large data sets and some for small data sets. Each algorithm works best for its own criteria. The decision tree algorithms classify categorical and continuous attributes very well but it handles efficiently only a smaller data set. It consumes more time for large datasets. Supervised Learning In Quest (SLIQ) and Scalable Parallelizable Induction of Decision Tree (SPRINT) handles very large datasets. But SLIQ requires that the class labels should be available in main memory beforehand. SPRINT is best suited for large data sets and it removes all these memory restrictions. The research work deals with the automatic selection of decision tree algorithm based on training dataset size. This proposed system first prepares the training dataset size using the mathematical measure. The result training set size problem will be checked with the available memory space. If memory is very sufficient then the tree construction will continue. After the classifying the data, the accuracy of the classifier data set is estimated. The main advantages of the proposed method are that the system takes less time and avoids memory problem.

References

Amir Bar-Or, Daniel Keren, Assaf Schuster, and Ran Wolff, "Hierarchical Decision Tree Induction in Distributed Genomic Databases", IEEE Transactions on Knowledge and Data Engineering, vol. 17, No. 8, August 2005 .
Arun K Pujari, "Data Mining Techniques", Universities Press, 2001
Banerjee M. , and Chakraborty M. K. , "Rough Logics: A survey with further directions," Rough Sets Analysis, Physica Verlag, Heidelberg, 1997.
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone, "Classification and Regression Trees. Wadsworth, Belmont", 1984.
J. Bala, J. Huang and H. Vafaie K. DeJong and H. Wechsler "Hybrid Learning Using Genetic Algorithms and Decision Trees for Pattern Classification", 2003.
Carla E. Brodley Paul E. Utgoff, "Multivariate versus Univariate Decision Trees", COINS Technical Report 92-8, Jan 1992
Andrew B. Nobel, "Analysis of a complexity based pruning scheme for classification trees", IEEE Transactions on Information Theory, vol. 48, pp. 2362-2368, 2002.
Rakesh Agrawal, Tomasz Imielinski, and Arun Swami, "Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering", 5(6):914{925, December 1993.
Donato Malerba, Floriana Esposito and Giovanni Semeraro , "A Further Comparision of Simplification Methods for Decision –Tree Induction" , Springer-verlag, 1996.
Floriana Esposito, Donato Malerba, and Giovanni Semeraro "A Comparative Analysis of Methods for Pruning Decision Trees" , IEEE Transactions on pattern analysis and machine intelligence, vol. 19,No. 5, May 1997
Johannes Gehrke, Raghu Ramakrishnan, Venkatesh Ganti_, "RainForest - A Framework for Fast Decision Tree Construction of Large Datasets", Proceedings of the 24th VLDB Conference New York, USA, 1998.
V. Corruble D. E. Brown and C. L. Pittard, "A comparison of decision classifiers with back propagation neural networks for multimodal classification problems", Pattern Recognition, 26:953–961, 1993.
Deborah R. Carvalho, Alex A. Freitas , "A hybrid decision tree/genetic algorithm for coping with the problem of small disjuncts in data mining" 2004.
Haixun Wang, Carlo Zaniolo "CMP: A Fast Decision Tree Classifier Using Multivariate Predictions", University of D. Hand, H. Mannila, P. Smyth," Principles of Data Mining", MIT Press, Cambridge, MA, 2001.

Index Terms

Computer Science

Information Sciences

Keywords

Decision Tree Algorithm Classification Data Mining Data Set