CFP last date
20 May 2024
Reseach Article

C5.0 Algorithm to Improved Decision Tree with Feature Selection and Reduced Error Pruning

by Rutvija Pandya, Jayati Pandya
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 117 - Number 16
Year of Publication: 2015
Authors: Rutvija Pandya, Jayati Pandya
10.5120/20639-3318

Rutvija Pandya, Jayati Pandya . C5.0 Algorithm to Improved Decision Tree with Feature Selection and Reduced Error Pruning. International Journal of Computer Applications. 117, 16 ( May 2015), 18-21. DOI=10.5120/20639-3318

@article{ 10.5120/20639-3318,
author = { Rutvija Pandya, Jayati Pandya },
title = { C5.0 Algorithm to Improved Decision Tree with Feature Selection and Reduced Error Pruning },
journal = { International Journal of Computer Applications },
issue_date = { May 2015 },
volume = { 117 },
number = { 16 },
month = { May },
year = { 2015 },
issn = { 0975-8887 },
pages = { 18-21 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume117/number16/20639-3318/ },
doi = { 10.5120/20639-3318 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:59:33.522351+05:30
%A Rutvija Pandya
%A Jayati Pandya
%T C5.0 Algorithm to Improved Decision Tree with Feature Selection and Reduced Error Pruning
%J International Journal of Computer Applications
%@ 0975-8887
%V 117
%N 16
%P 18-21
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Data mining is a knowledge discovery process that analyzes data and generate useful pattern from it. Classification is the technique that uses pre-classified examples to classify the required results. Decision tree is used to model classification process. Using feature values of instances, Decision trees classify those instances. Each node in a decision tree represents a feature in an instance to be classified. In this research work ID3, C4. 5 and C5. 0 Compare with each other. Among all these classifiers C5. 0 gives more accurate and efficient result. This research work used C5. 0 as the base classifier so proposed system will classify the result set with high accuracy and low memory usage. The classification process generates fewer rules compare to other techniques so the proposed system has low memory usage. Error rate is low so accuracy in result set is high and pruned tree is generated so the system generates fast results as compare with other technique. In this research work proposed system use C5. 0 classifier that Performs feature selection and reduced error pruning techniques which are described in this paper. Feature selection technique assumes that the data contains many redundant features. so remove that features which provides no useful information in any context. Select relevant features which are useful in model construction. Cross- validation method gives more reliable estimate of predictive. Over fitting problem of the decision tree is solved by using reduced error pruning technique. With the proposed system achieve 1 to 3% of accuracy, reduced error rate and decision tree is construed within less time.

References
  1. XindongWu Vipin Kumar J. Ross Quinlan Joydeep Ghosh Qiang Yang Hiroshi Motoda Geoffrey J.
  2. McLachlan Angus Ng Bing Liu Philip S. Yu Zhi-Hua Zhou Michael Steinbach David J. Hand Dan Steinberg Top 10 algorithms in data mining © Springer-Verlag London Limited 2007
  3. Thair Nu Phyu, "Survey of Classification Techniques in Data Mining", International MultiConference of Engineers and Computer Scientists 2009 Vol I IMECS 2009, March 18 - 20, 2009, Hong Kong
  4. Biao Qin, Yuni Xia, Sunil Prabhakar,Yicheng Tu" A Rule-Based Classi?cation Algorithm for Uncertain Data", Department of Computer ScienceIndiana University -Purdue University Indianapolis, USA
  5. A comparative study of decision tree ID3 and C4. 5 Badr HSSINA, Abdelkarim MERBOUHA,Hanane EZZIKOURI,Mohammed ERRITALI TIAD laboratory, Computer Sciences Department, Faculty of sciences and techniques Sultan Moulay Slimane University Beni-Mellal, BP: 523, MoroccoM. Govindarajan, Text Mining Technique for Data Mining Application, World Academy of Science, Engineering and Technology 35 2007
  6. Sohag Sundar Nanda, Soumya Mishra, Sanghamitra Mohanty, Oriya Language Text Mining Using C5. 0 Algorithm, (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 2 (1) , 2011
  7. cTomM. Mitchel, McGrawHil, Decision Tree Learning, Lecture slides for textbook Machine Learning, , 197
  8. Zuleyka Díaz Martínez, José Fernández Menéndez, Mª Jesús Segovia Vargas See5 Algorithm versus Discriminant Analysis, Spain.
  9. J. R, QUINLAN , Induction of Decision Trees, New South Wales Institute of Technology, Sydney 2007, Australia
  10. A. S. Galathiya, A. P. Ganatraand C. K. Bhensdadia, Improved Decision Tree Induction Algorithm with Feature Selection, Cross Validation, Model Complexity and Reduced Error Pruning International Journal of Computer Science and Information Technologies.
Index Terms

Computer Science
Information Sciences

Keywords

REP Decision Tree induction C5 classifier KNN SVM