CFP last date
20 May 2024
Reseach Article

Handling Missing Value in Decision Tree Algorithm

by Preeti Patidar, Anshu Tiwari
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 70 - Number 13
Year of Publication: 2013
Authors: Preeti Patidar, Anshu Tiwari
10.5120/12023-8063

Preeti Patidar, Anshu Tiwari . Handling Missing Value in Decision Tree Algorithm. International Journal of Computer Applications. 70, 13 ( May 2013), 31-36. DOI=10.5120/12023-8063

@article{ 10.5120/12023-8063,
author = { Preeti Patidar, Anshu Tiwari },
title = { Handling Missing Value in Decision Tree Algorithm },
journal = { International Journal of Computer Applications },
issue_date = { May 2013 },
volume = { 70 },
number = { 13 },
month = { May },
year = { 2013 },
issn = { 0975-8887 },
pages = { 31-36 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume70/number13/12023-8063/ },
doi = { 10.5120/12023-8063 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:32:46.874976+05:30
%A Preeti Patidar
%A Anshu Tiwari
%T Handling Missing Value in Decision Tree Algorithm
%J International Journal of Computer Applications
%@ 0975-8887
%V 70
%N 13
%P 31-36
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Nowadays all the decisions making and large data analysis is made using computer applications. In such kind of application we use the data mining techniques to analyses them. Different domains of research like management, engineering, medical, education are frequently using these techniques. Data mining in educational system is an emerging discipline that focuses on applying data mining tools and techniques on educational data. Educational data mining is used to study the data available in the educational field and bring out the hidden knowledge from it. In this research work, data mining techniques is used to make smart decisions for the student, additionally this technique is used to analysis the performance of the students in educational domain, to make analysis and making decisions here we are using C5. 0 decision tree. Comparative study is done on ID3, C4. 5 and C5. 0. Among these classifiers C5. 0 gives more accurate and efficient output with comparatively high speed. Memory usage to store the rule set in case of the C5. 0 classifier is less as it generates smaller decision tree. This research work supports high accuracy, good speed and low memory usage as proposed system is using C5. 0 as the base classifier. The classification process here has low memory usage compare to other techniques because it generates fewer rules. Accuracy is high as error rate is low on unseen cases. And it is fast due to yielding pruned trees.

References
  1. Lior Rokach and Oded Maimon, "Top-Down Induction of Decision Trees Classifiers – A Survey", IEEE transactions on systems, man and cybernetics: part c, VOL. 1, NO. 11, pp. 1-12, 2002.
  2. Behrouz Minaei-Bidgoli , Deborah A. Kashy , Gerd Kortemeyer , William F. Punch, "predicting student performance: an application of data mining methods with the educational web-based system lon-capa", ASEE/IEEE Frontiers in Education Conference, pp. 1-6, 2003.
  3. Qiang Yang, Senior Member, Charles Ling, Xiaoyong Chai, and Rong Pan, "Test-Cost Sensitive Classification on Data with Missing Values", IEEE, VOL. 18, NO. 5, pp. 626-638, 2006.
  4. Ruey-Ling Yeh , Ching Liu, Ben-Chang Shia, Yu-Ting Cheng, Ya-Fang Huwang, " Imputing manufacturing material in data mining", Springer Science+Business Media, pp. 109–118, 2007.
  5. Vasile Paul Bre_felean, "Analysis and Predictions on Students' Behavior Using Decision Trees in Weka Environment", ITI Int. Conf. on Information Technology Interfaces, pp. 51-56, 2007, Cavtat, Croatia.
  6. Alireza Farhangfar, Lukasz A. Kurgan, Member and Witold Pedrycz, Fellow, "A Novel Framework for Imputation of Missing Values in Databases", IEEE, VOL. 37, NO. 5, pp. 692-709, 2007.
  7. Twala, B. E. T. H. ; Jones, M. C. and Hand, D. J. "Good methods for coping with missing data in decision trees" Pattern Recognition Letters, 29(7), pp. 950–956, 2008.
  8. Chen Jin, Luo De-lin and Mu Fen-xiang, "An Improved ID3 Decision Tree Algorithm", 4th International Conference on Computer Science & Education, pp. 127-130, 2009, China.
  9. Maria Francisca Capponi, Miguel Nussbaum, Guillermo Marshall and Maria Ester Lagos, "Pattern Discovery for the Design of Face-to-Face Computer-Supported Collaborative Learning Activities", Educational Technology & Society, pp. 40–52, 2010.
  10. Panyam Narahari Sastry, Rama krishnan Krishnan and Bhagavatula Venkata Sanker Ram, "Classification and identification of telugu handwritten characters extracted from palm leaves using decision tree approach", ARPN Journal of Engineering and Applied Sciences, VOL. 5, NO. 3, pp. 23-32, 2010.
  11. Anirut Suebsing, Nualsawat Hiransakolwong, "Euclidean-based Feature Selection for Network Intrusion Detection", International Conference on Machine Learning and Computing IPCSIT vol. 3, pp. 222-229, 2011.
  12. Smith Tsang, Ben Kao, Kevin Y. Yip, Wai-Shing Ho and Sau Dan Lee, "Decision Trees for Uncertain Data", IEEE, VOL. 23, NO. 1, pp. 1-15, 2011.
  13. T. R. Sivapriya, A. R. Nadira Banu Kamal and V. Thavavel, " Imputation And Classification Of Missing Data Using Least Square Support Vector Machines – A New Approach In Dementia Diagnosis", IJARAI, Vol. 1, No. 4, pp. 29-34,2012
  14. Barahate Sachin R. and Shelake Vijay M, "A Survey and Future Vision of Data mining in Educational Field", Second International Conference on Advanced Computing & Communication Technologies, pp. 96-100, 2012.
  15. A. S. Galathiya, A. P. Ganatra and C. K. Bhensdadia, "Improved Decision Tree Induction Algorithm with Feature Selection, Cross Validation, Model Complexity and Reduced Error Pruning", International Journal of Computer Science and Information Technologies, Vol. 3 ,pp. 3427-3431 , 2012, India.
  16. Mingyu Feng, Marie Bienkowski and Barbara Means, "Enhancing Teaching and Learning through Educational Data Mining and Learning Analytics: An Issue Brief", U. S. Department of Education, 2012.
Index Terms

Computer Science
Information Sciences

Keywords

Data Mining Decision Tree Educational Data Mining C4. 5 and C5. 0 Algorithm