Call for Paper - January 2023 Edition
IJCA solicits original research papers for the January 2023 Edition. Last date of manuscript submission is December 20, 2022. Read More

Performance Analysis on Uncertain Data using Decision Tree

Print
PDF
International Journal of Computer Applications
© 2014 by IJCA Journal
Volume 96 - Number 7
Year of Publication: 2014
Authors:
Bhosale J. D.
Patil B. M
10.5120/16805-6529

Bhosale J D. and Patil B M. Article: Performance Analysis on Uncertain Data using Decision Tree. International Journal of Computer Applications 96(7):15-19, June 2014. Full text available. BibTeX

@article{key:article,
	author = {Bhosale J. D. and Patil B. M},
	title = {Article: Performance Analysis on Uncertain Data using Decision Tree},
	journal = {International Journal of Computer Applications},
	year = {2014},
	volume = {96},
	number = {7},
	pages = {15-19},
	month = {June},
	note = {Full text available}
}

Abstract

Data uncertainty is common in emerging applications, such as sensor networks, moving object databases, medical and biological fields. Data uncertainty can be caused by various factors including measurements precision limitation. Data uncertainty is inherited in various applications due to different reasons such as outdated sources or imprecise measurement and transmission problems. Classification is one of the most popular data mining techniques. Lot of people used decision tree for data classification and it widely used on certain or precise data. However in this paper we applied on uncertain data which is taken from UCI machine learning repository. This paper proposes a decision tree based classification method on uncertain data. We construct decision tree algorithms by including entropy and information gain, considering the uncertain data intervals. We use some pruning techniques that can improve efficiency of the decision tree and our experiment show that it significantly reduce the tree-construction time.

References

  • http://archive. ics. uci. edu/ml/datasets. html.
  • J. R. Quinlan, "Induction of Decision Trees," Machine Learning, vol. 1, no. 1, pp. 81-106, 1986.
  • J. R. Quinlan, C4. 5: Programs for Machine Learning. Morgan Kaufmann, 1993.
  • C. L. Tsien, I. S. Kohane, and N. McIntosh, "Multiple Signal Integration by Decision Tree Induction to Detect Artefacts in the Neonatal Intensive Care Unit," Artificial Intelligence in Medicine, vol. 19, no. 3, pp. 189-202, 2000.
  • W. Street, W. Wolberg, and O. Mangasarian, "Nuclear Feature "Extraction for Breast Tumor Diagnosis," Proc. SPIE, pp. 861-870, http://citeseer. ist. psu. edu/street93nuclear. html, 1993.
  • L. Breiman, "Technical Note: Some Properties of Splitting Criteria," Machine Learning, vol. 24, no. 1, pp. 41-47, 1996.
  • Langley P, Iba W, Thompson K (1992) An analysis of Bayesian classifiers, In: Proceedings of the tenth National Conference on artificial intelligence, pp. 223-228.
  • Abdelghani Bellaachia, Erhan Guven, "Predicting Breast Cancer Survivability Using Data Mining Techniques", www. siam. org/meetings/sdm06/workproceed/bellaachia. pdf on Dec 06, 2010.
  • T. Elomaa and J. Rousu, "General and efficient multisplitting of numerical Attributes", Machine learning, vol 36, no 3, pp 201-244, 1999.
  • U. M. Fayyad and K. B. Irani, "On the handling of continuous-valued attributes in decision tree generation," Machine Learning, 1992.
  • T. Elomaa and J. Rousu, "Efficient multisplitting revisited: elimination of partition candidates," Data Mining and knowledge Discovery, vol. 8, no. 2, pp. 97–126, 2004.
  • Hawarah L, Simonet A, Simonet M(2006) Dealing with Missing Values in a Probabilistic Decision Tree during Classification, The Second International Workshop on Mining Complex Data, pp. 325-329.
  • T. M. Mitchell, Machine Learning. McGraw-Hill, 1997.