Call for Paper - August 2022 Edition
IJCA solicits original research papers for the August 2022 Edition. Last date of manuscript submission is July 20, 2022. Read More

An Ensemble approach on Missing Value Handling in Hepatitis Disease Dataset

Print
PDF
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2015
Authors:
Sridevi Radhakrishnan, D. Shanmuga Priyaa
10.5120/ijca2015907197

Sridevi Radhakrishnan and Shanmuga D Priyaa. Article: An Ensemble approach on Missing Value Handling in Hepatitis Disease Dataset. International Journal of Computer Applications 130(17):23-27, November 2015. Published by Foundation of Computer Science (FCS), NY, USA. BibTeX

@article{key:article,
	author = {Sridevi Radhakrishnan and D. Shanmuga Priyaa},
	title = {Article: An Ensemble approach on Missing Value Handling in Hepatitis Disease Dataset},
	journal = {International Journal of Computer Applications},
	year = {2015},
	volume = {130},
	number = {17},
	pages = {23-27},
	month = {November},
	note = {Published by Foundation of Computer Science (FCS), NY, USA}
}

Abstract

The Major work in data pre-processing is handling Missing value imputation in Hepatitis Disease Diagnosis which is one of the primary stage in data mining. Many health datasets are typically imperfect. Just removing the cases from the original datasets can fetch added problems than elucidations. A appropriate technique for missing value imputation can assist to generate high-quality datasets for enhanced scrutinizing in clinical trials. This paper investigates the exploit of a machine learning technique as a missing value imputation process for incomplete Hepatitis data. Mean/mode imputation, ID3 algorithm imputation, decision tree imputation and proposed bootstrap aggregation based imputation are used as missing value imputation and the resultant datasets are classified using KNN. The experiment reveals that classifier performance is enhanced when the Bagging based imputation algorithm is used to foresee missing attribute values.

References

  1. WHO, Hepatitis C (Fact Sheet No. 164), World Health Organization, Geneva, 2000.
  2. WHO, Hepatitis C global prevalence (update), Weekly Epidemiological Record (World Health Organization), 74, 1999, pp. 421–428.
  3. Information regarding hepatitis C from the staff of Mayo Clinic; available at: http://www.mayoclinic.com/health/hepatitis-c/DS00097
  4. D. F. Sittig, A. Wright, J. A. Osheroff, B. Middleton, J. M. Teich, J. S. Ash, et al., "Grand challenges in clinical decision support," in J Biomed Inform. vol. 41, ed United States, 2008, pp. 387-92.
  5. J. Fox, D. Glasspool, V. Patkar, M. Austin, L. Black, M. South, et al., "Delivering clinical decision support services: there is nothing as practical as a good theory," in J Biomed Inform. vol. 43, ed United States, 2010, pp. 831-43.
  6. R. Bellazzi and B. Zupan, "Predictive data mining in clinical medicine: Current issues and guidelines," International Journal of Medical Informatics, vol. 77, pp. 81-97, Feb 2008.
  7. Roslina, A.H. and Noraziah, A “Prediction of Hepatitis Prognosis Using Support Vector Machine and Wrapper Method”, Seventh International Conference on Fuzzy Systems and knowledge Discovery (FSKD 2010), 978-1-4244-5934-6/10, 2010 IEEE.
  8. Jiawei Han and Micheline Kamber. “Data Mining: Concepts and Techniques”,Data Preprocessing, Third Edition, 2011
  9. Weston, J., Mukherjee, S., Chapelle, O., Pontil, M., Poggio, T. and Vapnik, V., “ Feature Selection For SVMs”, Advances in Neural Information processing Systems, MIT Press 2001, pg 668- 674.
  10. Ron Kohavai and George H. John., “Wrappers for feature subset selection” , Artificial Intelligence
  11. Kantardzic M. 2003: Data Mining – Concepts, Models, Methods, and Algorithms, IEEE, pp. 165-176.
  12. Lakshminarayan, K., Harp S. A. & Samad, T., 1999: Imputation of Missing Data in Industrial Databases, Applied Intelligence 11, pp. 259–275.
  13. Blake, C. L., & Merz, C. J. (1996). UCI repository of machine learning databases. Available from: .
  14. Liu Peng, Lei Lei , A Review of Missing Data Treatment Method
  15. http://www.cise.ufl.edu/~ddd/cap6635/Fall-97/Short-papers/2.htm
  16. http://docs.rapidminer.com/studio/operators/modeling/classification_and_regression/meta/bagging

Keywords

data mining, prediction, knn, imputation, missing values, bagging, bootstrap