Call for Paper - November 2020 Edition
IJCA solicits original research papers for the November 2020 Edition. Last date of manuscript submission is October 20, 2020. Read More

Model for Predicting the Risk of Kidney Stone using Data Mining Techniques

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2019
Oladeji F. A., Idowu P. A., Egejuru N., Faluyi S. G., Balogun J. A.

Oladeji F A., Idowu P A., Egejuru N., Faluyi S G. and Balogun J A.. Model for Predicting the Risk of Kidney Stone using Data Mining Techniques. International Journal of Computer Applications 182(38):36-56, January 2019. BibTeX

	author = {Oladeji F. A. and Idowu P. A. and Egejuru N. and Faluyi S. G. and Balogun J. A.},
	title = {Model for Predicting the Risk of Kidney Stone using Data Mining Techniques},
	journal = {International Journal of Computer Applications},
	issue_date = {January 2019},
	volume = {182},
	number = {38},
	month = {Jan},
	year = {2019},
	issn = {0975-8887},
	pages = {36-56},
	numpages = {21},
	url = {},
	doi = {10.5120/ijca2019918404},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}


This paper focused on the development of a predictive model for the classification of the risk of kidney stones in Nigerian using data mining techniques based on historical information elicited about the risk of kidney stones among Nigerians. Following the identification of the risk factors of kidney stone from experienced endocrinologists, structured questionnaires were used to collect information about the risk factors and the associated risk of kidney stones from selected respondents.

The predictive model for the risk of kidney diseases was formulated using three (3) supervised machine learning algorithms (Decision Tree, Multi-layer perception and Genetic Algorithm) following the identification of relevant features. The predictive model was simulated using the Waikato Environment for Knowledge Analysis (WEKA) environment; and the model was validated using historical dataset of kidney stone risk via performance metrics: accuracy, true positive rate, precision and false positive rate.

The paper concluded that the multi-layer perceptron had the best performance overall using the 33 initially identified variables by the endocrinologists with an accuracy of 100%. The performance of the genetic programming and multi-layer perceptron algorithms used to formulate the predictive model for the risk of kidney stones using the 6 variables outperformed the model formulated using the 6 variables identified by the C4.5 decision trees. The variables identified by the C4.5 decision trees algorithm were: obese from childhood, eating late at night, BMI class, family history of hypertension, taking coffee and sweating daily. In conclusion, the multi-layer perceptron algorithm is best suitable for the development of a predictive model for the risk of kidney stones.


  1. Nyce, C. 2007. Predictive Analytics. AICPCU-IIA: Pennsylvania.
  2. Buytendijk, F. and Trepanier, L. 2010. Predictive Analytics: Bringing the Tools to the Data. Oracle Corporation, Redwood Shores, CA 94065.
  3. Sikder, M.K.A., Chy, A.N. and Seddiqui, M.H. 2013. Electronic health record system for human disease prediction and healthcare improvement in Bangladesh. In International Conference of Informatics, Electronics and Vision (ICIEV), 1 – 5.
  4. Sharath, S., Rao, M. and Chetan, H. 2014. Survey on the principles of mining Clinical Datasets by utilizing Data Mining technique. International Journal of Innovative Research in Computer and Communication Engineering 2(4), 3928 – 3935
  5. Bharatheesh, T. and Iyengar, S. 2004. Predictive Data Mining for Delinquency Modeling. ESA 40(1), 99-105.
  6. Bellazzi, R., Ferrazzi, F. and Sacchi, L 2011. Predictive data mining in clinical medicine: a focus on selected methods and applications. WIREs Data Mining Knowledge and Discovery 1(5): 416 – 430.
  7. Li, X., Nsofor, G.C. and Song, L. A 2009. Comparative analysis of predictive data mining techniques. International Journal of Rapid Manufacturing 1(2), 50 – 72.
  8. Jensen, P.B., Jensen, L.J. and Brunak, S. 2012. Mining electronic health records: towards better research applications and clinical care. Nature Reviews Genetics 13(6), 395 – 405
  9. Kinikar, M., Chawria, H., Chauhan, P. and Nashte, A. 2012. Data Mining in Clinical Practices Guidelines. Global Journal of Computer Science and Technology (GJCST-C) 12(12), 4 – 8.
  10. Bala, S and Kumar, K. 2014. A Literature Review on Kidney Disease Prediction using Data Mining Classification Techniques. International Journal of Computer Science and Mobile Computing 3(7), 960 – 967.
  11. Idowu, P.A., Aladekomo, T.A., Williams, K.O. and Balogun, J.A 2015. Predictive Model for Likelihood of Sickle Cell Aneamia (SCA) among pediatric patients using fuzzy logic. Transactions in networks and communications 31(1), 31–44.
  12. [Oztekin, A., Delen, D. and Kong, Z.J. 2009. Predicting the graft survival for heart–lung transplantation patients: An integrated data mining methodology. International Journal of Medical Informatics (IJMI) 78(12): e84 - e96.
  13. Delen, D., Walker, G. and Kadam, A. 2005. Predicting breast cancer survivability: a comparison of three data mining methods. Artificial Intelligence in Medicine 34(2), 113 – 127
  14. Imran, K., Ture, M. and Kurum, A.T. 2008. Comparing performances of logistic regression, classification and regression tree, and neural networks for predicting coronary artery disease. Expert Systems with Applications 34(1), 366 - 374, 2008.
  15. Moudani, W. 2013. Dynamic Features Selection for Heart Disease. International Science Index 7(2), 629 - 634.
  16. Kumari M. and Godara, S. 2011. Comparative Study of Data Mining Classification Methods in Cardiovascular Disease Prediction. International Journal of Computer Science and Technology (IJCST) 2(2), 304 – 308.
  17. Chu, C., Chien, W., Lai, C., Bludau, H., Tschai, H., Pai, L., Hsieh, S., Chu, N., Klar A., Haux, R. and Wetter, T. 2009. A Bayesian Expert System for Clinical Detecting Coronary Artery Disease. Journal of Medical Sciences (JMS) 29(4), 187 – 194.
  18. Prasad, B., Prasad, P.K. and Sagar, Y. 2011. A Comparative Study of Machine Learning Algorithms as Expert Systems in Medical Diagnosis (Asthma). In Advances in Computer Science and Information Technology, Berlin, Heidelberg.
  19. Ivanciuc, O. 2008. WEKA Machine Learning for Predicting the Phospholipidosis Inducing Potential, Current topics in medicinal chemistry 8(18), 1691-1709, 2008.
  20. Sertkaya, C., Temurtas, F. and Tanrikulu, A.C. 2009. A Comparative Study on Chronic Obstructive Pulmonary and Pneumonia Diseases Diagnosis using Neural Networks and Artificial Immune System, Journal of Medical Systems 33(6), 485 – 492.
  21. Dangare, C.S. and Apte S.S. 2012. Improved Study of Heart Disease Prediction System using Data Mining Classification Techniques. International Journal of Computer Applications 47(10): 44 – 48
  22. Han H, Segal A.M, Seifer J. L. and Dvyer J.T. 2015. Nutritional Management of Kidney Stones (Nephrolithiasis), PMC Journal of US National Library of Medicine and National Institute of Health, July 31, 2015
  23. Curhan G.C, Willett W.C, Rimm E.B .1998. Body size and risk of kidney stones. J Am Soc Nephrol Vol. 9, :1645
  24. Docteur, E. and Oxley, H. 2003. Hlth-Care Systems: Lessons from the Reform Experience. OECD Economics Department Working Papers 374, Dec. 2003.
  25. Yarnell J and O’Reilly D. (2013) Epidemiology and Disease Prevention: A Global Approach 2nd ed. OUP oxford University Press
  26. Pearle M, Lotan Y. 2002. Urinary lithiasis: etiology, epidemiology, and pathogenesis. In: Walsh P, Retik A, Vaughan ED Jr, Wein A, eds. Campbell's Urology, 8th edition. Philadelphia, PA: WB Saunders; 1363-1371
  27. Pearle M. S, Calhoun E.A, Curhan G.C. 2005. Urologic diseases in America project: urolithiasis. J Urol. 173:848-857.
  28. Taylor E.N and Curhan G.C. 2013. Dietary calcium from dairy and nondairy sources, and risk of symptomatic kidney stones. J Urol , 190, 1255.
  29. Scales C.D Jr., Smith A.C, Hanley J.M 2012. Prevalence of kidney stones in the United States. Eur Urol; Vol.62, 160
  30. Kaladhar, D., Rayavarapu, K.A. and Vadlapudi, V. 2012. Statistical and Data Mining Aspects on Kidney Stones: A Systematic Review and Meta-Analysis. Open Access Scientific Reports 1(12), 1 – 5
  31. Kinra, P., Sarkar, R., Baijal, R. and Raghava, V. 2009. Renal Stone Risk Assessment in Potential Indian Astronauts. International Journal of Aerospace Medicine 53(2), 27 – 33.
  32. Kaladhar D, Krishna Apparao Rayavarapu K and Vadlapudi V 2012. Statistical and Data Mining Aspects on Kidney Stones: A Systematic Review and Meta-analysis Department of Biochemistry/Bioinformatics, GIS, GITAM University, Visakhapatnam-530045, India
  33. Ahmad, L.G., Eshlaghy, A.T., Poorebrahimi, A. and Razavi, A.R. (2013). Using Three Machine Learning Techniques for Predicting Breast Cancer Recurrence. Journal of Health and Medical Informatics 4(2): 1 – 3.
  34. Sofia, H.N. and Walter, T.M. 2016. Prevalence and Risk Factors of Kidney Stones. Global Journal for Research Analysis 5(3), 183 – 187.
  35. Idowu P.A. 2017. Predictive Model for the Classification of Hypertension Risk Using Decision Trees Algorithm. American Journal of Mathematical and Computer Modelling. 2(2), 48-59.
  36. Quinlan J. 1986. Induction of Decision Tree, Machine Learning; Kluwer Academic Publishers, Boston, 1st ed. 81-106


Kidney Stone Risk Factors, C4.5, Prediction, Classification, Decision Trees, Genetic Algorithms, Multilayer Perception