Prognostication of Diabetes using Random Forest

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2020
Harsh Harwani, Mohammed Omar Khan, Ananya Arora

Harsh Harwani, Mohammed Omar Khan and Ananya Arora. Prognostication of Diabetes using Random Forest. International Journal of Computer Applications 175(29):40-43, November 2020. BibTeX

	author = {Harsh Harwani and Mohammed Omar Khan and Ananya Arora},
	title = {Prognostication of Diabetes using Random Forest},
	journal = {International Journal of Computer Applications},
	issue_date = {November 2020},
	volume = {175},
	number = {29},
	month = {Nov},
	year = {2020},
	issn = {0975-8887},
	pages = {40-43},
	numpages = {4},
	url = {},
	doi = {10.5120/ijca2020920833},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}


Diabetes is a serious malady where one has abnormally high blood sugar levels. Despite being so deadly, it is quite common as anyone is susceptible to it. If untreated, it can damage a person’s kidneys, eyes, nerves, and other organs. Genes, environment, and preexisting medical conditions can all affect a person’s odds of developing diabetes. The bottom line is that it can be extremely deadly if discovered late. Thus, it is imperative that researchers devise an accurate diabetes predictor in order to enable early treatment of diabetic people. This paper demonstrates the prediction of diabetes using the Random Forest algorithm on the PIMA Indians Diabetes dataset. Using important data points and features of several healthy and diabetic PIMA Indians, the model predicts the onset of diabetes. The performance of this algorithm is evaluated using metrics like Accuracy, Precision, and Recall. Furthermore, several suggestions to improve the effectiveness of this model are discussed.


  1. "Diabetes-A Silent Killer",, 2020. [Online]. Available: [Accessed: 02- Oct- 2020].
  2. A. diabetes, W. diabetes and F. figures, "International Diabetes Federation - Facts & figures",, 2020. [Online]. Available: [Accessed: 02- Oct- 2020]. "The Importance of Early Diabetes Detection", ASPE, 2020. [Online]. Available: [Accessed: 02- Oct- 2020].
  3. Q. Zou, K. Qu, Y. Luo, D. Yin, Y. Ju and H. Tang, "Predicting Diabetes Mellitus With Machine Learning Techniques", Frontiers in Genetics, vol. 9, 2018. Available: 10.3389/fgene.2018.00515 [Accessed 2 October 2020].
  4. N. Sneha and T. Gangil, "Analysis of diabetes mellitus for early prediction using optimal features selection", Journal of Big Data, vol. 6, no. 1, 2019. Available: 10.1186/s40537-019-0175-6 [Accessed 2 October 2020].
  5. T. Mahboob Alam et al., "A model for early prediction of diabetes", Informatics in Medicine Unlocked, vol. 16, p. 100204, 2019. Available: 10.1016/j.imu.2019.100204.
  6. R. Patil and S. Tamane, "A Comparative Analysis on the Evaluation of Classification Algorithms in the Prediction of Diabetes", International Journal of Electrical and Computer Engineering (IJECE), vol. 8, no. 5, p. 3966, 2018. Available: 10.11591/ijece.v8i5.pp3966-3975.
  7. S. Wei, X. Zhao and C. Miao, "A comprehensive exploration to the machine learning techniques for diabetes identification", 2018 IEEE 4th World Forum on Internet of Things (WF-IoT), 2018. Available: 10.1109/wf-iot.2018.8355130 [Accessed 2 October 2020].
  8. A. P, M. M V and S. H A, "DRAP: Decision Tree and Random Forest Based Classification Model to Predict Diabetes", 2019 1st International Conference on Advances in Information Technology (ICAIT), 2019. Available: 10.1109/icait47043.2019.8987277 [Accessed 2 October 2020].


Machine learning, Diabetes prediction, Random Forest, PIMA Indians diabetes dataset