CFP last date
20 May 2024
Reseach Article

Predicting the Presence of Heart Diseases using Comparative Data Mining and Machine Learning Algorithms

by Daniel Ananey-Obiri, Enoch Sarku
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 176 - Number 11
Year of Publication: 2020
Authors: Daniel Ananey-Obiri, Enoch Sarku
10.5120/ijca2020920034

Daniel Ananey-Obiri, Enoch Sarku . Predicting the Presence of Heart Diseases using Comparative Data Mining and Machine Learning Algorithms. International Journal of Computer Applications. 176, 11 ( Apr 2020), 17-21. DOI=10.5120/ijca2020920034

@article{ 10.5120/ijca2020920034,
author = { Daniel Ananey-Obiri, Enoch Sarku },
title = { Predicting the Presence of Heart Diseases using Comparative Data Mining and Machine Learning Algorithms },
journal = { International Journal of Computer Applications },
issue_date = { Apr 2020 },
volume = { 176 },
number = { 11 },
month = { Apr },
year = { 2020 },
issn = { 0975-8887 },
pages = { 17-21 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume176/number11/31245-2020920034/ },
doi = { 10.5120/ijca2020920034 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:42:13.624588+05:30
%A Daniel Ananey-Obiri
%A Enoch Sarku
%T Predicting the Presence of Heart Diseases using Comparative Data Mining and Machine Learning Algorithms
%J International Journal of Computer Applications
%@ 0975-8887
%V 176
%N 11
%P 17-21
%D 2020
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Heart disease, an example of cardiovascular diseases is the number one notable reason for the death of many people in the world. Of recent, studies have concentrated on using alternative efficient techniques such as data mining and machine learning in the diagnosis of diseases based on certain features of an individual. This study will use data exploratory and mining techniques to extract hidden patterns using python. By this, machine learning algorithms (logistic linear regression, decision tree classifier, Gaussian Naïve Bayes models) will be developed to predict the presence of heart diseases in patients. This will try to seek better performance in predicting heart diseases to reduce the number of tests require for the diagnosis of heart diseases. The k-fold cross validation approach will be used in assessing the resulting models for receiver operating characteristic (ROC) curves (sensitivity against specificity). The dataset was collected from UCI machine learning repository which contains information on patients with heart disease. The dataset has 14 attributes and measured on 303 individuals.

References
  1. H. K. Weir et al., “Heart Disease and Cancer Deaths - Trends and Projections in the United States, 1969-2020,” Prev. Chronic Dis., vol. 13, pp. E157–E157, Nov. 2016.
  2. C. S. Dangare and M. E. Cse, “Improved Study of Heart Disease Prediction System using Data Mining Classification Techniques,” Int. ournal Comput. Appl., vol. 47, no. 10, pp. 44–48, 2012.
  3. J. Patel, S. Tejalupadhyay, and S. Patel, Heart Disease prediction using Machine learning and Data Mining Technique. 2016.
  4. S. Sakr et al., “Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry ford exercIse testing (FIT) project,” BMC Med. Inform. Decis. Mak., vol. 17, no. 1, p. 174, Dec. 2017.
  5. S. Sakr et al., “Using machine learning on cardiorespiratory fitness data for predicting hypertension: The Henry Ford ExercIse Testing (FIT) Project.,” PLoS One, vol. 13, no. 4, p. e0195344, 2018.
  6. M. Shouman, T. Turner, and R. Stocker, Using decision tree for diagnosing heart disease patients, vol. 121. 2011.
  7. A. H. Babar, “Comparative Analysis of Classification Models for Healthcare Data Analysis,” vol. 07, no. 04, pp. 170–175, 2018.
  8. V. Chaurasia, “Early Prediction of Heart Diseases Using Data Mining Techniques,” Caribb. J. Sci. Technol., vol. Vol.1, pp. 208–217, Dec. 2013.
  9. H. Sharma, “Prediction of Heart Disease using Machine Learning Algorithms : A Survey,” Int. J. Recent Innov. Trends Comput. Commun., vol. 5, no. 8, pp. 99–104, 2017.
  10. N. Bhargava and G. Sharma, “Decision Tree Analysis on J48 Algorithm for Data Mining,” Int. J. Adv. Res. Comput. Sci. Softw. Eng., vol. 3, no. 6, pp. 1114–1119, 2013.
  11. E. W. Steyerberg, M. J. Eijkemans, F. E. J. Harrell, and J. D. Habbema, “Prognostic modeling with logistic regression analysis: in search of a sensible strategy in small data sets.,” Med. Decis. Making, vol. 21, no. 1, pp. 45–56, 2001.
  12. S. Xu, “Bayesian Naïve Bayes classifiers to text classification,” J. Inf. Sci., vol. 44, no. 1, pp. 48–59, Nov. 2016.
  13. R. D. S. Raizada and Y.-S. Lee, “Smoothness without Smoothing: Why Gaussian Naive Bayes Is Not Naive for Multi-Subject Searchlight Studies,” PLoS One, vol. 8, no. 7, p. e69566, Jul. 2013.
  14. I. Rish, “An Empirical Study of the Naïve Bayes Classifier,” IJCAI 2001 Work Empir Methods Artif Intell, vol. 3, Jan. 2001.
  15. H. Zhang, The Optimality of Naive Bayes, vol. 2. 2004.
  16. R. Nichenametla, T. Maneesha, S. Hafeez, and H. Krishna, “Prediction of Heart Disease Using Machine Learning Algorithms,” Int. J. Eng. Technol., vol. 7, pp. 363–366, May 2018.
Index Terms

Computer Science
Information Sciences

Keywords

Classification regression k-fold cross validation Receiver Operator Characteristics