Predicting the Presence of Heart Diseases using Comparative Data Mining and Machine Learning Algorithms

Daniel Ananey-Obiri; Enoch Sarku

Call for Paper

July Edition

IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 22 June 2026

Submit your paper

Know more

The week's pick

CAD-Genesis: An Open-Source AI-Powered Add-in for Natural Language-Driven Parametric CAD Modeling and Cross-Platform Integration in SolidWorks and Fusion 360

Anil Mandloi Prakhi Mandloi

Random Articles

FPGA Implementation of Intelligent Climate Control for Greenhouse

February

2010

Facial Features Extraction by Relative Geometrical Position

July

2014

A Robust Audio Steganographic Scheme in Time Domain (RASSTD)

October

2013

Migration Effects on BBO Evolution in Optimizing Fifteen Element Yagi-Uda Antenna Design

April

2013

Reseach Article

Predicting the Presence of Heart Diseases using Comparative Data Mining and Machine Learning Algorithms

by Daniel Ananey-Obiri, Enoch Sarku

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 176 - Number 11

Year of Publication: 2020

Authors: Daniel Ananey-Obiri, Enoch Sarku

10.5120/ijca2020920034

Daniel Ananey-Obiri, Enoch Sarku . Predicting the Presence of Heart Diseases using Comparative Data Mining and Machine Learning Algorithms. International Journal of Computer Applications. 176, 11 ( Apr 2020), 17-21. DOI=10.5120/ijca2020920034

@article{ 10.5120/ijca2020920034,

author = { Daniel Ananey-Obiri, Enoch Sarku },

title = { Predicting the Presence of Heart Diseases using Comparative Data Mining and Machine Learning Algorithms },

journal = { International Journal of Computer Applications },

issue_date = { Apr 2020 },

volume = { 176 },

number = { 11 },

month = { Apr },

year = { 2020 },

issn = { 0975-8887 },

pages = { 17-21 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume176/number11/31245-2020920034/ },

doi = { 10.5120/ijca2020920034 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:42:13.624588+05:30

%A Daniel Ananey-Obiri

%A Enoch Sarku

%T Predicting the Presence of Heart Diseases using Comparative Data Mining and Machine Learning Algorithms

%J International Journal of Computer Applications

%@ 0975-8887

%V 176

%N 11

%P 17-21

%D 2020

%I Foundation of Computer Science (FCS), NY, USA

Abstract

Heart disease, an example of cardiovascular diseases is the number one notable reason for the death of many people in the world. Of recent, studies have concentrated on using alternative efficient techniques such as data mining and machine learning in the diagnosis of diseases based on certain features of an individual. This study will use data exploratory and mining techniques to extract hidden patterns using python. By this, machine learning algorithms (logistic linear regression, decision tree classifier, Gaussian Naïve Bayes models) will be developed to predict the presence of heart diseases in patients. This will try to seek better performance in predicting heart diseases to reduce the number of tests require for the diagnosis of heart diseases. The k-fold cross validation approach will be used in assessing the resulting models for receiver operating characteristic (ROC) curves (sensitivity against specificity). The dataset was collected from UCI machine learning repository which contains information on patients with heart disease. The dataset has 14 attributes and measured on 303 individuals.

References

H. K. Weir et al., “Heart Disease and Cancer Deaths - Trends and Projections in the United States, 1969-2020,” Prev. Chronic Dis., vol. 13, pp. E157–E157, Nov. 2016.
C. S. Dangare and M. E. Cse, “Improved Study of Heart Disease Prediction System using Data Mining Classification Techniques,” Int. ournal Comput. Appl., vol. 47, no. 10, pp. 44–48, 2012.
J. Patel, S. Tejalupadhyay, and S. Patel, Heart Disease prediction using Machine learning and Data Mining Technique. 2016.
S. Sakr et al., “Comparison of machine learning techniques to predict all-cause mortality using fitness data: the Henry ford exercIse testing (FIT) project,” BMC Med. Inform. Decis. Mak., vol. 17, no. 1, p. 174, Dec. 2017.
S. Sakr et al., “Using machine learning on cardiorespiratory fitness data for predicting hypertension: The Henry Ford ExercIse Testing (FIT) Project.,” PLoS One, vol. 13, no. 4, p. e0195344, 2018.
M. Shouman, T. Turner, and R. Stocker, Using decision tree for diagnosing heart disease patients, vol. 121. 2011.
A. H. Babar, “Comparative Analysis of Classification Models for Healthcare Data Analysis,” vol. 07, no. 04, pp. 170–175, 2018.
V. Chaurasia, “Early Prediction of Heart Diseases Using Data Mining Techniques,” Caribb. J. Sci. Technol., vol. Vol.1, pp. 208–217, Dec. 2013.
H. Sharma, “Prediction of Heart Disease using Machine Learning Algorithms : A Survey,” Int. J. Recent Innov. Trends Comput. Commun., vol. 5, no. 8, pp. 99–104, 2017.
N. Bhargava and G. Sharma, “Decision Tree Analysis on J48 Algorithm for Data Mining,” Int. J. Adv. Res. Comput. Sci. Softw. Eng., vol. 3, no. 6, pp. 1114–1119, 2013.
E. W. Steyerberg, M. J. Eijkemans, F. E. J. Harrell, and J. D. Habbema, “Prognostic modeling with logistic regression analysis: in search of a sensible strategy in small data sets.,” Med. Decis. Making, vol. 21, no. 1, pp. 45–56, 2001.
S. Xu, “Bayesian Naïve Bayes classifiers to text classification,” J. Inf. Sci., vol. 44, no. 1, pp. 48–59, Nov. 2016.
R. D. S. Raizada and Y.-S. Lee, “Smoothness without Smoothing: Why Gaussian Naive Bayes Is Not Naive for Multi-Subject Searchlight Studies,” PLoS One, vol. 8, no. 7, p. e69566, Jul. 2013.
I. Rish, “An Empirical Study of the Naïve Bayes Classifier,” IJCAI 2001 Work Empir Methods Artif Intell, vol. 3, Jan. 2001.
H. Zhang, The Optimality of Naive Bayes, vol. 2. 2004.
R. Nichenametla, T. Maneesha, S. Hafeez, and H. Krishna, “Prediction of Heart Disease Using Machine Learning Algorithms,” Int. J. Eng. Technol., vol. 7, pp. 363–366, May 2018.

Index Terms

Computer Science

Information Sciences

Keywords

Classification regression k-fold cross validation Receiver Operator Characteristics