Comparative Study of Data Mining Classifiers with Different Attributes and Different Databases Domain

P. Arumugam; Poompavai A.; Manimannan G.; R. Lakshmi Priya

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

Comparative Study of Data Mining Classifiers with Different Attributes and Different Databases Domain

by P. Arumugam, Poompavai A., Manimannan G., R. Lakshmi Priya

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 177 - Number 47

Year of Publication: 2020

Authors: P. Arumugam, Poompavai A., Manimannan G., R. Lakshmi Priya

10.5120/ijca2020919840

P. Arumugam, Poompavai A., Manimannan G., R. Lakshmi Priya . Comparative Study of Data Mining Classifiers with Different Attributes and Different Databases Domain. International Journal of Computer Applications. 177, 47 ( Mar 2020), 13-23. DOI=10.5120/ijca2020919840

@article{ 10.5120/ijca2020919840,

author = { P. Arumugam, Poompavai A., Manimannan G., R. Lakshmi Priya },

title = { Comparative Study of Data Mining Classifiers with Different Attributes and Different Databases Domain },

journal = { International Journal of Computer Applications },

issue_date = { Mar 2020 },

volume = { 177 },

number = { 47 },

month = { Mar },

year = { 2020 },

issn = { 0975-8887 },

pages = { 13-23 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume177/number47/31224-2020919840/ },

doi = { 10.5120/ijca2020919840 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:48:52.571490+05:30

%A P. Arumugam

%A Poompavai A.

%A Manimannan G.

%A R. Lakshmi Priya

%T Comparative Study of Data Mining Classifiers with Different Attributes and Different Databases Domain

%J International Journal of Computer Applications

%@ 0975-8887

%V 177

%N 47

%P 13-23

%D 2020

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In this paper, an attempt is made to identify and cross validate with five different classification methods in terms of precision, accuracy and kappa statistics calculated and visualized with different sets of database collected from different domain. This research paper has been implemented in R language environment and the obtained results show that which classifier is the most robust classifier method. The Accuracy based comparison of different classification for different datasets have been showed. By confusion matrix sensitivity, specificity, accuracy, true positive rate and false positive rate of different classifier for all four datasets are calculated and comparison of Kappa Statistics is also performed. The present work is about to analyze the effectiveness of the most popular classification techniques. According to the Experimental results, the Support Vector Machine model proved to have the best performance. It performed better of all datasets used. Naive Bayes Classifier, Decision Tree and Random Forest also performed well. The true positive rate and false positive rate table represent above 80% True Positive Rate and less than 20% False Positive Rate for all four datasets. Kappa Statistics basically performs the analysis between different classes. This shows the comparative analysis of different classification under the kappa statistics. Higher Value of kappa statistic is considered as good.

References

Brijesh Kumar Bhardwaj, Saurabh Pal (2011), Data Mining: A prediction for performance improvement using classification, (IJCSIS) International Journal of Computer Science and Information Security, Vol. 9, No. 4, April 2011, pp 136-140.
Jyoti Soni, Ujma Ansari, Dipesh Sharma, Sunita Soni (2011), Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction, International Journal of Computer Applications (0975 – 8887) Volume 17– No.8, pp. 43-48.
S. M. Kamruzzaman, Farhana Haider, Ahmed Ryadh Hasan (2010), Text Classification Using Data Mining, ICTM, pp.1-9.
Pushpalata Pujari (2013), Classification And Comparative Study of Data Mining Classifiers with Feature Selection on Binomial Data Set, Journal of Global Research in computer Science, Vol.5, No.3, pp.39-45.
Dahinden, C., 2009. An improved Random Forests Approach with Application to the Performance Prediction Challenge Datasets. Hands on Pattern Recognition. Microtome. Seminar fur¨ Statistik CH-8092 Zurich, ¨ Switzerland, pp.1-6.
Jiawei Han, Micheline Kamber Jian Pei (2012), Data Mining Concepts and Techniques, Morgan Kaufmann Publishers is an Imprint of Elsevier. 225 Wyman Street, Waltham, MA 02451, USA.
U. Rajendra Acharya, P. Subbanna Bhat, S.S. Iyengar , Ashok Rao, Sumeet Dua (2003), Classification of heart rate data using artificial neural network and fuzzy equivalence relation, Pattern Recognition, Volume 36, issue.1. pp.61-68.
https://blog.floydhub.com/naive-bayes-for-machine-learning/
http://www2.cs.uregina.ca/~dbd/cs831/notes/kdd/1_kdd.html
Fisher RA. The Use of Multiple Measurements in Taxonomic Problems. Annuals of Eugenics. 1936;7:179–188. Annals of Human Genetics, UCL and Blackwell Publishing Ltd.

Index Terms

Computer Science

Information Sciences

Keywords

Decision Tree Random Forest Naive Bayes Classifier Linear Discriminant Analysis Support Vector Machine Confusion Matrix and Kappa Statistics.