CFP last date
22 April 2024
Reseach Article

An Effective Intelligent Model for Finding an Optimal Number of Different Pathological Types of Diseases

by Mohamed A. El-Rashidy, Taha E. Taha, Nabil M. Ayad
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 35 - Number 1
Year of Publication: 2011
Authors: Mohamed A. El-Rashidy, Taha E. Taha, Nabil M. Ayad
10.5120/4366-6023

Mohamed A. El-Rashidy, Taha E. Taha, Nabil M. Ayad . An Effective Intelligent Model for Finding an Optimal Number of Different Pathological Types of Diseases. International Journal of Computer Applications. 35, 1 ( December 2011), 21-29. DOI=10.5120/4366-6023

@article{ 10.5120/4366-6023,
author = { Mohamed A. El-Rashidy, Taha E. Taha, Nabil M. Ayad },
title = { An Effective Intelligent Model for Finding an Optimal Number of Different Pathological Types of Diseases },
journal = { International Journal of Computer Applications },
issue_date = { December 2011 },
volume = { 35 },
number = { 1 },
month = { December },
year = { 2011 },
issn = { 0975-8887 },
pages = { 21-29 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume35/number1/4366-6023/ },
doi = { 10.5120/4366-6023 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:20:54.237090+05:30
%A Mohamed A. El-Rashidy
%A Taha E. Taha
%A Nabil M. Ayad
%T An Effective Intelligent Model for Finding an Optimal Number of Different Pathological Types of Diseases
%J International Journal of Computer Applications
%@ 0975-8887
%V 35
%N 1
%P 21-29
%D 2011
%I Foundation of Computer Science (FCS), NY, USA
Abstract

A new hybrid data mining model is proposed to provide a comprehensive analytic method for finding an optimal number of different pathological types of any disease and its complications, an optimal partitioning representative and extracts the most significant features for each pathological type. This model is an integration of both characteristics of supervised and unsupervised models and is based on clustering, feature selection, and classification concepts. This model takes into consideration access to the highest classification accuracy during the clustering process. Experiments have been conducted on 3 real medical datasets related to the diagnosis of breast cancer, heart disease, and post-operative infections. The performance of this method is evaluated using information entropy, squared error, classification sensitivity, specificity, overall accuracy, and Matthew's correlation coefficient. The results show that the highest classification performance is obtained using our proposed model, and this is very promising compared to NaïveBayes, Linear Support Vector Machine (Linear SVM), Polykernal Support Vector Machine (Polykernal SVM), Artificial Neural Network (ANN), and Support Feature Machines (SFM) models.

References
  1. Cheng, H., Shan, J., Ju, W., Guo, Y., and Zhang, L., 2010. Automated breast cancer detection and classification using ultra sound images: Asurvey. Pattern Recognition. 43, 299-317.
  2. Riccardo, B., and Blaz, Z., 2008. Predictive data mining in clinical medicine: Current issues and guidelines. International journal of medical informatics. 77, 81-97.
  3. Rong-Ho, Lin, 2009. An intelligent model for liver disease diagnosis. Artificial Intelligence in Medicine. 47, 53-62.
  4. Yue, H., Paul, M., Norman, B., and Roy, H., 2007. Feature selection and classification model construction on type 2 diabetic patient’s data. Artificial Intelligence in Medicine. 41, 251-262.
  5. Choua, S., M., Leeb, T., S., Shaoc, Y., E., and Chenb, I., F., 2004. Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines. Expert Systems with Applications. 27, 133-142.
  6. Elmore, J., Wells, M., Carol, M., Lee, H., Howard, D., and Feinstein, A., 1994. Variability in radiologists interpretation of mammograms. New England Journal of Medicine. 331(22), 1493-1499.
  7. Mehmet, F., A., 2009. Support vector machines combined with feature selection for breast cancer diagnosis. Expert Systems with Applications. 36, 3240-3247.
  8. Ilias, M., Elias, Z., and Ioannis, A., 2009. An intelligent system for automated breast cancer diagnosis and prognosis using SVM based classifiers. Appl Intell. 30, 24-36.
  9. Ya-Ju, F., and Wanpracha, A., Ch., 2010. Optimizing feature selection to improve medical diagnosis. Ann Oper Res. 174, 169-183.
  10. Cleveland Heart Disease and Wisconsin Breast Cancer Datasets are originally available on UCI Machine Learning Repository website http://archive.ics.uci.edu.
  11. Bhargavi, P., and Jyothi, S., 2009. Applying Naive Bayes Data Mining Technique for Classification of Agricultural Land Soils. International Journal of Computer Science and Network Security. 9(8), 117-122.
  12. Zhizheng, L., and Tuo, Z., 2006. Feature selection for linear support vector machines. The 18th International Conference on Pattern Recognition IEEE.
  13. Bhattacharya, I., and Bhatia, M., 2010. SVM classification to distinguish Parkinson disease patients. A2CWiC '10 Amrita ACM-W Celebration on Women in Computing in India.
  14. Paulo, L., and Azzam, G., 2006. The use of artificial neural networks in decision support in cancer: A systematic review . Neural Networks. 19(4), 408-415.
  15. Eisen, M., Spellman, P., Brown, P., and Botstein, D., 1998. Cluster analysis and display of genome wide expression patterns. Natl Acad Sci USA. 95(25), 14863-14868.
  16. Blatt, M., Wiseman, S., and Domany, E., 1996. Super-paramagnetic clustering of data. Phys Rev Lett. 76, (3251-3254):29-76.
  17. Rose, K., 1998. Deterministic annealing for clustering, compression, classification, regression, and related optimization problems. IEEE. 86(11), 2210-2239.
  18. Herrero, J., Valencia, A., and Dopazo, J., 2001. A hierarchical unsupervised growing neural network for clustering gene expression patterns. Bioinformatics. 17(2), 126-136.
  19. Jiang, D., Tang, C., and Zhang, A., 2004. Cluster analysis for gene expression data: a survey. IEEE Trans Knowl Data Eng. 16(11), 1370-1386.
  20. Jiang, D., Pei, J., and Zhang, A., 2003. DHC: a density-based hierarchical clustering method for time series gene expression data. the 3rd IEEE symp on bioinformatics and bioengineering. Maryland, USA, 393-400.
  21. Hinneburg, A., and Keim, D., 1998. An efficient approach to clustering in large multimedia database with noise. The 4th int conf on knowledge discovery and data mining. NY, USA, 58–65.
  22. Au, W., Chan, K., Wong, A., and Wang, Y., 2005. Attribute clustering for grouping, selection, and classification of gene expression data. IEEE/ACMTrans Comput Biol Bioinform. 2(2), 83–101.
  23. Bickel D., 2003. Robust cluster analysis of microarray gene expression data with the number of clusters determined biologically. Bioinformatics, 19(7), 818–824.
  24. Guthke, R., Schmidt-Heck, W., Hann, D., and Pfaff, M., 2000. Gene expression data mining for functional genomics. The European symp on intel techn. Aachen, Germany, 170–177.
  25. Romdhane, L., Shili, H., and Ayeb, B., 2009. Mining microarray gene expression data with unsupervised possibilistic clustering and proximity graphs. Appl Intell. 10.1007/s (10489-009):0161-3.
  26. Shamir, R., and Sharan, R., 2000. CLICK: A clustering algorithm for gene expression analysis. the int conf on intelligent systems for molecular biology. CA, USA, 307–316.
  27. Yeung, K., Fraley, C., Murua, A., Raftery, A., and Ruzz, W., 2001. Model-based clustering and data transformations for gene expression data. Bioinformatics. 17(10), 977–987.
  28. Bezdek, J., 1981 Pattern Recognition with Fuzzy Objective Function Algorithms. New York: Plenum.
  29. Tou, J., and Gonzalez, R., 1974 Pattern recognition principles. Addison-Wesley.
Index Terms

Computer Science
Information Sciences

Keywords

Clustering feature selection classification SFM model breast cancer heart disease post-operative infection