CFP last date
20 May 2024
Reseach Article

Modified Deviation Approach to Deal with Missing Attribute Values in Data Mining with different Percentage of Missing Values

by Pallab Kumar Dey, Sripati Mukhopadhay
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 73 - Number 5
Year of Publication: 2013
Authors: Pallab Kumar Dey, Sripati Mukhopadhay
10.5120/12734-9634

Pallab Kumar Dey, Sripati Mukhopadhay . Modified Deviation Approach to Deal with Missing Attribute Values in Data Mining with different Percentage of Missing Values. International Journal of Computer Applications. 73, 5 ( July 2013), 1-7. DOI=10.5120/12734-9634

@article{ 10.5120/12734-9634,
author = { Pallab Kumar Dey, Sripati Mukhopadhay },
title = { Modified Deviation Approach to Deal with Missing Attribute Values in Data Mining with different Percentage of Missing Values },
journal = { International Journal of Computer Applications },
issue_date = { July 2013 },
volume = { 73 },
number = { 5 },
month = { July },
year = { 2013 },
issn = { 0975-8887 },
pages = { 1-7 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume73/number5/12734-9634/ },
doi = { 10.5120/12734-9634 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:39:14.293106+05:30
%A Pallab Kumar Dey
%A Sripati Mukhopadhay
%T Modified Deviation Approach to Deal with Missing Attribute Values in Data Mining with different Percentage of Missing Values
%J International Journal of Computer Applications
%@ 0975-8887
%V 73
%N 5
%P 1-7
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Information System having missing attribute values (in practical) hampers accurate estimation of Data Mining. If missing attribute values can be predicted in the pre-processing stage of data mining then it will help to improve the accuracy, and the existing data mining algorithms can also be applied based on complete data. In this work different type of methods available to handle incomplete information system have been discussed, and there after an algorithm has been proposed by which missing attribute values may be replaced with minimum complexity. It is shown that proposed algorithm is better by applying it on different sets of data with different percentage of missing values.

References
  1. A. Acock. Working with missing values. Journal of Marriage and Family, 67:1012–1028, 2005.
  2. P. Clark and T. Niblett. The CN2 induction algorithm, Machine Learning 3. 1989.
  3. Pallab K. Dey and Sripati Mukhopadhyay. Deviation approach to missing attribute values in data mining. International Journal of Advance Research in Computer Science, 3(3):–, 2012.
  4. Sanjay Gaur and M. S. Dulawat. A closest fit approach to missing attribute values in data mining. International Journal of Advances in Science and Technology, 2(4):–, 2011.
  5. J. W. Grzymala-Busse. On the unknown attribute values in learning from examples. 542:368–377, 1991.
  6. J. W. Grzymala-Busse and Hu Ming. A comparison of several approaches to missing attribute values in data mining. pages 378–385, 2001.
  7. J. W. Grzymala-Busse and A. Y. Wang. Modified algorithms lem1 and lem2 for rule induction from data with missing attribute values. In Research Triangle Park, editor, Fifth International Workshop on Rough Sets and Soft Computing(RSSC'97), Third Joint Conference on Information Sciences (JCIS'97), page 6972, 2–5 1997.
  8. J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, San Francisco, CA, 2001.
  9. Bratko I. Knonenko and I. Roskar. E. : Experiments in automatic learning of medical diagnostic rules. Technical Report, Jozef Stefan Institute, Lljubljana, Yugoslavia, 1984.
  10. R. J. A. Little and D. B. Rubin. Statistical Analysis with Missing Data. John Wiley and Sons, 1987.
  11. Dorian Pyle. Data Preparation for Data Mining. Morgan Kaufmann Publishers, 1999. hardcopy.
  12. J. R. Quinlan. C4. 5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993.
  13. G. Weckman W. Young and W. Holland. A survey of methodologies for the treatment of missing values within datasets: limitations and benefits. Theoretical Issues in Ergonomics Science, 12(1):15–43, 2011.
Index Terms

Computer Science
Information Sciences

Keywords

Data Mining Incomplete Information Missing attribute Values pre-processing Modified Deviation approach