CFP last date
22 April 2024
Reseach Article

A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set

by Renu Vashist, M. L. Garg
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 42 - Number 14
Year of Publication: 2012
Authors: Renu Vashist, M. L. Garg
10.5120/5762-7938

Renu Vashist, M. L. Garg . A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set. International Journal of Computer Applications. 42, 14 ( March 2012), 31-35. DOI=10.5120/5762-7938

@article{ 10.5120/5762-7938,
author = { Renu Vashist, M. L. Garg },
title = { A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set },
journal = { International Journal of Computer Applications },
issue_date = { March 2012 },
volume = { 42 },
number = { 14 },
month = { March },
year = { 2012 },
issn = { 0975-8887 },
pages = { 31-35 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume42/number14/5762-7938/ },
doi = { 10.5120/5762-7938 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:32:11.257792+05:30
%A Renu Vashist
%A M. L. Garg
%T A Rough Set Approach for Generation and Validation of Rules for Missing Attribute Values of a Data Set
%J International Journal of Computer Applications
%@ 0975-8887
%V 42
%N 14
%P 31-35
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Data mining has emerged as most significant and continuously evolving field of research because of it's ever growing and far reaching applications into various areas such as medical, military, financial markets, banking etc. One of the most useful applications of data mining is extracting significant and earlier unknown knowledge from real-world databases. This knowledge may be in the form of rules. 'Rule generation' from raw data is a very effective and most widely used tool of data mining. Real life data are frequently imperfect, erroneous, incomplete, uncertain and vague. There are so many approaches for handling missing attribute values. In this paper we use the most common attribute value approach i. e. replacing all the missing attribute values by most frequently occurring attribute value and thereby completing the information table. Subsequently, we find the reduct and core of the complete decision table and verify that the reduct and core find by our method is same as the reduct and core find by ROSE2 software. Thereafter we generate the rules based on reduct. Our results are validated by conducting the same rough set analysis on the incomplete information system using the software ROSE2.

References
  1. Z. Pawlak Rough Sets and Intelligent Data Analysis[J]. Information Sciences,2002, 147(1-4) 1-12.
  2. Z Pawlak, Andrzej Skowron. Rudiments of rough sets[J]. Information Sciences, 177(2007) 3-27.
  3. Clark, P. Niblett, T. : The CN2 induction algorithm. Machine Learning 3 (1989) 261–283.
  4. Grzymala-Busse, J. W. : On the unknown attribute values in learning from examples. Proc. of the ISMIS-91, 6th International Symposium on Methodologies for Intelligent Systems, Charlotte, North Carolina, October 16–19, 1991, Lecture Notes in Artificial Intelligence, vol. 542. Springer-Verlag, Berlin Heidelberg New York (1991) 368–377.
  5. Grzymala-Busse, J. W. : LERS—A System for Learning from Examples Based on Rough Sets. In: Slowinski, R. (ed. ): Intelligent Decision Support. Handbook of Applications and Advances of the Rough Sets Theory. Kluwer Academic Publishers, Boston MA (1992) 3–18.
  6. . Grzymala-Busse, J. W. : Rough set strategies to data with missing attribute values. Workshop Notes, Foundations and New Directions of Data Mining, the 3-rd International Conference on Data Mining, Melbourne, FL, USA, November 19–22, 2003, 56–63.
  7. . Grzymala-Busse, J. W. and Hu, M. : A comparison of several approaches to missing attribute values in data mining. Proceedings of the Second International Conference on Rough Sets and Current Trends in Computing RSCTC'2000, Banff, Canada, October 16–19, 2000, 340–347.
  8. . Grzymala-Busse, J. W. and A. Y. Wang A. Y. : Modified algorithms LEM1 and LEM2 for rule induction from data with missing attribute values. Proc. of the Fifth International Workshop on Rough Sets and Soft Computing (RSSC'97) at the Third Joint Conference on Information Sciences (JCIS'97), Research Triangle Park, NC, March 2–5, 1997, 69–72.
  9. Kryszkiewicz, M. : Rough set approach to incomplete information systems. Proceedings of the Second Annual Joint Conference on Information Sciences, Wrightsville Beach, NC, September 28–October 1, 1995, 194–197.
  10. . Kryszkiewicz, M. : Rules in incomplete information systems. Information Sciences 113 (1999) 271–292.
  11. . Pawlak, Z. : Rough Sets. International Journal of Computer and Information Sciences (1982) 341–356.
  12. Pawlak, Z. : Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht, Boston, London (1991).
  13. . Stefanowski, J. : Algorithms of Decision Rule Induction in Data Mining. Poznan University of Technology Press, Poznan, Poland (2001). [14. ] Stefanowski, J. and Tsoukias, A. : On the extension of rough sets under incomplete information. Proceedings of the 7th International Workshop on New Directions in Rough Sets, Data Mining, and Granular-Soft Computing, RSFDGrC'1999, Ube, Yamaguchi, Japan, November 8–10, 1999, 73–81.
  14. Stefanowski, J. and Tsoukias, A. : Incomplete information tables and rough classification. Computationa l Intelligence 17 (2001) 545–566.
  15. Pr?dki B, Wilk S (1999) Rough set based data exploration using ROSE system. In: Ras ZW, Skowron A (Eds. ), Foundations of Intelligent, Lecture Notes in Artificial Intelligence, vol. 1609, Springer, Berlin, 172-180
  16. Jiawei Han, and Micheline Kamber, Data Mining: Concepts and Techniques. California??Morgan Kaufmann Publishers, 2000.
  17. R. Brachman, T. Khabaza, W. Kloesgen, G. Piatetsky- Shapiro, and E. Simoudis, Industrial Applications of Data Mining and Knowledge Discovery, Communzcatzons of ACM, vol. 39, no. 11. 1996.
  18. Communications of The ACM, special issue on Data Mining, vol. 39, no. 11.
Index Terms

Computer Science
Information Sciences

Keywords

Data Mining Knowledge Discovery From Database Machine Learning Reduct Core Missing Attribute Values Rule Generation