Call for Paper - May 2023 Edition
IJCA solicits original research papers for the May 2023 Edition. Last date of manuscript submission is April 20, 2023. Read More

Mining Constant Conditional Functional Dependencies for Improving Data Quality

International Journal of Computer Applications
© 2013 by IJCA Journal
Volume 74 - Number 15
Year of Publication: 2013
D Devi Kalyani

Devi D Kalyani. Article: Mining Constant Conditional Functional Dependencies for Improving Data Quality. International Journal of Computer Applications 74(15):12-20, July 2013. Full text available. BibTeX

	author = {D Devi Kalyani},
	title = {Article: Mining Constant Conditional Functional Dependencies for Improving Data Quality},
	journal = {International Journal of Computer Applications},
	year = {2013},
	volume = {74},
	number = {15},
	pages = {12-20},
	month = {July},
	note = {Full text available}


This paper applies the data mining techniques in the area of data cleaning as effective in discovering Constant Conditional Functional Dependencies(CCFDs) from relational databases . These CCFDs are used as business rules for context dependent data validations. Conditional Functional Dependencies(CFDs) are an extension of Functional dependencies(FDs) which captures the consistency of data by supporting patterns of semantically related constants. Based on the hierarchy between FDs, CFDs and Association Rules :Union of Association Rules are CFDs, while union of CFDs are FDs. This paper proposes the algorithms used for Association Rule discovery to be reused for CCFD Mining i. e CFDs with constant patterns only . Three algorithms for CCFD mining namely CCFD-FPGrowth, CCFD-AprioriClose and CCFD-ZartMNR are provided in this paper. CCFD-FPGrowth uses FP-growth algorithm to find frequent itemsets and then generates rules as constant patterns from the set of frequent itemsets using modified Agrawal Association rule Generation algorithm. CCFD-AprioriClose uses Apriori algorithm to find frequent closed itemsets and then generates rules as constant patterns from the set of frequent closed itemsets using modified Agrawal Association rule Generation algorithm. CCFD-ZartMNR uses Zart algorithm to find closed itemsets and minimal generators and then generates minimal non-redundant rules from the set of closed itemsets. Experimental results on two real-world data sets show that this approach performs well across several dimensions such as recall, runtime and scalability.


  • W. Fan, F. Geerts, X. Jia, and A. Kementsietsidis, "Conditional functional dependencies for capturing data inconsistencies," TODS, vol. 33, no. 2, june 2008.
  • G. Cong, W. Fan, F. Geerts, X. Jia, and S. Ma, "Improving data quality:Consistency and accuracy," in VLDB, 2007
  • Wenfei Fan , Floris Geerts , Jianzhong Li , Ming Xiong. Discovering Conditional Functional Dependencies. IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, VOL. 23, NO. 5, May 2011.
  • Philip Bohannon, Wenfei Fan, Floris Geerts, Xibei Jia, and Anastasios Kementsietsidis. Conditional functional dependencies for data cleaning. In Proceedings of ICDE'07, April 15-20, Istanbul, Turkey, pages 746_755, 2007.
  • Mining Frequent Patterns without Candidate Generation: A Frequent-Pattern Tree Approach Data Mining and Knowledge Discovery Journal , 8, 53–87, 2004
  • Discovering Frequent Closed Itemsets for Association Rules, Nicolas Pasquier, Yves Bastide, Rafik Taouil, Lotfi Lakhal. ICDT '99 Proceedings of the 7th International Conference on Database Theory, Pages 398 - 416 , Springer-Verlag London, UK ©1999
  • Paul De Bra and Jan Paredaens. Conditional dependencies for horizontal decompositions. In Proceedings of the 10th Colloquium on Automata, Languages and Programming, pages 67_82, London, UK, 1983. Springer-Verlag.
  • Fei Chiang and Renée J. Miller. Discovering data quality rules. PVLDB,1(1):1166_1177, 2008.
  • Fast Algorithms for Mining Association Rules, Rakesh Agrawal, RamaKrishnan Srikant, In Proc. 20th Int. Conf. Very Large Data Bases, VLDB (December--JanuaryMay~ 1994), pp. 487-499
  • Rakesh Agrawal, Tomasz Imielinski, and Arun N. Swami. Mining association rules between sets of items in large databases. In Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, Washington, D. C. , May 26-28, 1993, pages 207_216. ACM Press, 1993.
  • Nicolas Pasquier, Yves Bastide, Ra_k Taouil, and Lot_ Lakhal. Discovering frequent closed itemsets for association rules. In ICDT, pages 398_416, 1999.
  • Lukasz Golab, Howard Karlo_, Flip Korn, Divesh Srivastava, and Bei Yu. On generating near-optimal tableaux for conditional functional dependencies. Proc. VLDB Endow. , 1(1):376_390, 2008.
  • A. Maydanchik Data Quality Assessment Technics Publications,336pp, ISBN-13: 9780977140022
  • Raoul Medina and Lhouari Nourine. A unified hierarchy for functional dependencies, conditional functional dependencies and association rules. In ICFCA, Lecture Notes in Computer Science, pages 235_248. Springer, 2009
  • Y. Huhtala, J. Kärkkäinen, P. Porkka, and H. Toivonen. Tane: An efficient algorithm for discovering functional and approximate dependencies. The Computer Journal, 42(2):100_111, 1999.
  • Stéphane Lopes, Jean-Marc Petit, and Lot_ Lakhal. Efficient discovery of functional dependencies and armstrong relations. In EDBT 2000, volume 1777 of LNCS, pages 350_364, Konstanz, Germany, 2000. Springer.
  • Noël Novelli and Rosine Cicchetti. Fun: An efficient algorithm for mining functional and embeddeddependencies. In Proceedings of the 8th International Conference on DatabaseTheory (ICDT'01), volume 1973 of Lecture Notes in Computer Science, pages 189_203, 2001.
  • Catharine Wyss, Chris Giannella, and Edward Robertson. Fastfds: A heuristic-driven, depth-first algorithm for mining functional dependencies from relation instances extended abstract. Data Warehousing and Knowledge Discovery, pages 101_110, 2001.
  • G. Birkhoff. Lattices theory. In coll. pub. XXV, Volume 25. American Mathematical Society,1967. Third edition.
  • B. A. Davey and H. A. Priestley. Introduction to Lattices and Order. Cambridge University Press,1994. Fourth edition.
  • L. Szathmary, A. Napoli, and S. O. Kuznetsov. ZART: A Multifunctional Itemset Mining Algorithm. In Proc. of the 5th Intl. Conf. on Concept Lattices and Their Applications (CLA '07), pages 26{37, Montpellier, France, Oct 2007
  • Marzena Kryszkiewicz: Representative Association Rules and Minimum Condition Maximum Consequence Association Rules. PKDD 1998: 361-369 Minimum
  • Kryszkiewicz, M. Concise representations of association rules. In: Pattern Detection and Discovery. (2002) 92–109
  • Pasquier, N. , Bastide, Y. , Taouil, R. , Lakhal, L. : Closed set based discovery of small covers for association rules. In: Proc. 15emes Journees Bases de Donnees Avancees, BDA. (1999) 361–381
  • Pasquier, N. : Mining association rules using formal concept analysis. In: Proc. Of the 8th International Conf. on Conceptual Structures (ICCS '00), Shaker-Verlag (2000) 259–264