Call for Paper - July 2018 Edition
IJCA solicits original research papers for the July 2018 Edition. Last date of manuscript submission is June 20, 2018. Read More

De-Mystifying Data Testing and Applying Automation

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2015
Pad Balasubramanian, Prasanth Malla

Pad Balasubramanian and Prasanth Malla. Article: De-Mystifying Data Testing and Applying Automation. International Journal of Computer Applications 126(11):1-5, September 2015. Published by Foundation of Computer Science (FCS), NY, USA. BibTeX

	author = {Pad Balasubramanian and Prasanth Malla},
	title = {Article: De-Mystifying Data Testing and Applying Automation},
	journal = {International Journal of Computer Applications},
	year = {2015},
	volume = {126},
	number = {11},
	pages = {1-5},
	month = {September},
	note = {Published by Foundation of Computer Science (FCS), NY, USA}


Being Agile has become a norm rather than a special need. To stay Agile in today’s world requires significant thought and innovative solutions. The testing industry has matured during the past decade with hundreds of open source tools and frameworks, specifically in the area of automation. QA teams have significantly benefitted by this evolution, enabling them to satisfy the demand of being Agile throughout the lifecycle and stay at par with technology advancements.

One of the most important objectives of data-testing is to recommend the corrective measures the back-end integration teams need to introduce in the development life cycle (SDLC). Data validation definitely plays an important role and there are lots of techniques and tools available in the market. However, end-to-end automation penetration is comparatively low in back-end data testing and ETL test automation since the data transformation predominantly happens through ETL processes on major enterprise systems. There is a clear market and industry demand for automation in data testing. This space is gaining importance with the sole reason being quantity (size) to be handled along with the quality of data.

This paper explains the essentials of data testing strategy - how data quality and data validation checks play an important role; where and how to bring-in automation; and finally the method for arriving at faster, accurate root-cause analysis. It can be argued that data quality checks are implicitly covered as part of validation, however it is always recommended to address the problem at the source rather than at the destination. According to analyst findings in public domain, significant revenue wastages are reported due to poor data quality.

The approach defined in this paper will benefit QA-testing teams involved in back-end data testing. It will improve their understanding and enable them to apply correct techniques as they move forward. Automation for data-testing is considered only for people with a technical background. A proper understanding of what exactly happens at back-end once data is processed from front-end, will enable a non-technical person to understand, enjoy, and appreciate the benefits of automation.


  1. The Six Principles of BW Data Validation. Sapiex White papers: 272DE7600522A3CE862578230056F4FA/$FILE/Sapiex_White_Paper_-_The_Six_Principles_of_BW_Data_Validation.pdf
  2. General concepts of Set Theory
  3. Yuan Wang, David J. DeWitt, Jin-Yi Cai, “X-Diff: An Effective Change Detection Algorithm for XML Documents”, xdiff.pdf
  4. Parsing Techniques - A Practical Guide:
  5. XQuery/XML Differences. Available at:


Data quality check, data validations, ETL test automation, Agile development, third party systems, data source, and data destination.