CFP last date
20 March 2024
Call for Paper
April Edition
IJCA solicits high quality original research papers for the upcoming April edition of the journal. The last date of research paper submission is 20 March 2024

Submit your paper
Know more
Reseach Article

Analysis of Data Quality and Performance Issues in Data Warehousing and Business Intelligence

by Nikhil Debbarma, Gautam Nath, Hillol Das
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 79 - Number 15
Year of Publication: 2013
Authors: Nikhil Debbarma, Gautam Nath, Hillol Das
10.5120/13818-1862

Nikhil Debbarma, Gautam Nath, Hillol Das . Analysis of Data Quality and Performance Issues in Data Warehousing and Business Intelligence. International Journal of Computer Applications. 79, 15 ( October 2013), 20-26. DOI=10.5120/13818-1862

@article{ 10.5120/13818-1862,
author = { Nikhil Debbarma, Gautam Nath, Hillol Das },
title = { Analysis of Data Quality and Performance Issues in Data Warehousing and Business Intelligence },
journal = { International Journal of Computer Applications },
issue_date = { October 2013 },
volume = { 79 },
number = { 15 },
month = { October },
year = { 2013 },
issn = { 0975-8887 },
pages = { 20-26 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume79/number15/13818-1862/ },
doi = { 10.5120/13818-1862 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:53:05.296559+05:30
%A Nikhil Debbarma
%A Gautam Nath
%A Hillol Das
%T Analysis of Data Quality and Performance Issues in Data Warehousing and Business Intelligence
%J International Journal of Computer Applications
%@ 0975-8887
%V 79
%N 15
%P 20-26
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

A Data Warehouse is an integral part of those enterprises which want to have a clear business insights from customer and operational data. It includes collection of technologies aimed at enabling the knowledge worker (executive, manager, analyst) to make better and faster decisions. It is expected to present the right information in the right place at the right time with the right cost in order to support the right decision. Over the years ,the practice of Data warehousing proved that the traditional online Transaction Processing (OLTP) systems are not fully appropriate for decision support. From the survey and evaluation of the literature related to Data Warehouse and with consultation and feedback of the data warehouse practitioners working in renowned IT giants ,it has been observed that the fundamental problems arise in populating a warehouse with quality data. . This paper mainly focuses on the study of the issues that hinder the data quality and performance of the Data warehouse and some of the means that may be opted to realize a better performance with respect to accuracy and quality to meet the challenging and dynamic needs of the corporate world.

References
  1. Leo, L. , Pipino, L. Yang, W. L. , & Wang, R. Y. (2002). Data quality assessment. Communications of the ACM, 45(4), 211-218.
  2. Kahn, B. , Strong, D. , & Wang, R. (2003). Information quality benchmarks: Product and service performance. Communications of the ACM, 45, 184-192.
  3. Redman, T. C. (1998). The impact of poor data quality on the typical enterprise. Communications of the ACM, 41(2), 79-82.
  4. Sanjay Seth,Senior Architect with the Business Intelligence Practice of a leading IT consulting firm, Article on "Data Quality Assessment Approach",Page-4,http:// hosteddocs. ittoolbox. com/ ss052809. pdf
  5. Marsh, R. (2005). Drowning in dirty data? It's time to sink or swim: A four-stage methodology for total data quality management. Database Marketing & Customer Strategy Management, 12(2), 105–112.
  6. Anders Haug, Frederik Zachariassen, Dennis van Liempd, The costs of poor data quality,Journal of Industrial Engineering and Management JIEM, 2011 – 4(2): 168-193 – Online ISSN: 2013-0953.
  7. Wang, R. Y. , Storey, V. C. , & Firth, C. P. (1995). A framework for analysis of data quality research. IEEE Transactions on Knowledge and Data Engineering, 7(4), 623–640.
  8. Ranjit Singh, Dr. Kawaljeet Singh (2010),"A Descriptive Classification of Causes of Data Quality Problems in DataWarehousing", Vol. 7, Issue 3, No 2, May 2010.
  9. R. Kimball and J. Caserta, The Data Warehouse ETL Toolkit: Practical Techniques for Extracting, Cleaning, Conforming, and Delivering Data. John Wiley & Sons, 2004.
  10. W. H. Inmom,Building The Data Warehouse 3rd Edition,Chapter 2,Page :76-77 John Wiley & Sons,2003.
  11. Ahmed K. Elmagarmid, Panagiotis G. Ipeirotis and Vassilios S. Verykios"Duplicate Record Detection: A Survey" IEEE Transactions on Knowledge and Data Engineering, Vol. 19, no. 1, january 2007.
  12. Erhard Rahm , Hong Hai Do,"Data Cleaning: Problems and Current Approaches", University of Leipzig, Germany.
  13. Kamran Ali and Mubeen Ahmed Warraich, A framework to implement Data Cleaning in Enterprise Data Warehouse for Robust Data Quality, International Conference on Information and Emerging Technologies (ICIET), 2010.
  14. Li, Lin, Peng, Taoxin and Kennedy, Jessie (2010) A rule based taxonomy of dirty data. In: Proceedings of Annual International Academic Conference on Data Analysis, Data Quality and Metadata Management. GSTF, Singapore. ISBN 978-981-08-6308-1.
  15. Batini Carlo, Barone Daniele, Cabitza Federico and Grega Simone, A DATA QUALITY METHODOLOGY FOR HETEROGENEOUS DATA, International Journal of Database Management Systems ( IJDMS ), Vol. 3, No. 1, February 2011.
  16. Pedro Gomes,José Farinha and Maria José Trigueiros, A data quality metamodel extension to CWM,Proceeding APCCM '07 Proceedings of the fourth Asia-Pacific conference on Comceptual modelling - Volume 67.
  17. Maunendra Sankar Desarkar,"Data Profiling for ETL Processes", Indian Institute of Technology, Kanpur,India.
  18. Art DeMaio ,Evoke Software,VP Technical Sales Support,"Understanding Data Quality Issues : Finding Data Inaccuracies"
  19. "Performing Data Profiling",http://docs. oracle. com
Index Terms

Computer Science
Information Sciences

Keywords

Data Warehouse(DW) Data Profiling OLTP Data Quality (DQ) ETL