CFP last date
20 May 2024
Reseach Article

Empirical Models for the Performance of ETL Processes

by M Mrunalini, T V Suresh Kumar, K Rajani Kanth
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 92 - Number 5
Year of Publication: 2014
Authors: M Mrunalini, T V Suresh Kumar, K Rajani Kanth
10.5120/16007-5017

M Mrunalini, T V Suresh Kumar, K Rajani Kanth . Empirical Models for the Performance of ETL Processes. International Journal of Computer Applications. 92, 5 ( April 2014), 36-41. DOI=10.5120/16007-5017

@article{ 10.5120/16007-5017,
author = { M Mrunalini, T V Suresh Kumar, K Rajani Kanth },
title = { Empirical Models for the Performance of ETL Processes },
journal = { International Journal of Computer Applications },
issue_date = { April 2014 },
volume = { 92 },
number = { 5 },
month = { April },
year = { 2014 },
issn = { 0975-8887 },
pages = { 36-41 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume92/number5/16007-5017/ },
doi = { 10.5120/16007-5017 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:13:30.757189+05:30
%A M Mrunalini
%A T V Suresh Kumar
%A K Rajani Kanth
%T Empirical Models for the Performance of ETL Processes
%J International Journal of Computer Applications
%@ 0975-8887
%V 92
%N 5
%P 36-41
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Generally, software projects' outcomes will give us various aspects of quality parameters. In such cases, empirical studies with prototyping exercises are well suited to analyze/understand the system. ETL (Extraction-Transformation-Loading) is the software responsible for extracting data, cleaning, transforming and loading the data into a data warehouse. ETL is a large software system. The performance of the decision support system depends on the data warehouse that it uses. ETL tools play a major role in building the data warehouse; these tools need to have good performance in order to improve the performance of the whole system. An experimental study is conducted to analyze the performance of the ETL tool. Two ETL tools are considered; one with integrated security and another without integrated security. The time for data extraction in different environments is recorded. Further, regression analysis is done on the experimental data and observed the behavior of the tools and developed the empirical models. Both tools have shown the same behavior in performance for different extraction data sizes.

References
  1. Ralph Kimball, "The Data Warehouse ETL Toolkit", Wiley Publications, 2006.
  2. M Mrunalini, T V Suresh Kumar, K Rajani Kanth, 2013 "Secure ETL Process Model: An Assessment of Security in Different Phases of ETL", Proc. International Journal of Software Engineering, IJSE, Vol. 6 No. 1, January 2013 pp 33-63.
  3. Alkis Simitsis, Panos Vassiliadis, Timos Sellis, 2005 "Optimizing ETL Processes in Data Warehouses", Proc. of the 21st International Conference on Data Engineering (ICDE 2005), pp 564-575.
  4. Tony Brown, 2007 "Data Warehouse Efficiency Techniques with the SAS System", SAS Global Forum 2007, April 16-19, 2007 - Orlando, Florida, pp 1-10.
  5. Manole Velicanu, 2007 "Building a Data Warehouse step by step", Informatica Economic?, nr. 2 (42)/2007, pp 83-89
  6. Ion Lungu, Manole Velicanu, Adela Bara, Vlad Diaconi¸t?a, Iuliana Botha, 2008 "Practices for Designing and Improving Data Extraction in a Virtual Data Warehouses Project", Int. J. of Computers, Communications & Control, ISSN 1841-9836, E-ISSN 1841-9844 Vol. III, Suppl. issue: Proceedings of ICCCC 2008, pp. 369-374.
  7. Stefano Rizzi, Alberto Abell´, Jens Lechtenb¨orger and Juan Trujillo, 2006 "Research in Data Warehouse Modeling and Design: Dead or Alive?" Proc. DOLAP'06, pp. 3-10.
  8. Dag I. K. Sjøberg , Bente Anda , Erik Arisholm , Tore Dybå , Magne Jørgensen , Amela Karahasanovic , Espen F. Koren , Marek Vokác, 2002 "Conducting realistic experiments in software engineering", Proc. 1st Int. Symposium on Empirical Software Engineering, pp 17-26.
  9. Jarke M. , Jeusfeld M. A. , Quix C. and Vassiliadis P, 1999 "Architecture and quality in data warehouses: An extended repository approach," Proc. Information Systems, vol. 24(3), pp. 229–253.
  10. Jarke M. , Lenzerini M. , Vassiliou Y. and Vassiliadis P, "Fundamentals of Data Warehousing," Springer Verlag, 2003.
  11. A. Simitsis, P. Vassiliadis and T. K. Sellis, 2005 "Optimizing ETL processes in data warehouses," Proc. ICDE, pp. 564–575.
  12. P. Vassiliadis, A. Simitsis, P. Georgantas, M. Terrovitis and S. Skiadopoulos, 2005 "A generic and customizable framework for the design of ETL scenarios," Proc. Information Systems, 30 (7), pp. 492–525.
  13. A. Simitsis, 2005 "Mapping conceptual to logical models for ETL processes," Proc. DOLAP, pp. 67–76.
  14. P. Vassiliadis, A. Simitsis and S. Skiadopoulos, 2002 "Conceptual modeling for ETL processes," Proc. DOLAP, pp. 14–21.
  15. M. Bouzeghoub, F. Fabret and M. Matulovic, 1999 "Modeling data warehouse refreshment process as a workflow application," Proc. DMDW, pp. 6. 1–6. 12.
  16. D. Calvanese, G. De Giacomo, M. Lenzerini, D. Nardi and R. Rosati, 1998 "Information integration: Conceptual modeling and reasoning support," Proc. CoopIS, pp. 280–291.
  17. Juan Trujillo and Sergio Lujan-Mora, "A UML based approach for Modeling ETL Processes in Data Warehouses," in LNCS, Springer Verlag, vol. 2813/2003, pp. 307–320, 2003.
  18. Vasiliki Tziovara, Panos Vassiliadis and Alkis Simitsis, 2007 "Deciding the Physical Implementation of ETL Workflows," Proc. ACM tenth international workshop on Data warehousing and OLAP (DOLAP'07), pp. 49–56.
  19. D. W. Embley, D. M. Campbell, Y. S. Jiang, S. W. Liddle, D. Wlonsdale, Y. -K. Ng and R. D. Smith, 1999 "Conceptual-Model-Based Data Extraction from Multiple-Record Web Pages," in Data and Knowledge Engineering, vol. 31, no. 3, pp. 227–251.
  20. Alkis Simitsis. Modeling and Managing ETL Processes [Online]. Available: http://ftp. informatik. rwthaachen. de/Publications/CEUR-WS/Vol-76/simitsis. pdf
  21. Panos Vassiliadis, Alkis Simitsis and Spiros Skiadopoulos, 2002 "Logical Modeling of ETL Processes," Proc. International Conference on Advanced Information Systems Engineering, pp. 782–786.
  22. Panos Vassiliadis, Alkis Simitsis and Spiron Skiadopoulos, 2002 "Modeling ETL Activities as Graphs," Proc. DMDW'2002, Toronto, Canada, pp. 52- 61.
  23. Dewayne E. Perry Adam A. Porter Lawrence G. Votta, 2000 "Empirical Studies of Software Engineering: A Roadmap", Proc. The Future of Software Engineering, pp 345 – 355.
  24. M Mrunalini, T V Suresh Kumar, K Rajani Kanth, 2013 "Assessing the Performance and Security Trade-offs at the Early Stages of Software Development", Proc. IndiaCom 2013, New Delhi, India, pp 353-360.
Index Terms

Computer Science
Information Sciences

Keywords

Regression Analysis Integrated Security Secure ETL Performance Analysis Experimental Study.