CFP last date
22 April 2024
Reseach Article

Methods to Enhance Transformation in Near Real Time ETL

by Mohammed Muddasir N., Ravi Kumar V., Prajwal V.
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 137 - Number 5
Year of Publication: 2016
Authors: Mohammed Muddasir N., Ravi Kumar V., Prajwal V.
10.5120/ijca2016908733

Mohammed Muddasir N., Ravi Kumar V., Prajwal V. . Methods to Enhance Transformation in Near Real Time ETL. International Journal of Computer Applications. 137, 5 ( March 2016), 20-24. DOI=10.5120/ijca2016908733

@article{ 10.5120/ijca2016908733,
author = { Mohammed Muddasir N., Ravi Kumar V., Prajwal V. },
title = { Methods to Enhance Transformation in Near Real Time ETL },
journal = { International Journal of Computer Applications },
issue_date = { March 2016 },
volume = { 137 },
number = { 5 },
month = { March },
year = { 2016 },
issn = { 0975-8887 },
pages = { 20-24 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume137/number5/24272-2016908733/ },
doi = { 10.5120/ijca2016908733 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:37:34.549113+05:30
%A Mohammed Muddasir N.
%A Ravi Kumar V.
%A Prajwal V.
%T Methods to Enhance Transformation in Near Real Time ETL
%J International Journal of Computer Applications
%@ 0975-8887
%V 137
%N 5
%P 20-24
%D 2016
%I Foundation of Computer Science (FCS), NY, USA
Abstract

During the transformation phase of near real time ETL there could be some technique applied so that we get better results in terms of speed and accuracy. Transformation phase concentrates on changing the transactional data into semantically suitable format for the data warehouse. We try to bring in some of the solution during transformation phase that could enhance the speed and accuracy of the phase like advanced query optimization techniques, designing a new workflow so that we could reschedule some of the task. E.g. some functions applied on two parallel flows could be applied only once if the flows are converging. Also we look into some of the solutions for stream data how we could merge stream data and stored data, the challenges like speed and memory utilization. We also explore solutions like event based transformation for selected items, and handling of metadata efficiently so that it could add valued to the transformation phase.

References
  1. P. Vassiliadis and A. Simitsis, "Near Real Time ETL," springer, vol. 3, 2008.
  2. A. Wibowo, "Problems and Available Solutions On The Stage of Extract, Transform, and Loading In Near Real-Time Data Warehousing," IEEE, p. 345, 2015.
  3. C. K. Bhensdadia, D. M. Tank, A. Ganatra and Y. P. Kosta, "Speeding ETL Processing in Data Warehouses Using High-Performance Joins For Changed Data Capture (CDC)," IEEE, 2010.
  4. E. Schallehn, K.-U. Sattler and G. Saake, "Advanced Grouping and Aggregation," in CIKM '01 Proceedings of the tenth international conference on Information and knowledge management, New York, 2001.
  5. R. Elmasri and S. Navathe, Fundamentals of Database Systems, Addison-Wesley Pubs, 2000.
  6. A. Simitsis, P. Vassiliadis and T. Sellis, "State-Space Optimization of ETL Workflows," IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, vol. 17, no. 10, 2005.
  7. Y. Tu and C. Guo, "An Intelligent ETL Workflow Framework based on data Partition," IEEE, 2010.
  8. M. A. Naeem, G. Dobbie and G. Weber, "An Event-Based Near Real-Time Data Integration Architecture," IEEE, 2008. N. Polyzotis, S. Skiadopoulos and P. Vassiliadis, "Supporting Streaming Updates in an Active Data Warehouse," IEEE, 2007.
  9. M. A. Bornea, A. Deligiannakis, Y. Kotidis and V. Vassalos, "Semi-Streamed Index Join for Near-Real Time Execution of ETL Transformations," IEEE, 2011.
  10. A. Simitsis, C. Gupta, S. Wang and U. Dayal, "Partitioning Real-Time ETL Workflows," IEEE, 2010.
  11. L. Li, "A Framework Study of ETL Processes Optimization Based on Metadata Repository," IEEE, 2010.
Index Terms

Computer Science
Information Sciences

Keywords

ETC CBR MDB