CFP last date
22 April 2024
Reseach Article

A Survey on Data Placement Strategies for Cloud based Scientific Workflows

by Lalitha Singh, Jyoti Malhotra
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 141 - Number 6
Year of Publication: 2016
Authors: Lalitha Singh, Jyoti Malhotra

Lalitha Singh, Jyoti Malhotra . A Survey on Data Placement Strategies for Cloud based Scientific Workflows. International Journal of Computer Applications. 141, 6 ( May 2016), 30-33. DOI=10.5120/ijca2016909651

@article{ 10.5120/ijca2016909651,
author = { Lalitha Singh, Jyoti Malhotra },
title = { A Survey on Data Placement Strategies for Cloud based Scientific Workflows },
journal = { International Journal of Computer Applications },
issue_date = { May 2016 },
volume = { 141 },
number = { 6 },
month = { May },
year = { 2016 },
issn = { 0975-8887 },
pages = { 30-33 },
numpages = {9},
url = { },
doi = { 10.5120/ijca2016909651 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
%0 Journal Article
%1 2024-02-06T23:42:46.005784+05:30
%A Lalitha Singh
%A Jyoti Malhotra
%T A Survey on Data Placement Strategies for Cloud based Scientific Workflows
%J International Journal of Computer Applications
%@ 0975-8887
%V 141
%N 6
%P 30-33
%D 2016
%I Foundation of Computer Science (FCS), NY, USA

Scientific workflows perform computations exceeding single workstation’s capabilities. When running such data intensive workflows in the cloud distributed across several physical locations, the execution time and the resource utilization efficiency highly depends on the initial placement and distribution of the input datasets across these multiple virtual machines in the Cloud. The ideal data placement scheme optimizes the execution of the data intensive scientific workflows in cloud by assigning the tasks to the execution site in such a way that the file transfers and the cost associated are reduced. Several data placement strategies in cloud based scientific workflows are reviewed. A data placement scheme which uses big data to improve the performance and also the data movement cost is studied. BDAP (Big Data Placement strategy), improves workflow performance by minimizing data movement across multiple virtual machines.

  1. G. Juve and E. Deelman, "Scientific workflows and clouds," Journal of Crossroads, vol. 16, no. 3, pp. 14-18, 2010.
  2. U. V. Catalyurek, K. Kaya and B. Ucar, "Integrated Data Placement and Task Assignment for Scientific Workflows in clouds," In Proceedings of the fourth international workshop on Data-intensive distributed computing, pp. 45-54, 2011.K.
  3. T. Fahringer, R. Prodan, R. Duan, J. Hofer, F. Nadeem, F. Nerieri, S. Podlipnig, J. Qin, M. Siddiqui, H.-L. Truong, A. Villazon, and M.
  4. Tera-Grid.
  5. E. Deelman and A. Chervenak, "Data Management Challenges of Data- Intensive Scientific workflows," In Cluster Computing and the Grid, 2008. CCGRID'08. 8th IEEE International Symposium on, pp. 687-692, 2008.
  6. D. Yuan, Y. Yang, X. Liu and J. Chen "A data placement strategy in scientific cloud workflows," Future Generation Computing Systems 26, no. 8, pp. 1200-1214, 2010.
  7. Wei Guo, Xinjun Wang “A Data Placement Strategy Based on Genetic Algorithm in Cloud Computing Platform,” In 10th Web Information System and Application Conference, pp 369-372, 2013.
  8. Qiang Li, Kun Wang, Member, Suwei Wei, Xuefeng Han, Lili Xu, “A data placement strategy based on clustering and consistent hashing algorithm in Cloud Computing,” In 9th International Conference on Communications and Networking in China (CHINACOM), pp 278-283, 2014.
  9. J. Wang, D. Crawl, I. Altintas, W. Li, "Big Data Applications Using Workflows for Data parallel Computing," Journal of Computing in Science and Engineering, vol. 16, no. 4, pp. 11-21, 2014.
  10. Mahdi Ebrahimi, Aravind Mohan, Andrey Kashlev, and Shiyong Lu, “BDAP: A Big Data Placement Strategy for Cloud-Based Scientific Workflows,” In Big Data Computing Service and Applications, 2015. First IEEE International Conference, pp 105-114.
Index Terms

Computer Science
Information Sciences


Cloud computing Big data Scientific workflow Data placement Virtual machine.