CFP last date
20 May 2024
Reseach Article

A Comprehensive View of MapReduce Aware Scheduling Algorithms in Cloud Environments

by Hadi Yazdanpanah, Amin Shouraki, Abbas Ali Abshirini
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 127 - Number 6
Year of Publication: 2015
Authors: Hadi Yazdanpanah, Amin Shouraki, Abbas Ali Abshirini
10.5120/ijca2015906395

Hadi Yazdanpanah, Amin Shouraki, Abbas Ali Abshirini . A Comprehensive View of MapReduce Aware Scheduling Algorithms in Cloud Environments. International Journal of Computer Applications. 127, 6 ( October 2015), 10-15. DOI=10.5120/ijca2015906395

@article{ 10.5120/ijca2015906395,
author = { Hadi Yazdanpanah, Amin Shouraki, Abbas Ali Abshirini },
title = { A Comprehensive View of MapReduce Aware Scheduling Algorithms in Cloud Environments },
journal = { International Journal of Computer Applications },
issue_date = { October 2015 },
volume = { 127 },
number = { 6 },
month = { October },
year = { 2015 },
issn = { 0975-8887 },
pages = { 10-15 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume127/number6/22732-2015906395/ },
doi = { 10.5120/ijca2015906395 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T23:19:09.425769+05:30
%A Hadi Yazdanpanah
%A Amin Shouraki
%A Abbas Ali Abshirini
%T A Comprehensive View of MapReduce Aware Scheduling Algorithms in Cloud Environments
%J International Journal of Computer Applications
%@ 0975-8887
%V 127
%N 6
%P 10-15
%D 2015
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Cloud computing has emerged as a model that harnesses massive capacities of data centers to host services in a cost-effective manner. MapReduce has been widely used as a Big Data processing platform, proposed by Google in 2004 and has become a popular parallel computing framework for large-scale data processing since then. It is best suited for embarrassingly parallel and data-intensive tasks. It is designed to read large amount of data stored in a distributed file system such as Google File System (GFS), process the data in parallel, aggregate and store the results back to the distributed file system. Scheduling is one of the most critical aspects of MapReduce. Also three important scheduling issues in MapReduce such as locality, synchronization and fairness exist. This paper tries to illustrate and analyze the overview of thirteen different aware scheduling algorithms with different techniques and approaches for MapReduce in Hadoop and their scheduling issues and problems. At the end, Advantages and disadvantages of these algorithms are identified.

References
  1. M. Armbrust, A. Fox, R. Griffith, A. D. Joseph, "A View of Cloud Computing ", Comm. Of the ACM, Vol. 53, No. 4, April 2010, pp. 50-58.
  2. J. Dean and S. Ghemawat, “MapReduce: Simplied Data Processing on Large Clusters”, In Proc. of 5th Symposium on Operating Systems Design and Implementation, 2008, pp. 137-150.
  3. S. Ghemawat, H. Gobioff, and S. T. Leung, "The Google File System", In ACM Symposium on Operating Systems Principles (SOSP), 2003.
  4. Hadoop, “Hadoop home page.” http://hadoop.apache.org/.
  5. M. Zaharia, A. Konwinski, A. D. Joseph, R. Katz and I. Stoica, "Improving MapReduce performance in heterogeneous environments", In: OSDI 2008: 8th USENIX Symposium on Operating Systems Design and Implementation, 2008.
  6. Hadoop’s Fair Scheduler. https://hadoop.apache.org/docs/r1.2.1/fair_scheduler.
  7. J. Chen, D. Wang and W. Zhao, "A Task Scheduling Algorithm for Hadoop Platform", JOURNAL OF COMPUTERS, VOL. 8, NO. 4, APRIL 2013, pp. 929-936.
  8. M. Zaharia, D. Borthakur, J.S. Sarma, K. Elmeleegy, S. Shenker and I. Stoica, “ Delay scheduling: a simple technique for achieving locality and fairness in cluster scheduling”, In: Proceedings of the fifth European conference on computer systems, New York, NY, USA: ACM, 2010, pp. 265–278.
  9. M. Hammoud, M. Rehman and M. Sakr, “Center-of-Gravity reduce task scheduling to lower MapReduce network traffic”, International conference on cloud computing. IEEE, 2012, pp. 49-58.
  10. M. D. Assuncao, M. A. S. Netto, F. Koch and S. Bianchi, "Context-aware Job Scheduling for Cloud Computing Environments", IEEE/ACM Fifth International Conference on Utility and Cloud Computing, 2012, pp. 255-262.
  11. K. A. Kumar, V. K. Konishetty, K. Voruganti and G. Rao, "CASH: context aware scheduler for Hadoop", In: Proceedings of the international conference on advances in computing, communications and informatics, New York, NY, USA: ACM, 2012, pp. 52–61.
  12. X. Zhang and Y. Ding, "A Distribution Aware Scheduling Method in MapReduce", IEEE Symposium on Electrical & Electronics Engineering (EEESYM), 2012, pp. 128-131.
  13. L. Guo, H. Sun et al., "A Data Distribution Aware Task Scheduling Strategy for MapReduce System", Cloud Computing, 2009, pp. 694-699.
  14. L. Mashayekhy, M. Movahed Nejad, D. Grosu, D. Lu and W. Shi, "Energy-aware Scheduling of MapReduce Jobs", IEEE International Congress on Big Data, 2014, pp. 32-39.
  15. Y. Chen, S. Alspaugh, D. Borthakur, and R. Katz, "Energy efficiency for large-scale MapReduce workloads with significant interactive analysis", in Proc. of the 7th ACM European Conf. on Computer Systems, 2012, pp. 43–56.
  16. W. Lang and J. M. Patel, "Energy management for MapReduce Clusters", Proc. of the VLDB Endowment, vol. 3, no. 1-2, 2010, pp. 129–139.
  17. R. Nanduri, N. Maheshwari, R. Raja and V. Varma , "Job Aware Scheduling Algorithm for MapReduce Framework", 3rd IEEE International Conference on Cloud Computing Technology and Science, 2011, pp. 724-729.
  18. JobTracker Architecture, Available: http://hadoop.apache.org/common/docs/current/mapred_tutorial.html.
  19. S. Pati and M. A. Mehta, "Job Aware Scheduling in Hadoop for Heterogeneous Cluster", IEEE International Advance Computing Conference (IACC), 2015, pp. 778- 783.
  20. C. He, Y. Lu, and D. Swanson, "Matchmaking: A new MapReduce scheduling technique", IEEE Third International Conference on Cloud Computing Technology and Science (CloudCom), December 2011, pp. 40-47.
  21. H. H. You, C. C. Yang et al., "A load-aware scheduler for MapReduce framework in heterogeneous cloud environments", Proceedings of the 2011 ACM Symposium on Applied Computing, 2011, pp. 127-132.
  22. T. Yi Chen, H. Wen Wei, M. Feng Wei, Y. Jie Chen, T. sheng Hsu and W. Kuan Shih, "LaSA: A Locality-aware Scheduling Algorithm for Hadoop-MapReduce Resource Assignment", International Conference on Collaboration Technologies and Systems (CTS), 2013, pp. 342-346.
  23. P. Kondikoppa , C. H. Chiu, C. Cui, L. Xue and S. J. Park, "Network-Aware Scheduling of MapReduce Framework on Distributed Clusters over High Speed Networks", Workshop on Cloud Services, Federation, and the 8th Open Cirrus Summit, San Jose, CA, USA, September 21, 2012.
  24. Y. Li, H. Zhang et al., "A Power-Aware Scheduling of MapReduce Applications in the Cloud", Dependable, Autonomic and Secure Computing (DASC), 2011 IEEE Ninth International Conference, 2011, pp. 613-620.
  25. S. Ibrahim, H. Jin, L. Lu, B. He, G. Antoniu and S. Wu, "Maestro: replica-aware map scheduling for MapReduce”, In: The 12th international symposium on cluster, cloud and grid computing. IEEE/ACM, 2012, pp. 435– 477.
  26. I. Polato, R. Ré, A. Goldman and F. Kon, “A comprehensive view of Hadoop research—A systematic literature review”, Journal of Network and Computer Applications (2014), Volume 46, November 2014, pp. 1-25.
  27. J. Polo, C. Castillo, D. Carrera, Y. Becerra, I. Whalley, M. Steinder, J. Torres and E. Ayguad, "Resource-aware adaptive scheduling for mapreduce clusters", Middleware 2011, 2011, pp. 187-207.
  28. M. Yong, N. Garegrat and S. Mohan: “Towards a Resource Aware Scheduler in Hadoop”, in Proc. ICWS, 2009, pp. 102-109.
  29. J. S Manjaly and V. S Chooralil, "TaskTracker Aware Scheduling for Hadoop MapReduce", Third International Conference on Advances in Computing and Communications, 2013, pp. 278-281.
  30. J. H. Hsiao and S. J. Kao, "A Usage-Aware Scheduler for Improving MapReduce Performance in Heterogeneous Environments", International Conference on Information Science, Electronics and Electrical Engineering (ISEEE), Vol.3, 2014, pp. 1648- 1652.
Index Terms

Computer Science
Information Sciences

Keywords

Cloud Computing MapReduce Scheduling algorithms