Call for Paper - January 2023 Edition
IJCA solicits original research papers for the January 2023 Edition. Last date of manuscript submission is December 20, 2022. Read More

Dynamic Job Ordering and Slot Configuration for MapReduce Workloads

Print
PDF
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2017
Authors:
Sonali S. Birajadar, B. M. Patil, V. M. Chandode
10.5120/ijca2017915351

Sonali S Birajadar, B M Patil and V M Chandode. Dynamic Job Ordering and Slot Configuration for MapReduce Workloads. International Journal of Computer Applications 173(7):8-12, September 2017. BibTeX

@article{10.5120/ijca2017915351,
	author = {Sonali S. Birajadar and B. M. Patil and V. M. Chandode},
	title = {Dynamic Job Ordering and Slot Configuration for MapReduce Workloads},
	journal = {International Journal of Computer Applications},
	issue_date = {September 2017},
	volume = {173},
	number = {7},
	month = {Sep},
	year = {2017},
	issn = {0975-8887},
	pages = {8-12},
	numpages = {5},
	url = {http://www.ijcaonline.org/archives/volume173/number7/28345-2017915351},
	doi = {10.5120/ijca2017915351},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}
}

Abstract

In today’s world the amount of data being generated is growing exponentially and use of internet is also increasing it leads to handle lots of data by internet service providers. MapReduce is one of the good solutions for implementing large scale distributed data application. A MapReduce workload generally contains a set of jobs, each of job consists of multiple map and reduce tasks. Map task executed before reduce task and map tasks can only run in map slot and reduce tasks can only run in reduce slot. Due to that different job executions orders and map/reduce slot configurations for a MapReduce workload have different performance metrics and different system utilization. Makespan and total completion time are two key performance metrics. This paper proposes two algorithm for these two key metrics, The first class of algorithms mainly focuses on the job ordering optimization for a MapReduce workload under given slot configuration and the second class of algorithms perform optimization for slot configuration for a MapReduce workload.

References

  1. J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing on large clusters,” in Proc. 6th Conf. Symp. Oper. Syst. Design Implementation, 2004
  2. J. Wolf, D. Rajan, K. Hildrum, R. Khandekar, V. Kumar, S. Parekh, K.-L. Wu, and A. balmin, “Flex: A slot allocation scheduling optimizer for mapreduce workloads,” in Proc. ACM/IFIP/USENIX 11th Int. Conf. Middleware, 2010
  3. A. Verma, L. Cherkasova, and R. H. Campbell, “Two sides of a coin: Optimizing the schedule of mapreduce jobs to minimize their makespan and improve cluster performance,” in Proc. IEEE 20th Int. Symp. Model., Anal. Simul. Comput. Telecommun. Syst.,2012
  4. S. Tang, B.-S. Lee, and B. He, “Dynamic slot allocation technique for mapreduce clusters,” in Proc. IEEE Int. Conf. Cluster Comput.,Sep. 2013, pp. 1–8.
  5. S. Tang, B.-S. Lee, and B. He, “Dynamicmr: A dynamic slot allocation optimization framework for mapreduce clusters,” IEEE Trans.Cloud Comput., vol. 2, no. 3, pp. 333–347, Jul. 2014.
  6. S. Tang, B.-S. Lee, and B. He,, “Mrorder: Flexible job ordering optimization for online mapreduce workloads,” in Proc. 19th Int. Conf. Parallel Process., 2013, pp. 291–304.
  7. G. J. Kyparisis and C. Koulamas, “A note on makespan minimization in two-stage flexible flow shops with uniform machines,” Eur. J.Oper. Res., vol. 175, no. 2, pp. 1321–1327, 2006.
  8. P. Agrawal, D. Kifer, and C. Olston, “Scheduling shared scans of large data files,” Proc. VLDB Endow., vol. 1, no. 1, pp. 958–969,Aug. 2008.
  9. T. Nykiel, M. Potamias, C. Mishra, G. Kollios, and N. Koudas, “Mrshare: Sharing across multiple queries in mapreduce,” Proc.VLDB Endowment, vol. 3, nos. 1/2, pp. 494–505, Sep. 2010.
  10. H. Herodotou, H. Lim, G. Luo, N. Borisov, L. Dong, F. B. Cetin, and S. Babu, “Starfish: A self-tuning system for big data analytics,”in Proc. 5th Conf. Innovative Data Syst. Res., 2011
  11. S. M. Johnson, “Optimal two- and three-stage production schedules with setup times included,” Naval Res. Logistics Quart., vol. 1, no. 1, pp. 61–68, 1954.

Keywords

MapReduce, Hadoop, Flow-shops, Scheduling algorithm, Job ordering.