![]() |
10.5120/ijca2017915130 |
Swati R Mahendrakar and B M Patil. PRISM: Fine-Grained Phase and Resource Information-aware Scheduler for Map-Reduce. International Journal of Computer Applications 172(4):32-39, August 2017. BibTeX
@article{10.5120/ijca2017915130, author = {Swati R. Mahendrakar and B. M. Patil}, title = {PRISM: Fine-Grained Phase and Resource Information-aware Scheduler for Map-Reduce}, journal = {International Journal of Computer Applications}, issue_date = {August 2017}, volume = {172}, number = {4}, month = {Aug}, year = {2017}, issn = {0975-8887}, pages = {32-39}, numpages = {8}, url = {http://www.ijcaonline.org/archives/volume172/number4/28241-2017915130}, doi = {10.5120/ijca2017915130}, publisher = {Foundation of Computer Science (FCS), NY, USA}, address = {New York, USA} }
Abstract
In recent years, Map Reduce has become a popular model with regard to data-intensive computation. Map Reduce can significantly reduce the execution time of data-intensive jobs. In order to achieve this objective, Map Reduce breaks down each job into small map and reduce tasks and executes them in parallel across a large number of machines. However, existing solutions mainly focus on scheduling at the task-level, which offer sub-optimal job performance, because tasks may have resource requirements which may vary during their lifetime. This makes it difficult for existing system’s task-level schedulers to effectively utilize available resources in order to reduce job execution time.
To avoid this limitation, PRISM is introduced. PRISM stands for Phase and Resource Information-aware Scheduler for Map-Reduce. PRISM consists of various clusters that perform resource-aware scheduling at the level of phases. PRISM can be defined as a fine-grained resource-aware Map Reduce scheduler that divides tasks into phases. Here, each phase has a constant resource usage profile, so that not a single phase suffers from starvation. PRISM also offers high resource utilization and provides 1:3x improvements in job running time as compared to the current Hadoop schedulers.
References
- Hadoop MapReduce distribution [Online]. Available: http://hadoop.apache.org, 2015.
- Hadoop Capacity Scheduler [Online]. Available: http://hadoop.apache.org/docs/stable/capacity_scheduler html/, 2015.
- Hadoop Fair Scheduler [Online]. Available: http://hadoop.apache.org/docs/r0.20.2/fair_scheduler.html, 2015.
- Hadoop Distributed File System [Online]. Available: hadoop.apache.org/docs/hdfs/current/, 2015.
- GridMix benchmark for Hadoop clusters [Online]. Available:http://hadoop.apache.org/docs/mapreduce/curt/gridmix.html, 2015.
- PUMA benchmarks [Online]. Available: http://web.ics.purdue.edu/fahmad/benchmarks/datasets.htm, 2015.
- The Next Generation of Apache Hadoop MapReduce [Online].Available:http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html, 2015.
- T. Condie, N. Conway, P. Alvaro, J. Hellerstein, K. Elmeleegy, and R. Sears, “MapReduce online,” in Proc. USENIX Symp. Netw. Syst. Des. Implementation, 2010, p. 21.
- J. Dean and S. Ghemawat, “Mapreduce: Simplified data processing on large clusters,” Commun. ACM, vol. 51, no. 1, pp. 107–113, 2008.
- A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica, “Dominant resource fairness: Fair allocation of multiple resource types,” in Proc. USENIX Symp. Netw. Syst. Des. Implementation, 2011, pp. 323–336.
- H. Herodotou, H. Lim, G. Luo, N. Borisov, L. Dong, F. Cetin, and S. Babu, ”Starfish: A self-tuning system for big data analytics,” in Proc. Conf. Innovative Data Syst. Res., 2011, pp. 261–272.
- M. Isard, V. Prabhakaran, J. Currey, U. Wieder, and K. Talwar, “Quincy: Fair scheduling for distributed computing clusters,” in Proc. ACMSIGOPS Symp. Oper. Syst. Principles, 2009, pp. 261–276.
- C. Joe-Wong, S. Sen, T. Lan, and M. Chiang. “Multi-resource allocation: Flexible tradeoffs in a unifying framework,” in Proc. IEEE Int. Conf. Comput. Commun., 2012, pp. 1206–1214.
- J. Polo, C. Castillo, D. Carrera, Y. Becerra, I. Whalley, M. Steinder, J. Torres, and E. Ayguad_e, “Resource-aware adaptive scheduling for MapReduce clusters,” in Proc. ACM/IFIP/USENIX Int. Conf. Middleware, 2011, pp. 187–207.
- Verma, L. Cherkasova, and R. Campbell, “Resource provisioning framework for MapReduce jobs with performance goals,” in Proc. ACM/IFIP/USENIX Int. Conf. Middleware, 2011, pp. 165–186.
- Qi Zhang, Student Member, IEEE, Mohamed Faten Zhani, Member, IEEE, Yuke Yang, Raouf Boutaba, Fellow, IEEE, and Bernard Wong, “PRISM: Fine-Grained Resource-Aware Scheduling for Map-Reduce,” in ieee transactions on cloud computing, vol. 3, no. 2, april/june 2015.
Keywords
Map Reduce, scheduling, resource allocation.