CFP last date
20 May 2024
Reseach Article

Comparative Analysis of Fault Tolerance Techniques in Grid Environment

by R.k.bawa, Ramandeep Singh
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 41 - Number 1
Year of Publication: 2012
Authors: R.k.bawa, Ramandeep Singh
10.5120/5505-7520

R.k.bawa, Ramandeep Singh . Comparative Analysis of Fault Tolerance Techniques in Grid Environment. International Journal of Computer Applications. 41, 1 ( March 2012), 21-25. DOI=10.5120/5505-7520

@article{ 10.5120/5505-7520,
author = { R.k.bawa, Ramandeep Singh },
title = { Comparative Analysis of Fault Tolerance Techniques in Grid Environment },
journal = { International Journal of Computer Applications },
issue_date = { March 2012 },
volume = { 41 },
number = { 1 },
month = { March },
year = { 2012 },
issn = { 0975-8887 },
pages = { 21-25 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume41/number1/5505-7520/ },
doi = { 10.5120/5505-7520 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T20:28:29.009442+05:30
%A R.k.bawa
%A Ramandeep Singh
%T Comparative Analysis of Fault Tolerance Techniques in Grid Environment
%J International Journal of Computer Applications
%@ 0975-8887
%V 41
%N 1
%P 21-25
%D 2012
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Grid being a collection of heterogeneous resources connected through network, to execute complex jobs with high processing power requirements, is more vulnerable to faults. Faults may affect the performance and QoS of Grid. Faults are dealt with either avoiding them or recovering them by either re-execution or by resuming the execution from the point of failure by using the checkpoints. The various fault tolerance techniques use resource management, job scheduling services combined with checkpointing scheme. Different techniques targets different kind of faults and have their respective advantages and limitations. In this paper we have analyzed various faults, fault tolerance approaches and techniques. Finally different techniques have been evaluated based on resource utilization, redundancy, execution time and checkpointing overhead.

References
  1. Raissa Medeiros, Walfredo Cirne, Faults in Grids: Why are they so bad and What can be done about it?", Proceedings of the Fourth International Workshop on Grid Computing 2003.
  2. Jia Yu , Rajkumar Buyya, "A Taxonomy of Workflow Management Systems for Grid Computing ". Department of CS and SE university of Melbourne, Australia.
  3. Leili Mohammad Khanli, Maryam Etminan Far, Amir Masoud Rahmani, "RFOH: A New Fault Tolerant Job Scheduler in Grid Computing" IEEE Second International Conference on Computer Engineering and Applications P 422-425 2010.
  4. Paul Townend, Jie Xu, "Fault Tolerance within a Grid Environment" IEEE Second International Conference on Computer Engineering and Applications 2009. Department of Computer Science University of Durham.
  5. Yulan Yin, Yanhong Zhao, Fengna Dai, "Fault Tolerance Scheduling in Economic Grids". IEEE P 2252-2256 2011.
  6. Jesus Montes, Alberto Sanchez, Maria S. Perez "Improving Grid fault tolerance by means of global behavior modeling", Ninth International Symposium on Parallel and Distributed Computing P 101- 108 2010.
  7. Ivan Cores, Gabriel Rodr?guez, Maria J. Mart ?n and Patricia Gonzalez, "Achieving Fault Tolerance on Grids with the CPPC Framework and GridWay Metascheduler". 22nd International Symposium on Computer Architecture and High Performance Computing P 119 -126 2010.
  8. Jasma Balasangameshwara, Nedunchezhian Raju, "A Fault Tolerance Optimal Neighbor Load Balancing Algorithm for Grid Environment", International IEEE Conference on Computational Intelligence and Communication Systems P 428 – 433 2010.
  9. Francisco Brasileiro, Lauro Beltrao Costa, Alisson Andrade, Walfredo Cirne "A large scale fault-tolerant Grid information service" MGC 06 November 27, 2006 Melbourne, Australia.
  10. Yongjian Wang, Zhongzhi Luan, Depei, DDGrid: A Grid Computing Environment with Massive Concurrency and Fault-tolerance Support. Proceedings of the IEEE Seventh International Conference on Grid and Cooperative Computing P 5-14 2008.
  11. B. Tierney, R. Aydt, D. Gunter, W. Smith, V. Taylor, R. Wolski, and M. Swany. A Grid Monitoring Architecture. Working Document, January 2002.
  12. Y. Aridor, D. Lorenz, B. Rochwerger, B. Horn, and H. Salem, "Reporting Grid Services (ReGS) Specification. ", IBM Haifa Research Lab, January 2003.
  13. Congfeng Jiang, Cheng Wang, Xiaohu Liu, "Adaptive Replication Based Security Aware and Fault Tolerant Job Scheduling for Grids" Eighth ACIS International conference on Artificial Intelligence and Parallel distributed Computing IEEE 2009.
Index Terms

Computer Science
Information Sciences

Keywords

Fault Tolerance Resource Management Job Scheduling Checkpointing Replication