Performance improvement in Distributed Systems through Replication and Checkpointing

Sourabh Dave; Abhishek Raghuvanshi

Call for Paper

August Edition

IJCA solicits high quality original research papers for the upcoming August edition of the journal. The last date of research paper submission is 20 July 2026

Submit your paper

Know more

The week's pick

ReLeaf: A MobileNetV2-Based Mobile Application for Real-Time Waste Classification with LLM-Assisted Recycling Guidance

Fatimah H. Alyami Nadeen N. Abduljabbar Ghadi T. Alzahrani Dana B. Alakeel Amal S. Almirsal Atheer S. Algherairy

Random Articles

Transmit Power Minimization using Fuzzy Rule based System in Relay Assisted Cognitive Radio Networks

November

2015

An Optimized Classifier Frame Work based on Rough Set and Random Tree

Feb

2017

An Intelligent approach to enhance the help messages for a compiler - An expert system

February

2010

Advanced Algorithm for Detection and Prevention of Cooperative Black and Gray Hole Attacks in Mobile Ad Hoc Networks

February

2010

Reseach Article

Performance improvement in Distributed Systems through Replication and Checkpointing

by Sourabh Dave, Abhishek Raghuvanshi

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 42 - Number 19

Year of Publication: 2012

Authors: Sourabh Dave, Abhishek Raghuvanshi

10.5120/5801-8039

Sourabh Dave, Abhishek Raghuvanshi . Performance improvement in Distributed Systems through Replication and Checkpointing. International Journal of Computer Applications. 42, 19 ( March 2012), 17-21. DOI=10.5120/5801-8039

@article{ 10.5120/5801-8039,

author = { Sourabh Dave, Abhishek Raghuvanshi },

title = { Performance improvement in Distributed Systems through Replication and Checkpointing },

journal = { International Journal of Computer Applications },

issue_date = { March 2012 },

volume = { 42 },

number = { 19 },

month = { March },

year = { 2012 },

issn = { 0975-8887 },

pages = { 17-21 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume42/number19/5801-8039/ },

doi = { 10.5120/5801-8039 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T20:31:44.291034+05:30

%A Sourabh Dave

%A Abhishek Raghuvanshi

%T Performance improvement in Distributed Systems through Replication and Checkpointing

%J International Journal of Computer Applications

%@ 0975-8887

%V 42

%N 19

%P 17-21

%D 2012

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In distributed system fault tolerance is an important issue. Many applications executing in present scenario with several processors have to face with problems related to consistency and availability. Complete process will fail with the failure of a single component. There are many existing approaches which assure reliable execution, are based on fault tolerance mechanisms. We talk about the basic concept of fault tolerance, which is to make a network system tolerant enough to work properly, may be with a little low efficiency, in case of any fault. A good fault tolerant system will avoid further failures. After transient failures main problem is to bring a distributed system to a consistent state. We worked on two parts of this problem by providing a distributed system to create consistent checkpoints as well as replication is focused. We have given an algorithm for replication and implemented it in Java RMI. We have done two things: First the checkpoints are replicated and Second, Servers are replicated on different system using that algorithm.

References

Daniel Oelke, "Overview of Distributed computing", Mithral Communications & Design Inc. 1995-2012 , [Online]Available:http://www. mithral. com/projects/cosm/ch-02. html
Sanjay Bansal, and Sanjeev Sharma, "Identification of Critical Factors in Checkpointing Based Multiple Fault Tolerance for Distributed System", Journal of Emerging Trends in Computing and Information Sciences, Volume 2 No. 1, 2010.
Halpern, J. and Y. Moses, "Knowledge and Common Knowledge in a Distributed Environment," Proc. of the 3rd ACM Symposium on Principles of Distributed Systems, 1984, pp. 50-61 and Lamport, L. , R. Shostak, and M. Pease, "The Byzantine Generals Problem," ACM Transactions on Programming Languages and Systems, Vol. 4 No. 3, July 1982, pp. 382-401.
Jalote, P. Fault Tolerance in Distributed Systems, (Prentice Hall, 1994).
Chris Matthews, "Introduction to Java Remote Method Invocation (RMI)", The Electronic Developer Magzine, [Online]Available: http://www. edm2. com/0601/rmi1. html
A Concept of Replicated Remote Method Invocation Jerzy Brzezinski and Cezary Sobaniec, Institute of Computing Science, Poznan University of Technology, Poland{Jerzy. Brzezinski,Cezary. Sobaniec}@cs. put. poznan. pl.
M. Wiesmann, F. Pedone, A. Schiper, B. Kemme, G. Alonso, "Understanding Replication in Databases and Distributed Systems," Research supported by EPFLETHZ DRAGON project and OFES).
M. Herlihy and J. Wing. "Linearizability: a correctness condition for concurrent objects," ACM Trans. on Progr. Languages and Syst. , 12(3):463-492, 1990. (IJIDCS) International Journal on Internet and Distributed Computing Systems. Vol: 1 No: 1, 39
M. Ahamad, P. W. Hutto, G. Neiger, J. E. Burns, and P. Kohli. , "Causal Memory:Definitions, implementations and Programming," TR GIT-CC-93/55, Georgia Institute of Technology, July 94.
H. P. Reiser, M. J. Danel, and F. J. Hauck. , " A flexible replication framework for scalable andreliable . net services. ," In Proc. of the IADIS Int. Conf. on Applied Computing, volume1, pages 161–169, 2005.
A. Kale, U. Bharambe, "Highly available fault tolerant distributed computing using reflection and replication," Proceedings of the International Conference on Advances in Computing, Communication and Control, Mumbai, India Pages: 251-256 ,: 2009
X. China, "Token-Based Sequential Consistency in Asynchronous Distributed System ," 17 th Internaional Conference on Advanced Information Networking and Applications (AINA'03),March 27-29, ISBN: 0-7695- 1906-7
Sanjay Bansal, Sanjeev Sharma, Ishita Trivedi, "A Detailed Review of Fault-Tolerance Techniques in Distributed System", International Journal on Internet and Distributed Computing Systems. Vol: 1 No: 1 : 2011
D. K. Gifford, "Weighted voting for replicated data," In SOSP '79: Proc. of the seventh ACM symposium on Operating systems principles, pages 150–162, 1979.
J. Osrael, L. Froihofer, K. M. Goeschka, S. Beyer,P. Gald´amez, , and F. Mu˜noz. "A system architecture for enhanced availability of tightly coupled distributed systems," In Proc. of 1st Int. Conf. on Availability, Reliability, and Security. IEEE, 2006
J Maccormick1, C Thekkath, M. Jager,K. Roomp, and L. Peterson , "Niobe: A Practical Replication Protocol. " ACM Journal Name, Vol. V, No. N, Month 20YY.
Cao Huaihu, Zhu Jianming, "An Adaptive Replicas Creation Algorithm with Fault Tolerance in the Distributed Storage Network" 2008 IEEE.
N. Budhiraja, K. Marzullo, F. B. Schneider, and S. Toueg. The Primary-Backup Approach. In Sape Mullender, editor, Distributed Systems, pages 199-216. ACM Press, 1993.
V. Agarwal, Fault Tolerance in Distributed Systems, Institute of Technology Kanpur, www. cse. iitk. ac. in/report-repository, 2004. ,
H. Jung, D. Shin, H. Kim, and Heon Y. Lee, "Design and Implementation of Multiple FaultTolerant MPI over Myrinet (M3) ," SC|05 Nov 1218,2005, Seattle, Washington, USA Copyright 2005 ACM.
M. Elnozahy, L. Alvisi, Y. M. Wang, and D. B. Johnson. A survey of rollback-recovery protocols in message passing systems. Technical Report CMU-CS-96-81, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA, October 1996.
J. Walters and V. Chaudhary," Replication-Based Fault Tolerance for MPI Applications," Ieee Transactions On Parallel And Distributed Systems, Vol. 20, No. 7, July 2009.
M Chtepen, F. . Claeys, B. Dhoedt, , and P. Vanrolleghem," Adaptive Task Checkpointing and Replication:Toward Efficient Fault-Tolerant Grids", IEE Transactions on Parallel and Distributed Systems, Vol. 20, No. 2, Feb 2009.
S. Jafar, A. Krings, and T. Gautier," Flexible Rollback Recovery in Dynamic Heterogeneous Grid Computing", IEEE Transactions On Dependable and Secure Computing, Vol. 6, No. 1, Jan-Mar 2009.

Index Terms

Computer Science

Information Sciences

Keywords

Checkpointing Replication Rmi