CFP last date
22 April 2024
Reseach Article

Fault Tolerance Approach in Mobile Distributed Systems

Published on July 2015 by Renu, Praveen Kumar
Innovations in Computing and Information Technology (Cognition 2015)
Foundation of Computer Science USA
COGNITION2015 - Number 2
July 2015
Authors: Renu, Praveen Kumar
a4950c62-514a-41c4-be10-76e8537f5576

Renu, Praveen Kumar . Fault Tolerance Approach in Mobile Distributed Systems. Innovations in Computing and Information Technology (Cognition 2015). COGNITION2015, 2 (July 2015), 15-19.

@article{
author = { Renu, Praveen Kumar },
title = { Fault Tolerance Approach in Mobile Distributed Systems },
journal = { Innovations in Computing and Information Technology (Cognition 2015) },
issue_date = { July 2015 },
volume = { COGNITION2015 },
number = { 2 },
month = { July },
year = { 2015 },
issn = 0975-8887,
pages = { 15-19 },
numpages = 5,
url = { /proceedings/cognition2015/number2/21894-2124/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 Innovations in Computing and Information Technology (Cognition 2015)
%A Renu
%A Praveen Kumar
%T Fault Tolerance Approach in Mobile Distributed Systems
%J Innovations in Computing and Information Technology (Cognition 2015)
%@ 0975-8887
%V COGNITION2015
%N 2
%P 15-19
%D 2015
%I International Journal of Computer Applications
Abstract

Mobile agent become very popular and attracted more importance these days due to the exponential growth of internet applications. The design of fault tolerance system become very challenging due to limited bandwidth of wireless network, mobile host mobility, limited local storage, limited battery power and handoff. A distributed system is a collection of independent entities to solve the problem that cannot be solved individually. A distributed system is susceptible to failure when it does not meet its specifications. Fault tolerant techniques enable systems to perform tasks even in the presence of faults. To deal with failure, a checkpoint is taken at specific place in a program at which standard process is interrupted specifically to preserve the status information. To recover from a failure one may restart computation from the last checkpoints, thereby avoiding repeating computation from the previous consistent global checkpoint. A mobile computing system is a distributed system where some of processes are running on mobile hosts (MHs), whose location in the network changes with time. The number of processes that take checkpoints is minimized to 1) avoid awakening of MHs in doze mode of operation, 2) minimize thrashing of MHs with checkpointing activity, 3) save limited battery life of MHs and low bandwidth of wireless channels. In this paper we provide an overview on Fault Tolerance in Mobile Distributed Systems (MDS).

References
  1. Acharya A. and Badrinath B. R. , "Checkpointing Distributed Applications on Mobile Computers," Proceedings of the 3rd International Conference on Parallel and Distributed Information Systems, pp. 73-80, September 1994.
  2. Cao G. and Singhal M. , "On coordinated checkpointing in Distributed Systems", IEEE Transactions on Parallel and Distributed Systems, vol. 9, no. 12, pp. 1213-1225, Dec 1998.
  3. Cao G. and Singhal M. , "On the Impossibility of Min-process Non-blocking Checkpointing and an Efficient Checkpointing Algorithm for Mobile Computing Systems," Proceedings of International Conference on Parallel Processing, pp. 37-44, August 1998.
  4. Cao G. and Singhal M. , "Mutable Checkpoints: A New Checkpointing Approach for Mobile Computing systems," IEEE Transaction On Parallel and Distributed Systems, vol. 12, no. 2, pp. 157-172, February 2001.
  5. Chandy K. M. and Lamport L. , "Distributed Snapshots: Determining Global State of Distributed Systems," ACM Transaction on Computing Systems, vol. 3, No. 1, pp. 63-75, February 1985.
  6. Elnozahy E. N. , Alvisi L. , Wang Y. M. and Johnson D. B. , "A Survey of Rollback-Recovery Protocols in Message-Passing Systems," ACM Computing Surveys, vol. 34, no. 3, pp. 375-408, 2002.
  7. Elnozahy E. N. , Johnson D. B. and Zwaenepoel W. , "The Performance of Consistent Checkpointing," Proceedings of the 11th Symposium on Reliable Distributed Systems, pp. 39-47, October 1992.
  8. Higaki H. and Takizawa M. , "Checkpoint-recovery Protocol for Reliable Mobile Systems," Trans. of Information processing Japan, vol. 40, no. 1, pp. 236-244, Jan. 1999.
  9. Koo R. and Toueg S. , "Checkpointing and Roll-Back Recovery for Distributed Systems," IEEE Trans. on Software Engineering, vol. 13, no. 1, pp. 23-31, January 1987.
  10. Neves N. and Fuchs W. K. , "Adaptive Recovery for Mobile Environments," Communications of the ACM, vol. 40, no. 1, pp. 68-74, January 1997.
  11. Parveen Kumar, Lalit Kumar, R K Chauhan, V K Gupta "A Non-Intrusive Minimum Process Synchronous Checkpointing Protocol for Mobile Distributed Systems" Proceedings of IEEE ICPWC-2005, pp 491-95, January 2005.
  12. Pradhan D. K. , Krishana P. P. and Vaidya N. H. , "Recovery in Mobile Wireless Environment: Design and Trade-off Analysis," Proceedings 26th International Symposium on Fault-Tolerant Computing, pp. 16-25, 1996.
  13. Prakash R. and Singhal M. , "Low-Cost Checkpointing and Failure Recovery in Mobile Computing Systems," IEEE Transaction On Parallel and Distributed Systems, vol. 7, no. 10, pp. 1035-1048, October1996.
  14. L. Kumar, M. Misra, R. C. Joshi, "Low overhead optimal checkpointing for mobile distributed systems" Proceedings. 19th IEEE International Conference on Data Engineering, pp 686 – 88, 2003.
  15. Ni, W. , S. Vrbsky and S. Ray, "Pitfalls in Distributed Nonblocking Checkpointing", Journal of Interconnection Networks, Vol. 1 No. 5, pp. 47-78, March 2004.
  16. L. Lamport, "Time, clocks and ordering of events in a distributed system" Comm. ACM, vol. 21, no. 7, pp. 558-565, July 1978.
  17. Parveen Kumar, Lalit Kumar, R K Chauhan, "A Non-intrusive Hybrid Synchronous Checkpointing Protocol for Mobile Systems", IETE Journal of Research, Vol. 52 No. 2&3, 2006.
  18. Parveen Kumar, "A Low-Cost Hybrid Coordinated Checkpointing Protocol for mobile distributed systems", Mobile Information Systems. pp 13-32, Vol. 4, No. 1, 2007.
  19. Lalit Kumar Awasthi, Parveen Kumar, "A Synchronous Checkpointing Protocol for Mobile Distributed Systems: Probabilistic Approach" International Journal of Information and Computer Security, Vol. 1, No. 3 pp 298-314.
  20. Sunil Kumar, R K Chauhan, Parveen Kumar, "A Minimum-process Coordinated Checkpointing Protocol for Mobile Computing Systems", International Journal of Foundations of Computer science,Vol 19, No. 4, pp 1015-1038 (2008).
  21. A. Tanenbaum and M. Van Steen, Distributed Systems: Principles and Paradigms, Upper Saddle River, NJ, Prentice-Hall, 2003.
  22. M. Singhal and N. Shivaratri, Advanced Concepts in Operating Systems, New York, McGraw Hill, 1994.
  23. E. N. Elnozahy, L. Alvisi, Y. M. Wang, and D. B. Johnson, Asurvey of reollback-recovery protocols in message-passing system, ACM Computing Surveys, 34(3), 2002, 375-408.
Index Terms

Computer Science
Information Sciences

Keywords

Domino Effect Rollback Recovery Mobile Host Mobile Support Station Consistent Global Checkpoint