CFP last date
20 May 2024
Reseach Article

Eliminating Homogeneous Cluster Setup for Efficient Parallel Data Processing

by Piyush Saxena, Satyajit Padhy, Praveen Kumar
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 64 - Number 17
Year of Publication: 2013
Authors: Piyush Saxena, Satyajit Padhy, Praveen Kumar
10.5120/10723-5620

Piyush Saxena, Satyajit Padhy, Praveen Kumar . Eliminating Homogeneous Cluster Setup for Efficient Parallel Data Processing. International Journal of Computer Applications. 64, 17 ( February 2013), 1-7. DOI=10.5120/10723-5620

@article{ 10.5120/10723-5620,
author = { Piyush Saxena, Satyajit Padhy, Praveen Kumar },
title = { Eliminating Homogeneous Cluster Setup for Efficient Parallel Data Processing },
journal = { International Journal of Computer Applications },
issue_date = { February 2013 },
volume = { 64 },
number = { 17 },
month = { February },
year = { 2013 },
issn = { 0975-8887 },
pages = { 1-7 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume64/number17/10723-5620/ },
doi = { 10.5120/10723-5620 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T21:16:40.606517+05:30
%A Piyush Saxena
%A Satyajit Padhy
%A Praveen Kumar
%T Eliminating Homogeneous Cluster Setup for Efficient Parallel Data Processing
%J International Journal of Computer Applications
%@ 0975-8887
%V 64
%N 17
%P 1-7
%D 2013
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This project proposes to eliminate homogeneous cluster setup in a parallel data processing environment. A homogeneous cluster setup supports static nature of processing which is a huge disadvantage for optimising the response time towards clients. Parallel data processing is performed more often in today's internet and it is very important for the server to deliver the services to its client in optimal time. In order to avail utmost client satisfaction, the server needs to eliminate homogeneous cluster setup that is encountered usually in parallel data processing. The homogeneous cluster setup is static in nature and dynamic allocation of resources is not possible in this kind of environment. The project will also make sure that the user gets its entire requirement fulfilled in optimal time. This will improve the overall resource utilization and, consequently, reduce the processing cost.

References
  1. "Parallel Data Processing with Map Reduce: A Survey" by Kyong-Ha Lee and Yoon-Joon Lee, Department of Computer Science KAIST, December 2011.
  2. Query Optimization for Massively Parallel Data Processing by Sai Wu , Feng Li, Sharad Mehrotra, Beng Chin Ooi School of Computing, National University of Singapore, March 2012
  3. S. Babu. Towards automatic optimization of map reduce programs. In Proceedings of the 1st ACM symposium on Cloud computing, pages 137–142, 2010.
  4. Parallel Data Processing: http://server-demo-ec2. cloveretl. com/clover/docs/clustering-parallel-processing. html
  5. H. chih Yang, A. Dasdan, R. -L. Hsiao, and D. S. Parker. Map-Reduce-Merge: Simplified Relational Data Processing on Large clusters. In SIGMOD '07: Proceedings of the 2007 ACM SIGMOD international conference on Management of data, pages 1029–1040, New York, NY, USA, 2007. ACM.
  6. J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In OSDI'04: Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation, pages 10–10, Berkeley, CA, USA, 2004. USENIX Association.
  7. E. Deelman, G. Singh, M. -H. Su, J. Blythe, Y. Gil, C. Kesselman, G. Mehta, K. Vahi, G. B. Berriman, J. Good, A. Laity, J. C. Jacob, and D. S. Katz. Pegasus: A Framework for Mapping Complex Scientific Workflows onto Distributed Systems. Sci. Program 13(3):219–237, 2005.
  8. R. Pike, S. Dorward, R. Griesemer, and S. Quinlan. Interpreting the Data: Parallel Analysis with Sawzall. Sci. Program. , 13(4):277–298, 2005.
  9. B. Li et al . A Platform for Scalable One-Pass Analytics using MapReduce. In Proceedings of the 2011 ACM SIGMOD, 2011.
  10. D. Jiang et al. Map-join-reduce: Towards scalable and efficient data analysis on large clusters. IEEE Transactions on Knowledge and Data Engineering, 2010.
  11. Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, David E. Culler, Joseph M. Hellerstein, and David A. Patterson. High-performance sorting on networks of workstations. In Proceedings of the 1997 ACM SIGMOD International Conference on Management of Data, Tucson, Arizona, May 1997.
  12. William Gropp, Ewing Lusk, and Anthony Skjellum. Using MPI: Portable Parallel Programming with the Message-Passing Interface. MIT Press, Cambridge, MA, 1999.
  13. Douglas Thain, Todd Tannenbaum, and Miron Livny. Distributed computing in practice: The Condor experience. Concurrency and Computation: Practice and Experience, 2004.
  14. A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka, S. Anthony, H. Liu, P. Wychoff, and R. Murthy, "Hive - a warehousing solution over a map-reduce framework," in VLDB, 2009.
  15. D. DeWitt and J. Gray, "Parallel database systems: the future of high performance database systems," Commun. ACM, 1992.
  16. S. Fushimi, M. Kitsuregawa, and H. Tanaka, "An overview of the system software of a parallel relational database machine grace," in VLDB '86: Proceedings of the 12th International Conference on Very Large Data Bases. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. , 1986, pp. 209–219.
  17. R. Pike, S. Dorward, R. Griesemer, and S. Quinlan, "Interpreting the data: Parallel analysis with sawzall," Sci. Program. , vol. 13, no. 4, pp. 277–298, 2005.
  18. M. Ziane, M. Za¨?t, and P. Borla-Salamet, "Parallel query processing with zigzag trees," The VLDB Journal, vol. 2, no. 3, pp. 277–302, 1993
  19. Homogeneous vs Heterogeneous Clustered Sensor Networks: A Comparative Study by Vivek Mhatre, Catherine Rosenberg School of Electrical and Computer Eng. , Purdue University, West Lafayette, IN 47907-1285.
Index Terms

Computer Science
Information Sciences

Keywords

Data mining Data warehousing Parallel data processing