CFP last date
20 May 2024
Reseach Article

Big Data Analytics using Hadoop

by Bijesh Dhyani, Anurag Barthwal
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 108 - Number 12
Year of Publication: 2014
Authors: Bijesh Dhyani, Anurag Barthwal
10.5120/18960-0288

Bijesh Dhyani, Anurag Barthwal . Big Data Analytics using Hadoop. International Journal of Computer Applications. 108, 12 ( December 2014), 1-5. DOI=10.5120/18960-0288

@article{ 10.5120/18960-0288,
author = { Bijesh Dhyani, Anurag Barthwal },
title = { Big Data Analytics using Hadoop },
journal = { International Journal of Computer Applications },
issue_date = { December 2014 },
volume = { 108 },
number = { 12 },
month = { December },
year = { 2014 },
issn = { 0975-8887 },
pages = { 1-5 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume108/number12/18960-0288/ },
doi = { 10.5120/18960-0288 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-06T22:42:45.993785+05:30
%A Bijesh Dhyani
%A Anurag Barthwal
%T Big Data Analytics using Hadoop
%J International Journal of Computer Applications
%@ 0975-8887
%V 108
%N 12
%P 1-5
%D 2014
%I Foundation of Computer Science (FCS), NY, USA
Abstract

This paper is an effort to present the basic understanding of BIG DATA is and it's usefulness to an organization from the performance perspective. Along-with the introduction of BIG DATA, the important parameters and attributes that make this emerging concept attractive to organizations has been highlighted. The paper also evaluates the difference in the challenges faced by a small organization as compared to a medium or large scale operation and therefore the differences in their approach and treatment of BIG DATA. A number of application examples of implementation of BIG DATA across industries varying in strategy, product and processes have been presented. The second part of the paper deals with the technology aspects of BIG DATA for it's implementation in organizations. Since HADOOP has emerged as a popular tool for BIG DATA implementation, the paper deals with the overall architecture of HADOOP alongwith the details of it's various components. Further each of the components of the architecture has been taken up and described in detail.

References
  1. M. A. Beyer and D. Laney, "The importance of ?big data?: A definition," Gartner, Tech. Rep. , 2012.
  2. X. Wu, X. Zhu, G. Q. Wu, et al. , "Data mining with big data," IEEE Trans. on Knowledge and Data Engineering, vol. 26, no. 1, pp. 97-107, January 2014. Rajaraman and J. D. Ullman, "Mining of massive datasets," Cambridge University Press, 2012.
  3. Z. Zheng, J. Zhu, M. R. Lyu. "Service-generated Big Data and Big Data-as-a-Service: An Overview," in Proc. IEEE BigData, pp. 403-410, October 2013. A . Bellogín, I. Cantador, F. Díez, et al. , "An empirical comparison of social, collaborative filtering, and hybrid recommenders," ACM Trans. on Intelligent Systems and Technology, vol. 4, no. 1, pp. 1-37, January 2013.
  4. W. Zeng, M. S. Shang, Q. M. Zhang, et al. , "Can Dissimilar Users Contribute to Accuracy and Diversity of Personalized Recommendation?," International Journal of Modern Physics C, vol. 21, no. 10, pp. 1217-1227, June 2010.
  5. T. C. Havens, J. C. Bezdek, C. Leckie, L. O. Hall, and M. Palaniswami, "Fuzzy c-Means Algorithms for Very Large Data," IEEE Trans. on Fuzzy Systems, vol. 20, no. 6, pp. 1130-1146, December 2012.
  6. Z. Liu, P. Li, Y. Zheng, et al. , "Clustering to find exemplar terms for keyphrase extraction," in Proc. 2009 Conf. on Empirical Methods in Natural Language Processing, pp. 257-266, May 2009.
  7. X. Liu, G. Huang, and H. Mei, "Discovering homogeneous web service community in the user-centric web environment," IEEE Trans. on Services Computing, vol. 2, no. 2, pp. 167-181, April-June 2009.
  8. K. Zielinnski, T. Szydlo, R. Szymacha, et al. , "Adaptive soa solution stack," IEEE Trans. on Services Computing, vol. 5, no. 2, pp. 149-163, April-June 2012.
  9. F. Chang, J. Dean, S. mawat, et al. , "Bigtable: A distributed storage system for structured data," ACM Trans. on Computer Systems, vol. 26, no. 2, pp. 1-39, June 2008.
  10. R. S. Sandeep, C. Vinay, S. M. Hemant, "Strength and Accuracy Analysis of Affix Removal Stemming Algorithms," International Journal of Computer Science and Information Technologies, vol. 4, no. 2, pp. 265-269, April 2013.
  11. V. Gupta, G. S. Lehal, "A Survey of Common Stemming Techniques and Existing Stemmers for Indian Languages," Journal of Emerging Technologies in Web Intelligence, vol. 5, no. 2, pp. 157-161, May 2013. A. Rodriguez, W. A. Chaovalitwongse, L. Zhe L, et al. , "Master defect record retrieval using network-based feature association," IEEE Trans. on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 40, no. 3, pp. 319-329, October 2010.
  12. T. Niknam, E. Taherian Fard, N. Pourjafarian, et al. , "An efficient algorithm based on modified imperialist competitive algorithm and K-means for data clustering," Engineering Applications of Artificial Intelligence, vol. 24, no. 2, pp. 306-317, March 2011.
  13. M. J. Li, M. K. Ng, Y. M. Cheung, et al. "Agglomerative fuzzy k-means clustering algorithm with selection of number of clusters," IEEE Trans. on Knowledge and Data Engineering, vol. 20, no. 11, pp. 1519-1534, November 2008.
  14. G. Thilagavathi, D. Srivaishnavi, N. Aparna, et al. , "A Survey on Efficient Hierarchical Algorithm used in Clustering," International Journal of Engineering, vol. 2, no. 9, September 2013.
  15. C. Platzer, F. Rosenberg, and S. Dustdar, "Web service clustering using multidimensional angles as proximity measures," ACM Trans. on Internet Technology, vol. 9, no. 3, pp. 11:1-11:26, July, 2009.
  16. G. Adomavicius, and J. Zhang, "Stability of Recommendation Algorithms," ACM Trans. on Information Systems, vol. 30, no. 4, pp. 23:1-23:31, August 2012.
  17. J. Herlocker, J. A. Konstan, and J. Riedl, "An empirical analysis of design choices in neighborhood-based collaborative filtering algorithms," Information retrieval, vol. 5, no. 4, pp. 287-310, October 2002.
  18. Yamashita, H. Kawamura, and K. Suzuki, "Adaptive Fusion Method for User-based and Item-based Collaborative Filtering," Advances in Complex Systems, vol. 14, no. 2, pp. 133-149, May 2011.
  19. D. Julie, and K. A. Kumar, "Optimal Web Service Selection Scheme With Dynamic QoS Property Assignment," International Journal of Advanced Research In Technology, vol. 2, no. 2, pp. 69-75, May 2012.
  20. J. Wu, L. Chen, Y. Feng, et al. , "Predicting quality of service for selection by neighborhood-based collaborative filtering," IEEE Trans. on Systems, Man, and Cybernetics: Systems, vol. 43, no. 2, pp. 428-439, March 2013.
  21. Y. Zhao, G. Karypis, and U. Fayyad, "Hierarchical clustering algorithms for document datasets," Data Mining and Knowledge Discovery, vol. 10, no. 2, pp. 141-168, November 2005.
  22. Z. Zheng, H. Ma, M. R. Lyu, et al. , "QoS-aware Web service recommendation by collaborative filtering," IEEE Trans. on Services Computing, vol. 4, no. 2, pp. 140-152, February 2011.
Index Terms

Computer Science
Information Sciences

Keywords

Big data hadoop analytic databases analytic applications.