CFP last date
20 May 2024
Reseach Article

Hadoop: An Effective Framework for Big Data Analytics

Published on September 2016 by Dilbag Singh, Chirag Goyal
Recent Innovations in Computer Science and Information Technology
Foundation of Computer Science USA
RICSIT2016 - Number 1
September 2016
Authors: Dilbag Singh, Chirag Goyal
83050051-5417-49fb-a8b9-94b0fab0426b

Dilbag Singh, Chirag Goyal . Hadoop: An Effective Framework for Big Data Analytics. Recent Innovations in Computer Science and Information Technology. RICSIT2016, 1 (September 2016), 13-16.

@article{
author = { Dilbag Singh, Chirag Goyal },
title = { Hadoop: An Effective Framework for Big Data Analytics },
journal = { Recent Innovations in Computer Science and Information Technology },
issue_date = { September 2016 },
volume = { RICSIT2016 },
number = { 1 },
month = { September },
year = { 2016 },
issn = 0975-8887,
pages = { 13-16 },
numpages = 4,
url = { /proceedings/ricsit2016/number1/26185-2019/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 Recent Innovations in Computer Science and Information Technology
%A Dilbag Singh
%A Chirag Goyal
%T Hadoop: An Effective Framework for Big Data Analytics
%J Recent Innovations in Computer Science and Information Technology
%@ 0975-8887
%V RICSIT2016
%N 1
%P 13-16
%D 2016
%I International Journal of Computer Applications
Abstract

In this modern era, analysis of enormous amount of data is becoming a big challenge to the decision makers. Big data is the datasets in size as well as high in variety, velocity and volume. So there is a need of the mean to handle and extract valuable insights from these datasets for better precision. It is very tedious rather impossible in some cases to handle enormous data using traditional databases and techniques their being the need for massive parallel processing and scalability which is not supported by the existing methods. Hadoop supports the scalability as it provides big storage and distribute big data sets over large no of servers operating in parallel. Traditional relational database systems don't scale to process the big data. Scaling of traditional RDBMS to such big data increases cost in many folds which is not affordable. Making efforts to reduce cost, the organizations have had to down-sample data and classify the data on assumptions by deleting raw data that may be useful only for a short term. Hadoop is designed as a scale out architecture and can affordably store company's data for use in future. In the present paper the Big Data Analytics has been carried out using experimental research method. Structured Queries are executed by setting up Hadoop Cluster and RDBMS environment using secondary datasets. The response time of RDBMS with Hadoop framework will be compared.

References
  1. Tom White "Hadoop Definitive Guide" , Second Edition, O'Reilly Media, pp 1-9, October-2010.
  2. Shiqi Wu , Big Data Processing with Hadoop, pp. 13-16, June 2015.
  3. A Review Paper on Big Data and Hadoop, International Journal of Scientific and Research Publications, Volume 4, Issue 10, October 2014.
  4. Mark Kerzner and Sujee Maniyam, "Hadoop Illuminated", GitHub, pp. 28-30, 2014.
  5. Apache Hadoop, MapReduce Tutorial, 2013. https://hadoop. apache. org/docs/r1. 2. 1/mapred_tutorial. html, accessed April 2014.
  6. Ketaki Subhash Raste, "Big Data Analytics-Hadoop Performance Analysis", pp. 18-22, 2014.
  7. Rui Xue, "SQL Engines for Big Data Analytics: SQL on Hadoop", pp 31-41, Nov 20,2015.
  8. Jefferey Shafer ,"A Storage Architecture for Data-Intesive Computing" , pp. 87-100, May 2010.
Index Terms

Computer Science
Information Sciences

Keywords

Big Data Hadoop Cluster Hdfs Map Reduce.