Heterogeneous Data Processing using Hadoop and Java Map/Reduce

Jasmeet Singh Puaar; Ramanjeet Kaur

Call for Paper

July Edition

IJCA solicits high quality original research papers for the upcoming July edition of the journal. The last date of research paper submission is 20 June 2025

Submit your paper

Know more

The week's pick

Designing Multi-Tenant E-Learning Systems in the Cloud: A Process-Oriented Approach for Higher Education

Sameh Azouzi Sonia Ayachi Ghannouchi

Random Articles

Intrusion Detection and Secured Data Transmission using Software Hardware Codesign

March

2013

Review: Comparative Analysis of Different Techniques of DL-Frameworks

Sep

2018

One Way Functions –Conjecture, Status, Applications and Future Research Scope

Nov

2016

Software Reliability Prediction using Fuzzy Inference System: Early Stage Perspective

Jul

2016

Reseach Article

Heterogeneous Data Processing using Hadoop and Java Map/Reduce

by Jasmeet Singh Puaar, Ramanjeet Kaur

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 146 - Number 9

Year of Publication: 2016

Authors: Jasmeet Singh Puaar, Ramanjeet Kaur

10.5120/ijca2016910846

Jasmeet Singh Puaar, Ramanjeet Kaur . Heterogeneous Data Processing using Hadoop and Java Map/Reduce. International Journal of Computer Applications. 146, 9 ( Jul 2016), 13-16. DOI=10.5120/ijca2016910846

@article{ 10.5120/ijca2016910846,

author = { Jasmeet Singh Puaar, Ramanjeet Kaur },

title = { Heterogeneous Data Processing using Hadoop and Java Map/Reduce },

journal = { International Journal of Computer Applications },

issue_date = { Jul 2016 },

volume = { 146 },

number = { 9 },

month = { Jul },

year = { 2016 },

issn = { 0975-8887 },

pages = { 13-16 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume146/number9/25425-2016910846/ },

doi = { 10.5120/ijca2016910846 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-06T23:49:57.052928+05:30

%A Jasmeet Singh Puaar

%A Ramanjeet Kaur

%T Heterogeneous Data Processing using Hadoop and Java Map/Reduce

%J International Journal of Computer Applications

%@ 0975-8887

%V 146

%N 9

%P 13-16

%D 2016

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In this paper, the objective is to do analysis of New York Stock Exchange's heterogeneous sample data using java map-reduce on Hadoop platform. Java programming as well as Java map-reduce API has been used to work upon huge amount of data i.e. BIG DATA. The source data is of heterogeneous type. The format and the structure of the data files worked with are different. So, it was challenging to handle the data and send it to the mappers to get a single reduced output file. The analysis of NYSE's data was done to find out the maximum and minimum price of every particular stock exchange for each year and to calculate average stock price of any stock exchange for a particular year by using record of its dividends in the sample data. This has been done by usage of data from two different files namely: dividends.csv and sample_prices.csv .The output of the program was saved to the HDFS file system. This output can then be saved to our NTFS file system using Sqoop or the files can be manually copied to our system for further processing.

References

Thomas H. Davenport, 2014 big data @ work Harvard business review press.
T. Kraska, "Finding the Needle in the Big Data Systems Haystack," IEEE Internet Computing, vol. 17, no. 1, pp. 84-86, 2013.
Lekha R.Nair, 2014 Research in Big Data and Analytics: An Overview IJCA Volume 108
Siddharth Mehta 2015 Big Data analytics made easy with SQL and MapReduce
Online Searcher: Information Discovery, Technology, Strategies Volume 38, Number 2 - March/April 2014
J. Dean and S. Ghemawat. MapReduce: Simplified Data Processing on Large Clusters. In Communications of the ACM, 51 (1): 107-113, 2008.
https://hadoop.apache.org/docs/stable/api/org/apache/hadoop/mapreduce/lib/input/MultipleInputs.html
P. Amuthabala Kavya. T.C 2016 Outlook on various scheduling approaches in Hadoop P. Amuthabala et al. / International Journal on Computer Science and Engineering (IJCSE).
Manisha R. Thakare S.W. Mohod A.N. Thakare Various Data-Mining Techniques for Big Data IJCA Number 8
Kvn Krishna Mohan, K Prem Sai Reddy 2016 Efficient Big Data Processing in Hadoop MapReduce IJARCSSE Volume 6 Issue 3

Index Terms

Computer Science

Information Sciences

Keywords

Heterogeneous data processing MapReduce Big data Data Analysis HDFS multiple input NYSE data.