Clustering Algorithms in MapReduce: A Review

Call for Paper

May Edition

IJCA solicits high quality original research papers for the upcoming May edition of the journal. The last date of research paper submission is 20 April 2026

Submit your paper

Know more

The week's pick

A Unified NIST SP 800-90B Validation Framework for CMOS True Random Number Generators and Quantum Random Number Generators

Che-Ping Lin

Random Articles

Reseach Article

Clustering Algorithms in MapReduce: A Review

Published on June 2015 by Vinod S. Bawane, Sandesha M. Kale

National Conference on Recent Trends in Computer Science and Engineering

Foundation of Computer Science USA

MEDHA2015 - Number 4

June 2015

Authors: Vinod S. Bawane, Sandesha M. Kale

Vinod S. Bawane, Sandesha M. Kale . Clustering Algorithms in MapReduce: A Review. National Conference on Recent Trends in Computer Science and Engineering. MEDHA2015, 4 (June 2015), 15-18.

@article{

author = { Vinod S. Bawane, Sandesha M. Kale },

title = { Clustering Algorithms in MapReduce: A Review },

journal = { National Conference on Recent Trends in Computer Science and Engineering },

issue_date = { June 2015 },

volume = { MEDHA2015 },

number = { 4 },

month = { June },

year = { 2015 },

issn = 0975-8887,

pages = { 15-18 },

numpages = 4,

url = { /proceedings/medha2015/number4/21449-8059/ },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Proceeding Article

%1 National Conference on Recent Trends in Computer Science and Engineering

%A Vinod S. Bawane

%A Sandesha M. Kale

%T Clustering Algorithms in MapReduce: A Review

%J National Conference on Recent Trends in Computer Science and Engineering

%@ 0975-8887

%V MEDHA2015

%N 4

%P 15-18

%D 2015

%I International Journal of Computer Applications

Abstract

A MapReduce is a framework that allows processing the very big amounts of formless data in parallel across a distributed cluster of processors or individual computers. The MapReduce framework is mostly used to analyze the large amount of datasets in clustering environments. MapReduce has become a dominant parallel computing paradigm for big data. This paper describes well known strategies in MapReduce, and present comprehensive comparative algorithms in MapReduce in clustering environment.

References

Bin Cao, Jianwei Yin, Qi Zhang, Yanming Ye, "A MapReduce-based architecture for rule matching in production system", 2nd IEEE International Conference on Cloud Computing Technology and Science ,2010.
Guoping Wang and CheeYong Chan, "MultiQuery Optimization in MapReduce Framework", 40th International Conference on Very Large Data Bases, September 2014.
Thomas Wirtz and Rong Ge, "Improving MapReduce Energy Efficiency for Computation Intensive Workloads", Green Computing Conference and Workshops (IGCC), 2011 International.
Prajesh P Anchalia, Anjan K Koundinya, Srinath N K, "MapReduce Design of K-Means Clustering Algorithm", 2013 IEEE.
Cheng T. Chu, Sang K. Kim, Yi A. Lin, Yuanyuan Yu, Gary R. Bradski, Andrew Y. Ng, and Kunle Olukotun, "Map-Reduce for Machine Learning on Multicore", NIPS, page 281--288. MIT Press, 2006.
Yaobin He, Haoyu Tan, Wuman Luo, Huajian Mao, Di Ma, Shengzhong Feng, Jianping Fan, "MR-DBSCAN: An Efficient Parallel Density-based Clustering Algorithm using MapReduce", 2011 IEEE 17th International Conference on Parallel and Distributed Systems.
Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu, "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise", Published in Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96).

Index Terms

Computer Science

Information Sciences

Keywords

Big Data Clustering Algorithm Mapreduce