Call for Paper - January 2022 Edition
IJCA solicits original research papers for the January 2022 Edition. Last date of manuscript submission is December 20, 2021. Read More

Performance Optimization of Big Data Processing using Clustering Technique in Map Reduces Programming Model

International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Year of Publication: 2016
Ravindra Singh Raghuwanshi, Deepak Sain

Ravindra Singh Raghuwanshi and Deepak Sain. Performance Optimization of Big Data Processing using Clustering Technique in Map Reduces Programming Model. International Journal of Computer Applications 151(4):42-46, October 2016. BibTeX

	author = {Ravindra Singh Raghuwanshi and Deepak Sain},
	title = {Performance Optimization of Big Data Processing using Clustering Technique in Map Reduces Programming Model},
	journal = {International Journal of Computer Applications},
	issue_date = {October 2016},
	volume = {151},
	number = {4},
	month = {Oct},
	year = {2016},
	issn = {0975-8887},
	pages = {42-46},
	numpages = {5},
	url = {},
	doi = {10.5120/ijca2016911748},
	publisher = {Foundation of Computer Science (FCS), NY, USA},
	address = {New York, USA}


The generation of technology and requirement fulfill the demand of digital universe data. Day to day the digital universe data are exploded in terms of megabyte and petabyte. The exploding rate of data demands the new generation of technology such as big data processing. In this paper optimized the performance of map reduce programming model for the enhancement of data processing. The modified model of programming used clustering technique. the clustering technique incorporate the process of map data in terms of task group. The task group of map data correlated with different index of data for the processing of data node. The proposed model implemented in Hadoop framework and programmed in java. For the evaluation of performance used three standard datasets and measure the processing time and count value of file.


  1. Carson Kai-Sang Leung, Richard Kyle MacKinnon and Fan Jiang “Reducing the Search Space for Big Data Mining for Interesting Patterns from Uncertain Data”, IEEE, 2014, Pp 315-322.
  2. Rama Satish K. V. and Dr. N. P. Kavya “Big Data Processing with harnessing Hadoop - MapReduce for Optimizing Analytical Workloads”, IEEE, 2014, Pp 49-54.
  3. Seungwoo Jeon, Bonghee Hong and Byungsoo Kim “Big Data Processing for Prediction of Traffic Time based on Vertical Data Arrangement”, IEEE, 2014, Pp 327-333.
  4. Rajiv Ranjan “Modeling and Simulation in Performance Optimization of Big Data Processing Frameworks”, IEEE, 2014, Pp 14-19.
  5. Muhammad MazharUllahRathore, Anand Paul, Awais Ahmad, Bo-Wei Chen, Bormin Huang, and Wen Ji “Real-Time Big Data Analytical Architecture for Remote Sensing Application”, IEEE, 2015, Pp 1-12.
  6. Jyoti V Gautam, Harshadkumar B Prajapati, Vipul K Dabhi and Sanjay Chaudhary “A Survey on Job Scheduling Algorithms in Big Data Processing”, IEEE, 2015, Pp 1-11.
  7. Alfred Daniel, Anand Paul and Awais Ahmad “Near Real-Time Big Data Analysis on Vehicular Networks”, International Conference on Soft-Computing and Network Security, 2015, Pp 1-7.
  8. Chun-Wei Tsai, Chin-Feng Lai, Ming-Chao Chiang and Laurence T. Yang “Data Mining for Internet of Things: A Survey”, IEEE, 2014, Pp 77-97.
  9. Albert Bifet “Mining Big Data in Real Time”, Informatica, 2013, Pp 15-20.
  10. GwangbumPyun ,Unil Yun and Keun Ho Ryu “Efficient frequent pattern mining based on Linear Prefix tree”, Elsevier, 2013, Pp 125-139.
  11. Unil Yun and Keun Ho Ryu “Approximate weighted frequent pattern mining with/without noisy environments”, Elsevier, 2010, Pp 73-82.
  12. Zhi-Hua Zhou, Nitesh V. Chawla, YaochuJin and Graham J. Williams “Big Data Opportunities and Challenges: Discussions from Data Analytics Perspectives”, IEEE, 2011, Pp 1-20
  13. Boris Novikov, Natalia Vassilieva and Anna Yarygina “Querying Big Data”, International Conference on Computer Systems and Technologies, 2012, Pp 1-10.
  14. Liwen Sun, Reynold Cheng, David W. Cheung and Jiefeng Cheng “Mining Uncertain Data with Probabilistic Guarantees”, ACM, 2010, Pp 273-282.
  15. Yuxuan Li, James Bailey, Lars Kulik and Jian Pei “Mining Probabilistic Frequent Spatio-Temporal Sequential Patterns with Gap Constraints from Uncertain Databases”, Pp 1-10.
  16. Carson Kai-Sang Leung and Fan Jiang “Frequent Itemset Mining of Uncertain Data Streams Using the Damped Window Model”, ACM, 2011, Pp 950-955.


Big Data, Hadoop, MapReduce, Clustering, Optimization