CFP last date
20 May 2024
Reseach Article

Big Data Analytic of Nigeria Population Census Data using MapReduce and K-Means Algorithm

by Akande Oyebola, Osofisan Adenike
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 181 - Number 23
Year of Publication: 2018
Authors: Akande Oyebola, Osofisan Adenike
10.5120/ijca2018917999

Akande Oyebola, Osofisan Adenike . Big Data Analytic of Nigeria Population Census Data using MapReduce and K-Means Algorithm. International Journal of Computer Applications. 181, 23 ( Oct 2018), 14-21. DOI=10.5120/ijca2018917999

@article{ 10.5120/ijca2018917999,
author = { Akande Oyebola, Osofisan Adenike },
title = { Big Data Analytic of Nigeria Population Census Data using MapReduce and K-Means Algorithm },
journal = { International Journal of Computer Applications },
issue_date = { Oct 2018 },
volume = { 181 },
number = { 23 },
month = { Oct },
year = { 2018 },
issn = { 0975-8887 },
pages = { 14-21 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume181/number23/30025-2018917999/ },
doi = { 10.5120/ijca2018917999 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T01:06:48.328998+05:30
%A Akande Oyebola
%A Osofisan Adenike
%T Big Data Analytic of Nigeria Population Census Data using MapReduce and K-Means Algorithm
%J International Journal of Computer Applications
%@ 0975-8887
%V 181
%N 23
%P 14-21
%D 2018
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Mining of big data brings out hidden knowledge that medium size and sample data cannot reveal. This research analyzed Nigeria Population Census data in order to bring forth knowledge that can aid Government in social-economic decision-making. Thus, k-means algorithm, which is an unsupervised learning technique, was implemented on MapReduce with the aim of discovering knowledge from Priority Table IX of Nigeria Census Data of 2005. MapReduce was used to aid k-means computational challenges such as Euclidean distance computation, minimum sum of square error (MSSE) computation and global objective computation effectively. The big data analytics revealed local government areas that need Government Intervention in terms of low cost housing and those local governments that need urban restructuring for good distribution of population. Further work can be done by implementing other data such as malaria data of children to reveal hidden pattern and knowledge.

References
  1. Jianqing, F., Fang, H. and Han, L. (2013). "Challenges of Big Data Analysis." arXiv, 2013, pp. 1-38.
  2. Dijcks, J.-P. (2013). Oracle: Big Data for the Enterprise. Carlifonia: Oracle Corporation.
  3. Michael B. and Gordon L. (2004). Data Mining Techniques For Marketing Sales And Customer Relationship Manager. Indianapolis, Indiana : Wiley.
  4. Kumara, J., Millsa, R.T., Hoffmana, F.M., and Hargrove, W.W. (2011). "Parallel k-Means Clustering for Quantitative Ecoregion Delineation Using Large Data Sets." Elsevier 1602–1611.
  5. Abdous, M., He, W., and Yen, C. (2012). " Using Data Mining for Predicting Relationships between Online Question Theme and Final Grade." Educational Technology & Society 77-88.
  6. Ramesh V., Parkavi P., and Ramar K. (2013). Predicting Student Performance: A Statistical and Data Mining Approach . International Journal of Computer Applications , 35-39.
  7. Singh, H. (2016). "Clustering of text documents by implementation of K-means algorithms" Streamed Info-Ocean 54-63.
  8. Bansal, A., Shama, M., and Goel S. (2017).Improved K-mean Clustering Algorithm for Prediction Analysis using Classification Technique in Data Mining. International Journal of Computer Applications, 35-40.
  9. Ghodsi R, Marani SB., and Keramati A. (2017). Application of K-Means Technique in Data Mining to Cluster Hemodialysis Patients. International Robotics & Automation Journal , 1-6.
  10. Zhao, W., Ma, H., and He, Q. 2009. "Parallel K-Means Clustering Based on MapReduce." CloudCom. Heidelberg: Springer-Verlag. 674-679.
Index Terms

Computer Science
Information Sciences

Keywords

k-means MapReduce Euclidean distance MSSE Global Objective Function.