Call for Paper - February 2019 Edition
IJCA solicits original research papers for the February 2019 Edition. Last date of manuscript submission is January 21, 2019. Read More

Research in Big Data and Analytics: An Overview

International Journal of Computer Applications
© 2014 by IJCA Journal
Volume 108 - Number 14
Year of Publication: 2014
Lekha R. Nair
Sujala D. Shetty

Lekha R Nair and Sujala D Shetty. Article: Research in Big Data and Analytics: An Overview. International Journal of Computer Applications 108(14):19-23, December 2014. Full text available. BibTeX

	author = {Lekha R. Nair and Sujala D. Shetty},
	title = {Article: Research in Big Data and Analytics: An Overview},
	journal = {International Journal of Computer Applications},
	year = {2014},
	volume = {108},
	number = {14},
	pages = {19-23},
	month = {December},
	note = {Full text available}


Big Data Analytics has been gaining much focus of attention lately as researchers from industry and academia are trying to effectively extract and employ all possible knowledge from the overwhelming amount of data generated and received. Traditional data analytic methods stumble in dealing with the wide variety of data that comes in huge volumes in a short period of time, demanding a paradigm shift in storage, processing and analysis of Big Data. Owing to its significance, several agencies including U. S. government have released huge funds for research in Big Data and allied fields in recent years. This paper presents a brief overview of research progress in various areas associated to Big Data Processing and Analytics and conclude with a discussion on research directions in the same area.


  • T. Kraska, "Finding the Needle in the Big Data Systems Haystack," IEEE Internet Computing, vol. 17, no. 1, pp. 84-86, 2013.
  • F. Shull, "Getting an Intuition for Big Data," IEEE Software, vol. 30, no. 4, pp. 3-6, 2013.
  • C. Jayalath, J. Stephen and P. Eugster, "From the Cloud to the Atmosphere: Running MapReduce across Data Centers," IEEE Transactions on Computers, vol. 63, no. 1, pp. 74-87, 2014.
  • V. Marx, "The Big Challenges of Big Data," Nature, vol. 498, no. 7453, pp. 255-260, 2013.
  • H. S. Francis J. Alexander, "Big Data," Computing in Science and Engineering, vol. 13, no. 6, pp. 10-12, 2011.
  • F. X. Zhang Xiaoxue, "Survey of Research on Big Data Storage," in IEEE International Symposium on Distributed Computing and Applications to Business, Engineering & Science, 2013.
  • L. Garber, "Using in-memory analytics to quickly crunch big data," Computer, vol. 45, no. 10, pp. 16-18, 2012.
  • M. Zaharia, M. Chowdhury, M. J. Franklin, S. Shenker and I. Stoica, "Spark: Cluster Computing with Working Sets," in USENIX conference on Hot topics in cloud computing, 2010.
  • R. Branch et al. , "Cloud Computing and Big Data: A Review of Current Service Models and Hardware Perspectives," Journal of Software Engineering and Applications, vol. 7, pp. 686-693, 2014.
  • S. Madden, "From Databases to Big Data," IEEE Internet Computing, vol. 14, no. 6, pp. 4-6, 2012.
  • H. Jing et al. , "Survey on NoSQL database," in International Conference on Pervasive Computing and Applications, 2011.
  • K. Tim and B. Trushkowsky, "The New Database Architectures," IEEE internet computing, vol. 17, no. 3, pp. 72-76, 2013.
  • B. Novikov, N. Vassilieva and A. Yarygina, "Querying Big Data," in International Conference on Computer Systems and Technologies, 2012.
  • J. Dean and S. Ghemawat, "Mapreduce: Simplified Data Processing on Large Clusters," Communications of the ACM, vol. 51, no. 1, pp. 107-113, 2008.
  • T. White, "Hadoop: The Definitive Guide, 3rd Edition", O'Reilly Media, California, 2012.
  • Osman, M. El-Refaey and A. Elnaggar, "Towards Real-Time Analytics in the Cloud," in IEEE Ninth World Congress on Services, 2013.
  • Z. Guigang, L. Chao, Z. Yong and C. Xing, "MapReduce++: Efficient Processing of MapReduce Jobs in the Cloud," Journal of Computational Information Systems, vol. 8, no. 14, pp. 5757-5764, 2012.
  • L. Harold, H. Herodotos and B. Shivnath, "Stubby: a transformation-based optimizer for MapReduce workflows," Proceedings of the VLDB Endowment, vol. 5, no. 11, pp. 1196-1207 , 2012.
  • H. Herodotou, H. Lim, G. Luo, N. Borisov and L. Dong, "Star?sh: A Self-tuning System for Big Data Analytics," in 5th Biennial Conference on Innovative Data Systems Research (CIDR '11), California, 2011.
  • Z. Prekopcs´ak, G. Makrai, T. Henk and C. G´asp´ar-Papanek, "Radoop: Analyzing Big Data with RapidMiner and Hadoop," in 2nd RapidMiner Community Meeting and Conference (RCOMM 2011), 2011.
  • S. Rao et al. , "Sailfish: A Framework for Large Scale Data Processing," in Proceedings of the Third ACM Symposium on Cloud Computing, 2012.
  • J. Ekanayake, "Twister: A Runtime for Iterative Mapreduce," in 19th ACM International Symposium on High Performance Distributed Computing, 2010.
  • B. Yingyi, B. Howe, M. Balazinska and M. D. Ernst, "HaLoop: Efficient iterative data processing on large clusters," in VLDB Endowment, 2010.
  • W. Yang, X. Liu, L. Zhang and L. T. Yang, "Big Data Real-time Processing Based on Storm," in 12th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, 2013.
  • C. W. Linquan Zhang, Z. Li, C. Guo, M. Chen and F. C. Lau, "Moving Big Data to The Cloud:An Online Cost-Minimizing Approach," IEEE Journal on Selected Areas In Communications, vol. 31, no. 12, pp. 2710-2721, 2013.
  • B. Edmon and J. Horey, "Design principles for effective knowledge discovery from big data," in Joint Working IEEE/IFIP Conference onSoftware Architecture (WICSA) and European Conference on Software Architecture (ECSA), 2012.
  • Z. Daniel and R. Lusch, "Big Data Analytics: Perspective Shifting from Transactions to Ecosystems," IEEE Intelligent Systems, vol. 28, no. 2, pp. 2-5, 2013.
  • X. Wu, X. Zhu, W. Gong-Qing and W. Ding, "Data Mining with Big Data," IEEE Transactions on Knowledge and Data Engineering, vol. 26, no. 1, pp. 97-107, 2014.
  • T. Condie, P. Mineiro, N. Polyzotis and M. Weimer, "Machine Learning for Big Data," in ACM SIGMOD International Conference on Management of Data, New York, USA, 2013.
  • K. LeFevre, D. J. DeWitt and R. Ramakrishnan, "Workload-aware Anonymization Techniques for Large-scale Datasets," ACM Transactions on Database Systems, vol. 33, no. 3, pp. 17:1-17:47, 2008.
  • Z. Xuyun, L. T. Yang, C. Liu and J. Chen, "A scalable two-phase top-down specialization approach for data anonymization using mapreduce on cloud," IEEE Transactions on Parallel and Distributed Systems, vol. 25, no. 2, pp. 363-373, 2014.
  • X. Han, M. Wang, X. Zhang and X. Meng, "Differentially Private Top-k Query over Map-Reduce," in Fourth ACM international workshop on Cloud data management, 2012.
  • B. M. Gaff, H. E. Sussman and J. Geetter, "Privacy and Big Data," IEEE Computer, vol. 47, no. 6, pp. 7-9, 2014.
  • L. Zhang, "Visual analytics for the big data era—A comparative review of state-of-the-art commercial systems," in IEEE Conference on Visual Analytics Science and Technology, 2012.
  • G. E. Yur'evich and V. V. Gubarev, "Analytical review of data visualization methods in application to big data," Journal of Electrical and Computer Engineering, vol. 2013, pp. 1-7, 2013.
  • S. Malik et al. , "TopicFlow: Visualizing Topic Alignment of Twitter Data Over Time," in Proceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2013.
  • W. Xiong, "A Characterization of Big Data Benchmarks," in IEEE International Conference on Big Data, 2013.
  • Zimmermann, M. Pretz, G. Zimmermann, D. G, Firesmith and I. Petrov, "Towards Service-oriented Enterprise Architectures for Big Data Applications in the Cloud," in IEEE International Enterprise Distributed Object Computing Conference Workshops, 2013.
  • Ji, L. Yu, Q. Wenming, A. Uchechukwu and L. Keqiu, "Big Data Processing in Cloud Computing Environments," in International Symposium on Pervasive Systems, Algorithms and Networks, 2012.
  • K. Salah and J. M. A. Calero, "Achieving Elasticity for Cloud MapReduce Jobs," in IEEE 2nd International Conference on Cloud Networking, San Francisco, 2013.
  • B. Ghit, A. Losup and D. Epema, "Towards an Optimized Big Data Processing System," in 13th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing, 2013.
  • C. K. Cheng, M. Chiang and H. V. Poor, "From Technological Networks to Social Networks," IEEE Journal on Selected Areas in Communications, vol. 31, no. 9, pp. 548-572, 2013.
  • E. Zhong, W. Fan, J. W. L. Xiao and Y. Li, "ComSoc: Adaptive Transfer of User Behaviors over Composite Social Network," in 18th ACM SIGKDD international conference on Knowledge discovery and data mining, 2012.
  • W. Tan, M. Blake, I. Saleh and S. Dustdar, "Social-Network- Sourced Big Data Analytics," IEEE Internet Computing, vol. 17, no. 5, pp. 62-69, 2013.
  • G. Booch, "The Human and Ethical Aspects of Big Data," IEEE Software, vol. 31, no. 1, pp. 20-22, 2014.