CFP last date
20 May 2024
Reseach Article

Various Data-Mining Techniques for Big Data

Published on October 2015 by Manisha R. Thakare, S. W. Mohod, and A. N. Thakare
International Conference on Advancements in Engineering and Technology (ICAET 2015)
Foundation of Computer Science USA
ICQUEST2015 - Number 8
October 2015
Authors: Manisha R. Thakare, S. W. Mohod, and A. N. Thakare
5cfce688-6893-4088-88e4-5f7952a47947

Manisha R. Thakare, S. W. Mohod, and A. N. Thakare . Various Data-Mining Techniques for Big Data. International Conference on Advancements in Engineering and Technology (ICAET 2015). ICQUEST2015, 8 (October 2015), 9-13.

@article{
author = { Manisha R. Thakare, S. W. Mohod, and A. N. Thakare },
title = { Various Data-Mining Techniques for Big Data },
journal = { International Conference on Advancements in Engineering and Technology (ICAET 2015) },
issue_date = { October 2015 },
volume = { ICQUEST2015 },
number = { 8 },
month = { October },
year = { 2015 },
issn = 0975-8887,
pages = { 9-13 },
numpages = 5,
url = { /proceedings/icquest2015/number8/23029-2906/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 International Conference on Advancements in Engineering and Technology (ICAET 2015)
%A Manisha R. Thakare
%A S. W. Mohod
%A and A. N. Thakare
%T Various Data-Mining Techniques for Big Data
%J International Conference on Advancements in Engineering and Technology (ICAET 2015)
%@ 0975-8887
%V ICQUEST2015
%N 8
%P 9-13
%D 2015
%I International Journal of Computer Applications
Abstract

Big data is the word used to describe structured and unstructured data. The term big data is originated from the web search companies who had to query loosely structured very large distributed data. Big Data is a new term used to identify the datasets that due to their large size and complexity. Big data mining is the capabilities of extracting useful information from these large datasets or streams data that due to its volume, variability and velocity. This data is going to be more diverse larger and faster. Mapreduce provides to the application programmer the abstraction of the map and reduce. Mapreduce is a framework used to write applications that process large amounts of data in parallel on clusters. Mapreduce framework for processing large amount of data. The main aim of this system is to improve performance through parallelization of various operations such as loading the data. This paper explores the efficient implementation of bisecting clustering algorithm with mapreduce in the context of grouping along with a new fully distributed architecture to implement the mapreduce programming model. The architecture also uses queries to shuffle results from map to reduce the cluster results also indicate that queues to overlap the map and shuffling stage seems to be a promising approach to improve mapreduce performance.

References
  1. BABU, G. P. and MARTY, M. N. 1994. Clustering with evolution strategies Pattern Recognition, 27, 2, 321-329.
  2. Fayyad, U. Data Mining and Knowledge Discovery: Making Sense Out of IEEE Expert, v. 11, no. 5, pp. 20-25, October 1996.
  3. Guo, G, Neagu, D. (2005) Similarity-based Classifier Combination for Decision Making . Proc. Of IEEE International Conference on Systems, Man and Cybernetics, pp. 176-181
  4. Jyothi Bellary, BhargaviPeyakunta, SekharKonetigari "Hybrid Machine Learning Approach In Data Mining", 2010 Second International Conference on Machine Learning and computing.
  5. Oyelade, O. J, Oladipupo, O. O, Obagbuwa, I. C" Application of k- means Clustering algorithm for prediction of Students Academic Performance" (IJCSIS) International Journal of Computer Science and Information Security, Vol. 7, 2010.
  6. Varun Kumar and NishaRathee, ITM University, "Knowledge discovery from database Using an integration of clustering and classification", International Journal of Advanced Computer Science and Applications, Vol. 2, No. 3, March 2011.
  7. McKinsey Global Institute (2011) Big Data: The next frontier for innovation, competition and productivity.
  8. Chen, H. , Chaing, R. H. L. and Storey, V. C. (2012) Business Intelligence and Analytics: From Big Data to Big Impact, MIS Quarterly, 36, 4, pp. 1165-1188.
  9. Patel, A. B. , Birla, M. and Nair, U. (2012) Addressing Big Data Problem Using Hadoop and Map Reduce, NIRMA University Conference on Engineering, pp. 1-5.
  10. Wu Yuntian, Shaanxi University of Science and Technology, "Based on Machine Learning of Data Mining to Further Explore", 2012 International Conference on Machine Learning Banff, Canada.
  11. NeelamadhabPadhy, Dr. Pragnyaban Mishra and RasmitaPanigrahi, "The Survey of Data Mining Applications And Feature Scope", International Journal of Computer Science and Information Processing(CSIP).
  12. Aditya B. Patel, Manashvi Birla, Ushma Nair, (6-8 Dec. 2012),"Addressing Big Data Problem Using Hadoop and Map Reduce".
  13. Shiv Pratap Singh Kushwah, KeshavRawat, Pradeep Gupta" Analysis and Comparison of Efficient Techniques of Clustering Algorithms in Data Mining" International Journal of Innovative Technology and Exploring Engineering (IIJITEE) ISSN: 2278-3075, Volume-1, Issue-3, August 2012.
  14. Tekiner F. and Keane J. A. , Systems, Man and Cybernetics(SMC), "Big Data Framework" 2013 IEEE International Conference on 13-16 Oct. 2013, 1494-1499.
  15. Dong, X. L. ; Srivastava, D. Data Engineering (ICDE),' Big data integration" IEEE International Conference on, 29(2013)1245-1248.
  16. Sagiroglu, S. ; Sinanc, D. ,"Big Data: A Review",2013,20-24.
  17. Kyuseok Shim, MapReduce algorithms for Big Data Analysis, DNIS 2013, LNCS 7813, pp. 44-48, 2013.
  18. Madhuri V. Joseph, LipsaSadath and VanajaRajan" Data Mining: A Comparative Study on various Techniques and Methods" International Journal of Advanced Research in Computer Science and Software Engineering Volume 3, Issue 2, February 2013 ISSN: 2277.
  19. Aastha Joshi, RajneetKaur" A Review: Comparative Study of Various Clustering Techniques in Data Mining" International Journal of Advanced Research in computer Science and Software Engineering, Volume 3, Issue 3, March 2013.
  20. Yaxiong Zhao; Jie Wu INFOCOM, "Dache: A Data Aware Caching for Big-Data Applications Using the MapReduce Framework" 2014 Proceedings IEEE 2014, 35 – 39 (Volume 19).
  21. Wu, X. , Zhu, X. , Wu, G. , Ding, W. (2014) Data Mining with Big Data, Knowledge and Data Enginnering, IEEE Transactions.
Index Terms

Computer Science
Information Sciences

Keywords

Big Data Clustering Classification Clustering Algorithms Data Mining Map-reduce.