A Novel Technique on Class Imbalance Big Data using Analogous under Sampling Approach

Mohammad Imran; Vaddi Srinivasa Rao

Call for Paper

September Edition

IJCA solicits high quality original research papers for the upcoming September edition of the journal. The last date of research paper submission is 20 August 2026

Submit your paper

Know more

The week's pick

AI-Assisted Observability in Distributed Microservice Architectures

Kyrylo Sotnykov

Random Articles

An Evaluation of Network Topologies for Enhance Networking

Jun

2023

Semantic Web Application in Learning Resource Ontology Repository

April

2016

FRANSAC: Fast RANdom Sample Consensus for 3D Plane Segmentation

Jun

2017

Recommender Systems for Software Requirements Negotiation and Prioritization

May

2015

Reseach Article

A Novel Technique on Class Imbalance Big Data using Analogous under Sampling Approach

by Mohammad Imran, Vaddi Srinivasa Rao

International Journal of Computer Applications

Foundation of Computer Science (FCS), NY, USA

Volume 179 - Number 33

Year of Publication: 2018

Authors: Mohammad Imran, Vaddi Srinivasa Rao

10.5120/ijca2018916743

Mohammad Imran, Vaddi Srinivasa Rao . A Novel Technique on Class Imbalance Big Data using Analogous under Sampling Approach. International Journal of Computer Applications. 179, 33 ( Apr 2018), 18-21. DOI=10.5120/ijca2018916743

@article{ 10.5120/ijca2018916743,

author = { Mohammad Imran, Vaddi Srinivasa Rao },

title = { A Novel Technique on Class Imbalance Big Data using Analogous under Sampling Approach },

journal = { International Journal of Computer Applications },

issue_date = { Apr 2018 },

volume = { 179 },

number = { 33 },

month = { Apr },

year = { 2018 },

issn = { 0975-8887 },

pages = { 18-21 },

numpages = {9},

url = { https://ijcaonline.org/archives/volume179/number33/29210-2018916743/ },

doi = { 10.5120/ijca2018916743 },

publisher = {Foundation of Computer Science (FCS), NY, USA},

address = {New York, USA}

}

%0 Journal Article

%1 2024-02-07T00:57:18.765853+05:30

%A Mohammad Imran

%A Vaddi Srinivasa Rao

%T A Novel Technique on Class Imbalance Big Data using Analogous under Sampling Approach

%J International Journal of Computer Applications

%@ 0975-8887

%V 179

%N 33

%P 18-21

%D 2018

%I Foundation of Computer Science (FCS), NY, USA

Abstract

In this paper, we propose hybrid Random under Sampled Imbalance Big Data (USIBD) framework to extract knowledge from class imbalance big data. A novel under-sampling method for the base learner is also proposed to handle the dynamic class-imbalance problem caused by the gradual evolution of classes in big data. The proposed USIBD knowledge discovery framework is robust and less sensitive to outliers where non-uniform distribution of data is applied. Empirical studies demonstrate the effectiveness of USIBD in various class imbalance big datasets scenarios in comparison to existing methods.

References

O. Maimon, and L. Rokach, Data mining and knowledge discovery handbook, Berlin: Springer, 2010.
Rajiv Sambasivan, SourishDas,”Big Data Classification Using Augmented Decision Trees”, arXiv preprint arXiv:1710.09567, 2017.
Petra Perner,”Big Data, Decision Tree Induction, and Image Analysis for the Discovery of Decision Rules for Colon Examination”, International Journal of Engineering Research & Science (IJOER) ISSN: [2395-6992] [Vol-3, Issue-8, August- 2017].
Tianyi Yang and Anne HeeHiongNgu,”Implementation of Decision Tree Using Hadoop Map Reduce”,Yang and Ngu, Int J Biomed Data Min 2016, 6:1
DOI: 10.4172/2090-4924.1000125.
Armando Segatori, Francesco Marcelloni, and Witold Pedrycz,” On Distributed Fuzzy Decision Trees for BigData”,DOI10.1109/TFUZZ.2016.2646746,IEEE Transactions on Fuzzy Systems.
Hanif Arief Wisesa, M. Anwar Ma’sum, PetrusMursanto, Andreas Febrian,Processing Big Data with Decision TreesA Case Study in Large Traffic Data”, IWBIS 2016 978-1-5090-3477-2/16/2016 IEEE.
Blake C, Merz CJ (2000) UCI repository of machine learning databases. Machine-readable data repository. Department of Information and Computer Science, University of California at Irvine, Irvine.http://www.ics.uci.edu/mlearn/MLRepository.html.
Witten, I.H. and Frank, E. (2005) Data Mining:Practical machine learning tools and techniques.2nd edition Morgan Kaufmann, San Francisco.
J. Quinlan. C4.5 Programs for Machine Learning, San Mateo, CA: Morgan Kaufmann, 1993.

Index Terms

Computer Science

Information Sciences

Keywords

Classification Big data Imbalanced data Under Sampling USIBD