CFP last date
20 May 2024
Reseach Article

Unbalanced Data Set- State-of-the-art and its Research Challenges

by Deeksha Dhapola, Janmejay Pant
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
Volume 178 - Number 15
Year of Publication: 2019
Authors: Deeksha Dhapola, Janmejay Pant
10.5120/ijca2019918931

Deeksha Dhapola, Janmejay Pant . Unbalanced Data Set- State-of-the-art and its Research Challenges. International Journal of Computer Applications. 178, 15 ( May 2019), 62-64. DOI=10.5120/ijca2019918931

@article{ 10.5120/ijca2019918931,
author = { Deeksha Dhapola, Janmejay Pant },
title = { Unbalanced Data Set- State-of-the-art and its Research Challenges },
journal = { International Journal of Computer Applications },
issue_date = { May 2019 },
volume = { 178 },
number = { 15 },
month = { May },
year = { 2019 },
issn = { 0975-8887 },
pages = { 62-64 },
numpages = {9},
url = { https://ijcaonline.org/archives/volume178/number15/30611-2019918931/ },
doi = { 10.5120/ijca2019918931 },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Journal Article
%1 2024-02-07T00:50:33.028844+05:30
%A Deeksha Dhapola
%A Janmejay Pant
%T Unbalanced Data Set- State-of-the-art and its Research Challenges
%J International Journal of Computer Applications
%@ 0975-8887
%V 178
%N 15
%P 62-64
%D 2019
%I Foundation of Computer Science (FCS), NY, USA
Abstract

Real world application often found the problem of unbalanced dataset. This then create the problem in machine learning methods . In this paper we have surveyed the imbalance dataset problem at the algorithmic level . By over sampling and under sampling some researchers artificially prove that updated svm ,cost sensitive classifier ,class orientation methods can be good on imbalanced dataset. This imbalance problem is also switching towards hybrid algorithm.

References
  1. 1. Szil´ard Vajda, Gernot A. ―Fink Strategies for Training Robust Neural Network Based Digit Recognizers on Unbalanced Data Set 2010‖ 12th International Conference on Frontiers in Handwriting Recognition
  2. C.V. KrishnaVeni,T. Sobha Rani On the Classification of Imbalanced Datasets‖ IJCST Vol . 2, SP 1, December 2015
  3. Nitesh V. Chawla, Nathalie Japkowicz,
  4. Special Issue on Learning from Imbalanced Data Sets‖ Sigkdd Explorations. Volume 6, Issue 1
  5. .Chawla,NBowyer, K., Hall, L. Kegelmeyer, W. ―SMOTE: Synthetic minority over-sampling technique‖ of Artificial Intelligence Research 16, 321–357 (2015)
  6. Andrew Estabrooks, Taeho Jo and Nathalie Japkowicz ―Multiple Resampling Method for Learning from Comprtational Intelligence 20 (1) (2009).
  7. Taeho Jo, Nathalie Japkowicz ―Class Imbalances versus Small Disjuncts‖. Sigkdd Conference IEEE 2011.
  8. Hongyu Guo, Herna L Viktor: ―Learning from Imbalanced Data Sets with Boosting and Data Generation: The DataBoost-IM Approach‖. Sigkdd Explorations 6 (1) (2015).
  9. Hui Han, Wen-Yuan Wang, Bing-Huan 4th International conference 2011, Malaysia.
  10. David A. Cieslak, Nitesh V. Chawla ―Start Globally, Optimize Locally, Predict Globally: Improving Performance on Imbalanced Data‖ 2012 Eighth IEEE International Conference on Data Mining.
  11. Gary M. Weiss, Kate McCarthy, and Bibi Zabar Cost-Sensitive Learning vs. Sampling: Which is Best for Handling Unbalanced Classes with Unequal Error costs?
  12. Haibo He, Edwardo A. Garcia, ― Learning from Imbalanced Data‖, 2012.
  13. Charles X. Ling, Qiang Yang, Jianning Decisions tree with minimum costs.
  14. David A. Cieslak, Nitesh V. Chawla, Learning decision tree for unbalanced datasets. 2015.
  15. Wei Liu, Sanjay Chawla, David A. Nitesh v Chawala for imbalanced datasets. 2013
  16. David A. Cieslak, T. Ryan The journal of Data mining issue May 2016.
  17. Satyam Maheshwari, Prof. Jitendra A New Approach for Classification of imbalanced datasets Evolutionary algorithm 2011.
  18. NGUYEN HA VO, YONGGWAN WON ―Classification of Unbalanced Medical Data with Weighted .Convergence of bio science technology 2015.
  19. Jie Song, Xiaoling Lu, Xizhi Wu ―An Improved AdaBoost Algorithmfor Unbalanced Classification Data‖ 2009 Sixth International Conference on Fuzzy 2012.
  20. Yanmin Sun, Mohamed S. Kamel, Andrew K cost sensitive boosting on imbalanced dataset 2013.
  21. Rehan Akbani, Stephen Kwek Nathalie Japkowicz Applying Support Vector Machines to Imbalanced Dataset.
  22. TAO Xiao-yan, JI Hong-bing AModifiedPSVM and itsApplicationtoUnbalancedDataClassification.ThirInternational Conference on NaturalICNS 2017
Index Terms

Computer Science
Information Sciences

Keywords

cost-sensitive learning imbalanced data set modified SVM oversampling undersampling