CFP last date
22 April 2024
Reseach Article

A Review on Imbalanced Learning Methods

Published on December 2015 by Varsha S. Babar, Roshani Ade
National Conference on Advances in Computing
Foundation of Computer Science USA
NCAC2015 - Number 2
December 2015
Authors: Varsha S. Babar, Roshani Ade
fcf3fdc0-93a2-4794-a787-3b349a8a86b0

Varsha S. Babar, Roshani Ade . A Review on Imbalanced Learning Methods. National Conference on Advances in Computing. NCAC2015, 2 (December 2015), 23-27.

@article{
author = { Varsha S. Babar, Roshani Ade },
title = { A Review on Imbalanced Learning Methods },
journal = { National Conference on Advances in Computing },
issue_date = { December 2015 },
volume = { NCAC2015 },
number = { 2 },
month = { December },
year = { 2015 },
issn = 0975-8887,
pages = { 23-27 },
numpages = 5,
url = { /proceedings/ncac2015/number2/23366-5029/ },
publisher = {Foundation of Computer Science (FCS), NY, USA},
address = {New York, USA}
}
%0 Proceeding Article
%1 National Conference on Advances in Computing
%A Varsha S. Babar
%A Roshani Ade
%T A Review on Imbalanced Learning Methods
%J National Conference on Advances in Computing
%@ 0975-8887
%V NCAC2015
%N 2
%P 23-27
%D 2015
%I International Journal of Computer Applications
Abstract

Nowadays learning from imbalanced data sets are a relatively a very critical task for many data mining applications such as fraud detection, anomaly detection, medical diagnosis, information retrieval systems. The imbalanced learning problem is nothing but unequal distribution of data between the classes where one class contains more and more samples while another contains very little. Because of imbalance learning problems, it becomes hard for the classifier to learn the minority class samples. The Aim of this paper is to review on various techniques which are used for resolving imbalanced learning problem. This paper proposes a taxonomy for various methods used forhandling the class imbalance problem where each method can be categorized depending on the techniques it uses. To handle imbalanced learning problem significant work has been done, which can be categorized into four categories: sampling-based methods, cost-based methods, kernel-based methods, and active learning-based methods. All these methods resolve the imbalanced learning problem efficiently.

References
  1. T. E. Fawcett and F. Provost, "Adaptive Fraud Detection," Data Mining and Knowledge Discovery, vol. 3, no. 1, pp. 291-316, 1997.
  2. P. M. Murphy and D. W. Aha, "UCI Repository of Machine Learning Databases," Dept. of Information and Computer Science, Univ. of California, Irvine, CA, 1994.
  3. D. Lewis and J. Catlett, "Heterogeneous Uncertainty Sampling for Supervised Learning," Proc. Int'l Conf. Machine Learning, pp. 148- 156, 1994.
  4. R. Ade and P. R. Deshmukh, "An incremental ensemble of classifiers as a technique for prediction of student's career choice" Int'l conf. on Networks and soft computing(ICNSC), Aug 2014
  5. Shruti Patil and Roshani Ade, "Software Requirement Engineering Risk Prediction Model", Int'l journal of computer application, Sept 2014
  6. R Ade and P. R. Deshmukh, "Incremental learning in students classification system with efficient knowledge transformation" Int'l conf. on PDGC, Dec 2014
  7. R Ade and P. R. Deshmukh, "Efficient Knowledge Transformation System Using Pair of Classifiers for Prediction of Students Career Choice", Int'l Conf. on Information and communication technologies, Dec 2014
  8. R Ade and P. R. Deshmukh, "Efficient Knowledge Transformation for incremental learning and detection of new concept class in students classification system" Jan 2015
  9. R Ade and P. R. Deshmukh, "Classification of students by using an incremental ensemble of classifiers",Int'l Conf on ICRITO, Oct 2014
  10. H. He, Self-Adaptive Systems for Machine Intelligence,Wiley, Aug 2011
  11. H. He and E. A. Garcia, "Learning from Imbalanced Data," IEEE Trans. Knowledge Data Eng. , vol. 21, no. 9, pp. 1263-1284, Sept. 2009.
  12. X. Y. Liu, J. Wu, and Z. H. Zhou, "Exploratory Under Sampling for Class Imbalance Learning," Proc. Int'l Conf. Data Mining, pp. 965- 969, 2006.
  13. J. Zhang and I. Mani, "KNN Approach to Unbalanced Data Distributions: A Case Study Involving Information Extraction," Proc. Int'l Conf. Machine Learning (ICML '2003), Workshop Learning from Imbalanced Data Sets, 2003.
  14. M. Kubat and S. Matwin, "Addressing the Curse of Imbalanced Training Sets: One-Sided Selection," Proc. Int'l Conf. Machine Learning, pp. 179-186, 1997.
  15. Victor H. Barella, Eduardo p. Costa, and Andre C P L F Carvalho, "ClusterOSS: a new undersampling method for imbalanced learning"
  16. N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, "SMOTE: Synthetic Minority oversampling Technique," J. Artificial Intelligence Research, vol. 16, pp. 321-357, 2002.
  17. Reshma C. Bhagat and Sachin S. Patil, "Enhanced SMOTE Algorithm for Classification of Imbalanced Big-Data using Random Forest", IEEE International Advance Computing Conference (IACC), 2015
  18. H. Han, W. Y. Wang, and B. H. Mao, "Borderline-SMOTE: A New Oversampling Method in Imbalanced Data Sets Learning," Proc. Int'l Conf. Intelligent Computing, pp. 878-887, 2005.
  19. H. He, Y. Bai, E. A. Garcia, and S. Li, "ADASYN: Adaptive Synthetic Sampling Approach for Imbalanced Learning," Proc. Int'l Joint Conf. Neural Networks, pp. 1322-1328, 2008.
  20. S. Chen, H. He, and E. A. Garcia, "RAMOBoost: Ranked Minority Oversampling in Boosting," IEEE Trans. Neural Networks, vol. 21, no. 20, pp. 1624-1642, Oct. 2010.
  21. Sukarna Barua, Md. Monirul Islam,Xin Yao, "MWMOTE-Majority Weighted MinorityOversampling Technique for imbalanced data set learning",IEEE Trans. Knowledge anddata engineering, vol. 26, no. 2, February 2014
  22. Xingyi LIU, "Cost-sensitive Decision Tree with Missing Values and Multiple Cost Scales", Int'l Joint Conf. on Artificial Intelligence, 2009
  23. Zhi-Hua Zhou and Xu-Ying Liu, "Training Cost-Sensitive Neural Networks with Methods Addressing the Class Imbalance Problem", IEEE Trans on knowledge and data engineering, vol. 18, no. 1, January 2006
  24. B. Settles and M. Craven, "An analysis of active learning strategies for sequence labeling tasks," in Proc. Conf. Empirical Methods NaturalLang. Process. (EMNLP), Oct. 2008, pp. 1070–1079.
  25. S. Dasgupta, D. Hsu, and C. Monteleoni, "A general agnostic active learning algorithm," in Proc. Adv. Neural Inf. Process. Syst. (NIPS), vol. 20. 2008, pp. 353–360.
  26. C. X. Ling and J. Du, "Active learning with direct query construction," in Proc. 14th ACM SIGKDD Int. Conf. Knowl. Discov. DataMining (KDD), Las Vegas, NV, USA, 2008, pp. 480–487.
  27. D. Tuia, F. Ratle, F. Pacifici, M. F. Kanevski and W. J. Emery, "Active Learning Methods for Remote Sensing Image Classification", IEEE Trans. on Geoscience and Remote sensing, vol. 47, issue 7, April 2009
  28. Jing Zhang, Xindong Wu and Victor S. Sheng, "Active Learning with Imbalanced MultipleNoisy Labeling", IEEE Trans. on Cybernetics, vol. 45, no. 5, May 2015
  29. ZhiQiang ZENG and ShunZhi ZHU, "A Kernel-based Sampling to Train SVM with Imbalanced Data Set", Conference Anthology, IEEE, January 2013
  30. Bo ZHOU, Cheng YANG, Haixiang GUO and Jinglu HU, "A Quasi-linear SVM Combined with Assembled SMOTE for Imbalanced Data Classification", Int'l Joint Conf. on Neural Networks, August 2013
Index Terms

Computer Science
Information Sciences

Keywords

Imbalanced Learning Active Learning Cost-sensitive Learning