International Journal of Computer Applications |
Foundation of Computer Science (FCS), NY, USA |
Volume 186 - Number 76 |
Year of Publication: 2025 |
Authors: Debashis Roy, Anandarup Roy, Utpal Roy |
![]() |
Debashis Roy, Anandarup Roy, Utpal Roy . A Novel Adaptive Framework for Data Complexity Analysis in Imbalanced Binary Classification. International Journal of Computer Applications. 186, 76 ( Apr 2025), 42-51. DOI=10.5120/ijca2025924677
Improving classification performance when the dataset is imbalanced—that is, when the negative (majority) class is stronger than the positive (minority) class—is one of the most important problems in machine learning. Several researchers alleviated this situation by developing various data-level and algorithm-level techniques. However, it is important to note that an imbalanced dataset is not the sole factor compromising classification performance. It's not just the imbalanced dataset that makes classification harder; things like overlap, local instance ambiguity, intrinsic structural complexity, and so on also make the classification more complicated. Very few researchers have focused on data complexity, especially along with imbalanced datasets. This paper proposes a novel adaptive framework that measures data complexities like instance overlap, multiresolution overlap, structural overlap, kNN-based complexity for minority instances, and more. This systematized adaptive measure selection framework sorts through the complexity of the data based on how imbalanced the datasets are and suggests preprocessing steps and the right models to make the classification task easier. The work includes a theoretical analysis, the lemma, and the corollary, as well as specific steps for putting the ideas into practice. This framework, which is aware of taxonomies and provides actionable insights that greatly improve the performance of imbalanced classification, makes it new and very useful for both researchers and practitioners.