Research Output
Cluster-based oversampling with area extraction from representative points for class imbalance learning
  Class imbalance learning is challenging in various domains where training datasets exhibit disproportionate samples in a specific class. Resampling methods have been used to adjust the class distribution, but they often have limitations for small disjunct minority subsets. This paper introduces AROSS, an adaptive cluster-based oversampling approach that addresses these limitations. AROSS utilizes an optimized agglomerative clustering algorithm with the Cophenetic Correlation Coefficient and the Bayesian Information Criterion to identify representative areas of the minority class. Safe and half-safe areas are obtained using an incremental k-Nearest Neighbor strategy, and oversampling is performed with a truncated hyperspherical Gaussian distribution. Experimental evaluations on 70 binary datasets demonstrate the effectiveness of AROSS in improving class imbalance learning performance, making it a promising solution for mitigating class imbalance challenges, especially for small disjunct minority subsets.

  • Type:

    Article

  • Date:

    16 March 2024

  • Publication Status:

    Published

  • Publisher

    Elsevier BV

  • DOI:

    10.1016/j.iswa.2024.200357

  • Funders:

    Edinburgh Napier Funded

Citation

Farou, Z., Wang, Y., & Horváth, T. (2024). Cluster-based oversampling with area extraction from representative points for class imbalance learning. Intelligent Systems with Applications, 22, Article 200357. https://doi.org/10.1016/j.iswa.2024.200357

Authors

Keywords

Artificial Intelligence; Computer Science Applications; Computer Vision and Pattern Recognition; Signal Processing; Computer Science (miscellaneous)

Monthly Views:

Available Documents