This notebook provides an overview for using and understanding the Class Imbalance check.
What is the Class Imbalance check#
ClassImbalance check produces a distribution of the target variable.
An indication for an imbalanced dataset is an uneven distribution in label classes.
An imbalanced dataset poses its own challenges, namely learning the characteristics of the minority label, scarce minority instances to train on (or test for) and defining the right evaluation metric.
Albeit, there are many techniques to address these challenges, including artificially increasing the minority sample size (by over-sampling or using SMOTE), drop instances from the majority class (under-sampling), using regularization, and adjusting the label classes weights.
Run the check#
Skew the target variable and run the check#
Define a condition#
A manually defined ratio between the labels can also be set:
Total running time of the script: ( 0 minutes 2.053 seconds)