When faced with unbalanced data on which to build a classifier, the performance of the model to correctly predict each class is generally poor or even bad. In this notebook, we will consider the case of a binary classifier, trained to predict two classes that we will call the majority class and the minority class. Arbitrarily, we will consider that the minority class will be coded 1, while the majority class will be coded 0. In theory, when the repartition between the two classes is not equal (50% of 0 and 50% of 1), we speak of imbalance. In practice, problems occur when the imbalance is strong, for example 1% of 1 and 99% of 1. It is also possible to have extremely few observations of the minority class. The typical example encountered in machine learning is that of fraud, where more than 99% of observations are non-fraudulent while less than one percent of observations involve fraud.