Post-Calibration Techniques: Balancing Calibration and Score Distribution Alignment

calibration
machine learning
ensemble methods
neurIPS
workshop
presentation
paper
Arthur
Agathe
Author

Ewen Gallic

Published

November 9, 2024

Our paper titled Sequential Conditional (Marginally Optimal) Transport on Probabilistic Graphs for Interpretable Counterfactual Fairness, co-authored with Agathe Fernandes Machado, Arthur Charpentier, has been accepted for publication in the proceedings of the NeurIPS 2024 Workshop on Bayesian Decision-making and Uncertainty (see also the page of the workshop on NeurIPS website).

The paper is available here:

The poster that Agathe will be presenting in Vancouver is available here:

Abstract

A binary scoring classifier can appear well-calibrated according to standard calibration metrics, even when the distribution of scores does not align with the distribution of the true events. In this paper, we investigate the impact of post-processing calibration on the score distribution (sometimes named “recalibration”). Using simulated data, where the true probability is known, followed by real-world datasets with prior knowledge on event distributions, we compare the performance of an XGBoost model before and after applying calibration techniques. The results show that while applying methods such as Platt scaling, Beta calibration, or isotonic regression can improve the model’s calibration, they may also lead to an increase in the divergence between the score distribution and the underlying event probability distribution.

A replication ebook is available on Agathe’ Github:

The corresponding R codes are also available on Agathe’s Github:

We also prepared some slides:

Figure 1: Our poster.