Explainable post-training bias mitigation with distribution-based fairness metrics

Ryan Franks, Alexey Miroshnikov

2025-04-01Fairness Explainable Models

Abstract

We develop a novel optimization framework with distribution-based fairness constraints for efficiently producing demographically blind, explainable models across a wide range of fairness levels. This is accomplished through post-processing, avoiding the need for retraining. Our framework, which is based on stochastic gradient descent, can be applied to a wide range of model types, with a particular emphasis on the post-processing of gradient-boosted decision trees. Additionally, we design a broad class of interpretable global bias metrics compatible with our method by building on previous work. We empirically test our methodology on a variety of datasets and compare it to other methods.

Related Papers

A Reproducibility Study of Product-side Fairness in Bundle Recommendation2025-07-18 FedGA: A Fair Federated Learning Framework Based on the Gini Coefficient2025-07-17 Looking for Fairness in Recommender Systems2025-07-16 FADE: Adversarial Concept Erasure in Flow Models2025-07-16 Fairness-Aware Grouping for Continuous Sensitive Variables: Application for Debiasing Face Analysis with respect to Skin Tone2025-07-15 Guiding LLM Decision-Making with Fairness Reward Models2025-07-15 Fairness-Aware Secure Integrated Sensing and Communications with Fractional Programming2025-07-15 Meta-Reinforcement Learning for Fast and Data-Efficient Spectrum Allocation in Dynamic Wireless Networks2025-07-13