ML Bias Mitigation

Interventions

Having seen that a model trained without intervention leads to unfair outcomes, we now try to apply some of the mitigation algorithms from the literature. Where possible we use existing open source implementations, but have included our own implementations where necessary. All of our analysis can be explored on Binder and the code is all available on GitHub.

Fairness through unawareness

Fairness through unawareness excludes protected attributes when learning a classifier with the aim on improving fairness. We consider the effect of applying Fairness Through Unawareness for a number of observational group fairness notions.

Summary

The intervention has overall an small influence on loss in accuracy, however also a limited influence to the improvement in fairness. It is therefore relatively harmless to apply, but insufficient by itself.

Feature modification - Feldman et al.

Feldman et. al introduce a pre-processing technique for imposing demographic parity. It is implemented in IBM's AI Fairness 360 library.

Summary

In our experiments this intervention was not very effective. Other experiments we have seen with this method achieve better results by doing additional feature selection. In our case, there was possibly too much correlation between features for data modification to work well.

We note that Feldman can be adapted to work as a post-processing technique. This is an unpublished idea that has been observed by Hardt. The distribution modification algorithm is applied to the scores of an existing model rather than the data. This is a simple but effective strategy for imposing demographic parity.

Decision threshold modification - Hardt et al.

Hardt et al. introduce a post-processing technique for imposing equalised odds and equal opportunity. It is implemented in IBM's AI Fairness 360 library, and Microsoft's FairLearn library.

Summary

The algorithm of Hardt et al. is extremely effective, which is not surprising as they prove in their paper that their intervention is optimal among post-processing algorithms for equalised odds.

There are perhaps two drawbacks. The first is that it achieves fairness through some randomisation of decision thresholds, which means that the post-processed classifier can fail individual fairness. In fact two identical individuals could receive different outcomes. The second is that it fully mitigates bias, which can have a negative performance implications. It is not possible to balance fairness and accuracy requirements by reducing the bias partially but not fully.

Reject Option Classification - Kamiran et al.

Kamiran et al. introduce a post-processing technique for imposing multiple notions of fairness, including demographic parity, equalised odds and equal opportunity. It is implemented in IBM's AI Fairness 360 library.

Summary

This intervention is attractive because it can address multiple notions of fairness, and since it is a post-processing algorithm it is model agnostic and relatively straightforward to apply to existing models. Moreover the intervention that is being taken can be easily understood and audited, as it corresponds to a deterministic intervention on ambiguous decisions from the existing model.

It does however sacrifice accuracy more than some other methods, which in certain situations might be unacceptable. Furthermore, as noted above, the decision threshold modification algorithm of Hardt et al. is optimal among post-processing algorithms for equalised odds and equal opportunity, which means we can't expect better performance from this intervention. That said, since the intervention of Hardt et al. introduces some stochasticity to predictions, if that is unacceptable then this might be a viable alternative.

Data reweighting - Kamiran & Calders

Kamiran and Calders introduce a pre-processing technique for imposing demographic parity based on reweighting the training data. It is implemented in IBM's AI Fairness 360 library.

Summary

This intervention does improve fairness without significantly impacting accuracy, but appears to not be enough by itself to address demographic disparity if that is the goal. However, since this is a pre-processing step, it could easily be combined with other interventions to fully achieve demographic parity.

While the original paper is focussed on demographic parity, we observe that it's not clear that the intervention is directly addressing it. Indeed upweighting positive outcomes from the underprivileged class would generally result in fewer false negatives on that class, and hence could improve the equalised odds difference. Equally, improving performance on the underprivileged class may be more directly addressing calibration. It seems that this intervention doesn't perfectly align with any of the notions of fairness we have at our disposal. Nevertheless, improving representation of under-represented groups by reweighting the data is likely a reasonable thing to do.

Regularisation - Kamishima et al.

Kamishima et al. introduce an in-processing technique for imposing demographic parity based on adding a regularising term to the objective function that is being minimised. It is implemented in IBM's AI Fairness 360 library.

Summary

While the intervention improved fairness on both datasets, the resulting drop in accuracy is extreme, and probably too much to make this algorithm practical. Possibly by removing some features which are highly correlated with the protected attribute, and by tuning the hyperparameters we could improve performance, but it seems that other interventions offer better performance with less effort.

Information witholding - Pleiss et al.

Pleiss et al. introduce a post-processing algorithm that imposes a relaxed notion of equalised odds while preserving calibration. It is one of the few methods available that targets multiple notions of fairness simultaneously. It is implemented in IBM's AI Fairness 360 library.

Summary

We had mixed success imposing equal opportunity while preserving calibration. The intervention worked well on the adult data, but didn't really help on the recruiting data.

Imposing equalised odds was unsuccessful on both datasets. It may be possible to do better by controlling the weights in the relaxed definition of equalised odds, but as far as we can tell this option is not exposed to us by the implementation in AI Fairness 360.

Optimal Clustering - Zemel et al.

Zemel et al. introduce a pre-processing technique that learns fair representations of the data based on clustering. It is implemented in IBM's AI Fairness 360 library.

Summary

While the intervention improved fairness on both datasets, the resulting drop in accuracy is extreme, and probably too much to make this algorithm practical. Possibly by removing some features which are highly correlated with the protected attribute, and by tuning the hyperparameters we could improve performance, but it seems that other interventions offer better performance with less effort.

Adversarial debiasing - Zhang et al.

The paper Mitigating Unwanted Biases with Adversarial Learning of Zhang et al. introduces a method for mitigating bias in a model using adversarial learning. Their approach is able to impose demographic parity, conditional demographic parity, and equalised odds with only minor modifications. There is an implementation in IBM's AI Fairness 360 library, but it can only address demographic parity. Hence we provide our own implementation for comparing its performance across different definitions of fairness.

Summary

The adversarial debiasing technique is extremely effective for demographic parity and conditional demographic parity. It is less effective for equalised odds, likely due to a combination of the fact that the model of course does not see the labels which get passed to the discriminator, which means it's hard for it to know what information it can use, and because adversarial methods are inherently unstable.

We showed with out implementation that it is straightforward to implement this algorithm yourself, but there is also an implementation in AI Fairness 360 that can impose demographic parity but not other notions of fairness.

Reductions approach via constrained optimisation - Agarwal et al.

Agarwal et al. introduce an in-processing technique which learns a fair classifier by solving a constrained minimisation problem. It is implemented in Microsoft's FairLearn library.

Summary

The intervention improved fairness significantly on both datasets, while the overall observed trade-off between fairness and accuracy for the adult dataset was superior compared to the synthetic recruiting data. Compared to the other intervention techniques Agarwal et al.'s intervention is highly competitive and leads to consistently good results in repeated independent experiments. For practical use this implementation hence seems to work well.