Table of Contents
Having seen that a model trained without intervention leads to unfair outcomes, we now try to apply some of the mitigation algorithms from the literature. Where possible we use existing open source implementations, but have included our own implementations where necessary. All of our analysis can be explored on Binder and the code is all available on GitHub.
Fairness through unawareness
Fairness through unawareness excludes protected attributes when learning a classifier with the aim on improving fairness. We consider the effect of applying Fairness Through Unawareness for a number of observational group fairness notions.
This notebook contains the implementation of the common pre-processing intervention called Fairness Through Unawareness in which the protected attribute is not included as a feature in the training data. Besides being considered as an intervention, Fairness Through Unawareness can also be considered as a fairness notion, which is consistent with disparate treatment.
Although Fairness Through Unawareness is often applied by industry practitioners, its effect in terms of reducing unfairness is limited since information on protected attributed can still be contained elsewhere in the data. More precisely, there may be features which are highly correlated with the protected attributes and therefore act as proxies for them.
Since the intervention is the same independently of the fairness notion considered, the reported accuracy of the Fairness Through Unawareness model for different notions of fairness is the same.
Finance
We applied the intervention to the adult dataset in order to impose different notions of fairness with respect to sex. Run our analysis yourself on Binder.
Demographic parity
The intervention reduced demographic parity difference only slightly, from 0.193 to 0.174. Since we exclude a priori available information from the training process it is reasonable to expect some reduction in accuracy. However, the influence on achieved accuracy is small, reducing it from 85.3% to 84.9%.
Equalised odds
The reduction in equalised odds is more significant, reducing equalised odds difference from 0.128 to 0.074. Further details on mean scores are displayed in two graphs below which compare Fairness Through Unawareness and baseline.
Equal opportunity
The reduction in equal opportunity is identical to equalised odds.
Calibration
Compared to the baseline model Fairness Through Unawareness has lead to a clear loss in calibration.
Recruiting
Here we applied the intervention to the synthetic recruiting dataset and impose different notions of fairness with respect to race. Run our analysis yourself on Binder.
Demographic parity
Due to the intervention demographic parity was slightly reduced, from 0.327 to 0.267. The loss of less than 1% in accuracy is fairly low, leading to 85.4% accuracy compared to 86.2% in the original model.
Equalised odds
Equalised odds could be roughly halved, from 0.133 to 0.064, which is a significant reduction. Further details on mean scores are displayed in two graphs below which compare Fairness Through Unawareness and baseline.
Equal opportunity
The improvement in equal opportunity is even better than for equalised odds, reducing equal opportunity from 0.133 to 0.039.
Similar to our experiments on the adult data, Fairness Through Unawareness has reduced the level of calibration compared to the baseline model.
Summary
The intervention has overall an small influence on loss in accuracy, however also a limited influence to the improvement in fairness. It is therefore relatively harmless to apply, but insufficient by itself.
Feature modification - Feldman et al.
Feldman et. al introduce a pre-processing technique for imposing demographic parity. It is implemented in IBM's AI Fairness 360 library.The algorithm assumes a binary or categorical protected attribute. It adjusts the distributions of the features so that they are the same in each protected group. For example, in the Adult dataset hours_per_week
is generally lower for women than for men. In this case the algorithm would increase the hours worked per week slightly for women in the dataset, and reduce hours worked per week for men in the dataset, in such a way that the two distributions look the same.
The result of applying the algorithm is a modified dataset, such that each feature in the data has been decorrelated from the protected attribute. The idea is that a model trained on this data, should not be able to learn to discriminate based on the protected attributes.
Finance
We applied the intervention to the adult dataset in order to impose demographic parity with respect to sex. Run our analysis yourself on Binder. We found that training on the modified data didn't substantially change the results. There was a small drop in accuracy, whereas our baseline achieved 85.3% test set accuracy, the model trained on the fair data achieved 85.0%. However there was also hardly any change in demographic parity difference, going from 0.193 to 0.186. Below we show a box plot of the score distributions for the two models. They appear very similar.
This is likely because features that are highly correlated with sex
remain in the data such as marital-status
.
Recruiting
We also applied it to the synthetic recruiting data, this time trying to impose demographic parity with respect to race. Run our analysis yourself on Binder. In this case the intervention was marginally more effective, but still didn't come close to actually achieving fairness. Demographic parity difference decreased from 0.327 in the baseline model to 0.269 for the model trained on fair data. The bar chart of scores shows a modest improvement, but a significant disparity between the races remains.
Summary
In our experiments this intervention was not very effective. Other experiments we have seen with this method achieve better results by doing additional feature selection. In our case, there was possibly too much correlation between features for data modification to work well.
We note that Feldman can be adapted to work as a post-processing technique. This is an unpublished idea that has been observed by Hardt. The distribution modification algorithm is applied to the scores of an existing model rather than the data. This is a simple but effective strategy for imposing demographic parity.
Decision threshold modification - Hardt et al.
Hardt et al. introduce a post-processing technique for imposing equalised odds and equal opportunity. It is implemented in IBM's AI Fairness 360 library, and Microsoft's FairLearn library.Equalised odds requires that the true and false positive rates are equal for each protected group. Equal opportunity requires that only the true positive rates are the same. In either case the algorithm achieves this by adjusting the decision thresholds for each group that are used to determine the prediction. In some cases this alone is not enough to achieve equality, in which case two thresholds are set for each group, and the prediction is made by first randomly choosing between the thresholds, then making a prediction with the threshold.
The algorithm is very widely applicable, as it only needs access to the model outputs and the protected attribute. Moreover Hardt et al. show that their algorithm is optimal among post-processing algorithms for equalised odds. However, the possible randomness present in predictions may not be satisfactory when individual fairness is a concern, as two identical individuals could receive different predictions due to the stochasticity.
Finance
We applied the intervention to the adult dataset in order to impose equalised odds with respect to sex. Run our analysis yourself on Binder. The intervention is extremely effective, the test set equalised odds difference is negligible, while the test set accuracy fell about three percentage points compared to the baseline.
Recruiting
We also applied it to the synthetic recruiting data, this time trying to impose equalised odds with respect to race. Run our analysis yourself on Binder. Again it was extremely effective, with test set equalised odds difference being negligible, and test set accuracy falling about three percentage points.
Summary
The algorithm of Hardt et al. is extremely effective, which is not surprising as they prove in their paper that their intervention is optimal among post-processing algorithms for equalised odds.
There are perhaps two drawbacks. The first is that it achieves fairness through some randomisation of decision thresholds, which means that the post-processed classifier can fail individual fairness. In fact two identical individuals could receive different outcomes. The second is that it fully mitigates bias, which can have a negative performance implications. It is not possible to balance fairness and accuracy requirements by reducing the bias partially but not fully.
Reject Option Classification - Kamiran et al.
Kamiran et al. introduce a post-processing technique for imposing multiple notions of fairness, including demographic parity, equalised odds and equal opportunity. It is implemented in IBM's AI Fairness 360 library.In their paper Kamiran et al. introduce two algorithms, the one implemented by IBM that we benchmark they call Reject Option Classification. The algorithm takes any points which the model is unsure about, i.e. where the probabilities it assigns to different outcomes are not significantly different. Of those points it assigns the favourable outcome to the disadvantaged protected class, and the negative outcome to the advantaged class. In effect, we make an intervention at the margin to balance outcomes overall. Individuals about whom the model is confident are unaffected by the intervention.
Finance
We applied the intervention to the adult dataset in order to impose demographic parity, equalised odds and equal opportunity with respect to sex. Run our analysis yourself on Binder. The interventions are largely effective, demographic parity difference is reduced from 0.193 to 0.025 from the baseline, while accuracy decreased from 85.3% to 79.3%.
The intervention was less effective for equalised odds, only reducing the difference from 0.128 to 0.079 while similarly decreasing accuracy from 85.3% to 79.0%.
Imposing equal opportunity was slightly more effective, reducing the difference from 0.128 to 0.043 and reducing accuracy from 85.3% to 80.7%.
Recruiting
We also applied it to the synthetic recruiting data, this time trying to impose demographic parity, equalised odds and equal opportunity with respect to race. Run our analysis yourself on Binder. Again it was largely effective, with test set test set fairness improving under each intervention, but in some cases sacrificing a lot of accuracy.
First for demographic parity we saw demographic parity difference decrease from 0.327 to 0.055, while accuracy fell from 86.2% to 79.4%.
Imposing equalised odds was more successful, we saw the equalised odds difference fall from 0.133 to 0.032 while accuracy only fell from 86.2% to 84.1%.
Imposing equal opportunity was similarly effective, with equal opportunity difference falling from 0.133 to 0.010 and accuracy falling from 86.2% to 84.1%.
Summary
This intervention is attractive because it can address multiple notions of fairness, and since it is a post-processing algorithm it is model agnostic and relatively straightforward to apply to existing models. Moreover the intervention that is being taken can be easily understood and audited, as it corresponds to a deterministic intervention on ambiguous decisions from the existing model.
It does however sacrifice accuracy more than some other methods, which in certain situations might be unacceptable. Furthermore, as noted above, the decision threshold modification algorithm of Hardt et al. is optimal among post-processing algorithms for equalised odds and equal opportunity, which means we can't expect better performance from this intervention. That said, since the intervention of Hardt et al. introduces some stochasticity to predictions, if that is unacceptable then this might be a viable alternative.
Data reweighting - Kamiran & Calders
Kamiran and Calders introduce a pre-processing technique for imposing demographic parity based on reweighting the training data. It is implemented in IBM's AI Fairness 360 library.Classifiers can learn bias because representatives of the disadvantaged group with positive outcomes are poorly represented in the training data. The reweighting algorithm proposed by Kamiran and Calders identifies such points and upweights them, so that they have a greater impact on model training.
Finance
We applied the intervention to the adult dataset in order to impose demographic parity with respect to sex. Run our analysis yourself on Binder. The intervention did improve fairness, reducing demographic parity difference from the baseline value of 0.193 to 0.099, and only slightly decreased accuracy from 85.3% to 84.2%.
Recruiting
We also applied the intervention to the synthetic recruiting data in order to impose demographic parity with respect to race. Run our analysis yourself on Binder. The intervention was slightly less effective on the recruiting data, reducing demographic parity difference from the baseline value of 0.327 to 0.190, and reducing accuracy from 86.2% to 84.3%.
Summary
This intervention does improve fairness without significantly impacting accuracy, but appears to not be enough by itself to address demographic disparity if that is the goal. However, since this is a pre-processing step, it could easily be combined with other interventions to fully achieve demographic parity.
While the original paper is focussed on demographic parity, we observe that it's not clear that the intervention is directly addressing it. Indeed upweighting positive outcomes from the underprivileged class would generally result in fewer false negatives on that class, and hence could improve the equalised odds difference. Equally, improving performance on the underprivileged class may be more directly addressing calibration. It seems that this intervention doesn't perfectly align with any of the notions of fairness we have at our disposal. Nevertheless, improving representation of under-represented groups by reweighting the data is likely a reasonable thing to do.
Regularisation - Kamishima et al.
Kamishima et al. introduce an in-processing technique for imposing demographic parity based on adding a regularising term to the objective function that is being minimised. It is implemented in IBM's AI Fairness 360 library.Kamishima et al. propose a regularisation term, which approximately represents the mutual information in the predictions and the sensitive attributes, that is incorporated in the optimisation objective. Minimising the objective function thus encourages both accurate prediction while not allowing too extreme a relationship between predictions and the the protected attributes, thus imposing demographic parity.
Finance
We applied the intervention to the adult dataset in order to impose demographic parity with respect to sex. Run our analysis yourself on Binder. The intervention reduced demographic parity difference from the baseline value of 0.193 to 0.059, and decreased accuracy from 85.3% to 80.6%. We can see from the box plots of the scores, that the improvement in demographic parity difference appears to be driven by the scores for both protected groups being squeezed towards zero, and so the classifier is closer to a constant classifier that predicts nobody is a high-earner. This doesn't seem to be much of an improvement.
Recruiting
We also applied the intervention to our synthetic recruiting data to impose demographic parity with respect to race. Run our analysis yourself on Binder. We saw the demographic parity difference decrease from 0.327 to 0.067, but we also observed a major drop in accuracy from 86.2% to 77.7%.
Summary
While the intervention improved fairness on both datasets, the resulting drop in accuracy is extreme, and probably too much to make this algorithm practical. Possibly by removing some features which are highly correlated with the protected attribute, and by tuning the hyperparameters we could improve performance, but it seems that other interventions offer better performance with less effort.
Information witholding - Pleiss et al.
Pleiss et al. introduce a post-processing algorithm that imposes a relaxed notion of equalised odds while preserving calibration. It is one of the few methods available that targets multiple notions of fairness simultaneously. It is implemented in IBM's AI Fairness 360 library.The first component of the algorithm is an information withholding procedure. For a fixed proportion of randomly chosen points in each protected group in the dataset, we will predict the in-group class balance of the labels rather than return the output of the model. The observation Pleiss make is that that this procedure preserves calibration.
Since we have control over the proportion of data from each protected group that is classified without using the features, we can choose these proportion to optimise fairness. Pleiss et al. introduce a relaxation of equalised odds, which rather than requiring parity between the true and false positive rates, instead requires that a weighted average of the true and false positive rates is equal between the two groups. Thus within each group it is possible to trade true and false positives to achieve equality. The optimal proportion can be calculated directly from the data.
On both datasets we had success imposing equal opportunity, but were unsuccessful imposing equalised odds.
Finance
We applied the intervention to the adult dataset in order to impose equal opportunity with respect to sex. Run our analysis yourself on Binder. The intervention reduced equal opportunity difference from 0.128 to 0.035, and only slightly decreased accuracy from 85.3% to 83.2%. In addition, calibration of the original model was preserved.
The intervention was however unsuccessful at imposing equalised odds, the algorithm returns the constant predictor for women, meaning every prediction for women is made with information withheld, and the prediction is made according to the class balance. This results in a large increase in the equalised odds difference, from 0.128 to 0.644.
Recruiting
We also applied the intervention to our synthetic recruiting data to impose equal opportunity and equalised odds with respect to race. Run our analysis yourself on Binder. When imposing equal opportunity we saw equal opportunity difference increase from 0.133 to 0.245, and accuracy fell significantly from 86.2% to 77.9%.
Our attempt to impose equalised odds also failed, with the difference increasing from 0.133 to 0.346, though the decrease in accuracy was less severe, falling from 86.2% to only 84.5%.
Summary
We had mixed success imposing equal opportunity while preserving calibration. The intervention worked well on the adult data, but didn't really help on the recruiting data.
Imposing equalised odds was unsuccessful on both datasets. It may be possible to do better by controlling the weights in the relaxed definition of equalised odds, but as far as we can tell this option is not exposed to us by the implementation in AI Fairness 360.
Optimal Clustering - Zemel et al.
Zemel et al. introduce a pre-processing technique that learns fair representations of the data based on clustering. It is implemented in IBM's AI Fairness 360 library.The approach described by Zemel et al. proceeds by clustering the data in such a way that the probability of being allocated to any particular cluster does not depend on the protected data. The cluster centers can then be used for classification, and so each data point is classified according to the classification received by the corresponding cluster center.
Finance
We applied the intervention to the adult dataset in order to impose demographic parity with respect to sex and separately with respect to race. Run our analysis yourself on Binder here and here. The intervention on race was more successful, reducing the demographic parity difference from 0.1 to 0.004, however the accuracy fell from 85.3% to 77.0%. The bar plot of outcomes shows that the corrected model ended up rarely predicting that the individual was a high earner.
When imposing demographic parity with respect to sex we had less success, seeing the accuracy fall from 85.3% to 79.2% and demographic parity difference increase from 0.193 to 0.370.
Recruiting
We also applied the intervention to our synthetic recruiting data to impose demographic parity with respect to race, which performed poorly. Run our analysis yourself on Binder. While demographic parity difference fell from 0.327 to 0.146, there was a catastrophic loss of accuracy, falling from 86.2% to 66.4%.
Summary
While the intervention improved fairness on both datasets, the resulting drop in accuracy is extreme, and probably too much to make this algorithm practical. Possibly by removing some features which are highly correlated with the protected attribute, and by tuning the hyperparameters we could improve performance, but it seems that other interventions offer better performance with less effort.
Adversarial debiasing - Zhang et al.
The paper Mitigating Unwanted Biases with Adversarial Learning of Zhang et al. introduces a method for mitigating bias in a model using adversarial learning. Their approach is able to impose demographic parity, conditional demographic parity, and equalised odds with only minor modifications. There is an implementation in IBM's AI Fairness 360 library, but it can only address demographic parity. Hence we provide our own implementation for comparing its performance across different definitions of fairness.
The model is trained in tandem with an adversary, which we refer to as the discriminator. The discriminator monitors the model output, and tries to predict the protected attributes. If it were able to do so, this would be a sign that the model is treating the protected groups differently. Hence the model is trained to simultaneously optimise a performance objective and to fool the discriminator. If it learns to fool the discriminator, then the model outputs are unbiased.
To achieve conditional demographic parity we additionally pass legitimate risk factors to the discriminator, so that the model receives no benefit from removing information about the protected attributes from its output that is contained in those factors. Similarly to achieve equalised odds we allow the discriminator to additionally see the labels during training, so that the model is not incentivised to remove from its output any information about the sensitive data that is contained in the labels.
Finance
We attempt to enforce demographic parity, conditional demographic parity and equalised odds on the Adult dataset with respect to sex. Run our analysis yourself on Binder.
Demographic parity
This algorithm is very effective at imposing demographic parity. With minimal tuning we saw a demographic parity difference fall from 0.193 to 0.025, while accuracy only fell from 85.3% to 83.4%
Conditional demographic parity
Similarly imposing conditional demographic parity with respect to hours_per_week
with this approach proved very effective. We saw conditional demographic parity difference fall from 0.173 to 0.031 while accuracy fell from 85.3% to 83.4%.
Equal opportunity
Equal opportunity proved more challenging. We saw some improvement, with equalised odds difference falling from 0.128 to 0.094 and accuracy only falling a small amount from 85.3% to 84.9%. If anything the model somewhat over-corrected. The core problem appears to be that the tension between performance and fairness constraints leads to some instability during training that is typical of adversarial methods. Zhang et al. recommend a few possible strategies for addressing these problems, such as modifying the discriminator loss weight over time so as to slowly increase the penalty for unfairness. In our implementation we warm up without a fairness constraint, but then turn on the fairness constraint suddenly rather than slowly increase it, so there are things we could have done differently to address some of the training issues, however it's clear that imposing equalised odds with this algorithm is more delicate than other definitions of fairness.
Recruiting
Similarly we impose all three definitions of fairness on the recruiting dataset. Run our analysis yourself on Binder.
Demographic parity
This was again very effective, reducing demographic parity difference from 0.173 to 0.007, while accuracy fell from 0.862 to 0.845.
Conditional demographic parity
This was similarly effective, reducing conditional demographic parity difference from 0.127 to 0.014, and accuracy from 86.2% to 84.5%.
Equal opportunity
Equal opportunity once again proved more challenging. This time the equalised odds difference actually got slightly worse, increasing from 0.088 to 0.119. The initial value equalised odds difference is fairly small though which doesn't leave much room for improvement. Nevertheless, imposing equalised odds with this method appears more challenging.
Summary
The adversarial debiasing technique is extremely effective for demographic parity and conditional demographic parity. It is less effective for equalised odds, likely due to a combination of the fact that the model of course does not see the labels which get passed to the discriminator, which means it's hard for it to know what information it can use, and because adversarial methods are inherently unstable.
We showed with out implementation that it is straightforward to implement this algorithm yourself, but there is also an implementation in AI Fairness 360 that can impose demographic parity but not other notions of fairness.
Reductions approach via constrained optimisation - Agarwal et al.
Agarwal et al. introduce an in-processing technique which learns a fair classifier by solving a constrained minimisation problem. It is implemented in Microsoft's FairLearn library.The approach described by Agarwal et al. proceeds by formulating a fair classifier as the minimisation of the prediction error under a general form of linear constraint, which addresses Demographic Parity and Equalised Odds as special cases. The optimisation is reformulated as a saddle point problem and is solved by a sequence of cost-sensitive classification problems.
Finance
We applied the intervention to the adult dataset in order to impose different notions of fairness with respect to sex. Run our analysis yourself on Binder here.
Demographic parity
The intervention was very successful, reducing demographic from 0.193 to 0.013, while only sacrificing a small amount of accuracy, losing 1.5% from 85.3% to 83.8%.
Equalised odds
The intervention again improved fairness massively, reducing equalised odds from 0.128 to 0.017, while almost no accuracy was lost, that is, accuracy only fell from 85.3% to 85.1%.
Equal opportunity
Similarly convincing results were achieved for equal opportunity, reducing equalised odds from 0.128 to 0.014, while the accuracy actually rose by 0.2%. Note that in some cases the restricted state space for the model parameter due to the fairness constraints can lead to a minimiser converging at a local minimum preferable over the local minimum found on the unrestricted space.
Recruiting
Here we applied the intervention to the synthetic recruiting dataset and impose different notions of fairness with respect to race. Run our analysis yourself on Binder here.
Demographic parity
Due to the intervention demographic parity was significantly reduced from 0.327 to 0.041. However, the loss in accuracy was relatively high, reducing it from 86.2% to 80%.
Equalised odds
Equalised odds could be roughly halved, from 0.133 to 0.065, which is a significant reduction. In this case, the loss in accuracy was small, with a final accuracy of 84.3% compared to 86.2%.
Equal opportunity
The improvement in equal opportunity is similar to the improvement in equalised odds, reducing equal opportunity from 0.133 to 0.084, and similarly for accuracy, obtaining a fair model accuracy of 85.4%.
Summary
The intervention improved fairness significantly on both datasets, while the overall observed trade-off between fairness and accuracy for the adult dataset was superior compared to the synthetic recruiting data. Compared to the other intervention techniques Agarwal et al.'s intervention is highly competitive and leads to consistently good results in repeated independent experiments. For practical use this implementation hence seems to work well.