Bias Mitigation — Methods

How to build a Fair Model

Abhishek Dabas
10 min readAug 1, 2020

Understanding and solving “Bias in AI” is more important than ever before. The first step to solving bias is awareness of bias which is followed by bias mitigation.

AI is having a high impact on our daily lives. This is why understanding and implementing fairness is such an important to consider!!

Fairness: It is the quality of making impartial treatment or judgment that does not favor or discriminates.

Sounds good? that's what our Models need….Fairness!! Let's add it to our models now!! Sounds simple?….it's actually more nuanced than that. It requires you to define the type of fairness you want in the model or what type of biases you want to protect against.


Metrics for Evaluating Fairness and Inclusion in your Models:

  1. Disaggregated Evaluation: For each subgroup in the data, compare across subgroups. The overall test score on the test data set gives us a score of the overall data, but testing with different subgroups here can help in removing that bias, for example: comparing men and women.
  2. Intersectional Evaluation: For each subgroup 1 and subgroup 2. compare across all pairs of subgroups. Combining across features like gender and race like black-women or White-men, etc together. For example, GM claims to employ both women as well as black people, but they fail to show fairness for black women, hence it's important to consider Intersectional evaluation.
  3. Causal Inference: Understanding causality can really be helpful in solving bias problems in ML. Understanding the causal relationship between the variables helps us in controlling for bias in the data.
  4. Disparate Impact Remover: This method for evaluating fairness compares the proportion of the population between the privileged group and the unprivileged group. In this process, we try to remove this ability to distinguish between the groups.
P(Y=1 | X= unprivileged group) / P(Y=1 | X= privileged group)
  1. Confusion Matrix: It is a performance measurement technique for the ML classification models. It compares the predicted value with the real value(ground truth) and classifies it into 4 categories (True Positive, False Positive, False Negative, True Negative). It is an extremely useful technique for measuring Recall, Precision, Specificity, Accuracy, and most importantly AUC-ROC Curve. Precision and Recall are imp topics for NLP.
  • Recall or True Positive Rate(TPR) — Out of all the positive classes, how much we predicted correctly. It should be as high as possible.
Recall = TP / (TP + FN)
  • Precision — Out of all the positive classes we have predicted correctly, how many are actually positive.
Precision = TP / (TP + FP)
  • Accuracy — Out of all the classes predicted, which ones were correctly predicted.
Accuracy= TP+TN /(TP+TN+TN+FN)
  • Specificity or True Negative Rate(TNR)— the proportion of actual negatives that are correctly identified.
Specificity = TN /(TN+FP)
  • False-positive and False-negative are Type 1 and Type 2 errors respectively.
  • Choose Recall if the occurrence of false negatives is unaccepted/intolerable. For example, In the case of diabetes, you would rather have some extra false positives (false alarms) over saving some false negatives(we don't want to miss out on people who actually have diabetes).
  • Choose Precision if you want to be more confident of your true positives. For example, in case of spam emails, you would rather have some spam emails in your inbox rather than some important emails in your spam box. You would like to be extra sure that email X is spam before we put it in the spam box.

Some common tests for accuracy are:

  1. F-1 Score: It measures the test accuracy, considering both precision and recall at the same time. The score takes both false positives and false negatives into account. It is very helpful in case when you have uneven class distribution.
f1 score = 2 * precision * recall / (precision + recall)

2. AUC-ROC: It is used to quantify the performance of the classification model and can be used for visualizing all possible threshold levels. Area Under Curve(AUC) represents the measure of separability whereas Receiver Operating Characteristics(ROC) represent the probability curve. AUC metric is only sensitive to rank order. It is the probability of ranking randomly chosen positive observation higher than a randomly chosen negative observation.

We want the AUC to be as high as possible and do not want the probability curves to overlap, which means that it is an ideal measure of separability.


Tools developed for bias and Fairness:

Developing a legal ethical framework for reducing unconscious biases is definitely a start in improving the level of trust about the Models. In business, having the right framework for company culture around data usage will set apart businesses and create a more inclusive mindset. For example, Google AI has outlined its principles for being socially beneficial, avoiding bias, improving safety, and accountability, and they are transparent about this. Other firms such as IBM and Microsoft have taken similar steps. The Alan Turing Institute also provides comprehensive principles on the values in which data should be treated with respect to an individual. Understanding Artificial Intelligence Ethics and Safety — Alan Turing institute

Source: Alan Turing Insitute
  1. Google Fascets: “Know your Data”. With the thought that better data build better models, it is a very helpful tool to know your data better. Understanding data is critical to building a powerful machine learning system. Facets contain two robust visualizations to aid in understanding and analyzing machine learning datasets. It helps to get a sense of the shape of each feature of your dataset using Facets Overview, or individual observations using Facets Dive. You can check it out here. All you have to do is load your dataset(CSV file) and google’s Facets will summarize statistics for each feature and also compare the training and test dataset.
  2. Google’s Responsible AI: Google has shown a commitment to making progress in the responsible development of AI and sharing research, tools, datasets, and other resources with a larger community. Here they have shared some of their current work and recommended practices. A leap by Google: Google AI breast cancer works well across countries: link
  3. Google Project GlassBox: Another project by google which focuses to stop software that learns from a limited sample of data, making it more transparent.“Be aware to be Fair” — Maya Gupta, a research scientist at Google
  4. Perspective API: Jigsaw and Google’s Counter Abuse Technology team collaborated on a research project called Conversation-AI. It open-sources experiments, models, and research data to explore the strengths and weaknesses of ML as a tool for online discussion. Their first model identifies whether a comment could be perceived as “toxic” to a discussion. Example: The statement “He is a gay” — considered to be 76.06% toxic

5. IBM AI fairness 360: It is an extensible open-source toolkit that can help you examine, report, and mitigate discrimination and bias in machine learning models throughout the AI application lifecycle. IBM has been working on the bias, and it is one of their “5 in 5” topics for innovations that will help change the lives of people. Within five years, IBM will have new solutions to counter a substantial increase in the number of biased AI systems and algorithms.

Considering so many fairness metrics and bias mitigation techniques, IBM tried to build a comprehensive tool including all of these. It can help in trying out different metrics and choosing the right one for the stakeholders and the application (for specific use cases). There are a bunch of datasets within the library itself to play around.

There are some bias Mitigation Techniques added to the toolbox, which can be chosen according to the use case. These are as follows:

  • Reweighting: Weights the examples in each (group, label) combination differently to ensure fairness before classification.
  • Optimized Pre-processing: Learns a probabilistic transformation that can modify the features and the labels in the training data.
  • Adversarial Debiasing: Learns a classifier that maximizes prediction accuracy and simultaneously reduces an adversary’s ability to determine the protected attribute from the predictions. This approach leads to a fair classifier as the predictions cannot carry any group discrimination information that the adversary can exploit.
  • Rejected Option Based Classification: Changes predictions from a classifier to make them fairer. Provides favorable outcomes to unprivileged groups and unfavorable outcomes to privileged groups in a confidence band around the decision boundary with the highest uncertainty.

6. FairML : It is a toolbox written in python to audit the Machine Learning Models for fairness and Bias. The main idea behind FairML is to measure the dependence of the model on its inputs. The model examines if the model is sensitive to a particular feature. Perturbation is one of the techniques used in the library for detecting fairness. The trick FairML uses to counter this multicollinearity is orthogonal projection. FairML orthogonally projects the input to measure the dependence of the predictive model on each attribute. It allows the model to completely remove the linear dependence between attributes. It is an end-to-end toolbox for auditing predictive models by quantifying the relative significance of the model’s inputs. Read More here.

7. Microsoft FairLearn: It is aimed at enabling users to enjoy the enhanced insights and efficiency of AI-powered services without discrimination and bias. Microsoft is committed to building Responsible AI. FairLearn is one of their libraries build under Microsoft FATE: Fairness, Accountability, Transparency, and Ethics in AI.

FairLearn has a very easy to use dashboard. It uses common fairness metrics for assessment and bias mitigation in classification or regression models. The performance metrics used here are Precision, Accuracy, Recall, and Balanced Accuracy. The bias mitigation techniques in the library are:

  • Reduction: These algorithms take a standard black-box machine learning estimator and generate a set of retrained models using a sequence of re-weighted training datasets. For example, applicants of a certain gender might be up-weighted or down-weighted to retrain models and reduce disparities across different gender groups. Users can then pick a model that provides the best trade-off between accuracy (or another performance metric) and disparity, which generally would need to be based on business rules and cost calculations.
  • Post-processing: These algorithms take an existing classifier and the sensitive feature as input. Then, they derive a transformation of the classifier’s prediction to enforce the specified fairness constraints. The biggest advantage of threshold optimization is its simplicity and flexibility as it does not need to retrain the model.

Other Recommended Methods:

  1. Debiasing: This process includes removing the signals which produce problematic outputs like stereotyping, sexism, racism. Another way could be the addition of desired variables to increase the model performance, such as more additional data on data slices with a bad performance. Our aim here is that the prediction should be uncorrelated with the sensitive attributes. Although it's not the perfect solution because we may not always know these variables within our data that are making the model biased. Example: The sensitive attributes are not always sex, race, etc even variables like zip codes could create a bias in the model by favoring certain regions for the output.

Note: Sometimes, the outcome is highly correlated with these sensitive attributes which could be true. For Example, The incidence of heart failure is substantially more common in men than women, so when predicting such medical condition it is desirable for sex to be correlated with the predicted outcome.

2. Datasheets for Datasets: When releasing the dataset, factors like the purpose of dataset and biases in the dataset should be mention in the documentation. There is no such thing as a dataset that isn't biased. A dataset by virtue of the fact that it's collected from the world as a subset is a biased set of the world in some way. The important this is to be aware of these biases, where the biases are, how much of bias is there etc.

3. Model Cards for Model Reporting: Just like datasheets for datasets, this focuses on what model does, how it works, and why it matters. Its purpose is to make sure that the model does not get used by people for the purpose that it's not intended for.


  • What is the intended use for the predictions or the output?
  • What datasets the model was tested and trained on?
  • Recommended use cases?
  • Possible Biases in the data or the model?
  • Bias & Fairness mitigation methods used in the model?
  • The explained decision-making process of a model?
  • Evaluation Metrics?

Transparency in model performance can help to build a Responsible AI adoption and encourage Fair Models. This can be thought of as something similar to a verified list of Nutrition Facts we see at the back of every food product!!


  1. There is no single definition of Fairness that can be quantified and integrated into the system. It is highly related to the real-world scenario where it is being implemented. 21 Fairness definitions and their policies by Arvind Narayanan.
  2. With all these tools and libraries for fairness, we input the sensitive attribute and select the performance metric according to the application of the model.
  3. There is no one button or method to remove bias or make your model fair. Given the subjective and sociotechnical nature of fairness, there couldn’t be a single tool to address every challenge.
  4. The more we inject bias detection and mitigation mechanisms into AI the more they can help us be less biased, by alerting us that we do not behave fairly!! We might just improve ourselves!!
  5. Choose carefully between:
  • Fair Model: It is fair but doesn't work well!!
  • UnFair Model: It isn't Fair but works well!!

Good Read on the topic :

  1. Reducing bias in AI-based Financial Services
  2. Google’s AI for Social Good Initiative:





Abhishek Dabas

Masters Student | Machine Learning | Artificial Intelligence | Causal Inference | Data Bias | Twitter: @adabhishekdabas