Bias Mitigation — Methods


Metrics for Evaluating Fairness and Inclusion in your Models:

  1. Disaggregated Evaluation: For each subgroup in the data, compare across subgroups. The overall test score on the test data set gives us a score of the overall data, but testing with different subgroups here can help in removing that bias, for example: comparing men and women.
  2. Intersectional Evaluation: For each subgroup 1 and subgroup 2. compare across all pairs of subgroups. Combining across features like gender and race like black-women or White-men, etc together. For example, GM claims to employ both women as well as black people, but they fail to show fairness for black women, hence it's important to consider Intersectional evaluation.
  3. Causal Inference: Understanding causality can really be helpful in solving bias problems in ML. Understanding the causal relationship between the variables helps us in controlling for bias in the data.
  4. Disparate Impact Remover: This method for evaluating fairness compares the proportion of the population between the privileged group and the unprivileged group. In this process, we try to remove this ability to distinguish between the groups.
P(Y=1 | X= unprivileged group) / P(Y=1 | X= privileged group)
  1. Confusion Matrix: It is a performance measurement technique for the ML classification models. It compares the predicted value with the real value(ground truth) and classifies it into 4 categories (True Positive, False Positive, False Negative, True Negative). It is an extremely useful technique for measuring Recall, Precision, Specificity, Accuracy, and most importantly AUC-ROC Curve. Precision and Recall are imp topics for NLP.
  • Recall or True Positive Rate(TPR) — Out of all the positive classes, how much we predicted correctly. It should be as high as possible.
Recall = TP / (TP + FN)
  • Precision — Out of all the positive classes we have predicted correctly, how many are actually positive.
Precision = TP / (TP + FP)
  • Accuracy — Out of all the classes predicted, which ones were correctly predicted.
Accuracy= TP+TN /(TP+TN+TN+FN)
  • Specificity or True Negative Rate(TNR)— the proportion of actual negatives that are correctly identified.
Specificity = TN /(TN+FP)
  • False-positive and False-negative are Type 1 and Type 2 errors respectively.
  • Choose Recall if the occurrence of false negatives is unaccepted/intolerable. For example, In the case of diabetes, you would rather have some extra false positives (false alarms) over saving some false negatives(we don't want to miss out on people who actually have diabetes).
  • Choose Precision if you want to be more confident of your true positives. For example, in case of spam emails, you would rather have some spam emails in your inbox rather than some important emails in your spam box. You would like to be extra sure that email X is spam before we put it in the spam box.

Some common tests for accuracy are:

  1. F-1 Score: It measures the test accuracy, considering both precision and recall at the same time. The score takes both false positives and false negatives into account. It is very helpful in case when you have uneven class distribution.
f1 score = 2 * precision * recall / (precision + recall)

Tools developed for bias and Fairness:

Source: Alan Turing Insitute
  1. Google Fascets: “Know your Data”. With the thought that better data build better models, it is a very helpful tool to know your data better. Understanding data is critical to building a powerful machine learning system. Facets contain two robust visualizations to aid in understanding and analyzing machine learning datasets. It helps to get a sense of the shape of each feature of your dataset using Facets Overview, or individual observations using Facets Dive. You can check it out here. All you have to do is load your dataset(CSV file) and google’s Facets will summarize statistics for each feature and also compare the training and test dataset.
  2. Google’s Responsible AI: Google has shown a commitment to making progress in the responsible development of AI and sharing research, tools, datasets, and other resources with a larger community. Here they have shared some of their current work and recommended practices. A leap by Google: Google AI breast cancer works well across countries: link
  3. Google Project GlassBox: Another project by google which focuses to stop software that learns from a limited sample of data, making it more transparent.“Be aware to be Fair” — Maya Gupta, a research scientist at Google
  4. Perspective API: Jigsaw and Google’s Counter Abuse Technology team collaborated on a research project called Conversation-AI. It open-sources experiments, models, and research data to explore the strengths and weaknesses of ML as a tool for online discussion. Their first model identifies whether a comment could be perceived as “toxic” to a discussion. Example: The statement “He is a gay” — considered to be 76.06% toxic
  • Reweighting: Weights the examples in each (group, label) combination differently to ensure fairness before classification.
  • Optimized Pre-processing: Learns a probabilistic transformation that can modify the features and the labels in the training data.
  • Adversarial Debiasing: Learns a classifier that maximizes prediction accuracy and simultaneously reduces an adversary’s ability to determine the protected attribute from the predictions. This approach leads to a fair classifier as the predictions cannot carry any group discrimination information that the adversary can exploit.
  • Rejected Option Based Classification: Changes predictions from a classifier to make them fairer. Provides favorable outcomes to unprivileged groups and unfavorable outcomes to privileged groups in a confidence band around the decision boundary with the highest uncertainty.
  • Reduction: These algorithms take a standard black-box machine learning estimator and generate a set of retrained models using a sequence of re-weighted training datasets. For example, applicants of a certain gender might be up-weighted or down-weighted to retrain models and reduce disparities across different gender groups. Users can then pick a model that provides the best trade-off between accuracy (or another performance metric) and disparity, which generally would need to be based on business rules and cost calculations.
  • Post-processing: These algorithms take an existing classifier and the sensitive feature as input. Then, they derive a transformation of the classifier’s prediction to enforce the specified fairness constraints. The biggest advantage of threshold optimization is its simplicity and flexibility as it does not need to retrain the model.

Other Recommended Methods:

  1. Debiasing: This process includes removing the signals which produce problematic outputs like stereotyping, sexism, racism. Another way could be the addition of desired variables to increase the model performance, such as more additional data on data slices with a bad performance. Our aim here is that the prediction should be uncorrelated with the sensitive attributes. Although it's not the perfect solution because we may not always know these variables within our data that are making the model biased. Example: The sensitive attributes are not always sex, race, etc even variables like zip codes could create a bias in the model by favoring certain regions for the output.
  • What is the intended use for the predictions or the output?
  • What datasets the model was tested and trained on?
  • Recommended use cases?
  • Possible Biases in the data or the model?
  • Bias & Fairness mitigation methods used in the model?
  • The explained decision-making process of a model?
  • Evaluation Metrics?


  1. There is no single definition of Fairness that can be quantified and integrated into the system. It is highly related to the real-world scenario where it is being implemented. 21 Fairness definitions and their policies by Arvind Narayanan.
  2. With all these tools and libraries for fairness, we input the sensitive attribute and select the performance metric according to the application of the model.
  3. There is no one button or method to remove bias or make your model fair. Given the subjective and sociotechnical nature of fairness, there couldn’t be a single tool to address every challenge.
  4. The more we inject bias detection and mitigation mechanisms into AI the more they can help us be less biased, by alerting us that we do not behave fairly!! We might just improve ourselves!!
  5. Choose carefully between:
  • Fair Model: It is fair but doesn't work well!!
  • UnFair Model: It isn't Fair but works well!!

Good Read on the topic :

  1. Reducing bias in AI-based Financial Services
  2. Google’s AI for Social Good Initiative:





Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store