Step 2: Mitigating Bias

In order to mitigate bias, it is important to remember that bias can have different causes: data, model or both.

Our training data is generally just a noisy approximation of the function our ML model is trying to learn, so if the data is not representative of the whole population (e.g. not sampling enough data from unprivileged groups) our model will fail to predict correctly in these cases. Or if our data contains historical human biases, conscious or unconscious, the model will continue to propagate those biases.

The error may also be at the level of the model. Models learn to generalize by minimizing the total prediction error. If the model is not carefully monitored and thought-out, it may simply learn to misclassify the minority group (which is less costly than misclassifying the majority group). The model should also be explicitly designed with bias-related questions in mind (e.g. which features or hyper parameters to include, choices of inputs etc).

Depending on the source of bias, we may decide to intervene to mitigate bias at one of three stages:

Last updated