Extracting Explanations from Machine Learning Models

Why does this matter?

Explainability is essential for building and maintaining trust across the whole ecosystem of stakeholders. It is also important from a regulatory perspective, for example, new AI regulations give users the right to know why a certain automated decision was taken in a certain form (Right to an Explanation – EU General Data Protection Regulation (2016)).
A good explainability framework will be able to explain how an AI system works, what is driving its decisions and whether the model can be trusted or not. We identify two main routes to improve the explainability of the system:
  1. 1.
    Documentation: e.g. having clear informative material for users, documenting the way the dataset and the model are built and used, so that it can be reproduced and understood, etc.
  2. 2.
    Tools: e.g. techniques to extract meaningful explanations from models; debugging tools, etc.
This roadmap will focus on option n.2, you can find a roadmap about documentation here. You can find more information about creating a good explainability framework here.

This Roadmap

In this roadmap, we will focus on tools that can be used to extract explanations from your model. Firstly, we introduce the different types of techniques for algorithmic explainability. Secondly, we will cover model-specific techniques. Finally, we will present in detail some model-agnostic methodologies for explainability.