Step 2: In-processing methodologies for Explainability
Last updated
Last updated
The easiest way to have explainability is to use models which are easy to interpret. This type of approach is said to be in-process because it is performed at the model design stage, and it is said to be model-specific because it directly depends on the specific machine learning algorithm used.
There are various types of decision trees, but we can take as an example the classification and regression trees (CART) algorithm. The main idea is that the algorithm learns to classify the input by applying a series of sequential decisions to each feature. The algorithm effectively partitions the feature space into rectangles, and inside each rectangle it fits a simple predictive model.
It is pretty straightforward to understand the decision-making process of the algorithm, you can refer to the figure below for clarity. You start from the root node (input data x), then move deeper into the tree by following a series of choices (e.g. is x < 434.5?). With each choice, you move down a level, following either the left or right branch depending on your choice (i.e. your choice will help determine in which subspace your data point belongs). The final node will reveal the predicted outcome.
The model is pretty intuitive, and can provide helpful explanations. Here are some interesting properties:
Feature importance can be calculated by going through all the relevant splits and measuring their impact on the variance with respect to the parent node.
Individual predictions can be explained by tracking a decision through the whole tree. This allows us to observe the whole decision-making process and the contribution of each decision node.
The tree structure provides a clear visualization of the decision-making process.
The deeper the tree, the harder it is to have explanations that are human-understandable.
Decision trees for both classification and regression can be implemented in sklearn. You can find an example in our notebook, which can be visualized here or downloaded below: