Permutation feature importance#

The following content is based on tutorials provided by the scikit-learn developers.

Permutation feature importance overcomes limitations of the impurity-based feature importance (scikit-learn developers):

  • they do not have a bias toward high-cardinality features

  • they can be computed on a left-out test set.

The permutation feature importance is defined to be the decrease in a model score when a single feature value is randomly shuffled. This procedure breaks the relationship between the feature and the target, thus the drop in the model score is indicative of how much the model depends on the feature. This technique benefits from being model agnostic and can be calculated many times with different permutations of the feature.

Note

Permutation feature importance is a model inspection technique that can be used for any fitted estimator when the data is tabular (see scikit learn for more details)

The permutation importance can be calculated on the training set to show how much the model relies on each feature during training. However, it can also be calculated on the test data to show the ability of a feature to be useful to make predictions that generalize to the test set.