# Correlation Can Ruin Interpretability

### How to identify if an ML model interpretation is affected by correlation and how to deal with it.

A baby weighing as much as andult. A freezing summer day. A credit card debt of $3,102 but 0 credit cards.

You wouldn’t let these data points anywhere near your prediction model, would you? But if you interpret your model with **model-agnostic machine learning methods**, your model might be confronted with such data points.

The origin of these unrealistic data points is **correlated features**. Correlation can ruin interpretation both on a technical and a philosophical level.

**Correlated is the rule, not the exception.**

Predicting bike rentals? Season and temperature are correlated.

Credit scoring? Income correlates with age, job, ...

Diagnosing patients? Blood values are correlated, like multiple markers of inflammation, ...

In this issue, we will explore

why correlation can be problematic

how to spot if correlation affects model interpretation

which solutions you can use

## How to Spot Correlation

*Note: correlation here includes more general dependencies, not only linear correlation.*

There are multiple ways to identify if two or more features are correlated:

Calculate dependence measures such as

Pearson correlation coefficient,

Spearman’s rank correlation coefficient

Hilbert-Schmidt independent criterion

…

Visualize data. Scatterplots can reveal a lot.

Draw a directed acyclic graph — a non-data-driven way to think about dependencies.

## Correlation Causes Extrapolation

Model-agnostic interpretation methods create new data points that are used to compute the interpretation. The following interpretation methods “ignore” that the features have correlations with others:

Partial dependence plot

Shapley values

LIME

Permutation feature importance

...

What happens when we apply the interpretation methods to correlated data?

The methods produce data points that extrapolate to unlikely regions. See the following scatterplot, where the permutation of feature x1 created new data points in the right bottom region that contains none of the original data points. Permutation feature importance, for example, would proceed exactly like this for feature x1.

These new data points lie outside of the distribution. The interpretation method uses these unrealistic data points to compute the interpretation.

**Extrapolation can ruin you interpretation:**

The model wasn't trained with data from extrapolated regions: the quality of predictions might be bad or even unknowable.

Data might be undefined in this region. Like a 100kg baby

Unrealistic data points become part of the interpretation

*A short interlude: Interpretation with extrapolation can be insightful and the very goal of the interpretation. If you want to debug, stress test, or audit a machine learning model, it’s desirable to see how the behaves when individual feature inputs are varied. This plays into another view on interpretation: true-to-data versus true-to-model. But this will be part of another post.*

We've established that correlation is common. Correlation causes extrapolation, and extrapolation can ruin the interpretation. What can we do?

## Options to Interpret Correlated Features

We have not one, but many means to interpret the model, even if features are correlated.

These are some general approaches to address correlated features:

Put features that are correlated with each other into a group. Interpret the groups instead of the features. For example with grouped importance or grouped Shapley values.

Decorrelate features with techniques such as PCA. This might involve retraining the model with the “new” features.

Remove correlated features before modeling. That’s what many feature selection algorithms do anyways.

Use causal modeling to identify confounders, mediation effects, and so on.

Use conditional interpretation methods.

...

Unfortunately, none of these approaches perfectly solves the correlation issue:

Grouped interpretation is coarse: Interpretation now happens at the group, and not the feature level: For example, we’ll get one importance value for the entire group, but can’t say how important each feature within the group was. Grouping doesn’t work well for feature dependence plots.

Decorrelation destroys feature interpretability. For example, PCA components mix many features.

Removing features often worsens model performance, because we throw away information.

Causal modeling requires a huge mind shift. For example, for each feature interpretation, we might need a different selection of features. Also, causal modeling can be in conflict with predictive performance.

Conditional interpretation entangles interpretation.

Many of the options like decorrelation, grouping, causal inference, and conditional interpretation entangle the interpretation of the features.

## Entanglement Makes Interpretation Difficult

We’ll have a look at the entanglement of interpretation through the last approach in the list: **conditional interpretation**.

Many interpretation methods have a "**marginal**" and a "**conditional**" version:

The marginal versions pretend that features are independent.

Conditional interpretation methods manipulate data but keep the conditional distribution intact.

For example, feature importance:

Permutation feature importance (marginal): Feature values are shuffled, which is a data-driven version of sampling from the marginal distribution.

Conditional feature importance: Values are sampled based on the feature's distribution conditional on all other features.

**Conditional interpretation fixes the extrapolation problem, but it entangles the interpretation of a feature with its correlated features. **

Compare the two:

Partial dependence plot (marginal): The average change in the prediction when we change a feature

*and keep the other features fixed*.M-Plot (conditional): The average change in the prediction when we change a feature

*and all correlated features accordingly*.

Entanglement is bad when we want an isolated interpretation. **We often want isolated interpretations, because they are easy to understand**. For example, if we want to understand the isolated effect of temperature on bike rentals.

It’s much harder to interpret a mashed-up effect of temperature, season, humidity, ...

Correlation is not merely a technical issue. Correlation tells you that the features share information. Causal inference offers explanations of why features are correlated: one feature might cause the other, or the features might share a confounder.

# A Recipe To Deal With Correlation

You have a model that you want to interpret.

Follow these steps:

Find out whether the interpretation method of choice is affected by correlation. A good start is the Interpretable Machine Learning book. Even it if isn’t affected, do step 2 to understand your data, but otherwise, you should be fine.

Analyze correlation in your data with visualization and correlation measures (like Pearson correlation). If correlations are negligible, you are fine.

Compare outcomes of marginal and conditional interpretation. Are they close? Then the correlation doesn’t affect the model much, and either interpretation is fine.

Are you fine with a conditional interpretation, even though it’s entangled? Good, you can stop. But make sure to understand the correlation between features for an in-depth interpretation.

Are you fine with a coarse group interpretation, and is it available for the interpretation method of choice? Use it, and it should be fine.

You are more and more in trouble.

Your problem might go deeper than you think. Time for more causal thinking: Draw a DAG of how the variables are causally connected. What are the causal roles of the features?

In my book Interpretable Machine Learning, correlated features are also a constant theme, and for each interpretation method I discuss this issue.

If you found this newsletter on correlation and interpretation useful, maybe your colleagues and peers will appreciate it if you share it with them:

Thanks for the illuminating post, it helped materialize my scattered intutions :)

Sorry if the question is naive, but how do you "group" features for interpretation (e.g. by combining them in a linear combination like in a PCA ?) ?

Thank you Christoph for this insightful post. Very useful..