You Can Break A Predictive Model By Using It - How To Spot And Fix Performative Prediction

Predictions can change the actual outcome, a phenomenon called performative prediction. This distribution shift can severely hurt model performance, so I'll share approaches to mitigate the problem.

Sep 27, 2022

Predictions can change the future.

That’s a problem for supervised learning. Machine learning models don’t deal well with changes in data distribution. But this time it might totally be the model’s fault:

Predictions can change the actual outcome, a phenomenon called performative prediction.

Once You Understand Performative Prediction, You See It Everywhere

Performative prediction is common but often overlooked. Take a look at these typical modeling tasks:

Rent index: A model that landlords must adhere to when increasing rents.
Churn model: Predict which customers will quit their contracts.
Credit score: The model decides who gets a loan.
Navigation: Predict traffic jams.

Each of these models produces predictions. The predictions can affect the outcome they try to predict:

Rent: Future rents have to conform to the model predictions (self-fulfilling prophecy).
Churn: The marketing department could make a better offer to customers that are at risk of churn.
Credit: Applicants might start to optimize against the system.
Navigation: Drivers avoid busy streets.

These changes in human behavior cause a distribution shift: the relationship between features and actual outcome changes. The distribution shift can affect the performance of the model that was trained with the “old” distribution:

Rent: The model might work even better since landlords have to adhere to it: New rents are set according to the model.
Churn: The churn predictions are now invalid for the customer the marketing department has interacted with.
Credit: Loan applicants might change their behavior. Especially disadvantaged applicants (rejected loans) might try to improve their scores.
Navigation: The traffic jam might never occur because the alerted drivers avoid this route.

In all these examples, the model predictions affect the outcome. Through deployment, the model changes from observer to actor. And the predictions become part of the (future) data-generating process.

Expressed through the eyes of a statistician:

We train a model on a distribution.
Deployment causes a distribution shift.
This can change the model performance - for better or worse.

Cycle guy meme. Panel 1: Text says "Train model", panel shows guy on bicycle. Panel 2 says "deploy model", guy puts stick in the wheel. Panel 3: Guy falls from bike bc of stick. Text says "distribution shift".

How to Spot Performative Prediction

Thanks for reading Mindful Modeler! Subscribe for free to receive new posts and support my work.

Not all models are affected by performative prediction. But many. Here are minimal ingredients needed to cause performative prediction:

Deployed (or otherwise used) predictive model
Predictions affect humans or other actors in a system (trading algorithms, temperature controllers, self-driving AI, …)

Just two ingredients and they are both common. That’s why you find performative prediction everywhere.

Here is how to spot if a model is affected by performative prediction:

Identify whether predictions affect humans or other agents.
Monitor distribution shifts. Predictive performance always causes a distribution shift. A distribution shift, however, can also have other causes.
Investigate the model through other mindsets:
- Apply system thinking: Which system is the model part of? How is the model connected with other system components? Are there feedback loops?
- Draw a directed acyclic graph (DAG) to think through and discuss the causal implications of the predictions in the data-generating process.
- Use the rational actor model: How would the behavior of a rational actor change when the prediction model becomes part of the equation?

Once you have identified performative prediction, you might want to address it.

How to Deal With Performative Prediction

In a situation of performative prediction, our goal might be performative stability: a distribution equilibrium where model predictions don’t further change the outcome.

How do we reach performative stability or at least reduce the impact of performative prediction?

Retrain the model frequently and hope that the distribution stabilizes down the road.
Reframe the task as a reinforcement learning problem.
See deployment as an intervention. Analyze with causal inference.
Avoid non-causal features.
Keep model workings secret. That’s what many companies do so that their algorithms cannot be “gamed”. Examples are credit scoring and search rankings.
Observe a control group that is not affected by the predictions. Not always possible.
Optimize models for performative risk.
For some predictive tasks, performative prediction is a desirable outcome, as in the rent index example. Do nothing.

Think Bigger: Even Statistical Findings Can Be Affected By Performative Prediction

Performative prediction is not restricted to supervised learning.

Maybe you’ve heard of the claim that a glass of red wine per day would be healthy for your heart. This research dates back to 1979 and is the outcome of statistical analysis. Diet research is entangled and complex, and I don’t want to discuss the validity and reliability of the findings.

It’s about this question: Does the finding itself change the outcome?

“Red wine is healthy” is a meme in the sense of an idea that is echoed in our Western societies.

Health-aware people might follow this advice along with a big portfolio of other health-related behavior such as working out, meditating, and eating healthy, …

If healthy people pick up red wine, they reinforce the correlation between red wine and healthy outcomes (but the causal relationship doesn’t change).

Researchers try to adjust for such confounders, but due to the publicity of the red wine meme, it might be much harder today to disentangle which effect comes from red wine, and which from a hard-to-measure degree of “health-awareness” of a person.

Haydar Jawad

"See deployment as an intervention. Analyze with causal inference" - best of all in my view,

Plus "Keep model workings secret. That’s what many companies do so that their algorithms cannot be “gamed”. Examples are credit scoring and search rankings." - what will happen when XAI kicks in and counterfactuals can expose the model don't they ?

Thank you Chris for a great and informative article!

Expand full comment

Mindful Modeler