What A Horse Can Tell Us About Machine Learning
How to find out whether your model relies on spurious correlations
Von Osten, an amateur horse trainer and math teacher, thought a bit too much about how he could combine his two skills. He ended up teaching his horse Hans to do math.
When asked, what is 2 + 5, Hans would tap 7 with his hoof to solve the equation.
Many were impressed by Hans, earning him the honorary title “Clever Hans”.
Hans and his owner were tested thoroughly by a committee organized by the German government, but they didn’t find any fraud. But on a second inspection by psychologist Oskar Pfungst, the secret was revealed.
Clever Hans didn’t know any math.
But how did he solve the math questions then?
Clever Hans was reading cues from his owner Von Osten. When the taps reached the correct number, Von Osten subconsciously showed relief. The horse picked up these cues by observing the facial expression and body language of Von Osten. That’s why Hans failed when he couldn’t see his owner or when the owner didn’t know the answer himself, as Oskar Pfungst revealed in an experiment. Von Osten didn’t seem a fraudster, since Hans managed to get the answers right even from reading other people.
Clever Hans was clever indeed, just working in different ways than expected.
Clever Hans Machine Learning
Hans was a horse, not a machine.
Yet, machine learning will take similar shortcuts when given the option. With the same problems: We think they make decisions in a certain way, while they actually do it quite differently. Some examples:
Image classifiers that rely on text on images instead of visual features.
Wolf versus dog classifier relying on snow in the background as a feature (instead of the animal’s appearance)
Whale detection based on artifacts of the audio files instead of the audio content
One way to talk and think about these is the language of causality: Clever Hans Predictors (borrowing the term from this paper) rely on spurious correlations.
Whenever you design or evaluate a machine learning model, at least once put on the detective hat and look for clues that the model might in fact be a Clever Hans Predictor. Or you could just put it in production and see the model wreak havoc.
Unfortunately, Clever Hans-like predictions don’t flag up in the performance metric. On the contrary: Using spurious correlations may not only be a shortcut for making predictions, but it might also lead to much better predictions than would be possible in the first place.
The reason we don’t want Clever Hans predictors is that they aren’t robust. They might just work on the training data and nowhere else.
What can you do to spot Clever Hans Predictors?
Is the model performance surprisingly good? This would be the first red flag.
Causal thinking. You might be able to spot problems by thinking about your data. Maybe draw a DAG (directed acyclic graph) to understand the relations. Find out whether a confounder is missing. This works especially for tabular data.
Look at data: For non-tabular data, like images, you will have a hard time with just “causal thinking”. You actually have to look at images to spot, for example, that some images have watermarks on them on which the model might rely.
Interpretation methods: Interpretation methods like LIME or so highlight the features or image parts or text parts the model relied on. Like a spotlight. And if this spotlight shines on something unusual, you can investigate whether your model might have learned to behave like Clever Hans.
This post was inspired by the paper: "Unmasking Clever Hans Predictors and assessing what machines really learn”
Thank you. I was looking for the link to the paper.