Discover more from Mindful Modeler
The Case for Uninterpretable Machine Learning
Machine learning excels because it's not interpretable. Not in spite of it.
Interpretability is a constraint.1
Thanks for reading Mindful Modeler! Subscribe for free to receive new posts and support my work.
Every constraint shrinks the hypothesis space of models, making it less likely to find the best-performing one.
The deep learning revolution, sparked by AlexNet in 2012, showcased this. Designed for flexibility rather than interpretability, it dominated the ImageNet computer vision challenge. This trend extends to other fields like tabular prediction, where gradient-boosted trees like xgboost reign, and language modeling, in the grasp of transformers.
Embracing complexity reflects the intricacies of real-world tasks that seem to lack simple rules or semantics. And that means dropping interpretability.
Relying on flexible models such as neural networks and regulating them captures this complexity effectively. Computer scientists have fully embraced this complexity and taken over the steering wheel from statisticians for machine learning.
Interpretable models can still outperform complex ones, especially with smaller tabular datasets where strong inductive biases help. And when they do, it’s great! Simpler models are preferable for easier debugging, auditing, and communication. However, if you're agnostic about model selection, you must be open to the likelihood that a complex model could emerge as the best performer.
Some ML researchers and experts have taken this thinking to the extreme saying that we don’t need interpretability at all. However, embracing complexity doesn’t require abandoning all attempts to understand the model.
I’m a big fan of post-hoc interpretation methods because they are applied to the models after training. So they don’t interfere with the embrace of complexity that is the secret sauce of machine learning. But the post-hoc approach also means we have to let go of the hope of fully understanding or reconstructing the models. Post-hoc methods can only offer summaries of how the models behave.
The AlexNet team also took this route, studying the activation patterns of neurons. But only after model training. Performance was the priority.
Except for post-hoc interpretation methods as they are applied after the model is trained and don’t interfere with model training and selection.