Can you draw scientific conclusions with interpretable machine learning?
Guest post by Timo Freiesleben.
Welcome to a new edition of Mindful Modeler. Today’s post is a premiere — a guest post by Timo Freiesleben. Timo is a philosopher, a machine learning researcher, and a good friend. Together we are writing the book Supervised Machine Learning for Science, which builds on our research on interpretability that he shares in this post. Enjoy the read!
Applied scientists may not even care about machine learning models — they care about insights into the underlying phenomenon. In such an application, the machine learning model is often a means to an end. Medical scientists want to identify risk factors, not just diagnose. Biologists want to understand why certain amino acid sequences form certain proteins. Education scientists want to learn what conditions lead to student success, not just predict it.
Interpretable machine learning held the bold promise of opening the black box and answering such questions. However, current interpretable machine learning is model-centric — it’s unclear what, if any, insight into the data and the underlying phenomenon interpretation techniques like LIME or SHAP provide. They constantly break dependencies in the data and probe the model in areas where it has to extrapolate, or where the data-generating process is not even defined.
Why not take a data and phenomenon-centric approach to interpretable machine learning? In our research, we proposed to develop interpretation techniques that provide insight into the data and phenomenon by design:
Start with a scientific question.
Show that this question could be answered if you had the best possible prediction model (i.e., the Bayes-optimal model) and unlimited data.
Provide an approximation of the answer given your trained ML model and finite data (= model interpretation).
Quantify or qualitatively describe the uncertainty between model interpretation and the ground truth.
Grounding interpretation techniques in concrete scientific questions would make these techniques more useful to the scientific community and encourage scientists to adopt interpretation techniques. It would also finally be clear what the ground truth explanations are — namely the target estimands in the phenomenon.
We have published two papers on the subject:
One is a more philosophical paper that introduces the framework explained above. The paper connects machine learning models with traditional scientific models, discusses what insights into the data current interpretation techniques such as SAGE and conditional SHAP provide, and highlights the limited causal insight into the phenomenon we can gain from analyzing machine learning models.
The other is a statistical paper that shows how we can quantify uncertainty for two global interpretation techniques — conditional feature importance and partial dependence plots. Here we dissect different sources of uncertainty: the uncertainty arising from the bias and variance of the model due to the learning process and the uncertainty arising from the variance in the approximation of the interpretation process using Monte Carlo methods.
Ultimately, this unique perspective heavily influenced how we wrote the Interpretability chapter in Supervised Machine Learning for Science. I hope you enjoy reading the papers and chapter!