That's also how I describe it in Modeling Mindset. ML is super task-oriented and less on making the right assumptions about the data-generating process.
For non-tabular datasets, what are the available model-agnostic post hoc IML methods that are comparable to methods like ALE for tabular datasets? Is this the subject of Chapter 10 of your IML book (https://christophm.github.io/interpretable-ml-book/neural-networks.html) or do you have more to say on this topic?
For tabular datasets, I would think that model-agnostic post hoc methods like ALE resolve this conflict: go all out for performance, then interpret the best-performing model. From that perspective, I understand you to be saying that we should focus almost exclusively on performance while model-building and then use suitable interpretation methods afterwards, rather than being constrained by limiting ourselves to only using "interpretable" models while we are still in development phase. Is that the basic gist of your article (from the tabular dataset perspective)?
Yes, that's a good summary. Maybe with the edge case that when an intepretable model is close in performance it may be worth forgoing a bit of performance for a structurally simpler model.
All model-agnostic methods from ALE, PFI and SHAP are capable of this. They target different aspect of the model.
Hi Christoph, I can see where you are coming from. My concern is when interpreting the model is key. For example being able to clearly articulate why someone didn't get the loan to a regulator.
One might argue (per Breiman) that statisticians never had the wheel to begin with! ML has always been a practical affair.
That's also how I describe it in Modeling Mindset. ML is super task-oriented and less on making the right assumptions about the data-generating process.
For non-tabular datasets, what are the available model-agnostic post hoc IML methods that are comparable to methods like ALE for tabular datasets? Is this the subject of Chapter 10 of your IML book (https://christophm.github.io/interpretable-ml-book/neural-networks.html) or do you have more to say on this topic?
For tabular datasets, I would think that model-agnostic post hoc methods like ALE resolve this conflict: go all out for performance, then interpret the best-performing model. From that perspective, I understand you to be saying that we should focus almost exclusively on performance while model-building and then use suitable interpretation methods afterwards, rather than being constrained by limiting ourselves to only using "interpretable" models while we are still in development phase. Is that the basic gist of your article (from the tabular dataset perspective)?
Yes, that's a good summary. Maybe with the edge case that when an intepretable model is close in performance it may be worth forgoing a bit of performance for a structurally simpler model.
All model-agnostic methods from ALE, PFI and SHAP are capable of this. They target different aspect of the model.
Hi Christoph, I can see where you are coming from. My concern is when interpreting the model is key. For example being able to clearly articulate why someone didn't get the loan to a regulator.
Fair enough. Sometimes there are constraints from outside that you have to take care of in your modeling process.