From a business practitioner perspective, I may add 'justify investment and increase the chance of business implementation success'
Some companies are a complicated environment with a lot of stakeholders on different levels of knowledge and also fears. To bring a model from idea to production, it can help to have interpretability-by-design or at least interpretability-during-design, to align experts and drive change commitment.
Justification in general I'd say is one of the goals. So, yes, that makes a lot of sense. Being able to explain the model to others can be as important as demonstrating the model performance.
I really enjoyed reading this. You've identified a real failing in the way Machine Learning interpretability is taught, where the big picture questions of "what is the goal" is an afterthought or even ignored. Far too often people product an importance chart and call it a day.
I'm looking forward to the follow-up posts. I hope you can also bring "which data to use" together with these posts on "which tool to use" because I often find the former question even more difficult than the latter, at least in my work. Currently I'm trying to design anti-discrimination tests that could be applied to machine learning models.
Questions on the COMPAS algorithm: is the issue about ML or that simply that the algorithm isn't transparent? My understanding is that protecting IP is the real reason for the lack of transparency and the algorithm used has little to do with it. If they released the code of their algorithm to the public, people like yourself could approximately (if not perfectly) understand how it's making decisions, but then you could also reverse-engineer it and sell yourself.
I skimmed Rudin 2018 and I think that's what he's saying, although I'm not sure that I agree that drastically simplifying the model to gain simpler interpretability is always going to be worth the trade-off, even in "high-stakes" situations. I'm OK with 50 variable ML models, as long as some good explainability is also provided.
Good points. Often it's just the lack of transparency and that companies don't want to and don't have to share how the model worked or any form of explainability.
Great summary post!
From a business practitioner perspective, I may add 'justify investment and increase the chance of business implementation success'
Some companies are a complicated environment with a lot of stakeholders on different levels of knowledge and also fears. To bring a model from idea to production, it can help to have interpretability-by-design or at least interpretability-during-design, to align experts and drive change commitment.
Does this make sense?
Justification in general I'd say is one of the goals. So, yes, that makes a lot of sense. Being able to explain the model to others can be as important as demonstrating the model performance.
I really enjoyed reading this. You've identified a real failing in the way Machine Learning interpretability is taught, where the big picture questions of "what is the goal" is an afterthought or even ignored. Far too often people product an importance chart and call it a day.
I'm looking forward to the follow-up posts. I hope you can also bring "which data to use" together with these posts on "which tool to use" because I often find the former question even more difficult than the latter, at least in my work. Currently I'm trying to design anti-discrimination tests that could be applied to machine learning models.
Questions on the COMPAS algorithm: is the issue about ML or that simply that the algorithm isn't transparent? My understanding is that protecting IP is the real reason for the lack of transparency and the algorithm used has little to do with it. If they released the code of their algorithm to the public, people like yourself could approximately (if not perfectly) understand how it's making decisions, but then you could also reverse-engineer it and sell yourself.
I skimmed Rudin 2018 and I think that's what he's saying, although I'm not sure that I agree that drastically simplifying the model to gain simpler interpretability is always going to be worth the trade-off, even in "high-stakes" situations. I'm OK with 50 variable ML models, as long as some good explainability is also provided.
Good points. Often it's just the lack of transparency and that companies don't want to and don't have to share how the model worked or any form of explainability.