Book Launch: ML for Science 🐦‍⬛

A philosophical and pragmatic justification of supervised machine learning in science that you can read for free

Feb 06, 2024

“Supervised Machine Learning for Science” combines many strands of my career, from interpretability research to work in statistical consulting to my participation in Kaggle competitions. It’s the first book I’m writing 50/50 with a co-author: Timo Freiesleben brings his philosophy of science perspective and deep views on topics such as causality, interpretability, and robustness to the book.

We are happy to share this book project with you today. 🥳🥳

tl;dr: Read the book at ml-science-book.com

What’s in the book

Machine learning is gaining an increasing foothold in science. More and more researchers rely on machine learning to answer questions. Where are the African elephants? Train a classification model that predicts their occurrences. What will the weather be like? Deep learning. How does sarcasm work? Train a text classifier.

Part one of the book focuses on how exactly machine learning fits into science and how we may justify its use. Prediction and science are closely linked. Machine learning brings other strengths to scientific modeling as well: ML allows us to work with many different data modalities, it can be faster than some simulations / physical models, and machine learning is about adapting the model to the world (aka data) and not the other way around.

However, machine learning doesn’t come with “batteries included”. It’s like buying a video game console, but without a controller, without a memory card (you know, back in the day), without a power cord, without a TV. Machine learning seems like an incomplete tool for doing science. Lack of transparency, just learning correlations and not causes, and ignoring domain knowledge.

That’s what part two of this book is about. We have controllers, memory cards, and TV for machine learning. The problem is that the knowledge and tools are all over the place. The second part of the book dives into these different topics and discusses how they help make machine learning a better tool for research:

Machine learning interpretability helps with insights into the models
Causal modeling provides language and tools to learn beyond just correlations
Robustness helps you guard models against distribution shifts
Domain knowledge may be incorporated into the model

The book isn’t finished yet. Just like my first book, the plan is to release the book chapter by chapter. But we have a good chunk of chapters ready for the start: Part 1 is done. Part 2 currently covers Domain Knowledge, Interpretability, and Causality. The remaining chapters are coming soon.

The book is free to read for everyone with an Internet connection.

You can find it at https://ml-science-book.com/

The book is still in progress, which means we are happy about any feedback you might have. Enjoy the read!

Joris C.

Mar 1, 2024

Great book! I especially liked the chapter on interpretability.

Is there anywhere in the book where you distinguish between interpretability and explainability?

In my opinion a subsection on Gradient-based explanation methods would be nice, as grad-cam is popular (at least when I read papers on explainability) and to ao distinguish saliency maps from grad-cam.

Are you also planning on writing a chapter on PINNs?

Expand full comment

2 replies by Christoph Molnar and others

2 more comments...

Mindful Modeler

Discussion about this post