Discussion about this post

User's avatar
Ibtesam Ahmed's avatar

Hi Christoph, thanks for writing a very useful article. I was wondering when you said "If X2 is a noisy copy of a strong feature X1, then by sampling the features that a node can use for splitting, in many cases X2 will be available, but not X1, and the split will be based on X2." why X2 would be picked over X1, since X1 is less noisy it would lead to a more homogenous split, right? Ideally, X1 should be picked more

Expand full comment
Pietro's avatar

Brilliant article!I'm a new reader, but I'm trying catching up on the older articles and books, and I'm enjoying them very much. I'm really enjoying this inductive bias series.

I have a question: Why do you think tabular data tends to exhibit non-smooth patterns in the underlying 'real' function? Coming from a background in physics, I'm accustomed to mostly using smooth and continuous functions to model reality. It feels unusual to expect a non-smooth model to perform better. Could you elaborate on this? Do you have any papers or information that could provide more insight into this? Thanks!

Expand full comment
2 more comments...

No posts