Brilliant article!I'm a new reader, but I'm trying catching up on the older articles and books, and I'm enjoying them very much. I'm really enjoying this inductive bias series.
I have a question: Why do you think tabular data tends to exhibit non-smooth patterns in the underlying 'real' function? Coming from a background in physics, I'm accustomed to mostly using smooth and continuous functions to model reality. It feels unusual to expect a non-smooth model to perform better. Could you elaborate on this? Do you have any papers or information that could provide more insight into this? Thanks!
Good question. Just guessing: But let's say both smooth and non-smooth patterns occur in the "true" relationship between feature and target. Tree-based models can capture non-smooth patterns well, and they can at least approximate smooth-patterns. Maybe it's that neural networks can't approximate non-smooth patterns well enough. But just a speculation.
As for papers, I relied on "Why do tree-based models still outperform deep learning on tabular data?" (https://arxiv.org/abs/2207.08815) most for writing this post.
Brilliant article!I'm a new reader, but I'm trying catching up on the older articles and books, and I'm enjoying them very much. I'm really enjoying this inductive bias series.
I have a question: Why do you think tabular data tends to exhibit non-smooth patterns in the underlying 'real' function? Coming from a background in physics, I'm accustomed to mostly using smooth and continuous functions to model reality. It feels unusual to expect a non-smooth model to perform better. Could you elaborate on this? Do you have any papers or information that could provide more insight into this? Thanks!
Thanks Pietro. That gives me a lot of motivation.
Good question. Just guessing: But let's say both smooth and non-smooth patterns occur in the "true" relationship between feature and target. Tree-based models can capture non-smooth patterns well, and they can at least approximate smooth-patterns. Maybe it's that neural networks can't approximate non-smooth patterns well enough. But just a speculation.
As for papers, I relied on "Why do tree-based models still outperform deep learning on tabular data?" (https://arxiv.org/abs/2207.08815) most for writing this post.