I believe I left a similar comment somewhere in your series about PFNs but the question still bugs me (and might be worth a chapter in your TFM book ;-)?)...
TFMs are trained mostly on synthetic data generated using causal graphs under the hood. Understanding whether this allows deriving a feature causality ranking rather than a feature importance ranking w/o having to create a causal graph for your features/data is super interesting in my mind. The TabBench folks seem to rely mostly on classical prediction KPIs though and no causality KPIs.
There are attempts to apply PFNs for causal settings. I haven’t seen anything yet for feature causality ranking. I feel like we’ll see more applications of PFNs in the causality space.
Hi Chris,
Do you know a toy example for which a tree model beats TabPFN
Regards
You can hover over the dots in https://huggingface.co/spaces/Neuralk-AI/tabbench and see for which tree-based models outpeform TFMs
thank you! i would love to see win cluster but with additional feature engineering like from https://arxiv.org/pdf/2606.02384
this is still on my to read list. I would also be curious to see the impact of more automated feature engineering in these benchmarks
Thanks for the post! I have a question: are TFMs adequate for very small datasets with p >> n without having to apply dimensionality reduction?
There is TabPFNWide, which is specifically pre-trained for such settings. Otherwise, I haven’t tried TFMs yet for such settings
Thanks for bringing this to my/our attention!
I believe I left a similar comment somewhere in your series about PFNs but the question still bugs me (and might be worth a chapter in your TFM book ;-)?)...
TFMs are trained mostly on synthetic data generated using causal graphs under the hood. Understanding whether this allows deriving a feature causality ranking rather than a feature importance ranking w/o having to create a causal graph for your features/data is super interesting in my mind. The TabBench folks seem to rely mostly on classical prediction KPIs though and no causality KPIs.
Here's a recent paper that takes a more causality driven point of view: https://arxiv.org/pdf/2605.08786
Wouldn't a similar benchmark using something like the "Recall@k" from the above paper be crazy interesting?
There are attempts to apply PFNs for causal settings. I haven’t seen anything yet for feature causality ranking. I feel like we’ll see more applications of PFNs in the causality space.