2 Comments
User's avatar
Abdullah Mujeeb Khawaja's avatar

someone recently showcased training an XGBoost and taking the leaf features as the input to a base small transformer and trained over it with RoRa and trained on roughly 6K parameters and outperformed all of these tabular foundation models on most tasks.

What we should instead focus on is how to re-structure tabular data so that it becomes impossible for anything to come even close to reaching performances achieved by these much larger models - but thats just my take :D

Christoph Molnar's avatar

I saw the post. It had some flaws/limitations in the evaluation, so I would take it with a grain of salt.