7 Comments
User's avatar
Abdelkrim Bouasria's avatar

Interesting! This is additional survey here (https://github.com/LAMDA-Tabular/Tabular-Survey) repo and paper.

Chris Bartley's avatar

Really appreciate this write up Christoph. I'd love to see more (unbiased) work on the performance and complexity tradeoffs of TFM vs traditional ML for different applications.

LW's avatar

Truly appreciate your writeup. I have learned a lot. I think Kumo has extended TFM because , as most ML engineers say, assembling a dataset is 90% of the job. Once it is in a tabular format, the difference between STOA TFM and Catboost could be statistically signifiant but practically insignificant.

What Kumo adds is to there is no need to assemble a dataset through feature engineering (assuming all raw data are in various tables in a database). Its FM trains also on the third attention layer that is to learn how "hopping" into different tables should update the information of a cell (assuming each table is a node, and the relationship between them are edges). I have not given it a try yet, but the idea of getting away from a tabular format while completing a supervised learning task is quite refreshing. (BTW, I am not affiliated with Kumo)

How do you see the direction/future of specialized TFM like the LLM for finance?

David Landy's avatar

Something I'm not getting yet about TFM's is related to this comment of yours Christoph: "And it kind of makes sense, because all of a sudden, you can pre-train a model for specific industries and scenarios."

If you have a foundational model trained on millions of synthetic datasets for say, classification (classification in general), why would you have a need for these foundation models in specific industries? In the end most of these industries simply need models suited to some generic tasks like classification, regression, anomaly detection etc. So you'd think that startups would be getting formed in the vertical of the prediction task, not the industry task. Eg. say you consider banking. They need default likelihood models, which are basically classification models. So just use a TFM suited to classification tasks. If they are such general capabilities why the need for a specific banking model?

Christoph Molnar's avatar

Good point. Maybe the sentence would be better without "industries", because it's about the structure of the task and the data.

This paper is an example of that: https://arxiv.org/pdf/2603.22738. They adapt TabPFN for multi-task predictions, specifically geared towards predicting steel properties.

With pre-training, you can further "niche down" your TFM. But of course, the model trained for steel prediction also works in other industries where the task is similarly structured.