Stuck on Benchmark Island

Jul 16, 2024

When Model Performance Becomes A Self-Absorbing Goal

4 Comments

This "benchmark island" problem is actually identical to the one in many social science disciplines that rely on experimentation (known as mutual-internal-validity problem). Paper in case you're interested (and it's my paper): https://doi.org/10.1177/174569162097477

Expand full comment

Reply (1)

Christoph Molnar

Jul 17

Thanks! There's a typo in the DOI: https://doi.org/10.1177/1745691620974773

Expand full comment

Reply (1)

Hause

Jul 17

Oops, thanks for catching that!

Expand full comment

Gabriel

Jul 16

I think another reason for these benchmarks island is that academia, maybe due to their limited access to real-world data and industrial use-cases, focuses mainly on the learning algorithm innovations. Breaking this trend would require more collaborations with industry, but such collaborations are hard when the company is not open to publish its research.

Expand full comment

Mindful Modeler

Stuck on Benchmark Island