4 Comments

This "benchmark island" problem is actually identical to the one in many social science disciplines that rely on experimentation (known as mutual-internal-validity problem). Paper in case you're interested (and it's my paper): https://doi.org/10.1177/174569162097477

Expand full comment

Thanks! There's a typo in the DOI: https://doi.org/10.1177/1745691620974773

Expand full comment

Oops, thanks for catching that!

Expand full comment

I think another reason for these benchmarks island is that academia, maybe due to their limited access to real-world data and industrial use-cases, focuses mainly on the learning algorithm innovations. Breaking this trend would require more collaborations with industry, but such collaborations are hard when the company is not open to publish its research.

Expand full comment