5 Comments

Insightful post! One question I had was why quantifying aleatoric uncertainty vía variance instead of, e.g., entropy? Is there a principled argument or is it just a matter of modeling preference?

Expand full comment

Thanks!

Variance seems to me as a statistician a "natural" choice when you frame aleatoric uncertainty as the conditional probabilistic model P(Y|X=x).

But entropy should also be a sensible option to pick.

Expand full comment

Makes sense. Are they monotonically correlated? If a distribution has higher variance, does it also have higher entropy?

Expand full comment

Great post! I have a question, in a simulated dataset, where one has more control the data generating process, if one particularly leverages more control over which features have the most contribution towards aleatoric uncertainty, then is it also fair to say that those features have aleatoric uncertainty as we are controlling the uncertainty there and the possible overlap between classes?

Expand full comment

I would say it's fair to call it aleatoric uncertainty because you control the random number generator, but you still can't have a model to beat this uncertainty.

Expand full comment