Unbiased estimators aren’t always good

When people take their first class is statistics they usually learn to calculate the sample variance as

s2 = Σi=1:n (xi m)2 / (n – 1)

where m is the sample mean because the Unicode character for x-bar doesn't display well in most browsers.

This often seems counter-intuitive to the students, and the teachers usually explain that this expression is used instead of the more intuitive

s2 = Σi=1:n (xi m)2 / n

because the version in which you divide by n – 1 is an unbiased estimator, or that it's expected value is the variance of the distribution that's being sampled.

This leaves people with the impression that unbiased estimators are better in some way than biased estimators are, but that's not always the case. Here's the standard example of an unbiased estimator that's not as good as a biased estimator.

Suppose that we have a Poission distribution with mean λ, so that

p(x = n) = λn e / n!

and that we want to estimate the statistic W = e-3λ based on a single sample of the distribution. It's easy to see that (-2)x is an unbiased estimator for W because

E(W) = Σ x=0:∞ (-2)x λx e / x!

= e Σ x=0:∞ (-2λ)x e / x!

= e e-2λ

= e-3λ

But this statistic isn't a very good estimator for W in some ways. In particular, it's actually negative when x is odd while the underlying distribution is always positive.

An estimator that makes more sense is max{W,0}, which happens to be a biased estimator. That's always at least as good estimate for the true value that we're trying to estimate and is actually never worse. So just because an estimator is unbiased doesn't necessarily mean that it's good. it just means that its expected value is whatever it's trying to estimate.

Leave a Reply

Your email address will not be published. Required fields are marked *