Credible Intervals vs Confidence Intervals, Why Would We Ever Use the Latter?

Say we want to know the value of some parameter X \in \mathbb{R}. The definition of a credible interval is easy to parse. Your 90% credible interval is the interval [a, b], after you observe evidence E, just in case:

P(X \in [a, b]|E) = 0.9

This just says that the probability that the value of x is in [A, B] is 0.9. Simple enough.

The definition of a confidence interval in this setting is more complicated and a bit harder to parse. We have to start by defining a function from values of E to intervals in \mathbb{R}. We will call this function f. The output of this function: f(E) is what turns out to be our confidence interval, but only if we make sure to construct f according to certain constraints. Recall that E is a random variable, meaning it could take any of several values before we observe it, and so f(E) is also a random variable, and we do not know its value before we observe E. f(E) is a random variable that can take intervals in \mathbb{R} as values. Here are the constraints that f must satisfy in order for its output to be a 90% confidence interval.

\forall x \in \mathbb{R}, P(X \in f(E)|X = x) \geq 0.9

If you’re used to thinking in terms of probability theory, this formula seems pretty weird. Let’s try parsing it in English. It says something like: for any value that X might take, we need it to be the case that 90% of the time, that value is in the interval f(E). The value of E must depend on the value of X since E is supposedly evidence about the value of X. So what we have to do in order to construct a good f is make sure that conditioning on the value of X, 90% of the time E is going to take a value which our f function maps to an interval which includes x.

It might be best to walk though a simpler example. Suppose that X = \{ 1, 2, 3, 4 \} so that it can only take one of those four values. Suppose that E = \{ 0,1,2,3,4 \}. Now it turns out that assuming the value of X, 80% of the time, E takes the same value, and 10% of the time E takes a value one less than X. The other 10% of the time E randomly chooses one of the remaining three values. So for example:

P(E = 1 | X = 1) = 0.8
P(E = 0 | X = 1) = 0.1
P(E = 2| X = 1) = 0.03333
P(E = 3| X= 1) =0.03333
P(E=4|X =1) = 0.03333

And similar for other values of X. Now instead of having a confidence interval in this setting, we are going to have to use a confidence set, but the definition is exactly the same. We have to construct a function f that takes values of E and returns sets of possible values of X. Furthermore, in order for the output of f to be a confidence interval, we need f(E) to be a set which contains the true value of X 90% of the time conditioned on the value of X. This is easy enough to do in this example. We define f(E) = \{ E, E+1 \}.

Suppose that the value of X is 2, then 80% of the time, the value of E is also 2, and so f(E) = \{ 2, 3 \}, which contains 2. Still supposing that X = 2, 10% of the time, the value of E is 1, and so f(E) = \{1, 2 \}, which contains 2. This means that if X = 2, then 90% of the time, X is in the set f(E). This same argument could be used for all values of x, and so we have that:

\forall x, P(X \in f(E)|X = x) \geq 0.9

Which makes f(E) a confidence interval, or rather, a confidence set in this case, but the reasoning is exactly the same for intervals.

So you may ask at this point why anyone would ever care about confidence intervals. The reason I wanted to write this post is that I think I have a good explanation, which if it isn’t novel, is certainly rarely given.

The standard criticism of confidence intervals is that the probability you assign to the value of X being in f(E) is not necessarily 90% conditioning on the value of E! Suppose that we had a flat prior over the values of X in the previous setting, and that we observe that the value of E is 3. This gives us a 90% confidence set of \{ 3, 4 \}. In this case P(\{ 3, 4 \}|E = 3 is 0.93, not 0.9. There are cases where your posterior probability of the confidence set is lower than the value of the confidence interval as well.

Here is the thing though, in order for us to agree about the posterior probability of X being in \{ 3, 4 \} we had to agree about the prior over X. We would not have to agree about our prior over X to agree about the 90% confidence interval. In fact, we could assign any prior over X and we would still end up with the same confidence interval (this might give you a hint as to how to construct cases where the confidence interval value and posterior probability diverge significantly).

Consider what happens if we take the open part of the constraint we defined on f: P(X \in f(E)|X = x) and take the expectation of that expression wrt X. This is the following average:

P(X \in f(E) | X = x_1) P(X = x_1) + \\ P(X \in f(E) | X = x_2) P(X = x_2) + \\ ... + \\ P(X \in f(E) | X = x_n) P(X = x_n)

We know from our constraints on f that each term being averaged here, eg, P(X \in f(E) | X = x_i), is greater than 0.9, and this means that the whole average is also greater than 0.9. Also, reducing this average using the law of total probability, we end up with the following expression: P(X \in f(E)). In other words, that whole average gives us our prior probability that the value of X is in f(E) whatever f(E) turns out to be. This gives us the following theorem:

P(X \in f(E)) \geq 0.9

Our prior probability that the true value is in f(E) must be greater than or equal to 0.9, and this holds no matter what our prior is! The prior has simplified out of the formula by the law of total probability.

Now this does not mean that we agree very much, at best it means that we would have agreed before we saw the value of E about the probability of X being in f(E) if we somehow knew what f(E) was without gaining any information about E. Still, this is an interesting way to think about confidence intervals, and we can it helps us spot cases where using confidence intervals instead of credible intervals makes sense in practice. For one, it makes sense to use confidence intervals if you think that your prior has been chosen adversarially, or if you are sufficiently skeptical of your prior. For example, if you think your prior might favor the conclusion you wanted to be true, even after you try to account for that bias. It also makes sense to use confidence intervals in cases where we do not trust each others’ priors, or where our priors are not relevant to some social epistemic practice, such as science.

An interesting question I would like to work out some day is when in principle it makes sense to use confidence intervals instead of credible intervals. What are the exact formal conditions where betting according to confidence intervals gives you a better expected payout than betting according to your credible intervals? It’s not enough for your prior to have been chosen adversarially, or to be biased in favor of conclusions for no good reason. Your prior has to be messed up enough, perhaps messed up enough that you’re better off ignoring it, since confidence intervals do seem to ignore it overall. There should be some formal condition that tells us exactly what it means for your prior to be “messed up enough”, but until someone finds it, I suggest using confidence intervals whenever you or those you wish to convince might reasonably be skeptical of your prior.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this:
search previous next tag category expand menu location phone mail time cart zoom edit close