Statistics in the Pfizer Data – how good do they show the vaccine to be?

Both the UK and the FDA have released enough information so that one can make a good bet on how the Pfizer vaccine worked. (See https://www.fda.gov/media/144245/download for example). It makes for fascinating and informative reading. I am not competent to comment on the medical aspects described there other than to say that when I first started reading about possible vaccines many months ago, I never found any virologist who predicted we would have a vaccine that was more than than 70% effective. To have a vaccine that is likely about 95% effective for people 18-64 is nothing short of a medical miracle: we really lucked out. 

However, when you look at the key statistical table (“Table 8: Subgroup Analyses of Second Primary Endpoint: First COVID-19 Occurrence From 7 Days After Dose 2, by Subgroup) things get murkier. More precisely, what one sees is exactly what I thought would happen, the signal becomes really bad for people over 65 and completely useless for people over 75.  Here’s an excerpt of that table, and then I will try to explain what is going on:

What I need to explain is how to think about what that “95% CI” in the last column means and why it is so important. “CI” stands for confidence interval and is the key when a statistician looks at data and tries to tease out signal from noise. The ideas behind a confidence interval are simple, although how to define it precisely and then calculate it, is a bit tricky. 

In a nutshell when we pull a single number from a bunch of measurements – whether it is the average weight of what’s in a bunch of boxes of cereal or how effective a vaccine is – we know that number isn’t going to be perfect. So what we want and, well, should do is not focus on that single number but give a range around that number and then ask when, say, the odds on average are that 19/20 times that we are within that range i.e. what happens if we do the experiment repeatedly. When we do this with a range you get what statisticians call a  “95% confidence interval1”. The more data you have, the tighter you can make your confidence interval!

So now let’s look at some individual lines from the table above and tease out just what the signal is. The major line is for people 18-64 and we had enough cases to say that our 94.6% efficacy average for the vaccine has a 95% confidence interval runs from 89.1 to 97.7. So what the biostatisicians who analyzed the data are telling us is that, roughly speaking, if we bet that this vaccine is between 89.1% and 97.7% effective for this group, this is an awfully good way to bet and we will win 95% of the time. These are astonishingly good numbers and we all have a lot to be thankful for. (Although having data by age deciles would have been better, they don’t have enough data to do that even in this bigger group I suspect.)

But then we have the next two lines and they unfortunately, confirm what I wrote about here (https://garycornell.com/2020/10/22/we-are-unlikely-to-have-a-vaccine-that-is-proven-effective-for-seniors-for-a-long-time-unless-dramatic-action-is-taken-now/). For people 65 to 74, while the average number (92.9%)  looks great, the confidence interval is not. It says that what we can say, roughly speaking, that a bet that the efficacy is between 53.2 to 99.8 is a good bet. Or, I would say you really don’t have a great way to bet. This kind of confidence interval says that didn’t have enough cases in this group to really say much at all and so the confidence range is too large to be really useful. 

And when we get to people over 75, what they describe isn’t a confidence interval, it’s a joke. A confidence interval of -12.1 to 100 is a lot like saying they threw a bunch of darts at a dart board at random and did everything from hit bystanders (i.e. the vaccine made things worse) to perfect protection. Who would make any bets on what is going on in this situation?They simply didn’t have enough cases to say anything meaningful and so what they say is just totally useless.

But I don’t want to end on a depressing note!  My friends who think about these questions feel pretty strongly that while the vaccine will likely be less effective in people over 65 than it is in younger people, the dropoff won’t be great enough to make a big difference. For example, if it is 20-25% less effective in these age groups (which they think is the worst case scenario), you still get a vaccine that is roughly between 70% and 75% effective – which is still pretty darn good.  

Still I wish they had enrolled enough people >65 to have a better signal!

  1. Added this footnote and clarified the text because as an astute commenter pointed out, I was being a bit sloppy in the original version! A 95% confidence intervals actually “relates to the reliability of the estimation procedure, not to a specific calculated interval” or, more precisely it means that if the same population is sampled repeatedly i.e the experiment is repeated, the region would contain the actual value in 95% of the cases. It does not mean that that there is a 95% chance that the real efficacy of the vaccine is in the reported confidence interval. This is because we have no way of knowing if the results Pfizer reported is one of the 95% good ones or one of the 5% bad ones. But since it will be true “on average”, it is the right way for us to bet! A good place to start for a further understanding of these issues can be found at the Wikipedia article on “Confidence Intervals” and the references given there. That having been said, I found when I taught this stuff or try to explain it to a layperson, saying that, roughly speaking the odds are “19/20” and so this is how you should bet is a better approach. After all “on average” this is true!

18 thoughts on “Statistics in the Pfizer Data – how good do they show the vaccine to be?”

  1. Looks like 9 people got the vaccine and 169 did not? No one over 75 got the vaccine but 5 got the placebo? If true, seems somewhat risky to start vaccination of the elderly.

  2. Hello,

    Thanks for this post and calling attention to the fact that these are noisy subgroup estimates. However, I think things are a little more auspicious than you let on.

    Frequentist CIs like those in this analysis do not actually tell you that there is a “19/20 chance that the efficacy is between 53.2 to 99.8”. It tells us that 95% of hypothetical “runs” of that particular confidence interval will contain the true value. For instance, see coverage of this on Andrew Gelman’s blog: https://statmodeling.stat.columbia.edu/2019/04/21/no-its-not-correct-to-say-that-you-can-be-95-sure-that-the-true-value-will-be-in-the-confidence-interval/

    I think that we (including you) don’t really care about the binary outcome “does this confidence interval contain the true value or not”. We care instead about quantifying our uncertainty about the likelihood that the vaccine is within given ranges of effectiveness for the population of interest. A Bayesian approach would lead us to directly constructing that probability distribution using a prior belief distribution and updating it with the data. The confidence interval above is akin to a Bayesian analysis with a totally flat prior across all possible outcome values (negative infinity to 100). It seems rather absurd to me to think that all those values are equally possible for any subgroup of people. Instead, we probably went into this analysis with some modest prior beliefs:

    1. It’s very highly likely the vaccine will have efficacy greater than zero
    2. It’s quite unlikely that the vaccine will be 100%, or very nearly, that effective.
    3. It’s unlikely to be radically different for the elderly than other people. Perhaps less effective, but likely not radically so.

    Given those basic prior beliefs, the results in the table are very auspicious. The beliefs above could be quantified and we could get a good belief distribution, and calculate the 95% credibility interval, or the interval which has a 95% chance of holding the true value. I don’t have the time to do that, but my guess is it would be a whole lot tighter than (-12.1, 100). Hope this was a helpful perspective.

    1. Your comment is obviously well taken and absolutely correct. I added a footnote and updated the post to clarify. However, in my defense, I found that when trying to explain this stuff to laypeople or when I was teaching it to freshman being a little bit sloppy and saying “this isn’t quite right but it is basically the way it works” was often a better approach! After all thinking of it as “the odds” is true on average.

    2. I don’t really understand the linked post very clearly. Would you mind helping out? What does it mean to say once we see the interval we can know?

    3. The study notes that they used Bayesian beta-binomial model for the confidence intervals. Who knows how they parameterized the prior distribution though…

      1. oh wait that was for the credible interval (which is not shown by subgroup in this table). the confidence interval shown in the table is as you say (and by definition) non-bayesian.

  3. I did some reanalysis of their confidence limits by age cohort here: Are the Pfizer vaccine’s efficacy confidence intervals sensible?

    The conclusions are parallel to yours: good evidence in the overall and 18-64 cohorts, marginal evidence in the 65-74 cohort, no real evidence in the 75+ cohort, and the 16-17 cohort is a joke that probably should have been removed. It does explain, however, why some VRBPAC members voted against it, because of the lack of evidence in 16-17 year olds when the approval question included them.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.