Herd Immunity 1

Herd immunity is all in the news but in many cases the news media is not really describing the concept clearly. While technically the concept of herd immunity derives from mathematical models of disease transmission, when you drill down, it is just generalizing common-sense thinking about disease transmission.

Let’s start by imagining only one person still has a disease and everyone else is immune. Then,  after you cure that person, assuming there is no animal reservoir (i.e. animals that can harbor the disease),  the disease is over and done with. That’s what happened with smallpox on that glorious day when, on Oct. 26, 1977, in Merka, Somalia the last natural case of Smallpox was diagnosed. As a mathematician would say: “this is the boundary case for herd immunity, there’s simply no one left to transmit disease to”.

More generally, I want you to imagine the following situation: you have a bunch of people who are immune and a bunch of people who are infected.  Now also imagine there is a large group of people who are neither infected nor immune, they are the “susceptibles”. OK, an infected person comes into contact with a group of people. Obviously, he or she encounters a certain number of immune people and a certain number of people who are susceptible. Suppose that person infects only one person and subsequently becomes cured. This gives us a steady state (potentially for a very for a long time). Every infected person passes the disease onto one person and gets cured themselves. The disease stays at whatever state it is in – that’s why they call it a “steady state”. Yes, it still infects the susceptible population, but slowly, one person at a time, until there are no more susceptible people. Then and only then does it die out.

But diseases usually don’t work like that. More likely our one infected person infects more than one person: say two people and then, of course, those two people each infect two more people for a total of four. The next round each of those four infect 8 etc. Then you have exponential growth and disaster ensues quickly. The disease will grow uncontrollably and then you’re down the rabbit hole I talked about this in the parable of the Lily Pond (https://garycornell.com/2020/04/20/the-parable-of-the-lily-pond-or-why-you-should-shelter-in-place/) . 

Since mathematicians and statisticians have no trouble thinking about “fractional people”, I want you to join us and imagine that one person can only infect half a person. In this fantasy world of fractional people, that half a person will only infect a quarter of a person, the quarter person only an eighth of a person and so on. In this case the disease dies out.

In a nutshell, this is exactly what herd immunity is about:  it’s about throwing  infected people into a pool of both immune people and susceptible people, having them interact and then thinking about when the disease can no longer spread. More precisely, what the mathematical models try to do is predict when you will be in a situation where the number of immune people is so large that an infected person won’t find one or more susceptible people – even if they succeeded in finding “.99” of a person to infect before they can no longer spread the disease. Once the number of people an infected person infects on average is less than one person, “herd immunity” happens and the disease eventually dies out. 

OK now you have seen what it’s all about, let’s try a more realistic situation especially for Covid-19. Imagine a situation where a person will infect at most three people on average. Whether they do or not will obviously depend on the number of susceptible people they encounter.

Let’s walk through a couple of scenarios:

Scenario 1: One infected person encounters two susceptible people, both get the disease, and each of these two infect two out of three more people (i.e. they infect four people in the second level). All hell breaks loose if this scenario continues for too long.

Scenario 2: One infected person  always encounters one person who is susceptible and two who are immune. They infect that one person, and that person infects one person and so on. We are in a steady state for )potentially) a long time.

Scenario 3:  The infected person doesn’t encounter any susceptible people, he/she gets cured and the disease transmission stops.

Looking over these scenarios, it is clear that scenario 2 is the “tipping point”. In fact, if you allow me my fractional people then, then if more than two out of three people are immune – even “2.01/3” people immune”, then the infected person will infect less than 1 person and the disease will (eventually) die out. Similarly, if there were say less than 2 people, say “1.99/3 people” who were immune (2.01 people who are susceptible) , the infected person would infect more than 1 person and the disease will grow.

Well, we’ve just proved a special case of the most important basic theorem in disease transmission, it’s called the “Threshold Theorem” i.e. the threshold for herd immunity. While the model that led to this theorem is not necessarily the right one in all situations, nonetheless, we’ve essentially proved that if you have a disease where people can infect 3 people on average (R0 is 3 as epidemiologists would say), then herd immunity occurs when more than 2 out of 3 (>⅔, 66.67%) of the people are immune.

Next, imagine a scenario where each person can infect 4 people (R0=4). Can you see why the tipping point occurs if 3 out of every 4 people (75% = ¾) are immune?  With everyone infecting 5 people (R0 =5) the tipping point occurs when 4/5ths (80%) of the people are immune etc. 

And now I hope you can see that how the general threshold theorem would be proved and here is its statement:

The threshold theorem: In the basic model of herd immunity, herd immunity occurs when more than 1-(1/R0) percent of the people are immune.

Many epidemiologists believe that the threshold theorem with R0 = 3 – i.e. our first situation actually describes COVID-19 transmission pretty well. So in this case herd immunity to COVID requires roughly ⅔ (66.67%) of the population to be immune.

That’s it: you now know more about herd immunity than your average talking head and probably more than your average doctor who doesn’t have an MPh! But I obviously have to end by cautioning you that many epidemiologists want to use different values of R0 in the threshold theorem instead of 3 for Covid-19. For example, an R0 of 2.5 would give (1- 1/2.5 = 60%) an R0 of 4 would give 75%. And, more importantly, other epidemiologists want to use much more sophisticated models of disease transmission and throw out the basic threshold theorem completely when it comes to Covid 19. But no epidemiologist has a model that predicts anything like 20 or 25% as the level for herd immunity as far as I know and we have some facts on the ground that make, say, a radiologist who believes that, an inhabitant of the crazy place in the science landscape. 

We are unlikely to have a vaccine that is proven effective for seniors for a long time unless dramatic action is taken now!

(Cross posted with Medium)

The risk of both hospitalization and death from Covid 19 increase greatly with age. Approximately 80% of the deaths from Covid 19 are people over 65 (https://www.cdc.gov/coronavirus/2019-ncov/need-extra-precautions/older-adults.html). The unfortunate truth is that a vaccine that is proven highly effective for seniors is not likely for a very long time, unless we dramatically increase the number of seniors in current trials now.

Why? The gold standard to determine efficacy is a large, placebo-controlled, double-blind clinical trial. There are currently eight vaccines in large Phase 3 trials. But it is very unlikely, maybe even impossible, that any of these vaccine trials will give us definitive information about how effective these vaccines are for people over age 55 — well, unless they change how they are currently setup.

Why? To begin with, none of the four trials that have released their protocols are properly stratified. While they aren’t lumping seniors into the same group as the 18 to 55-year-olds, they should be using three groups i.e. one for each age decile: 55 to 64, 65 to 74, and 75 and older. To be sure, it’s possible to tease out information about how different strata of seniors react to the vaccine even if they are lumped together in one group. However, it likely would take more data because you have to tease out the information for each age decile from a larger group.

But the bigger problem is that even if these trials give us some information about efficacy for seniors, they are unlikely to tell us everything we need to know quickly enough unless we have a far larger number of participants over 55 than the current trials are enrolling. Why? Since Covid-19 can be so deadly in older people, they and the communities they live in have generally been taking better precautions than the general public against becoming infected with the virus SARS-CoV-2 that causes Covid-19. For example, the CDC reports (https://www.cdc.gov/mmwr/volumes/69/wr/mm6939e1.htm#T1_down) that, as of August, roughly speaking, the prevalence among people >75 years old is 1/15 that of people 18–55 years old. And, even if you lump all people over 55 into one group, it is roughly 1/3. And, of course, we are all hoping that prevalence among seniors has gone down significantly since August. But prevalence is what determines the time needed to have a statistically significant efficacy signal from a vaccine trial.

Here’s what information I have gotten from their published protocols for the percentages of seniors enrolled:

AstraZenica: “Randomization will be stratified by age (≥ 18 and < 65 years, and ≥ 65 years), with at least 25% of participants to be enrolled in the older age stratum.” They are also using a 2 to 1 active to placebo division

Johnson & Johnson: “The aim of having a minimum of approximately 25% of recruited participants ≥60 years of age has been adjusted to 30%”

Moderna: “At least 25% of enrolled participants, but not more than 40%, will be either ≥ 65 years of age or < 65 years of age and “at risk” at Screening”

Pfizer: “It is intended that a minimum of 40% of participants will be in the >55-year stratum”

I suppose we seniors should be thankful, originally it was much worse (https://www.nytimes.com/2020/06/19/health/vaccine-trials-elderly.html), and in at least one case, the published protocols made this clear, as people over 55 years old were only added via a late amendment.

Anyway, regardless of what they were planning on doing originally, none of these four trials are enrolling a far larger percentage of people over 55 than they are enrolling under 55. This means the time needed for getting enough cases for people over 55, and especially among people over 75, will be longer than the time needed to get an efficacy signal from 18- to 55-year-olds. And you need to get that signal for seniors as quickly as you get a safety and efficacy signal for younger people. Why? Because once we have a vaccine that has been shown to be safe and effective for people under 55, I believe it is unethical to continue any placebo-controlled Phase 3 trial in the elderly for a disease so deadly to them — all elderly participants would have to be offered the vaccine that worked among younger people. Since designers of these trials are hardly stupid, it seems to me that they are either betting that they will have enough cases in seniors to have an efficacy signal quickly or that enough seniors will agree to stay enrolled in the placebo arm.

A better solution is obvious, don’t bet: dramatically increase the number of participants over 55 in the current trials quickly, no matter what the expense and difficulty is in doing so. The more people we have over 55 in a trial, the more likely it is we will have an efficacy signal before ethical considerations force us to stop the arm of the trial being done on seniors.

What happens if we don’t do this? Then the only thing we will know quickly is what the vaccine candidate does to immune system markers on seniors,such as the antibody levels they induce. And, since the immune system of someone who is 75 tends to work differently than someone who is 55, let alone 25, even having immune system markers in seniors that match those of a 25 year old won’t mean enough to know anything definitive. But I want to make clear that of course, seniors can and should take an approved vaccine based on the results in 18–55 year olds even without their being an efficacy signal for them.

Then, interestingly enough, I expect the same things to happen whether or not we had an efficacy signal for seniors: it’s just that the consequences and information we gain from them is different. What I expect is that correctly stratified trials for people over 55 will quickly begin that compare the approved vaccine(s) to various tweaked formulations or dosing regimens. We did this in order to get a better flu vaccine for seniors for example. But not only will these trials take time, unless we have that efficacy signal for seniors from the original trials, these trials can give us only relative information, not absolute information. For example, a trial might show that half the dose doesn’t work very well while four times the dose is not only safe but it works twice as well. That sounds great, but we aren’t home free if we don’t know how well the original vaccine actually works on seniors of varying ages. Knowing that you have a vaccine tweak that works twice as well as the original vaccine actually doesn’t tell you anything about how well the improved vaccine will work without a baseline! For example, suppose the original vaccine was just 15% effective among people age 75 and older. Doubling the effectiveness with a tweaked formulation puts it at only 30%. Knowing that something is 2X, doesn’t tell you anything without information about X. And information on X is what we won’t have unless we spent the money and effort needed to get it from greatly enlarged trials of the original vaccine in seniors now. This puts us in a completely different position than the flu vaccine where we knew quite well how the original flu vaccine worked in seniors, so the “tweaked” version’s efficacy was easy to compute.

To summarize: given what I feel are the inescapable ethical issues in completing any placebo controlled study on seniors once you have a safe and effective vaccine based on trials in healthy 18–55 olds, we must enlarge the number of seniors in the trials quickly. Failing to do so means that we likely won’t know enough about how well the original vaccine worked in one or more age decile group of seniors. Your castle will be built on little if any foundation.

So, considering how deadly Covid 19 is among seniors, absent great therapeutics, would you, if you were over 55, really change the precautions you are taking such as not getting on a plane because you took a vaccine you have little absolute information about for your age group? I am and I wouldn’t! So, again, I am hoping (perhaps without hope) that we spend the money and take the effort to quickly expand the number of seniors in the current trials.

I want to end by explaining what will likely happen if we don’t change the current trials to include far more seniors. First off, you need to always keep in mind that absent that, we will likely be stuck in the twilight zone of having relative information but needing absolute information! Can biostatisticians do anything down the road to break the barrier between relative information and absolute information if we didn’t enroll enough seniors in the current trials? Of course. What they will do is what is called a paired retrospective study. This means they will look at seniors who chose to get the vaccine and compare them with a matched group of seniors that didn’t get the vaccine. Then, given enough cases and a good enough match between members of the two groups, we will finally have a way to get the absolute information we need. Only after that retrospective study is complete, would seniors know (roughly) how well the best of the vaccine tweaks works for them.

Still, a paired retrospective study would be both difficult and time consuming to perform. Why? The key to doing a paired retrospective study is to pair up the people so that there are no differences between them that can influence the incidence of the disease. And you need to know if seniors who declined to get an approved vaccine, or who didn’t have access to it, are different in some fundamental ways from those who did get the vaccine. I don’t know how to answer that, but I do know that the biostatisticians are going to have a difficult job designing a paired retrospective study.

So I personally would be shocked if we have any absolute information about the efficacy of a Covid-19 vaccine for seniors for a very long time to come unless we spend the money and effort to dramatically increase the number of seniors enrolled in the current trials now.

Numbers don’t lie, how bad has the US done?

As I’ve said before, the actuarial concept of “excess deaths” together with analyzing the numbers of excess deaths is the only way to figure out how many Americans have really died from Covid 19. The following, very interesting paper: https://jamanetwork.com/journals/jama/fullarticle/2771841 begins the process. The paper is mathematically sophisticated but the tables are useful for all. Alas, there are no surprises, we have done terrible when matched to comparable countries.

On September 19, 2020, the US reported a total of 198 589 COVID-19 deaths (60.3/100 000), higher than countries with low and moderate COVID-19 mortality but comparable with high-mortality countries (Table 1). For instance, Australia (low mortality) had 3.3 deaths per 100 000 and Canada (moderate mortality) had 24.6 per 100 000. Conversely, Italy had 59.1 COVID-19 deaths per 100 000; Belgium had 86.8 per 100 000. If the US death rates were comparable to Australia, the US would have had 187 661 fewer COVID-19 deaths (94% of reported deaths), and if comparable with Canada, 117 622 fewer deaths (59%).

There’s a Catch-22 that means we are unlikely to have a vaccine that is proven highly effective for seniors for a long time

“That’s some catch, that Catch-22,” he observed. “It’s the best there is,” Doc Daneeka agreed.” — a conversation in Joseph Heller’s novel “Catch-22”

The risk of both hospitalization and death from Covid 19 increase greatly with age. Approximately 80% of the deaths from Covid 19 are people over 65 (https://www.cdc.gov/coronavirus/2019-ncov/need-extra-precautions/older-adults.html). Reporting to date has not made clear the unfortunate truth that a hidden “Catch-22” means that no vaccine that is proven highly effective for seniors is likely for a very long time. What is the Catch 22? It’s that the faster we have information on safety and efficacy for healthy 18-55 years olds, then as you will see below, the less likely it is that we will have that information for people over 55. But please note I am not saying that we won’t have a vaccine that seniors can use and that will very likely have benefit for them, what I am saying is: we won’t know quickly: how much benefit seniors will get from any approved vaccine.

The gold standard to determine efficacy is a large, placebo-controlled, double-blind clinical trial. There are currently eight vaccines (nine if you count the Russian one) in large Phase 3 trials. But it is very unlikely, maybe even impossible, that any of these vaccine trials will give us definitive information about how effective these vaccines are for people over age 55. Why? To begin with none of the three trials that have released their protocols are properly stratified. While they aren’t lumping seniors into the same group as the 18- to 55-year-olds obviously, they should be using three groups i.e. one for each age decile of age: 55 to 64, 65 to 74, and 75 and older. Enrolling participants in a properly stratified trial is more difficult and perhaps takes a bit longer than enrolling volunteers in a nonstratified trial. To be sure, it’s possible to tease out information about different strata of seniors react to the vaccine even they are lumped together in one group, but it takes more data about individual cases because you have to tease out the information for each age decile from a larger group. So even if these trials give us some information about efficacy for seniors, they are unlikely to tell us everything we need to know quickly. The only thing we will know for sure quickly is what the vaccine candidate does to immune system markers on seniors,such as the antibody levels they induce. And, since the immune system of someone who is 75 tends to work differently than someone who is 55, let alone 25, even having immune system markers in seniors that match those of a 25 year old won’t mean enough.

So when will we have a completed, placebo controlled, properly stratified phase 3 trial on people over 55? As far as I can see, the answer is never. Why? Since Covid-19 can be so deadly in older people, they, their families, and their communities have generally been taking better precautions than the general public against becoming infected with the novel coronavirus. For example the CDC reports (https://www.cdc.gov/mmwr/volumes/69/wr/mm6939e1.htm#T1_down) as of August  that roughly speaking the prevalence among people >75 years old is roughly 1/15 that of people 18-55 years old. And, even if you lump all people over 55 into one group it is 1/3. And of course we are all hoping that prevalence has gone down since August. And prevalence affects the time needed to have a statistically significant efficacy signal out of the current vaccine trials: the time needed for  people over 55, and especially among people over 75, will be far longer than the time needed to get an efficacy signal from 18- to 55-year-olds.

But that takes me to my point: once we have a vaccine that has been shown to be safe and effective for people under 55, I believe it is unethical to continue any placebo-controlled Phase 3 trial in the elderly — all participants should be offered the vaccine that worked among younger people. Thus the Catch-22: the faster we get a safe and effective vaccine based on trials on 18-55 year olds, the less likely it is we will have all the information we need on its efficacy for people older than 55.

Still, I do want to stress that a vaccine that is proven safe and effective on 18-55 year olds will have a benefit for people over 55, if you are in this group you will definitely want to take it when it becomes available. What I am saying is you just won’t know how effective it is on your age group absent a completed placebo controlled study.

What happens then? In an ideal world, correctly stratified trials for people over 55 will compare the approved vaccine(s) to various tweaked formulations or dosing regimens. But not only will these trials take time, they can give us only relative information, not absolute information. For example, a trial might show that half the dose doesn’t work very well while four times the dose is not only safe but it works twice as well. For example, the optimal dose of the flu vaccine for people over 65 turns out to be four times the antigen dose of the regular vaccine and it is approximately 25% more effective than the original flu vaccine (https://www.nejm.org/doi/full/10.1056/NEJMoa1315727?query=featured_home) but that took time to determine.

That sounds great, but we aren’t home free. Since we don’t know how well the original vaccine actually works on seniors of varying ages, knowing that you have a vaccine tweak that works twice as well as the original vaccine actually doesn’t tell you anything about how well the vaccine is working! Suppose, say the original vaccine was just 10% effective among people age 75 and older. Doubling the effectiveness with a tweaked formulation puts it at only 20%. Knowing the something is 2X, doesn’t tell you anything without information about X, and that is what we won’t have. This puts us in a completely different position than the flu vaccine where we knew quite well how the original flu vaccine worked in seniors.

To summarize: given what seems to me to be the inescapable ethical issues in completing any placebo controlled study on seniors, if you very quickly have a safe and effective vaccine based on trials in healthy 18-55 as people are hoping, you won’t know enough about how well the original vaccine worked in one or more age decile group of seniors-you just don’t know that “X” for them. Your castle will be built on little if any foundation. So, considering how deadly Covid 19 is among seniors, absent great therapeutics, would you, if you were over 55, really change the precautions you are taking such as not getting on a plane because of a vaccine you have little absolute information about for your age group? I am and I wouldn’t!

We are stuck in the twilight zone of having relative information but needing absolute information! Can biostatisticians do anything to break the barrier between relative information and absolute information? Of course. What they will do is what is called a paired retrospective study. This means they will look at seniors who chose to get the vaccine and compare them with a matched group of seniors didn’t get the vaccine. Then, given enough cases and a good enough match between members of the two groups, we will finally have a way to get the absolute information we need. Only after that retrospective study is complete, will seniors know (roughly) how well the best of the vaccine tweaks works for them.

Still a paired retrospective study would be both difficult and time consuming to perform. Why? The key to doing a paired retrospective study is to pair up the people so that there are no differences between them that can influence the incidence of the disease. And you need to know if seniors who declined to get an approved vaccine, or who didn’t have access to it, are different in some fundamental ways from those who did get the vaccine. I don’t know how to answer that, but I do know that the biostatisticians are going to have a difficult job designing a paired retrospective study.

So I personally would be shocked if we have any absolute information about the efficacy of a Covid-19 vaccine for seniors for a very long time to come.

A field guide to the remaining species of TV talking heads

I said a lot about MDs as not usually being a great source of information about the public health aspects of a pandemic in my last blog. Now I want to take up the remaining talking heads that you will see on TV.

Let’s start with politicians because they almost never have a clue. Why? Well, because they almost never have training in public health, they almost never are scientists.  The Congressional Public Health Caucus has exactly one person with a master’s in public health (Rob Wittman (R-VA-1) who also has a Ph.D. in Public Health). No other members of Congress have an MPh as far as I can tell. There is one physicist Ph.D. and one applied math Ph.D. in the house, and as far as I can tell, there are no other scientists. There are lots of MDs of course, some of whom may or may not be, say,  borderline acceptable (if not board certified) ophthalmologists, but otherwise, again as far as I can tell, they know nothing about public health issues, And alas because of this they can and do say things that go from being harmless but nutty all the way up to being downright dangerous – all the while thinking they know more then the pros. When a politician with an MD thinks they know it all, they are a public hazard, not a public good.  More generally, my advice is: don’t pay attention to any politicians when they talk about the health aspects of the pandemic.

Epidemiologists are probably the best people to listen to about life in general during a pandemic. While an infectious disease doctor studies disease in individuals, the “trees”, epidemiologists study the “forest”: how diseases work in whole populations. Usually, the ones on TV  are Ph.D. level research scientists who understand quite a lot of statistics. A common path to becoming an epidemiologist is to start with a masters of public health with a specialty in epidemiology and then either get a Ph.D. in epidemiology itself or a Ph.D. in something like public health e.g. environmental health or even biostatistics. They know a lot about the models for disease transmission, really understand herd immunity, and so on. Michael Osterholm is one of the best talking heads to listen to and has an MPh in epidemiology but his Ph.D. is in environmental health, for example. 

Virologists are the people to listen to about viruses. They are your go-to people on vaccines, who think about and design drugs that may work on Covid-19, mutations in Covid-19, etc. They are either MDs who have chosen to do research in virology or Ph.D.s in a subject that gives them the tools to do virology. I’ve seen high powered virologists with degrees in biophysics, molecular biology, organic chemistry, and more (oh, as well as in Virology itself). In most cases (Fauci being an obvious exception), they know no more than laypeople do about the public health issues of a pandemic, such as when herd immunity might happen or actually conducting a vaccine trial.

Biostatisticians. I kind of love biostatisticians. They are the people who keep medicine and doctors honest. They design the trials, they tell you when you have a statistically significant result, etc. They tell you when a drug has failed to deliver in a trial, etc. They usually have a Ph.D. in statistics with lots of experience in the field that was gained by living through the messiness of actual clinical trials. Essentially, while any Ph.D. in mathematics can relatively easily learn the statistics behind clinical trials, what distinguishes a biostatistician from even a regular statistician (who does learn the math behind clinical trials as part of their training), is their real-life experience in the trenches. You rarely see biostatisticians on TV, and on the rare occasion when a biostatistician is on TV, they have no choice but to talk in generalities- the math they are using and the difficulties they have to deal with are hard to explain in a soundbite, alas.

Finally, Science Journalists. I do have a soft spot in my heart for science journalists. While some have professional training (all the way up to the Ph.D.), many do not – but it often doesn’t matter. The best of them have an uncanny ability to accurately translate what scientists are saying into descriptions that make sense to intelligent laypeople, and have this amazing ability to come up with analogies that really let laypeople understand what the scientists are doing. 

Dear Governor Evers: “Near exponential growth” is not a thing”

“”Wisconsin is now experiencing unprecedented, near-exponential growth of the number of COVID-19 cases in our state,” said Governor Tony Evers in a video message.“

Dear Governor Evers,

“Near exponential growth” is not a thing. You either have exponential growth or you don’t – and I’m sorry to tell you, you do. I promise I’ll write you a longer blog letter to explain why this is true and go into much greater depth about the difference between exponential and non-exponential growth, but in a nutshell, here’s what is the thing: exponential growth (and if we could only have it, exponential decay) is characterized by one thing and one thing only: the rate of new cases is proportional to the number of present cases. (Another common term for this is compound growth).  The only question is how fast is your doubling rate.  If every day you have even 1% more cases then the previous day, you have exponential growth. Your doubling time with only a 1% daily increase will be longer (it is 69.6607168936 days approximately, please trust me on this) but eventually, you will get to the situation described in my “Parable of the Lily Pond” (https://garycornell.com/2020/04/20/the-parable-of-the-lily-pond-or-why-you-should-shelter-in-place/).

Best regards

Gary Cornell PhD

A field guide to talking heads with MD degrees

I watch too many news shows and one consequence is that I see way too many talking heads saying things for which they are unqualified to have an opinion on. The worst offenders are the MDs. Look, obviously becoming a board certified doctor in any specialty is hard and you are probably pretty smart, but it doesn’t make you qualified to say anything outside of your area of specialization. If you are a radiologist or an ophthalmologist say, absent any special training, like having a Masters in Public Health, the  MPH, you will almost certainly know no more about vaccine trials, vaccine deployment or the epidemiology of a pandemic than your diligent reader of Scientific American, probably less. That a talking head has an MPH along with their MD is pretty much the baseline before you should even start to think that they are qualified to say anything about these crucial areas – especially during a pandemic. Yes, there are rare exceptions because of career choices like doing research in this area or working in public health and so “learning on the job” without bothering to get the MPH – Anthony Fauci for example(!)  or Leana Wen come to mind. But if an MD is on a TV show or even working for the president and they don’t have an MPH, it’s best to treat what they say as being their opinion as a layperson. Start out by giving it all the credibility that you would give your doctor if they started talking about physics!

A particular annoying trait –  for a mathematician like me – is when they talk about the statistics involved in clinical trials or the models of how a disease spreads. This is because the mathematics needed to say something intelligent about any of these areas is way beyond what most doctors ever knew. Almost all doctors for example, (odds probability slightly lessened by having an MPH but not cured) know essentially nothing about the statistics behind clinical trials – that’s why the biostatisticians design the trial and decide when there is a statistically significant result. While an infectious disease doctor or an immunologist are your go-to people for treating patients and helping to conduct a clinical trial on a vaccine, usually they know barely enough statistics to understand what the biostaticians are telling them, not to mention doing the statistical analysis themselves. I would be shocked if more than a handful of infectious disease doctors really understand what, say, the very common  “Cox Proportional Hazards Model” is, or why it is a good test to use in a vaccine trial. Heck, I would give odds on a bet that most couldn’t begin to understand how you would actually do the calculation if you didn’t say, have a R program that gave you blanks to fill in with the data. 

So my advice is: look at those chyrons to see if there is an MPH listed before or after that MD and be very wary if there is not.

Alas the usually awesome Rachel Maddow doesn’t do the math for herd immunity right

I love back of envelope calculations and I love Rachel Maddow, she is so bright and usually does her homework. Yesterday, not so much. In this segment:

http://www.msnbc.com/rachel-maddow/watch/math-on-trump-covid-strategy-has-millions-dying-before-it-works-91958341539

she completely screws up a back of envelope calculation on how many deaths might happen before we get to herd immunity. She starts out correctly: if you use the standard “threshold theorem” 1 , you conclude that roughly 65% of the people (roughly 215,000,000) need to be infected and become immune before you get to herd immunity.

And then she goes completely off the rails. She says there are about 6.6 million confirmed cases of Covid-19 with about 200,000 deaths, so there is a case fatality rate of about 2.97% based on the total number of cases (roughly 200,000/6,600,00). That’s correct. And then this graphic shows up:

What she is doing here is taking the case fatality rate and multiplying it by the number of cases you would (roughly) need to have herd immunity. The problem is that you need to use the infection fatality rate not the case fatality rate in any calculation like this. And, because there are many many cases of Covid 19 that we don’t know about than those we have found through testing, these two numbers are very different. The infection fatality rate is much much lower. Why didn’t she check with Ashish Jha or Michael Osterholm who would have caught this horrible mistake? The good guys aren’t supposed to go so badly off the rails.

Look anyway you cut it, as I will shortly show you, the number of deaths is beyond horrible. But using the case fatality rate rather than the infection fatality rate means the number of deaths is too high by at least a factor of 3 – and that is absent any therapeutic changes which she correctly points out later would lower this number.

So what was her mistake? Well, the 6.6 million confirmed cases are just the tip of the iceberg. Based on antibody testings (and of course the test positivity ratios we have seen) the best estimates of the number of infections in the United States I have seen say there are between three and twelve times the number of infected people as there are confirmed cases i.e. roughly between 20 and 80 million people have actually been infected by Covid 19. Using these numbers you get an infection fatality rate of between .25% (200,000/80,000,000) and 1% (200,000/20,000,000). Anthony Fauci thinks that the infection fatality rate is actually the higher number i.e. about 1%, so I will use this number. This rate gives us (absent any therapeutic changes that greatly lower the death rate) 2.15 million deaths (1%*215,000,000) before herd immunity occurs at the high end with Fauci’s estimate and about 550,000 at the low end if the infection is far more widespread than testing would indicate.

So, the situation is beyond horrible in both cases – if we take the herd immunity route, but let’s not do the math wrong – it gives ammunition to the bad guys.

How to understand “statistically significant”, step the first

In a previous blog I said: “you compare the number of people getting the disease or getting severely ill in the vaccinated and placebo group, looking for a statistically significant difference.” and some people asked me is that just an expert’s way of saying “the results are obviously different.” Actually no, it is much more subtle than this. For example, in the early days of remdesivir testing, they found: “a 14-day mortality rate of 7.1% for the group receiving remdesivir versus 11.9% for the placebo group”. However, they also said “the difference in mortality was not statistically significant” (italics mine). But to a layperson it sure looks significant, so what is going on? Unfortunately, I can’t explain what is going on in this example yet, but if you keep reading this blog, I will slowly get you there. 

Explaining what “statistically significant” means will take a lot more than a single blog because there is a lot of statistics needed to explain this simple phrase–and, even more true, this phrase is controversial among statisticians. In this blog I want to start you on the road but I will stay away from anything really technical.

First off, the reason why statistical significance is so confusing and needs real math to explain it, is that people aren’t wired to understand how powerful “randomness” can be in making rare events happen. For example, I once had a t-shirt that said “miracles happen to me” on the front and on the back it said: “once a year on average”. The idea behind this t-shirt is there are more than 1/2 million minutes in a year, so something weirdly good that is very low probability could (actually would) likely happen in a year to me. It’s a dumb statistical joke I suppose, but the idea is real- random good and bad happens more often than you think.

Let’s think about flipping a coin and looking at the results of a single run of coin tosses. We are trying to decide if the coin is “loaded” and so not a fair coin. We flip it 4 times and we get all heads. Assuming a fair coin and using the “multiplication of probabilities” rule, this will happen only

(1/2)*(1/2)(1/2)*(1/2) = 1/16 of the time

or  6.25% of the time. Is this enough for any statistician to say that there is “statistically significant chance the coin is loaded”?

The answer is no. How unlikely does an event have to be before we say it is statistically significant? It used to be the case that statisticians routinely used a “5% threshold” i.e. that the odds of it happening were less than 1/20, to decide if something was weird enough to say “Huh something isn’t right here”. Here the odds were only 1/16 of this happening, so we say that it “didn’t reach the 1/20 threshold (5%)” and they also would say that therefore there isn’t enough evidence to conclude that we have a loaded coin.

So, now suppose you had actually flipped it five times and got all heads. That would happen only 1/32 ( a little bit more than 3%) of the time by chance. So some statisticians would say: “yep you probably have a loaded coin because you passed the 5% threshold, there is less than a 1 in 20 chance it could have occurred randomly”.

Personally I (and many statisticians) think that is too low a bar to clear, we want something to be much less likely to occur by chance than 1 in 20.  We want something to be much more rare before we say it is statistically significant and I and many other people would want there to be less than a 1/100 chance (1%) of it happening randomly. So, I would have flipped it 7 times to start with and yes if I got 7 heads in a row, I would say “yea I will bet that it is loaded” – because this will happen only in 1/128 of the time – which is less than 1% of the time!

But while statistical significance can be easily illustrated by tossing a coin, it is used most often when doing “hypothesis testing” so I will take that up shortly. Stay tuned.