Since the beginning of the pandemic, the United States has reported 2.1 million cases of COVID-19, but not all cases are created equal. About half of states report two different types of positive cases: “lab-confirmed” cases and “probable cases.” The difference between the two types is important, but we continue to see questions about what a probable case actually is—and why reporting them matters.
The Centers for Disease Control and Prevention recommend that states separate their data into lab-confirmed and probable cases. According to the CDC’s COVID data tracker, only 24 states take the agency’s advice, along with four territories. We hope that all states and territories eventually provide both metrics.
According to CDC data, which tracks with the data we compile from states, probable cases that are clearly labeled as such currently make up less than one percent of all cases nationally. For six states that report them separately, however—Colorado, Delaware, Idaho, Maine, Michigan, and Wisconsin—probable cases make up between eight and 12 percent of all cases. In Wyoming, more than 20 percent of all known cases are probable cases, and in Puerto Rico, that figure is more than 75 percent. We would, therefore, expect the proportion of probable cases to rise nationally as more states either begin to report them or to label them correctly.
At The COVID Tracking Project, our “cases” metric currently includes both lab-confirmed and probable cases in a single figure, but we’ve been internally tracking probable cases separately for states that publish them. We suspect that some states publish a case count that lumps probable and lab-confirmed cases, but this has been difficult to verify. Our data team is reengineering our API and spreadsheets to include probable case metrics where we have them. However, until all states publish and clearly label probable and lab-confirmed cases—or verify that their case data counts only lab-confirmed cases—our case count will necessarily mix the two kinds of cases.
But what exactly is a probable case? To get there, we first need to define the other kind of COVID-19 case: “lab-confirmed.”
What’s a lab-confirmed COVID-19 case?
A lab-confirmed case is a person whose illness has been diagnosed as COVID-19 using an FDA approved test that meets the standards set out in the April 5, 2020 Interim Case Definition for COVID-19 developed by the Council for State and Territorial Epidemiologists. The case definition says that cases of COVID-19 are confirmed by a particular kind of test—“molecular amplification”—that can detect viral RNA.
There are two approved tests of this kind in broad use in the United States. The most common is the RT-PCR test. RT-PCR stands for “real-time polymerase chain reaction,” and the Mayo Clinic’s research magazine offers an excellent primer on the science behind the test. RT-PCR is currently the gold standard for infectious disease molecular diagnostics.
The other molecular amplification test used in the US is the more controversial Abbott ID-NOW test. The molecular amplification and RNA detection methods used in this test differ slightly from the RT-PCR test. It produces results within about fifteen minutes—significantly less than the four to five hours required by the RT-PCR test. But, this test has proven to be less reliable, so the FDA weighs its positive and negative results differently: A positive result from an Abbott ID-NOW test is considered lab confirmation of COVID-19, but a negative result is considered only “presumptive.” In plain language, this means that if a patient with symptoms of COVID-19 receives a negative result from the Abbott ID-NOW test, they should also be given an RT-PCR test to verify the result.
To recap, someone with a lab-confirmed case of COVID-19 doesn’t have to show symptoms or have an exposure to a known case. They just need a positive result from one of the approved tests.
What’s a probable COVID-19 case?
The definition of a probable case is a bit more complicated. In the absence of a confirmatory lab test, public-health experts can piece together other evidence of infection to try to identify cases that are very likely to be COVID-19 and exclude cases that only might be COVID-19. The job of deciding what counts as a probable case falls, again, to the Council of State and Territorial Epidemiologists.
The CSTE’s Interim Case Definition for COVID-19 defines a probable case of COVID-19 as one that meets one of three sets of requirements:
Meet clinical criteria AND epidemiologic evidence with no confirmatory laboratory testing performed for COVID-19.
Meet presumptive laboratory evidence AND either clinical criteria OR epidemiologic evidence.
Meet vital records criteria with no confirmatory laboratory testing performed for COVID-19.
Requirements in the case definition refer to four kinds of information used to pinpoint COVID-19 infections: clinical criteria, epidemiologic evidence, presumptive laboratory evidence, and vital records criteria. Let’s take a closer look at those.
Four classes of information that can establish a probable case of COVID-19:
“Clinical criteria” refers to sets of COVID-19 specific symptoms and scan results (which are listed in the official CSTE case definition and on the CDC website) that must be present. However, clinical criteria are not met unless there is also “no alternative more likely diagnosis.” For some symptoms—cough, shortness of breath, or difficulty breathing—having one along with no alternative more likely diagnosis is enough to meet the standard of evidence for clinical criteria. For many other symptoms, including fever, sore throat, or loss of smell or taste, a suspected case must present at least two symptoms on the list, along with no alternative more likely diagnosis.
“Epidemiologic evidence” refers to whether a patient has been in close contact with someone with COVID-19 or has traveled to an area with sustained, ongoing community transmission. Contact tracing and travel histories provide this evidence.
“Presumptive laboratory evidence” refers to the results of antibody (serology) and antigen testing. (Remember, only molecular amplification detection tests that look for viral presence in samples can provide confirmatory laboratory evidence).
“Vital records criteria” refers to death certificates that list COVID-19 or SARS-CoV–2 as a cause of death or a significant condition contributing to death. Cause-of-death determination is another area with nested sets of criteria. The CDC provides detailed guidance on when and how to list COVID-19 as a cause of death or significant condition contributing to death, and some states like Colorado have provided more specific death counts beyond the vital records criteria. This is a complex topic that we’ll discuss further in a future post.
These criteria can interact in surprising ways. Here are some of the paths by which someone could be found to have a probable case of COVID-19:
Someone who has difficulty breathing AND no alternative more likely diagnosis (clinical criteria) AND has been in close contact with someone who has COVID-19 (epidemiologic evidence) is considered a probable case (unless or until they receive a test that counts as confirmatory).
Someone who has traveled to a known COVID-19 hotspot and then has a positive antibody test result is considered a probable case.
Someone whose death certificate lists COVID-19 as a cause of death or significant condition contributing to death, but who did not receive a molecular amplification detection test is considered a probable case.
On the other hand, some suspected cases that may well be COVID-19 infections don’t meet the criteria to be considered a probable case. For example:
Someone who gets a positive antigen or antibody test result is not considered a probable case unless they also show specific clinical or epidemiologic evidence.
Someone with acute respiratory distress syndrome (ARDS) and no alternative more likely diagnosis (clinical criteria) is not considered to be a probable case unless they have an additional positive antigen or antibody test OR epidemiologic evidence.
Someone who shares a room with a person with lab-confirmed COVID-19 and who has unexplained loss of smell or taste and no other symptoms is not considered to be a probable case unless they have an additional positive antigen or antibody test.
The “probable” case criteria are quite strict by design. People who have COVID-19 but aren’t yet symptomatic—or who don’t ever develop noticeable symptoms—wouldn’t meet the clinical criteria. Because of uneven access to testing that would provide lab evidence and a lack of comprehensive contact tracing that could provide epidemiologic evidence, we can conclude that many actual COVID-19 infections do not meet the criteria to be reported as either lab-confirmed or probable cases. That’s not a bad thing; the definition is built to exclude all but the likeliest cases. But it’s useful to keep in mind that even probable case counts are likely to miss many infections. In other words, many people who have actual COVID-19 infections are not considered to be probable COVID-19 cases.
Why it’s important to count probable cases
First, it’s what the experts want. On April 5, the Council for State and Territorial Epidemiologists recommended that both confirmed and probable cases of COVID-19 should be “included in the ‘case’ count released outside of the public health agency.” The council also said that the CDC should publish both metrics. The CDC, in turn, began including probables in its online case counts.
Second, it provides more context for coronavirus data. Some CDC officials have expressed concern that current case counts from states are too low. In a June 8th article in The Washington Post, CDC spokeswoman Kristen Nordlund stated that “the current case and deaths counts reported to CDC are likely an undercount.” Sources ranging from public health researchers and officials to Goldman Sachs analysts have also repeatedly expressed the belief that official US data undercounts actual COVID-19 cases.
If more states included their official probable cases, the United States would still be unlikely to completely close the gap between the data and the reality, but its sense of the pandemic would improve. Nevertheless, many state health departments have failed to provide data on probable cases to the CDC, or to the public—and not always for clear reasons. In a press conference on June 10th, Indiana’s State Health Commissioner responded to a question about why Indiana was not following the CDC’s guidelines on reporting probable cases by stating that “We, just like most states, do not report that and we don’t report it on our website. There is no particular reason for that.”
In a response to The Washington Post, North Carolina’s state health department cited concerns about the reliability of antibody tests (even though antibody tests do not, on their own, provide evidence for a probable case) and “concerns that the CSTE’s definition of a probable case is overly broad.”
In the same Washington Post report on probable cases, officials in the District of Columbia, which had counted 9,799 confirmed cases of COVID-19 as of June 15, claimed that they hadn’t reported probable cases to the CDC because they didn’t have any. For comparison, Delaware, which had reported 9,378 confirmed cases of COVID-19 as of the same date, has reported nearly 1,000 probable cases.
Until states report both confirmed and probable cases in accordance with CDC guidelines and the CSTE’s case definition, public health officials, researchers, journalists, and the public all lose valuable opportunities to compare outbreaks across regions.
More “Testing Data” posts
When analyzing COVID-19 data, confirmed case counts are obvious to study. But don’t overlook probable cases—and the varying, evolving ways that states have defined them.
Looking back on a year of collecting COVID-19 data, here’s a summary of the tools we automated to make our data entry smoother and why we ultimately relied on manual data collection.
As The COVID Tracking Project comes to a close, here’s a summary of how states reported data on the five major COVID-19 metrics we tracked—tests, cases, deaths, hospitalizations, and recoveries—and how reporting complexities shaped the data.