In the first few months of the COVID-19 outbreak in the United States, most states reported a single “deaths” figure to account for everyone in their jurisdiction who had died of COVID-19. Since then, states have gradually begun to report “probable” COVID-19 deaths, and we are now able to compile both probable and confirmed COVID-19 deaths from 24 US states—about a third of the 56 jurisdictions we track.
Today, we are releasing two new fields in our API, breaking COVID-19 death counts into new probable and confirmed categories for those 24 states providing them. We have been compiling these data points internally since May 12, so this public release includes all compiled probable death data that states have reported since that date.
Unfortunately, in the process of compiling this data, we have discovered a new challenge that makes even these newly separate data points less clear and consistent than we’d like them to be. So before we introduce you to the new data points and definitions, we need to break down the inconsistencies we found.
Two ways to count COVID-19 deaths
In the United States, most states use one of two methods to count COVID-19 deaths. Both methods are defensible and both carry advantages and disadvantages: the death certificate method and the case classification method. (Two states, Colorado and North Dakota, record deaths using both methods.)
Death certificates (people who died of COVID-19)
Using the death certificate method requires checking whether COVID-19 was listed as the cause of death) on the death certificate, which it can be whether or not there is a confirmatory laboratory test. These are usually termed “death due to COVID-19” in public health reports.
This method generates the most epidemiologically accurate data for modelling pandemic severity or COVID-19 associated population health research by removing deaths that were not due to COVID-19. The downside of this method is that it takes on average, one to two weeks to produce and review death certificates, which means death reporting using this method significantly lags behind testing and hospitalization data. This makes the method less helpful for understanding how many people are dying, and where, in an ongoing outbreak.
To understand the real-time status of the pandemic and allow for the best informed response, death counts based on case data are most useful, which brings us to the second way of reporting deaths.
Case classification method (people who died with a known case of COVID-19)
Using the case classification method, deaths from any previously identified cases (probable or confirmed) will be immediately recorded as probable or confirmed deaths respectively. These are usually termed “deaths among cases” in public health reports. Since there is no review process involved, this method results in real-time reporting of death counts, but at the cost of a less accurate count that includes some people who died with COVID-19, but not because of COVID-19.
Importantly, the differences between the two counts appear to be relatively small in some states and larger in others. For example, after reviewing its death certificate data, Washington state found that only 7 out of 1,233 people whose deaths were attributed to COVID-19 cases were not caused by the disease. In Colorado, the 1,696 “deaths among cases” include 154 more people than the “deaths due to COVID-19” to date. North Dakota reports deaths in a slightly more complicated way, providing counts calculated by both methods only for lab-confirmed deaths. Its case classification count for lab-confirmed deaths includes 11 more deaths than the 74 lab-confirmed deaths with COVID-19 listed on the death certificate.
So which method should states use? The federal government has sent out conflicting signals. The National Center for Health Statistics (NCHS), the arm of the US Centers for Disease Control and Prevention (CDC) that issued the official federal guidance on deaths reporting for COVID-19, recommends primarily using information from death certificates to count COVID-19 deaths. In a similar vein, the World Health Organization issued provisional ICD-10 codes (code U7.01 and U7.02) for classifying COVID-19 deaths based on death certificates.
The CDC, however, has not followed rules from its own affiliate institutions in its main COVID-19 surveillance effort. According to its website, the CDC has used the Council for State and Territorial Epidemiologists’ guidelines for case classifications to determine death counts in its own COVID-19 tracker since April 14.
From the “About the Data” section of the CDC’s US cases page:
- Meeting clinical criteria AND epidemiologic evidence with no confirmatory laboratory testing performed for COVID-19
- Meeting presumptive laboratory evidence AND either clinical criteria OR epidemiologic evidence
- Meeting vital records criteria with no confirmatory laboratory testing performed for COVID19
To confirm that we understood the CDC’s method, we compared the numbers that the states of Colorado and North Dakota reported—both release at least some data on the two types of death counts—to confirm that the CDC was reporting deaths among cases for those two states. It was.
The two methods are both defensible for their respective virtues, but generate different counts on different timelines. We think it’s most useful for states to report both, as Colorado and North Dakota do. If this is not possible, it would be very helpful for the federal government to clearly advise the use of one standard across the nation, so that all jurisdictions report deaths on synchronized timelines.
What we’re publishing today
Since the federal government has not provided that clear guidance, these new probable and confirmed deaths fields in our API—as well as our overall death counts—come with an asterisk: Because states report deaths using different methods, these fields lack a standardized and transparent definition across all jurisdictions. Their meanings will vary depending on the state. When states provide figures pertaining to both case classifications and death certificates, which two states (North Dakota and Colorado) do, we have prioritized information on deaths coming from death certificates as a temporary measure.
Here are the fields we are releasing today:
Deaths: This field reports the overall death count associated with COVID-19, including both confirmed and probable deaths in accordance with the CSTE expanded case definition. In states where the information is available, it only tracks total fatalities with COVID-19 listed on the death certificate in accordance with the NCHS and WHO standards.
Deaths (Confirmed): This new field tracks deaths of individuals with lab-confirmed COVID-19. In jurisdictions where the information is available, it tracks only those laboratory-confirmed deaths where COVID also contributed to the death according to the death certificate, in accordance with the NCHS and WHO standards.
Deaths (Probable): This new field tracks probable COVID-19 deaths. How those deaths are counted depends on the state. States using case classifications usually count any individual who met the CSTE’s probable case criteria, which combine vital records evidence, epidemiological linkage evidence, serology test evidence, and syndromic evidence. States that follow death certificate criteria count individuals who had COVID-19 listed as a cause of death but did not have a confirmatory laboratory test. As of now, no state provides both, so we have used whatever figure the state provides.
We are currently investigating each jurisdiction's death counting policies by auditing available documentation and doing targeted outreach to state health departments. Once we have finished, we will release more detailed annotations and analysis about each state’s standards.
Eventually, we will provide more detailed data on deaths by splitting confirmed and probable deaths into two categories: first, the slower but more accurate count of where COVID-19 played a role in the death according to the death certificate, and second, the faster but less precise count where death classifications were determined by case classifications.
Many people at The COVID Tracking Project contributed to the research and data-compilation efforts that made this change possible. We would like to thank Jesse Anderson, Jennifer Clyde, Matt Hilliard, Betsy Ladyzhets, Camille Le, Brian Li, Daniel Lin, Anna Schmidt, Sharon Wang, and other contributors for their tireless work on this and many other data efforts.
More “Hospitalization and Death Data” posts
The CDC provides two different datasets regarding COVID-19 fatalities. Here’s a walkthrough of how they compare to each other and to The COVID Tracking Project’s data.
An understanding of state reporting schedules and day-of-week effects can help explain the reasons that numbers fluctuate from day to day, and what those changes mean.
Our new data collection tracks the spread of COVID-19 in 65 cities and counties across the United States, and it lets us see how fatality rates vary widely across geographies.