Since its formation in 1946, the CDC has been the nation’s cornerstone for disease prevention and health promotion. As a federal agency within the Department of Health and Human Services (HHS), its primary role is to protect the United States from threats that endanger the public health. US public health professionals look to the CDC for scientific leadership, expertise, and guidance. For decades, the CDC has coordinated efforts across states and standardized epidemiological data and methods, giving us a nation-wide snapshot of new diseases as they form.
In the case of COVID-19, it took more than 15 weeks from the first reported case in the US for the CDC to release their COVID-19 Data Tracker. The 56 different datasets produced by US states and territories demonstrate the problem of reporting this data without national guidance. There is inconsistency in reporting case counts, completed tests, and death tolls, and these numbers are reported in ways that make it very difficult to compile an accurate national picture of the pandemic.
The launch of the CDC’s new COVID Data Tracker is a major step. Ideally, disease modelers, researchers, and public health authorities would be working from the same data. The general public, too, should be able to trust that there is one set of reliable numbers. Unfortunately, the new data from the CDC doesn’t get us all the way there.
Five days after the launch of the CDC Data Tracker, The COVID Tracking Project at The Atlantic released a detailed evaluation of the new CDC data. In the paper, we compare the CDC’s COVID-19 data with the corresponding data publicly reported by the states and the District of Columbia. For many states, the testing numbers from the CDC and the testing numbers we compile from official state sources paint different pictures of the current state of testing.
We understand how complicated this data is—we’ve been gathering and analyzing it ourselves for months. Some differences in the state and CDC datasets are to be expected, and in over half the states, the testing numbers fall within 10% of one another. Other discrepancies are too large to ignore: in 13 states, the testing numbers differ by over 25%.
The CDC is uniquely positioned to unify and reconcile the many inconsistent datasets from the states and territories. Dozens of our volunteers, most of whom have already been working on this data for weeks or months, worked through nights and weekends to do this analysis. We hope our work will help state and federal agencies understand and close the gaps we’ve identified. Once that happens, we can turn the greater part of our attention to other areas where we have insufficient data about this pandemic. Until then, we’ll be here every day to bring you the numbers.
All the data we used for analysis is publicly available on our GitHub repo. This report is licensed CC-BY 4.0. Please attribute it to “The COVID Tracking Project at The Atlantic.” You can contact us anytime at https://covidtracking.com/contact.
More “Testing Data” posts
Holiday reporting has garbled most metrics. Going by current COVID-19 hospitalizations, outbreaks in the Midwest are still easing, but every other region is in trouble.
States provide COVID-19 data in a variety of sources and formats. To ensure our data is as accurate and consistent as possible, we spend a lot of time looking at these sources to make sure that we’re capturing the most data possible for each state, while maintaining high standards of data quality and integrity. Today, we’re publicly releasing a detailed set of notes on the sources of all our data points.
Many states have moved toward greater transparency about their test data reporting methods, and we’re making changes to better represent what they publish. We’re also introducing a “new” way some states are counting tests—one we think all states and territories would be wise to embrace.