Analysis & updates | Giving Thanks and Looking Ahead: Our Data Collection Work Is Done

Yesterday at about 7pm Eastern, our data entry team completed our final daily update to our national testing and outcomes dataset. Our race and ethnicity team’s final update came in a few hours later, and our team that compiles data on COVID-19 in nursing homes and other long-term-care facilities did their last shift on March 4. With those tasks concluded, The COVID Tracking Project closes the book on our data compilation work and moves into our final phase of analysis and documentation. We’ll be publishing research through May, and then we will—fully and accessibly—archive our work and be done.

Our project started as reporting, turned into a crisis response, and endured for the entirety of the first year of the pandemic, increasing in size and complexity as needed to deal with the country’s expanding and persistently messy landscape of COVID data. As we wind down at the one-year point, we are both satisfied that the federal government is now producing enough data to replace many of our metrics and eager to turn our attention to the analysis of those areas of the data—like race and ethnicity data, some states’ testing counts, and information on COVID-19 in long-term-care facilities—that remain inadequate.

In the past few months, our teams spent extra time researching federal datasets that cover the metrics we have been tracking for a year in the various wings of our project, and have assembled a series of guides to the data produced by federal public health agencies. Our data summary page will soon link to a directory of comparable federal datasets and dashboards. We have also collected links to all our posts about federal data, along with recordings and slides for six training sessions on various federal datasets, to help our data users make the transition to federal data. For everyday users, we’ve written a short primer on easy-to-use federal COVID-19 datasets and interpretations that can take the place of our daily tweets and weekly updates.

From reporting that kicked off the project through our 100+ posts about the data in the ensuing months, we have often been sharply critical of the data that states and federal agencies have chosen to release. Every part of the US response to the pandemic has been nominally or actually based on the idea that we know what is happening across the country to a precise degree. And although this idea—that we have solid data about the movements of the pandemic and the effects of the response—is much closer to truth now than it was a year ago, it’s very far from universally accurate.

US COVID-19 data is only as good as its initial collection points, and is often heavily shaped by the pipelines through which it travels on its way to state dashboards and the federal government. Other datasets, such as those covering race and ethnicity and long-term-care facilities, remain woefully inadequate. We will continue to publish everything we’ve learned about the fractured and incomplete character of our country’s public health data until our project comes to a close. We do this both in the hope that our work will help improve the underlying systems, and in the belief that the data’s unusual characteristics and imperfections must be fully and publicly recognized—not later, but now, during the response to the ongoing pandemic—to allow governments at every level to plan their efforts based on reality, and not on a mirage of seamless, real-time data that has never existed.

We must also acknowledge the heroic efforts of public health workers in every US city, county, and state—as well as in many parts of the federal government—to rise to the overwhelming challenge of collecting and reporting pandemic data at a finer level of detail and a larger scale than has ever been required in our country. Their year-long and ongoing crisis efforts have given us our only chance of understanding the pandemic, and the work of COVID Tracking Project volunteers relies entirely on their labor and sacrifice.

Our friends at The Atlantic, our advisory board, funders, and vendors, and especially our hundreds of mostly volunteer contributors deserve all our gratitude. Many of our contributors did this work at significant personal cost, and—like so many other people all over the world—through periods of wrenching collective and individual loss. We don’t have words to thank them for the public service they’ve done. It has been the honor of a lifetime to help steer this ship through an impossible year.

Erin Kissane is a co-founder of the COVID Tracking Project, and the project’s managing editor.

@kissane

Alexis C. Madrigal is a staff writer at The Atlantic, a co-founder of the COVID Tracking Project, and the author of Powering the Dream: The History and Promise of Green Technology.

@alexismadrigal

More “Demographic Data” posts

See all analysis & updates

Early COVID-19 Race Data Shows Disproportionate Loss of Black Lives—It's Time for States to Release the Rest of the Data

We're still missing vital race and ethnicity data, but where the data is strongest—official COVID-19 death rates—the toll of longstanding public health inequities within Black communities is painfully clear. Five months into the US outbreak, several states are still not collecting or releasing complete demographic data required to address these disparities and safely re-open state economies. It's time for this to change.

By Jessica Malaty Rivera & Alice GoldfarbJune 4, 2020

Federal COVID Data 101: What We Know About Race and Ethnicity Data

Publicly available federal race and ethnicity COVID-19 data is currently usable and improving, although it shares many of the problems we’ve found in state-reported data.

By Alice GoldfarbMarch 19, 2021

The State of COVID-19 Race and Ethnicity Data

We know COVID-19 is affecting Black, Indigenous, Latinx, and other people of color the most. But we need more and more standardized data to truly understand the impact to these communities—and to mitigate those disparities.

By Alice GoldfarbJanuary 29, 2021