Skip site navigation

This page is meant to guide you to the US federal COVID-19 data sources most comparable to the data compiled by The COVID Tracking Project from individual US states and territories.

Most federal data about COVID-19 cases, deaths, hospitalizations, and testing comes from the Department of Health and Human Services (HHS), the Centers for Disease Control and Prevention (CDC), and the Centers for Medicare and Medicaid Services (CMS). Both CDC and CMS are part of HHS organizationally, but in practice the three units sometimes function independently. The federal government provides access to these datasets through multiple data portals and uses them to create a wide variety of trackers, websites, reports, and charts.

Testing and outcomes

CDC United States COVID-19 Cases and Deaths by State over Time

About

Description

COVID-19 cases and deaths by state/territory. This aggregate cases and deaths dataset is used in the COVID Data Tracker, COVID Data Tracker Weekly Review, Community Profile Reports and State Profile Reports. It is available for download both from the CDC and on HealthData.gov, in a variety of formats including CSV and XML. The CSVs are straightforward and easy to work with. It is also possible to filter, sort, and visualize the data on the CDC website without downloading it. You can also query the data online via the Socrata Open Data API after consulting the excellent and comprehensive documentation provided.

Last updated March 15, 2021

Our related datasets

Our related posts

HHS COVID-19 Reported Patient Impact and Hospital Capacity by State Timeseries

About

Description

This dataset reports hospital metrics by state and date, including current COVID-19 hospitalizations and many other metrics. It includes many hospital capacity and usage metrics, including, for example, the current number of adult and pediatric patients who are suspected or confirmed to have COVID-19 hospitalized in inpatient and intensive care unit (ICU) beds. Each of these metrics are reported by every US hospital every day, with the exception of psychiatric and rehabilitation hospitals, which report weekly. More than 6,000 hospitals report to HHS either directly or via their state or state hospital associations, and the underlying dataset is publicly accessible and used across federal and state agencies: The CDC uses this hospitalization data in their COVID-19 Data Tracker, and it is included in the publicly available Community Profile Reports and State Profile Reports.

Different versions or “slices” of this dataset are also available in HealthData.gov: The very large facility-level dataset “COVID-19 Reported Patient Impact and Hospital Capacity by Facility” of more than 92,000 rows includes the same hospital capacity and usage metrics for each reporting hospital by name. The summary, non-timeseries dataset “COVID-19 Reported Patient Impact and Hospital Capacity by State” dataset provides only the most recent (within the last four days) total value for each hospital metric by state.

Last updated March 15, 2021

Our related datasets

Our related posts

HHS COVID-19 Diagnostic Laboratory Testing (PCR Testing) Time Series

About

Description

This testing dataset is a time series of PCR (polymerase chain reaction) tests and test results by state by day that begins March 1, 2020. It includes only PCR tests, not antibody (serology) tests or antigen tests, and it includes only results for test specimens, not numbers of unique people tested. Tests in this dataset are organized by the date the test was administered or the date of the test result, not by date of report. The data comes from The COVID Electronic Laboratory Reporting Program (CELR), a system launched in spring of 2020 specifically for the purpose of collecting COVID-19 data from laboratories. The dataset powers charts and visualizations on the HHS Protect Public Data Hub as well as weekly state-level charts of testing in the HHS State Profile Reports. CMS uses this dataset to compile and publish a weekly spreadsheet of the 14-day average of test positivity rates by county that is meant to help nursing homes estimate viral prevalence in their area.

This dataset has several significant differences from the state-aggregated data compiled and published by The COVID Tracking Project. Please read our analysis of federal testing data to be sure you understand these differences.

Last updated March 15, 2021

Our related datasets

Our related posts

CDC NCHS Provisional Death Counts for Coronavirus Disease Index of Files

About

Description

The CDC’s National Center for Health Statistics regularly publishes various datasets about COVID-19 deaths based on death certificate data submitted to the National Vital Statistics System. Because death certificates take several weeks to be received and entered into the NVSS system, this death data significantly lags other sources of reported death counts from COVID-19. Death certificates, however, contain a great deal of information about the person who died, so these datasets are particularly useful for demographic research on factors such as age, race/ethnicity, and geographic location. Death certificates can contain errors and omissions that NCHS takes time to correct, and the data in these files has not yet been fully investigated, so NCHS emphasizes that this data is to be considered “provisional” and “ad hoc.”

Last updated March 11, 2021

Our related datasets

Our related posts

Race & ethnicity

CDC COVID-19 Case Surveillance Public Use Data

About

Description

The CDC COVID-19 Case Surveillance Public Use Data dataset is line-level data, not aggregate data, which means that it includes a de-identified line for each person reported as a case of COVID-19. Each line includes detailed demographic data for that person. This dataset is updated only once per month because of the complexity of working with such extremely detailed data, and the data itself is very unwieldy to work with—the downloadable version currently has nearly 21 million rows. The CDC’s COVID Data Tracker uses this dataset to provide a daily snapshot of current trends in COVID-19 cases and deaths by race/ethnicity, but these trends are not available in a timeseries---only the most current national counts and percentages are provided.

Because this data contains identifying and sensitive information, the CDC provides both a 12-data-element public-use dataset of the line list and a 32-data-element restricted-access dataset of the same data that includes potentially identifying information. The public use dataset includes fields for sex, age_group, and race_ethnicity_combined, information about whether the person was hospitalized or sent to the ICU, as well as four separate fields that can help assign a date to the case: the date the person said their symptoms began, the date the person’s diagnostic test had a positive result, the date the case was reported to the CDC, and the earliest available of these three dates. The restricted use dataset includes the state and county of residence of each person reported to have contracted COVID-19, as well as information on whether the person is a healthcare worker. If you want to use the COVID-19 Case Surveillance Restricted Access Detailed Data, you must apply to the CDC for permission. Data elements (fields) for both the public and the restricted versions of this dataset can be found on the COVID-19 case report form.

Last updated March 23, 2021

Our related datasets

Our related posts

CDC COVID-NET Rates of COVID-19-Associated Hospitalization per 100,000 Population

About

Description

COVID-NET, the COVID-19-Associated Hospitalization Surveillance Network, is a group of more than 250 US hospitals in ninety-nine counties in fourteen states that gathers in-depth data about people who are hospitalized with confirmed cases of COVID-19. The fourteen states with selected participating hospitals are California, Colorado, Connecticut, Georgia, Iowa, Maryland, Michigan, Minnesota, New Mexico, New York, Ohio, Oregon, Tennessee, and Utah. The data from COVID-NET is obviously representative, not comprehensive, since only a small number of US hospitals, counties, and states are part of COVID-NET. COVID-NET data represents about 10% of the US population.

COVID-NET data can be downloaded bly clicking the “Download data” button at the top right of the COVID-NET data page. COVID-NET reports data weekly by the number of the week in a particular year, so, for instance, the data begins with week 10 of the year 2020, which was the week ending March 7, 2020. The elements tracked are Age Category, Sex, and Race, and the data is standardized within each metric. Unfortunately, the Race category as tracked by COVID-NET is not perfectly comparable to the race_ethnicity_combined category tracked in the HHS COVID-19 Case Surveillance Data. Please watch our analyses of federal race and ethnicity data to be sure you understand these differences.

Last updated March 23, 2021

Our related datasets

Our related posts

CDC NCHS Provisional Death Counts for Coronavirus Disease (COVID-19): Distribution of Deaths by Race and Hispanic Origin

About

Description

The CDC’s National Center for Health Statistics regularly publishes various datasets about COVID-19 deaths based on death certificate data submitted to the National Vital Statistics System. Because death certificates take several weeks to be received and entered into the NVSS system, this death data significantly lags other sources of reported death counts from COVID-19. Death certificates, however, contain a great deal of information about the person who died, so these datasets are particularly useful for demographic research on factors such as age, race/ethnicity, and geographic location. Death certificates can contain errors and omissions that NCHS takes time to correct, and the data in these files has not yet been fully investigated, so NCHS emphasizes that this data is to be considered “provisional” and “ad hoc.” A full list of these datasets about deaths from COVID-19 is available at https://www.cdc.gov/nchs/covid19/covid-19-mortality-data-files.htm.

This particular dataset concerning the distribution of COVID-19 deaths by race and Hispanic origin gives aggregate proportions by state and for the US of all those who have died since January 1, 2020 according to the ethnicity and race information on that person’s death certificate. It is updated weekly, and the dataset also gives the “as of” date so that a user can judge the recency of the information. Proportions are given both as unweighted (raw) percentage of all COVID-19 deaths and as a weighted distribution of population.

Last updated March 23, 2021

Our related datasets

Our related posts

Long-term care

CMS COVID-19 Nursing Home Dataset

About

Description

This very large dataset includes facility-level data for Skilled Nursing Facilities that report COVID-19 information to the CDC’s National Healthcare Safety Network (NHSN). It does not include data before May 17, 2020, and it does not include data from state-regulated assisted living facilities and other resident care homes. Case data (both confirmed and suspected cases) and death data for both residents and staff are included, as is the total number of residents in each facility. Outbreaks are not reported by that term, but fields such as “Three or More Confirmed Cases of COVID-19 This Week” serve the same purpose. Charts and topline figures from this data appear on the CMS COVID-19 Nursing Home Data page.

This data can be difficult to work with, in part because of the sheer size of the dataset: it is currently nearly 600,000 rows and grows weekly. However, data definitions for all metrics are much more standardized in the CMS nursing home data than in the state aggregate data compiled by CTP.  Please read our analyses of federal long-term-care data to be sure you understand these differences.

Last updated March 8, 2021

Our related datasets

Our related posts

Vaccine metadata

CDC Federal Pharmacy Partnership for Long-Term Care (LTC) Program

About

Description

The CDC’s COVID Data Tracker gives a daily snapshot of COVID-19 vaccinations in US long-term-care facilities by state/territory. Metrics include total doses administered in long-term-care facilities, people in long-term-care facilities vaccinated who have received one or more doses, and people in long-term-care facilities vaccinated who have received two or more doses. The data does not differentiate between residents and staff of long-term-care facilities. The underlying data is available for download on the COVID Data Tracker page, but it is not a timeseries: it includes only the most recent totals by state.

The Federal Pharmacy Partnership for Long-Term Care (LTC) Program is a public/private partnership between the federal government and commercial pharmacies, notably Walgreens and CVS, with the purpose of distributing and administering vaccinations to one of the most at-risk populations in the US for death from COVID-19. West Virginia and most US territories chose not participate in the program and so are not included in the data: Puerto Rico is a participant and is included. Facility-level data about vaccine administration in long-term-care facilities is stored in the Tiberius system but is generally not public. Only South Carolina currently publishes facility-level data from Tiberius about vaccine administration in long-term-care settings as far as we can tell. Walgreens and CVS also publish data from this program about vaccine administration.

Last updated March 12, 2021

Our related datasets

Our related posts

CDC COVID-19 Vaccinations in the United States

About

Description

The CDC’s COVID Data Tracker gives a daily snapshot of COVID-19 vaccinations in the US by state. Metrics include doses delivered, doses administered, people vaccinated who have received one or more doses, and people vaccinated who have received two or more doses. Currently, only people who are at least eighteen years old are included. The underlying data is available for download on the page, but it is not a timeseries: it includes only the most recent totals by state.

Timeseries datasets for vaccine allocation by manufacturer are also available on Data.CDC.gov. The Pfizer dataset begins 12/14/2020, the Moderna dataset begins 12/21/20, and the Janssen dataset begins 3/1/21. These three datasets are updated weekly and provide information about both first and second dose allocations for each state/territory.

Last updated March 11, 2021

Our related datasets

Our related posts

Federal trackers

HHS Protect Public Data Hub

HHS Protect is a system built in the early months of the pandemic on existing commercial technology for the purpose of collecting, standardizing, and sharing COVID-19 response data, primarily but not exclusively from hospitals. The HHS Protect Public Data Hub has hospital data, national PCR testing data, and data about the distribution of monoclonal antibody cocktails and other therapeutic treatments. Downloadable data is available from the site, and complete datasets are also available on HealthData.gov.

HHS COVID-19 Community Profile Reports

The COVID-19 Community Profile Report is produced daily (except Saturdays, as of March 6, 2021) by multiple federal agencies and offices to share key but comprehensive topline metrics from the previous seven days in color-coded form. It is available as both a PDF and a multi-tabbed spreadsheet. Metrics include case, death, test positivity, and hospital admission trends for the country, for each census region, and for each state, as well as highlights of counties that show concerning trends.

HHS COVID-19 State Profile Reports

The COVID-19 State Profile Reports are produced weekly for each state and territory in a similar format and with similar metrics as the daily Community Profile Report. Metrics include case, death, test positivity, and hospital admission trends for the state as well as highlights of the top counties that show concerning trends. No underlying data is available. Each report is meant to be read by that state’s governor to help in decision making.

CDC COVID Data Tracker Weekly Review

The COVID Data Tracker Weekly Review is a weekly “interpretive summary” of the current status of the pandemic.

The CDC COVID Data Tracker

The CDC COVID Data Tracker provides detailed information at the county, state, and national level of metrics such as COVID-19 cases and deaths, testing, hospitalizations, demographic trends, vaccinations, nursing home cases and deaths, and more. Downloadable data is available from the site, and complete datasets are also available on Data.CDC.gov.

Federal data portals

Data.CMS.gov

Data portal for the Centers for Medicare & Medicaid Services (CMS).

Data.CMS.gov/beta

New version of Data.CMS.gov in development.

Data.CDC.gov

Data portal for the Centers for Disease Control and Prevention (CDC).

HealthData.gov

Department of Health and Human Services (HHS) agency data, including CDC and CMS data, as well as state and local health data.

Data.gov

Federal, state, local, and tribal government data of all kinds.