Have a question about our data? Ask us. Our small team of mostly volunteers can’t always reply to all questions, but we will do our best to get back to you and/or post answers to frequently asked questions on this page.
Almost all of the data we compile is taken directly from the websites of state/territory public health authorities. See our data sources page or flip to the “States” tab of our public spreadsheet, which includes constantly updated annotations about data-source changes.
We update the dataset by hand once a day and release the data between 5pm and 6pm Eastern Time. If a state updates its data after our daily compilation, we won’t pick up the new information until the next day. Additionally, in some cases, we retrieve data from sources that states have hidden from their public dashboards. See the notes associated with each state and territory for more information about specific states’ practices.
Will you still be able to report hospital data given the new HHS reporting requirements of July 15, 2020?
We are following this issue closely. Hospitals are often required to report to states as well as to federal agencies, and we compile data from state public health agencies, so in most cases we believe we will still be able to compile current and cumulative hospitalization, ICU, and ventilator data. However, the state sources we rely on may be affected for states that get their hospitalization data from the CDC’s National Health Safety Network.
As of July 17, three states — Idaho, Missouri, and Wyoming —- have made statements about interruptions in their hospital data at the state level. The Missouri Hospital Association writes:
Please note, due to the abrupt change in data measures and the reporting platform issued by the White House on Monday, July 13, and effective Wednesday, July 15, MHA and the State of Missouri will be unable to access critical hospitalization data during the transition. While we are working to collect interim data, situational awareness will be limited. It is uncertain whether we will be able to produce all data included in this regional dashboard on Wednesday, July 22. We will resume producing the daily hospitalization snapshot and weekly regional dashboards as soon as data feeds are fully restored.
And a spokesperson for Idaho’s Department of Health and Welfare told the Idaho Statesman that the HHS directive “was issued abruptly and presents some significant challenges for Idaho to continue to monitor the number of hospitalizations in the state.” Idaho appears not to have updated its hospital data since July 16, when they posted data from July 13 (a standard delay for Idaho’s hospital information).
Wyoming has appended a note to its hospitalization data: “As part of the transition to a new HHS data system, we are experiencing data quality / reporting problems with this indicator and are working with hospitals to resolve.”
We anticipate seeing changes in the data throughout July as hospitals adjust to the new rules.
Why doesn’t your data match the data from the CDC, or Worldometer, or Johns Hopkins, or The New York Times, or another site?
There are several reasons our data might diverge from those of other trackers:
- Manual capture vs. automatic capture. Our volunteers manually update our numbers by visiting state/territory public health websites once a day, annotating any changes to data sources or data anomalies as they go. Our volunteers are often retrieving data from sources such as PDFs and livestreams of press conferences that automated tools have not been engineered to capture.
- Time lag. When other trackers rely on automated tools to collect data from state/territory public health authorities, their counts tend to be updated more frequently than ours. We currently spend about three hours every afternoon collecting data, and we publish it once each day between 5pm and 6pm Eastern.
Data sources other than states/territories. Other trackers retrieve data directly from sources other than the state/territory public health authorities we use for our dataset.
- For example, as explained on pp. 17-20 of our “Assessment of the the CDC’s New COVID-19 Data Reporting,” some commercial laboratories and large hospital laboratories are reporting data directly to the federal government and might not be reporting all data, especially negative test results, to the states/territories.
- Other trackers, including Johns Hopkins and The New York Times, rely on county data rather than state data. While in theory counties do report their data to the state, in practice the sum total of county data points can often differ from the totals the state reports, probably because the state normalizes county data to its own standards.
Different data definitions. States/territories define data points in inconsistent ways, and the various trackers deal with those inconsistent definitions differently. For example, “deaths” is treated very differently by various states and trackers, especially when it comes to “probable deaths,” which are not reported by all states or trackers.
- The state of New York, for instance, has not been reporting “probable deaths” from COVID-19, whereas New York City reports thousands of these deaths. Worldometer includes the NYC probables in its death counts, whereas The COVID Tracking Project does not. (Johns Hopkins also does not include the NYC probable deaths on its US map but does on its Global map.) When the state of New York includes these probable deaths in its reporting, we will include them in ours.
We do not currently have plans to collect data at the county level, both because we do not have the resources to do so manually and because both Johns Hopkins and The New York Times are collecting county-level data for cases and deaths.
There are very strong day of the week effects in this dataset, largely because testing and reporting activity slows down on weekends. Health care staff and public health officials tend to “catch up” with their data reporting on Mondays and Tuesdays, causing spikes in the numbers.
We do hope to start collecting demographic data on patient sex in the future, although not every state reports it for every kind of information we track.
Unfortunately, age is a much more complicated problem, because the states group ages in incompatible ranges: one state might report ages 29-39 as a group, while another reports 25-35, and a third reports 30-45.
Age ranges are therefore almost impossible to provide as a national set of metrics, because of this non-standardized reporting. As a non-profit non-governmental volunteer project, we can politely ask state public health authorities to standardize their data—but unlike the federal government, we have no authority to compel anyone to publish anything, so when the data is too inconsistent to use, we’re stuck.
We are currently capturing both data about antibody tests and data about PCR tests, but to our best knowledge we are only publishing data about PCR tests. Some states do not clarify what kind of tests they are reporting. If states are silently including antibody test data in their overall test data, our dataset will also include it. Once we are sure that a majority of states are reliably reporting antibody test data, we will publish national- and state-level data about antibody testing.