Table of contents
- Questions about the end of The COVID Tracking Project’s data collection
- When will The COVID Tracking Project stop collecting data?
- Why aren’t you continuing to collect data indefinitely?
- What will happen to your API/data after March 7?
- Where can I go after March 7 to get the same data you have been publishing?
- Questions about the dataset
- Where do you get your data?
- Why are there so many spikes in the data?
- Why doesn’t your data match what I see on the official COVID-19 page for my state?
- Why doesn’t your data match the data from the CDC, or Worldometer, or Johns Hopkins, or USAFacts.org, or The New York Times, or another site?
- Why are your “new” cases, tests, or deaths counts for my state different from the “new” counts my state reports?
- Why do you sometimes report the same value for multiple days?
- How does the data deal with people who have had multiple tests for COVID-19?
- What are test encounters?
- How do you decide whether to record people who test positive on antigen tests as probable cases?
- Why do you list 56 states?
- Questions about long-term-care data
- Do the death totals in your long-term-care data tracker include deaths that occur in hospitals?
- Where does your data about vaccinations in Long-Term Care facilities come from?
- Questions about charts
- What states are included in the regions you display on your charts? What population figures do you use for per capita charts?
- Why does one hospitalization chart indicate that cases are increasing in a certain state when another hospitalization chart indicates the opposite?
- Questions about changes to the website or data
- Why are you removing values from the API field negative from various states starting on January 27, 2021?
- Why have you stopped reporting data about the number of people who have recovered from COVID-19?
- Why have you started including HHS hospitalization data on your data pages?
- Why have you stopped reporting national recoveries?
- Why have you stopped reporting national cumulative hospitalizations, ICU, and ventilation numbers on your website?
- Where has the “spreadsheet” option gone on the data page?
- Why don’t you report historical data on the state pages any more?
- Why did your national “total test results” numbers change on September 17?
- Why have your “Total test results” numbers changed for a particular state?
- Questions about information we don’t report
- Why don’t you report test positivity rates?
- Where can I find R(t) values?
- Are you planning to track vaccination data?
- Why don’t you report county-level data? Will you be doing so in the future?
- Why aren’t you tracking age and sex?
- Can you report COVID-19 data related to schools and/or colleges?
- Other questions
- Why don’t you harvest data automatically?
Questions about the end of The COVID Tracking Project’s data collection
When will The COVID Tracking Project stop collecting data?
We will conclude daily data compilation on March 7, 2021. We will continue with documentation, analysis, and archiving work through May, 2021, and then close the project.
Why aren’t you continuing to collect data indefinitely?
The short answer is that—as we have said from the project’s inception—we believe that COVID-19 data provisioning is the responsibility of federal public health agencies. Now, it appears that the federal government is taking steps to provide COVID-19 data that is largely comparable to the data we compile from state and territory sources. We’ve written a post to explain further.
What will happen to your API/data after March 7?
We are archiving all our data and our website for long-term access. Our API and data downloads will be available in their current locations for historical data until at least May 1, 2021. We are committed to further long-term archiving beyond that date and will continue to provide more information in the coming weeks.
Where can I go after March 7 to get the same data you have been publishing?
We are publishing many resources to help users of COVID Tracking Project data switch to federal data sources. See especially the following:
- All Datasets: A list of CTP datasets with comparable federal data listed where available.
- Federal COVID Data 101: An article that links to all material in our Federal Data 101 training series: videos, slides, and articles on how to use federal COVID-19 data on cases and deaths, hospitalizations, race/ethnicity, and nursing homes.
- Federal Resources: A comprehensive list of the federal datasets that contain the metrics CTP tracked.
- Where to Find Simple COVID-19 Data for the US: An article about where to find federal charts, graphs, visualizations, reports, and summaries of COVID data.
- Federal COVID Data in a Single Stream: An article with instructions and code for recreating CTP’s daily aggregate of testing, hospitalizations, and outcomes data from federal datasets.
Additionally, The New York Times and Johns Hopkins trackers both compile case and death data from US jurisdictions, as does USA Facts. APM Research Lab’s Color of Coronavirus project offers demographic data on COVID-19 deaths compiled from US states.
Throughout February, March, April, and May, our researchers will continue to publish analyses of federal and state COVID-19 data, including close looks at the gaps between CTP data and federal equivalents.
Questions about the dataset
Where do you get your data?
Almost all of the data we compile is taken directly from the websites of state/territory public health authorities. See our data sources page for more information, or go to any individual state’s data page, such as Alabama’s data page, to see “Where this data comes from” at the top of the page. To see where any specific data point comes from, refer to our public spreadsheet of source notes. Read more about our data sources in our article “How We Source Our Data and Why It Matters.”
Why are there so many spikes in the data?
There are very strong day of the week effects in this dataset. Testing and reporting activity slows down on weekends, and health care staff and public health officials tend to “catch up” with their data reporting on Mondays and Tuesdays, causing spikes in the numbers later in the week. Some states report some metrics once per week, which will cause a spike on the day of the week they report that metric. Holidays can create backlogs of data, causing large apparent decreases followed by large apparent increases: read more about that here.
There are also other occasions when a lab or a county “dumps” a great deal of data all at once on a particular day, which makes the state’s numbers for that day unusually large. We try to report all such unusual data spikes in the public notes on each state’s data page and on our Twitter feed.
In general, any date in our data should be understood to be defined as “the date on which data was collected by The COVID Tracking Project,” which is generally the date the state reported the data point to the public in its cumulative totals. We recommend analyzing our data with 7-day or 14-day averages instead of with single day values to help mitigate the effect of these reporting spikes.
Why doesn’t your data match what I see on the official COVID-19 page for my state?
There are several reasons why our tracker might show different data than your state’s COVID-19 page, even when we use that same page as a source:
- Date lag. We update the dataset by hand once a day and release the data between about 5:30pm and 7pm Eastern Time. If a state updates its data after our daily compilation, we won’t pick up the new information until the next day.
- Hidden data. In some cases, we retrieve data that states do not display on their public dashboards from data files that the state provides. This data is still public and still official, but might only be visible “behind the scenes” of a data dashboard or in an obscure corner of the state’s COVID-19 site. Missouri, for example, does not display the value for Total PCR tests (in specimens) on its dashboard, but the data is there, though not displayed, and we retrieve it with a machine query.
- Different data definitions. In the absence of national data standards, we might use the same name for a metric as your state but use a different definition. For instance, our case, death, and hospitalization metrics all include “probable” and “suspected” cases for states that report them, whereas your state might include only lab-confirmed cases in its official case count while reporting probable cases separately.
- Different ways of reporting “new” cases, tests, or deaths. Please note that our “new” values for cases, tests, deaths and other metrics are calculated as the increase in the total cumulative value reported by the state since yesterday. States themselves, however, frequently define “new” cases, tests, and deaths differently. See our FAQ on “new” data points below for more information.
- Backfilled / backdated data. As explained above, we report data once each day on the date the state adds that data to its systems, whereas states themselves frequently “backfill” data, meaning that they enter data for previous days. By doing this, states can connect data points to pertinent dates such as the date a death occurred or the date a laboratory completed its analysis of a test. For instance, Florida’s state report includes a graph titled “COVID-19: cases and laboratory testing over time” whose numbers by date change frequently: on 9/9/20 the graph reported 2352 cases for 9/8/2020 and on 9/10/2020 the graph reported 2337 cases for 9/8/2020. Similarly, Rhode Island continually revises its historic values for “Cumulative people who tested positive” as they receive more results from laboratories, so our time series falls out of sync with the state’s time series. We do sometimes backfill our own historic data when states provide us with a time series for a metric in a structured format. This work is tracked in one of our Github repositories.
See the notes associated with each state and territory for more information about the data for that state.
Why doesn’t your data match the data from the CDC, or Worldometer, or Johns Hopkins, or USAFacts.org, or The New York Times, or another site?
There are several reasons why different data trackers show different data:
- Manual capture vs. automatic capture. Our volunteers manually update our numbers by visiting state/territory public health websites once a day, annotating any changes to data sources or data anomalies as they go. Our volunteers are often retrieving data from sources such as PDFs and livestreams of press conferences that automated tools have not been engineered to capture.
- Time lag. When other trackers rely on automated tools to collect data from state/territory public health authorities, their counts tend to be updated more frequently than ours. We currently spend about three hours every afternoon collecting data, and we publish it only once each day.
-
Data sources other than states/territories. Other trackers retrieve data directly from sources other than the state/territory public health authorities we use for our dataset.
Many other trackers, including Johns Hopkins, USAFacts.org, and The New York Times, rely on county data rather than state data. While counties do report their data to the state, in practice the sum total of county data points can often differ from the totals the state reports, probably because the state normalizes county data to its own standards.
The CDC has direct access to other sources of data in addition to state public health authorities. For instance, as of August 25, 2020, the CDC reports on its COVID-19 testing tracker that “The data for each state are sourced from either data submitted directly by the state health department via COVID-19 electronic laboratory reporting (CELR), or a combination of commercial, public health, and in-house hospital laboratories.”
-
Different data definitions. States/territories define data points in inconsistent ways, and the various trackers deal with those inconsistent definitions differently. For example, “deaths” is treated very differently by various states and trackers, especially when it comes to “probable deaths,” which are not reported by all states or trackers.
The state of New York, for instance, does not report a significant number of deaths from COVID-19 reported by the city of New York, including thousands of probable deaths. Worldometer includes the NYC probables in its death counts, whereas The COVID Tracking Project does not. (Johns Hopkins also does not include the NYC probable deaths on its US map but does on its Global map.) When the state of New York includes all deaths reported by the city of New York in its reporting, we will include them in ours.
Why are your “new” cases, tests, or deaths counts for my state different from the “new” counts my state reports?
Our “new” values for cases, tests, deaths and other metrics are calculated as the increase in the cumulative total number of cases, tests, or deaths reported by the state since yesterday. This way of calculating “new” data points is a function of how we collect data. For the most part, we enter data manually once each day by visiting the state’s official COVID-19 data sites, and we capture, record, and report the cumulative totals reported by the state to the public on that day.
States themselves, however, frequently enter data into their systems for previous dates. Cases might be recorded by the state with dates such as “date of symptom onset,” which is not usually the same as the first date the state reports that case to the public. Tests might be recorded by the state with dates such as “specimen collection date,” which might not be the same as the first date the state reported that test to the public. Deaths might be recorded by the state with dates such as “date of death,” which is not usually the same as the first date the state reports that death to the public.
On a Friday, for example, a state might enter five tests into its public reporting system, one whose specimen was collected on Wednesday, two whose specimens were collected on Thursday, and two whose specimens were collected on Friday. In that example, the state might report “2 new tests” for that Friday and we might report “5 new tests” for that Friday. We recommend analyzing our data with 7-day or 14-day averages instead of with single day values to help mitigate the effect of these reporting spikes.
Why do you sometimes report the same value for multiple days?
Not all states and territories report all data points daily. When state data reporting is interrupted by holidays or by technical issues, we also keep the most recent available value in our dataset and note the anomaly on that state’s notes page. For example, Arizona’s Confirmed cases and Probable cases values remained the same from July 18 through August 5, 2020 due to a problem with their data dashboard explained on Arizona’s notes page.
How does the data deal with people who have had multiple tests for COVID-19?
COVID-19 “cases” refer to individual people, and even if a person tests positive for COVID-19 more than once, that person should in general only be counted once in the case counts. The same is true for death data, recovery data, and hospitalization data: those values should (barring mistakes) represent unique individuals.
Testing figures, however, might or might not include multiple tests administered to the same person: it depends on how the state “deduplicates” that data, meaning how it identifies and removes (or chooses not to remove) redundant / repeated information. It also depends to an extent on what units the state reports COVID-19 tests in. Some states report test results in units of “specimens tested,” some states report test results in units of “people tested,” some states report in units of “testing encounters” (meaning the number of times one person was tested), and some states report test results in more than one of these ways. On our data page and on each individual state’s data page, we list all three of these ways a state might be reporting the total number of tests conducted in its jurisdiction.
We have written about these issues in depth in our articles “Test Positivity in the US is a Mess” and “Counting COVID-19 Tests: How States Do It, How We Do It, and What’s Changing.” The best source of information about your own county or state’s method of deduplicating and reporting tests is your own county or state public health department.
What are test encounters?
“Test encounters” or “testing encounters” measures the number of people who have been tested in a single day. Though the phrase is probably unfamiliar, its definition just describes the way we talk about how many times people have been “tested for COVID-19” in everyday life. If a person was tested once every week for a month, she would likely say she had been tested four times. Those four occasions on which she was tested are four “testing encounters.” For more information, see our full data definition for testing encounters and our blog post on testing encounters.
How do you decide whether to record people who test positive on antigen tests as probable cases?
The most recent Council for State and Territorial Epidemiologists COVID-19 case definition designates an antigen test as one possible way to identify a probable case of COVID-19. Although there are other ways to identify probable cases, many states and territories use only antigen tests to identify probable cases. However, even when probable case counts include only individuals testing positive via antigen, they are different from a count of antigen positive individuals.
That’s because probable case counts are revised down if those cases are confirmed via a PCR test, a common occurrence. A count of antigen positive individuals tested, by contrast, never gets revised down. It is just a tally of everyone with a positive antigen test result. Both numbers provide valuable information but for different purposes. We capture probable cases in our API field probableCases
and antigen positive individuals in positiveTestsPeopleAntigen
. (And when states release the number of positive antigen tests rather than individuals testing positive, we store that number in positiveTestsAntigen
.)
States are not always crystal clear about which of these two categories they are using. We make the decision about which category states’ figures fall into based on a combination of information on health department websites and outreach to state health departments. Where we are unable to determine whether a metric is counting probable cases or antigen positive individuals, we store their numbers in positiveTestsPeopleAntigen
.
Why do you list 56 states?
We track data for the District of Columbia as well as for US territories including American Samoa, Guam, the Mariana Islands, Puerto Rico, and the US Virgin Islands. We try to say “states and territories” everywhere that it’s appropriate, but sometimes we might use the short term “states” when we mean “states, territories, and the District of Columbia.”
Questions about long-term-care data
Do the death totals in your long-term-care data tracker include deaths that occur in hospitals?
To the best of our knowledge, most state data about deaths in nursing homes, assisted living facilities, and other long-term care facilitie includes deaths that occur both on- and off-premises. Until recently, nursing home residents in New York who died from COVID-19 in hospitals “were not reflected in DOH’s published total nursing home death data” according to the Attorney General of New York. California notes that “COVID-19 fatalities in the dashboard below include deaths that were reported by facilities to the best of their knowledge. These include deaths that occurred at nursing homes and those that occurred in other locations, such as a hospital or private home, if the death occurred within the 14-day bed hold period after the resident transferred from the SNF. Deaths that occurred outside of this 14-day period may not be captured.” We are not aware of any other caveats about state tallies of long-term-care deaths.
Where does your data about vaccinations in Long-Term Care facilities come from?
On January 27, 2021, we began displaying data about vaccinations in long-term care facilities on our data page. This federal data provided by the CDC includes only vaccinations administered to residents and staff of nursing homes and assisted living facilities through the Pharmacy Partnership for Long-Term Care Program, a federal partnership with CVS, Walgreens, and selected pharmacies in the Managed Health Associates network. The data is updated daily.
Questions about charts
What states are included in the regions you display on your charts? What population figures do you use for per capita charts?
Regions displayed in our charts are defined by the US Census. Population estimates used in per capita charts are also from the US Census: we use the American Community Survey 5-Year Estimates for 2019.
Why does one hospitalization chart indicate that cases are increasing in a certain state when another hospitalization chart indicates the opposite?
The CHANGE IN CURRENTLY HOSPITALIZED: TODAY VS PREVIOUS WEEK chart shows the value for today compared to a single day a week ago, so only two (raw) values are involved in the comparison that determines the change. The KEY METRICS BY STATE hospitalization chart shows the 7-day rolling average represented with a line, so the change is measured with the average of 7 values. (The Key Metrics hospitalization chart also shows single-day values represented by bars.) The two charts are displaying different metrics.
Questions about changes to the website or data
Why are you removing values from the API field negative from various states starting on January 27, 2021?
As part of our larger project of moving to reporting explicit totals for all states, we are also removing negatives that were created from mixed units (specimens minus cases or test encounters minus cases) for states that are using explicit totals in our main total test results field, called totalTestResults
in the API. (Check out the above FAQ entry and blog post for more information about changes in our totalTestResults
).
Before these states provided full histories of explicit totals, we were using positive
plus negative
(following early reporting practices of many states) to produce total test counts in order to get a full time-series. When states stopped reporting negatives directly, we computed them by subtracting the cases from the totals, so that positive+negative would equal the new explicitly reported values. In some cases, this led to mixing units in the negative field. Now that these states have provided full histories of their total tests, we have switched them away from positive plus negative for total test results and can remove these mixed unit values.
We are starting with AK, CA, DC, GA, KY, NY, OH, OR, TX, VA and WA on January 27, 2021, and we will continue to remove any negatives mixing units as we switch states over to explicit total test figures.
Why have you stopped reporting data about the number of people who have recovered from COVID-19?
On January 13, 2021, we removed all state-level “recovered” metrics from our website and shifted to reporting only “hospital discharges” for the eight states that report it. After a comprehensive review of the 56 states and territories we track, we determined that most “recovered” metrics are estimates rather than precise figures, that no two states define and calculate “recovered” in the same way, and that “recovered” metrics are often based on guidelines about whether a person is infectious rather than follow-up investigation with the patient to confirm that they have returned to health. Read more about this decision in our article on the difficulty of counting recoveries.
Why have you started including HHS hospitalization data on your data pages?
We have been keeping a close eye on HHS hospitalization data throughout the pandemic. We believe that comparing the data we collect from states and territories to the federal data the HHS collects from hospitals and states should increase public faith in both datasets, since the values these datasets report are very similar when different data definitions and reporting times are taken into account.
To facilitate this comparison, on December 8th, 2020, we introduced a new “card” on our national and state data pages that includes figures for Now hospitalized (confirmed + suspected), Now hospitalized (confirmed only), and Now in ICU (confirmed + suspected) from the HHS dataset “COVID-19 Reported Patient Impact and Hospital Capacity by State.”
Note that the hospitalization data we collect from states and territories includes suspected cases and pediatric cases of COVID-19 when available, but in many instances the jurisdiction either does not report this information or does not make clear whether its hospitalization data includes suspected and pediatric cases. Both Now hospitalized HHS metrics include both adult and pediatric COVID-19 patients, while the Now in ICU HHS metric includes only adult COVID-19 patients.
The HHS hospital data on our site is updated when a new version of the HHS dataset is published. The data is not available in our API nor in our downloadable CSVs, but it is available in both forms from healthdata.gov along with other COVID-19 datasets. Read more about this data from HHS in our article “What We’ve Learned About the HHS’s Hospitalization Data.”
Why have you stopped reporting national recoveries?
Not all states and territories report the number of Recovered COVID-19 patients or the number of COVID-19 patients discharged from the hospital, and large states like Florida, California, and Washington are among those who do not report Recovered. Therefore, adding only the available state and territory figures together to get a national Recovered total results in a significant undercount of the true national number of people who have survived COVID-19.
Furthermore, Recovered is a particularly non-standardized metric at the state level, since there is no official definition of clinical recovery from COVID-19: the CDC only gives guidance about when COVID-19 patients are no longer infectious and can therefore be released from isolation. Some states use the CDC definition for COVID-19 patients released from isolation to estimate recoveries, while some states define recovery in their own way, often considering COVID-19 patients who have not died within a certain interval of time after infection as “recovered,” no matter what their state of health: these Recovered figures therefore include people who suffer medium-term or long-term disability caused by COVID-19. Some states do not give any indication how they define or calculate recoveries.
To avoid confusion, as of November 16, 2020 we have stopped displaying the national value for Recovered COVID-19 patients on our website, and the metric recovered
will be deprecated in and then removed from the API in December. When a more standard definition of Recovered is adopted nationally by a critical mass of states, we will restore the figure.
Why have you stopped reporting national cumulative hospitalizations, ICU, and ventilation numbers on your website?
Only about two-thirds of states and territories report data for Cumulative hospitalized/Ever hospitalized, and even fewer states report data for Cumulative in ICU/Ever in ICU and Cumulative on ventilator/Ever hospitalized. Therefore, adding these state and territory figures together to get a national count (as we do for other COVID-19 metrics with complete reporting such as cases and tests) drastically undercounts the true cumulative national number of COVID-19 patients who have ever been hospitalized, admitted to the the ICU, or placed on a ventilator.
This incomplete reporting can lead to a misleading national picture. For example, since more states report the number of people currently in the ICU or on a ventilator than report them cumulatively, the national numbers for individuals currently in the ICU or on a ventilator sometimes exceed the cumulative values.
To avoid confusion, as of November 16, 2020 we have therefore decided to stop displaying these national sums of cumulative hospitalization, ICU, and ventilated values on our website, although the fields remain available in our API. We will continue to ask states to report cumulative hospitalization figures and hope to restore the national sums to our website when a critical mass of states report them.
Where has the “spreadsheet” option gone on the data page?
Since the COVID Tracking Project started in March, we have been collating and publishing our data in the form of a single Google Sheet. Our API and website both used that sheet to publish all our core dataset. As our data collection effort has matured, however, we have built new tools to improve our publishing process. All of our API and website data are now based on an improved publishing system that no longer uses Google Sheets.
However, people have been using our public sheets to import our data in ways that were never intended. We only support pulling data through our API. Supporting users whose applications broke because we changed the public sheet has had a significant impact on our support teams.
We encourage anyone who is using the public sheet for importing data to switch to our API, or import the CSV files available from our download page. As of October 26, 2020, we have removed the “Spreadsheet” button on the data page. As of November 28, 2020, the sheet will be static and no longer get new rows or columns, and on December 24, 2020, it will be taken offline.
Why don’t you report historical data on the state pages any more?
On Thursday, September 10, 2020, we removed the table of historical data on our state data pages (for example, the data page for Alaska). This table included screenshots, new tests, cases, negative test results, pending test results, hospitalized, deaths, and total test results. We removed this table because reporting “Negative” and “Total” test results so simply was misleading, given the multiple ways that states report test results, and given our legacy practice of calculating a state’s total tests by adding its positive and negative test results.
In the original historical data table, the “Negative” test results and “Total” test results could sometimes refer to different data (people tested, specimens tested, or testing encounters). We did a great deal of work to make sure that we were reporting different test units accurately for our website redesign of August 25, and we have moved the old “Total” figure on the state’s data page to the state’s history page for the category of Viral (PCR) Tests, where it appears as “Total test results - legacy (positive + negative).”
We realize that the full history page was a convenient way to get a time series of data elements for a single state. As the datasets of individual states change and our knowledge of those datasets improves, however, it has become clear that COVID-19 test and outcome reporting is getting even more complex than it was to begin with. States (and we) have also begun reporting new kinds of tests. To present a complete and accurate historical time series for a state’s data on our website would require more columns than a single web page could comfortably contain and would ultimately be a disservice to our users.
All historical data for every metric for every state is still available, and we are in fact providing more historical data on the web than we did previously.
We encourage you to get a state’s historical data in any or all of the following ways:
- Use our full-history pages for each data category, viewable by clicking “Historical data” in any data category on a state page. The list of screenshots for each state’s data sources is also available from that state’s page.
- Download the full CSV data for a state to build your own charts or do your own analysis.
- Use our API if you have automated, daily tasks that need to process our data.
Why did your national “total test results” numbers change on September 17?
Since August 13, we have been preparing the API field totalTestResults on a state-by-state level to prefer units of testing encounters and specimens over our legacy calculation of summing states’ positive and negative figures. (You can read more about the motivation behind this policy change at the question directly above or here). As of September 17, we had been able to make that switch for four states: Colorado, Massachusetts, North Dakota, and Rhode Island, and the current list of included states is available in the “API Changes” section of our total tests documentation page, and each state page is also annotated on its data page. Future changes will immediately affect the US totalTestResults API field.
However, we did not immediately change the national totalTestResults field of our API, which continued to use positive+negative until September 17. This changeover resulted in a cumulative increase of 2,136,206 US tests. These tests are distributed over the entire time series back to March, so the daily difference is smaller, comprising 57,400 additional tests (about an 8% increase) on September 17.
These upticks are expected when we prioritize counting total tests in units of specimens and test encounters, which include repeat testing, over positive+negative, which usually does not. All four states whose totalTestResults we have switched used to reflect unique people, explaining the large cumulative difference.
Please do not use the posNeg field on the national level. It has been deprecated and zeroed out since we are switching away from using positive + negative to calculate totalTestResults.
Why have your “Total test results” numbers changed for a particular state?
As of August 13, 2020, we have made and will continue to make a number of changes to our state-level “total test results” metric to clarify it and to make it more useful for gauging state and national testing capacity.
Lacking federal data standards, states and territories have been reporting test results in different ways, using different units, and often with unclear definitions and documentation. Most commonly, states chose to report “total tests” either in units of “specimens” (e.g., number of nasal swabs processed by a laboratory, even if a single person provided more than one swab) or in units of “people” (individuals tested for COVID-19). In many cases states have not made clear exactly how they are counting “tests” at all.
Given the substantial lack of clarity and consistency in total test results definitions between states, The COVID Tracking Project created the “Total Test Results,” totalTestsResults field in our API, to assemble a national number, operating by a simple principle to fill it: we took whatever we could get. In the early months of our work, since we preferred to report in units of “people” rather than specimens, this usually meant summing a state’s figures for individuals receiving positive and negative results, because even when states directly provided a figure for total tests, that figure was often unclear or in units of “specimens.”
While we are still far from having a national data standard on how to count tests, most states have clarified their definitions enough that we can start switching states from using calculated positive+negative totals to using explicitly reported “total tests” figures in our main totalTestResults API field and our Total Test Results figures on our website. To support this change, we are launching a new policy about which units of total tests we prioritize in that column and are making it more evident which ones we are using in each state. The current list of included states is available in the “API Changes” section of our total tests documentation page, and each state page is also annotated on its data page.
Because we are rolling out these modifications gradually, you will see some movement in state/territorial totals for the totalTestResults API field and the Total Test Results figures on each state page. We will keep you posted whenever we make a change to the way we count a state’s test figures—each state and territory has its own page on our site, linked to from the main Our Data page, and our notes for these changes will appear that page, on each state page, in our notes for the state, in our API and relevant CSVs, as well as in a forthcoming central repository for everything we know about state and territorial test units.
Read more about the context of these changes here.
Questions about information we don’t report
Why don’t you report test positivity rates?
Test positivity, also called the “percent positive rate” or “positivity rate,” can change dramatically depending on which total test metric is used as the denominator. Until every state reports their most basic COVID-19 data in the same way, direct test positivity comparisons across states remain an intractable problem: read more about these issues in our blog post on the subject. The COVID Tracking Project does not currently calculate test positivity rates and will not do so until we are confident in our ability to communicate precisely about these complex issues. We urge caution when relying on any (governmental or non-governmental) test positivity calculation that does not transparently and prominently address the question of inconsistencies across jurisdictions. We particularly and emphatically recommend against an over-reliance on test positivity calculations to justify changes in public health responses or policies.
Where can I find R(t) values?
We do not calculate R(t) values on this site. The site rt.live used the data we compile to calculate R(t) from April 2020 until their recent shutdown in January 2021, but we are not affiliated with rt.live. We do compile official state and territorial data for metrics including tests, new cases, hospitalizations, deaths, demographics, and outbreaks in long-term-care facilities. You may find our detailed state data and interactive charts helpful.
Are you planning to track vaccination data?
The COVID Tracking Project is not tracking data about COVID-19 vaccine distribution and administration for the US as a whole, but as of January 27, 2021, we do display CDC data on vaccinations in long-term care facilities on our data page. We will also be keeping a close eye on how states define and report vaccination metrics and will be maintaining internal logs and annotations about interesting features of this data. The CDC is reporting vaccination distribution and adminstration data at https://covid.cdc.gov/covid-data-tracker/#vaccinations, and Bloomberg has launched a COVID-19 Vaccine Tracker at https://www.bloomberg.com/graphics/covid-vaccine-tracker-global-distribution.
Why don’t you report county-level data? Will you be doing so in the future?
We do not currently have plans to collect nationwide data at the county level, both because we do not have the resources to do so manually and because Johns Hopkins, The New York Times, and USAFacts.org are collecting county-level data automatically for cases and deaths. For a time, we did collect county-level data for select metro areas as part of our City Data project.
Why aren’t you tracking age and sex?
We had planned to track COVID-19 data by sex, but before we could muster the effort, the GenderSci Lab at Harvard published the US Gender/Sex COVID-19 Data Tracker, which “reports up-to-date and historical gender/sex-disaggregated data on COVID-19 cases and fatalities for 50 US States and 2 US Territories.”
Unfortunately, age is a complicated problem for us, because the states group ages in incompatible ranges: one state might report ages 29-39 as a group, while another reports 25-35, and a third reports 30-45. Because of this non-standardized reporting, age data is therefore very difficult to provide as a national set of metrics. Age data for COVID-19 hospitalizations can be found in the CDC’s weekly COVID-NET summary.
Can you report COVID-19 data related to schools and/or colleges?
We do not currently have plans to track COVID-19 data related to either K-12 schools or colleges. Some states have begun to report COVID-19 data by K-12 school district, including South Carolina and New York. The New York Times has launched a college COVID cases tracker. For additional information on COVID-19 in your area schools or colleges, check your local news sources, your city or county public health department, or your state public health authority.
Other questions
Why don’t you harvest data automatically?
We do have tools that monitor, scrape, harvest, fetch, query, and otherwise capture data automatically, but because the 56 states and territories provide the dozens of data points we collect in so many different ways, and because they change and move and revise their systems and definitions for this data so continually, we rely on human intelligence first and technology second.