You might think that The COVID Tracking Project would have a lot of data about our volunteers. We don’t. That lack of data is at least partly by design―we wanted to function more like a porous online community than a rigid formal organization. This was an emergency response effort, after all. We didn’t want to ask for too much information from people up front to avoid barriers to entry. We didn’t assign volunteers to specific permanent teams, and we didn’t ask them to commit to volunteering a certain number of hours per week. We don’t have a directory of people with addresses, phone numbers, teams, roles, and demographic information.
What we do know about our volunteers is that they are remarkable. Amazing. Dedicated. Idealistic. Competent, and even brilliant. Funny. Heartily human. And committed: by our best estimate, our volunteers spent well over 20 thousand hours this year doing data entry alone.
An overview of our volunteer system
In the single year of the project’s existence, we established a volunteer system consisting of set teams, recruitment strategies, application questions, review processes, onboarding steps, and training programs. We tweaked everything over time, but most processes were settled by June 2020.
How volunteers applied
On March 8, 2020, we set up a Google Form to take names and email addresses from people who responded to calls for volunteers on Twitter, the News Nerdery website, and Help With COVID. In May, we shut down the Google Form and opened a volunteer application form on our own website. In its final version, our volunteer application gave an overview of our work, schedule, and teams, then asked for a name, email, a URL, availability, time zone, and skills. We also asked each applicant to write a couple of paragraphs about the experience they would bring to the project.
Originally, we listed particular role descriptions―similar to job ads―and asked volunteers to apply for specific positions on specific teams. But we found that this was too restrictive: volunteers had such varying and plentiful skillsets that we realized we could make use of almost anyone. So we stopped the time-consuming task of writing position descriptions and switched to asking applicants to say generally what they’d like to volunteer to do in one of five “pools”:
I'd like to help collect data, ensure data quality, or build data tools.
I'd like to help create data visualizations or illustrations.
I'd like to help design or develop your website.
I am a journalism or communications student or professional who can help do outreach to state officials.
I am a health science student or professional who can help review your data and content.
We then fed the volunteer applications to Slack in five channels, one for each pool, where team leads reviewed and discussed applications and decided which people should be invited to which teams.
How many people volunteered
So many people wanted to help The COVID Tracking Project that we sometimes had trouble reviewing all the applications. The first version of our volunteer form received 735 responses, and the second version received nearly 1,700. We stopped accepting volunteers in January 2021.1
That means that almost 2,500 people offered to volunteer for The COVID Tracking Project in a span of just under 11 months. That averages to about seven people applying daily, but in summer and fall of 2020, we sometimes had as many as 30 or 40 applications in a day.
Based on account information from Slack, we estimate that about 800 volunteers contributed to The COVID Tracking Project in its year of existence. We admitted a total of 1,513 people to our Slack, but about 700 of them were not contributing volunteers. At least 60 were data users whom we invited to a single channel so that we could ask each other questions more easily. Over 550 were people we invited to volunteer with us who then let their Slack account expire after 30 days. Another 98 people with active Slack accounts have never posted a message; these are probably volunteers we admitted who decided not to participate.
The Data Entry and Data Quality teams worked closely together and shared many team members. The Data Entry team ran two- to four-hour data entry shifts at 3:30pm Eastern every day (including weekends and holidays) to capture COVID testing, hospitalization, and outcome data by visiting coronavirus data dashboards for 56 US states and territories and entering data points in a spreadsheet. The Data Quality team was responsible for doing deeper investigations into the core testing and outcomes data set. They conducted research on data definitions and data pipelines, wrote and compiled “annotations” from this research to help clarify data, corrected previously-entered data when necessary, and maintained the data entry spreadsheet and the quality assurance tool that automatically harvested available data from states. Data Quality also made sure that our screenshots of state coronavirus data dashboards were legible and kept the instructions for finding data (the “source notes”) up to date.
The Data Infrastructure team built and maintained the technical systems we relied on to collect and publish data: the scripts that captured screenshots of state data dashboards, the database where we stored data collected by the Data Entry team, an internal API to help preserve our data model and integrity, and many other tools and technologies.
The Data Visualization, Data Science, and Science Communication teams worked together on the task of interpreting the data for the general public. They analyzed data; produced illustrations, charts, and maps; and helped write content for our blog, newsletter, website, and social media accounts. Members of the Science Communication team also educated other volunteers and staffers about epidemiological standards and terminology, wrote and revised our data definitions, and reviewed our public content to make sure it was both clear and scientifically accurate.
The COVID Racial Data Tracker team (also called the “Race Data” team or the ”Race and Ethnicity Data” team) ran twice-weekly data entry shifts every Sunday and Wednesday at 9pm Eastern to capture COVID race and ethnicity data by visiting the coronavirus data dashboards of US states and territories and entering data points in a spreadsheet. They also worked on quality assurance and interpretation of the race and ethnicity dataset as well as maintenance of their own data entry spreadsheet, with some assistance from other teams. The COVID Racial Data Tracker launched on April 15, 2020.
The Long-Term Care Data team ran a weekly data entry shift on Thursdays at 8pm Eastern to capture data about COVID-19 in long-term care facilities by visiting the coronavirus data dashboards of US states and territories and entering data points in a spreadsheet. They also ran a weekly “special projects” shift on Mondays at the same time for maintenance and interpretation of the data. This team did quality assurance and interpretation of the long-term care dataset as well as maintenance of their own data entry spreadsheet, with some assistance from other teams. The Long-Term Care Tracker began collecting data in May 2020 and launched on August 12, 2020.
The Reporting and Outreach team monitored state and territory press conferences and press releases and asked public health officials questions about their coronavirus data. This team raised issues about how states defined data points, why there were difficulties and anomalies in data reporting, and whether states could provide machine-readable data.
The Website Tech and Design teams worked together to build, maintain, and improve our website and to create visual patterns for presenting content and data clearly. In addition, the Design team created a distinctive look and feel for our site, and the Website Tech team was responsible for our site’s code and accessibility as well as the content management system. The Website Tech team also created and managed the external API that provided data to users in machine-readable form.
We also had teams and team leads with few to no volunteers, including a Project Management team that helped see initiatives through to completion, an Editorial team responsible for written materials, and a Community team that supported volunteers and performed general administrative tasks. A City Data team made up of volunteers from other teams worked asynchronously from May to October to estimate COVID impact in some metropolitan areas.
The Data Entry, Racial Data, and Long-Term Care Data teams usually invited many people at once to join without an initial interview, whereas other teams were more likely to email applicants to set up a time to talk. Some teams, such as Data Quality and Data Infrastructure, recruited most of their volunteers internally, requiring previous experience on one or more of the data collection teams. Other teams had more success with personal networking than with either internal or external recruitment. For instance, Kara Oehler persuaded many fellow radio journalists to join the Reporting and Outreach team, and Tom Subak rounded up many web professionals from digital agencies to join the Web Tech team for a few weeks and help redesign our website.
How volunteers were onboarded
We wanted to make sure that volunteers went where they could do the most good, quickly learn the ropes, make valuable contributions, and feel useful and appreciated. But it was sometimes tough to figure out where in the organization each person would fit best―especially while we were still building the organization. So we settled on a system in which each volunteer was onboarded to a specific team but could then move to another team or begin working with more than one team. In addition, the work of each team wasn’t rigidly defined; who did what depended more on the skills and availability of individual people than on the responsibilities of teams. Many volunteers were willing to do anything and everything.
We also knew from the beginning that many people would find that volunteering for us wasn’t a good fit or that they simply didn’t have the time, so we set a 30-day time limit on most new volunteer Slack accounts in order to give people a chance to look around and try things out. More than one-third of the people we invited let their accounts expire. New volunteers were also usually limited to eight default Slack channels plus one or two team-specific channels until they completed initial training or otherwise began contributing regularly to the project. At that point, we removed all time and channel limits on the volunteer’s account and made the volunteer a full member of our Slack.
One of the first things we did was to put together a Code of Conduct that drew on existing models for creating a healthy and inclusive community that also expressed our own expectations. We also wrote a Welcome document that later became our Field Guide. The Field Guide outlined our goals and processes, listed the teams, provided a schedule of data entry shifts and a Slack Channel Guide, suggested some first steps, and gave names of people to message with questions. We created a Slack Workflow to automatically send these two documents to all new volunteers. Most team leads also created automated workflows to send out team-specific information whenever a new volunteer joined their team’s channel.
The Data Quality, Data Visualization, Data Science, Science Communication, Reporting and Outreach, and Design teams all wrote introductory onboarding documents to help explain key goals and processes, and these teams usually had weekly optional teleconference meetings as well. Volunteers who helped on the Web Tech team had an extensive documentation and Storybook website of their own, and certain GitHub issues were labeled “good first issue” to help newcomers know where to start.
How volunteers learned to enter data
The three main data collection teams all developed training processes to teach people to gather data by hand in shifts. Volunteers on all three teams were given a recorded or live tutorial, written instructions, and some practice exercises, along with help from more experienced data collectors. Differences in training were often at least partly due to the differences in the datasets and in how often the data was collected: the more metrics tracked and the more often the data was published, the more training was necessary.
The training program for the daily Data Entry team was the first to be created and eventually became the most detailed. New volunteers for this team were added to an intake channel where they were asked to watch a training video or sign up for a live training session, to read and consult the Data Entry Instructions, and to enter data for three states in a practice version of the data entry spreadsheet. Once they had done this, they were asked to observe a daily data entry shift. Only then could the new volunteer sign up to collect data.
New volunteers were given the role of “Checker,” since their job was to check state and territory websites for coronavirus data and enter it in the spreadsheet. Once a Checker had completed 10 total data entry shifts and had proven to be careful and accurate, they were eligible for promotion to “Doublechecker.” Doublecheckers were responsible for re-checking every data point that the Checkers had entered, and they also stayed on each shift longer to help perform other quality reviews and to help check and double-check states that published data late. Once a Doublechecker had completed 15 total data entry shifts in this role, they were eligible for promotion to Shift Lead. Shift Leads ran the shift, helped Checkers and Doublecheckers, wrote public notes about data anomalies, and were responsible for making decisions whenever ambiguities or difficulties arose. Any trained Checker from the Data Entry team could also sign up for Data Quality shifts and would then be asked to read the Data Quality Field Guide.
The Race Data team set up a similar training process at first, with an intake channel, training video, live training sessions, and Race Data Entry Instructions. Later, however, new volunteers were asked simply to read the instructions carefully, work slowly, and ask questions often during their first race data collection shifts. Roles on the Race Data team were the same as on the Data Entry team: Checker, Doublechecker, and Shift Lead.
The Long-Term Care Data team (LTC Data) relied more on live training sessions led by Glen Johnson during their weekly data collection shift on Thursdays. New volunteers could come to the LTC data collection channel at the appointed time on Thursday evening and be given the Long-Term Care Data Entry Instructions and some guided practice before starting to collect data right away. The Checker, Doublechecker, and Shift Lead roles were also used, but on the LTC Data team any Checker could also serve as a Doublechecker.
Once volunteers were approved and trained to work on a particular data collection team, they could work as many or as few shifts as fit their schedule. Those who signed up were expected to show up, but volunteers could also sign up to “float,” meaning that they could arrive late and/or leave early. Volunteers who were promoted to Doublecheckers and Shift Leads usually worked at least one shift per week and stayed on shifts longer than newer volunteers, and very often, these dedicated people would show up to shifts even when they hadn’t signed up.
How much time volunteers gave
We didn’t keep track of our volunteers’ time, but our Slack analytics show we had an overall average of 294 weekly active users, which means that on an average day, nearly 300 people had read or sent a message within the last week. Our peak day was December 13, 2020, when 386 people had been active in the previous week.
These were not the same people every week; it was common for volunteers to contribute for awhile and then stop. This was not only accepted―it was encouraged. We told volunteers that they should be sure to take as much time away from the project as they needed, especially since we knew that it was a difficult year for everyone. This also meant that we could and did admit new volunteers throughout the year.
We also have some information about the generous amount of time volunteers contributed from the sign-up spreadsheet for data shifts we used throughout the year. Only the daily Data Entry and Data Quality volunteers and the twice-weekly Race Data volunteers used the sign-up sheet regularly, but even with this undercounting, the statistics are impressive. From the logs we kept, we estimate that The COVID Tracking Project’s volunteers performed at least 20,754 hours of work this past year on the daily Data Entry and Data Quality shifts and the twice-weekly Race and Ethnicity Data shifts.2
The 347 people listed in the sign-up spreadsheet as authorized contributors to data collection and data quality efforts―nearly half the total number of volunteers―logged a total of 6,918 shifts over the course of the year. Most shifts were at least two hours long, and it was common for shifts to last four hours or longer when states were late in updating data or when problems arose, so the bare-minimum estimate of more than 20,000 hours is based on multiplying the total number of recorded shift sign-ups by three.
As with many open, distributed projects, a large group of people contributed a little and a small group of people contributed a lot, and the sum total of the work could only have been accomplished by the joint efforts of both casual and committed contributors. Of the 347 authorized data collection and data quality volunteers, 66 never officially signed up for a single shift, but many of them were trusted contributors in other areas of the project.
Who volunteered for The COVID Tracking Project
We have more knowledge than data about one another, because even though our intake process didn’t require much information, core members of the project got to know each other very well. Many teams had regular happy hours and some, beginning with Data Viz, offered regular “skillshares”―brief workshops on useful or interesting tools and techniques. Data shifts often began with icebreakers in which everyone shared their middle name (if any), their favorite way to eat eggs, their childhood transgressions, and their opinions on which animals would be the rudest if they could speak. We know a lot about one another’s pets. On Election Day, we held an all-day series of “Genuinely Random Skillshares” in which volunteers gave talks or workshops on their hobbies and areas of expertise. This list of lessons on topics from calculators to cupcakes is a good window into the personality of our collective.
From conversations we’ve had, we know that many of our volunteers were retirees, people working as independent consultants, or people who were unemployed, often because of the pandemic, though we also had plenty of full-time professionals contributing in their free time. We had at least a few veterans and current or former members of the US military. Several of our most dedicated volunteers were students in high school, college, and graduate school. We also admitted two groups of students as volunteers: undergraduate and graduate students in Professor Meredith Broussard’s Data Journalism classes at New York University and the graduate fellows in the CUNY Data for Public Good program led by Dr. Lisa Marie Rhody and Stephen Zweibel.
Over the course of the year, we added several custom fields to Slack profiles, and this data tells us a bit more about who our volunteers were. Among the 300 people who listed an “Area of expertise,” the top 20 terms show that our volunteers and contractors possess a wide range of general analytic and creative skills.
We didn’t collect race and ethnicity or gender data from our volunteers, but of the 363 people who provided pronouns, the majority preferred “she” and “her.”
Though we asked volunteers for their time zone on our sign-up form, we didn’t collect location information. Twice during the project, however, we sent volunteers gifts, and from the addresses given to us for the second mailing, we saw that nearly a quarter of the 171 volunteers who gave us their address live in California and almost as many in New York, with the other 56% distributed throughout another 25 US states. Only a few of the volunteers who gave us their addresses live outside the United States.3
The culture of The COVID Tracking Project
What made so many people want to stick around and do so much? Partly, we think it was the importance and urgency of the work itself, but it’s also true that those who stayed formed close ties to each other. We held and recorded “All-Hands meetings” on alternate Tuesdays, usually attended by about 50 people, giving participants a chance to hear presentations from various members of the project and sometimes from external speakers. But these meetings were also a chance to see and hear one another after two weeks of mostly written communication in Slack. Core values of the project such as gratitude, transparency, accuracy, kindness, and flexibility were reiterated and enacted by team leads often. Official community initiatives were launched to help volunteers get to know one another in the #virtual-coffee channel, and to help everyone cope with such distressing data in a presentation by a psychologist at an All-Hands meeting and in the #wellbeing channel.
While we put a lot of deliberate effort into the team culture, much of it seemed to grow organically. Teams and channels developed their own rituals and argots and in-jokes. The #general channel was a steady stream of useful and interesting COVID news articles. The #random channel had the usual share of animal pictures. The #science channel saw sharing and discussion of preprints and peer-reviewed articles on the latest coronavirus research. Volunteers created channels for cooking, entertainment, and job opportunities. A few games of Among Us were played.
And then there was the #emoji channel. Over the course of the project, members added 2,279 custom emoji to our Slack workspace, and more than 350 of these were variants of our purple logo. Emoji were used for work, especially on data entry shifts, but creating and using new emoji also became a huge source of fun for many. A Slack Workflow was set up just to manage requests for new emoji. Early in 2021, there was even a March Madness bracket to determine the most beloved emoji.
We used many of these emoji on merchandise in an online store that listed 163 designs on dozens of products offered at cost, with zero markup or profit. Most of the designs were based on The COVID Tracking Project’s purple logo. In late 2020, we emailed over 400 of our most active volunteers gift cards to this store so that they could get merch either for free or very cheaply.
The store has distributed 620 items to date. The most popular items with volunteers were stickers, mugs, and magnets, while the most popular designs were our array of logos in rainbow colors, our full logo with text in white, the COVID Tracking Project version of the “this is fine” meme, our full logo with text in purple, and our custom party possum emoji.4
Thanks to our volunteers
We want to know more about our volunteers, and we want others to know more about them as well. So we asked volunteers to submit pictures and brief biographies to us for publication, and more than 250 volunteers did so and are featured on our Thank You page. (Contact Amanda French if you volunteered with us and aren’t listed there.) There will also be a printed yearbook just for volunteers.
We’ve also initiated an oral history project in which volunteers interview one another about their experience with The COVID Tracking Project. We have about 60 recordings so far, which we hope to save along with other materials related to the project in a long-term archive. Leslie Heyison, who has been conducting many of these interviews, has highlighted the following quotations:
“It was the most chaotic and organized Slack I had ever seen” – Brandon Park
“A team like this can do anything” – Lauran Hazan
“If there is an afterlife, I want it to be CTP [COVID Tracking Project]” – Deirdre Kennedy
“It’s a magical slice of amazingness” – Asia Lindsay
In the last year, this group became a precious community for many of us. We celebrated holidays and birthdays together, helped each other with homework, gave advice on topics ranging from childcare to careers to recipes, and comforted each other through the darkest moments of this pandemic, all while doing meaningful work. Not everyone felt this degree of loyalty and affection all the time: sometimes, some people didn’t feel appreciated or useful, some people were overwhelmed, and some people just had more important places to be. But in general, the people who were actively involved in this project will tell you that it was one of the things that sustained and rewarded them during the past difficult and tragic year. We are grateful to and for each other.
Marie Connelly created the first versions of the Code of Conduct and the Welcome document.
Joseph Bensimon suggested the term “Field Guide.”
Ines Wingert created the Slack Channel Guide.
Pat Kelly created the initial version of the data shift schedule graphic.
Amanda French, JD Maresco, SJ Klein, Brandon Park, and Brian Li trained volunteers in Data Entry.
Stacey Rupolo, Adeline Gutierrez-Nunez, Betsy Ladyzhets, and Alice Goldfarb trained volunteers in Race Data.
Glen Johnson trained volunteers in Long-Term Care Data and ran our All-Hands calls.
Erin Kissane proposed the Election Day Random Skillshares.
Quang Nguyen did the original shift person-hours calculation.
Kevin Miller pulled the data from the custom Slack profile fields.
Asia Lindsay arranged and managed #virtual-coffee.
Kim Bryant arranged for a psychologist to support us and ran #wellbeing.
Jason Santa Maria created The COVID Tracking Project logo.
Jùlia Ledur created Bob, who illustrates why test positivity is a mess.
Nicki Camberg made 35% of our custom emoji and also created and ran our online merch store with help from Hannah Hoffman.
Nicki Camberg, Hannah Hoffman, and Kara Schechtman organized the Emoji March Madness.
Carol Brandmaier-Monahan conceived of and is carrying out the yearbook project.
Kara Oehler had the idea for the oral history project.
Rachel Glickhouse and Mandy Brown edited several versions of this piece.
Word frequency lists were generated with Voyant Tools.
1 We received 1,889 responses to our second form, but many people applied twice, so there were only 1,692 unique applicants.
2 See this public spreadsheet of anonymized data on volunteer hours, skills, pronouns, and locations.
3 Of the 171 volunteers who gave us their address in late 2020, 40 had addresses in California and 35 in New York. See additional anonymized data on volunteer hours, skills, pronouns, and locations.
4 To date, the merch store has sold 195 stickers, 63 mugs, and 47 magnets in addition to many other items. Our array of CTP logos in rainbow colors was used on 60 items, our logo in white was used on 44 items, our version of the “this is fine” meme on 43 items, our logo in purple on 30 items, and our “party possum” on 22 items.
Amanda French, Community Lead and Data Entry Shift Lead at The COVID Tracking Project, has a doctorate in English and is an expert in digital humanities.
Nicki Camberg is a student journalist studying Political Science and Statistics at Barnard College, and the City Data Manager at CTP.
Measuring Our Impact at The COVID Tracking Project
Our largely volunteer-operated effort became a critical data source for journalists, scientists, academics and government officials.
The Decisions We Made
Looking back at what made The COVID Tracking Project work.
Giving Thanks and Looking Ahead: Our Data Collection Work Is Done
While our work to compile COVID-19 data has concluded, we will continue to share research, analysis, and documentation in the months ahead. We are enormously grateful to the hundreds of volunteers who made this work possible.