Assessment of the All of Us research program’s informed consent process

Abstract Informed consent is the gateway to research participation. We report on the results of the formative evaluation that follows the electronic informed consent process for the All of Us Research Program. Of the nearly 250,000 participants included in this analysis, more than 95% could correctly answer questions distinguishing the program from medical care, the voluntary nature of participation, and the right to withdraw; comparatively, participants were less sure of privacy risk of the program. We also report on a small mixed-methods study of the experience of persons of very low health literacy with All of Us informed consent materials. Of note, many of the words commonly employed in the consent process were unfamiliar to or differently defined by informants. In combination, these analyses may inform participant-centered development and highlight areas for refinement of informed consent materials for the All of Us Research Program and similar studies.


Introduction
Informed consent is fundamental to the ethical practice of human subjects research (OHRP & DHHS 2018; National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research (NCPHS) 1979). In this article we interrogate two data streams to enrich our understanding of innovations in informed consent, especially those focused on improving the accessibility of consent processes for persons with low literacy. We present a quantitative analysis of "quiz" data gathered from participants completing the All of Us Research Program (All of Us; the program) informed consent processes and findings from a mixed-methods investigation of the informational needs and presentation preferences of prospective All of Us participants of very low health literacy.
The All of Us Research Program, one of the largest and most diverse precision medicine research cohorts ever assembled in the United States, relies on a set of community-derived core values to guide program development and implementation (National Institutes of Health and All of Us Research Program 2020a, 2020b; Stein 2017; The All of Us Research Program Investigators 2019; The Precision Medicine Initiative Working Group 2015; The White House 2015). One of these core values is "trust will be earned through transparency." Consistent with this charge, ensuring a clear, meaningful informed consent experience, inclusive of prospective participants regardless of race, ethnic group, age, sex, gender identity, sexual orientation, disability status, access to care, income, educational attainment, or geographic location, has been one of the top priorities of the program since its inception (Hudson et al. 2015;Kraft and Doerr 2018;The Precision Medicine Initiative Working Group 2015).
When initially considering available approaches to informed consent, the program recognized that creating an electronic informed consent ("eConsent") process, accessible by Web browser or app, in addition to a traditional "long-form" consent would allow the All of Us informed consent process to scale to the enormous size (1 million or more persons), geographic spread (the 50 states, District of Columbia, and five inhabited territories of the United States), and duration (10 or more years) of the desired cohort, facilitating broad recruitment. Despite concerns about the "digital divide," the program's ethics review board also favored implementing an eConsent process to help ensure that prospective participants have a consistent informed consent experience regardless of their geographic location or affiliation (participants can join through local health centers or independently as "direct volunteers") (Anderson and Kumar 2019). To create such an informed consent process, the program convened the All of Us Consent Working Group. The Consent Working Group had 55 core members (see Acknowledgments) who met 22 times between August 2016 and March 2017, with additional small-group meetings and expert consultations (e.g., Doerr et al. 2019).
To inform the development of the eConsent and long-form consent, the Consent Working Group drew on previous normative and empirical literature regarding informed consent for genomics research, as well as then emerging literature on the use of eConsent processes in mobile health studies (e.g., Beskow and Dean 2008;Doerr, Suver, and Wilbanks 2016;Kass et al. 2015;Kaufman et al. 2016;McGuire and Beskow 2010;Moore et al. 2017). From this review, the Consent Working Group identified key design and informational features for the All of Us Research Program's eConsent, for example, iconographic representation of essential concepts (McCay-Peet, Lalmas, and Navalpakkam 2012). In consultation with the program's ethics board, the Consent Working Group modified this approach for the All of Us eConsent process, using animated video clips to highlight important information (Figure 1). These clips, ranging in length from 10 to 90 seconds, address the eight elements of informed consent highlighted by the Common Rule, and seek to reinforce concepts previously identified to be confusing to participants of mHealth studies (OHRP & DHHS 2018;Doerr et al. 2017). The choice of animation (over live action) for the video clips was deliberate, allowing for the voiceover to be re-recorded in multiple languages.
The Consent Working Group drafted the program's informed consent materials at the U.S. fifth-grade reading level (ages 10-11 years), acknowledging that this was still insufficient for the needs of many living in the United States; one-third of American adults have basic or below basic health literacy, with 14% of U.S. adults having health literacy skills at or below the third-grade reading level (age 8-9 years) (Kutner et al. 2006). Further, literacy levels are lower among many of the populations traditionally underrepresented in research that All of Us seeks to engage (Kutner et al. 2006; The All of Us Research Program Investigators 2019). The use of video clips with voiceover within the eConsent process is therefore also an aid to broaden accessibility. The informed consent materials for the program are publicly accessible through the program's website (https://allofus.nih.gov/about/protocol/all-us-consent-process).
Due to the longitudinal nature of the study, and based on the recommendations of expert panels and consistent with the core values of the program that promote participant choice, the All of Us informed consent process is modular (Hudson et al. 2015). Participants first navigate through the Primary Consent module, joining the program. A separate, optional HIPAA Authorization module allows participants to release their electronic health records to the program, with the flexibility for additional modules in the future as data types are added to the study. Each module is comprised of an eConsent process, a longform consent that requires signature, and a four-question formative evaluation ( Figure 2).
The Consent Working Group included a formative evaluation ("quiz") at the end of the eConsent process as tool for reinforcing understanding of core concepts about the program and challenging misconceptions. As previously described, the formative evaluation is a form of "teach back" and acts as a pause before enrollment (Kraft and Doerr 2018). The formative evaluation highlights key concepts about study participation to facilitate informed autonomous decision making, but is not a barrier to enrollment.
We completed a quantitative analysis of data gathered from the All of Us Primary Consent and HIPAA Authorization modules' formative evaluations. Additionally, we report on a mixed methods investigation of prospective All of Us participants of very low health literacy. Although derived from the All of Us Research Program's informed consent process, our findings may be of interest to myriad research professionals interested in designing informed consent for research inclusive of persons of low literacy.

All of Us eConsent formative evaluations
Following the primary eConsent and HIPAA Authorizations, prior to being presented the informed consent form for signature, participants are presented with four formative evaluation questions. The questions are binary, with one correct and one incorrect answer presented, and there is a third option, "skip this question." The concepts assessed are detailed in Figure 2.
The institutional review board for the All of Us Research Program requested that the program collect information about navigation of the informed consent process to ensure that the eConsent process did not disadvantage or discourage any of the diverse populations targeted for enrollment. To this end, the program tracks limited data on people who begin the All of Us informed consent process, including results from the formative evaluation. (This data collection is disclosed in the Terms of Service for the All of Us Research Program website and participant portal.) For this analysis, we have married the results of the formative evaluation for participants who report their preferred language is English with those participants' self-reported annual household income level and educational attainment, data that are collected from participants after they join the program. As the program does not directly capture any measure of participant literacy, these variables were chosen as the closest proxies for literacy available to us within the All of Us database (Reardon 2012;Bailey and Dynarski 2011;Park and Kyei 2011). Any skipped question was treated as a missing value and removed from subsequent analyses for that question only. All statistical analyses were performed using R base version 3.4.1 and the collection of tidyverse packages for common data representations (R Core Team 2017; Wickham 2017). Confidence intervals for proportion estimates were calculated without Yates's continuity corrections. No hypothesis tests were performed, as the purpose of these summaries was descriptive in nature only.

Low health literacy consent study
In order to better understand the experience of the of people in the United States who read below the fifthgrade reading level with the All of Us consent process, four federally qualified health centers (FQHC) partnered with Sage Bionetworks to identify specific information needs and communication preferences of people of very low literacy considering participation in the All of Us Research Program: Cherokee Health Systems (Tennessee), HRHCare (New York), Jackson-Hinds Comprehensive Health Center (Mississippi), and San Ysidro Health (California). The study was reviewed and approved by the All of Us Research Program's institutional review board. Research staff members at the recruitment centers for this study had previously completed training courses on protection of human subjects and cultural competency as it pertains to their work in engaging with underserved communities.
To identify potentially eligible adult participants, the local research teams used the Rapid Estimate of Adult Literacy in Medicine-Short Form (REALM-SF), a validated English-language screening tool for rapid, real-time assessment of adult health literacy (Arozullah et al. 2007). Eligible participants scored at or below the sixth-grade reading level (the scoring category closest to the fifth-grade education target of the program's informed consent materials).
Participants were able to choose between a one-onone or a focus-group interview format. Interview and focus-group guides were co-developed by the research team, and interviews and focus groups were led by local study team members. Interviewees gave oral consent at the beginning of the interview/focus group.
The research team compiled informational requests of persons with low literacy with regard to the program ("question clustering"). Further, the team assessed interviewees' comfort with the specific vocabulary employed in the All of Us informed consent materials through word pair and word sorting activities. The word pair activity presented terms used within the All of Us informed consent materials and potential synonyms to help discern the preferred words and phrases of people of low health literacy. The word sorting activity was conducted within the focus groups only. Working in small groups, interviewees were asked to read the word/phrase and decide as a group how to categorize it: don't know/ not familiar (red), kind of know/maybe familiar (yellow), or know/comfortable (green).
With the permission of interviewees, interviews/ focus groups were recorded, and were then transcribed and deidentified by the study team. The Sage Bionetworks data analysis team used Dedoose, a cloud-based qualitative analysis software for the coding of open-ended responses and discussion (Dedoose 2016). The analysis team reviewed transcripts and iteratively developed and refined a codebook to document primary and secondary codes and establish consistency in their application. The codebook, codes, and themes were presented in their raw form to the FQHC research partners, who then assisted in the data verification and analysis process to refine and finalize findings.
The analysis team compiled closed-ended responses from the word pair and word sorting activities in Microsoft Excel, and reviewed and evaluated these data using a combination of descriptive statistics and memos. Once scored and annotated, closed-ended data from the interviews and focus groups were reviewed with the entire study team for analysis verification and finalization.

All of Us eConsent formative evaluations
Between May 31, 2017, and September 23, 2019, 264,234 people completed the All of Us eConsent process and enrolled in the study. Of that group, 249,454 participants reported their preferred language as English. Due to limitations within the All of Us database regarding the specification of the language of consent, we have included in our analysis only participants who stated their preferred language is English. Among those reporting their preferred language as English, all were adults (90% ages 25 to 75 years) and most reported being female at birth. Most had some college education (>40% with college degrees). Participants came from a broad spectrum of annual household income levels, although nearly a quarter reported household income greater than $100,000 per year (Table 1).
As previously described, in the course of completing the Primary Consent and HIPAA Authorization informed consent processes, prospective participants were asked to respond to a total of eight formative evaluation questions. In the following tables we present the questions asked, the correct response rate, and the upper and lower bounds of the 95% confidence interval for the four primary eConsent formative evaluation questions (Table 2) and for the four HIPAA Authorization formative evaluation questions (Table 3) for participants whose preferred language is English. These data are further summarized by selfreported educational attainment and income. The results are color coded, with black representing 90% or more of participants answering correctly, red representing 80% to <90% of participants answering correctly, and yellow representing 70% to <80% of participants answering correctly. No demographic category contains fewer than 539 participants ( Table 1).
Regardless of self-reported income level or educational attainment, more than 90% of participants could correctly answer the following questions.
From the Primary Consent: More than 90% of participants (overall 97.07%) could correctly answer the question What is the purpose of All of Us? from the Primary Consent, with the exception of participants with a self-reported education level of less than fifth grade, of whom 87.40% correctly answered.
Overall, 90.60% of participants could correctly answer the questions within the HIPAA Authorization, Do I have to give All of Us access to my EHR? However, participants with lower educational attainment and those with lower annual household income had correct response rates that were below 90% (Table 3).
While still far greater than expected by chance alone, fewer participants (86.21%) could correctly answer the question regarding privacy within the Primary Consent, Using smartphones, apps, and sensors are always a potential risk to privacy. If I give access to my EHR, is my privacy guaranteed? For the HIPAA Authorization, the privacy question was rephrased, simplifying its presentation: If I give access to my EHR, is my privacy guaranteed? Although 91.02% of participants, regardless of educational attainment or income, correctly answered this question, fewer participants of lower educational attainment/lower annual household income correctly answered (Table 3).

Low health literacy consent study
Local research team members from the Low Literacy Consent Study interviewed a total of 18 participants: nine individually and nine in focus groups. The nine interviews ranged in length from 29 to 48 min (mean: 37 min; median: 36 min). The two focus groups were 112 min and 69 min in length. All interviews took place in San Ysidro Health facilities during January and February 2018. Consistent with eligible criteria for the study, all interviewees had previously scored 3 or lower on the REALM-SF as administered by local research staff. The majority of participants spoke one or more languages in addition to English and may have had greater health literacy in those languages as compared to their assessed English health literacy level. Participants were all aged 18 years or older; 2 men and 16 women were interviewed. No other demographic information was collected from participants.

Question clustering
Throughout the interviews and focus groups, participants were prompted to share any questions they had about participating in the All of Us Research Program. These questions were compiled from the transcripts and clustered by the following themes: how to, how  does it work, why is this part of research, what are you not telling me, privacy/confidentiality concerns, and results ( Figure 3). No participant asked about any monetary incentives for study participation. One participant asked whether they would incur a cost for participating in All of Us.

Word pair activity
The word pair activity presented terms used within the All of Us informed consent materials and potential synonyms. The study team scored each of the individual and focus-group responses to the word pair activity to establish a preferred word; participants could also respond that they had no preference between the words, or suggest another word they preferred more than either of the two in the pair (Table 4). For several items the study team juxtaposed common terms and their technical equivalents (e.g., "pee" versus "urine"). Most of the participants preferred the use of technical terms; some participants stated this preference was due to commonly hearing these terms in health care-related settings. Similarly, some participants preferred "generation to generation" because this was a phrase that was familiar to them from their religious practice. Of particular note, the word pair "medicine/drug" received a strong reaction from many participants, including repeated discussion in each of the focus groups. Although used synonymously within the All of Us informed consent materials, "medicine" and "drug" were considered nonequivalent terms by interviewees, with medicines described as "good for you," or "something that the doctor gives,'' and drugs described as "illegal."

Word sorting activity
The participants were less familiar with many of the technical terms used in the All of Us informed consent materials (e.g., data breach), while they recognized several terms related to a health care setting (e.g., doctor) ( Table 5). Of note, participants were not asked to define any of the terms, so they may have recognized a term or single words used in a phrase but may still have had little or no understanding of the term/phrase as a conceptual unit.

All of Us eConsent formative evaluations
Although the data from the All of Us eConsent formative evaluations cannot be interpreted as signifiers of comprehension, they conclusively support the observation that participants of diverse self-reported educational attainment and annual household income brackets are able to correctly answer key questions about participation in the All of Us Research Program following the informed consent process. This said,  differences in correct response rates highlight two areas of consideration for further refinement by the program. First, participants were comparatively less likely to correctly answer the question within the HIPAA Authorization formative evaluation regarding whether sharing their EHR data with the program was voluntary. This question may have been difficult to answer because of the design of the All of Us Research Program itself: While participants are not required to give the program access to their electronic health records, during the time period covered in this investigation, only participants who completed the HIPAA Authorization and subsequently donated a blood/urine/saliva sample received the program's incentive payment. The eConsent process makes clear that the sharing is voluntary, but participants may have interpreted the incentive payment structure to have rendered this "option" moot. The program may reconsider its approach to incentive payments or make clear earlier in the consent process the incentive payment requirements.
Second, participants were least likely to answer correctly the two questions regarding the guarantee of privacy by the program. In the Primary Consent this question is preceded by a complex introductory sentence; when the sentence was removed in the HIPAA Authorization formative evaluation question, participants more readily identified the correct response. However, given that the primary risk of taking part in All of Us is to participants' privacy, lower rates of correct response, especially among participants from populations traditionally underrepresented in biomedical research, should give the program pause. Privacy is a complicated and fraught issue receiving growing attention in the popular press as health data are breached or used in ways that are inconsistent with popular understanding (Copeland 2019;Robbins 2019Robbins , 2020Tahir 2019). Interestingly, Kaufman and colleagues reported that people who supported an early description of the All of Us Research Program but would not be willing to participate were less likely than those who would take part to agree with the statement "I trust the study to protect my privacy" (51% vs. 81%, respectively) (Kaufman et al. 2016). The program might consider further educational interventions within or adjacent to the eConsent process that directly confront misconceptions about privacy in big health data research to better ensure participant informedness.

Low health literacy consent study
Results from the Low Health Literacy Consent Study highlight ways in which the All of Us Research Program may consider future refinement of the consent process to better meet the needs of persons with low health literacy; however, given the study's size, further investigation of these themes may be warranted.
Notably, one set of questions from prospective participants about the program clustered around the theme "What are you not telling me." This finding may point to ongoing distrust in the research enterprise, or general discomfort with lack of understanding of consent materials among persons of low health literacy. As the consent process is the gateway to research participation, it may be important to address the roots of mistrust and highlight the specific actions of the program to address those concerns within the consent process itself. The influence of word choice on enfranchisement in research was highlighted in the study's findings by participants' strong reaction to the word pair "Drug/Medicine"; participants described "drugs" as "illegal" and "medicines" as "good." The synonymous use of "drugs" and "medicines" within the program's informed consent materials may unwittingly spark suspicion and engender mistrust from persons of low health literacy.
The All of Us Consent Working Group put significant effort into using words within the informed consent process that would be accessible to most people living in the United States based on reading level. This effort is consistent with consensus across the research community on the benefits of using plain language in research, for example, as advocated for by the U.S. Agency for Healthcare Research and Quality (AHRQ) in its Informed Consent and Authorization Toolkit for Minimal Risk Research (The AHRQ Informed Consent and Authorization Toolkit for Minimal Risk Research 2009). However, within the Low Health Literacy Consent Study word pair activity, participants often expressed a preference for more "technical" terms (terms heard in "official" venues) over plain language terms. The results of the word sorting activity, by contrast, highlight that participants of low health literacy are unfamiliar with many terms commonly used by longitudinal cohort studies like All of Us. To wit, even the phrase "research program" was sorted into the yellow "maybe familiar" category by focus-group participants.
There are at least two possible explanations for the paradoxical preference against common terms in the word pair activity coupled with categorization of most technical terms as unfamiliar/maybe familiar in the word sorting activity. First, the majority of the "technical" terms tested within the word pair activity, like saliva and blood sample, may be common enough that participants of low health literacy are equally familiar with them as they are with the plain language equivalents. Alternatively, participant preference against many of the common terms presented in the word pair activity may be a visceral reaction against "dumbed down" materials, an elitist argument previously common even among health care professionals (Stableford and Mettger 2007). As Adkins and Ozanne note, "As a social practice, literacy is a public act-not merely a private act of decoding and encoding. Thus, social evaluations play a role in the social practice of literacy" (Adkins and Ozanne 2005). The social stigma of low literacy can be so high, and fear of being exposed can be so powerful, that people of low literacy have been observed to work against their own health interests to avoid disclosure of their literacy status (Easton, Entwistle, and Williams 2013). Striking a balance between a "professional" tone and true comprehensibility is clearly a significant challenge. To this end, the program may consider how to incorporate explanations of key words or phrases in real time during the consent process. The eConsent format is well suited to such technology-facilitated assists to comprehension. For example, a participant could click on a word to have an explanation read aloud to them or hear the word used in a sentence. Greater integration of adaptive technology could allow participants a choice in how to consume the information presented. This participant-centered approach would support not only those of low health literacy, but persons of diverse learning styles and physical abilities.

Conclusion
The All of Us Research Program has engaged one of the largest and most diverse cohorts in precision medicine and continues to enroll participants with a concentration on recruitment of persons from populations previously underrepresented in biomedical research. We present two analyses of the program's informed consent process: quantitative assessment of the Primary Consent and HIPAA Authorization modules' formative evaluations and a mixed-methods study of the informational needs and presentation preferences of prospective participants with very low health literacy. We hope these data will help inform the program as it iteratively improves the informed consent processes for All of Us and that these data will be of use to others working to construct inclusive and informing consent processes for health research studies.
Us Research Program's informed consent process, and reviewed and approved this article.
Except as just described, members of the Low Health Literacy Consent Study Research Team collaborated to define the research approach and to verify data analysis, and reviewed and approved this article.

Funding
The All of Us Research Program informed consent development reported in this publication was supported by the Office of the Director, National Institutes of Health, under award number U24OD023176. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The Low Health Literacy Consent Study reported in this publication has been conducted under a contract with the MITER Corporation, which is supporting the National Institutes of Health All of Us Research Program (Centers for Medicare and Medicaid Services) under contract number HHSM-500-2012-00008I/task order number HHSN263201600085U.