The challenge of recruiting diverse populations into health research: an embedded social science perspective

Addressing health disparities has become a central remit for conducting health research. In the following paper, we explore the conceptual and methodological challenges posed by the call to recruit medically underserved populations. This exploration of challenges is undertaken from the perspective of social science researchers embedded in a large clinical genomics research study. We suggest that these challenges are found in respect to the development of recruiting strategies, analysis of the data in respect to understanding and interpreting the experiences of being medically underserved, and in comparing the experiences of being medically underserved compared to not being underserved. By way of conclusion, it is argued that there is an important role for social scientists with large health research studies which, if achieved successfully, can benefit study teams and society as a whole.


Introduction
As the Twenty-First Century progresses, the notion that scientific research can be conducted in isolation from social values and public accountability has come under increasing pressure. Furthermore, within health research, turning a blindeye to recruitment protocols that may restrict research benefits to the dominant socioeconomic populations has become unacceptable. This has led to formalized research protocols that are inclusive of diverse populations, and may specifically look to recruit such populations in order to redress current imbalances in our scientific knowledge. Embedded social scientists are part of this landscape of change wherein traditional measurement-based approaches to scientific knowledge are coupled with reflections upon how the science is being carried out, and, more extensively whether such research meets the remit of being equitable in the distribution of the potential benefits. In the following paper weas social scientists and bioethicists embedded in a large clinical genomics research projectoutline the conceptual and methodological challenges posed by recruitment of diverse populations and the analysis of study findings as part of this remit to reflect upon the scientific process and the equitable distribution of benefits. Further, we discuss ways forward to enable this dual roleas members of team tasked with meeting recruitment targets and as critical analysts of the process by which targets are met and data interpretedto be more successfully managed in future. Indeed, as discussed below, embedded social scientists have multiple and sometimes conflicting roles and responsibilities within a multidisciplinary research environment.

Embeddedness
At its core, the employment of social scientists, bioethicists, and ethnographers within the health sciences is one of a hoped for mutual exchange in ideas to the benefit of the research team (or part of that team) and of society as a whole. (For the purposes of brevity these different disciplinary groups will largely be referred to a "social scientists" within the following paper). By employing social scientists, multidisciplinary research teams as a whole can learn more about the research process and develop richer insights into how basic science can lead to societal benefits (Reiter-Theil 2004;Lewis and Russell 2011;Viseu 2015;Reynolds 2017;Vindrola-Padros et al. 2017). Perhaps more problematically, social scientists may also be seen as taking on the role of reducing or minimizing the potential controversies surrounding large scientific studies. Indeed, outlining both of these roles, Viseu (2015) has written, social scientists are increasingly employed in various capacities in large research studies "with the goal of maximizing societal benefits while reducing the possibility of negative impacts and public controversy." The employment of social scientists within a clinical genomics research studythe context for the following paperprovides a prime example of how social scientists might be seen as taking a role in maximizing social benefits and reducing the likelihood of controversy, especially given the high expectations of genomics to solve health issues and, conversely, the complex and controversial history of genetics and genomic studies in respect to social policy and practice (Kevles 1995;Duster 2004).
However, this picture of a mutual exchange of ideas has been noted as less than perfect. Marris et al. (2015) and Viseu (2015) both provide critical reflections as to how academic disciplinary expectations can result in unrealistic expectations of embedded social scientists to solve problems that are seen as outside the domain of the natural or hard sciences. In this respect, too much can be expected of social scientists within the team; as if they are able to solve deep-seated social issues or change public opinion through their knowledge of ethics and/or social science. Conversely, it may be the case that too little may be asked of embedded social scientists for fear that critical reflection could lead to project delays or failure. Indeed, there may be a perception that social scientists are largely present to criticize or slow down scientific research.
In summary, the role of embedded social scientists in health research is challenging, sometimes conflictual, and perhaps paradoxical in being tasked with the dual role of facilitating a respective project's progress while at the same time taking a position as a critical observer and commentator on the scientific process. This is made all the more challenging when one starts to unpack the conceptual and methodological issues surrounding the recruitment of underserved populations.

Study context
The study context for exploring the challenges faced by embedded social scientists in addressing health disparities is the UCSF Program in Prenatal and Pediatric Genome Sequencing (P3EGS). P3EGS is a large research study (comprising 845 enrolled families) providing whole exome sequencing for diagnostic purposes for parents of children with otherwise unexplained developmental disorders and to the parents of unborn children with fetal anomalies. The study is part of the Clinical Sequencing Evidence Generating Research (CSER) consortium, comprised of seven research sites which are studying "the effectiveness of integrating genome sequencing into the clinical care of diverse and medically underserved individuals" (https://cser-consortium.org/projects). The consortium is funded by the US National Institutes of Health (NIH) which provided the initial call for study sites. In applying for funding, applicant sites were required "to recruit a minimum of 60% of patients who come from racial or ethnic minority populations, underserved populations, or populations who experience poorer medical outcomes" (https://grants.nih.gov/grants/guide/rfa-files/RFA-HG-16-011.html). The study is ongoing (although recruitment has ended) and began recruiting families in late 2017. Of the 845 families enrolled in the P3EGS study, 78% are defined as from "racial or ethnic minority populations, underserved populations, or populations who experience poorer medical outcomes." The study site provides a fitting location to reflect upon the challenges faced by embedded social scientists as interlocators tasked with critically analyzing research objectives and methods, communicating this analysis within and outside of the research team, and at the same time largely conforming to the goals and objectives of the project as set out by funding stipulations, especially in respect to the key remit of recruiting a minimum of 60% underserved populations.

The challenges
While the challenges that follow are broken down into constituent partsbroadly along the lines of recruitment, analysis, and conclusions drawnthey are all to a certain degree the product of the drive to recruit a minimum percentage of the target underserved population without clear instructions as to how to define this category and without a mandate (and time) to deeply examine the suitability of the adopted measures to explore "the effectiveness of integrating genome sequencing into the clinical care of diverse and medically underserved individuals." In practice, meeting the remit of recruiting a minimum of 60% underserved was done by the creation of multiple measures by employing three indicators of "underserved." These included insurance status (not being privately or employee insured), living within an officially defined Health Professional Shortage Area [HPSA], or being a member of a racial/ethnic population that has been historically underrepresented in biomedical research (i.e. anyone who did not identify as "white"). If a family met two of these three measures they were included as being part of the medically underserved or underrepresented population.
Conceptually and methodologically it is evident that these proxy measures of underservedness provide only a superficial insight (sometimes clearly erroneous) into what might be called the lived experience of being in a diverse and/or medically underserved community. Health insurance status is difficult to utilize as a measure of being medically underserved in respect to access to genomic sequencing because private/employee insurance does not necessarily provide genomic sequencing coverage. Furthermore, working in a large US consortium made evident that public insurance offered radically different levels of health care coverage. Anecdotally, considerable ambiguities were also found by the social science team when applying the HPSA protocols to urban addresses wherein it would appear that addresses that were evidently well served by hospitals and family doctors' offices were deemed to be in medically underserved areas. Furthermore, particularly problematic issues are seen with respect to employing racial and ethnic in health and genomics research including the long history of misuse of such categories and the tendency for racial and/or ethnic differences in health to be essentialized as if widespread health inequalities can be attributed to genomic difference (Lee, Mountain, and Koenig 2001;Lee, Mountain, and Koenig 2001;Sankar et al. 2004;Wailoo 2006;Sanchez and Garcia 2009;Phelan et al. 2014;Bliss 2020Bliss & 2012. Moreover, it is well established that there is a lack of conceptual clarity around how to interpret data employing racial and/or ethnic categories, especially in genetics and biomedical research. This is particularly evident when it is assumed that persons with the same racial and/or ethnic status are homogenous in respect to being underserved and/or members of different racial/ethnic populations are assumed to be distinct in respect to being underserved (Kaufman and Cooper 2001;Geiger 2006;Oliver 2008;Kindig 2017). Added to these conceptual issues around race and/or ethnicity are methodological issues pertaining to how to collect racial and/or ethnic data (Lin and Kelsey 2001;Corbie-Smith et al. 2003).
In summary, the use of proxy measures of underservedness and diversity presents a fundamental challenge to embedded social scientists who find themselves caught between knowing that there are problems with the measures being used to classify populations, and also knowing that such measurement is essential to the process of achieving recruitment targets. When, how, and to whom to voice these issues is a fundamental challenge, because in voicing concerns the very essence of the project may be undermined. Indeed, an over-critical approach could lead to disciplinary hostility wherein social scientists and bioethicists find themselves marginalized from the rest of the team who are largely responsible for meeting recruitment targets. At best the inclusion of social scientists might highlight, at an early stage, that the measures used should be treated with caution. At worst, bringing up this discussion could look like nit-picking or a form of internal sabotage.
Not surprisingly, fundamentally unresolved issues with respect to meeting recruitment targets lead to difficulties with data analysis. In particular, questions arise as to how to interpret what it means to be part of this nominally underserved population in the respect to the overarching CSER remit of exploring "the effectiveness of integrating genome sequencing into the clinical care." The question of meaning is perhaps the most complex challenge facing embedded social scientists. In part this is because meaning or conceptual clarity is not necessary in order to recruit and categorize populations; only operational measures are required in order to recruit.
This leads to complex and troubling phenomena. At an individual level a research participantwithout being askedbecomes representative of one of these groups (underserved or not underserved). There is little sense when recruiting underserved populations for research that the recruited families themselves necessarily view themselves as underserved; in practice they are not asked whether they feel underserved, only how they self-identify and where they live. Again, this leaves embedded social scientists with an uncomfortable reality, that we are tasked with speaking about a population as if it were a socially salient collective, but with little or no indications that this saliency exists.
In addition, when exploring underservedness, another challenge emerges, which is how we can call a population medically underserved when the population is receiving services? Put in another manner, a core paradox arises in that recruitment was intended to enroll medically underserved populations, yet the population that was actually enrolled was that element of the nominally underserved population that found its way into a research study (i.e. was not necessarily medically underserved).
Again, there is a danger that awareness that we have not truly recruited an underserved population could undermine the worthiness of the recruitment objectives. The challenge, as embedded social scientists, is to recognize and work within these limited parameters. Rather than simply giving up or stopping the research, the challenge is to work with the population that we have to gain insights into the significance of racial and/or ethnic diversity and what it means generally to be marginalized with respect to medical access (and how this relates to accessing genomic research services), and in doing so work towards increasing our insights into the broader population of underserved who never get to hear about or enroll in such studies.
Finally, notably absent from the original funding remit was whether and/or how to compare underserved populations with the target population deemed not to be underserved. To reiterate this point, without any explicit comparator we cannot say anything about the degree to which study findings overallwhether quantitive or qualitativeare distinct to this underserved population or are relatively familiar to all populations.
Again, embedded social scientists face challenges and difficult choices. Should we assume those outside of this population are different in their perspectives on the benefits of genomic sequencing? Or, given the absence of a structured system of comparison, should we simply avoid this question for the sake of getting the information out? Evidently, if the populations (served versus underserved, represented versus underrepresented) were more evenly divided we might be able to say more about differences. At best we can make a strong argument that what we learn provides empirical data on populations that would otherwise be less likely to be heard. However, in the absence of any defined method of comparison we cannot make assumptions about difference in experiences or expectations of genomic research between the nominally underserved and nominally "served" population. The inherent danger is that in standardizing the use of the term underserved we turn what is an exploratory labelwherein the purpose is to investigate what it means to be part of the underserved populationinto an explanatory labelwherein being labeled as underserved suggests within group homogeneity (and suggests out-group difference). This is the challenge that embedded social scientists are tasked to meet; one of promoting conceptual clarity as to what exactly we can say about this particular population that might be considered distinct and what we cannot say or assume.

Discussion
Working as embedded social scientists in a large health research team provides opportunities to reflect upon and contribute to study design, data collection, and data analysis and work towards refining study methods and conceptual approaches to understanding health disparities in clinical genomic research. As illustrated above, these opportunities are likely to be experienced as challenges or even unresolvable paradoxes that embedded social scientists have to work through with respect to their communication with colleagues and in respect to the analysis and interpretation of the available data.
In Viseu's (2015) reflections on working as an embedded social scientist, the author writes "despite its ambivalent track record, integration is increasingly used as a preferred policy tool and as a model for many STS engagements with technoscience, making it all the more important to examine whether and how it is working." As with this study, Viseu reflects upon how in this context, social scientists tend to be squeezed into the dominant hierarchy of numerical outputs and strict categorizations. Combining this with the drive to recruit diverse populations, Epstein's (2008Epstein's ( , 2010 critical work on "recruitmentology" also explores the role of the ethnographer in meeting the recruitment drive while at the same time (or perhaps subsequently) unpacking what Epstein refers to as the "knowable sociocultural properties" of underserved population. As embedded social scientists we are left in a peculiar and uncomfortable position; recognizing the imperfections of the underserved categorization mandate (and associated categorization process) but at the same time recognizing that such a mandate encompasses important scientific and social objectives that cannot and should not be jettisoned. If as a teaminclusive of social scientists, clinicians, geneticists, and otherswe can find a working solution that is satisfactory, we should be in a stronger position to make proposals about how to interpret the data which would ultimately be beneficial to all (researchers and members of society).
Regarding health research and recruitment mandates such as those discussed with respect to the P3EGS project, it is self-evident that no population should be excluded or left out from the potential benefits of genetic research because they are not as easy to recruit or assumptions about inherent biological or genetic differences (Epstein 2008;Shim et al. 2014). Such exclusion by convenience or assumption is counter-productive in terms of the generalizability of research findings and is especially problematic when one considers the biases in genomic biobanking, which has been noted to work against our understanding genomic differences among diverse populations (Petrovski and Goldstein 2016;Popejoy and Fullerton 2016;Hindorff et al. 2018;Landry et al. 2018). Timmermans and Epstein (2010, 78) have argued that we are witnessing a process of standardization with respect to the call for recruitment of underserved populations wherein researchers (and the public) may come to assume that labels such as underserved are markers for internal homogeneity and that persons who do not fall into this category can be assumed to be different in some manner. This assumption of in-group similarity and out-group difference becomes reenforced the more often a category is used in research (see Timmermans and Epstein 2010 for further discussion of this point). Indeed, much of the argument about underserved and not underserved appears to have parallels in respect to conceptual issues concerned with how difference is assumed to be the property of non-European origin populations, while the European population is seen as standard and thus left largely un-investigated (Louis 2005;Williams 2015). Embedded social scientists and bioethicists are in an ideal position to recognize when this is happening and convey this to the broader team, such that false dichotomies are avoided and interpretations of difference are provable rather than assumed.
Through embedded social scientific research, we can explore and even experience the paradoxes and tensions inherent in the process of recruiting diverse populations and interpreting the subsequent data and also make strong suggestions as projects are ongoing as to how to manage these tensions and paradoxes, so that at the very least we do not end up creating new forms of essentialized thinking about health inequalities. Embedded social scientists are in an ideal place both to note and work within this inherently contradictory and troubling space, communicating their respective findings to research colleagues about what it means to be underserved as well as highlighting the inherent dangers of reifying categories. They are not, however, in a position to resolve the tension that emerges from two fundamentally different approaches to recruitment: reaching a recruitment target for the specific population (as a remit of the study) and gaining a deep understanding of the experiences of the recruited population. Indeed, the former is largely concerned with creating a category and the latter is largely concerned with unpacking or challenging the boundaries of that category. The role of embedded social scientists is to continuously review what is happening during the course of a research project and communicate these findings to researchers in order to maximize our understanding of the data from different perspectives, noting the complexities inherent in such data. At its best, the role of embedded social scientists is truly co-creative; mixing observational insights based on actual practice to improve scientific practice while still enabling a project to function.

Limitations
Our reflections upon the challenge of embedded social science in health research come out of one study and are situated specifically in the context of offering genomic diagnostic services in a relatively urban area. They also reflect the process of the study itself as experienced, which is by definition unique. Other studies may have found that constant feedback between colleagues from different branches of the study largely eliminates the issues highlighted. For example, a study might have come to grips early on with the problem of comparability of data and found ways to deal with this or may have recruited in such a manner that such questions became of little consequence. More widely, biomedical studies that do not focus on genetics may not have to focus as forcefully on avoiding the pitfalls of biological essentialism. Nevertheless, it is argued the challenges presented and discussed are ones which will continue to be seen in many such study contexts within and outside of the health sciences.

Conclusion
We have explored the challenges posed by the remit of inclusion of underserved populations in biomedical research in order to address health disparities. We have suggested that such challenges are not fully resolvable, but in recognizing them we can incrementally improve research practice from within. It is not intended as condemnation of the original call for applications or the objectives of recruiting underserved populations into health research. The inclusion of embedded social scientists alongside colleagues for whom the primary objective is to recruit, collect biological samples, and analyze these samples for clinical purposes provides a rich and complex stage from which we can learn more about how to maximize the benefits of such research to all populations.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
Research reported in this publication was supported by the National Human Genome Research Institute of the National Institutes of Health under [Award Number U01HG009599]. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Ethics declaration
The study was approved by the UCSF Human Research Protection Program Institutional Review Board (IRB), approval number. 17-23118. Participants were provided with written and verbal details of the study and provided verbal informed consent to be interviewed and recorded prior to interviews.