Selection bias in general practice research: analysis in a cohort of pregnant Danish women

Abstract Objective The aim of the present study was to examine selection in a general practice-based pregnancy cohort. Design Survey linked to administrative register data. Setting and subjects In spring 2015, GPs were recruited from two Danish regions. They were asked to invite all pregnant women in their practice who had their first prenatal care visit before 15 August 2016 to participate in the survey. Outcome measures The characteristics of GPs and the pregnant women were compared at each step in the recruitment process – the GP’s invitation, their agreement to participate, actual GP participation, and the women’s participation – with an uncertainty coefficient to quantify the step where the largest selection occurs. Results Significant differences were found between participating and non-participating practices with regards to practice characteristics such as the number of patients registered with the practice, the age and sex of doctors, and the type of practice. Despite these differences, the characteristics of the eligible patients differed little between participating and non-participating practices. In participating practices significant differences were, however, observed between recruited and non-recruited patients. Conclusion The skewed selection of patients was mainly caused by a high number of non-participants within practices that actively took part in the study. We recommend that a focus on the sampling within participating practices be the most important factor in representative sampling of patient populations in general practice. Key points Selection among general practitioners (GPs) is often unavoidable in practice-based studies, and we found significant differences between participating and non-participating practices. These include practice characteristics such as the number of GPs, the number of patients registered with the GP practice, as well as the sex and age of the GPs. •Despite this, only small differences in the characteristics of the eligible patients were observed between participating and non-participating practices. •In participating practices, however, significant differences were observed between recruited and non-recruited patients. •Comprehensive sampling within participating practices may be the best way to generate representative samples of patients.


Introduction
Population-based research in primary care generally depends on gaining access to primary care settings and recruiting patients with a degree of diversity representative of the population. General practitioner (GP) participation is a crucial component. Many research projects in general practice are based on self-selection among GPs [1], raising questions about possible selection bias. All GPs may be invited to participate in a study, but not all decide to take part. For simple questionnaires aimed at GPs, response rates just above 50% are common [2][3][4][5]. More complex projects, which involve the inclusion of patients and extra clinical work, often have lower participation rates [6]. Various barriers to GP recruitment in research have been identified, including practical and organisational factors, such as competing time commitments and a lack of reimbursement [7], as well as personal factors [8].
Consecutive sampling of attending patients is widely used, given the simplicity and ease of implementation. The implications of using such an approach has, however, received scant attention. The representativeness of a visit-based sample was compared with the population of patients seen during the same year, and it was found that sampling of consecutive attenders typically underrepresents low users of the service [9]. Other studies have shown lower levels of participation among less privileged groups of society [10]: people with low income or low educational levels have typically been underrepresented in cohort studies in Western society [11]. Participation rates have also varied according to, for example, age and sex [11]. Sampling bias may, therefore, arise in various ways: GPs may decide not to participate, they may not invite some patients to participate, or patients may decide themselves not to accept the invitation [12,13]. Knowing the extent of these mechanisms would be valuable in developing strategies to obtain representative samples.
Studies of non-participation, aiming to compare data for participating and non-participating GPs and patients, must rely on population data produced independently of the study, in particular administrative register data. In Denmark, such data are available to researchers on an anonymised basis in national registers.
The aim of the present study was to examine selection in a general practice-based pregnancy cohort. We studied the differences in a range of practice characteristics and characteristics of the pregnant women at each step in the recruitment process. The steps in which the largest differences occur are the most critical in avoiding selection bias, and we discuss some implications of our results for study design.

Setting
The healthcare system in Denmark is tax-funded, and care is free of charge for the patient. The majority of Danes are registered with a GP who functions as gatekeeper to specialist secondary care. Citizens are entitled to a regular GP of their own choosing and thereby become registered with a practice. Some practices are small and single handed, while other practices comprise 2-6 GPs, who own the clinic jointly and share a larger number of patients.
By law, a minimum of three prenatal care visits are offered by the GP, at pregnancy weeks 6-10, 25, and 32. A fourth postnatal care visit is conducted 8 weeks after delivery. The first visit is attended by almost all pregnant women wanting to keep their pregnancy and precedes other pregnancy-related contacts in the healthcare system. In this consultation, a thorough and structured record is established (the Pregnancy Health Record), which is then sent to midwives and the hospital department.

Recruitment to the study
The present report is based on the recruitment process for a cohort study of pregnant women recruited in general practice at the first prenatal care visit [14]. This study aimed to investigate the physical and mental well-being of the women during their pregnancy and postpartum.
All GPs working in the Capital Region of Denmark and Region Zealand, two of the five Danish administrative regions, were eligible to participate in the study. In spring 2015, a subgroup of these practices was selected and invited to participate and recruit pregnant women to the study. A systematic procedure was used for the selection: first, all practices were allocated to geographically defined subgroups using municipalities and postal codes. These subgroups were randomly ranked and the practices in these subgroups were then contacted in the order of the ranking. The initial contact was a telephone call from the principal investigator (RE) to the GPs, and if the GP indicated interest, this was followed up by an email with detailed information and, on some occasions, a visit to the practice.
GPs who accepted participation (before 30 June 2015) were asked to invite all women booking an appointment for a first prenatal care visit in the recruitment period until 15 August 2016. GPs were offered a fee for each pregnant woman recruited to the study, an amount corresponding to reimbursement for one normal consultation for each woman.
During the study period, there was frequent communication between the principal investigator (RE) and the participating practices about recruitment, including e-mails about the progression of the study.

Identification of the source population
To identify selection differences for the purpose of the present study, data were obtained from the Danish national registers [15]. Those registers are based upon a 10-digit civil registration number assigned to all individuals in Denmark at birth or upon immigration to provide a unique identifier. Based on the civil registration number, Statistics Denmark provided an anonymised linkage between data on all pregnant women who were listed with the participating GP practices and had attended the first prenatal care visit (coded by the GP for remuneration purpose). Using the specific code (8110), it was possible to extract data on all women attending the first prenatal care visit in each practice during the recruitment period. Approval from the Danish Data Protection Agency was obtained (Journal 2014-41-3018). According to Danish law, observational studies and studies based entirely on data collected from registers do not need approval from a scientific ethics committee.

Statistical analysis
Differences were studied in practice characteristics between practices wishing to participate (accepted participation) versus practices not wishing to participate (declined participation), and differences between practices that actually recruited patients to the study and practices that agreed to participate, but did not recruit patients. Selective recruitment was studied by means of administrative data based on practice and patient characteristics. Finally, the characteristics of the recruited patients were compared to patients that were not recruited.
To compare the strength of the various selection effects for a certain characteristic, uncertainty coefficients [16][17][18] are calculated. This coefficient is the percentage reduction of the variation of the characteristic (measured as entropy) due to a selection effect. As the selection effects are binary indicators, e.g. participating practice and non-participating practice, the variation that remains after removing a selection effect is calculated by pooling the two within-group variations. The value of the coefficient tends to decrease to 0 as the number of categories of the characteristic increases, so there is no benchmark value indicating a particularly strong effect. However, higher values of the uncertainty coefficient indicate stronger selection effects, and coefficients for the same characteristic are compared so as to determine the strongest selection effect for that characteristic. A two-stage non-parametric bootstrap was used to obtain 95% confidence intervals for these coefficients, accounting for clustering of women in practices. The statistical analyses were performed in SAS version 9.4 (SAS Institute, Cary, NC) and R version 3.5.1.

Results
The invitation, participation, and recruitment processes are shown in Figure 1. We invited 305 out of 1561 general practices in the Capital Region of Denmark and Region Zealand, following the systematic randomisation procedure. A total of 190 practices (62% of those invited) agreed to participate, but only 125 (41% of those invited) recruited one or more pregnant women during the study period. These active practices recruited an average of 12 women (range 1-84) and only 1508 (17%) were recruited to the study from the 9028 eligible pregnant women who attended the first prenatal care visit at a practice that had agreed to participate. For four individual women in the study, we were not able to determine their GP. During the study some women moved, some could not be traced because of an incorrect or missing civil registration number, and others had a spontaneous abortion; this left 1434 out of 1508 women to participate in the study. Table 1 shows the characteristics of the practices at each step in the selection process. As seen from the uncertainty coefficients, the most pronounced differences in practice characteristics (number of patients on the list, type of practice, age and sex of doctors) were between practices that recruited women and  Table 1. Differences in practice characteristics between practices that were invited into the study versus practices that were not invited into the study, practices that accepted participation into the study versus practices that declined to participate among those invited, and practices that recruited women into the study versus those that did not recruit women into the study among those which had agreed to participate. The boxes show the uncertainty coefficient in % (95% confidence interval in brackets), which quantifies the difference in distribution and thereby the relative strength of the selection effect in each step for each characteristic of the women. The uncertainty coefficient builds on Goodman and Kruskal's classic review of association measures [18].
those that did not recruit women. Geographic location of the practice was the only factor for which the difference was largest between those practices that were invited and those not invited. Table 2 shows characteristics of all women who had a first prenatal care visit in the study period for each step in the selection process. Some effect of the sampling process was observed at all steps, but the uncertainty coefficients indicate that the most pronounced differences in socio-demographic characteristics were found between included and non-included women within practices that recruited patients. Patients that were included in the active practices were less likely to live alone and more likely to be born in Denmark, well educated, employed, have a higher household income, have other children, and have fewer contacts with the GP per year. However, for the women's age and their use of prescription medicines for central nervous system (ATC-code N), the largest differences were seen at the initial invitation stage, i.e. between invited and non-invited practices in our study.

Statement of principal findings
Considerable differences were found between practices that recruited women and practices that did not recruit women. Despite these differences, the characteristics of the eligible pregnant women in these practices differed little. Within active practices, however, considerable differences were observed between women who were recruited and women who were not recruited. Selection among GPs is often unavoidable in practice-based studies, and our study shows that the selective recruitment of individuals within the practice may be most critical for the representative or balanced sampling of patient populations.

Strength and weaknesses of the study
Demographic information on GPs and patients was studied using the Danish National Registers. It is often difficult to get information about non-participants in cohort studies [19], but this pregnancy study offered a unique opportunity to identify the source population, because almost all pregnant women in Denmark get in contact with their GP early in pregnancy, and first pregnancy consultations are registered in a national database based on reimbursement data provided by GPs. Personal characteristics and socioeconomic information are also available in the national registers enabling us to describe participants and non-participants by data of high validity, without recall bias and with low risk of misclassification compared to self-reported data [20].
Our study analysed selection at the various stages of the recruitment process in relation to both practice characteristics and the sociodemographic characteristics of the pregnant women that were available in registers. It is important to stress that the representativity of the sampled population may be different for measures not available in registers, such as the occurrence of sleep problems, physical discomfort, depressive symptoms, and other issues in pregnancy. Such measures of interest may theoretically be distributed differently in women that are recruited and women that are not recruited, irrespective of selection in the sociodemographic variables in the cohort [14]; wellbeing may be related to participation status even when no selection is found in sociodemographic factors, and the other way around. The problems mentioned above, however, and indeed many other health problems, do show strong associations with sociodemographic characteristics [14,21,22]. We consider our survey of selection using the available register data indicative, therefore, of a general tendency of selection at various steps of sampling in our cohort.
We observe that the selective inclusion of pregnant women within the participating practices was more important for the observed differences of most sociodemographic characteristics of the women than the selection of the GPs. Selection among doctors could be more important for the selection of other groups of patients with diseases such as hypertension, acute infections or multimorbidity; the patient lists of older doctors may, for example, comprise more patients with complex multimorbidity than lists of younger doctors. Differences between interested doctors and non-participants may also be more important if the GP has a more active role in defining the eligible patients. In our study, we were surprised to find a significant association between whether or not practices were invited with their geography and the age of their patients. We had attempted a systematic random selection of GPs, which somehow failed. Future studies should try to investigate the relative importance of selection among doctors versus selective inclusion of patients by participating doctors in other patient groups.
Comparison to other studies of sampling in general practice Self-selection among doctors may be difficult to avoid. Common barriers to GP participation and retention in Table 2. Differences in characteristics of the pregnant women between practices that were invited into the study versus practices that were not invited into the study, practices that accepted participation in the study versus practices that declined to participate among those invited and practices that recruited women into the study versus those that did not recruit women into the study among those which had agreed to participate.  Furthermore, it shows differences in characteristic of women who were included in the study versus women who were not included in the study from the practices that actively recruited women into the study.
The boxes show the uncertainty coefficient in % (95% confidence interval in brackets), which quantifies the difference in distribution and thereby the relative strength of the selection effect in each step for each characteristic of the women. The uncertainty coefficient builds on Goodman and Kruskal's classic review of association measures [18].
research projects include the following: GPs having little insight into research design; concern about the misuse of patient data; scepticism about the value of the research; survey overload, lack of time, and results that are not locally relevant [23][24][25][26]. GP recruitment is time-consuming and may involve many phone calls, emails and visits. In one Danish study GPs were invited by letter to participate in prospective registration of patients with a respiratory tract infection; only 8.5% of the invited practices agreed to participate [27]. A study investigating barriers and facilitators to patient recruitment in primary care sent 1662 invitation letters and enrolled 55 GPs [28]. Although it is difficult and time-consuming, a personal contact with the GP seems to be more effective than asking administrative staff for permission to send the practice an e-mail containing information about a project [29,30].
GPs who agree to participate in trials do not always then recruit patients. Only 41% of the invited GPs recruited one or more pregnant women to our cohort. A Dutch study [31], investigating the effectiveness of two treatment strategies for dyspepsia, reported that 48% of the GPs recruited one or more patients. A study involving patients with menorrhagia found that 41% of GPs who agreed to participate actually recruited patients [32], while a study investigating GP and patient recruitment in a trial to determine the usefulness of brain natriuretic peptide in the diagnosis of heart failure found that 31% of the participating GPs recruited patients [33]. Higher patient recruitment rates may be promoted by establishing a relationship with GPs and clinic staff, as well as keeping regular contacts, giving clear instructions and minimising tasks for participants [29]. A study investigating the validity of a response rate of 44% obtained in a national postal study of GPs surveyed about their work with patients with alcohol abuse found some significant evidence for the presence of non-response bias, but the low response rate did not necessarily affect the validity of the data collected [34].
Recruitment may depend on a number of factors related to GPs, the topic of the investigation and patient groups. Obviously GPs, as well as patients, may be more willing to join projects that interest them. The time required to take part may also be important. First pregnancy consultations can be time consuming and this may prevent the inclusion of some patients. Concurrent studies in primary care involving pregnant women could also have lowered inclusion, but we are not aware of such studies in the study period. The number of women recruited by each of the GPs who participated in our study varied considerably, and this is similar to observations in other studies [33]. Organisational characteristics of our high-recruiter practices included: larger practices, group practices, female GPs, and practices located in rural areas or in smaller cities. Single-handed practices were over-represented among the low recruiters. A smaller number of patients may reduce the potential for recruitment and the study may thus be brought to the GP's attention less frequently. Such effects should be studied further. Studies in the Nordic countries and New Zealand report no major GP gender differences in recruitment patterns [28,33]. A study investigating differences in medical service and the demographics of participating GPs in the five Scandinavian countries corresponds with our results: 47% of the GPs were women [35], they had a mean age of 50 and they generally shared their practice with other GPs [35]. In a study from Norway exploring the associations between GP characteristics and the quality of care for patients with type 2 diabetes, 73% of the invited practices participated and in total 55% of the GPs were male, 68% had specialist accreditation and 82% were born in Norway [36].
The characteristics associated with participation among the pregnant women were that they were born in Denmark, they were well educated and had a good income. Similarly, the Danish National Birth Cohort (DNBC), a nationwide cohort study with data from 100,000 women, showed underrepresentation of women outside the workforce, with low education levels, low income and non-Danish origin [11]. A systematic review investigating participation bias in cohort studies found an average proportion of participation to be 64% and only age, year of contact and study region were associated with participation. This leads the authors to suggest that evidence about participation and compliance should be assessed prior to funding, and local knowledge should be included in addressing the potential participants [12].

Meaning of the study
We found significant differences between participating and non-participating practices with regard to practice characteristics such as number of GPs, number of patients registered with the GP practice and the sex and age of the GP. Only relatively small differences were, however, observed in the characteristics of the eligible patients between participating and non-participating practices. The most important differences in socio-demographic characteristics were found between those patients included and those not included in the practices that actively recruited patients. Comprehensive sampling within the participating practices may be the best way to generate representative samples of patients: the fact that some practices in our study achieved very high recruitment rates suggests that it was the GP's invitation to participate rather than the acceptance of participation by the patient that was the crucial factor in determining level of recruitment.
Selection of a specific group of women which is not representative of all pregnant women is, however, hard to avoid. This may bias results if selection is present both in the exposure and the outcome of interest. Some of this bias may be removed by adjusting the analysis by means of observable factors with a known selection, for example, some of the factors investigated in the present paper [37]. A better approach, not necessarily possible in all studies, is to randomize the exposure or intervention; this removes confounding, notably confounding through selection.