The importance of collaboration and supervisor behaviour for gender differences in doctoral student performance and early career development

ABSTRACT This article provides an explanation for previously observed gender differences in scientific performance during doctoral studies and the early career. Data is based on doctoral students in science, technology, and medicine at a Swedish university. We collected information on each doctoral student's publication and employment history. We also created publication histories for the doctoral candidates main supervisors. The data was supplemented with information on gender, age, and research area. Informed by theories on academic socialization, our research questions focus on how gender differences in productivity during doctoral studies and the early career relate to research collaboration and behaviour/characteristics of the main supervisor. Results show that the gender gap in productivity during doctoral studies, and the early career, can be explained by the degree to which the doctoral students co-author publications with their main supervisors and the size of their collaborative networks.


Introduction
Persistent horizontal and vertical gender segregation in the higher education sector is concluded in a meta-analysis published by the European Commission (2012). Political initiatives to decrease gender segregation in the higher education sector focus mainly on three issues: decreasing gender segregation in educational choices, increasing the share of women who choose science as their vocation, and improving career prospects for women who have chosen science as their vocation. Despite similar patterns of gender segregation, there is considerable variation in levels of segregation among EU member nations and no general movement currently exists toward less gender segregation in European countries. Some of the differences between the countries, especially the share of women in the scientific workforce, can be explained by general levels of gender equality in the various countries. However, there is less cross-country variation as concerns the issue of career advancement for women in the higher education system, and the patterns are strikingly similar in each country. This is also true for the Nordic countries, which are those with the highest levels of gender equality.
In the case of Sweden, the extent to which women have entered the scientific workforce has changed substantially, with significant overall changes between 1973 and 2018 in the gender composition of research areas. In the natural sciences, the share of women completing their doctoral studies increased from 9.0% in 1973-35.7% in 2018. In technology, 2.9% of those completing their doctoral studies in 1973 were women, which increased to 32.0% in 2018. However, the greatest changes in gender composition are found in medicine (9.4% in 1973-58.7% in 2018), and social sciences (14.3% in 1973-57.6% in 2018). We also find a majority of women among those completing their doctoral studies in the humanities (31.0% in 1973-58.7% in 2018). It should be noted that the gender composition in medicine and social sciences is affected by the 'academization' of tertiary educational programmes strongly related to professions with a high female representation, such as nursing and social work.
Despite the changes in the gender composition of the Swedish higher education system, and the increased importance of gender equality in governmental policies, female researchers' career prospects within the higher education system, measured as the probability of becoming a professor, exhibit no observable improvement in Sweden during this same time period (Danell and Hjerm 2013). This also seems to be the case in other European countries (European Commission 2012). There may be many factors that explain why the promotion rate for women has not improved despite changes in the gender composition of research areas, and these factors can be of greater or lesser importance in different national contexts, depending on the levels of gender equality and variations in the setup of the career system at universities. However, a common factor for career success in all higher education systems is the demand for scientific publications. Career advancement in a university context is conditioned by the researcher's publication history, and even small differences in annual publication rates accumulate over time and can generate large differences in promotion rates, especially if the competition for positions is intense (Zuckerman 1991). Moreover, previous research indicates that productivity is a central driver for career development in science (see, e.g. van den Besselaar and Sandström 2016; Gaule and Piacentini 2018).
The so-called 'productivity puzzle' (Cole and Zuckerman 1984;Zuckerman 1991), i.e. the average difference in publication rates between male and female researchers, is a persistent phenomenon that has proven difficult to explain. Previous research has shown that performance differences between males and females, where males are on average more productive than females, arise already during the doctoral studies period (see, e.g. Epstein and Lachmann 2018;Pezzoni et al. 2016;Lindahl, Colliander, and Rickard 2019). While previous research on the productivity puzzle has focused on the early and later career phases (see, e.g. European Commission 2012; Ceci et al. 2014), less attention has been given to explaining gender differences in productivity during the doctoral studies period and how these differences are reproduced in the early career phase. In this study, we address this knowledge gap.
The purpose of this study is to gain a better understanding of the gender gap in productivity in the fields of science, technology and medicinehow it arises as early as during doctoral studies, and how it is reproduced in the early career phasewith a focus on research collaboration and supervisor behaviour. The main contribution and novelty of our study is that we examine how collaboration and supervisor behaviour affect productivity differences between women and men during completion of doctoral studies and the early career phase. During doctoral studies, the formal conditions are similar, if not identical, for all doctoral students, at least in Sweden. Swedish doctoral students are employed by the university, and as such have all the formal rights that government employment entails and a fixed amount of time allocated for research. Each doctoral student has three years of full-time employment for research, and one year of courses. The similarity in formal conditions for research among the doctoral students enables us to focus on other factors that are unique for doctoral education and that could affect gender differences in productivity, such as the scientific standing of their supervisor, and integration into collaborative research networks. In the article we present: (1) results from an estimation of how the gender difference in publication rates among doctoral students varies between different publication-based indicators; (2) an explanation of the observed gender difference among these students; and (3) a description of how this gender difference in publication rates is reproduced during the early career phase. 1 The main findings of this study are (1) that we can observe a gender gap in productivity already during doctoral studies and that this gender gap, to a large extent, can be explained by the size of the external collaborative network and co-authoring with the main supervisor; and (2) that collaborative behaviour continuous to be an important factor in explaining the gender gap in productivity in the early career phase (i.e. the years after completing doctoral studies). The focus of this article is on doctoral students within science, technology, and medicine who completed their doctoral studies at a single Swedish university between 2006 and 2010.

Previous research and theoretical framework
Explanations of the productivity puzzle The 'productivity puzzle', i.e. that male researchers publish on average more papers than female researchers, is well documented in the literature (Cole and Zuckerman 1984;Zuckerman 1991;Lemoine 1992;Long 1992;Xie and Shauman 1998;Prpic 2002;Sax et al. 2002;Taylor, Fender, and Burke 2006;Abramo, D'Angelo, and Caprasecca 2009;Frandsen et al. 2015;Ebadi and Schiffauerova 2016). Several explanations for the performance difference have been proposed; however, the results have pointed in different directions, and the general view is that gender-based performance differences have not been properly explained (European Commission 2012). Hence, the labelling of this phenomenon as the productivity puzzle.
One explanation for the productivity puzzle revolves around parenthood and to what degree having children affects gender differences in productivity (Ceci et al. 2014). However, while it has been shown that females more often interrupt their careers to have children and start a family (Prozesky 2008), the effect of family-related variables on performance have shown mixed and to some extent contradictory results (Ceci et al. 2014). One line of studies suggests that having children has a relatively small effect, if any, for both males and females (see, e.g. Xie and Shauman 2003;Stack 2004;Sax et al. 2002). Another body of research suggests that having children does have a negative effect on female research performance in comparison with male performance (see, e.g. Fox 1995;Fuchs, von Stebut, and Allmendinger 2001;Hunter and Leahey 2010;Ecklund and Lincoln 2011;Lutter and Schröder 2019). The results of Ginther and Kahn (2006) provide a more complex picture, suggesting that there are field-specific differences in the effect of having children and that the effect is very low in most research fields. Ginther and Kahn (2006) conclude that having children cannot explain the overall productivity puzzle.
A second explanation is concerned with how time for research is distributed between males and females at different stages in their careers (Ceci et al. 2014). Previous research has shown that in academia, females tend to work fewer hours per week than males (Ferriman, Lubinski, and Benbow 2009;Lubinski and Benbow 2006;Manchester and Barbezat 2013). Xie and Shauman (2003) more specifically examined time for teaching versus time for research and found that the effect of gender on productivity decreases when controlling for time for teaching and time for research. The issue of time for research is related to a third explanation that examines gender, position in the academic hierarchy, and the resources these positions provide access to. Xie and Shauman (2003) found that the observed productivity differences between males and females disappeared when they controlled for factors such as academic track, academic position, type of institution, and available resources.
A fourth explanation is related to male and female involvement in professional networks and collaboration. Many studies have concluded that women researchers have relatively less access to social capital in terms of participating in international collaborations and co-authorship networks (Allison and Long 1990;McNamee, Willis, and Rotchford 1990;Grant and Ward 1991;Dundar and Lewis 1998;Renzulli, Aldrich, and Moody 2000;Prpic 2002;Lee and Bozeman 2005;Bland et al. 2006;Carayol and Matt 2006;Leahey 2006;Taylor, Fender, and Burke 2006;Puuska 2010). Collaboration affects researchers' productivity as well as the visibility of the research in terms of citation impact (Persson, Glänzel, and Danell 2003;Lee and Bozeman 2005). Consequently, the lower degree of collaboration among females in comparison with males may have a negative effect on their research productivity and their career development.
While the above explanations are some of those most commonly suggested, other less investigated lines of research relate the gender gap in productivity to, e.g. institutional resource decisions (Duch et al. 2012), and to the fact that females specialize less in their research topics than males, which may result in fewer publications (Leahey 2006).
Taking into consideration the large differences between countries in gender equality, labour market policy, and organization of higher education institutions, the fact that there are several different explanations for the productivity gap between males and females may not be that puzzling. However, what is puzzling is that gender differences in research productivity seem to exist in most western countries, regardless of national differences in gender equality, policy, and organization of higher education institutions. In this study, we understand the productivity puzzle within the framework suggested by Zuckerman (1991), in which the explanations of gender differences in productivity, and in career development at large, are categorized as caused by social selection (i.e. gender discrimination) and/or by self-selection (i.e. individual choice). In Zuckerman's (1991) framework, it is assumed that the effects of social selection and self-selection accumulate over time and that in each step they, on average, provide advantages for males and disadvantages for females. Zuckerman (1991) refers to this process as cumulative advantages and disadvantages. In the context of this study, cumulative advantages and disadvantages suggest that small initial differences in productivity between males and females in the early career phase will accumulate and become larger as the career progresses (Zuckerman 1991;Merton 1988). Cumulative advantages and disadvantages are not dependent on which particular factors determine the gender gap, but rather suggest that the gender gap grows incrementally, and that different factors may contribute at each step (Zuckerman 1991).
Previous studies have shown that gender-based differences in productivity can be observed as early as during the doctoral studies period (see, e.g. Lindahl et al. 2019;Epstein and Lachmann 2018;Pezzoni et al. 2016). However, potential explanations for gender differences during doctoral studies are scarce. Feldon et al. (2017) examined 336 first-year doctoral students in the biological sciences in the US and found that despite males spending, in comparison with females, less time on supervised research and less time on assigned tasks (i.e. less time in the laboratory), they were more likely than females to publish journal articles during their first year as doctoral students. As suggested by Epstein and Lachmann (2018), it is possible that the results of Feldon et al. (2017) are a consequence of supervisors rewarding male and female doctoral students differently. Such a phenomenon could be explained by the so-called Matilda effect, a theory that suggests that women's achievements are systematically under-recognized in science (Rossiter 1993). Epstein and Lachmann (2018) examined a sample of 730 doctoral students in the life sciences in Germany and found that integration into their scientific community, as measured by two scales of subjective assessment, had a positive effect on male doctoral students' first authored publications, but not on those of females. In this study, we are aiming to further advance our understanding of which factors contribute to the gender gap in productivity during doctoral studies. Against the backdrop of Zuckerman's (1991) theoretical framework of cumulative advantages and disadvantages, which suggests that these differences will grow and become larger over time, our assumption is that a key to attaining a better understanding of the gender gap in productivity and how it arises in the careers of young researchers is to focus on central aspects of doctoral education, e.g. supervisors, mentorship, knowledge production, and networking. In the next sub-section, we will elaborate further on a theoretical framework for examining these elements.

Academic socialization
Previous research on gender differences in research performance and career development, together with the theory of cumulative advantages and disadvantages, provides a framework for explaining gender differences in science. In this study, we utilize theories of academic socialization as a framework to examine the social aspect of knowledge production in academia (Weidman, Twale, and Stein 2001). Researchers are part of knowledge-producing communities, and as such, knowledge production and writing articles are not individual activities. Throughout the research career, scientific achievements are dependent on the researcher's ability to cultivate social relations that provide access to tacit knowledge, information, and other resources embedded in social networks (Etzkowitz, Kemelgor, and Uzzi 2000;Burt 2004). The framework of academic socialization is particularly relevant for this study since we are examining a sample of doctoral students in science, technology and medicine, fields which have a more collective model for doctoral education than is the case in the social sciences and the humanities. In the natural and life sciences, the thesis is often an important contribution to the supervisor's research project and the doctoral student is integrated into a larger group of researchers (Austin 2009;Becher and Trowler 2001;Delamont, Atkinson, and Parry 2000;Golde 2005;Knorr Cetina 1999;Pyhältö, Stubb, and Lonka 2009). According to Weidman, Twale, and Stein (2001), socialization during doctoral studies cangenerallybe defined as 'the processes through which individuals gain knowledge, skills, and values necessary for successful entry into a professional career requiring an advanced level of specialized knowledge and skills' (iii). In this framework, the socialization process is closely connected with the institutional environment in which the doctoral studies take place and with the culture of these institutions (Weidman and Stein 2003). However, it is important to note that the framework does recognize that universities are not isolated environments and that doctoral students may interact with communities and cultures from other institutions and contexts (Weidman and Stein 2003). Weidman and Stein (2003) suggest three central mechanisms of socialization: (1) interactions with others; (2) acclimatizing with the expectations of faculty and peers; and (3) learning the necessary knowledge and skills.
An important part of the socialization process in the context of doctoral education is related to the relationship between the doctoral student and his or her supervisor(s). The importance of the interpersonal interaction between supervisors and doctoral students has been relatively well studied in previous research (see, e.g. Basturkmen, East, and Bitchener 2012;Golde 2000;Kam 1997;Marsh, Rowe, and Martin 2002;McAlpine and Norton 2006;Ives and Rowley 2005;Pyhältö, Vekkaila, and Keskinen 2015;Williamson and Cable 2003). There are also several studies investigating how factors other than the supervisor affect doctoral student performance. For example, how junior and senior researchers in project groups and networks affect doctoral students' socialization and intellectual development (Austin 2002;Fenge 2012;Lee and Boud 2009). A relevant perspective on academic socialization during doctoral studies concerns mentors and mentoring. Previous research suggests that mentors and mentoring during doctoral studies have a positive effect on the overall doctoral student experience (Lyons, Scroggins, and Rule 1990), as well as on research performance (see, e.g. Cronan-Hillix et al. 1986;Green 1991;Reskin 1979;Paglis, Green, and Bauer 2006). Following the framework for mentoring suggested by Paglis, Green, and Bauer (2006), mentorship has three main functions: (1) psychosocial mentoring, which relates to the doctoral student's sense of competence, confidence, and effectiveness; (2) career-related mentoring, which includes activities that help prepare the doctoral student for advancing his or her career, e.g. challenging assignments and providing access to professional networks; and (3) research collaboration, which consists of providing doctoral students with the opportunity to co-author (e.g. conference papers, grant proposals, and journal articles) with their supervisors.
Our main focus in this study is on the third function of mentoring: research collaboration. Research collaboration with colleagues in the early career phase may be favourable for career development. Cameron and Blackburn (1981) conducted a study based on a questionnaire and interviews, with a sample consisting of 250 assistant, associate, and full professors in English, psychology, and sociology departments in the US, and found that receiving financial support from, and early collaboration with, a senior colleague is positively correlated with publication rate, number of received grants, collaboration, and professional network. Cameron and Blackburn (1981) also concluded that males had larger social networks than females, a factor that was associated with career success (Cameron and Blackburn 1981). It has been shown that mentors (e.g. supervisors) tend to associate with apprentices (e.g. doctoral students) who are similar to themselves with respect to, e.g. gender, race, and social class (Chandler 1996). A recent study found that doctoral students with supervisors of the same gender published more during their doctoral studies and were more likely to continue in academic science (Gaule and Piacentini 2018). It can be argued that this tendency of homophily (i.e. in the context of this study, that females and males prefer to collaborate with persons of the same sex; Epstein and Lachmann 2018) put females at a unique disadvantage in science and academia due to their low representation at the upper end of the job hierarchy where most positions are held by males (Noe 1988). Since females tend to attain lower academic positions than males, homophily may be one reason why females benefit less from their collaborative networks than males (Epstein and Lachmann 2018). Moreover, previous research indicates that cross-gender mentorships are difficult for females due to, e.g. lack of access to information networks, stereotyping, social norms, gender role expectations, and also that mixed-gender pairs might give rise to gossip, jealousy, and sexual attraction or tension (Noe 1988;Wright and Wright 1987;Chandler 1996). The number of potential mentors with a high similarity match are simply limited for females in science and academia, from which it follows that female doctoral students may have less access to the three main functions of mentorship: (1) psychosocial mentoring; (2) career-related mentoring; and (3) research collaboration.

Research questions
Based on our review of previous research on productivity differences between males and females and the theoretical framework of academic socialization, we have formulized three research questions. To make sure that the observed gender gap in performance during doctoral studies is not a consequence of a particular indicator, how the indicator is calculated, or the choice of database from which the data is collected, research question 1 is focused on validating the observed gender gap by testing different indicators, calculations, and databases. Research questions 2 and 3 are the main questions of this study, and are concerned with gender differences in performance during doctoral studies, and to what degree gender differences in performance during doctoral studies are reproduced in the early career phase.
. Research question 1: What is the size of the gender gap in productivity during doctoral studies and how robust is the observed gender gap with respect to different indicators, counting schemes, and databases? . Research question 2: To what degree are observed gender differences in productivity among doctoral students related to (a) gender differences in research collaboration; and (b) behaviour and characteristics of the main supervisor? . Research question 3: To what degree are gender differences in productivity during the early career phase related to (a) performance during doctoral studies; (b) gender differences in research collaboration; and (c) behaviour and characteristics of the main supervisor?

Data
Our sample consists of 494 doctoral students who completed their studies at a Swedish university between 2006 and 2010 at the faculty of science and technology and the faculty of medicine. Publication data were collected from DiVA, a Swedish repository for research publications, and InCites, a citation-based research evaluation tool that utilizes bibliographic data from the citation indices accessible through Web of Science. The data from DiVA were downloaded in October 2018 and the citation data from InCites were downloaded in November 2018. A major advantage of using the DiVA repository is that we can uniquely identify doctoral students, supervisors, and other local collaborators. In DiVA, every publication is indexed with an author identifier, which is the same as the identity key universities assign to every student and employee. DiVA bibliographic records describing doctoral theses contain information about supervisors and, in the case of compilation theses, the included publications. This has enabled us to identify the supervisor(s) of the doctoral candidates and the publications of each supervisor. Publication data for doctoral candidates and supervisors has been supplemented with information on gender, age, research area, and organizational affiliation.

Design and variables
The overall design of this study consists of Design 1, which comprises the doctoral studies and is associated with research questions 1 and 2, and Design 2, which comprises the early career phase and is associated with research question 3. The early career phase is defined in accordance with Bazeley (2003) and comprises five years following the year of thesis completion. We chose the graduation cohorts of 2006-2010 for two reasons. The first reason is that from 2006, the coverage in the DiVA repository is robust. The second reason is related to the purpose and design. Consider an individual that graduated in 2010. For this individual, the early career phase extends to 2015. For the citation-based variables, e.g. the g-index, we used an open citation window with at least two years for publications in 2015 (Waltman 2016). Thus, the purpose and design of the study require seven years after the graduation year, which makes 2010 the most current year possible given that we collected the data in 2018. Gender equality in Swedish higher education became an important issue in the mid-1990s and has remained important through today (Utbildningsdepartementet 1994). An important question is whether there have been any reforms after 2006 in Sweden that may compromise the generalizability of the results for the 2006-2010 cohorts. However, since there have been no major gender equality reforms in Swedish higher education since 2006 (Vetenskapsrådet 2018), we assume that the generalizability to later cohorts is not compromised by such reforms. The dependent variable indicates the publication volume of a doctoral student during their doctoral studies for Design 1, and of an early-career researcher during the early years of their career for Design 2. Publication volume is operationalized as the number of publications in DiVA that are indexed under the category 'Peer reviewed'.
The main predictor that we are interested in examining operationalizes gender differences/ gender bias and consists of a binary predictor where the value 1 represents males and 0 represents females. Coding: male.
We constructed 12 additional predictors. Our aim is to examine how our main predictor, male, changes when we add and control for these predictors in the model. In the following section, each predictor is presented and explained. Each paragraph presents a predictor and begins with the coding label of the predictor in italics, followed by an indicator in brackets denoting whether the predictor is used in Design 1, Design 2, or in both Designs 1 and 2. The section ends with a presentation of descriptive statistics of the predictors.
Age at completion of doctoral studies (Designs 1 and 2). The results in Costas, van Leeuwen, and Bordons (2010) indicate that top-performing researchers are younger than researchers with lower performance, independent of professional category (i.e. tenured scientist, research scientist, and research professor). The results also indicate that the impact of researchers' output (as measured by a citation-based indicator) decreases with age (Costas, van Leeuwen, and Bordons 2010). Thus, it seems reasonable to assume that the age of the doctoral student at completion of their doctoral degree may have an effect on performance during doctoral studies and/or in the early career phase.
# of publications during doctoral studies (Design 2). Previous research has shown a positive relationship between productivity during doctoral studies and productivity in the early career phase (see, e.g. Williamson and Cable 2003;Laurance et al. 2013). Publication volume during doctoral studies is identical with the dependent variable for Design 1, i.e. the number of publications in DiVA during doctoral studies, including the year of the completion of the doctoral degree.
Number of internal collaborators (Designs 1 and 2). The size of the doctoral students' internal collaborative network is operationalized as the number of unique co-authors employed at the university. (Designs 1 and 2). The size of the doctoral students' external collaborative network is operationalized as the number of unique co-authors not employed at the university.

Number of external collaborators
Degree of collaboration (Designs 1 and 2). Collaboration and the degree of integration into the research community was operationalized with the collaborative coefficient (Ajiferuke, Burell, and Tague 1988). The collaborative coefficient is a weighted mean that incorporates the average number of authors per paper and the proportion of multi-authored papers in a single measure that can be defined as: where f j denotes the number of j-authored papers, N denotes the total number of publications, and k is the highest number of co-authors per paper of an author.
# of co-authored publications with supervisor (Designs 1 and 2). Previous research indicates that collaboration between doctoral students and supervisors in terms of co-authoring papers might increase productivity (Sinclair, Barnacle, and Cuthbert 2014). Collaboration with the supervisor is operationalized as the number of publications the doctoral student co-authored with the main supervisor during their doctoral studies, including the year of the thesis defence for Design 1, and during the first five years after thesis defence for Design 2.
Similarity with supervisor (Designs 1 and 2). Doctoral students with a thesis subject or a research interest that is topically very similar to their main supervisor's research interest may receive mentoring of higher quality due to the supervisor's knowledge and skills on the topic. The supervisor may also be more motivated to provide time for mentoring in cases where doctoral students pursue topics similar to their own. The topical similarity between the doctoral student and the main supervisor is operationalized with a term vector model, where each doctoral student and main supervisor are represented by a term vector. To derive these vectors, terms are extracted from the author's publications and TF-IDF weighting is applied (Salton 1989). For the doctoral students, publications published later than one year after their dissertation defence are not considered; this is to avoid potential topic drift if the research focus changes in later stages of the career.
Sim ij is the observed cosine similarity between the term vector of the PhD student i and the term vector of his/her supervisor j. For interpretability proposes, we compare the observed similarity value with the expected similarity values between a randomly chosen doctoral student and a randomly chosen main supervisor (sim * , * ), as follows: Thus, sim i,j expresses the deviationin terms of standard deviationsfrom the expected similarity of unrelated pairs of doctoral students and supervisors.
Male supervisor (Designs 1 and 2). Previous research indicates that the gender of the supervisor affects doctoral student productivity (see, e.g. Gaule and Piacentini 2018). Pezzoni et al. (2016) suggest that the gender of the supervisor might have an effect on the productivity of the doctoral student and that this effect may partly be influenced by the gender of the student. To take the effect of the gender of the main supervisor into consideration in our models, we constructed a binary predictor where the value 1 represents male supervisors and 0 represents female supervisors.
Age of supervisor (Designs 1 and 2). The age of the supervisor may reflect how much time the supervisor can spend on the doctoral student. Older supervisors may have less time for instruction and mentoring due to a higher workload in terms of, e.g. administration and teaching at more advanced positions (Gu et al. 2011).
G-index supervisor (Designs 1 and 2). Previous research has shown that doctoral students with more productive supervisors tend to publish more during their doctoral studies than those with less productive supervisors (see, e.g. Pezzoni et al. 2016;Gu et al. 2011). The overall research performance of the main supervisor is, in this study, operationalized with the g-index (Egghe 2006). We use the gindex since it is a measure that incorporates both the 'quantitative' and 'qualitative' dimensions of the publication output (Egghe 2006). For Design 1, all years up to and including the year of thesis defence are included in the calculation of the g-index. For Design 2, all years are included in the calculation.
# of other doctoral students (Design 1). Gu et al. (2011) indicate that the ratio of doctoral students to supervisor affects productivity in that doctoral students whose supervisors have a higher number of doctoral students are less productive. A potential reason for the negative effect of doctoral student to supervisor ratio is that the time for instruction and mentoring decreases as the number of doctoral students increases (Gu et al. 2011). The effect of doctoral student to supervisor ratio is operationalized as the number of doctoral students of the main supervisor until thesis completion.
Adjusted for years employed (Design 2). In Period 2, the number of years employed at the university during the second to the fifth years might have an effect on productivity since an author with a longer employment period has had more time to write papers. To control for this effect, we constructed a categorical variable denoting whether an author had been employed one, two, three, or four years during the second to the fifth year after the year of thesis defence.
An issue with our data set is that the effect of the predictors may vary between disciplines and research areas. Previous research has shown that there are differences between different areas of research in terms of, e.g. productivity (see, e.g. Schubert and Braun 1986;Vinkler 1986); collaboration (see, e.g. Newman 2001); and generally in how the doctoral studies are organized (Becher 1994). In order to take this issue into consideration, each doctoral student was grouped into one of six research areas in accordance with an aggregated version of the classification scheme Standard for Swedish classification of research areas 2011 (HSV 2011). The aggregated classification scheme consists of five research areas: (1) Clinical medicine; (2) Basic medicine and other health sciences; (3) Life science; (4) Physical science; and (5) Engineering and technology. We then calculated log-transformed and square-roottransformed z-score variables adjusted for research area. The dependent variable, which is the number of publications during doctoral studies and the predictors # of co-authored publications with supervisor, number of internal collaborators, number of external collaborators, are calculated as log-transformed z-scores normalized with the average and standard deviation for each research area. 2 The predictors degree of collaboration and similarity with supervisor are not log-transformed but calculated as zscores normalized with the average and the standard deviation for each research area. The predictor, gindex supervisor, is calculated as square-root-transformed z-scores normalized with the average and the standard deviation for each research area. The purpose of these transformations is to adjust for differences between research areas and make the models in our analyses conform to the OLS regression assumptions concerning linearity and normal distribution of errors. Table 1 consists of descriptive statistics for the non-binary variables. It contains descriptive statistics for both the transformed and the nontransformed variables. See Appendix 1 for descriptive statistics by male and female groups.
In our sample, among the doctoral students, 50.6% are male and 49.4% female. Among the main supervisors, 73.5% are male and 26.5% female. Notice that the mean and standard deviations for the transformed variables in Table 1 are not zero and one. This is the case because we first standardized the observations for each discipline and then summarized across disciplines when calculating the mean and standard deviations.

Size of gender differences in research performance during doctoral studies
Since time for research during doctoral studies is fairly limited and the publication output of doctoral students is relatively small, we start with an analysis of the size and robustness of the gender difference in productivity during doctoral studies. We expect the gender differences in productivity to be quite small (van den Besselaar and Sandström 2016). However, to get an indication of the robustness of the observations, and to make sure that the differences we observe are not a consequence of a particular indicator, how the indicator is calculated, or the database from which the data are collected, we compare male and female productivity using different indicators, calculations, and databases ( Figure 1). The indicators are the number of Web of Science publications accessed through InCites, the number of publications registered in DiVA, and the number of normalized cited publications in Web of Science, i.e. the sum of Mean Normalized Citation Scores (MNCS). The MNCS is equivalent to the InCites indicator Category Normalized Citation Impact, which is calculated by dividing the number of citations to a document by the expected number of citations for documents with the same publication year, document type, and subject area (Clarivate Analytics 2018). We have also calculated a version of each of these indicators based on author fractions.
Considering Figure 1, it seems clear that previously observed patterns, i.e. that males are on average more productive than females, do not depend on how the indicator is constructed or the choice of database. Second, the variations among the male doctoral students are greater, as shown both by the widths of the boxes and the distances between the whiskers.
As a test for the size of the gender difference, we have used an equality of median test for each of the six indicators, and the size of the gender difference is indicated by the difference between the percentages of male and female doctoral students that are strictly above the median (a t-test is also reported). We observed a significant gender gap, independent of which indicator was used ( Table 2). The size of the gender gap depends on the counting scheme, and fractionalization reduces the size of the gender gap for all indicators. Further, we observe the smallest gender gap for the sum of Mean Normalized Citation Scores (MNCS).
In our regression models, we use the indicator that exhibits the largest gender gap, i.e. whole counts of publications indexed in DiVA. We use whole counts partly because a fraction of an article is a theoretical construct that does not exist, i.e. no one can refer to a fraction of an article in a CV, and partly because we have other predictors that will adjust for collaborative behaviour. However, we want to adjust the indicator for potential variation between different research areas in overall productivity, and we want to transform the indicator in such a way that it conforms to assumptions made in OLS regression. The variable has been transformed by calculating a z-score on the log-transformed indicator for each research area. A comparison of female and male doctoral students according to this new indicator is displayed in Figure 2. Table 3 reports the results of seven OLS regression models in which we estimate the conditional size of the gender gap in research performance during doctoral studies, and study how the gender gap  changes as new predictors are added to the models. The choice of OLS regression is based on a validation procedure indicating that, out of several modelling techniques, OLS regression provided the best-fitting model. 3 Since the coefficients in Table 3 are difficult to interpret, we use the partial eta-squared effect size to interpret the changes in the gender gap (i.e. the male predictor) as additional predictors are added. The unadjusted average gender gap in publication volume is 0.33 (+/-0.15) standard deviations (Model 1) and the partial eta-squared is 0.032. In Model 2, we control for the age of the doctoral student at completion of doctoral studies and the collaborative coefficient. The collaborative coefficient, which adjusts for the weighted average number of authors per publication, is as expected a significant predictor with a large effect size (partial eta-squared = 0.43). Introducing this predictor reduces the estimated gender gap to 0.21 (+/-0.12) standard deviations and the partial eta-squared effect size of gender is reduced to 0.022 in Model 2. Age at completion of doctoral studies is not a significant predictor. In Model 3 and Model 4, predictors related to the supervisor are included in the models, neither of which are significant predictors nor reduce the estimated gender gap. In Model 5 and Model 6, we introduce predictors that aim to capture the relationship between the doctoral student and the supervisor. In Model 5, the doctoral student's topical similarity with the supervisor is introduced as a predictor, and the effect is positive and significant. In Model 6, we introduce a predictor for number of co-authored publications with the supervisor. As expected, this predictor has a major effect, with a partial eta-squared effect size of 0.29. Controlling for this predictor variable reduces the gender gap to an insignificant 0.10 (+/-0.11) standard deviation, and the partial etasquared effect size of the gender gap is 0.007 (Model 6). Notice that when we introduce the predictor # of co-authored publications with supervisor in Model 6, the predictor, male supervisor, increases and becomes significant. This suggests that, all else being equal, doctoral students with a male supervisor are more productive than doctoral students with a female supervisor. However, while the coefficient for male supervisor is significant, the effect size is small, with a partial eta-squared of 0.014 in Model 6 and 0.011 in Model 7. It should also be noted that introducing the # of co-authored publications with supervisor also changes the sign of the predictor, indicating topical similarity with the supervisor from positive to negative, i.e. given the same level of supervisor collaborations, a lower degree of topical similarity indicates more independent publications. The g-index also becomes significant and negative, indicating that, given the same level of supervisor collaboration, doctoral students with supervisors with a high g-index have fewer independent publications. In Model 7, the sizes of the doctoral student's internal and external networks are included. Both are significant predictors for publication volume during doctoral studies, but the effect size of the external network is significantly larger, with a partial eta-squared effect size of 0.14, as compared with 0.01 for the effect of the size of the internal network. The introduction of these variables further reduces the estimated gender gap.

Modelling gender difference during doctoral studies
To summarize the results reported in Table 1, we conclude that most of the gender gap during doctoral studies is explained by collaboration with the main supervisor and by the size of the doctoral student's external collaborative network.

Gender differences in the predictors
The gender gap in publication volume was stepwise reduced from 0.33 standard deviations (a partial eta-squared effect size of 0.032) to 0.08 standard deviations (an effect size of 0.005); see Table 3. It is therefore worth taking a closer look at gender differences in the predictors in order to clarify in what respect conditions for male and female doctoral students differ. Thus, we apply an equality of median test to the predictors with the largest effect size and test for median differences with respect to gender (t-tests are also reported).
We observe a gender difference in favour of the male doctoral students on all predictors (Table  4). For the predictor # of co-authored articles with supervisor, the predictor that explains most of the variation among doctoral students, the gender difference is 12.3%. The predictor number of external collaborators exhibits a gender difference of 12.5% and has the second largest effect size. For the predictor collaborative coefficient, i.e. the weighted average number of authors per paper, the gender difference is 13%, a much smaller effect size than the other two predictors. It is also interesting to note that despite the fairly large gender difference in collaborative coefficient and number of external collaborators, male and female doctoral students do not differ much in the number of internal collaborators, indicating that they tend to be part of research groups of fairly equal sizes. Note: *P < .05; **P < .01; ***P < .001.

Gender gap after completion of doctoral studies
In this section, we present six regression models for publication activities after the completion of doctoral studies (Table 5). Since our database is limited to those who have been employed at the university after the completion of their doctoral studies, the number of observations in the regression models displayed in Table 5 is 301, as compared with 494 in Table 3. Among the 494 doctoral students, 250 are women and 244 men. Among those 301 individuals who continued to be employed after completion of doctoral studies, 163 are women and 138 men, i.e. 54.1% are women. The probability of staying at the university after completion of doctoral studies differs between the research areas. The largest share of doctoral students who continue to work at the university is found in medical sciences, and the smallest share is found in biology and engineering. Among the doctoral students in the sample, women have an overall higher propensity to stay at the university. In Model 1, the gender gap after completion of doctoral studies is estimated when adjusting for years employed (Table 5). The coefficient for male is 0.36 (+/-0.22) standard deviations, which is almost equal to the coefficient of 0.33 (+/-0.15) for male during doctoral studies (see Model 1 in Table 3). The partial eta-squared effect size of the gender gap after completing doctoral studies is 0.037, as compared to 0.032 during doctoral studies. This would indicate that the gender gap is the same after doctoral studies. In Model 2, the size of the gender gap is re-estimated when controlling for three additional variables: age at completion of doctoral studies, # of publications during doctoral studies, and the collaborative coefficient after doctoral studies. Controlling for publication volume during doctoral studies and the collaborative coefficient during the period after doctoral studies reduces the size of the coefficient, indicating a decrease in the gender gap from 0.36 (+/-0.15) to 0.21 (+/-0.16), and the partial eta-squared effect size is reduced from 0.037 to 0.022 (see Model 2, Table 5). It is interesting to note that controlling for publication volume during doctoral studies, which contains a large portion of the initial gender difference, does not reduce the estimated gender gap more in Model 2 (i.e. despite controlling for the initial gender gap during doctoral studies, we still find a gender gap). Model 3 contains predictors related to the supervisor, but these variables do not reduce the estimated gender gap. In Model 4, we control for the student's topical similarity with their supervisor, but this predictor is not significant. However, introducing the predictor # of co-authored publications with supervisor (Model 5), which indicates to what degree the individual continues to collaborate with the supervisor after completing their doctoral studies, is clearly significant, with a coefficient of 0.35 (+/-0.10) and a partial eta-squared effect size of 0.16. Controlling for continued collaboration with the supervisor reduces the estimated gender gap to 0.15 (+/-0.15) publications, with a partial eta-squared effect size of 0.012, which also indicates that after the Table 4. T-test and equality of median test for gender differences in predictors. completion of doctoral studies the estimated gender gap in publication volume is related to supervisor behaviour. In Model 6, we add variables indicating the size of the individual network within the university and the size of the external network formed after completion of doctoral studies. Introducing these variables further reduces the estimated size of the gender gap, and the coefficient for the predictor male of 0.05 (+/−0.10) standard deviations is clearly no longer significant (Table 6). An explanation for the increased gender gap could be a higher proportion of employed female researchers with zero publications. However, a closer look at the variable shows that 19% of female researchers have zero publications during the period, compared to 17% for male researchers, which is not a sizable difference. Re-estimating Model 1 while excluding individuals without publications during the period does not change the size of the gender gap, since the coefficient for male is 0.36 (+/-0.18) and the eta-squared effect size increases slightly to 0.056. It therefore does not seem probable, as is often suggested, that the main cause of this gender difference is long spells of absence from work due to family obligations, at least not in the early career phase.

The effect of gendered differences in productivity
Are the observed productivity differences between males and females large enough to be interesting? After completing doctoral studies, the male doctoral students have an average of 4.5 Note: *P <.05; **P <.01; ***P < .001. Table 6. T-test and equality of median test for gender differences in the predictors. publications, compared to an average of 3.4 publications for their female colleagues, i.e. an average difference of 1.1 publication. The average number of publications for the five-year period following the dissertation is 6.1 for the male researchers and 3.9 for the female researchers, i.e. an average difference of 2.2 publications. At first glance, the gender differences appear to be greater during the five-year period following the defence. However, it is misleading to focus solely on the size of the mean difference without considering whether the variation has increased or decreased. If the effect size is calculated for the mean value differences with Cohen's d, the effect size is 0.43 during doctoral studies and 0.35 for the five-year period after doctoral studies. The effect size for mean value differences using the normalized variable is slightly lower with a Cohen's d of 0.36 during the doctoral studies and 0.27 for the five years following completion of doctoral studies. How much early productivity differences affect career development among women is difficult to determine from a comparison of mean values. The effect of gender difference depends on how it impacts when female and male researchers compete for the same academic appointment. We have therefore defined different performance groups to see how large a proportion of women are able to meet the different threshold levels and what happens to the relative distribution when we tighten the requirements for productivity, that is, become more selective. Although productivity is rarely the only criterion used in academic appointments, the differences in number of publications are difficult to overlook on such occasions.
Tables 7 and 8 illustrate the effect of the gender differences immediately after the individuals have completed their theses, and again after a five-year period. The first row in Table 7 shows the relative distribution of women and men if no productivity requirements are set, and the gender distribution is then very even. As performance requirements increase, the proportion of women decreases successively. The top 50% group consists of 42 per cent women and the higher productivity demands we applied, the fewer women are included in the top groups. In the most extreme performance group, i.e. the top 5%, only 26% are women. Five years after completion of doctoral studies, this pattern is repeated (Table 8). Although a larger proportion of women continue to be employed at the university, this proportion decreases when demands for productivity are applied. This is a theoretical example, but the outcome fits well with actual observations regarding different excellence initiatives in Sweden. Particularly noteworthy is the large investment in strong research environments, as no female research leader was granted any of these major grants (Sandström et al. 2010).

Discussion
In this section, we discuss the results in relation to our three research questions. Research question 1 focused on validating the observed gender gap during doctoral studies: What is the size of the gender gap in productivity during doctoral studies and how robust is the observed gender gap with respect to different indicators, counting schemes, and databases?
In line with previous research and what was expected (see, e.g. Lindahl et al. 2019;Epstein and Lachmann 2018;Pezzoni et al. 2016;Feldon et al. 2017), we were able to identify a gender gap in which males outperform females during doctoral studies. Contributing to previous research, we tested the robustness of this gender gap by conducting an equality of median test and a t-test for six different indicators of research performance. Even if the existence of a gender gap was independent of indicator used, the size of the gender gap was affected by both choice of database and counting method. The estimated gender gap was larger if the DiVA repository was used as a source of information rather than the Web of Science website, which has 11% lower coverage. The gender gap was also larger if we used a whole count method instead of a fractional count method. The smallest gender gap was observed when publications were weighted with their impact.
Research question 2 is: To what degree are observed gender differences in productivity among doctoral students related to (a) gender differences in research collaboration, and (b) behaviour and characteristics of the main supervisor?
The estimated regression models for research question 2 indicate that variables with the highest explanatory value for the gender gap in research performance during doctoral studies were those that express different aspects of collaborative behaviour. In the final model, the variables with the largest effect were number of articles co-authored with the supervisor, number of external collaborators, and the collaborative coefficient. Controlling only for number of publications co-authored with the supervisor and the number of external collaborators reduced the conditional average gender differences to an insignificant level.
These results suggest that male doctoral students are more collaborative, have more external collaborators, and co-author more papers with their supervisors than female doctoral students, and that these factors, to a large extent, explain the observed productivity differences between males and females during doctoral studies. While our results indicate that, all else being equal, a male supervisor increases the productivity of both male and female doctoral students, male doctoral students do seem to collaborate more with their supervisors. An interesting question here is why we observed this pattern. It may be that males are better at taking advantage of the opportunities to collaborate with their supervisors, or that they are provided with better opportunities to collaborate than females. It could also be a consequence of the Matilda effect (Rossiter 1993), which states that females are systematically under-recognized in science, which in this context would suggest that they are provided with fewer opportunities for collaborating with their supervisors than males.
Another interesting result is that doctoral students with a male supervisor tend to be more productive than doctoral students with a female supervisor when the number of co-authored publications with the supervisor is controlled for. This suggests that among doctoral students who coauthor equally with their main supervisor, those with a male supervisor tend to publish more papers in addition to the papers co-authored with their supervisors. This result does not align with what we expected based on the theory of homophily and previous research, where it has been shown that gender pairing matters for productivity (see, e.g. Pezzoni et al. 2016;Gaule and Piacentini 2018). It should be mentioned that during the analyses we tested for, but could not find, any significant interaction effects between the predictors for doctoral student gender and supervisor gender (note that these tests are not included in the results section). The difference between our results and previous research may be a consequence of differences in gender equality between countries and gender cultures in higher education institutions. However, comparative studies should be conducted to further examine these issues.
It is interesting that males have more external collaborators than females, and that this predictor seems to be significant in explaining the gender gap during doctoral studies. Previous research has shown that collaboration with more senior researchers in the early career phase provides several advantages for career progression (in terms of, for example, publication rate and the number of received grants), and that males tend to have larger social networks than females, a factor associated with career success (Cameron and Blackburn 1981). The importance of the external collaborative network suggests that activities that enable external networking (e.g. presenting at conferences) are important as early as during doctoral studies.
Research question 3 is: To what degree are gender differences in productivity during the early career phase related to (a) performance during the doctoral studies; (b) gender differences in research collaboration; and (c) behaviour and characteristics of the main supervisor?
In the regression models examining research question 3 and the conditional gender gap during the early career phase, we also controlled for doctoral student performance during doctoral studies. Despite controlling for doctoral student performance, which we know is a variable that contains a large gender difference, we still observed a significant gender gap in research performance during the early career phase. How can we interpret these results? van den Besselaar and Sandström (2016) suggest that if gender differences in the early career phase can be explained by performance differences during doctoral studies, performance will progress as expected, given that science functions as a meritocracy. However, if there are gender differences in performance in the early career phase that are not explained by performance differences during doctoral studies, there might be some other gender bias at work. In this study, the results indicate that the main factors explaining the gender gap in performance during the early career phase is continued collaboration with the supervisor and the size of the internal collaborative network. We noticed that while the external network is more important during the doctoral studies period, the internal network is more important in the early career phase. Since we only look at those students who stay at the university, these results may reflect (1) continued work with the supervisor and the surrounding research team, (2) that the early-career researcher joined another team at the university, or (3) participation in the formation of a new research group at the university.
Previous research has investigated the effects of mentoring during doctoral studies (see, e.g. Cameron and Blackburn 1981;Paglis, Green, and Bauer 2006), gender differences in productivity during doctoral studies (see, e.g. Epstein and Lachmann 2018;Feldon et al. 2017;van den Besselaar and Sandström 2016), and the effect on productivity of having a supervisor of the same gender (see, e.g. Pezzoni et al. 2016;Gaule and Piacentini 2018). We would like to specify our contribution to this body of research. A main contribution of this study is that we more specifically identify the factor that explains how interaction with the supervisor is important for the gender gap in productivity. The gender gap in productivity during doctoral studies can, to a large extent, be explained by the degree to which the doctoral student co-authors publications with the main supervisor during the doctoral programme. Another main contribution is not only that we identify that the degree of collaboration is important, but also that we specify what kind of collaborative network, besides collaborating with the supervisor, is important for explaining the gender gap during doctoral studies. The gender gap in productivity can also largely be explained by the size of the doctoral student's external collaborative network, i.e. the collaborative network that is external to the university where the doctoral student carries out his or her doctoral studies.

Conclusion
The purpose of this study is to gain a better understanding of the gender gap in productivity in science, technology and medicine, how the gap arises as early as during doctoral studies, and how it is reproduced in the early career phase. The article contributes to previous literature on the socalled productivity puzzle (i.e. that males, on average, are more productive than females in science) and academic socialization by focusing on the collaborative behaviour of doctoral students and their supervisors during the doctoral studies period and in the early career phase. Previous research has shown that productivity is a central driver for career development in science (see, e.g. van den Besselaar and Sandström 2016; Gaule and Piacentini 2018). The issue of gender differences in productivity is particularly relevant for researchers in the early years of their career and for doctoral students since small differences between males and females in the early career stages tend to grow over time and may lead to significant differences in career development through cumulative advantages and disadvantages (Zuckerman 1991). A better understanding of which factors during the doctoral studies period affect gender differences in productivity is arguably a central key to better understanding gender differences in researchers' careers.
Our main conclusions are: (1) The observed gender gap in productivity during doctoral studies can to a large extent be explained by the size of the external collaborative network and co-authoring with the supervisor. The question of how doctoral students become co-authors with their supervisors and colleagues, and to what degree this process differs for males and females, should be investigated in future research. Another future avenue of research should be to examine why male doctoral students have larger external networks, and what can be done to increase those networks for female students.
(2) Collaborative behaviour continues to be a central factor in explaining the gender gap in productivity in the early career phase. A task for future research should be to examine how male and female doctoral students attain co-authorships and access to collaborative networks, and the degree to which these processes are related to opportunities for continued collaboration with the supervisor and integration into collaborative networks in the early career phase.
The potential implication of this study is that it may be valuable to work towards: (1) increased awareness among management, faculty, and supervisors of how supervisors' collaborative behaviour may affect gender differences in productivity among doctoral students; (2) increased opportunities for females to collaborate and to publish as co-authors with their supervisors; and (3) increased opportunities for females to enlarge their external collaborative networks.
Finally, this study is not without limitations. In Design 2, where we examine the early career phase, we only look at early-career researchers who stay at the university. The conclusions of this study are therefore limited to this group. We must also remember that this is a limited study of doctoral students at a single Swedish university. Results presented in this article concern doctoral students who completed their doctoral studies in science, technology, or medicine from 2006 to 2010. Any conclusions may therefore not be valid for doctoral students within the social sciences and humanities, or for doctoral students in other time periods. Notes 1. Our research questions are presented at the end of the 'Previous research and theoretical framework' section. 2. Transforming the publication variable into a geometric standard score, we assume a log normal distribution of errors, which has been confirmed in the diagnostic procedure. However, the interpretation of geometric standard scores is not easy and suggests the need for further elaboration. First, since there are zeros in the predictors, we added one to each observation. The geometric standard score, z i , is calculated as z i = ln(x i ) − ln(m g ) ln(s g ) = log sg x i m g where x i is the observed number of publications, m g is the geometric mean for the research area, and s g is the geometric standard deviation for the research area. The distribution of all z i is approximately normal with mean 0 and standard deviation 1. If the ratio x i /m g = s g , then z i = 1, and if the ratio x i /m g = 1/s g , then z i = -1, and so on.

Appendix 1
Descriptive statistics by male and female groups for the variables in Table 1. Values for the male group are shown to the left of the slash and values for the female group are shown to the right of the slash.