Sex differences in scientific productivity and impact are largely explained by the proportion of highly productive individuals: a whole-population study of researchers across six disciplines in Sweden

ABSTRACT Sex differences in human performance have been documented across a wide array of human endeavours. Males tend to exhibit higher performance in intellectually demanding and competitive domains, and this difference tends to be more pronounced the higher the level of performance. Here, we analyse publishing performance for the whole population of associate and full professors in relatively sex-balanced disciplines, namely Education, Nursing and Caring Science, Psychology, Public Health, Sociology, and Social Work, comprising 426 women and 562 men. We find that sex differences in the number of publications, citations, and citations per publication were small across low and medium levels of productivity, but become more pronounced the higher the level of performance. In the top performing 10% the female proportion decreases from the average 43.2% to 26% (25 F, 71 M), which further decreases to 15% in the top 5%. The results are discussed with respect to the greater male variability hypothesis, sex differences in psychological traits, and environmental factors such as sex discrimination.


Introduction
In all societies there are occupations that are segregated with respect to sex, some very much so.Hairdresser, brick layer, nurse, carpenter, psychotherapist, auto mechanic, social worker and computer technician come to mind.In fact, few occupations have close to equal proportions.Amongst the 30 most common occupations in Sweden, for example, only 'management and organisation developers' and 'cooks and cold-buffet staff' are close to 50%, followed by 'customer and service staff' and 'upper-secondary school teachers' at just above 40% males (Statistics Sweden 2020, 66).With regard to success, there are striking examples of domains where men are more successful, such as Nobel laureates, tech entrepreneurs, billionaires, music composers, and movie directors.
Sex segregation is seen as undesirable in many Western countries, and demands are raised to equalize the proportion of men and women in the labour market, in particular for male-dominated high-status occupations (Bonitz 2017;Browne 2018).Concrete measures to this end tend to be issued in countries with a large tax-funded public sector.For example, Swedish authorities reported having implemented 950 state funded projects for increasing sex equality within academe between 1985and 1994(Utbildningsdepartementet 1995)).
Academe is a particularly pertinent domain in which to study sex segregation and evaluate measures to reduce it.It is often publicly funded, and therefore controlled more or less directly by governmental authorities as well as indirectly through the allocation of funding to specific disciplines or for specific purposes.Its revenue is not contingent upon successful competition for price versus performance on an open market.This offers, in principle, the stability of resources for sustainability, and hence for the application of long-term perspectives, even in the face of apparent retrograde.Academe is also hugely variegated and caters to a wide range of interests and abilities, and yet applies the same fundamental conditions and criteria within the same organization.Scientific disciplines from the humanities to the natural sciences cover a vast array of subject matter, and there are also several different ways to be successful within disciplines.It could be as a creative innovator, an entrepreneur who can form and direct a team and procure funding for it, or simply in terms of research output, either as a specialist or a generalist, for example, an expert in a field or a method, or an integrator of knowledge and prolific writer.Taken together, these characteristics would seem to render academe an uncommonly level playing field across demographic categories while yet at a high level of performance.
Nevertheless, there is profound segregation with respect to sex in academe, both horizontally and vertically.The humanities and social-and life sciences tend to be dominated by females, whereas the natural sciences, technology, and engineering tend to be dominated by males.Female students receive a majority of the graduate and undergraduate degrees awarded in broad academic fields, such as health care and social services (82%), teaching methods and teacher training (80%), and in the humanities and social sciences, as well as law, business, and administration (64%) (Statistics Sweden 2020).In contrast, females receive a minority of degrees in the natural sciences, mathematics and information and communication (42%), and less than 20% in technology and engineering (Statistics Sweden 2016).None of the narrower fields included in those statistics has a close to 50% proportion, and many of them are even more extreme (Statistics Sweden 2020, 38).The vertical segregation in academe exhibits a different and generally one-sided pattern, where the proportion of females decreases with higher academic rank and occupies only 28% of the top level of full professors, as of 2018 (Statistics Sweden 2020).
In conclusion, both the horizontal and vertical patterns of sex segregation are similar in academe as they are in society as a whole, as well as across societies, even though academe allows for diversity in interests and types of performance within the same organizational structure.Academic institutions strive mightily to decrease sex differences in most Western countries (Bonitz 2017), and although the overall proportion of females has increased substantially in recent decades, this seems not to be accompanied by a corresponding increase in their relative performance, nor their attainment of the higher ranks (Bendels et al. 2018;Joanis and Patil 2022;Madison and Fahlman 2020;Sebo, de Lucia, and Vernaz 2021;Strumia 2021;Yu and Madison 2021).The obvious question is why these patterns are so persistent, in spite of a more than a century of striving for equal rights and opportunities for women, and 50 years of state and government interventions to counter them in countries like Sweden.In other words, if the present conditions do not lead to sex equality in outcomes, which conditions would?The answer lies, of course, in what the causes are.
Here, we purport to elucidate this question in the light of sex differences in scientific productivity and psychological traits, also considering commonly suggested environmental factors.Stern and Madison (2022) note that two fundamentally different explanations for sex differences are proposed.The most common one, in public discourse in particular, is that men and women tend to conform to expectations and to act according to cultural and social norms, often fathomed as reflecting a maledominated power structure (for reviews see, e.g.Rosser 2004;Walby 1990).This leaves open why these particular norms exist in the first place.The second explanation acknowledges that men and women tend to gravitate towards activities that suit their genetically shaped predispositions, in terms of their different evolved physical, physiological, and psychological traits (e.g.Archer 2019; Stern and Madison 2022).Thus, the first focuses on causes in the environment, and the second in biology.There is empirical support for both, and they are in no sense mutually exclusive.For the purpose of the present study, we note four particularly relevant findings.
First, females exhibit more interest in people and males more in things, on average, also known as the empathizing-systematizing dimension (Baron-Cohen 2010).This is associated with strongly differentiated preferences for people-oriented and things-oriented educational and occupational domains, with multivariate effect sizes well above 1 (Morris 2016).This would seem to explain a substantial part of the horizontal segregation, particularly for the STEM and nursing/caring sciences (Luoto 2020).However, science is by its very nature concerned with finding patterns, which is arguably the essence of systematizing (Baron-Cohen 2020).It is therefore reasonable to expect that more males than females strive towards higher levels of systematizing, which in science may correspond to reducing more data into fewer constructs and to explain more phenomena with simpler theories (Wilson 1998).Inasmuch as this is commensurate with scientific ideals and values, a stronger urge for systematizing should foster research outcomes that lead to greater success, in terms of attracting students, collaborators and funding, and being cited and invited to present ones work, for example.This sex difference could thereby also lead to vertical segregation.
Second, success in academe is associated with several psychological traits, including intelligence, personality, and preference for risk-taking (van der Linden, Dutton, and Madison 2020, 136).Groundbreaking research and novel theories are associated with a high risk of failure, and some funding bodies therefore explicitly encourage 'high-risk research', such as the National Institutes of Health (https://commonfund.nih.gov/highrisk) and the European Research Council (https://erc.europa.eu/apply-grant/advanced-grant).Females have a substantially lower preference for taking risks, crossculturally (Falk and Hermle 2018).
Of course, these two types of sex differences might be dismissed as transient and fading effects of historical conditions that linger on because of cultural norms.There is, however, a vast literature documenting associations between psychological sex differences and sex hormones and other biomarkers, supporting a biological basis (e.g.Berenbaum and Beltz 2016;Miller and Halpern 2014;Schmitt 2017).
A third observation that more generally supports a biological rather than environmental basis for many sex differences is that they are larger in more sex-equal countries, whereas the would reasonably be smaller if they were caused by social attitudes and norms and the like.This is known as the gender equality or sexual paradox (Pinker 2009).In fact, differences in interests and preferences, occupational and educational choices, personality, and several other psychological traits tend to be larger in countries with more sex-equal and feminist attitudes and smaller in countries with more traditional sex roles and division of labour (for brief reviews, see Schmitt et al. 2017;Stern and Madison 2022, 695).This is consistent with the idea that the greater freedom of choice and the often higher standard of living in these countries allow people to choose according to their interests and preferences rather than their needs for sustenance.Related to academic success, females in countries with higher sex-egalitarian values and goals tend to exhibit lower relative performance in STEM subjects (Stoet and Geary 2018), lower preference for risk-taking (Falk and Hermle 2018), and lower aspirations for occupations higher on the people-things dimension (Stoet and Geary 2022).
Fourth and finally, the so-called greater male variability hypothesis reflects the observation that males are generally more variable across a wide range of traits, both psychological (Borkenau et al. 2013;Thöni and Volk 2021) and physiological (Lehre et al. 2009), a phenomenon that is also found across species (Archer and Mehdikhani 2003).Being a highly demanding and competitive domain, the academic system is likely to select and retain individuals who are at the upper extremes in intellectual ability, motivation, perseverance, and whatever other psychological traits that favour academic productivity (Chamorro-Premuzic and Furnham 2008;Grosul and Feist 2014).This implies that males be relatively more numerous at higher levels of performance.
In conclusion, norms and expectations would lead to the typical horizontal sex segregation, as would the people-things preferences.A male-dominated power structure could lead to bias against women, for example in terms of limited access to resources and higher rank positions.The first would manifest as lower performance across ranks and the second as higher female performance within the same rank, because those who reach it would have had to overcome that bias by being more productive than males.These predictions might be moderated by the proportion of each sex, as a male-dominated power structure would be less potent in domains with larger proportions of females.Thus, there should be an association between this proportion and the relative general female performance, either becoming higher due to less thwarting of resources or lower due to less bias in attaining higher ranks.To the extent that systematizing leads to more profound and hence higher valued scientific contributions, this would lead to higher performance for males.However, because scientific work is generally highly intellectually demanding it can be assumed that such contributions require particularly high levels of systematizing.As these high levels are rare in the population, and much more so in females (Baron-Cohen 2010), this would lead to more males reaching distinguished levels of performance.This same prediction also applies to sex differences in risk-taking and other psychological traits associated with academic success in terms of differences in distribution, according to the so-called greater male variability hypothesis.
Studies of sex differences in academic publishing productivity indicate robust and fairly large higher levels for males across many disciplines, even after controlling for rational differences such as active years and academic rank.Thus, males have been found to publish 35% more than females in biology (Laurance et al. 2013), 40% more in ecology and evolutionary biology (Symonds et al. 2006), 30% more in economy (Manchester and Barbezat 2012), and 28% more in computer science, chemistry, electrical engineering, microbiology, and physics (Fox 2005).Likewise, the higher male than female productivity was found to be 43% in psychology (D'Amico, Vermigli, and Canetto 2011;Geraci, Balsis, and Busch 2015), 64% in organizational psychology (Fell and König 2016), 55% in sociology (Aksnes et al. 2011), 60% in sociology and linguistics (Leahey 2006), and 26 and 31% in political science, depending on whether non-SSCI or SSCI articles were considered (Schröder, Lutter, and Habicht 2021).In medicine males have been found to publish 41% more (Raj et al. 2016) or 45% more (Rachid et al. 2021), but the sex differences are often reported to be even larger, on the order of double or more when measured as the median (Holliday et al. 2014;Sebo, de Lucia, and Vernaz 2021).Amongst PhD students in medicine males had only 17% more journal articles (Frandsen et al. 2015).Fridner et al. (2015) represented productivity as having more or less than 16 publications, showing that amongst physicians males constituted 68% of the first category but 31% of the latter.Studies that consider several disciplines have found males to have 50% more publications (van den Basselar and Sandström 2017), 47% (Nielsen 2016), 88% (Puuska 2010), 35% (Rörstad and Aksnes 2015), 23.5% across the life sciences (Lerchenmueller and Sorenson 2018), 43% amongst researchers younger than 35 years of age (Prpic 2004), 40% in the social sciences and humanities, 46% in science and engineering and 58% in the health sciences (Lariviére et al. 2011), and 37% for peer-reviewed articles and 23% for non-peer-reviewed (Nakhaie 2002).The same tendency is also found by studies that do not directly quantify the number of publications per individual, whether in terms of ∼50% higher overall productivity (van den Basselar and Sandström 2016), a predictor in regression modelling (Horta and Santos 2016;Lindahl, Colliander, and Danell 2020), or larger proportions of males amongst high-productive authors (Aguinis, Hun Ji, and Joo 2018;Bendels et al. 2018).No difference was found in management studies, however (Williamson and Cable 2003).Even samples that are effectively matched on variables associated with productivity, such as age or academic position, exhibit the same trend.For example, males had more publications amongst academics who applied for tenure, regardless of whether it was granted (19.7% more) or not (10.7%more) (Riis, Hartman, and Levander 2011) or had just been appointed or promoted to full professors (Madison and Fahlman 2020).
With regards to the greater male variability hypothesis, several studies have found that the higher male performance becomes more pronounced the higher the level of performance.In other words, the proportion of males seems to increase with higher productivity (Abramo, Aksnes, and D'Angelo 2021;Aguinis, Hun Ji, and Joo 2018;Bendels et al. 2018;Chan and Torgler 2020;Holliday et al. 2014;Huang et al. 2020;Strumia 2021).Specifically, Huang et al. (2020, Figure 2) found, across 13 disciplines, similar annual and total productivity for the low and middle 20% performing researchers, but higher male productivity for the top 20%, a result also found for economics (Liu, Song, and Yang 2020).Amongst Nature Index journals, 'Women are underrepresented at prestigious authorships compared to men . . .[which] accentuates in highly competitive articles attracting the highest citation rates, namely, articles with many authors and articles that were published in highest-impact journals' (Bendels et al. 2018, 1) and 'we found significant gender-based differences in the heaviness of the distributions' right tails (i.e.lighter for women), . . . the underrepresentation of women is more and more extreme as we consider more elite ranges of performance (Aguinis, Hun Ji, and Joo 2018, 17).However, a fundamental problem with these eight studies is that their samples are sourced from publication databases, and therefore ignore researchers with naught publications or whose publications are not included in the database.Such data may be biased.For example, if males are more frequent in this group their naught values would not contribute to the mean, leading to spuriously higher performance for males.It is therefore essential to employ a population-representative sample of academics, so that we include all individuals who are expected to publish, not only those who have actually published in the annals considered (cf.Madison and Sundell 2022).The sample should preferably also be drawn from disciplines that have relatively equal numbers of men and women, so as to avoid possible confounding effects of extremely sexskewed environments, such as alienation, bias, or interest in the subject matter.Such effects have been suggested for engineering (Richman, van Dellen, and Wood 2011) and medicine (Han et al. 2018), for example.Finally, we should avoid samples that may have been subject to strong interventions, as this would bias the productivity differential between the sexes.A typical case would be that a full professor position entails considerably more time for research than that available to assistant or associate professors, which will likely increase publication performance after a few years (Rörstad and Aksnes 2015).If a less productive person is given such a position, these extra resources will counter that lower productivity.This may strongly bias the comparison of groups, inasmuch as they are systematically affected by such unequal treatment.
Both productivity and citations are important for the assessment of merit for academic positions.More citations seem to be mainly a trivial effect of more publications, however.Sex differences in group average citations are rarely found when controlling for academic age (Nosek et al. 2010) or measuring citations for a certain time period (Aksnes et al. 2011;Frandsen et al. 2015;Nielsen 2016).Aksnes et al. argue that differences in citations even under these conditions are due to a cumulative effect of having more publications.Also, no sex bias tends to be found in self-citations or in a tendency to cite publications written by one's own sex (Mishra et al. 2020;Strumia 2021).However, even a small tendency to favour male researchers or research topics that are studied more frequently by males is likely to yield more citations to articles authored by males.In order to evade the reported effects of potentially having more publications overall, impact should be assessed by the number of citations per publication.
We choose Sweden as a model to test the predictions outlined above, for two main reasons.First, it is one of the most sex-equal countries (World Economic Forum 2015, p. 8), and has exercised governmental power into interventions to increase the number of females amongst university professors (Madison 2019).Second, our own knowledge about its academic system allows us to identify the target population with reasonable accuracy.We choose disciplines that are high in the people-end of the people-things dimension, so as to be maximally attractive to female researchers and hence have a large proportion of females: education, nursing and caring science, public health, psychology, social work, and sociology.
The greater male productivity reported in the studies reviewed above precludes the possibility to discriminate between two quite different possible causes: females being disadvantaged in terms of less funding and time for research, and sex differences in average levels of traits associated with academic success.Therefore, our design includes the variable academic rank, meaning that individuals must have fulfilled certain academic criteria to attain this rank.Here, we employ two levels, associate and full professor, and assess their performance in terms of publications, citations, and citations per publication.
We hypothesize, first, that there are no sex differences in productivity or impact within each of these rank levels, because the criteria for attaining them are the same for both sexes.However, we add the possibility that female performance is generally higher, consistent with bias against females when evaluating them for a higher rank (a detailed account of this logic is given by, e.g.Madison and Fahlman 2020).The second hypothesis is that citations per publication is smaller for females for the same level of productivity, reflecting a bias against citing female researchers.The third hypothesis is that female researchers are relatively more productive in disciplines with larger proportions of females, because this renders a possible male-dominated power structure less potent.The final two predictions refer to the greater male variability in traits associated with academic success, such as intelligence, motivation, and several personality dimensions.Thus, the fourth hypothesis is that male variance in productivity is higher, in terms of publications, within both ranks.The fifth hypothesis more specifically states that sex differences be small and non-significant across low and medium levels of productivity but exhibit substantially higher male performance for high levels.

Design
A cross-sectional design examines the effects of sex, academic rank, and discipline on the numbers of publications and citations, in the following called just publications and citations.The dependent variables also include the derived measures citations per publication and the distribution of publications.From the whole area of academe we sample the disciplines education, nursing and caring science, public health, psychology, social work, and sociology.This is motivated by their relatively equal sex proportions, consistent with their focus on people versus things.Whether their variation in sex proportion allows for evaluating H3 remains to be seen.We argue that disciplines with extremely skewed sex proportions amongst the research staff may be less representative for the phenomena of interest here.For example, the proportion of female academics in Sweden is 15.2% in electrical engineering, 19.5% in physics, 22.1% in computer and information science, 27.2% in philosophy, 61.1% in language and literature, 65.7% in education, and 70.7% in veterinary medicine (Statistics Sweden 2022).Some disciplines in the humanities have similar near-equal sex proportions, but are unsuitable because their low publication rates and different publication patterns provide little data (Huang and Chang 2008).
To maintain a fairly equal configuration of sex within the docent and professor groups across the measurement period, the data cover the period 2000-2009, which is effectively prior to the rapid increase in female professors from ∼19 to ∼25% between 2009 and 2014 (Madison and Fahlman 2020, Table 1).They can therefore not be influenced by the possible preferential hiring of female professors indicated by that study.Although we have no evidence as to the level of such bias before 2009, it is most likely smaller because the proportion of female professors did not increase during this period, being 18% in 1999, 17% in 2005, and 18% in 2007(Statistics Sweden 2015).

Participants
We consider all academics that are expected to perform and publish research in the selected disciplines as the target population.The Swedish academic system mainly considers three levels of academic rank, lektor, docent, and professor.Lektor and professor are occupational positions, while docent is a purely academic title, corresponding to associate professor in the US and reader in the UK.Lektor means lecturer and is the starting academic position, similar to assistant professor in the US.Professor is the highest academic rank in Sweden, corresponding to tenured or full professor in the US.These ranks mainly differ in the proportion of time available for research.A lecturer teaches full time, but is often given 20-40% time for research if appointed to docent, while a professor has 30-70% of full time for research.
The population was identified by first determining departments with personnel holding a graduate degree in the target disciplines in 2010.The Swedish National Agency for Higher Education indicated 99 eligible departments or work units at 27 institutions in Sweden.Next, information about docents and professors at these departments was obtained, identifying the total 1052 researchers active in these disciplines in 2010.Adjunct, guest, or emeritus professors were excluded, because their position is transient and their publication pattern often non-typical.The remaining 988 researchers were assigned to the discipline of their department, but when departments covered multiple areas of research they were assigned the discipline best corresponding to their research profile.The vast majority of these would have worked in academe for the full 10-year period, because it typically takes at least that amount of time to become docent after starting a PhD.We cannot exclude the possibility that some individuals spent some part of this period outside academe, or held positions with little opportunity to conduct research, but this is unlikely to interact systematically with sex.

Publication data
The selected disciplines are strongly empirical, and journal articles are the primary outlet for timely empirical results.We argue that journal articles is the most fair and useful type of publication by which to gauge productivity in these fields.First, the academics themselves know that articles is an important merit, and therefore strive to publish a substantial share of their work in journal articles.Second, they are subject to some level of quality control through peer-review, in that other academics in the same or similar fields accept both the premises, methods, and presentation of the research.Third, journal articles are collected in global, comprehensive, and well-structured databases, which provide more equal probabilities of finding what researchers have actually published, as compared to searching different sources for different disciplines.Databases like the Web of Science also add another level of quality control by applying their own inclusion criteria for journals.Taken together, these arguments point to journal article metrics being measures of overall performance rather than merely quantity, because they reflect what has passed several levels of arbiters.Aggregated across several publications for each researchers, they should provide valid and reliable measures for comparing individuals within these disciplines (Testa 2009).Thus, numbers of journal articles and citations of those articles during the period 2000-2009 were obtained for each researcher from the Web of Science (WoS).They include articles written in any language.We do not consider co-authorship because it has been found to differ very little or not at all between the sexes (Abramo, D'Angelo, and Di Costa 2019), adjusting for numbers of co-authors is generally not found to make a difference (Ruscio et al. 2012), and even when females are found to have slightly more co-authors it did not mediate productivity or impact (Fell and König 2016).We do also not apply field normalization, because it is superfluous when comparing two groups that have comparable numbers of publication within each discipline.
The detailed search process is described in Madison and Sundell (2022), including the coverage of these disciplines in WoS, handling of ambiguous author names, identifying the correct author, types of articles, etcetera.

Statistical analyses
Publication and citation data are positively skewed, and were individually log-transformed using log 10 (Y + 1).Group means are also reported as the arithmetic mean and the geometric mean (anti-log of the mean of the transformed data, i.e. 10 Y -1).The male/female variance ratio (VR) for assessing H4 was computed as the male s 2 divided by the female s 2 , based on individually log 10 transformed data.Effect sizes are Cohen's d computed with pooled SD (Olejnik and Algina 2000, 245) and degrees of freedom according to the Welch-Satterthwaite equation.Simultaneous effects of variables including sex, discipline, and academic rank were estimated by multiple regression and semipartial correlations.

Results
In summary, the data comprise 16,038 publications that have been cited 158,365 times.The 43.2% females (426 F, 562 M) have authored 45.7% of the publications (5,007 vs. 11,031) and received 29.9% of the citations (47,295 vs. 111,070).Table 1 shows descriptive data for both transformed and untransformed variables across all participants, demonstrating that the strongly skewed raw data reach acceptable skewness and kurtosis (±2.0) when Log 10 transformed.These data are further detailed as a function of sex and discipline in Table A1 and as a function of sex, discipline, and academic rank in Table A2.
Noting that the minimum value for publications are naught for all variables, we examine and confirm that this holds across all disciplines.In other words, there are researchers in every discipline who have no publication indexed in WoS at all, and hence also no citation, regardless of academic rank.This prompts a further exploration of how common this is, and to what extent it differs between the sexes.Amongst docents, females constitute 59.8% of those with zero publications in WoS (52 F, 34 M) and 59.2% of those having 0-2 publications (87 F, 60 M).These are slightly larger proportions than the 51.3% females in the docent population (232 F, 220 M).This is probably a trivial effect of a somewhat younger academic age of the female docents, implied by the fact that the number of females in academe has increased faster than males in the last decades.Amongst professors, females constitute 33.3% of those with zero publications (26 F, 52 M) and 35.4% of those having 0-2 publications (57 F, 104 M).This is slightly less than the 39.1% females in the professor population (194 F, 342 M), meaning that male professors tend to produce less at these lowest levels.It may seem peculiar that 86 docents and 78 professors had naught publications, but this is partly accounted for by the selectivity of WoS in excluding grey literature, such as book chapters and reports.
Table 2 lists the sex differences separately for each discipline and publication metric, with effect sizes ranging from small to medium.The difference between the two least (Education and Social Work) and the two most productive (Public Health and Nursing and Caring science) is about two orders of magnitude for citations and about one order of magnitude for the two other metrics.This interaction between sex and discipline is illustrated in Figure 1 for publications and Figure 2 for citations.Citations per publication trivially reflect the same pattern, and is not plotted.
With regards to the mean difference across disciplines, the very large differences between them lead to underestimated effect sizes.Instead of computing the mean male-female d across disciplines and researchers, we therefore compute it for each discipline and then take the mean of these six values, which is 0.406 for publications, 0.376 for citations, and 0.257 for citations per publication.This clearly contradicts Hypothesis 1 (H1), that there be no sex difference, and as these effects are positive, they also contradict the variant of H1 that female performance is higher because of bias against females when promoting to higher ranks.
In the following analyses the data are aggregated across academic rank, except when addressing specific hypotheses related to rank.This is based on several observations that performance is fairly evenly distributed across the ranks, including visual inspection of scatter plots.Indeed, professors can be found amongst the least productive researchers and docents amongst the most productive.Professors had 41.7% more publications (based on geometric means), which is not very much considering that a substantial difference is expected due to both higher competence and many more active years.For comparison, the male-female difference is 49.7% for docents and 13.3% for professors.The corresponding male-female d for publications across disciplines was 0.436 for docents and 0.271 for professors, and the mean professor-docents d for publications across disciplines was 0.612 for females and 0.432 for males (Cohen's d based on Log 10 values).Also, the lowest numbers of publications, reported above, exhibit an inconsistent pattern in relation to sex and rank.The third hypothesis is that the relative performance of females increases with the proportion of females in a discipline.This was assessed by the associations of these variables across disciplines.Figure 3 plots the female-male differences in terms of effect size (d ) in the top of the graph, related to the left ordinate, and the overall productivity in terms of publications as bars, related to the right ordinate.The proportion of females is represented along the abscissa, sorted by rank order and marked with the concomitant discipline.There was no sex difference for Education, but lower female performance for all other disciplines and metrics, except citations per publication for Nursing and Caring.Citations per publication are per definition controlled for productivity, but are included for completeness.These plots exhibit no systematic relationship to the proportion of females, but rather a jagged pattern with increasingly higher male performance for Sociology, Nursing and Caring, Public Health, Social Work, and Psychology.Indeed, zero-order correlations between the proportion of females and the sex differences varied from .01 for publications to.20 for citations, and to .41 for citations per publication.There is thus no correlation at all for productivity as such, refuting H3, and very small correlations for the citation-based metrics, explaining between 1.7% and 17% of the variance.In contrast, correlations between the sex differences and the overall performance, in terms of publications, varied from −.75 for publications over −.45 for citations to −.05 for citations per publication.There is thus a substantial positive association between overall level of performance and higher male than female performance, explaining 56% of the variance for publications and 20% for citations.
The proportion of females differed substantially between docents (51.3%) and professors (36.2%).To control for possible confounding effects of this, both sex and rank were included as predictors in multiple regressions, summarized in Table 3. Publications as a dependent variable addresses H1, again showing that males produce more articles, with sex explaining another 1.4% when controlling for rank and discipline.Citations per publications as dependent variable addresses H2, showing that neither sex nor rank explains any variance in addition to publications and discipline.Discipline was not included in this analysis because it is strongly correlated with publications and hence redundant.H3 states that females get cited less because of either a bias against female researchers or less scientific impact of their work, as these two causes cannot be distinguished.Because previous research has found a strong association between this metric and publications, this hypothesis can only be assessed while controlling for publications.Figure 4 plots citations per publication separately for each sex, as a function of productivity, in terms of decile bins of number of publications, increasing from left to right.The number of researchers in each bin is also stated along the abscissa, and the tenth decile is further divided into four 2.5 percentile bins in order to project possible differences within the topmost levels of productivity.No systematic sex difference is exhibited, and H3 is thus falsified.Moreover, the plots exhibit a very strong effect of the total number of publications, represented by the percentile of productivity, suggesting an asymptote from the 80th to the 95th percentile, which in this material corresponds to 19.8 and 49.2 publications.This is reasonably explained mainly by the number of years since publication, as citations accumulate across the up to 10 years window in these data, but also by the greater exposure of a researcher's work entailed by having more publications as advertisements for it (Aksnes et al. 2011).Table 4 indicates a trend for the variance to be larger for male professors at around 1.20 (p < .05only for publications), and this was also the case for docents and publications.This provides partial support for H4.
For assessing H5, that high levels of performance exhibit substantially higher relative male performance, researchers were sorted in order of their number of publications and binned into deciles, as seen in Figure 5.Because there was much variability in the topmost decile, this was further divided into four 2.5 percentile bins.The same procedure was repeated for citations per publication, under the assumption that this is largely a trivial effect of publications, and not as much an effect of greater higher male variability in traits associated with academic success.Figure 5 exhibits remarkably increased male performance in the highest decile for publications, in terms of an almost linearly increasing proportion of males across the three highest 2.5% bins.Specifically, these bins contain 11, 7, 4, and 3 females, and 13, 17, 20, and 21 males, of whom the females have authored on average 40, 49, 65, and 123 articles, and the males on average 41, 49, 72, and 169 articles.In other words, the performance of the most productive females does essentially not differ from that of the most productive males, except perhaps for the three females and 21 males in the highest 2.5% bin.The main difference lies in the number of the respective sex at these levels of performance, whereas there is essentially no difference at all in the sex proportion amongst the lowerperforming 90% of researchers, as seen in Figure 5.This pattern stands in stark contrast to that for Notes: Variance ratios (VR), effect sizes (d), and geometric means and SDs are based on individually log 10 transformed data.VR is the male s 2 divided by the female s 2 , so that values >1 indicate larger male variance.
Figure 5.The proportion of females for each decile of publications and citations per publication, respectively.The tenth decile is further divided into four 2.5 percentile bins.
citations per publication, for which the female proportion is essentially higher than 35% throughout all levels of performance.This result is further detailed in Figure 6, where publications for each researcher are plotted as a function of decile, with an added random scatter along the abscissa to increase visibility of individual data points.The highest performance region is also magnified by truncating publications below 6 and deciles below 4, which excludes 496 or 50.2% of researchers, 228 females and 268 males.The remaining 198 females are seen as circles and the 294 males as triangles in Figure 6, exhibiting the remarkable pattern that the variability is by far largest in the tenth decile, and that female researchers tend to be most numerous in, and even dominate, its lower region.Nevertheless, some female researchers have very large numbers of publications, the ten largest being 51, 52, 55, 61, 63, 66, 71, 105, 110, and 155, as compared to the ten largest male numbers at 131, 132, 144, 149, 168, 178, 181, 205, 506, and 642 publications.Thus, we conclude that there is strong support for H5.With regard to disciplines, these highest performing 7.5% include eight females and 25 males in Nursing and Caring science, five females and 16 males Public Health, one female and 13 males in Psychology, and four males in Education, Sociology, and Social Work.Thus, even the lowest performing disciplines are represented at the top level.This is also true of academic rank, where the highest performing 7.5% include 13 docents (3 F, 10 M) and 59 professors (10 F, 49 M).

Discussion
We set out to better understand the causes for the general and consistent higher male academic performance reported in previous research, guided by three hypotheses related to social- environmental factors (H1-H3), including norms and biases based on a male-dominated power structure, and two hypotheses related to sex differences in psychological traits (H4-H5).H1 stated that because we consider two levels of rank that are attained according to merits, there should be no difference in productivity, or alternatively that females have higher productivity because they would have needed higher merits to counter bias against females.H1 was falsified, showing instead substantially higher male performance within both ranks.This suggests a bias against males, such that the bar for promotion is higher for males.The alternative explanation that females in fact had equal or higher merits when being promoted, but were then disadvantaged to the extent that their performance relative males was reversed during their time within that rank, is not supported by studies that consider merits at the precise time of being appointed (Lutter, Habicht, and Schröder 2021;Madison and Fahlman 2020;Schröder, Lutter, and Habicht 2021).It seems also unlikely that such a large change could occur consistently across both rank and individuals with quite different career paths.H2 and H3 were also falsified, in that there was no sex difference in citations per publication at the same level of productivity and no substantive association between the proportions of females in disciplines and sex differences in productivity.This outcome of H2 is consistent with the finding that although publications with female authors are cited slightly less, this difference disappears when controlling for the total number of publications (Aksnes et al. 2011).However, a substantive association was found between sex differences in productivity and the overall level of productivity for each discipline, such that the higher the productivity in a discipline, the lower the relative female performance.This is consistent with the greater male variability hypothesis, according to which there are more males than females who have the rare combination of traits that lead to a very high level of performance.Likewise, the two predictions directly related to the greater male variability hypotheses were supported.Consistent with H4, males had higher variance in the number of publications, but more so for professors than for docents.This difference might be an effect of having categorized academic performance, such that most particularly high-performing individuals have already been sorted into the professor category, hence removing variability from the docent category.Another explanation could be inflated variability amongst female docents, due to either a greater inflow of younger academics (Madison and Fahlman 2020, Table 1) or the apparently lower bar for promotion of females to docent.Table 4 shows that metrics are substantially lower for female docents, but less so for professors.That females' variability is smaller for docents than professors speak against this, however.
The specific contribution of greater male variability was estimated by excluding the top performing 7.5% researchers, amongst whom males were explicitly overrepresented.Males still produced 5.8% more articles within the 92.5% of researchers with even sex proportion, which can be interpreted such that greater male variability accounts for 91.3% of the sex difference, or 61.3 percentage points.The remaining 5.8% is more difficult to interpret.That both sexes exist within the same two higher levels in an academic rank system suggests bias against males.Specifically, the promotion of less productive females could explain all or more of this productivity difference.If there is no such sex bias, in contrast, the explanation could be a productivity sex differential, such that females who reach a higher rank tend to relax their ambition and effort, or males increase theirs, or a combination of both.Nevertheless, previous research shows a robust productivity difference with the same magnitude even across academic rank, which speaks against sex bias as the sole explanation for the productivity difference.Both these explanations are consistent with sex differences in motivation, as females to a much greater extent report deriving pleasure from social relations (e.g.Bleske-Rechek and Gunseor 2021) and consider having children an important factor in career planning (Martinez et al. 2007).Females invest more time in childbirth and caring for children, but studies of the effect of this on academic performance are inconsistent and are often through their design unable to speak to the causal direction (see Stack 2004, for a meta-analysis and Lindahl (2020), for a review).More rigorous studies that control for confounding variables show that having children negatively affects the productivity of both sexes, but does only marginally decrease the sex difference (Lutter and Schröder 2019).
Females are also less competitive (e.g.Deaner et al. 2015) and less inclined to assume leadership roles (e.g.Davies et al. 2017).There are very few studies of professional time-use, but Manchester and Barbezat (2012) show that male academics in economics spent 29% more time on research, and published 30% more articles in peer-reviewed journals.There may of course be some combination, such that greater male variability exerts its influence also within the 92.5% of non-top performing researchers, and that motivation, interests, abilities, and time-use contribute to different degrees at all levels of performance.
In summary, the three hypotheses consistent with social-environmental factors were falsified, and the greater male variability hypothesis was supported through H4 and H5.Based on the present and reviewed studies, the conclusion must be that greater male variability contributes substantially to males' greater productivity, in terms of a larger proportion of male researchers who are very productive, that males are generally somewhat more productive, as an effect of greater motivation and time use, and that there is a tendency to promote females with lower publication merits.
Increasing the number of female professors has been an explicit political goal in Sweden since the 1980s (Madison 2019).The so-called promotion reform in 1998 (Lindberg, Riis, and Silander 2011) made it possible to be promoted to professor without being evaluated in competition with other applicants.In combination, these factors seem likely to have conspired to the rapid increase of female professors up to 2007 (Lindberg, Riis, and Silander 2011) and 2014 (Madison and Fahlman 2020), and as a consequence to their decreased relative merits.Indeed, female academics who were hired as, or promoted to, full professor in Sweden across the period 2009-2014 had as a group 40% fewer publications in medicine and 45% fewer publications in the social sciences, compared to the group of newly appointed male professors, suggesting that females were given access to such resources at a lower level of merit than males.Similar preferences for females are commonly reported in Western countries in recent years, both in academic hiring (Carlsson et al. 2020;Lutter, Habicht, and Schröder 2021;Schröder, Lutter, and Habicht 2021;Williams and Ceci 2015) and funding (Bol, de Vaan, and van de Rijt 2022).
The main limitation of the present study is the simplistic nature of the data, which does not allow us to account for a range of moderating variables.They could have included the type and quality of publications, e.g.primary or secondary empirical studies or theoretical studies, and biographical variables, such as marital status, number of children, or chronological and academic age (e.g. and time of PhD, first publication, promotion to academics positions and ranks).Depending on the outcomes of such variables, they might have been useful to nuance the effects and to better discriminate between possible alternative explanations.This limitation would not seem to invalidate our main conclusions, however, nor would any other design element, as argued in the method description.
Specifically, WoS data are exclusive as it applies stricter criteria than many other databases (Testa 2009), but we aver that this is appropriate for testing the present hypotheses.Also, six disciplines are a small number on which to test H3.Fortunately, the correlations between overall performance and the sex difference in performance were quite large, and therefore indicate a robust trend.
Females typically invest more time and effort in parenthood, which is therefore a viable factor in career success.However, the effects of this are reasonably to a large extent filtered out of the present analyses, because the researchers in this study should nominally have reached the same level of performance when they passed the criteria for becoming docent or professor.Considering only this and the higher male performance, the most obvious explanation must be bias against males in promotion to these ranks.There are several other possibilities, in particular when involving the dynamics of having children and when one has them.For example, if females on average have a few more years of parental leave than males, they may be older when they reach the same rank.It is unclear what consequences this would have.Figure 4 shows no sex difference in the relation between publications and citations, however, suggesting that if such differences entail, their effects even out.
The main strengths of the present study are the analyses focused on the distribution of performance and inclusion of the entire population of researchers in each of these disciplines.In fact, 16.6% of the researchers had naught publications in the Web of Science, and would therefore otherwise have been overlooked.The attrition would furthermore have been much larger than this, had we searched for publications based on discipline, topic, or some other index of research area.This is because we would not have identified all the relevant publications, and would therefore also have missed to identify many researchers that actually belong to the defined population.Likewise, researchers as would also have incorrectly identified as belonging to the selected disciplines, for example because they do some work outside or on the outskirts of their discipline, or collaborate across disciplines.Obviously, this would have entailed fragmented disciplinary samples, not to mention national and institutional fragmentation.Hence, it is crucial for the present research questions to identify and analyse the appropriate target population or, if it is large enough, randomized samples of it.Another strength is the sampling of disciplines with relatively large proportions of female researchers, as motivated above.
Finally, we want to emphasize that group level data could and should not in any way be applied to the individual level.Indeed, both males and females are found among the top 10, 5, and 2.5% performing researchers.Another important point is that our population of active researchers is very exclusive, and in no way representative of the general population, for example.The selection mechanisms for becoming a researcher are not well understood, and may also be different for males and females.It would therefore be incorrect to generalize the sex difference in this study group to males and females in general or to other possible researcher populations formed under different conditions.In fact, the performance differences even within ranks that are themselves contingent upon performance, such as docent and professor, strongly indicate a bias against males (Bol, de Vaan, and van de Rijt 2022;Carlsson et al. 2020;Lutter, Habicht, and Schröder 2021;Madison 2019;Madison and Fahlman 2020;Schröder, Lutter, and Habicht 2021;Williams and Ceci 2015).This would in itself lead to lower performance for females, and it is an open question whether the gap would close or even reverse in the absence of such bias.

Figure 1 .
Figure 1.Publications as a function of sex and discipline, in order of magnitude, across researchers and academic rank, plotted as geometric means and 0.95 confidence intervals.

Figure 2 .
Figure 2. Citations as a function of sex and discipline, in order of magnitude, across researchers and academic rank, plotted as geometric means and 0.95 confidence intervals.

Figure 3 .
Figure 3. Sex differences in each of the four performance metrics (points connected with lines) and the overall number of publications (bars) as a function of the proportion of females for each discipline along the abscissa.

Figure 4 .
Figure 4. Citations per publication as a function of productivity, in terms of decile bins of number of publications, with low productivity to the left and high to the right.N is the number of researchers in each bin, and the tenth decile is further divided into four 2.5 percentile bins with 24 researchers in each.

Figure 6 .
Figure 6.Scatterplot of publications for each of the 50% most productive researchers, as a function of sex, decile and with an added random scatter (−0.5 to 0.5 rectangular distribution) along the abscissa to increase visibility of individual data points.The lines are distance weighted least squares fits to the data for each sex.

Table 1 .
Descriptive summary statistics across all participants (N = 988) for publications, citations, and citations per publication, with and without log transformation.

Table 2 .
Raw and transformed means and sex difference indices for publications, citations, and citations per publications, as a function of discipline across participants.Measures are geometric means and their proportional sex difference (%) and log 10 transformed means and effect size (d).Values in bold are statistically significant at p < .05 or lower according to one-tailed t-tests, and corresponding p-values are shown in the p column.Transformed data statistics are based on individually transformed raw data.The proportional sex difference is the (male-female)/female ratio of transformed data.Percentiles correspond to the effect size. a These means are non-representative because of the large skew, but included for transparency.

Table 3 .
Results of linear regressions with publications and citations per publication as dependent variables.

Table 4 .
Geometric means, sex differences in percent, and variance ratio as a function of academic rank, for publications and citations.