The mirage of scientific productivity and how women are left behind: the Colombian case

ABSTRACT Equity, diversity and inclusion (EDI) in the workforce are paramount for the betterment of the scientific endeavor. Colombia is a country with great scientific potential, but also multiple long-lasting socioeconomical difficulties. Here, we provide a quantitative analysis of the temporal trajectories of gender parity in scientific publishing in Colombia. Data was dissected based on education level, researcher’s rank and research area, in order to elucidate differential patterns of scientific publishing. We controlled for gender-based differences in number of researchers by quantifying per capita scientific productivity. Our results show widespread gender disparity in scientific publishing persistent across time. Gender-based differences in per capita scientific publishing indicate that gender disparity persists even after controlling for differences in the number of researchers. Temporal trajectories revealed a decrease in women publishing in the medical sciences and a widening of the per capita publishing gender gap. Women senior researchers and women researchers with doctoral degrees had the lowest publishing participation within their group, suggesting access to postgraduate education or entering the workforce in themselves do not prevent women from being underrepresented. We highlight the need to understand the problem of underrepresentation in science and possible ways to address it beyond increasing the number of women researchers.


Introduction
Diversity and inclusivity in STEMM (science, technology, engineering, mathematics and medicine) have become highly debated topics in the scientific community, identifying pervasive and systemic inequalities that have limited the participation of different communities globally (Potvin et al. 2018;Stirling 2007). Gender parity has become a foundational objective to promote the democratization of the scientific endeavor (Ceci and Williams 2011;Ceci, Williams, and Barnett 2009;Harding and Societies 1986;Pell 1996;Rossi 1965), opening a novel line of research that has gained increasing momentum in recent decades (Clark Blickenstaff 2005;Jones et al. 2014;Ley and Hamilton 2008;Meyer, Cimpian, and Leslie 2015;Moss-Racusin et al. 2012;Ovseiko et al. 2016;Smeding 2012;Weeden, Thébaud, and Gelbgiser 2017). From access to education and training (Christie et al. 2017), job security (e.g. promotion and access to funding) (Clark Blickenstaff 2005;Moss-Racusin et al. 2012;Pell 1996), to participation in the generation of new knowledge (e.g. scientific publishing) (Araújo et al. 2017;Holman and Morandin 2019;Holman, Stuart-Fox, and Hauser 2018;Huang et al. 2020;Jones et al. 2014), women have been historically marginalized and underrepresented in science (Ceci and Williams 2011;Ceci, Williams, and Barnett 2009;Partnership 2015;Valentova et al. 2017). Despite progress in recent decades in access to education that has seen an increased participation of women in STEMM (Franco-Orozco and Franco-Orozco 2018; UIS 2020), it is estimated that it will still take decades or even centuries to completely eliminate gender inequality (Holman, Stuart-Fox, and Hauser 2018;López-Aguirre 2019).
After women's access to education, which has shown significant improvements over the last 60 years (Franco-Orozco and Franco-Orozco 2018;O'Dea et al. 2018), the retention of women in research positions has gained importance as a major challenge (Clark Blickenstaff 2005;Pell 1996), perhaps best exemplified by the publishing gender gap (Holman, Stuart-Fox, and Hauser 2018;Huang et al. 2020). Scientific productivity is used as a proxy to assess the prowess of a researcher, vastly measured in terms of the quantity and quality of published scientific papers (Astegiano, Sebastián-González, and Castanho 2019;Prpić 2002; van Arensbergen, van der Weijden, and van den Besselaar 2012). The h-index and Impact Factor (IF) are possibly the two most used metrics to estimate and compare the impact and productivity of a researcher, used by institutions and governments to determine hiring, promotion and fund-allocation to researchers. These and other scientometric indexes have come under heavy scrutiny, evidencing biases and limitations that have reinforced systemic inequalities, prolonging the marginalization of minorities in STEMM (Bradshaw et al. 2021). The spread of quantitative metrics of scientific productivity has inadvertently helped establish practices that have been proven harmful to the scientific community. The so-called "publish-or-perish" phenomenon in academia has resulted in increasingly deleterious work cultures for early career researchers and underrepresented groups, leading to widespread mental health issues and increased dropout rates. This "tyranny of metrics" also nurtures fraudulent research practices such as disproportionate self-citation and plagiarism by authors, or coercive citation by publishers. Genderized dynamics are also evident, as men tend to represent the majority of cases of fraud and scientific misconduct in academia (Fang, Bennett, and Casadevall 2013). Nevertheless, traditional measures of scientific productivity are still widely used and are a common hurdle that researchers from underrepresented groups must face.
One of the major limitations to effectively integrate underrepresented groups into the scientific community is the lack of knowledge on the systematicity and idiosyncrasies of the problem across research areas, cultures and nations. Large-scale studies on gender inequalities in scientific productivity globally have provided insightful evidence on historical trends and differences across research fields (Holman, Stuart-Fox, and Hauser 2018;Huang et al. 2020). Despite these global perspectives providing an important benchmark to inform debates, they can oversimplify local, cultural or guild-specific issues. As a result, localized research efforts to identify challenges that women face in specific settings are equally needed to inform policy and lawmakers at a national and institutional scale (López-Aguirre 2019; Valentova et al. 2017). Recent efforts to apply quantitative methods to study the interplay between society and academia (i.e. scientometrics) contrast with a robust tradition of science and technology studies (STS) centered in qualitative and descriptive approaches, creating a gap between both (Wyatt et al. 2017). STS studies have provided a crucial body of knowledge that critically discusses how societal issues shape the generation and sharing of new knowledge (see Harding, Pérez-Bustos, and Fernández-Pinto 2019 for an overview), while scientometric studies have described unidentified patterns in academia shaped by cultural and socioeconomic drivers (Huang et al. 2020). For a more overarching discussion of issues affecting academia and in order to propose comprehensive solutions, multidisciplinary studies bridging STS and scientometrics studies are needed (Wyatt et al. 2017). For example, a recent study on the statistics of funding-allocation for research in Mexico found no significant effect of gender, but rather that the underrepresentation of women at the highest levels was behind the unequal distribution of funds (Fabila-Castillo 2019). Furthermore, the ongoing global pandemic has had an important impact on the productivity of underrepresented groups in STEMM. Beyond the immediate effect of the pandemic in scientific productivity that has been reported, the long-term effect remains an ongoing issue that needs to be studied and addressed. In Brazil, the scientific productivity of black women and mothers was found to be the most impacted by the pandemic, while single men were the least affected (Staniscuaski et al. 2021).
Colombia is the fifth Latin American nation with the highest scientific productivity (after Brazil, Mexico, Argentina and Chile), accounting for 5.38% of all the scientific production in Latin America between 1996 and 2019 (Scimago 2020). Colombia's spending in research and development (R&D) has remained below 0.50% GDP in recent decades, partly due to the half-century of internal armed conflict that worked as a justification to prioritize defence and military spending across multiple administrations (MINCIENCIAS 2020). The 2016 peace deal between the Colombian government and the Revolutionary Armed Forces of Colombia (FARC) represented a new opportunity for Colombian researchers to demand increased funding, and for R&D to be highlighted as a national priority (Ocampo-Peñuela and Winton 2017; Salazar et al. 2018Salazar et al. , 2017. Significant research efforts have focused on exploring the economic, societal and cultural drivers of inequality in Colombian science (Castelao-Huerta 2020; Daza and Bustos 2008; Daza-Caicedo, Farías, and Ariza 2016; Franco-Orozco and Franco-Orozco 2018; López-Aguirre 2019; Ramírez-Castañeda 2020). Progress in increasing gender parity in education has seen women reaching parity in access to undergraduate studies, reducing the underrepresentation in access to masters and doctoral studies to 47.7% and 38.3%, respectively (Daza-Caicedo, Farías, and Ariza 2016; Franco-Orozco and Franco-Orozco 2018; López-Aguirre 2019). Nevertheless, widespread gender salary gap in postgraduate graduates remains an issue hard to address (up to 30% in medical science and engineering) (Franco-Orozco and Franco-Orozco 2018). Women are consistently underrepresented in the Colombian scientific community, making up 38% of the scientific workforce and 14% of the Colombian Academy of Exact, Physical and Natural Sciences membership (López-Aguirre 2019). Estimates indicate that it could take centuries to reach gender parity in the workforce of some research areas (López-Aguirre 2019). English proficiency has also been found to be a significant disadvantage for Colombian researchers for their scientific productivity, tightly correlated with socioeconomic status (Ramírez-Castañeda 2020). Gender stereotypes in the workplace have been reported to disproportionally impact women researchers in Colombia, hindering their scientific productivity and job security (Castelao-Huerta 2020). Beyond an increase in the number of women in science and their scientific productivity, a complete reappraisal of how science is done and valued in Colombia is imperative to ensure science is produced in an inclusive and equitable ecosystem that addresses the specific needs of the Colombian society (Daza-Caicedo, Farías, and Ariza 2016).
Quantitative studies looking at the socioeconomic drivers of science are rare in Latin America, making Colombia an important case study to advance this area in the region. A recent global study of temporal trends of gender parity in scientific publishing found a decreasing participation of women in Colombia, projecting that gender parity will never be reached in the country if it remains unaddressed (Holman, Stuart-Fox, and Hauser 2018). However, this study focused on databases that are heavily biased towards publications in English, providing only a partial perspective of a complex reality. In this study, we provide a quantitative analysis of the temporal trajectories of gender parity in scientific publishing in Colombia using official governmental data. We decomposed our data based on education level, researcher's rank, and research area, in order to elucidate differential patterns of scientific publishing across time. To assess gender disparity beyond differences in the number of men and women researchers, we used data of governmental biannual censuses of the scientific workforce to quantify per capita scientific productivity and how it varies across socioeconomic variables.

Data gathering
Data on scientific publishing by Colombian researchers is collected by the Colombian Ministry of Science (Minciencias) and made publicly available through the Science in Numbers (SN) online repository. Minciencias performs annual (biannual since 2016) censuses, prompting researchers to register on the SCIENTI platform to validate their profile and career development. Each census gathers data on the scientific community including teaching, supervising and productivity data. Also, censuses gather demographic data recording gender, education level, researcher's ranking, geographic origin and age, allowing the exploration of socioeconomic patterns. Productivity data is decomposed into different categories, depending on the nature of the output; generation of new knowledge (54% of total productivity), development of human resources (20%), social appropriation of knowledge (23%) and technological development and innovation (4%). For this study, we focused on the publication of scientific papers under the "generation of new knowledge" category, as it represents the majority of productivity items reported by researchers. We gathered data of censuses from 2013, 2014, 2015, 2017 (covering 2016-2017) and 2019 (covering 2017-2019). By 2019 (latest census), 161,204 research papers from 16,796 researchers registered in SCIENTI were reported. Each research paper is classified as either woman-authored or male-authored, based on the first author, avoiding duplicates in the quantification of total scientific production due to papers co-authored by men and women. We collected demographic data on gender, research area, academic level and researcher rank level. Research area was codified following the Organisation for Economic Co-operation and Development (OECD) classification: agricultural sciences, engineering, humanities, medical and health sciences, natural sciences and social sciences. Academic level was classified into four groups: undergraduate, diploma, masters and doctoral; whereas researcher rank level is classified into four groups: junior, associate, senior and emeritus. Data on emeritus researchers were not available for every time period, so no temporal patterns were analysed for this group. Additionally, we gathered data on the number of women and men researchers across temporal samples (censuses) and demographic categories also available in SN.

Statistical analysis
Gender gap in total productivity was estimated based on accumulated paper production of Colombian researchers for each census. Temporal patterns of scientific publishing by Colombian researchers were reconstructed by sequentially estimating the difference in papers recorded between a given census and the immediately previous one, estimating scientific production per time period. This was estimated for the 2014,2015,[2016][2017] and 2017-2019 time periods. Analyses of gender parity on scientific production were decomposed based on research area, education level and researcher ranking. Research area allowed us to elucidate patterns in gender-based differences in scientific productivity that reflect the publishing dynamics within each area. Education level enabled an assessment of how access to education correlates with scientific productivity, whereas researcher ranking can be indicative of how career trajectories differ based on gender. Gender-based differences were reported as the total number of scientific papers published per gender and the percentage of the total scientific production produced by each gender. Finally, we generated and analysed an additional dataset of per capita scientific production, standardizing our raw data based on the number of men and women researchers in each temporal sample and demographic category. Per capita scientific production allowed us to recover gender-based differences irrespective of disparities in the number of men and women researchers. Stacked barplots were used to visualize the proportion of scientific papers published per gender, whereas linear plots were used to reflect trajectories in the total number of papers published per gender.

Results
Of the accumulated 161,204 scientific papers registered by Colombian researchers in the SCIENTI platform by 2019, 30.62% (49,359) were authored by women researchers and 69.38% (111,845) by men researchers (Figure 1). Between 2013 and 2019, the input of women researchers on the accumulated production of scientific papers in Colombia increased by 3.17% (Figure 1(A)). Scientific production per time period revealed women researchers authored between 2013 and 2019 on average 30.93% of published papers in Colombia during that period (Figure 1(B)), increasing by 0.76% between 2014 (8,565 of 27,388 papers) and 2019 (14,499 of 45,263 papers). Per capita scientific production was, on average, lower for women (2.12 papers) than for men (2.67 papers), indicating a per capita productivity gender gap of 22.15% (Figure 1(C)). Temporal patterns of per capita productivity revealed an increase of 6.57% in the gender gap between 2014 (17.09%) and 2019 (23.66%). Scientific production increased with education level, with the majority of papers authored by researchers with a doctoral (109,162, 67.71%) and masters (39,351, 24.41%) education (Figure 2). Across education levels, women researchers authored an average of 32.28% of scientific papers, with the lowest participation amongst researchers with doctoral education (28.43%) followed by researchers with undergraduate (30.77%), masters (34.44%) and diploma (35.47%) education.
A gender productivity gap was found in all researcher's ranking levels, being the lowest in junior researchers where 35.96% (22,822) of papers were authored by women ( Figure  3). Women emeritus and associate researchers authored 34.53% and 33.48% of papers published within each ranking, whereas women senior researchers only authored 23.79% of total papers published by senior researchers. Temporal trajectories show diverging patterns of women per capita productivity between ranking levels ( Figure 4).  Differences in gender parity were found in the cumulative scientific production across research areas, engineering having the lowest women publishing productivity (21.77%), followed by humanities (28.58%) and agricultural sciences (28.87%), whereas natural (29.09%), social (35.51%) and medical (37.76%) sciences had the highest women publishing productivity ( Figure 5). Research area-specific patterns revealed diverging temporal trajectories of gender-based productivity ( Figure 6). Between 2014 and 2019, women researchers in agricultural (6.13%; Figure 6(A)), engineering (5.05%; Figure 6(B)), natural (3.64%; Figure 6(E)) and social (0.18%; Figure 6(F)) sciences increased their participation in scientific publishing. In contrast, humanities and medical sciences showed a decrease in gender parity of 3.93% and 0.65%, respectively (Figure 6(C,D)).

Discussion
Our results show widespread gender disparity in scientific publishing in Colombia persistent across time. Gender-based differences in per capita scientific publishing found in this study indicate that gender disparity persists even after controlling for differences in the number of men and women researchers. Multifaceted trajectories were found across research areas, researcher's ranking and education level, showing the presence and magnitude of gender disparity follow context-specific patterns that need to be clearly elucidated. Temporal trajectories of scientific publishing show the participation of women researchers is continuously increasing and that the per capita gender disparity gap is slowly widening. Decoupled research area-specific temporal patterns were found, indicating women participation is improving in some areas and worsening in others. Notably, scientific productivity for women seems to decrease as they climb up the ranking scale. Across education levels, gender disparity was higher in the lowest and highest levels (undergraduate and doctoral). Our findings indicate that gender disparity in scientific publishing in Colombia is not homogenous and needs to be studied and addressed in a context-specific setting. Moreover, we posit the need to reconsider how science is promoted and valued, based on quantitative measures of productivity and impact.
The results presented here build on an increasingly robust body of research analysing the socioeconomic drivers of gender disparity in Colombian science (Daza and Bustos 2008;Daza-Caicedo, Farías, and Ariza 2016;Franco-Orozco and Franco-Orozco 2018;López-Aguirre 2019). Studies documenting historical trajectories (López-Aguirre 2019), improvements in equal access to education (Daza-Caicedo, Farías, and Ariza 2016;Franco-Orozco and Franco-Orozco 2018), persistent gender salary gap (Franco-Orozco and Franco-Orozco 2018) and the impact of language barriers (Ramírez-Castañeda 2020), position Colombia as a regional leader in the area and an informative study case. Consistent with recent findings of increased gender parity in access to education and research participation in STEMM (Daza-Caicedo, Farías, and Ariza 2016;López-Aguirre 2019), our findings show that women author an increasing proportion of the total scientific production in Colombia. Scientific productivity per time period has also seen an increase in gender parity, albeit with an irregular trajectory throughout years (i.e. parity was higher in 2014 and 2016-2017 than in 2015 and 2018-2019). Global analyses of scientific publishing have provided contrasting evidence on the status of gender parity in Colombia. Holman, Stuart-Fox, and Hauser (2018) indicated that between 2002 and 2016, the percentage of papers authored by women researchers in Colombia decreased by 11%, showing a widening gender gap. Contrastingly, Huang et al. (2020) reported a gender gap in career length and total productivity favoring Colombian women researchers. Holman, Stuart-Fox, and Hauser (2018) and Huang et al. (2020) are based on mining global data from international databases that are biased towards English and indexed publications (Holman, Stuart-Fox, and Hauser 2018;Huang et al. 2020). Our results do not support either of these studies, as we found a general increase in women participation across time, as well as widespread gender disparity in publishing. However, the increase of <1% in women participation in scientific publishing contrasts with the 4.29% increase in the number of women researchers in the same time period (López-Aguirre 2019). The mismatch between the increase in number of women researchers and their cumulative scientific production indicate that gender parity in the number of researchers does not translate in parity in the scientific production. This is supported by our results of gender disparity in per capita productivity. Our study is based on official governmental data that captures all the scientific production in Colombia, including publications in languages other than English, as well as local and regional journals that may not be indexed in databases like Web of Science. For example, Web of Science reports ∼139,000 papers published by researchers with a Colombian address, compared to the official SN database that reports ∼161,000 papers. This highlights the importance of complementing global studies that provide general tendencies with localized studies that reveal context-specific patterns. The genderized impact of relying almost entirely on quantitative metrics of journal publications to evaluate productivity and proficiency needs to be considered and critically addressed, as it does not account for other forms of scholarly production that may better reflect cultural or guild-specific idiosyncrasies that underrepresented groups experience (e.g. oral tradition of indigenous knowledge and feminist political activism in Latin America; Harding, Pérez-Bustos, and Fernández-Pinto 2019).
Gender gap in per capita productivity in our results suggests that disparity is prevalent and has been increasing across time, even after controlling for differences in the number of women and men researchers. We highlight the need to understand the problem of underrepresentation in science and possible ways to address it beyond increasing the number of women researchers. Homophily (i.e. authors publishing more often with colleagues of the same gender) has been shown to be common practice in scientific publishing across many fields and regions (Araújo et al. 2017;Bravo-Hermsdorff et al. 2019;Díaz-Faes et al. 2020;Holman and Morandin 2019;Salerno et al. 2019). Additionally, comparisons of research networks between men and women have shown that women have tended to build more diverse research networks that end up promoting diversity in science and multidisciplinary studies (Araújo et al. 2017;Díaz-Faes et al. 2020;Salerno et al. 2019). We hypothesize that our results could be the result of the combined effect of a reduced number of women researchers and a reduced participation of women in men-led studies. To further test this hypothesis, future studies could examine gender differences in collaboration networks in Colombian science.
Our results also indicate that initiatives led by the Ministry of Science to increase the number of women in STEMM may prove insufficient to reduce the gender gap, and that it is not only a matter of having more women in the scientific workforce, but also of removing the blockades women and other underrepresented groups face. The surprising increase in gender-based publishing disparity in senior researchers and researchers with doctoral degrees suggest, by themselves, accessing postgraduate education and entering the workforce do not prevent women from being underrepresented. We highlight the need to also address gender disparity at senior levels of education and research, instead of only focusing on promoting STEMM education as a cure-all solution for gender inequality. Researchers, institutions and governments need to work towards building a nurturing, collaborative work culture in science that feeds on the plurality of its society. Moreover, we argue that a critical evaluation of the use of quantitative metrics of productivity and impact is long overdue in the face of the socioeconomic reality of the country and the region. The pandemic also presents a new challenge that exacerbates known systemic issues, so future studies should aim to study the uneven impact of the pandemic on underrepresented groups to comprehensibly address the roots of inequality and discrimination in science (Staniscuaski et al. 2021). Our approach of focusing on firstauthorship has the possible caveat of underestimating women productivity due to multiauthored papers with both men and women authors, highlighting the need to study gender-based collaboration dynamics in future studies.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work was supported by the University of Toronto Scarborough Office of the Vice Principal Academic and Dean.

Supplemental material
Supplemental material for this article can be accessed here: http://doi.org/10.1080/ 25729861.2022.2037819.

Notes on contributors
Camilo López-Aguirre has a PhD in biological sciences and is a postdoctoral fellow at the University of Toronto Scarborough, Canada. He has multidisciplinary research interests ranging from the study of past and modern biodiversity, to the study of issues affecting modern science such as public access and gender parity, especially in Latin America.
Diana Farías has a PhD in education and is an associate professor at the Universidad Nacional de Colombia. She studied Chemistry in the biggest public university of Colombia where she's been working for the last 24 years in the Chemistry Department. She studied three master degrees in environmental education, pesticide science, and science education. Her work focuses on the intersection between scientific education and social studies of science with an interest in gender and otherness.