Revealing migration schedule and potential breeding grounds of Lined Seedeaters using citizen science data

ABSTRACT The Lined Seedeater (Sporophila lineola) is a small intra-tropical migrant songbird. However, little is known about its breeding and wintering grounds, or migratory habits. To investigate potentially distinct breeding populations and the migratory schedule of Lined Seedeaters we analysed its spatial and temporal distribution using published breeding records, museum vouchers, and data from citizen science projects (eBird and WikiAves). Our findings suggest that there are three main breeding areas: northern Argentina and north-western Paraguay, south-eastern Brazil, and north-eastern Brazil, and that the breeding season seems to be restricted to November through May, with slight differences in timing among these three areas. The species winters in the northern part of South America (mostly in grassland areas) and maybe also in the Amazonia. Moreover, rainfall predicts the latitudinal and longitudinal movements of Lined Seedeaters, with the migratory movements associated with an increase in rainfall. Taken together, these results provide a first comprehensive overview on the migration of Lined Seedeater, calling for further empirical field studies. Understanding intra-tropical migratory patterns is paramount to comprehend the potential impacts of environmental change in tropical ecosystems.


Introduction
Studying bird migration over large distances remains a great challenge, despite the prolific development of new technologies such as smaller and lighter geolocators (Robinson et al. 2010;Bridge et al. 2013), and the availability of an unprecedented volume of data through citizen science projects (Newson et al. 2016;Turnhout et al. 2016). Geolocators are now a well-established and widely used tool for the study of bird migration (Lisovski et al. 2020), but costs are still prohibitive for often-underfunded tropical ornithologists. Moreover, mass limitations, mainly due to battery weight, make the use of such technology a great challenge to study the migratory habits of small species. On the other hand, the increased popularity of birdwatching and citizen science projects in tropical countries provides a novel and inexpensive tool to study birds' migratory habits (Lees et al. 2020). In this study, we used citizen science data to provide a comprehensive overview of the migratory habits of Lined Seedeaters (Sporophila lineola).
The Lined Seedeater is a small (~10 g) granivorous songbird that inhabits a variety of open habitats, including disturbed areas, throughout much of South America (Sick 1997;Ridgely and Tudor 2009). Lined Seedeaters are sexually colour dimorphic (adult males are blackand-white, females are dull) and socially monogamous (Ridgely and Tudor 2009;Ferreira and Lopes 2017). Although common and widespread over the continent, the migratory habits of Lined Seedeaters, as of many intra-tropical migrants, remain poorly understood (Jahn et al. 2017;Somenzari et al. 2018).
Inferences about the migration of Lined Seedeaters have been based on anecdotal field observations, personal perceptions, and museum specimens (Schwartz 1975;Vielliard 1987;Silva 1995;D'Angelo Neto and Vasconcelos 2007). Those reports, summarised by Jaramillo and Kirwan (2020), have suggested that there are two breeding populations of the species, which are purportedly diagnosable by voice: one in north-eastern Brazil and another in the south-central portion of South America, ranging from south-eastern Bolivia, Paraguay and northern Argentina to southeastern Brazil. The population in north-eastern Brazil breeds from January to June (Sales 1989;Sales and Major 2001), and Silva (1995) suggested that it migrates through the eastern portion of the Brazilian states of Pará and Amapá to its wintering grounds in the savannahs of the Guianas and the llanos of Venezuela and Colombia. In contrast, the population that breeds in south-central South America remains on its breeding grounds from November to April (Marcondes-Machado 1997;D'Angelo Neto and Vasconcelos 2007;Oliveira et al. 2010;Ferreira and Lopes 2017), and Silva (1995) suggested that it migrates through central South America to the central and western Amazonia, where it winters.
However, the migratory pattern summarised above is overly simplistic. Lined Seedeaters have been observed year-round in central-western Amazonia, though it is not clear if these records refer to an independent resident population or to individuals from different populations that visit the region at different times of the year (Somenzari et al. 2018;Jaramillo and Kirwan 2020). Areta and Almirón (2009) also reported, based on their personal perceptions, that vocalisations of the purported south-central South America population are not homogeneous and that the idea of two vocally wellmarked populations of the Lined Seedeater might be incomplete. Moreover, the migratory status of the birds found in the Chaco of northern Argentina and Paraguay remains unclear, with some authors considering them as resident (Short 1975;Capurro and Bucher 1988). Finally, the breeding and migratory statuses of the birds found in the Yungas of southern Bolivia and northern Argentina, as well as in the Argentine Monte and Paraná Flooded Savanna (Barnett and Pearman 2001;Areta and Almirón 2009) are not known. Thus, basic questions about the migratory habits of Lined Seedeaters remain to be answered.
It has been suggested that the migratory movements of the Lined Seedeater and several other species of Sporophila are tuned to the effects of wet and dry seasons on grass seed production. This is because members of Sporophila are specialised stem-gleaners, i.e. they eat seeds on the stalks, occasionally foraging in the reservoir of seeds on the ground (Remsen and Hunn 1979;Silva 1995). Seed production in grasslands is regulated by rainfall regimes (Klink 1996;Baskin and Baskin 2014), which is, therefore, expected to predict the seasonal movement of Lined Seedeaters' populations.
In this study, we used available citizen science data, and data from published literature to: (i) identify potential breeding grounds of the Lined Seedeater; (ii) investigate the likely migratory schedule of each population; (iii) identify potential wintering grounds; and (iv) test if the likely migratory movements of the species can be predicted by rainfall patterns, assuming that rainfall is a predictor of grass seed availability (Crowley and Garnet 1999).
To identify the potential breeding grounds of the Lined Seedeater, we compiled breeding data (e.g. museum specimens with enlarged gonads, nests, eggs, fledglings, parental care behaviour) using data from the published literature, citizen science projects, and egg collections. We also conducted a literature review in Google Scholar (https://scholar.google.com) and in the Biodiversity Heritage Library (https://www.biodiversity library.org) using 'Sporophila lineola' as keyword, going through all references that we could access (~1000 literature sources). We also searched for breeding records in WikiAves (http://www.wikiaves.com) and eBird (https://ebird.org), as well as in the online database of the Western Foundation of Vertebrate Zoology, Camarillo, the USA, which houses the largest collection of bird eggs and nests in the world (https://collections. wfvz.org). All searches for breeding records were performed in February 2020. To identify the potential geographical breeding grounds, we used a K-mean clustering method, utilising the geographical coordinates from all breeding records that we could find. To find the optimal number of clusters, we used the 'Elbow method' implemented with the function 'fviz_nbclust' from R-package 'factoextra' (Kassambra and Mundt 2019). The clustering was performed using the core 'kmeans' R function. The clusters were evaluated using the silhouette values calculated using the function 'silhouette' from the R-package 'cluster' (Maechler, 2019). Silhouette values above zero signify that the variable is well classified, the closer that value is to one, the better the classification.
To investigate rainfall influence on the migratory schedule of Lined Seedeaters, we used all records available for the species on two major citizen science projects: eBird and WikiAves. eBird has a global coverage, while WikiAves covers only the Brazilian territory. Both these platforms have a review system for all the entries to reduce identification errors (see https://support. ebird.org/en/support/solutions/articles/48000795278the-ebird-data-quality-and-review-process, and https:// www.wikiaves.com.br/wiki/wikiaves:regras, respectively). WikiAves data was obtained in February 2020 and eBird data in August 2020. eBird data was processed in RStudio using the R-package 'auk' (Strimas-Mackey et al., 2018). Only unique observations on eBird of Lined Seedeaters were included, i.e. any pseudo replicates caused by multiple observers using the same checklist were removed. However, an important caveat is that WikiAves lacks such a filtering system, and hence, we filtered the data manually, removing any duplicates based on identical date, location, sex and age (i.e. resulting in no records of Lined Seedeaters of the same sex and age at the same location on the same date), which reduced the original data by 10.7%.
For better visualisation of the seasonal migratory pattern of Lined Seedeaters, we mapped its occurrences throughout South America during each month. For that, we used a grid of square cells (1° Lat × 1° Long) to calculate, for each month of the year, the proportion of records of Lined Seedeaters available for that grid cell (i.e. number of records of Lined Seedeaters in a given grid cell/total number of records of Lined Seedeaters in that month).
A potential caveat of using citizen science records is the accuracy of species identification. To reduce the possibility of incorporating erroneous records in our analysis, LEL checked all ~8000 photographs deposited in WikiAves and Macaulay Library (which deposits vouchered records from eBird users), removing records of misidentified birds. Nevertheless, females and young males of most Sporophila have similar plumage and are often nondescript, making it difficult to distinguish them at the species level, in contrast to adult males, which are coloured and patterned (Ridgely and Tudor 2009). Our extensive field experience with the Lined Seedeater confirms that not every brown bird (especially young or those with heavily worn plumage) can be conclusively identified at the species level. Therefore, it should be expected that birdwatchers would unlikely report a record of this species based solely on the visual record of brownish seedeaters, without a presence of typically coloured males. This caveat is inherent to citizen science data, thus here we present the analyses considering records of all individuals (including females) in the main text and the same analyses including the records that were specified as males or as pairs/ groups of males and females in the supplementary material (see table S1). Given that eBird users rarely report the sex of the record, removing records that did not have evidence of a male present excluded 95.9% of eBird records. Also, 99.7% of records from countries other than Brazil were from eBird. Thus, to be able to discuss potential movements to important breeding grounds out of Brazil, we present the analyses considering records of all individuals (including brown birds) in the main text and the same analyses including the records that were specified as including males in the supplementary material. The analysis including all records remained qualitatively similar to the analyses using only records including males, thus not changing its interpretation, adding however more information on the species occurrence outside Brazilian territory (see Supplementary Material).

Statistical analysis
To analyse the influence of rainfall on the migratory dynamic of Lined Seedeaters, we used the citizen science data and average monthly precipitation data from WorldClim (https://www.worldclim.org) from 1960 to 2018. We downloaded the historical monthly precipitation data for each year between 1960 and 2018, and calculated the average monthly precipitation across all the years. Given that Lined Seedeaters are granivorous, we expect their migratory movements would follow seed production across ecosystems, which happens with a time lag after the onset of rains (Morellato et al. 2013). Therefore, to assess whether rainfall can predict the monthly distribution of the Lined Seedeater across South America, we calculated the proportion of precipitation occurring in the last quarter in each grid cell (2.5° × 2.5° resolution) in each month by summing the values (mean monthly precipitation) for the last quarter and dividing them by the total annual precipitation, and used the resulting value as the main fixed predictor (Crowley et al. 1999).
Although birdwatching is a relatively accessible activity, it is still biased by socioeconomic factors such as, but not limited to, education and income. Books, binoculars, scopes, and transportation are expensive commodities that only a parcel of the South-American population can afford as a modus of leisure. Thus, not surprisingly, birdwatching is an activity that tends to be concentrated in economic centres and in touristic places with good infrastructure for access, resulting in a few hotspots on the continent. To account for this potential bias on citizen science data, we used the number of observations of all the species in each grid cell in each month as a fixed predictor to account for differences in sampling effort between the cells. This was calculated by creating a raster grid (2.5° × 2.5° resolution), and summing the number of all records per month inside each grid cell, using the 'rasterize' function from the 'raster' R-package (Hijmans et al. 2015). As the data for each grid cell was comprised of one observation for each month, grid cell ID was considered as a grouping variable, and thus was included as a random effect to account for non-independence between the observations within a given cell. We constructed a negative binomial model with a 'log' link implemented with the function 'glmer.nb' from the R-package"lme4" (Bates et al. 2015) in which the number of observations of the Lined Seedeater per grid cell per month was the response variable, and percentage of precipitation occurring in the last quarter per grid cell was the main fixed effect. We also included the total number of records of all species per grid cell, as a proxy for sampling effort to account for differences in sampling effort between grid cells (adapted from Fletcher et al. 2005). The dataset containing information for all the records of all the species (30ʹ511ʹ951 records) was created by combining the data for all species downloaded from eBird for each country where the Lined Seedeater occurs (27ʹ453ʹ420 records) and all the available records for all species from WikiAves (3ʹ058ʹ531 records). We considered only the grid cells where Lined Seedeaters occurred in at least one month and excluded all the grid cells in which they were not registered. We used 95% confidence intervals to determine the significance of a relationship between the predictors and the response; if the 95% confidence intervals did not overlap zero the relationship was accepted as significant. The 95% confidence intervals were calculated using the function "confint.merMod from the 'lme4' R-package (Bates et al. 2015) with the 'Wald' method. To evaluate the model, we calculated marginal and conditional pseudo-R 2 adapted for generalised mixed effect models (Nakagawa, and Schielzeth 2013;Johnson 2014;Nakagawa et al. 2017), implemented with the function 'r.squaredGLMM' from the 'MuMIn' R-package (MuMIn 2020).

Results
Overall, our search resulted in a total of 221 breeding records of Lined Seedeaters (see Supplementary Material), 39 of which were obtained from literature (24 published studies), 161 from WikiAves, 17 from eBird/Macaulay Library, and four from the collection of the Western Foundation of Vertebrate Zoology. The cluster analyses indicated that the optimum number of clusters based on the data provided was three, suggesting that Lined Seedeaters potentially have at least three main breeding grounds in South America (Figure 1), which could correspond to three distinct populations: (i) one in the Caatinga, a multifaceted biome of northeastern Brazil primarily composed of thorn scrub forests (n = 7 records), (ii) one in the Cerrado (a large biome characterised as a tropical savannah that encompasses the central part of Brazil) and Atlantic Forest (a tropical rainforest that originally covered most of the Brazilian Atlantic coast) of south-eastern Brazil (n = 181), and (iii) one in the Chaco, a subtropical complex of forest and savannah of northern Argentina and north-western Paraguay (n = 22) (see Figure 1). The within-cluster sum of squares were 49.5, 1,803.9 and 269.9 for the ranges one (i), two (ii) and three (iii), respectively. The compactness of the clusters, calculated as between-cluster sum of squares divided by the total sum of squares, was 76.9%, suggesting high similarity between the members of each cluster. The silhouette values per record ranged between 0.08 and 0.84, and the average silhouette values for ranges one (i), two (ii) and three (iii) were 0.79, 0.71 and 0.74, respectively, suggesting adequate classification.
We found two isolated historical breeding records (Snethlage 1935) in the Amazon region of Northern Brazil, a complex biome that encompasses the Amazon river basin and includes dense tropical rainforest and other ecoregions, where the species is no longer known to breed (Figure 1). These records are based on two nests with eggs collected by Emilie Snethlage in February 1917 in Santa Isabel do Pará, state of Pará, Brazil (these nests are currently housed in the Museu Paraense Emílio Goeldi, Belém, Brazil, M.Â. Marini pers. com.). The nearest modern breeding record of the Lined Seedeater we acquired is a single nest photographed in Araguatins in 2013 (https://www.wikiaves. com/1201575), state of Tocantins, 500 km south of the historical Pará records, in the transition area between the Amazon and the Cerrado. These three odd breeding records are almost 700 km from the Caatinga breeding grounds. Two other isolated records are available for the northern border of the Brazilian Pantanal wetlands. The Pantanal region, an extensive complex of wetlands, is a common destination for birdwatchers and is a base for numerous research projects, thus having only two records suggest that the Pantanal is not a common breeding ground for the species. We excluded from the cluster analyses these isolated records in the Amazon and the Pantanal.
Overall, we gathered 14ʹ871 observations of Lined Seedeaters from the two citizen science projects, 7ʹ838 from eBird and 7ʹ033 from WikiAves. After filtering the data to retain only those records that correspond to the male Lined Seedeaters or to male-female pairs, the dataset contained 5ʹ632 records, 247 from eBird and 5ʹ385 from WikiAves (see Supplementary Material). These records clearly demonstrate that the three populations that compose the suggested breeding grounds exhibit migratory habits (Figure 2), with the population that breeds in the Caatinga staying on its breeding ground mostly from February to May. The population that breeds in the Cerrado/Atlantic Forest remains on its  breeding ground mostly from the end of November to April. The population found in the Chaco remains on its breeding ground mostly from November to February.
From June till October, most records available are limited to the northern part of South America and even a few isolated records in Central America (Figures S1 and S2). Our results suggest that Lined Seedeaters, after breeding south of the Amazon, migrate to wintering grounds in the northern part of South America ( Figure  S1 and S2). However, the current data available do not allow us to suggest the potential boundaries of their wintering grounds, which seem to cover a wide geographical range, as well as a wide range of habitats, including xeric shrublands and dry forests (coastal Venezuela), grasslands and savannas (Llanos and Guianan Savannas) and even grassy habitats in the Amazon floodplain (Amazon 'várzeas').
Additionally, the results indicate that rainfall seems to predict the migratory latitudinal and longitudinal movements of the species (Table 1). Our findings show that the abundance of the Lined Seedeater (i.e. number of records of Lined Seedeaters per grid cell) had a positive relationship with rain (estimate 5.262, 95% CIs 4.271, 6.252; p-value <0.001), even when controlling for the sampling bias (i.e. total number of records of all species per grid cell).

Discussion
We used citizen science data to reveal the seasonal distribution of Lined Seedeaters across the Neotropics, corroborating the idea that citizen science projects are powerful and inexpensive tools to study ecological dynamics of bird species that are distributed over large areas (Lees 2016;Schubert et al. 2019;Jahn et al. 2020). Our findings suggest that the previous concept of two breeding populations of Lined Seedeaters (Vielliard 1987;Silva 1995) is equivocal, and rather suggest that there are at least three main breeding grounds of Lined Seedeaters across South America. It is not clear if the Lined Seedeater breeds in the Yungas forests (a tropical to subtropical broadleaf forest situated between Peru and northern Argentina) of the Andean foothills (the biggest mass of continental mountains in the world). The westernmost breeding records of the Chaco population are very close to (only a few kilometres) the Andean foothills, where many sightings of the species have been recorded. Therefore, though not confirmed, Lined Seedeaters are likely to also breed in the Southern Yungas, especially on its lower and drier parts, locally known as 'piedmont forests', but further studies are needed to investigate if the individuals in this range are part of the population that breeds in the Chaco.
It remains unclear how isolated the three potential breeding areas that we identified are from one another. For example, we located a few isolated breeding records in the intervening area between the Chaco and the Cerrado/Atlantic Forest breeding grounds, in the Brazilian states of Mato Grosso and Mato Grosso do Sul. The scarce records of the species for the state of Bahia (Northeast Brazil) (Figures S1 and S2), in the midway between the Cerrado/Atlantic Forest and the Caatinga breeding grounds, are highly seasonal, with a marked peak in abundance that coincides with the breeding season of Lined Seedeaters reported for Minas Gerais (Ferreira and Lopes 2017). Therefore, probably small and isolated populations do breed in suitable intervening areas between the three breeding grounds identified here. Further population genetic studies are needed to confirm this idea.
Lined Seedeaters are intra-tropical migrants that execute long-distance regular and predictable movements. Populations from the three breeding areas identified here migrate to the north portion of South America where they winter, following Silva's (1995) prediction. These three populations, however, show some variation in their breeding and migratory schedules, which are supported by detailed reproductive studies conducted in Table 1. Influence of rainfall on the migratory dynamic of Lined Seedeaters Sporophila lineola. Negative binomial generalised linear mixed model documenting variables affecting the abundance of Lined Seedeaters (number of observations) per grid cell, with the proportion of precipitation occurring in the last quarter ('rain') in each grid cell as the main fixed predictor, and the number of records of all the species per grid per month (N points of all species) as a covariate. Gird identity was used as a random effect. This output includes all records of Lined Seedeaters in eBird and WikiAves.  Sales 1989;Sales and Major 2001) and in the Cerrado/Atlantic Forest (breeding from November to April, Marcondes-Machado 1997;Oliveira et al. 2010;Ferreira and Lopes 2017). Although we found no strong evidence of any yearround resident population of Lined Seedeaters, the drivers of the apparent year-round presence of Lined Seedeaters in central and western Amazon remain unclear. Longitudinal studies following the populations in these areas, preferentially of marked individuals, are deemed necessary to evaluate their status. There are scarce records of Lined Seedeaters in their breeding grounds during the non-breeding period, as well as in northern South America during the breeding season. Those records could potentially be attributed to birds that failed to migrate (e.g. in poor body condition), vagrants, escapees from captivity, misdated records, or even identification mistakes, rather than year-round residents.
Our findings confirmed that the species is also migratory throughout its northern range, where a marked fluctuation in the abundance of Lined Seedeaters is also observed (Figure 2) Lined Seedeaters are more common in northern South America from May to November, with a slight peak in abundance during July. In Suriname, flocks of Lined Seedeaters suddenly arrive at the end of June and July, feeding in open grassfields, with some flocks observed as late as October (Haverschmidt 1968). Adult males were rarely observed in these flocks, which were predominantly composed of brown birds in non-breeding condition with a noteworthy volume of accumulated fat stores (Haverschmidt 1968). In the Venezuelan Llanos, Lined Seedeaters are assumed to be seasonally common, with flocks up to 100 birds recorded in the savannas and bushes from July to December (Thomas 1979). In the Gran Sabana of south-eastern Venezuela, the species was also recorded in open savannas and cleared areas from June to December, with a marked peak in abundance during July (Sharpe et al. 2001;Crease 2009).
The study of the migratory patterns of species in the genus Sporophila is rather complex, as Areta (2012) and Silva (1999) have singled out: (i) the wintering range of most species of the genus is poorly sampled; (ii) identifying young males and females of many species of Sporophila in the field is usually not feasible; and (iii) a single location may harbour distinct populations, each one with a distinct migratory pattern. Additionally, we highlight two further difficulties: (iv) many species of seedeaters, including the Lined Seedeater, are popular in the pet market and animal traffic in South America (Soares et al. 2020), and captive individuals that have been released in an ill-advised manner or escaped from captivity may lead to odd geographical records; and (v) in the specific case of Lined Seedeaters, adult males are quite similar to adult males of Lesson's Seedeater S. bouvronides, and females of these species are indistinguishable (Ridgely and Tudor 2009). These two species are syntopic throughout much of the northern South America, with records of Lesson's Seedeater breeding at the same time and place where Lined Seedeaters winter in Venezuela (Schwartz 1975).
The use of a large pool of data available through citizen science projects allowed us to produce the first comprehensive study on the migration of Lined Seedeaters. Although the use of citizen science data has its weaknesses, such as observer bias, observer reliability, and non-independence of sampling, citizen science projects that provide a review system and that are based on vouchered records can minimise these problems. Citizen science data is not error-free and needs to be dealt with carefully (see the guidelines proposed by Areta and Juhant 2019), because of identification mistakes (which are probably more common in eBird, given that most records are not vouchered), or misdating (which is probably more common in WikiAves, since it uses the metadata associated with the image as the date of the record). In the present study, in which we included all the records (females, males, and juveniles) in the analyses, the cases of misidentification may have increased the error in our analyses given that females and juveniles are very difficult to identify in the field. Moreover, records of Lined Seedeaters on their wintering grounds in northern South America are still comparatively scarce (WikiAves, which accounted for half of the records available for the species, only covers the Brazilian territory, and is highly biased to extra-Amazonian Brazil). However, the immense volume of data available allows us to have confidence in our findings, even accounting for these sampling issues.

Data availability
The data used in this project is openly available in WikiAves and eBird portals. Climate data is available through the Global Climate Data (WorldClim).