Genome wide analyses reveal the population distinctiveness of the ‘Nera del Mela’ sheep

Abstract Italy has a long history of sheep breeding and counts several local populations that may represent a unique source of genetic diversity. Among these, Nera del Mela is a sheep genetic resource historically reared in Sicily but not officially recognised as a breed. In this study, we genotyped 36 individuals of Nera del Mela using the OvineSNP50K array, in order to estimate the genetic diversity and evaluate the population structure and relatedness with other Italian sheep breeds. Genetic diversity indices, and inbreeding estimated from runs of homozygosity (FROH) revealed a moderate level of variability. Runs of homozygosity islands mapped candidate genes involved in the adaptation to local environment and immune response. Population genetic analyses using different approaches highlighted the hypothesis that this sheep possesses a defined genetic structure, especially if compared with other recognised breeds, despite the influence of other populations such as the Sicilian breeds. Overall, our findings represent a starting point for the possible official acknowledgement of this population, for the creation of a conservation plan, and thus for preserving this genomic heritage. HIGHLIGHTS Nera de Mela sheep can be considered as a reservoir of genetic diversity. The results indicated a clear genetic differentiation from other populations and moderate level of genetic variability. Our findings represent a starting point for the creation of conservation plans.


Introduction
Among livestock species, the sheep genetic resources have played a major role in the economic development of the Mediterranean area (Ben Sassi-Zaidy et al. 2022).Italy has a long history of sheep breeding and its wide environmental diversity together with socioeconomic factors have led to the formation of a large number of populations, many of which have undergone drastic reductions in numbers and are listed as critically endangered (Ciani et al. 2014).Nonetheless, these local populations and their relative products represent an important resource for the cultural and economic heritage of the breeding areas in which they evolved (Scintu and Piredda 2007).Moreover, local populations, being often well adapted to marginal areas, represent a reservoir of genetic diversity important to counteract all the negative effects related to climate and environmental changes (Boettcher et al. 2014).Therefore, the genetic characterisation of these genetic resources is desirable to establish the basis for adequate management and conservation plans (Mastrangelo, Portolano, et al. 2017;Persichilli et al. 2021;Tolone et al. 2022).
The Nera del Mela sheep are reared in a limited geographical area of Sicily (Italy) and takes its name from the homonymous valley on the western side of the Peloritani mountains.The population counts about 400 phenotypically homogeneous animals, small in size and with black fleece (Figure 1).Along with inbreeding events and adaptation to the local environment, this sheep has evolved over the decades and now it is reared for its close integration with the Sicilian production system.However, no information on the genetic diversity and population structure of this local animal genetic resource is available to date.
The characterisation, conservation, and sustainable management of local livestock genetic resources is an urgent challenging task to preserve their precious contribution to the gene pool reservoir (FAO 2007).On these premises, the project Co.Ri.A.L-'Conservation of Local Animal Resources' was developed to evaluate the conservation status of those genetically uncharacterised populations of different species, historically reared in the region but not officially recognised as breeds, such as the Nera del Mela sheep.
The availability of the ovine genome-wide single nucleotide polymorphism (SNP) panels allowed providing background information on genome structure in local and cosmopolite breeds (e.g.Kijas et al. 2012;Ciani et al. 2020;Deniskova et al. 2021;Wanjala et al. 2023).In light of the above, a medium-density SNP array panel was used in this study to estimate the genetic diversity and to evaluate the population structure of Nera del Mela.Furthermore, the SNP data of Italian sheep breeds (Ciani et al. 2014) were included in the analyses to investigate the genetic relationships and divergence of this local sheep from the other genetic resources.

Sampling and genotyping
A total of 36 Nera del Mela (hereafter referred to as NER) blood samples were collected from different farms.Animals were chosen on the basis of the information provided by farmers in order to collect unrelated individuals.DNA was extracted from blood using the commercial Illustra blood genomic Prep Mini Spin kit (GE Healthcare, Little Chalfont, UK).The 36 samples were genotyped using the Illumina Ovine SNP50K BeadChip array, which contains 53,516 SNPs spanning the entire ovine genome (Illumina, San Diego, California, USA).

Data management
Chromosomal coordinates and SNPs names of genotype data were updated using the OAR4.0 version of the assembled sheep genome.
A total of 53,446 markers distributed all over the genome were updated for the NER dataset.In order to better explore the genetic relationships, the genotype data of NER were collated with those of 19 Italian sheep breeds distributed North-South along the peninsula and genotyped with Illumina Ovine SNP50K BeadChip array (Ciani et al. 2014), thus obtaining a final dataset consisting of 487 individuals and 46,134 common SNPs (ITA dataset).Moreover, to investigate in detail the relationships between NER and Sicilian breeds, a second dataset of 5 populations (SIC) was generated.The Sarda breed (SAR) was included in the SIC dataset due to its likely contribution to the genomic structure of the Sicilian breeds (Portolano 1987).Table 1 provides population names and acronyms, number of individuals per populations and information on the datasets.
The software PLINK ver.1.9 (Chang et al. 2015) was used to filter data and perform quality control.In detail, after removing the unmapped SNPs and markers on sexual chromosomes, the following parameters were used to perform quality control: a minor allele frequency !0.01, a genotype call rate for an SNP !0.95 and an individual call rate !0.98, resulting in 44,516 SNPs and 36 individuals for NER, 44,989 SNPs and 473 individuals for ITA, and 44,896 SNPs and 132 individuals for SIC.

Genetic diversity and ROH analysis in Nera del
Mela sheep NER data were investigated for observed (H O ) and expected (H E ) heterozygosity, inbreeding coefficient (F IS ) and average minor allele frequency (MAF) using PLINK ver.1.9 (Chang et al. 2015).The contemporary effective population size (Ne) was also estimated using NEESTIMATOR ver.2.1 (Do et al. 2014), following the random mating option, within the linkage disequilibrium method proposed by Waples and Do (2010).
The sliding windows method, implemented in the R package detectRUNS ver.0.9.6 (Biscarini et al. 2018), was used to investigate the pattern of runs of homozygosity (ROH) in NER.The following parameters were used for ROH detection: (i) the minimum number of SNPs included in a ROH was 30; (ii) the number of missing or opposite genotypes were set to zero; (iii) the maximum gap between consecutive SNPs was set to 250 kb; (iv) the minimum ROH length was set to 1 Mb; (v) sliding window of 50 SNPs for ROH; (vi) no missing or opposite genotypes were allowed in the window; (vii) the minimum density of one SNP every 100 kb; (viii) the threshold to call an SNP within an ROH was set to 0.05.ROH segments were placed into five classes of length using the nomenclature of Kirin et al. (2010) and Feren cakovi c et al. ( 2013): 1-2, 2-4, 4-8, 8-16 and >16 Mbp.The mean number of ROH per individual (N ROH ) and chromosome (NC ROH ), as well as the average length of ROH in Mbp per individual (LROH) and chromosome (LCROH), were calculated.In addition, the total length of the genome covered by ROH was evaluated for each individual and divided by the total autosomal genome length covered by SNPs ($2.4 Gb) in order to estimate the average population genomic inbreeding coefficient (F ROH ).
Rich homozygous regions (ROH islands) recurring in NER, were identified by calculating the standard normal z-score from all the SNPs-within-ROH incidence and deriving the p-values: only the SNPs within the top 0.5% were considering to constitute ROH islands.The genomic coordinates of ROH islands were examined through NCBI Genome Data Viewer (Rangwala et al. 2021) according to the Assembly OAR_v4.0 (GCF_000298735.2),while the QTLs associated to the genomic regions of interest were investigated in animalQTLdb (Hu et al. 2022).

Genetic relationship and population structure
Population relationships and structure were studied on merged datasets, prior removing SNPs in high linkage disequilibrium by using the indep-pairwise (50 10 0.2) function in PLINK ver.1.9 (Chang et al. 2015).Multidimensional scaling (MDS) analysis on pairwise identity-by-state distances was performed by PLINK ver.1.9 (Chang et al. 2015) on ITA and SIC datasets.ARLEQUIN ver.3.5.2.2 (Excoffier and Lischer 2010) was used to estimate the Italian between-populations variance using pairwise F ST , which was visualised through a heatmap using the R package ggplot2 (Wickham Finally, the analysis of genomic structure on ITA and SIC was performed by the software ADMIXTURE ver.1.3.0(Alexander et al. 2009) using the unsupervised modelbased clustering algorithm.The outcomes of structure analysis were plotted through the R package BITE ver.1.2.0008 (Milanesi et al. 2017).

Genetic diversity and ROH analyses
The results describing the genetic diversity and ROH parameters of NER sheep are summarised in

Genetic relationship and structure
Pairwise F ST values calculated among all Italian sheep populations ranged from 0.113 (SAB-DEL) to 0.024 (PIN-BAG), identifying SAB and DEL as the most distant breeds (Figure 2).NER showed the lowest values towards the southern breeds, especially COM (0.039), BAG (0.036) and PIN (0.028), whereas the highest value was observed versus SAB (0.081).
To examine the genetic relationships within and between individuals and populations, we used the MDS based on pairwise IBS distances plotted in Figure 3.The results distributed all the populations in accordance with their geographical origin.In particular, the first dimension (C1, 8.96% of the observed variation) distinguished the southern and insular from the northern sheep populations and closely clustered the Sicilian breeds (COM, PIN and VDB) together with NER.The Neighbor-Joining tree based on ASD (Figure 4) assigned the individuals in accordance with their population of origin, with few exceptions regarding some samples of BAG, BIE, SAM, and PIN breeds.Only one individual was not assigned to the NER population that had a common phylogenetic root with the other closely clustered Sicilian breeds.
To provide further information on the genetic connections of NER with the breeds of the same breeding area, the relationship analyses were carried out on the SIC dataset.The MDS plot (Figure 5) showed widespread clusters for VDB and SAR and their distinctness from the other populations and highlighted an evident genetic closeness between PIN and COM.The results indicated a moderate genetic isolation of NER that formed a defined cluster, but also a partial overlap of some of its individuals with PIN, underlining their reciprocal proximity.The Neighbor-Joining tree based on ASD (Figure 6), further detailed the genetic relationships among populations and confirmed the results of the MDS described above.
Results of admixture analysis indicated that the most probable number of inferred populations was K ¼ 15 (Supplementary Figure 1).A graphic representation of the Bayesian clustering among the ITA sheep populations is shown in Figure 7.At K ¼ 2, northern and southern breeds are clearly separated.As the number of Ks increased, populations were progressively assigned to different clusters and many exhibited shared ancestral components.Admixture among southern populations was clearly visible up to K ¼ 6, particularly evident in the Sicilian sheep group (NER, PIN, COM and VDB).From K ¼ 7, the NER sheep clustered apart from all other Italian populations, with some individuals showing ancestral components shared with PIN along the whole range of K analysed.This clustering trend was confirmed by the results of the Admixture analysis performed on SIC dataset (Supplementary Figure 2).The first breed to be differentiated from the others was VDB (K ¼ 2).The NER sheep tended to show its own distinct cluster starting from K ¼ 3, while K ¼ 4 showed some NER individuals with a PIN-like pattern.At K ¼ 5, the shared genetic components between COM and PIN, and an internal variability for NER are underlined.

Discussion
The Italian breeding system has various genetic resources that, although part of the local biodiversity, are often not recognised as official breeds.The rusticity and adaptability make these populations essential for a sustainable production in response to climate change (Ronchi and Nardone 2003;Vastola 2015).It is important to investigate the characteristics that make these animal resources valuable, in order to provide compelling reasons for institutions to recognise their importance and thus to promote funding to support their breeding.

Genetic diversity and ROH analysis
The observed and expected heterozygosity values nearly overlapped, assuming an unremarkable inbreeding status, and revealing a moderate intrapopulation diversity.The Nera del Mela showed similar values to the other Italian sheep populations (Ciani et al. 2014;Persichilli et al. 2021).On the contrary they were slightly lower than those reported by Mastrangelo et al. (2014) for the three autochthonous Sicilian breeds (Comisana, Pinzirita and Valle del Belice).Similarly to Nera del Mela, other Italian breeds also showed a negative F IS value, including Comisana (À0.030) and Sarda (À0.020) (Ciani et al. 2014).The low inbreeding was also confirmed by the F ROH value that differed from the results of previous studies (Mastrangelo et al. 2018;Ciani et al. 2020;Persichilli et al. 2021) in which the authors reported higher values for several sheep populations.The effective population size of Nera del Mela was lower than the values estimated in the Italian breeds (Persichilli et al. 2021).Conversely, comparable low results were found in Barbaresca (Ne ¼ 25.0) (Mastrangelo, Portolano, et al. 2017) and Pagliarola (Ne ¼ 20.0) (Persichilli et al. 2021), which, like Nera del Mela, are endangered populations that have experienced a dramatic contraction of the breeding animals.Although the Nera del Mela has a low estimated and real population size, did not show signals of low levels of genetic diversity, and this could be attributed to possible admixture events which, due to the high percentage of short and medium ROH segments, seem related to past inbreeding (Yoshida et al. 2020).

Runs of homozygosity islands
ROH analysis in livestock species has been widely used to identify genomic regions fixed in populations and therefore potentially under selection, with the possibility of mapping genes potentially linked to populationspecific traits (e.g.Marras et al. 2015;Zhang et al. 2015;Mastrangelo, Tolone, et al. 2017).In Nera del Mela sheep, the chromosome richest in ROH and ROH islands was OAR1, unlike what was found in other  1.
The ROH island in OAR1, spanning the region 85.70-102.01Mb, identified genes involved in the immune response against pathogens and pests.In particular, the CD53 gene has been related to nematode infection pathways in sheep (Rafeie et al. 2023), CD101, IGSF3 and VTCN1 have been reported as genes linked to parasite resistance in Australian sheep (Kalaldeh et al. 2015), and KCNC4 gene has been reported in promoting potassium voltage-gated channel for immune functions in indigenous cattle breeds (Xu et al. 2018).The annotation of this region also reported key genes involved in sheep's resistance and adaptation to harsh environments.In particular, MAN1A2, GOLPH3L, RAP1A are oxidative stress-related genes that have been shown to be overexpressed due to heat or cold stress (Lu et al. 2019;Guo et al. 2022;Paim et al. 2022).OAR1:85.70-102.01 also mapped the GNAI3 gene, which appears to influence adaptation to hot and arid environments, such as the thermo-tolerance reported for the Barki and Boer goats (Kim et al. 2016).Genes related to production traits were also identified, such as AMPD1 that has been reported as a discriminating gene for pasture-fed sheep due to increased umami taste in meat (Luo et al. 2019).The OAR1:154.96-163.54island reported the EPHA3 gene that has been showed to have a role in the wool structure in sheep (Kang et al. 2013), function showed also by the GNG11 gene, found in OAR4:2.03-16.70island (Li et al. 2020).Genes associated with fertility and reproduction were also found within the island OAR4:2.03-16.70.In the ovine species, ZPBP and COL1A2 genes are assumed to be functionally essential for spermatogenesis (Xu et al. 2021;Fu et al. 2022), while TAC1 might be involved in the control of reproductive seasonality (Li et al. 2018).The KRIT1 and VWC2 genes appeared to contribute to high fecundity in Small Tail Han sheep (Miao et al. 2020).Genes, previously associated with adaptive capacity in sheep, have also been annotated in this genomic region.The GNGT1 has been considered as a candidate gene responsible for the adaptation as it is involved in the regulation of hypoxia (Wanjala et al. 2023).Similarly, PDK4 is a glycolysis-related gene, which significantly increased its expression with increasing altitude in Tibetan sheep (Wen et al. 2021).The island in OAR6 spanning 22.74-28.96Mb mapped the DNAJB14 gene  1.
that has been associated with heat tolerance in Asian Mufloun and Shal sheep (Chen et al. 2021).Finally, the ROH island on chromosome  showed EPB41L1 which is a gene linked to milk quality in sheep as reported in Chios breed (Tsartsianidou et al. 2021).

Genetic relationship and structure
The different population analyses, aimed at investigating the genetic relationships and the genomic structure within and between populations, gave an unambiguous picture regarding the genetics of the Nera del Mela sheep.This population, with a reduced consistency and unclear origin, is characterised by small size and uniform black fleece, morphological characteristics found also in individuals scattered in the herds of other Sicilian breeds.Breeders, in fact, report the practice of mating animals with black fleece from various farms spread throughout Sicily.Decades of genomic isolation and adaptation to environmental conditions have helped to fix this distinctive trait in this sheep population, in association with a good response to the local adaptation.
The measure of genetic differentiation between populations (F ST ) highlighted proximity of the Nera del Mela to the Comisana and Pinzirita breeds, as well as to other southern Italian breeds such as Leccese and Bagnolese.On the other end, the results showed the   Sarda Black as the most genetically distant sheep from the Nera del Mela, despite their morphological similarities (small body size and black fleece).The multidimensional scaling plot clearly showed the Italian populations distributed in the metric space according to their geographical breeding location, as previously reported (Ciani et al. 2014).The condensation in the first two components of the genomic variance between individuals makes us appreciate the close relationships within the cluster of Sicilian sheep and the degree of dispersion of the Nera del Mela among them.The Neighbor-Joining algorithm implemented on pair-wise shared allele distances, which are a measure of the genetic proximity between each pair of individuals, is able to describe accurate phylogeny and population stratification even when the size of a sample is small (Gao and Martin 2009).The phylogenetic relationships showed an almost perfect correspondence between individuals and the populations they belong to.Moreover, the graph showed a clear clusterization between populations originating from the same geographic area and confirmed the mutual proximity of the Sicilian sheep.However, the results of genomic structure, obtained through the model-based clustering approach, distinguished Nera del Mela, indicating clear genetic differences compared with other Italian sheep populations.In fact, the population diverged from the other Italian sheep breeds and is recognised by a distinct cluster at very low K value (K ¼ 7).
To explore in detail the relationship of Nera del Mela with the other Sicilian breeds, further analyses were carried out only on the sheep reared in Sicily.The multidimensional scaling analysis confirmed the  1.
differentiation between Nera del Mela and Sicilian breeds that could be hypothesised as its potential ancestors.Compared to Comisana and Pinzirita, individuals of Nera del Mela sheep are relatively more dispersed within the population cluster, a typical feature of those animal resources with higher genetic diversity or admixture.The Neighbor-Joining tree underlined the distinctiveness of the Nera del Mela and confirmed the results of the MDS described above.Within the Nera del Mela cluster, some individuals showed pattern of admixture, with genomic components resembling that of the Pinzirita and Comisana breeds.This evidence of admixture is possibly to be found in a distant past of common origin.Finally, the genetic similarity between Comisana and Pinzirita could be explained considering the geographical proximity of their breeding area, their similar management, and thus the possibility of gene flow between them (Mastrangelo et al. 2014).
All the results underline the hypothesis that Nera del Mela population possesses a defined genetic structure, despite the absence of a herd book and the possible genomic influence of other breeds, as reported for other local animal genetic resources (Tolone et al. 2022).This differentiation is probably the consequence of the combined effects of genetic drift, small population size, founder effects and reproductive isolation.

Conclusions
This study reported the first genetic characterisation of Nera del Mela sheep population performed using genome-wide SNP data.Despite the evidence of admixture with other Sicilian breeds, a clearly differentiated structure can be attributed to Nera del Mela, which suggests an original genomic pattern worthy of safeguarding and adequate selective plans.
The presence of a few sheep with black fleece has always characterised the flocks of different breeds, in particular those breeds with scattered black spots and pigmentation, such as the Pinzirita.Assuming a selection starting from the different Sicilian breeds for the character of the black fleece, supported by the information gathered from Sicilian breeders, the admixture of the Nera del Mela with the other Sicilian sheep resources, and in particular with the Pinzirita, would be explained.
The search for the genes annotated in the recurrent segments of homozygosity in Nera del Mela highlighted functions related to the adaptation to the local environment, while less for traits linked to milk production.Our genomic characterisation represents the starting point for the possible acknowledgment of the breed status of the Nera del Mela.Further investigations of the genomic characteristics by increasing the number of subjects analysed, as well as a detailed survey of the morphological traits are necessary to provide the peculiarities of this animal genetic resource.

Figure 2 .
Figure 2. Pairwise F ST values estimated between the 20 Italian sheep populations (ITA dataset).For full definition of the dataset, see Table1.

Figure 3 .
Figure 3. Multidimensional scaling analysis on the genotype data of the 20 Italian sheep populations (ITA dataset).For full definition of the dataset, see Table 1.

Figure 4 .
Figure 4. Neighbor-Joining Tree based on Allele Sharing Distances between the individuals of the 20 Italian populations of the ITA dataset.For full definition of the dataset, see Table 1.

Figure 5 .
Figure5.Multidimensional scaling analysis on the genotype data of the 5 Sicilian populations (SIC dataset).For full definition of the dataset, see Table1.

Figure 6 .
Figure 6.Neighbor-Joining Tree based on Allele Sharing Distances between the individuals of the 5 Sicilian populations (SIC dataset).For full definition of the dataset, see Table 1.

Figure 7 .
Figure 7. Circle plot of the ancestral clusters (K ¼ 2-20) inferred by the Admixture analysis on the genotype data of the 20 Italian sheep populations (ITA dataset).For full definition of the dataset, see Table1.

Table 2 .
The diversity indices revealed a moderate level of variability, with close values between observed (H O ) and expected (H E ) heterozygosity (0.376 ± 0.171 and 0.370 ± 0.130, respectively), and a relatively high mean MAF value.This result was confirmed by the negative average F IS (À0.017 ± 0.068) and the low average F ROH (0.027 ± 0.053).The estimated contemporary effective population size (Ne) was 38.1.The analysis of runs of homozygosity pattern gave a total of 356 ROHs in 28 out of the 36 individuals.The N ROH was 9.89 ± 16.49, while NC ROH reported a mean value of 13.

Table 2 .
Genetic diversity indices and run of homozygosity parameters for Nera del Mela sheep.IS : inbreeding coefficient based on observed and expected heterozygosity; F ROH : inbreeding coefficient based on run of homozygosity (ROH); MAF: average minor allele frequency; Ne: contemporary effective population size; N ROH : mean number of ROH per individual; NC ROH : mean number of ROH per chromosome; L ROH : average length of ROH in Mb per individual; LC O : observed heterozygosity; H E : expected heterozygosity; F ROH : average length of ROH in Mb per chromosome; sd: standard deviation.