Genetic diversity and population structure of bread wheat varieties grown in Bulgaria based on microsatellite and phenotypic analyses

Abstract Determining the genetic diversity and population structure of the modern hexaploid wheat varieties currently grown in Bulgaria is fundamental for selection of genotypes with desirable traits resilient under climatic fluctuations and development of successful crop improvement programmes. In this study, simple sequence repeat (SSR) markers were used to characterize a population of 117 modern wheat varieties (Triticum aestivum L.) from Bulgaria and several Western, Central and Eastern European countries. The genetic diversity was higher in the Western and Central European varieties than in the Bulgarian and the remaining Eastern European ones. Model-based population structure analysis defined 2 sub-populations (K = 2) dividing the Central and Western European varieties from the Bulgarian ones. Subsequent genetic structure analysis at K = 3 revealed an additional separation of the Bulgarian varieties in two distinct sub-populations. The phenotypic diversity among the varieties was evaluated in the fields of Dobrudzha Agricultural Institute, G. Toshevo in North-Eastern Bulgaria for three consecutive years. The distribution of the varieties in the biplot analysis in terms of grain yield and its components revealed differences in their adaptation to the agro-climatic conditions of North Bulgaria according to their geographical origin. These results and the prevalence of specific SSR alleles in the sub-populations suggest distinct adaptive mechanisms to specific agro-ecological regions. The data will be of interest for both breeders and farmers and could serve as a basis for wheat improvement programmes and further association mapping for important agronomic traits expressed under different environmental conditions. Supplemental data for this article is available online at https://doi.org/10.1080/13102818.2021.1996274 .


Introduction
Wheat is the most important cereal crop, with 772 million tonnes produced in 2017 [1] and one of the major staple food crops over the world after rice. It occupies a central place in human nutrition providing 20% of the daily protein and food calories [2]. Wheat based foods provide a range of essential and beneficial components to the human diet, including protein, B vitamins, dietary fibres and phytochemicals [3].
Among wheat crops, bread wheat (Triticum aestivum L.) is the most important because of its pivotal role in world food security. The Green Revolution has allowed an increase in wheat production and development of high-yielding varieties all over the world and in Europe. However, during the last 20 years, the wheat yield in Europe has faced stagnation [4]. The necessity for increasing wheat production is of a high priority to reach a global food security under the deepening negative impact of climate changes and the growing human population, which is expected to reach over 9 billion by 2050 [5].
Bread wheat has accumulated huge genetic diversity during its evolution from einkorn. However, during the last several decades the genetic diversity in the present-day wheat has decreased basically due to the repeated cultivation of landraces for desirable traits, adaptation to particular agro-climatic conditions, the development and use of uniform varieties to meet the requirements of the food industry [6,7]. Narrowing of the genetic diversity has become a major concern for future genetic progress. Genetic diversity is crucial for the adaptability and survival of wheat species against infectious pests and diseases [8] and climate fluctuations, which are expected to become a major constraint for food security in the future. In order to avoid this, strategies to characterize and protect genetic diversity at both national and regional level are required. The reduction of genetic diversity encouraged the use of genetic resources in wheat breeding programmes.
The characterization of wheat varieties at molecular and phenotypic level is essential for preserving the existing genetic diversity, which would allow selection of varieties well adapted to the specific agro-climatic and soil conditions in order to meet the challenges imposed by the changing climatic conditions. However, the morphological markers are of limited number and the expression of quantitative traits is strongly affected by the environment.
The progress in molecular genetics technologies and the development of the Next Generation Sequencing (NGS) during the last few decades benefitted our understanding of the wheat genome organization and evolution, and provided reliable approaches for fast and efficient breeding. With the expansion of novel technologies, a range of polymerase chain reaction (pCR)-based DNA markers (RApD, AFLp, SSR, EST-SSR, ISSR, IRAp, RBIp, TD, SNp, etc.) were developed in wheat, which allowed their utilization in genotypic identification, gene diversity analysis, qTL mapping, etc. in different Triticeae species [9][10][11][12][13]. Molecular marker techniques vary in data generation efficiency and the genome area covered in the study [6]. The choice of a marker type for a study depends on the crop species and the goals of experiments. Among the molecular markers used in wheat studies in recent years, SSR (simple sequence repeats or microsatellites) and SNp (single nucleotide polymorphism) markers are the most applicable due to their co-dominant nature and wide coverage across the genome.
SSR markers are widely distributed throughout the cereal genomes and can be found in both coding and non-coding regions [14,15]. Although their number is lower than SNp markers, they continue to be widely used in various research projects. Nowadays, SSR databases are publicly available for various crops including wheat [16]. Moreover, many studies have claimed SSR markers to be a versatile tool in plant breeding programmes [10] and they are the more efficient choice in genetic diversity studies than SNps due to their faster rate of mutation and the elevated levels of polymorphism, which can be found with a small number of highly polymorphic markers [17].
SSR markers have been widely used for analysis of population structure, genetic diversity and relationships among wheat genotypes which are of major importance for the development of appropriate breeding plans [22]. The population structure information allows utilization of the natural diversity for detection of genes/qTLs of agronomic importance using the existing genetic technologies [30]. In the last few decades genetic diversity studies have made huge progress from simple detection of various morphological to molecular traits variation on DNA level [31]. Determining the existing genetic diversity is of a high importance for selection of superior genotypes for further crop enhancement programmes [32].
Bulgaria is one of the major producers of bread wheat in South-Eastern Europe. According to the Black Sea Agriculture Markets Analyst, Refinitiv (https://www. r e f i n i t i v. c o m / p e r s p e c t i v e s / t r a d i n g / w h a t -a r e-the-prospects-for-this-seasons-black-sea-wheat/), the wheat production of the Black Sea countries like Russia, ukraine, Romania and Bulgaria accounts for 1/3 of the global wheat exports.
Improving the yield capacity and the quality of grain of bread wheat is a major task at present because of the unpredictable climatic changes in recent years. Hence, uncovering the molecular basis of complex adaptation of wheat including tolerance to abiotic and biotic stresses and selection of germplasm with better performance under unfavourable meteorological conditions is a primary aim of breeders to enhance the crop production in Bulgaria.
This study aimed to determine the genetic diversity and population structure of the bread wheat varieties currently grown in Bulgaria using SSR markers and phenotypic data to reveal the potential of the genotypes for further use in breeding programmes for sustainable agricultural production.

Plant material
The plant material consisted of a germplasm collection of 117 modern winter wheat (Triticum aestivum L.) varieties currently grown in Bulgaria. The Bulgarian wheat is presented by 79 wheat varieties developed in Dobrudzha Agricultural Institute -G. Toshevo (DAI), Institute of plant Genetic Resources -Sadovo (IpGR), and two Bulgarian Seed Houses (Supplemental material  Table S1). The foreign germplasm is presented by the varieties from several European union Member States and breeding companies: France (17), Austria (5), Serbia (6), Croatia (1), Germany (2), the Czech Republic (1), Syngenta (3) and KWS (3).

Phenotype recording
Field experiments were conducted in three crop seasons (2017-2020) in the fields of Dobrudzha Agricultural Institute -G. Toshevo, North-Eastern Bulgaria. The region is typical for wheat production. Sowing was done up to the middle of October of each year. Field trials were organized in plots of 10 m 2 each using complete randomized design with two replications. Ten plants per variety were used for the analysis before harvesting. The following traits were evaluated and measured each cropping season: Days-to-heading (HD), plant height (pH), Thousand kernel weight (TKW), Grain yield (Gy), Hectolitre weight (Hectl) and Number of grains per ear (NGE).

DNA extraction
DNA samples were extracted from a 100 mg bulked fresh leaf tissue of 10 individual plants/genotype using a Gene All (Exgene TM plant SV mini) (Tribioscience, Seoul, Korea). The concentration and the purity of DNA samples were determined with a NanoDrop spectrophotometer (Thermo Fisher Scientific, Wilmington, DE, uSA).  [36]. Xbarc primers were extended with generic non-complementary nucleotide sequences tagF 5 -A C G A C G T T G T A A A A -3 ′ a n d t a g R 5-CATTAAGTTCCCATTA-3′, respectively, at their 5′ ends as described in Hayden et al. [37]. The SSR amplification was carried out according to Roder et al. [33] and Hyden et al. [37] using forward primers labelled with the FAM, ATTO565 (pET) or ATTO550 (NED) to allow direct detection of alleles on the automated capillary sequencer (ABI3130, Thermo Fisher Scientific, Wilmington, DE, uSA). All primers were synthesized by Microsynth (Microsynth AG, Balgach, Switzerland).
pCR was performed on a Veriti96 Thermal Cycler (Thermo Fisher Scientific, Wilmington, DE, uSA) using the pCR conditions described for each marker type [33,37]. Electrophoresis and visualization of SSR alleles was performed on an ABI3130 DNA analyzer. A standardized multi-pooling procedure was used to prepare SSR products for electrophoresis. After pCR, a 3-fold (for Barc SSRs) or 10-fold (for the remaining SSRs) initial dilution of the pCR products was performed. Additional dilutions (up to 1/25-1/75× or 500-750×) were done depending of the intensity of the signal after the subsequent mixing of the labelled products, respectively FAM: pET: NED (Barc SSRs) or FAM: pET (Xgwm, Xwmc and Xfca). Nine mixed pools were used according to the length of the pCR products. Subsequently, the diluted products were mixed with labelled internal standards GeneScan™ 500 LIZ™ dye Size Standard (Thermo Fisher Scientific, Wilmington, DE, uSA) and formamide, denatured and electrophoresed on an ABI3130 DNA analyzer. SSR allele sizing was performed with the Gene Mapper v4.0 software program ( Thermo Fisher Scientific, Wilmington, DE, uSA).

Data analysis
Allele number, frequency of alleles, observed heterozygosity (Ho), expected heterozygosity (He) and polymorphic Information Content (pIC) were determined using power Marker v. 3.25 [38].
The Nei 83D A distance matrix [39] was used to construct a phylogenetic tree with the unweighted pair group method with arithmetic mean (upGMA) module of power Marker v3.25. The resulting tree was visualized and annotated by using the Evolview v3 webserver [40].
The genetic structure among the genotypes was studied using two approaches: a model-based method implemented with STRuCTuRE 2.3 [41] and discriminant analysis of principal components (DApC) [42]. The model-based analysis was run hypothesizing from one to ten distinct populations (K = 1 to K = 10), using an admixture model with correlated allele frequencies with 100,000 burn-in phases and 1,000,000 MCMC iterations in 10 independent runs. The most probable number of populations was determined using the ΔK method [43] as implemented in Structure Harvester v. 6.93 [44]. The visualization of the resulting clusters was done using pophelper package in R version 4.1.1 (https://www.r-project.org/). All basic statistics of genetic diversity and the level of differentiation between populations defined in the model-based analysis were evaluated in R version 4.1.1 (https:// www.r-project.org/).The degree of differentiation between the resulting genetic clusters was evaluated using pairwise Jost's D [45] and analysis of molecular variance (AMOVA) in R version 4.1.1.
We examined how the genotypes from the populations established by the model-based method are clustered according to their performance in the six quantitative phenotypic traits described above, using principal component analysis (pCA). The pCA was done using the best unbiased linear predictions (BLups) across the three experimental seasons. The BLup values were extracted from the model: where μ is the overall mean, g i is the random effect of the ith genotype, r j is the random effect of the jth replication, s k is the random effect of the kth season, (gs) ik is the random effect of the interaction between ith genotype and kth season and ε ijk is the random error of the model [46]. All phenotypic data analysis and visualization were performed in R version 4.0.5 (https://www.r-project.org/).

Genetic diversity analysis
To explore the genetic diversity and population structure, we investigated the patterns of molecular diversity with 22 SSR markers in the wheat germplasm collection consisting of 117 accessions, including 79 modern Bulgarian varieties and 38 foreign varieties that were also phenotypically evaluated during 3 subsequent crop seasons (2017-2020).
The markers were selected to cover all 7 chromosomes of wheat with a minimum of 2 markers per chromosome. The diversity pattern of the 22 SSR loci across the whole set of accessions revealed a total of 179 distinct alleles ranging from 4 in locus Xgwm458-1DL to 15 in locus Xcfa2086-2AL with a mean of 8.14 alleles per locus (Table 1). Several genotypes showed null alleles in 5 out of 22 loci tested, which was confirmed by additional pCR. The mean pIC for all 22 SSR markers was 0.6240 with values ranging from 0.3841 for marker Xbarc273 on chromosome 6DL to 0.8074 for the marker Xcfa2086 on chromosome 2AL. A few additional markers like Xgwm11, Xbarc170 and Xwmc808 on chromosomes 1BL, 4A and 3BS, respectively, were also characterized with high pIC. The genetic diversity index (He or GD) ranged from 0.3941 (Xbarc273-6DL) to 0.8168 (Xcfa2086-2AL) with a mean of 0.6580. The average observed heterozygosity (Ho) across all 22 loci was 0.1853 with the highest values of 0.6496 in locus Xgwm619-2BL, and 0.5043 in locus Xbarc330-5AS. The lowest heterozygosity was recorded in locus Xgwm458-1DL (0.0256).
The B-genome had the highest mean number of alleles (8.75) followed by the A genome (8.33), whereas the D genome had the lowest number (6.8). The SSR markers used showed different levels of gene diversity between the wheat genomes. Тhe highest pIC and GD values were observed in the A genome (0.6537 and 0.6910) compared to B (0.6445 and 0.6768) and D (0.5375 and 0.5687) genomes, respectively (Table 1).
Comparative analysis of gene diversity among the germplasm pools from different geographical regions in Bulgaria (North and South Bulgaria) and Europe (Western, Central and Eastern Europe) ( Table 2 and Supplemental material Table S2) revealed the highest mean GD in Western and Central European wheat (0.6240) and the lowest in Eastern European wheat (0.4896). The bread wheat varieties originating from North Bulgaria showed a higher mean number of alleles and GD value (5.91 and 0.5817, respectively) compared to those from South Bulgaria (4.82 and 0.5032). There was difference in the level of GD among the studied SSR loci in both regions in Bulgaria. The highest level of GD in the panel of wheat varieties bred in North Bulgaria was observed in loci Xcfa2086-2AL (0.8106), Xgwm11-1BL (0.7824) and Xbarc170-4A (0.7696). In contrast, the highest GD in the wheat panel from South Bulgaria was found in the loci Xgwm155-3AL (0.7067), Xbarc330-5AS (0.6828) and Xwmc808-1BL (0.6643).

Population structure
The presence of a genetic structure among the genotypes in the studied collection was inferred using a model-based approach implemented in the STRuCTuRE software. The analysis of the results showed two distinct peaks: a higher one at K = 2 and a lower one at K = 3 (Figure 1(A)). The resulting sub-populations at K = 2 divided the varieties in a pronounced geographical manner. The first sub-population (Sp1) included 34 varieties, the majority of which originated from Western and Central Europe: 22 from France, 4 from Austria, 2 from Germany and 1 from the Czech Republic. The remaining five varieties were from the Balkans: 3 Serbian and 2 Bulgarian. The second sub-population (Sp2), consisting of 79 genotypes, was composed mainly of varieties originating from Bulgaria. Seventy-five out of the 79 Bulgarian genotypes were grouped in Sp2. It also included three Serbian varieties and one Austrian. There were four genotypes that were not assigned to any of the described sub-populations. The admixed group consisted of 1 French, 1 Croatian and 2 Bulgarian varieties.  At K = 3 the geographical division of the collection was even more pronounced as the Bulgarian varieties were sub-divided according to the geographic location of the breeding centres where they were selected. The first sub-population (Sp1) consisted of 38 varieties, 36 of which developed in the North Bulgarian breeding centres (DAI, G. Toshevo and pesticid EOOD, Shumen), one in the South Bulgarian breeding centres (IpGR, Sadovo and Seed House, Sadovo) and one Austrian variety (Figure 1(B)). The second sub-population (Sp2) included exclusively Western European varieties. Twenty-eight out of 33 varieties in Sp2 were from Western and Central Europe (22 French To analyze how the genetic variance was distributed across the hierarchical levels of the grouping derived from the genetic structure analysis we performed an AMOVA, considering the sub-populations described above as populations. The analyses were done by excluding the genotypes that failed to group to any of the main sub-populations inferred by the structure analysis. When considering two main populations (K = 2), 56.3% of the genetic variance was distributed among the varieties within populations, 25.6% within the varieties and 18.1% between populations. When considering three main populations (K = 3), 19.6% of the genetic variance was partitioned among the populations, 52.8% among varieties and 25.5% within varieties. The differentiation among the populations was more pronounced at K = 3. Higher pairwise values of Jost's index were observed between the Sp2 and Sp1 (0.41), and between Sp2 and Sp3 (0.40), than between Sp1 and Sp3 (0.22).
The presence of three distinct sub-populations was also confirmed by the DApC analysis. The resulting DApC inferred groups showed a high level of agreement with those formed at K = 3 in the model-based population structure analysis (Figure 2(B)). The above described sub-populations were completely preserved, with the only exceptions being the dislocation of variety Renesansa (RS) from Sp3 to group 1, the latter corresponding to Sp1 from the structure analysis, and the classification of the admixed genotypes across the three groups formed in the DApC analysis. Group 3 and group 1 were located in the centre of quadrants two and three, respectively and group 2 was situated in the right part between quadrants one and four on the DApC graph (Figure 2(A)). The most contributing alleles for the first axis were X g w m 2 6 1 . 1 9 2 b p , X g w m 2 6 1 . 1 9 6 b p a n d Xcfa2086.271 bp, and for the second axis

Phylogenetic analysis
The phylogenetic tree (Figure 4) delineated the 117 varieties into 2 main clusters in agreement with the observed 2 main sub-populations inferred by the model-based approach at K = 2 in the genetic structure analysis. The first main cluster included mostly Western and Central European varieties with origin from France (22), Austria (4) and Germany (2) as well as a few Eastern European varieties from Serbia (3), South Bulgaria (1) and the Czech Republic (1). The second main cluster incorporated about 2/3 of the analyzed varieties, including all Bulgarian varieties, with the exception of Nikolay (No 24), and 6 foreign ones with origin from Serbia, Croatia, France and Austria (Supplemental material Table S1). This cluster is subdivided into several sub-clusters. The observed subdivision generally conforms to the sub-populations determined by the model-based approach at K = 3. The varieties originating from South Bulgaria were grouped close to each other, whereas those from North Bulgaria are distributed in several sub-clusters, some of which consisting only of genotypes developed in DAI, G. Toshevo. Three varieties from Serbia (Renesansa, Ilina and Simonida) clustered together with the varieties of IpGR, Sadovo, whereas the varieties Midas (Austria) and Andelka (Croatia) are grouped with the DAI varieties.

Phenotypic data
To analyze if there are patterns in the phenotypic performance of the genotypes classified in the two clusters resulting from the model-based method described above, we performed a pCA analysis with six quantitative phenotypic traits. The analysis was done using BLupS extracted from a linear mixed model, with all independent variables taken as random. All of the used predictors in the model had a highly significant effect upon the studied traits except 'replication' (Supplemental material Table S3). The first two axes from the pCA explained 64% of the variance ( Figure 5). Grain yield (Gy) and hectolitre weight (Hctl) had the highest contributions to the first axis, and plant height (pH) and days to heading (DH) were the most contributing variables for the second axis. The genotypes from Cluster 1 were situated mainly in quadrants one and four, except for the variety Nikolay (No24), which was located in the lower left part of quadrant two. The genotypes situated in quadrant four had a Gy above average (the ones situated closer to the ordinate axis) or below average (those situated to the extreme right of the graph); the most common feature of these varieties was the low values (below average) of pH, TKW and Hclt. The genotypes in quadrant one had Gy below the average, lower NGE than the average and higher or close to average DH. The genotypes from Cluster 2 were distributed predominantly in quadrants two and three, but 13 of them were scattered in the remaining two quadrants. Those in quadrants two and three had higher than the average Gy, TKW and Hctl. The genotypes in quadrant two also had a higher pH.

Discussion
Genetic diversity is vital for successful crop improvement programmes and food security because it underlies adaptation of crops to environmental changes and ecosystem resilience. However, many studies in bread wheat have claimed that modern breeding has led to narrowing of the genetic diversity in wheat in the last few decades [8,[47][48][49]. Continuous assessment of genetic diversity to identify sources of new genetic variation in the wheat germplasm pool is critical for the development and selection of high-performing varieties that have the capacity to adapt to specific agro-ecosystems and overcome the ever-changing climate conditions [22,50].
In this study, we described genetic and phenotypic diversity and marker-based population structure of modern bread wheat (T. aestivum L.) varieties grown in Bulgaria.

Genetic diversity and SSR marker distribution
This study showed a significant level of genetic variation in the studied panel of 117 winter bread wheat currently grown in Bulgaria. A total of 179 alleles were identified by 22 polymorphic SSR markers with an average of 8.14 alleles per locus. The results are similar to the findings described in other studies. Belete et al. [22] reported higher mean number of alleles (10.06) for 52 bread wheat accessions in a study aimed at identifying parental genotypes appropriate for drought tolerance breeding. A lower number of alleles per locus was reported by Zhang et al. [51] (7.2 alleles/locus) who analyzed 205 elite wheat breeding lines from the major winter wheat breeding programmes in the uSA using 245 SSRs and by Chen et al. [52] (5.05 allele/ locus) in 90 Chinese winter wheat accessions analyzed at 269 SSR loci. Landjeva et al. [48] also identified a lower number of alleles (6.8 alleles/locus) in a set of 91 Bulgarian winter wheat (T. aestivum L.) varieties released in the twentieth century using 19 SSR markers and one secalin-specific marker for rye chromosome arm 1RS.
Allelic diversity depends on the genetic composition of the population and it is an important tool to discern the genetic attributes associated with specific traits. The polymorphic information content (pIC) across the tested loci observed in our study is a confirmation that the selected SSR markers are highly polymorphic and appropriate for genetic diversity studies. At the genome level, both A and B genomes had higher values than the D genome for both GD and pIC. The lower number of polymorphic markers on the D Figure 5. pca biplot of six quantitative traits and the distribution of the genotypes at K = 2. the cluster memberships are coded in colours and the country of origin in shapes. gy -grain yield, Dh -days to heading, hctl -hectolitre weight, nge -number of grains per ear, ph -plant height, tKW -thousand kernel weight.
genome could be explained with the lower frequency of recombination events in the D genome of hexaploid wheat leading to its lower diversity. The latest originated from the A. tauchii, whereas the A and B genomes derived from its tetraploid ancestors which are characterized with larger proportion of genetic diversity [53].
The average GD (0.6580) of the studied wheat panel presented by the national and foreign varieties registered and currently used in Bulgaria is higher than that reported in previous studies for the Bulgarian, Belgian and South-Eastern European wheat [47,48]. However, in the present study it was higher in the Western and Central European varieties than that in the Bulgarian and the rest of the Eastern European ones. This is in contrast to the report by Todorovska et al. [47] on the GD values for the Bulgarian germplasm, consisting mostly of DAI, G. Toshevo varieties and lines, and Belgian germplasm collection (0.5834 vs 0.5273). However, the GD value for DAI germplasm collection (0.5834) reported by Todorovska et al. [47] is nearly identical with the GD value for the Northern Bulgarian wheat sub-population (0.5817) observed in the present study.
Our study showed that the Bulgarian wheat varieties bred in North and South Bulgaria are characterized by different allele number and GD across the studied microsatellite loci. The highest difference in respect to the number of alleles and GD values between both sets of wheat varieties was observed in locus Xcfa2086-2AL. Higher GD values in the Northern Bulgarian wheat were calculated for several other loci like Xgwm11-1B, Xgwm619-7AS, Xbarc170-4AL. On the other hand, different loci like Xgwm680-6BS and Xgwm513-4BS contributed to the higher level of GD in wheat from South Bulgaria (Supplemental material  Table S2). This could be explained by the distinct selection pressure provoked by the difference in the agro-ecological zones with distinct climatic patterns requiring different alleles or genes for local adaptation.

Population structure and differentiation of wheat varieties
Many studies have highlighted the importance of investigating the genetic structure of a population for effective exploitation of genetic diversity and broadening the genetic basis of modern varieties by selection of genotypes with desirable traits in further breeding programmes [22,54]. Analysis of the STRuCTuRE software output by the approach described by Evanno et al. [43] suggested that the most probable number of sub-populations in the studied collection of modern wheat cultivars is two. The distribution of the varieties across the resulting sub-populations showed a pronounced geographic pattern, separating those from Central and Western Europe from the majority of the Bulgarian and some of the varieties from Serbia. The analysis of the sub-populations at K = 3, prompted by the second peak on the Evanno graph, revealed a subdivision of the varieties originating from North and South Bulgaria into two distinct groups. A potential genetic structure consisting of three sub-populations was further supported by DApC and the AMOVA results, the latter showing a higher between-subpopulation variance at K = 3. Geographical patterns in the genetic structure of wheat were reported by other authors [4,55] studying panels of European wheat varieties. Le Couviour et al. [55] also observed a finer geographical separation with increasing the value of K from 2 up to 4. Our results are also in accordance with results of Tehseen et al. [56] reporting a better geographical clustering in the DApC analysis. It should be noted that in contrast to our results, Aleksandrov et al. [57], studying a panel of 179 Bulgarian bread wheat (Bulwheat population) consisting of 128 modern semi-dwarf cultivars and 51 old germplasm with tall stature using SNp markers, inferred a genetic structure influenced mainly by the temporal factor and only to a limited extend by a geographical pattern.
The applied distance-based cluster approach ( Figure  4) is in correspondence with the sub-populations of the studied panel of wheat with different geographical origins defined by the model-based approach at K = 2 and K = 3. The observed sub-clustering within the 2 main clusters reveals the effectiveness of the microsatellite markers in the assessment of genetic differentiation. It also clearly reflects the difference in the strategies for wheat improvement adopted in both breeding centres and seed houses located in the 2 eco-geographical regions (North and South Bulgaria) characterizing with continental, transitive-continental and Mediterranean climatic conditions determined by the positioning of Bulgaria at the meeting point of the Continental, Oceanic and Mediterranean air masses combined with the barrier effect of the Balkan mountain range.
The primary determinant for the group separation observed in our study could be the regional adaptation as reported in several studies on different wheat classes and traits [58,59] as well as the use of different genetic sources and breeding selection methods directed towards improvement of specific traits determined by the particular environmental conditions and market requirements. Many studies [4,47,52,60] have shown that some chromosome regions can contribute to population subgroup separation. Among the loci encoding for important agronomic traits, those responsible for plant height (Rht), day-length insensitivity (ppd1) and flowering time (Vrn) have a major impact on the genetic separation of global wheat population [4]. Our study showed that the observed allele variation at locus Xgwm 261(2DS), which is closely located to the dwarfing gene Rht8, contributed to the genetic separation of the studied wheat population. The allele of 192 bp contributing to the reduced plant height prevails in the South-Eastern European wheat [4,47]. In our study this allele was found mostly in Bulgarian and Serbian varieties belonging to Cluster 1 and 3 in the structure analysis at K = 3. In contrast, almost all foreign wheat varieties with origin from Western and Central Europe belonging to Cluster 2 carried the allele of 174 bp.
The climatic barrier formed by the Balkan mountain range determines different agro-climatic zones in northern and southern Bulgaria. As a result, the breeding of wheat in North Bulgaria is directed towards development of more cold tolerant varieties, whereas in South Bulgaria it is mostly targeted at the development of drought tolerant ones. The grouping of varieties of IpGR, Sadovo in close proximity within a distinct sub-cluster of the second main cluster provides evidence for narrowing the genetic basis of the modern Southern Bulgarian varieties as it has been reported by Aleksandrov et al. [57]. Several drought tolerant varieties like Gines (No16), Katya (No4) and Gizda (No27) of IRGR-Sadovo were grouped next to each other forming a branch within one of the sub-clusters, which is a confirmation of the results from the physiological analysis on the drought tolerance at the early stage performed by Vassileva et al. [61]. The observed grouping of the varieties of DAI, G. Toshevo in several sub-clusters that surrounded the sub-cluster formed mainly of varieties developed in IpGR, Sadovo reflects the higher diversity pattern in this sub-population as a result of the use of more diverse genetic sources for crop improvement in the last 30 years with origin from Romania, Russia, ukraine, CIMMyT and ICARDA.
Such grouping is in correspondence with the breeding programmes in the two agro-ecological regions in Bulgaria and is determined by the predominance of specific alleles in some loci related to abiotic stress conditions (cold, drought) and productivity. Our study revealed that several microsatellite loci significantly contributed to this separation like Xgwm631-7AS, located near to qTL for pH [62] and grain yield [63], Xcfa2086-2AL located near to the qTLs for some agronomic traits (days to heading, ear peduncle length, grain yield) expressed under different environmental conditions in hexaploid and durum wheat [64,65], for TKW like Xgwm513-4BS [66] as well as qTL for biofortification in wheat [67] and to the meta qTL region associated with root architecture of bread wheat [68] on group 3 chromosomes and to qTL for protein content and glutamine synthetase activity [69].

Phenotypic diversity
In addition to the molecular markers, we used six quantitative phenotypic traits to analyze the genetic diversity in the studied wheat collection. According to the results of the fitted mixed model there was a substantial genetic variability in terms of the studied traits demonstrated by values of the genotypic component of variance. The clustering analysis separated the two sub-populations inferred by the model-based approach relatively well along the first composite axis according to the grain yield and its components. These results present the main trends in the realization of the genetic potential of the varieties from the working collection. The established differences are a result primarily of the selection pressure in the breeding process. It is determined by the priorities of the respective programmes and by the limiting factors of the targeted ecological and geographic regions. The varieties from Austria and Germany were with the longest time to heading, followed by the ones from Croatia and France. The variation of the latter group was within a wide range. The variety Avenue headed earliest, and Sorrial, Toskani, Solveig and Renan, latest. Considerable variability according to this trait was found between the accessions from Bulgaria and Serbia. The field observations showed they had similar phenological development. Among the new Bulgarian bread wheat varieties included in the study, Kalina was with the earliest date to heading, whereas Dragana was with the latest.
A considerable variation in the plant height among the groups of varieties with different geographical origin was observed. The variety from Croatia was with the shortest stem. Even its maximum value was much below the mean height of the rest of the groups. A low value of this trait was also observed for the varieties from France. The varieties from Austria were with the highest plant height. The variation in this trait was more strongly pronounced in the group of the Bulgarian varieties. The plant height of the majority of them was within a wide range (82 − 109 cm). Some varieties, such as Medeya, Geya 1, Kalina and yoana, were with the shortest stem, whereas Sadovo 552 was with the tallest one.
Significant variation was also observed in the structural components of yield; it was the highest with regard to the number of grains per ear. The Bulgarian varieties had the highest predicted values for the number of grains per ear and for absolute weight (TKW), which compensated for their lower number of productive tillers. There was a similar trend in the group of the Serbian varieties. The varieties from Germany had a balanced combination of these traits, and to a lesser degree, the ones of Austria, which were characterized by lower absolute grain weight.
When comparing the range of variation in productivity, we found that in each group there were varieties with very high genetic potential realized under the conditions of Dobrudzha region. The lowest mean yield was observed in some Eastern (Croatia) and Western European (Austria) varieties. This was not incidental since the predominant part of them were from the group of wheats with good quality and a comparison is inappropriate. The variation in the varieties from Germany was due to a similar reason. The group of French varieties was presented by the highest number of medium wheat types and medium wheat types with increased dough strength. During the three crop seasons of evaluation in the Dobrudzha region (North Bulgaria), the Bulgarian wheat varieties that produced the highest yields were pryaspa, Koprinka, Neda, Kiara and Dragana, whereas among the foreign varieties, Basmati and Santorin.

Conclusions
The present study revealed the extent of genetic diversity and the presence of genetic structures using SSR markers in a panel of 117 modern winter wheat varieties with origin from Western, Central and Eastern European countries which are currently used in both breeding programmes and grain production in Bulgaria. The genetic diversity (GD) was higher in Western and Central European wheat varieties in comparison to Bulgarian and Eastern European ones. Within the Bulgarian population, GD was slightly higher for the wheat sub-population from North Bulgaria in comparison to that of South Bulgaria. Both, the distance-based cluster analysis and the model-based structure analysis revealed an eco-geographical pattern of distribution of the studied modern winter varieties. The clustering of the studied varieties in the biplot analysis according to the phenotypic traits reflects the differences in adaptation to the agro-climatic conditions of North Bulgaria determined by the observed eco-geographical pattern of origin. The study provides information for selection of appropriate parents for further breeding purposes and is a basis for association mapping for important agronomic traits expressed under different environmental conditions.

Disclosure statement
No potential conflict of interest was reported by the authors.

Data availability statement
All data that support the findings reported in this study are available from the corresponding author upon reasonable request.