Molecular characterisation of the oldest domesticated Turkish einkorn wheat landraces with simple sequence repeat (SSR) markers

Abstract Einkorn (Triticum monococcum L. ssp. monococcum) is an ancient diploid wheat species with many useful traits and used as a wheat gene discovery model. In this research, a total of 41 diploid and tetraploid wheat accessions were tested using simple sequence repeat (SSR) markers. A total of 33 genotypes of T. monococcum along with four genotypes each of tetraploid wheat (T. dicoccon and T. durum) were used as plant material. The analysis utilized 10 polymorphic markers, including a total number of 41 alleles with an average frequency of 4.1 alleles per locus during exploration of the level of genetic variations. Various diversity analyses, which are the effective number of alleles (Ne), gene diversity (h), Polymorphic Information Content (PIC), and Shannon’s information index (I), were performed for 10 ‘A’ genome wheat SSR markers. The results showed a narrow variation in einkorn genotypes, supported by Analysis of Molecular Variance (AMOVA), with 66% maximum variation in all genotypes. The structure analysis divided the whole germplasm into two populations. A dendrogram was constructed to determine the genetic similarities using the unpaired group method with arithmetic averages (UPGMA), which separated tetraploid wheat from other genotypes/accessions. Principal Coordinate Analysis (PCoA) co-supported the clustering of UPGMA and structure by differentiating the diploid and tetraploid wheat. These findings will help understand the genetic relationships among these wheat accessions and their use in breeding programs in the future works.


Introduction
The diploid (Triticum monococcum L., 2n = 14) also known as Einkorn or Siyez in Turkey, and tetraploid (T. dicoccon) are the two oldest hulled wheat species. They were firstly domesticated in the Anatolia (present-day Turkey), and both of them are considered among the ancestors of modern wheat. They are also known as a bridge between cultivated and wild wheat species [1]. Both einkorn and emmer wheat populations have been cultivated in rural areas of several provinces in Turkey till today.
Siyez was domesticated from a wild progenitor T. boeoticum [2]. It originated in the Neolithic period ~10,000 years ago [3] at Karaca mountains in Tigris-Euphrates valley of the Fertile Crescent in the South East of modern-day Turkey [4][5][6]. This area is considered the main area in the domestication, and later this species spread to the Caucasia, Turkmenistan, the Middle-East, the Balkans, Central Germany and the Mediterranean region (Italy, Spain) of Europe [6,7].
Furthermore, tetraploid wheat (T. dicoccum), (named Gernik in Turkey), was domesticated from wild Emmer progenitor (T. dicoccoides Koern.) and was the first domesticated wheat species [8]. Tetraploid Emmer wheat was domesticated at a highly debated site, the upper Jordan River valley, either in the same era of T. monococcum or slightly later [9].
Studies have shown that Turkish populations of T. dicoccoides wheat are morphologically and phenologically similar in population structure to the populations mentioned above, as verified by allozyme based on similarity tests [10,11]. T. dicoccum wheat Gernik populations was firstly found at Cayonu excavation [1].
The end products from T. monococcum and T. dicoccon could be considered as organic and healthy food due to high nutritional value, resistance against the number of pests and diseases, lodging and drought resistance with acceptable yields on poor soils; where other species of wheat fail to grow [12,13]. Whereas, T. monococcum besides acting as a good source of many essential alleles in wheat, also serves as a source of variation present only in this species.
Unlike T. dicoccum, it also provides genes for resistance against leaf rust and adaptation to low and poor-input agriculture with high-quality traits for Zn, Fe, Mn, Cu, Mg, P, microelements uptake, and the genes responsible for the formation of carotenoids, tocols, conjugated phenolics, alkylresorcinols and phytosterols with rare glutenin subunits (HMW-GS) [14][15][16][17][18][19][20]. They also prevent pre-harvest sprouting [21] and increase zinc uptake efficiency [22][23][24]. So variation in numerous traits in the diploid Siyez wheat is worthy of exploitation for genetic improvement [25][26][27] because it is almost untouched since bronze age to the present time without conscious human selection or has not suffered a reduction of its diversity during domestication [28]. Wheat breeders are continuously working to improve wheat grain yield with better quality and resistance against various types of biotic and abiotic stress around the world [16,29]. It is crucial to estimate genetic variation and inheritance modes to start productive wheat breeding [30].
The T. monococcum genome is under-represented in hexaploid wheat. The exploitation of genetic diversity in T. monococcum genome could serve as a novel source of discovering new and additional traits in breeding and genetic improvement of tetraploid and hexaploid varieties [26,31].
In line with the above information, there is a need to identify and compare genetic variations in Siyez populations at its first Centre of Diversity (Karaca Mountains) and first areas of its spread within Turkey like Provinces of Kastamonu, Bolu, Sinop, Balikesir, Bilecik, and Cankiri [32]. Knowledge of distribution and genetic variation in endemic Siyez wheat populations is fundamental to breeders to enrich the genetic base for modern wheat [33,34] that might provide useful sources of genetic variability in several beneficial traits [29]. Therefore, the knowledge of genetic diversity in a germplasm collection will significantly impact wheat breeding [26].
Germplasm characterization is considered a prerequisite for breeding by providing novel information that can be used for future breeding activities [35,36]. Molecular markers can provide opuurtunities to detect precise genetic diversity among different cultivated and wild wheat species with different ploidy levels [37][38][39]. Studies have shown that highly polymorphic Simple Sequence Repeats (SSRs), or Microsatellite markers, with multi-allelic nature and genome specificity are the most suitable for studying genetic wheat diversity and evolution analyses in the wheat population [40][41][42]. Genetic variability of diploid and tetraploid wheat from Karaca mountains and many varieties growing in diverse regions of Turkey regions have not been well investigated to date [43,44]. Therefore, the study aimed to identify molecular characteristics of different Gernik (T. dicoccum) and Siyez (T. monococcum) diploid and tetraploid wheat genotypes of diverse origin in Turkey.

Plant material
A total of 32 Siyez (T. monococcum) landraces from 11 geographically different provinces (Karabük, Samsun, Eskişehir, Çankırı, Kastamonu, Sinop, Aksaray, Nevşehir, Gaziantep, Kars and Kayseri), along with 4 landraces of Gernik species (Triticum dicoccum) were collected from south and north-western Turkey, and four commercial cultivars of Triticum durum were obtained from the Central Field Crops Research Institute, Ankara, Turkey and registered in the National Gene Bank of the same Institute for assesment of phylogenetic relationships among them (Table 1 and Figure 1).

SSR markers analysis
The seeds of each wheat accession were sown into pots under greenhouse conditions. Genomic DNA was extracted from the bulk of 5 young and fresh leaves (three-week-old seedlings) using a Roche Magna Lyser homogenizer following the 'Biotecon Foodproof D.N.A. Isolation Kit' . The DNA concentrations were quantified with NanoDrop (NanoDrop™ OneC) and diluted to 20 ng/μl for further use in PCR.
Ten (10) SSR markers represented by chromosome wheat A genome were described by Röder et al. [42] and used for polymerase chain reaction (PCR) amplifications, to asses the genetic diversity among the genotypes [42]. These primers were selected to produce strong bands and very high discrimination and polymorphism ( Table 2). Each reaction contained 20 ng/μL of template DNA, PCR Master Mix (2×) followed by, 2 μmol/L of each primer. The PCR conditions were 3 min at 94 °C initial denaturation, followed by 35 cycles of denaturation at 94 °C for 1 min, and annealing temperature of 50-65 °C (depending on the SSR markers) for 1 min, ending with an extension of 72 °C for 7 min. The total concentration of 20 μL PCR amplifications contained 20 ng/μL genomic DNA templates from wheat accessions, PCR Master Mix (2×) following the instruction manual, 2 μmol/L of each primer. The PCR products fragmantation and bands were separated using electrophoresis in 3% (w/v) of Nusieve 3:1 agarose and two-thirds of MetaPhor agarose (Cambrex Bio Science, Rockland, ME, USA). Using 1 × TBE buffer containing 1 mL RedSafe Nucleic Acid Staining Solution for detection of fragments for 1.5 h; after the electrophoresis, visualized and photographed under a UV Imager (Kodak GelLogic 200 Image System) ( Figure 2).

Data analysis
Only strong, clear and reproducible amplifiable products were considered for analysis. Each informative allele was scored individually as 1, for the presence and 0 for the absence. To reveal the genetic variations among diploid and tetraploid wheats, various diversity indices like effective number of alleles (Ne), Shannon's Information Index (I) and gene diversity were calculated for each locus using GenAlEx v6.5 [45]. The same software was also used for the analysis of molecular variance (AMOVA) and principal coordinate analysis (PCoA). The mean polymorphism information contents (PIC) for each selected primer were calculated as described by Roldan-Ruiz et al. [46](2000). The data in the similarity matrix was used for cluster analysis using UPGMA (Unweighted Pair-Group Method with arithmetic averages). The dendrograms were created using the SAHN module of NTSYS-PC 2.02e software [47]. Jaccard's coefficient was used to calculate the genetic similarities (GS) for pair-wise comparison of genotypes based on SSR data [48]. A similarity matrix was generated according to Simple Match (SM) coefficients [49]. The cophenetic correlation coefficient (r) between the observed distances and the dendrogram was 0.75, indicating a good fit between observed distances and the dendrogram.
The model-based Bayesian cluster software was applied in STRUCTURE 2.3.4 and [50]. We followed the criteria suggested by Evanno et al. [51] and plotted the number of clusters (K) against logarithm probability relative to standard deviation (ΔK) for the determination of a suitable number of clusters (number of K; number of subpopulations) in the STRUCTURE analysis.
To explore the level of variations, various diversity indices were also calculated on a population basis (Table 4). T. monococcum was found more diverse as it reflected maximum values for the calculated diversity indices like polymorphic loci (6.98), number of effective alleles (0.243), expected heterozygosity (0.156) and unbiased expected heterozygosity (0.158). Minimum variations were reflected by T. durum.
The genetic distance was also calculated and the maximum genetic distance was noted between T. monococcum and T. durum to explore the relationship among wheat populations ( Table 5).
Analysis of molecular variance (AMOVA) was performed to reveal the variations, and the results showed maximum variations (66%) in the studied germplasm (Table 6).
UPGMA dendrogram based on the genetic similarity coefficient among genotypes grouped all genotypes into two main clusters with a similarity index of 0.15 ( Figure 3) based on diploid and tetraploid wheat species according to A genome phylogenetics. Cluster I grouped all tetraploid wheats, including T. dicoccum and T. durum genotypes. Cluster II grouped all genotypes belonging to T. monococcum.
To identify the likelihood and favourable populations (delta K), the output files from Structure software were uploaded for analysis using the online software Structure Harvester, that confirmed the presence of two populations (K = 2; Figure 4). A total of 33 accessions of T. monococcum were clustered in Cluster I (red), while T. dicoccon and T. durum were present in Cluster II (green) ( Figure 5). PCoA was performed to confirm the results of UPGMA and STRUCTURE, and the results showed that the Einkorn genotypes formed their separate group, while T. durum and T. dicoccum were present close to each other ( Figure 6).

Discussion
T. monococcum (Siyez) from the Karacadağ and T. dicoccum (Gernik) from the Kars regions are considered the most ancient wheat that was first domesticated in Turkey [1]. The genetic diversity in this region of the gene pool, (the wheat's first domesticated centre), is essential for deciphering the chain of events during adaptation against unfavourable biotic stresses like diseases and pest resistance [55]. These substantial biotic stresses are known to lower the grain quality traits during modern cultivars breeding [56,57]. The study investigated the inter and intra genotypic polymorphism among einkorn, T. dicocum and T. durum genotypes by comparing their loci with A genome-based markers. Additionally, there are very few studies on Turkish hulled einkorn and emmer wheat at the molecular level [9,55]. Microsatellite markers selected in this study are codominant inheritance markers       characterized by high polymorphism and wide distribution, with advantages of high sensitivity and reproducibility. These markers are considered powerful and informative tools in many plant species, including wheat, due to their highly accurate detection variability in genetic diversity estimates [58][59][60][61][62]. The 10 most polymorphic SSR markers were applied for the clustering of genotypes in this research, as Gurcan et al. [55] used the 10 most polymorphic SSR markers, including Xgwm 312 markers in their study. All selected markers had 69-94% PIC values with an average value of 0.86. These results showed a higher PIC value in A drive genome markers than the value of 0.71 as noted by Ahmed et al. [58]. Previous studies show that when the PIC > 0.5, the marker has the maximum diversity, indicating high allelic diversity among germplasms. Contrarily when PIC < 0.25, the marker has minimum diversity [63,64]. Thus, these results with an average PIC value of 0.86 indicate a sufficient amount of gene variability and diversity among accessions [65]. This study demonstrated a narrow genetic diversity within T. monococcum and T. diccocum accessions, respectively, with values close to 0.4 and 0.3 in the construct. The A genomes of T. monococcum and T. diccocum are significantly different and have the least similarity, in agreement with Korzun et al. [66] and Jing et al. [26].
The results confirmed that these plant species growing over a number of years in different regions did not show significant genetic changes compared to their parent, which is in agreement with Zohary and Hopf [34]. Various diversity indices were calculated to explore the level of genetic diversity in diploid and tetraploid wheat. Einkorn wheat reflected higher values for all diversity indices compared to tetraploid wheat. Moreover, the genetic distance among diploid and tetraploid wheat was also calculated, and the minimum genetic distance was present between tetraploid wheat (T. diococcum and T. durum). Analysis of molecular variance (AMOVA) was performed by considering withinand between-population components. The AMOVA results revealed higher variations (66%) among the populations and showed a lesser level of variations within the population. Being diploid and tetraploid, wheat were genetically distant from each other and this reflected the extent of genetic similarity among the populations. Therefore, AMOVA showed higher population genetic variations among the genotypes.
To explore the genetic relationship among three wheat species, UPGMA based clustering was performed to separate the diploid and tetraploid genotypes clearly under two groups, and the members of each group had relatively close relationships among each other. It was observed that T. durum and T. dicoccum were grouped in the same cluster (Cluster I) by reflecting their genetic similarity with each other, as they are tetraploid wheat species. A total of eight genotypes belonging to T. dicoccum and T. durum were present in Cluster I. This cluster was further divided into two subgroups: Subgroup I consisting of three samples, two Gernik belonging to T. dicoccum, and one Emin bey belonging to T. durum. Besides, Kunduru, Kiziltan Gernik (Kars), C-1252 belonging to T. durum and one Gernik (Sinop) belonging to T. dicoccum genotypes classified into Subgroup II.
In this study, a total of 33 einkorn genotypes were present in Cluster II. It was further subdivided into five subgroups. Subgroup I was the largest subgroup in this cluster. However, all 33 einkorn genotypes used in this study belonged to six different groups: Çankırı, ESK-Collect (Eskisehir), Karabuk, and Kars Kastamonu, Samsun, Sinop region. Furthermore, 22 out of 33 of these genotypes could be accommodated into two subgroups with a similarity of 0.74. Therefore, our results showed that these genotypes are more similar to each other. Maximum genetic similarity was observed for both Siyez 26 -Siyez 35 and Siyez 5 -Siyez 6 from Sinop and Samsun regions, respectively. Indeed, these two regions are geographically neighbouring on the map. STRUCTURE software was used for investigating the population structure of three wheat species, and the whole germplasm was grouped into two populations. The STRUCTURE algorithm supported the findings of UPGMA by clearly separating the diploid and tetraploid genotypes. All 33 einkorn genotypes were present in Cluster I, while T. dicoccon and T. durum were clustered together under cluster II ( Figure 5). Principal Coordinate Analysis (PCoA) is performed to graphically represent the similarity/dissimilarity between individuals or populations. In this study, PCoA was also performed to check the clustering of UPGMA and structure algorithms. PCoA differentiated the diploid and tetraploid wheat and supported the findings of UPGMA and structure algorithms ( Figure 6). All einkorn genotypes made their separate group, while T diococcum and T. durum were present close to each other. The genetic distance among the populations also revealed the existence of lower genetic distance (0.061) between T. diococcum and T. durum. As tetraploid wheat was more genetically similar compared to einkorn, they were present close to each other in all three (UPGMA, STRUCTURE, PCoA) clustering algorithms.

Conclusions
This study explored the genetic diversity in diploid and tetraploid wheat comprehensively. Einkorn wheat reflected a higher level of genetic diversity for calculated diversity indices and confirmed its potential for wheat breeding. The results of AMOVA revealed the existence of higher genetic diversity among two populations. All three clustering algorithms differentiated the diploid and tetraploid wheat based on their genetic makeup. We believe that the findings of this study will help understand the genetic relationships among various wheat species.

Data availability
All data that support the findings reported in this study are available from the corresponding author upon request.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This study was funded by the BFN (Biodiversity for Food and Nutrition Project) project.