Evaluation of genetic variability and relatedness among eight Centaurea species through CAAT-box derived polymorphism (CBDP) and start codon targeted polymorphism (SCoT) markers

Abstract Centaurea is a value-ultimate genus of medicinal plants showing high diversification levels, especially within the Mediterranean basin, and is still traditionally recognized as a complicated taxon. So far, few studies utilizing molecular markers have been done on Centaurea spp. towards a better dissection of its phylogeny and accurate assessment of genetic diversity. Here, two functional marker systems, start codon targeted (SCoT) polymorphism and CAAT box-derived polymorphism (CBDP), were implemented to assess the genetic diversity between eight wild Centaurea species in Egypt. Seventeen SCoT and 19 CBDP primers generated 197 and 179 bands, respectively. These primers generated 158 (80.2%) and 131 (73.18%) polymorphic amplicons with an average of 9.29 and 6.89 amplicons per primer, respectively. SCoT primers exhibited higher levels in % of polymorphism, % of heterozygosity, effective multiplex ratio, discriminating power, resolving power parameters compared to the CBDP. In contrast, no significant differences were observed between SCoT and CBDP in polymorphism information content, marker index and mean heterozygosity parameters. The UPGMA (unweighted pair group method with arithmetic mean) dendrogram of the combined data classified the Centaurea species under two major clades; the first comprised two species, whereas the second grouped five species. C. pallescens was kept separated as the most diverged among the eight Centaurea species. Principal component analysis (PCA) topology revealed highly similar results to those obtained from the cluster analysis. Ultimately, our results represent the first report utilising two gene-targeting marker systems as powerful techniques for assessing the genetic variability and relatedness among these eight valuable Centaurea species.


Introduction
Centaurea L. is the fourth biggest genus in the Asteraceae family, including more than 500 species worldwide, particularly in the Mediterranean regions [1]. Approximately 17 species of the genus Centaurea grow in Egypt [2,3]. The Centaurea species are of strong interest in phytochemical and biological research due to their excellent medicinal value. Centaurea species are one of the most popular plants in folk medicine for the remedial treatment of cancer and microbial infections. In addition, Centaurea spp. has also been shown to have anti-diabetic, diuretic, antimalarial, anti-inflammatory, anti-pyretic, analgesic, anti-platelet, wound healing, anti-ulcerogenic, hepatoprotective, anti-plasmodial, cytotoxic, anti-proteasomal, antioxidant, anti-bacterial, anti-fungal, and antirheumatic properties [4,5]. Several reports have shown the presence of several compounds belonging primarily to the classes of sesquiterpene lactones, flavonoids and lignans. Besides, Centaurea was a subject of interest for several phytochemical characterizations, due to the active constituents particularly sesquiterpene lactones [6,7] and flavonoids [8]. The curative properties of Centaurea species may be due to the content of a wealth of bitter crystalline unsaturated lactones [9].
The species Centaurea alexandrina, Centaurea calcitrapa, Centaurea eryngioides, Centaurea glomerata, Centaurea pallescens, Centaurea pumilio, Centaurea scoparia and Centaurea sinaica are of great biological importance. C. alexanderina extract was suggested to have a potential anti-inflammatory analgesic, hypoglycemic activities, anticancer activity and was found to be active against Pseudomonas aeruginosa pathogenic activity for chest infection [10,11]. C. calcitrapa was reported to have hypoglycemic effect and antimicrobial activity, and the seed extracts, also a potential ability for milk clotting [12]. C. eryngyoides strongly inhibits the growth of the malaria parasite [13]. C. glomerata and C. pallescens have been documented to have antioxidant activity [14,15]. C. pumilio extract has antimicrobial activity against skin infection-causing strains, suggesting its potential to maintain healthy skin [16]. Also, extracts from C. scoparia and C. sinaica have potential antitumor activity against human carcinoma cell lines [17,18].
The genus of Centaurea still has many taxonomical problems. The systematics and the phylogenetic analysis of Centaurea have changed dramatically during the past two decades [1,19]. To date, molecular analyses of the genus Centaurea has not shown a well-resolved phylogeny. Accordingly, it is essential to use modern molecular marker systems for enhanced taxonomy of the genus Centaurea. In the last few years, novel molecular marker techniques have been developed and used in various plant genetic studies [20]. Many new alternative and promising markers have evolved, called gene-targeting markers or functional markers [21].
The Start Codon Targeted (SCoT) technique is considered a modern DNA-marker system that targets regions of the conserved sequences surrounding the ATG start codon region. For this reason, it employs a single primer as both a forward and reverse primer [22]. As the region flanking the ATG start codon is very conserved across a wide range of plant species, the SCoT marker system will be valuable for the sake of generating DNA markers in a vast range of plant species [23][24][25]. Recently, a genuine gene-targeted marker system called CAAT box-derived polymorphism (CBDP) has evolved to be one of the latest and easy-to-implement functional marker techniques. The CBDP marker system is designed to target specifically the CAAT box region of the promoters upstream of the genes in the plant genomes [26]. Given that the CAAT box has a distinct paradigm of nucleotides with a consensus sequence, GGCCAATCT, located -80 bp upstream of the start codon of the genes, it is a key player during the transcription of eukaryotic genes. Furthermore, it is more beneficial for several downstream applications in plant molecular genetics, taking into considerations that the recombination between marker and gene is lower than Random Amplification of Polymorphic DNA (RAPDs), Simple Sequence Repeats (SSRs) or Inter Simple Sequence Repeats (ISSRs) [27].
Accordingly, our study focused on investigating the genetic diversity and genetic relationships between eight medicinal herbs belonging to the genus Centaurea; C. alexandrina, C. calcitrapa, C. eryngioides, C. glomerata, C. pallescens, C. pumilio, C. scoparia and C. lipii (Volutaria lippii), using SCoT-PCR accompanying with CDBP-PCR techniques as gene-targeting marker systems. The fusion of the two marker systems is pivotal to identify, classify and authenticate these valuable species.

Plant materials
young fresh leaves from eight Centaurea species (Three independent individuals for each species) were collected during the flowering phase in May 2018 from six locations in Egypt (Table 1). A specimen of all collected Centaurea samples in the present study was deposited in the National Research Center's herbarium. The Centaurea species were taxonomically identified by Prof. Dr. Ibrahim El-Garf, (Professor at Botany Department, Faculty of Science, Cairo University, Cairo, Egypt).

Plant DNA extraction
Genomic DNA was extracted from fresh parts (100 mg) of the collected eight Centaurea species using a DNAeasy Plant Mini Kit (qIAGEN, Santa Clarita, CA) and following the manufacturer's protocol. The DNA was quantified with the qubit 3.0 Fluorometer (Life Technologies) according to the manufacturer's instruction. The quality of the extracted DNA was determined in terms of A260/280 ratio as an indicator for the purity of the DNA, also the DNA samples were electrophoresed using agarose gel (0.7%) to check the integrity of DNA.

SCoT-PCR amplification
The SCoT-PCR analysis was conducted according to the procedure described by [24]. A set of 17 SCoT primers was initially screened against the eight Centaurea species ( Table 2). The amplified fragments were finally electrophoresed in 1.5% agarose gel and photographed using Gel Doc XR + Gel Documentation framework (Bio-Rad, USA).

CBDP-PCR amplification
Nineteen CBDP primers (Table 3), used in the present study, were designed according to [26]. The amplification fragments were electrophoresed in 1.5% agarose gel and finally photographed using Gel Doc XR + Gel Documentation framework (Bio-Rad, USA).

Data analysis
For SCoT and CBDP data analysis, the amplified bands were scored manually. The bands were scored as absent (0) or present (1) to create the binary data matrix. A similarity matrix was constructed according to the Jaccard similarity coefficient (Jaccard, 1901). For SCoT, CBDP marker systems, and combined data (SCoT + CBDP), dendrograms were developed using cluster analysis and the unweighted pair group method with arithmetic mean (UPGMA). To evaluate the efficiency of the SCoT and CBDP primers, seven informative indices were calculated as described by [24].
To determine the efficiency of applied primers, the following parameters were calculated: Expected

SCoT analysis
Seventeen SCoT primers were screened to assess the eight Centaurea species' genetic relationships; all the tested SCoT primers produced reproducible and scorable bands in all species. The SCoT primers generated 197 scorable bands with an average of 11.58 bands per   primer ( Table 2) Table 2).

CBDP analysis
Nineteen CBDP primers were used to assess the genetic relationships among the eight Centaurea species. All the utilized CBDP primers generated reproducible patterns with clearly scorable bands in all species. The CBDP primers yielded 179 scorable amplicons, out of Additionally, the resolving power values were found to be between 0.250 (primers CAAT-14 and CAAT-15) and 8.750 (primer CAAT-8) ( Table 3).

Analysis of molecular phylogeny and genetic similarities
The phylogenetic trees based on the unweighted pair group method of the arithmetic averages analysis of SCoT, CBDP and combined data were constructed for the eight Centaurea species (Figure 1). For the SCoT marker system, the dendrogram comprised one major cluster (grouping seven out of the eight Centaurea species) in addition to another cluster that includes only C. pallescens (the most diverged among the eight Centaurea species). The major cluster was further divided into two subclusters; the first subcluster included only C. eryngioides, whereas the second subcluster was subsequently divided into two groups. The first group comprised four species (C. pumilio, C. alexandrina, C. scoparia and C. lipii), while the second group involved two species (C. glomerata and C. calcitrapa) ( Figure 1A). on the other hand, principal component analysis (PCA) analysis of the SCoT data exhibited highly consistent results compared to the grouping obtained from the SCoT dendrogram (Figure 2A). The SCoT PCA plot revealed that the clustering topology is comparable to that obtained by the clustering analysis except for C. pumilio, which becomes closer to C. lipii than C. alexandrina.
For the CBDP marker system, the dendrogram was divided into two main clusters; the first cluster involved two species (C. alexandrina and C. calcitrapa). The second cluster grouped the rest of the eight Centaurea species. The second cluster was subdivided into two subclusters: the first subcluster comprised two species (C. eryngioides and C. lipii), while the second subcluster grouped four species (C. pumilio with C. scoparia, as the most genetically similar, C. pallescens and C. glomerata) ( Figure 1B). Additionally, the PCA analysis of the CBDP data revealed a high degree of consistency with the CBDP dendrogram's topology ( Figure 2B). The CBDP PCA plot showed that the Centaurea species' clustering remained the same as that obtained by the dendrogram analysis except for C. pumilio, which became closer to C. pallescens than C. scoparia.
Furthermore, the SCoT and CBDP scored data were further combined to reach more comprehensive genome coverage and generate deeper relationships among the eight Centaurea species. The topology of  the combined dendrogram was similar to the SCoT dendrogram with some variations. The combined dendrogram comprised one major cluster gathering seven out of the eight Centaurea species. The second cluster includes only C. pallescens; the most diverged among the eight Centaurea species. Again, the major cluster was subdivided into two subclusters; the first comprised two species (C. alexandrina and C. calcitrapa), while the second subcluster grouped five species (C. pumilio with C. scoparia, as the most genetically similar; C. eryngioides, C. lipii and C. glomerata) ( Figure 1C). The genetic similarities based on Jaccard's coefficient showed that the highest value was identified between C. scoparia and C. pumilio, whereas the lowest similarity value was detected between C. eryngioides and C. calcitrapa (Table 4). Furthermore, the PCA analysis of the combined data exhibited a supreme degree of consistency with the dendrogram's topology of the combined data ( Figure 2C).

Discussion
Genus Centaurea is a typical example of complex development and problematic classification. Long ago, Centaurea was defined as a polyphyletic genus, and molecular studies showed that some clades, including the former section Centaurea, had to be excluded in order to make the genus monophyletic [34]. Recent studies identify three subgenera, namely Lopholoma, Cyanus and Centaurea, with the latter being the most species-rich and divided into several sections [34]. Species of the genus Centaurea are distributed throughout the Mediterranean and are featured by a basic chromosome number of x = 9 [34].
Although the genus Centaurea exhibited high diversity levels for its bioactive compounds, particularly essential oils, such as triterpenes, sesquiterpene, flavonoids and lignans [6,35], this medicinally valuable genus still had morphologically numerous taxonomical issues. Therefore, to evaluate the genetic relationships and assess the level of polymorphism among the eight Centaurea species (C. alexandrina, C. calcitrapa, C. eryngioides, C. glomerata, C. pallescens, C. pumilio, C. scoparia and C. lipii (Volutaria lippii)), two efficient functional marker systems (SCoT and CBDP) were implemented.
Remarkably, during the last decade, few molecular studies were released to investigate the genetic diversity and relationships between Centaurea spp., focusing on ITS barcode or microsatellite markers for authentication. Recently, a study conducted by Dogan et al. [36] concluded that ISSR is a powerful tool in resolving the genetic relationships within problematic taxonomical entities including the Centaurea species (C. ptosimopappoides, and C. straminicephala). Moreover, yildirim et al. [37], studied the genetic relatedness among 16 Centaurea species in the eastern Anatolia region of Turkey. They concluded that the use of RAPD marker system compared to the Fatty acid methyl ester (FAME) profiles gives a consistent result [37]. López-Pujol et al. [38] used the SSR marker system and demonstrated that there is a lack of correlation between genetic-based classification and morpho-based classification due to the allopatric diversification of the genus Centaurea.
These few studies, however, did not include any of the Centaurea species found in Egypt in their analysis and used less reproducible marker systems (RAPD, SSR, ISSR etc.) [36][37][38][39]. With the evolution of many improved novel marker techniques, currently, they have become a not preferred marker technique in plant genetic diversity studies. New functional marker systems proved their ability to provide good reproducibility and increased resolution power compared to the traditional random marker systems. For that, we implemented two functional marker systems (SCoT and CBDP) to investigate the genetic relationships and the diversity among eight Egyptian wild Centaurea spp.
our study observed high levels in % of polymorphism, discriminating power, resolving power parameters in SCoT compared to the CBDP system, suggesting the advanced discriminatory capacity of both marker systems. Based on the resolving power values (R) of the primers SCoT-6, SCoT-7 and SCoT-8, these primers could identify more than 50 Centaurea genotypes according Table 4. genetic similarities between the eight Centaurea species based on Jaccard's similarity coefficient of combined data (Scot + cBDp).
C. glomerata C. calcitrapa C. scoparia C. pumilio C. alexandrina C. lipii C. eryngioides C. pallescens Although both SCoT and CBDP markers can be effectively regarded as markers of choice for studying genetic diversity in Centaurea, it seems that SCoT can perform better than CBDP markers to interpret the polymorphism among the studied Centaurea spp. our results are in line with previous reports that the SCoT markers showed a higher total number of bands, number of polymorphic bands, D and R values than CBDP markers when applied in Triticum urartu [40] and Aegilops triuncialis [41]. other reports revealed that CBDP could perform a little better than the SCoT marker system in Simmondsia Chinensis [42] and Andrographis paniculata [43].
Ultimately, to the best of our knowledge, this is the first study focusing on the applications of functional maker systems such as SCoT and CBDP to assesses the genetic diversity and analyze the genetic relationships among those collections of wild Egyptian Centaurea species.
Conclusions our results revealed a high level of genetic diversity among the eight Egyptian Centaurea species. These findings were supported by various statistical analyses such as UPGMA cluster analysis, PCA and Jaccard's similarity analysis, which showed a high divergence among the studied species. Hence, we concluded that the SCoT combined with CBDP marker systems could be used efficiently to evaluate the genetic diversity, especially in complex and problematic classification genus such as Centaurea.

Disclosure statement
The authors report no conflict of interest

Data availability statement
All data that support the findings reported in this study are available from the corresponding author upon reasonable request.