Genetic diversity of Bulgarian representatives of genus Carduus L. (Asteraceae) as revealed by variability in sequences of internal transcribed spacers region

ABSTRACT The biodiversity of genus Carduus in Bulgaria is grossly under-studied with modern methods, which leads to unclear status of species and hampers the assessment of their habitats and conservation status. In this study, we used the variability of internal transcribed spacers (ITS) to investigate the biodiversity of Carduus species collected from different floristic regions of Bulgaria. Thirty-three samples were processed. Among the studied species, C. acanthoides exhibited the highest ITS variability (eight single nucleotide polymorphisms were found). High ITS variability was also found in C. crispus and C. hamulosus, demonstrating that some of the local Carduus species possesses unique genetic diversity, which can serve as source for future divergence of new forms and subspecies. For each species, unique nucleotides suitable as molecular taxonomy markers were identified. However, C. nutans and C. thoermeri displayed completely identical ITS sequences, which is in agreement with all ITS sequences of these species collected elsewhere in the world and deposited in the NCBI database. After careful analyses of both molecular and morphological data, we propose to restore C. nutans and C. thoermeri as one species with two subspecies. Namely, Carduus nutans subsp. nutans and Carduus nutans subsp. thoermeri. We recommend diameter of capitulum, length of peduncle, width of peduncle and width of bract as diagnostic features of taxonomic significance for distinguishing the two subspecies.

from 1800 to 2800 m above sea level, in six floristic regions in Bulgaria. Carduus rhodopaeus ( = C. adpessus subsp. rhodopaeus) is a perennial Bulgarian endemic plant, rarely distributed in the Rhodope Mountains in dry rocky places (800-2000 m altitude), with conservation status 'Endangered' [9]. Carduus thracicus is a perennial Balkan endemic plant (Bulgaria, Turkey-in-Europe), distributed in dry and waste places in Bulgaria with conservation status 'Vulnerable' [10]. It is very similar to C. hamulosus, from which it differs mainly in its smaller capitula [6].
Our understanding of Carduus biodiversity in the region and their local or regional distribution can be greatly enhanced by a study based on a combination of classical morphological approaches with modern molecular techniques. The present study covers the 14 Bulgarian species of the genus and combines morphological taxonomy with molecular taxonomy based on variability of the two internal transcribed spacers ITS1 and ITS2. ITS1/2 regions are among the most popular markers for studying phylogenetic inference at the generic and infrageneric levels in plants and animal species since 1994. The sequence data have provided insights into the biodiversity of closely related species, the phylogenetic history, polyploid ancestry, genome relationships, historical introgression and other evolutionary questions [11,12]. Susanna et al. [13] demonstrated the selective power of ITS1/2 molecular markers in studying the systematics of tribe Cardueae to which genus Carduus belongs. For this purpose, they used sequences of the nuclear ribosomal gene cluster starting from the 3' end of the gene encoding 18S rRNA -Internal Transcribed Spacer 1 (ITS1)the gene for 5,8S rRNA -ITS2and the 5' end of the gene encoding 26S rRNA. Due to the complexity of the tribe taxonomy, they also used two chloroplastic markers (trnL-trnF and matK) to achieve more appropriate outgrouping [13].
Twenty-four accessions of the ITS1/2 region isolated from different Carduus species are available at the NCBI (National Center of Biotechnology Information) database, which offers excellent opportunity to confirm independently the identity of Bulgarian species by matching their ITS sequences with those annotated at the NCBI database and, on the other hand, to compare the biodiversity of the Bulgarian species with those used for molecular studies in other parts of the world.

Plant material
The plant material was collected from natural habitats in Bulgaria during the vegetative seasons in 2010-2013.
Thirty-three samples from all Carduus Bulgarian representatives collected from different floristic regions were used in this study (Table 1).
In order to minimize the harm on the Carduus populations, samples for DNA extraction were collected by taking a single leaf per plant from 10 randomly selected plants at each location. All leaves of one species collected from one location were stored together on dry silica gel in a refrigerator bag. The chosen method for sampling allowed us to collect samples from the same habitat more than once but at different years. In total, 33 samples for DNA extraction were collected.

Taxonomic determination of species by morphological features
Species identification by morphological features was done at the Department of Botany of the University of Plovdiv 'Paisii Hilendareski', according to Flora Europaea [6] and determination keys of Stoyanov et al. [14], Delipavlov and Cheshmedzhiev [8]. Ten morphological features with taxonomic significance were used to distinguish species: number and shape of leaf lobes; length of leaf spine; diameter of capitulum; length of peduncle; width of peduncle; width of bract; length of bract spine; length of corolla; indumentum. Fifty measurements on fresh plant material were taken for any morphological trait in each of the surveyed locations. For floristic reasons, 25 individual whole plants were collected and deposited as Voucher specimens in the Herbarium at the Agricultural University-Plovdiv, Bulgaria (Herbarium SOA).

DNA extraction
Upon delivery in the laboratory, the leaf samples were frozen with liquid nitrogen and ground to fine powder in pre-cooled mortar and pestle. One hundred milligrams of the plant material were used for DNA extraction by DNeasy plant mini kit (Qiagen cat. no 69,104) following the original protocol.
The absorption at 260 nm was used to determine the concentrations of the isolated DNA samples, and the ratios A 260 /A 280 and A 260 /A 230 , to determine the presence of contaminations like proteins, polyphenolic compounds, sugars and lipids. The average amounts of isolated DNA were 250-300 ng and contaminations were present in negligible amounts.

Primers
We used two ITS primers from the primer Set Ribosomal primers (University of British Columbia, Nucleic Acid-Protein Service Unit, NAPS Unit, www.michaelsmith.ubc.ca/ services/NAPS/Primer ç Sets/) namely: 18S Fw1: 5'-CGACTA-TATGGAAYTGTGGCG ¡3' and 26S Rev1 5'-CGCAGTC-GAAAGCACAAGTAG ¡3'. Because the production of this primer set was discontinued by the UBC-NAPS Unit, the primers were ordered from Metabion International AG (Martinsried, Germany) and upon arrival were dissolved in DNase-free water to 10 mmol final concentration.

Polymerase chain reaction (PCR)
Six independent PCR runs were performed for each sample. Approximately 150 ng DNA template was taken for each reaction and mixed with 1 mL of each primer (10 mmol/L), 25 mL PCR master mix (Fermentas, cat no K0171) and 22 mL DNase-free water (supplied with the master mix kit) in a 250-mL PCR tube. The PCR tubes were placea in a TC-512 THERMAL CYCLER (Techne, Cole-Parmer, Beacon Road, Stone, Staffordshire, ST15 OSA, UK) PCR apparatus and the PCR amplification was carried-out by using the following program: initial DNA melting at 94 C for 5 min; next 35 cycles of 94 C -1 min; 55 C -1 min 30 s; 72 C -2 min 30 s and final extension at 72 C for 6 min. The PCR products were mixed with 6.5 mL of loading dye (Fermentas #R0611), loaded onto 1% agarose gel containing 0.5 mg/ mL ethidium bromide (final concentration) covered with 0.5X Tris-borate-EDTA (TBE) buffer and separated by applying 7 V/cm. The size of the products was determined by comparison with a DNA ladder (Fermentas GeneRu-ler#SM0311). The PCR products were visualized by ultraviolet (UV) light.

PCR product isolation, cloning and sequencing
The PCR products were isolated from the agarose by QIAquick Gel Extraction Kit (Qiagen, cat no 28,704)  following the original protocol. The concentration of the PCR products was determined spectro-photometrically and 2-4 mL were used for A/T cloning. QIAGEN PCR Cloning Kit (cat no 231,124) was used to clone the PCR products, according to the original protocol. The ligation reactions were mixed with 250 mL freshly prepared competent bacterial (Escherichia coli-TOP 10 -Invitrogen) cells. The plasmids containing PCR products were isolated by QIAprep Spin Miniprep Kit (cat no 27,104) following the original protocol. The isolated plasmids were dissolved in 50 mL buffer (10 mmol/L Tris¢Cl, pH 8.5) and sent for sequencing to GATC -Biotech AG (Cologne, Germany).

Data analysis
Online nucleotide BLAST (Basic Local Alignment Search Tool) analyses in the NCBI database were performed using the nblast algorithm of Altschul et al. [15]. The multiple alignments of obtained sequences, phylogenetic and molecular evolutionary analyses were conducted using MEGA version 6 [16]. The number of haplotypes, haplotype diversity and nucleotide diversity were calculated using DNA SP 5.10.01 software [17]. The software package Statistica v. 7.0 [18] was used for the statistical processing of morphological data.

Results and discussion
The PCR products obtained from the studied samples had an average size of 650 bp (Figure 1), whereas the size of the sequences annotated in NCBI varied between 735 and 220 bp. These differences are due to the primers combinations used. The NCBI accessions with smaller size usually contain sequences of ITS1 only. They were excluded from our further analyses. Next, the sequences of our samples were compared with those having sizes between 640 and 735 bp annotated in NCBI by other authors. The comparison demonstrated that sequences from NCBI contain longer fragments from 26S rRNA, while our sequences contained 126 bp longer fragments from 18S rRNA. Neither gene fragments encoding 18S nor 26S rRNA had any variability and, hence, no taxonomy or phylogenic value. Therefore, they were trimmed and the size of all sequences was equalized to 524 bp. The sequences of Bulgarian representatives of genus Carduus were deposited in the NCBI database under accession numbers KT363903-KT363919. The isolated ITS1/2 sequences were next subjected to online analyses in the NCBI database using the nblast algorithm of Altschul et al. [15]. The object of these analyses was to confirm independently the identity of wild Bulgarian Carduus species by matching their ITS1/2 sequences with those annotated at the NCBI database. Blast analyses revealed 98% similarity of Bulgarian species C. pycnocephalus and C. personata to those from the same species annotated in NCBI under accessions EF123105, AF319057 and AF319111 (C. pycnocephalus) and KM262846 (C. personata). The ITS1/2 sequences from Bulgarian C. nutans and C. thoermeri were 100% identical with accessions AY780401, EF543521, KC603920, HQ540426, AF443678 and JX867642 (C. nutans). The match between Bulgarian C. acanthoides and accession EF123106, JX867641 (C. acanthoides) was slightly lower -96%. The lowest similarity, 92%, was observed between Bulgarian C. crispus and accession EF010530 (C. crispus from Maryland, USA), while C. crispus isolates from South Korea and China (GU188570, AY914813) showed 98 and 96% similarity, respectively.
Next, all six replicas of each Bulgarian sample were merged together to form a single consensus sequence for phylogenic analyses. Analyses were done by MEGA6 software using the Maximum Likelihood algorithm. The joined phylogram built from ITS1/2 sequences of Bulgarian species and those annotated in NCBI (Figure 2) demonstrated satisfactory clustering by species and confirmed the accepted taxonomy scheme. The only exception was that C. thoermeri and C. nutans clustered together, which is expected due to 100% identity of their sequences. C. crispus and C. personata formed a separate branch containing a subcluster formed by C. personata samples. A relatively higher difference found in Bulgarian C. crispus is well illustrated in Figure 2, where the sequences isolated by us formed a separate subcluster but still remained in the same cluster with the C. crispus accessions annotated in the NCBI database.
Incorporation of the other eight species, which are not annotated by other authors in NCBI, did not change significantly the clustering illustrated in Figure 2, but led to formation of new clusters and subclusters specific for some of the Bulgarian Carduus representatives (Figure 3). All C. pycnocephalus sequences formed an individual cluster separated from all other species. The Balkan endemic C. armatus was shown to group with C. acicularis but they formed well distinct clusters. The other Balkan endemic, C. thracicus, formed a separate branch from the group of clusters accommodating C. hamulosus and C. acanthoides as distinct clusters and the joint cluster of C. thoermeri and C. nutans. The other endemic species, C. rhodopaeus, also formed an individual branch.
Interestingly, three species: C. candicans, C. carduelis and C. kerneri, displayed grouping in a joint cluster. A subtree was built only by these species (Figure 4), which demonstrated that C. candicans formed actually a separate subcluster, while the sequences of C. carduelis and C. kerneri are 100% identical, just like these of C. thoermeri and C. nutans.
For revision of the species taxonomy status, apart from ITS1/2 sequences as a molecular marker, profound analyses of morphological features with taxonomic importance are needed. C. carduelis and C. kerneri are rare species and we had only two specimens of the former and a single specimen of the latter, which is not sufficient for statistical analyses. Unlike them, C. thoermeri and C. nutans, are common species and we were able to record the variability of taxonomically important morphological features in 30 individual plants per species per habitat.
Leaf lobes had palmate shape in C. nutans, whereas triangular in C. thoermeri. The diameter of capitula, the  length and width of peduncle, the width of bract of C. nutans were considerably smaller (P > 0.001) than those of C. thoermeri ( Table 2).
The number of leaf lobes, the length of bract-spine and the length of corolla varied in the habitats, and in some habitats there was no statistically significant difference between C. nutans and C. thoermeri ( Table 2). The statistically processed morphological data are summarised in Table 2.
Both ITS1 and ITS2 regions are removed during processing of primary rRNA transcript and have no functions. Therefore, they are not subject to selective pressure and evolve according to the Kimura neutral model, accumulating mutations randomly. Such sequences change uniformly in time and can be used for phylogenic analyses to study the divergence rates between closely related species and between isolated populations within a species [19,20].
For 12 of the studied species, unique nucleotides suitable for molecular taxonomy markers were identified and summarized in Table 3. These findings provide opportunity for designing unique primers with the power to discriminate single species by PCR reactions without sequencing. The other two species also possess unique sequence features, but more complex sets of primers will be needed for their identification. This is important because some, but not all Carduus species are used by pharmaceutical companies for preparation of medicinal products. Such primers can serve for the purpose of quality control on the raw materials, which are usually supplied in dried and milled form.
Among the studied species, the Bulgarian representatives of C. acanthoides exhibited the highest ITS variability: eight SNPs were found at positions 38, 43, 79, 82, 116, 147, 149 and 158. The samples from both localities, despite the geographic distance between them, displayed similar types of SNPs at the same positions. This finding is an indication for high genetic polymorphism within the populations of Bulgarian C. acanthoides. The relative frequency of nucleotide substitutions varied at the different positions (Table 3). Since we used mixed samples, it is not possible at this stage to draw conclusions about allele abundance among individual plants but frequency variations may correlate with segregation in the progeny. SNPs were also found in two other Bulgarian representatives of genus Carduus, C. crispus and C.    (Table 4). In C. crispus, the positions and frequency of SNPs were identical and constant during the two vegetative seasons, probably due to the fact that both samples originate from the same population. In contrast, the three studied C. hamulosus populations had differences in the positions and frequency of SNPs (Table 4), which indicates high genetic diversity between these populations. These findings demonstrate that this local Carduus species possesses unique genetic diversity, which can serve as source for future divergence of new forms and subspecies. The finding is not surprisingthe Balkan Peninsula was the largest of the three refuges for plants (and other species) in Europe during the glacial period. After the melting of glaciers, that refuge became a center of biodiversity for plant species, some of which later colonized Europe [21,22]. This is the reason why the Balkan flora is rich in genetic sources and endemic plant species that have survived the glacial age.
In contrast, C. nutans demonstrated complete identity of ITS1/2 sequences both among sequences isolated from Bulgarian accessions and those annotated in NCBI. The fact that the ITS sequences of C. nutans isolated from Bulgaria and other places in Europe and North America are identical, confirms the weedy, invasive nature of this species [4,5]. It was obviously spread relatively recently world-wide by human activities and the time since has not been sufficient for accumulation of different mutations in distant populations even in nonfunctional regions like ITS.
Interestingly, the ITS1/2 sequences of C. nutans were identical with these of C. thoermeri. Earlier, Carduus nutans was considered as a complex group of taxonomically difficult plants [6,7]. In many earlier studies, they were considered subspecies of Carduus nutans-ssp. nutans and ssp. leiophyllus [14,23,24]. In contrast, based on morphological and biochemical features, Flora Europaea [6] considers them two different species: C. nutans and C. thoermeri. Exploring the karyotype of Bulgarian flowering plants, Kuzmanov et al. [25,26] reported that the C. thoermeri 2n karyotype is equal to 16. The same karyotype (2n = 16) occurs in C. nutans [27][28][29]. Our analyses of both molecular and morphological data, gave us sufficient evidence to propose restoration of C. nutans and C. thoermeri as one species with two subspecies, Table 3. List of unique nucleotides in ITS1/2 sequences in different Carduus species. Species namely, Carduus nutans subsp. nutans and Carduus nutans subsp. thoermeri. As diagnostic features of taxonomic significance for distinguishing the two subspecies, we recommend the diameter of capitulum, the length of peduncle, the width of peduncle and the width of bract.

Conclusions
For the first time, ITS1/2 sequences of Balkan endemic species were deposited at NCBI. We believe these data can support work of other researchers studying the origin, colonization ways and evolution of Carduus species. The variation of the morphological features used in the taxonomy of the genus supports molecular genetic data that C. nutans and C. thoermeri are not two separate species. Therefore, we propose the taxonomic status of the two species to be revised accepting again that there is one species, C. nutans, with two subspecies: C. nutans L. subsp. nutans and C. nutans L. subsp. thoermeri. We recommend, as morphological diagnostic features of taxonomic significance for distinguishing the two subspecies, to be used the capitulum diameter, peduncle length, peduncle width and bract width. More specimens, however, are needed before a similar taxonomic revision could be eventually proposed about the other two Carduus species, C. carduelis and C. kerneri subsp. austro-orientalis.