Identification of seeds of Phelipanche ramosa, Phelipanche mutelii and Orobanche cumana in the soils from different agricultural regions in Bulgaria by molecular markers

Abstract Broomrapes are root holoparasites from Orobanchaceae family parasitizing other flowering plants. Several of the most aggressive broomrape species are widely spread in Bulgaria often causing serious yield losses of important crops. It is important to be able to detect the presence of broomrape seeds in the soil. In the present study, we combined a proven PCR-based assay for detection of broomrape seeds with methods to distinguish species based on nuclear ITS sequences to identify broomrape seeds isolated from soil samples and to study the population structure and the intraspecific variations within the three species. Fifty-six soil samples from 23 different regions in Bulgaria were studied. Based on molecular and bioinformatics analyses we found that 22 samples contained seeds of Orobanche cernua var. cumana and six samples, seeds of Phelipanche ramosa. Phylogenic and bioinformatics analyses surprisingly revealed that the isolated sequences from supposedly P. mutelii seeds diverge from those annotated in NCBI by other authors on 16 different nucleotide positions and formed two distant branches of the phylogenic tree. On the other hand, the isolated sequences were almost identical with P. rosmarina. Tajimas’ D-test revealed that O. cumana populations are currently stable. Regarding the Phelipanche representatives, based on the D-test we can hypothesize that P. mutelii/P. rosmarina populations are in a period of active expansion, while P. ramosa populations are contracting. All isolated sequences were deposited in NCBI Genbank database.


Introduction
Obligate root parasitic plants of the family Orobanchaceae (broomrapes) have lost upon their evolution the ability to perform photosynthesis and now survive by parasitizing the roots of other flowering plants [1,2]. A number of broomrape species parasitize on important crop plants causing serious yield losses [2][3][4][5][6].
Broomrapes are a serious problem for Bulgarian agriculture because several of the most aggressive broomrape species are widely spread in Bulgaria including: Phelipanche ramosa L. (formerly Orobanche ramosa); Phelipanche mutelii Schultz and Orobanche cumana Wallr. For example, more than 36,000 ha of agricultural land in Bulgaria are infected with P. ramosa and P. mutelii. Thus, they are a serious problem in regions producing tobacco, tomato, potato, cabbage and rapeseed, where those two species cause serious reduction (up to 40%) in crop yields [2,7]. Heavy yield losses in sunflower production are attributed to parasitism by O. cumana [7][8][9].
There are several mechanisms that ensure tighter coordination between the developmental stages the of parasites' life cycle and that of the host plants. The first critical step is germination and it is regulated by chemicals emitted from the roots of prospective host plants [9,10]. With some exceptions like sorgoleone, the chemical signals belong to the so-called strigolactones, which also play role in communication and attraction of beneficial arbuscular mycorrhizal fungi [11][12][13]. Parasites take advantage of this system and use it to recognize prospective hosts [13,14].
Once attached to a host and fully grown, each broomrape plant produces hundreds of thousands of seeds which remain viable in the soils for decades until the right host is available. Therefore, it is of critical importance to be able to detect the presence of seeds of parasitic weeds in the soil. Osterbauer and Rehms [15] developed a method based on polymerase chain reaction (PCR) with primers capable of amplifying internal transcribed spacers (ITS) of the nuclear ribosomal DNA only of small broomrape. The PCRbased assay is sensitive enough to detect a single broomrape seed [15].
Schneeweiss et al. [16], on the other hand, developed methods to distinguish species and study phylogenic relations among holoparasitic broomrapes (Orobanchaceae) based on nuclear ITS sequences. His work was further extended by Piwowarczyk et al. [17], allowing the authors to propose a number of taxonomic revisions in Central European broomrape representatives.
Our current research is focused on using tolls of metabolomics and genomics (next-generation sequencing) to study the complex interactions between host crops, broomrapes and beneficial microbial flora. For this purpose, soil samples were collected from different agricultural fields in Bulgaria. Apart from studying the microbial flora, the soil samples were analysed for presence of broomrape seeds to determine which species they belonged to and what their viability was.
In the present study, we combined the PCR method of Osterbauer and Rehms [15] with sequencing of nuclear ITS sequences to identify broomrape seeds isolated from soil samples and to study the population structure and intraspecific variations within the species.

Collection of broomrape seeds
The broomrape seeds in the soil were harvested by processing of 100 g soil samples collected from different agricultural fields in Bulgaria. The seeds were separated from the soil by washing through a system of sieves according to Singh et al. [18].
Preliminary determinations of species were based on seed micromorphology according to Teryokhin et al. [19] and Stoyanov [20].

Vitality tests
The test was carried out by the method of Mauromicale et al. [21], using TTC (2,3,5-tryphenyltetrazolium chloride). Isolated seeds were decolorized with 6% NaClO for 10 min, washed with distilled water and treated with solution containing 1% of TTC. Seeds were incubated in the dark for 72 h at 25-35 C and observed under a stereo microscope. Live seeds had a pink-red colouration.

Germination tests
Aliquots of the isolated broomrape seeds were surface sterilized by 2% active chlorine and incubated between two glass fibre filter paper disks for two weeks for preconditioning according to Mangnus et al. [22]. After preconditioning, the "sandwiches" were treated with a synthetic germination stimulant -GR24 (0.4 mg/L) and were incubated for another week. Sterile nano-pure water was used as negative controls. Seeds were observed under a stereo microscope to count their total number and the number of germinated seeds. The germination percentage for each seed sample was determined in five replicas and statistically processed by ANOVA. P-value <0.05 was considered as a statistically significant difference.

DNA extraction
Seeds were grinded to powder in liquid nitrogen and total DNA was isolated using the DNA extraction kit of Analytik Jena following the manufacturer's instructions.

Primer design
The Vector NTI 11.0 software was used to perform multiple alignments of the ITS rDNA sequences annotated in NCBI database for each of the three species and to design specific of primers for the nuclear ribosomal RNA cistron including the 3'end of 18S rRNA through ITS1, 5.8S rRNA ITS2 and the 5' end of 26S rRNA. The sequences used in the primer design for O.

PCR amplification of ribosomal markers
About 150 ng DNA from each sample was placed in a separate 250 lL PCR tube and mixed with 1 lL of forward and reverse primers (10 mmol.L À1 concentration), 25 lL green PCR master mix and diluted to a final volume of 50 lL with DNase-free water. PCR reactions were performed using the following conditions: initial DNA denaturation at 94 C for 5 min; 35 cycles of melting at 94 C for 1 min; annealing of the primers at 57 C for 1 min; extension at 72 C for 2 min 30 s. The PCR products were loaded onto 1.5% agarose gelcontaining 0.5 mg/mL ethidium bromide in 0.5X TBE buffer and separated for 55 min applying 130 V electrical currency The PCR products were visualized by UV light and the bands with a size of about 650 bp (determined by comparison with 1 Kb GeneRuler of Thermo Fisher Scientific, Inc.) were isolated from the agarose by QIAquick Gel Extraction Kit following the original protocol. Purified PCR products were sent for direct sequencing by Sanger dideoxy method in GATC Ltd., Cologne, Germany.

Bioinformatics analyses
Initially, the obtained sequences were subjected to on-line nucleotide BLAST (Basic Local Alignment Search Tool) analyses in the NCBI Genebank database using the nblast algorithm of Altschul et al. [23] in order to confirm the species independently of the morphological analyses. Next multiple alignments of the obtained sequences, phylogenetic and molecular evolutionary analyses were conducted using MEGA version 7. Maximum likelihood was used, applying the general time reversal model and a uniform rate of substitution. Phylogeny testbootstrap method by 500 replications [24]. In addition, the types of nucleotide substitutions and Tajima's Neutrality Test (Tajima D-test) were used to analyse the trends in the evolutionary pressure and whether certain populations expand or shrink their range [25,26].

Seed collection and testing
Soil samples were collected from 23 different regions in Bulgaria. Fifty-six soil samples were processed and seeds were collected. Preliminary morphological determination revealed that 22 samples contained seeds of O. cumana. Twenty-six samples contained P. mutelii seeds and eight ones, P. ramosa seeds. The locations and potential hosts are presented in Table 1.
The average viability of the seeds of O. cumana was 78% ± 5%, while the viability of P. ramosa and P. mutelii seeds was over 87% ± 3.5%. Germination tests in general were with agreement with these results. Although the percentage of germinated O. cumana seeds after treatment with GR24 was about 50% (Figure 1), it is noteworthy that other authors have reported that the germination stimulants exuded by sunflower are sesquiterpene lactones, which differ in their chemical structure from strigolactones and GR24 [27,28].

Isolation and sequencing of ITS1/2 region
Two sets of forward/reverse primers were designed for O. cumana and for P. ramosa/P. mutelii ( Table 2). The external sets were used for isolation of the ribosomal gene cluster in three replicas per sample, while the internal sets were used for sequencing. The PCR products were sequenced in both directions. The two sequences of each product were overlapped to obtain a consensus sequence. In case of differences between the sequence obtained from the reaction with the forward primer and with the reverse primer in a PCR product, both sequencing reactions were repeated.

Bioinformatics analyses
The nblast comparison between the sequences that we obtained from the O. cumana seeds with those annotated in the NCBI Genbank database resulted in 100% identity with e-value 0 with the species deposited as Orobanche cernua var. cumana. The identity with Orobanche cernua var. cernua was 99% with evalue 0 and the difference was in a single C/T transition located in ITS2 (relative position 423). The phylogenic tree based on the studied sequences completely confirmed the results from the nblast analyses ( Figure  2). Therefore, the sequences isolated by us were deposited in NCBI Genbank as genomic DNA sequences containing a partial 18S rRNA gene, ITS1, 5.8S rRNA gene, ITS2 and a partial 26S rRNA gene isolated from Bulgarian samples. Based on bioinformatics analyses, the sequences were accepted under accession numbers from MK024256.1 to MK024277.1 and named Orobanche cernua var. cumana.
Similarly, the sequences isolated from seeds of P. ramosa and P. mutelii were deposited in the NCBI Genbank database under accession numbers from MK026645.1 to MK026670.1 for P. mutelii and from MK024283.1 to MK024290.1 for P. ramosa.
Six of the sequences isolated from seeds determined morphologically as P. ramosa, namely samples from Markovo (fields 1 and 2), Kozarsko (fields 1 and 2), Razlog and Proslav, were between 99% and 100% (e-value 0) identical with P. ramosa deposited by other authors in NCBI Genbank. The other two sequences, however, which were isolated from seeds collected in the Gotse Delchev region, were 100% identical (e-value 0) to the 26 sequences isolated from seeds determined morphologically as P. mutelii (Figure 3).
Surprisingly, the bioinformatics analyses revealed significant discrepancies between morphological determination and molecular data regarding P. mutelii seeds. The isolated sequences diverged from those annotated in NCBI by other authors on 16 different positions, except position 581 ( Table 3). The phylogenic analyses placed the two sets of sequences into two distant branches of the phylogenic tree ( Figure 3). The ones closest to the sequences isolated by us (99% similarities, e-value 0) were representatives of Phelipanche rosmarina (Beck) Banfi, Galasso and Soldanoonly two different nucleotides (Table 3). Therefore, we can hypothesize that the 28 seed samples may be from P. rosmarina. To test this hypothesis, Table 1. List of the studied species, the location of their collection and the short names of the samples as they appear in phylogenic trees. more studies are needed with a combination of classic taxonomy and more molecular markers. However, the taxonomic status of P. rosmarina is subject of debate. The species was first described by Beck-Mannagetta in 1930, yet some researches argue that it could be a form of P. mutelii [29,30]. Other botanists have described species as Orobanche rumseiana Pujadas and Fraga, O. mariana Pujadas and O. pseudorosmarina Pujadas and Mucoz Garm. Sanchez Pedraja et al. [31] reported these three species as synonyms of Phelipanche rosmarina (Beck) Banfi, Galasso and Soldano, since the small variations differentiating these species (lobes shape, indumentum, etc.) could be influenced by the habitat. Due to taxonomy uncertainties, the collected seeds will be excluded from our current research.
Tajima's Neutrality Test (Tajima D-test) distinguishes between DNA sequences evolving randomly and such that are evolving under a non-random process, including directional selection or balancing selection that can be connected with population expansion or contraction [23,24]. Negative D-values relate to an excess of low frequency polymorphisms and indicate population size expansion, while positive D-values relate to low levels of both low-and high-frequency polymorphisms, indicating a decrease in population size [23,24]. A rough rule of significance is that values greater than þ2 or less than -2 are likely to be significant [23].
For O. cumana, the D value was -0.44; for P. ramosa þ 2.44 and for P. mutelii/P. rosmarina -4.3. These values indicate that O. cumana is not currently Note: The seeds were preconditioned for 14 days before treatment. Portions of the seeds of each species were treated with nano pure, sterile water in order to serve as negative controls. This study demonstrated that comparison of the sequences of studied specimens with sequences from reference databases can provide information about the species identity of the specimens, independent of the morphological features. It can also provide more precise results than those solely based on determination keys. Primers for other broomrape species can also be designed for future studies of their seeds in soils not only from agricultural fields, but also in wild habitats. This could be very helpful for taxonomists, because the flowering stems of broomrapes that are suitable for determination by the methods of classic taxonomy emerge over soil for a very short period, whereas seeds are always available in soil samples.  Table 1. Four sequences annotated in NCBI by other authors were incorporated in the tree as reference samples. Figure 3. Phylogenetic tree based on P. ramosa and P. mutelii/P.rosmarina ITS1/2 sequences annotated in NCBI and the sequences isolated from the Bulgarian representatives. Maximum likelihood was used, applying the general time reversal model and a uniform rate of substitution. Phylogeny testbootstrap method by 500 replications. The samples are listed in Table 1. Ten sequences annotated in NCBI by other authors were incorporated in the tree for comparison.

Conclusions
The present study confirmed that the PCR-based approach is a powerful tool for detection of broomrape seeds in different soils. However, in combination with Sanger sequencing of the PCR products, it provides much more informative results. The frequency and the types of single nucleotide polymorphisms can provide information about the trends in the evolution of the population, the gene flow and the changes in the population size. On the other hand, sequence analyses must be further combined with studies based on morphological determination keys in order to determine by two independent methods whether P. rosmarina is really present in the Bulgarian flora and which plants serve as its hosts. Table 3. Comparative list of unique SNPs in ITS1/2 sequences of P. mutelii, P. rosmarina and our samples.  12  22  26  37  38  58  81  160  172  425  434  463  464  478  494  566  581  P. mutelii (Bulgaria)  A  T  A  T  A  T  T  T  G  C  G  --T  C  C  C  P. rosmarina (NCBI)  A  T  A  T  A  T  T  T  A  C  G  --T  C  C  T  P. mutelii (NCBI)  T  C  C  C  C  C  C  A  C  T  T  G  T  G  T  T  C