SSR and morphological traits based fingerprints and DNA barcodes for varietal identification in rice

Abstract Varietal identification has attained prime importance at a global level particularly in the context of plant variety protection. In this study, forty-six traits were characterized for the establishment of the sheer distinctness among eighteen rice varieties. Morphologically, out of forty-six traits, twenty-five traits did not display any variation among the varieties studied. Significant variations were observed for 21 traits which are relatively more informative in the identification/characterization. Distinctness, uniformity and stability-based fingerprints specific to the studied varieties were developed based on the variations. A total of 175 simple sequence repeat (SSR) markers were screened for polymorphism; 53 hypervariable polymorphic SSR markers were utilized for fingerprinting of rice varieties. A total of 151 alleles were detected in these 53 SSR markers with an amplification size ranging from 120 to 750 bp and the allelic range of 2–4. The average number of alleles observed was 2.84 per locus. The PIC values of 53 SSRs ranged from 0.03 (RM 12983) to 0.64 (RM 11449) with an average of 0.40. The generated allelic variations of 53 SSR markers for the varieties were translated into DNA barcode profiles by separating the allele size from each SSR locus. Multiplex polymerase chain reaction (PCR) with five markers (RM 21392, RM 1388, RM 6699, RM 25754 and RM 16913) was used to identify rice varieties based on the band positions. This study classifies and identifies rice varieties and is an important reference to test the authenticity and varietal purity of other rice varieties in the future. Supplemental data for this article is available online at https://doi.org/10.1080/13102818.2021.1987324 .


Introduction
Seeds are the primarily most significant agricultural tool that assists in the enhancement of productivity, production and income of the farmer and facilitates the determination of the performance and efficacy of other inputs. The quality seed alone contributes about 15-20% to the total production contingent to the crop and it can further rise to 45% with efficient management of other inputs [1]. The authenticity of the variety, quality and purity of seeds is extremely significant for production of crops and vegetables [2]. Felicitously timed production of optimum quantities of quality seeds in accordance with a good genetic potential appropriate to the agro-climatic conditions at a nominal price is largely crucial and should be made accessible to the farmers for reaching the pinnacle of agricultural production, or at the very least, to its full potential [3]. In India, the private sector concedes to large-scale rice seed production and marketing of publicly bred cultivars. The PPV & FR Act with respect to protection and registration of crop varieties, a regulatory mechanism that is in force in India, uses the protocols similar to those of UPOV rice, but with relevant modifications in the morphological descriptors of indica rice [4].
The constant and ever increasing number of improved varieties leads to a narrow genetic base and creates a hiccup in varietal identification since morphological descriptors are limited in their utility when Barcodes too closely related cultivars are to be distinguished from one other. Unambiguous identification of varieties is exceedingly important for registration and certification of newly released and notified varieties to curb the supply of spurious seed and to avoid the retail of the same variety under different entitlements by private companies and seed production agencies [5]. The rise of systems for the purpose of protection of plant breeder's rights across the globe is an immediate need for generation of data that can distinguish one variety from the other [3]. In the process of seed production and multiplication, plant breeders, farmers, certification agencies, seed testing laboratories and seed industry should know the specific morphological traits of the variety for identification at different growth stages of the crop. The distinctness, uniformity and stability (DUS) testing done based on essential morphological traits and employment of biochemical markers for varietal identification is selective to environmental influence [6]. Polymerase chain reaction (PCR)-based molecular markers, especially simple sequence repeats (SSRs), are very quick, reliable, environmentally neutral in varietal profiling and purity analysis of crop varieties [7] and are accommodating in the development of unambiguous DNA fingerprints of cultivars [8,9]. SSR fingerprints are inherent in genomes and are not affected by internal and external environmental factors, including growth and development time [10]. Both SSR and DUS based fingerprinting of the rice varieties gives the information about phylogenetic relationships and aids rice breeders in varietal registration and protection of intellectual property rights. Very limited data are available for varieties which are in the seed multiplication chain. Combination of morphological traits and DNA fingerprinting of cultivars are very important for identification of adultrated seeds at field and lab level. Against the backdrop of the aforementioned scenario, the present study was planned and conducted to assess the morphological traits for varietal identification at the field conditions by using DUS traits and development of SSR based DNA barcodes/molecular fingerprints for varietal identification of elite rice varieties (that are in the seed chain) of Telangana and Andhra Pradesh states in India.

Plant material
The experimental material for the present study comprised a total of 18 rice varieties which are grown commercially in the Telangana and Andhra Pradesh states in India, enjoying high market demand and presence in the regular seed chain. Details of the varieties along with the names of the centers where the selected varieties were developed are provided in Table 1. The seeds of the selected varieties were obtained from the respective research stations and were raised at the Seed Research and Technology Center (SRTC), Professor Jayashankar Telangana State Agricultural University, Hyderabad, India, during kharif, 2018.

Experiment layout and morphological DUS characters
Thirty-day-old seedlings of each variety were transplanted in 4 rows of 4 meters length with a spacing of 30 cm between each row and 20 cm between each plant (as per DUS guidelines given by PPV and FR Act [11] in a randomized block design with three replications. All the necessary precautions and management practices were adopted to maintain uniform plant population. Observations on 46 morphological traits were recorded at the appropriate crop stage (  Table S2, supplemental material.

Genomic DNA isolation and quantification
Genomic DNA was extracted using a cetyl-trimethyl-ammonium bromide (CTAB) method as described by Murray and Thompson [12]. Five grams of leaf tissue were crushed to a very fine powder in DNA extraction buffer (100 mmol/L Tris-HCl, 20 mmol/L ethylenediaminetetraacetic acid (EDTA), 1.4 mol/L NaCl, 2% CTAB, 1% polyvinyl pyrrolidone and 0.1% β-mercaptoethanol, pH 8.0) and incubated at 60 °C for 1 h. Phenol: Chloroform: isoamyl alcohol (25:24:1) was added to the mixture and centrifuged (Sigma Aldrich 1-15) at 11,000 rpm for 15 min at 4 °C. The upper aqueous phase was transferred to a fresh centrifuge tube and the Phenol: chloroform: isoamyl alcohol step was repeated. The DNA pellet was washed with 70% cold ethanol twice and air dried. The DNA pellet was dissolved in 100 μl Tris-EDTA buffer. The quality of the DNA was checked by running in a 1.5% agarose gel stained with ethidium bromide (EDBr) and quantified by a NanoDrop spectrophotometer. DNA was diluted with autoclaved Milli-q water to a working concentration of 20 ng/μl and was eventually used for SSR analysis.

PCR conditions and allele size determination
Polymerase chain reaction (PCR) was carried out in 10 μl volume containing 10 ng of 2 μL template DNA, 5 μL of 10xPCR master mix (Takara PCR mix), 2 μL of sterile (deionized) water, 0.5 μL of forward and reverse primers. All PCR reactions were performed in Agilent Thermal cycler with temperature cycling profile of an initial denaturation at 94 °C for 5 min followed by 35 cycles of denaturation at 94 °C for 45 s, primer annealing step between 53 °C and 60 °C (according to the optimal temperature of the primers) for 45 s, and extension at 72 °C for 45 s with a final extension step at 72 °C for 10 min. Electrophoresis was carried out with a 1.5% agarose gel along with the 100 base pair DNA ladder (New England Bio laboratories (NEB)) for 2 to 2.30 h in 0.5 Tris-boric acid-EDTA (TBE) buffers. The sizes of the amplified fragments were then surveyed under a UV transilluminator and were estimated with the help of a Bio-Rad Molecular Imager Gel Doc XR System using a 100 bp DNA ladder (NEB) as a size standard.

Microsatellite markers and DNA barcoding/ fingerprinting
Molecular characterization of the 18 rice varieties was done by using 175 hyper variable microsatellite markers selected from the microsatellite database, (http:// www.gramene.org/markers/microsat/) distributed across all the 12 chromosomes of rice. The sequence and the details of the primers used are listed in Table  S3. Out of these 175 SSR markers, 53 SSR markers (30.28%) were polymorphic among the varieties studied and were used to generate DNA fingerprints. The position and distribution of the total markers and polymorphic markers used in the study are presented in Figures S1 and S2. The DNA barcode for each variety was developed from the allelic variation data by aligning the allele size data of all the primers from the lowest to the highest [13][14][15][16].

Data scoring and analysis
To evaluate the polymorphism status of the markers, the polymorphism information content (PIC) for each SSR marker was calculated according to the formula [17].
where i is 'i'th allele of the 'j'th marker, n is the number of the 'j'th marker's alleles and Р i is the frequency of the 'i'th allele. This formula gives us an indicator of how many alleles a certain marker has and to what extent these alleles divide evenly. The frequency is the number of times a particular allele appeared in 18 varieties divided by the total number of DNA bands generated in the whole population.  [4,19,20]. The frequency distributions observed among the 21 differentiating traits are shown in Table 2. These traits were found to be more advantageous in the characterization of the studied varieties. Seed and plant characters are major components of cultivar identification because they provide dependable data. A set of morphological traits/DUS descriptors are essential to distinguish the varieties from one another [21][22][23]. Several earlier workers also used DUS descriptors/traits for characterization of genotypes in the studied crop [17,[24][25][26]. During the vegetative stage, variation was observed for six traits among the varieties. Based on the absence of anthocyanin coloration of the collar and  [22] emphasized the importance of this character during the grain maturity and reproductive stages for characterization of the varieties.

Fingerprinting using DUS traits
Based on the variation observed in DUS/Morphological traits among the varieties studied, DUS fingerprints were developed using 21 informative traits (Figure 1). DUS fingerprint was more useful in the identification of the varieties at the field level at specific crop growth stages. For further differentiating the other similar grain type, medium slender varieties from popular consumer preferred variety BPT 5204 (Popularly Sambha Mahsuri) ( Figure 2) and also to differentiate the long slender varieties from popular varieties Tellahamsa and MTU 1010 (Figure 3), DUS fingerprints were developed using 20 differentiating traits among varieties. However, the trait density of pubescence of lemma was not useful for differentiating the medium and long slender varieties; this can be used in differentiating RNR 15048 and RNR 2465 (medium) from the other varieties. The developed DUS fingerprints can serve as a reference database for comparing the varieties. DUS fingerprints resembling barcodes are unique and these data can be integrated into a national coordinated database to increase the precision for identification of individual varieties with absolute certainty.

Allelic variations and polymorphism information content (PIC) of SSR markers
A total of 175 SSR markers located across the 12 chromosomes (the marker distribution across chromosomes are shown in Figure S1) were employed for fingerprinting the chosen varieties. Based on the polymorphic status, 53 hypervariable polymorphic SSR markers (markers distribution across chromosomes is shown in Figure S2) were found to be informative for discriminating the varieties and these markers were used for fingerprinting of the varieties.  Table 3. A total of twenty-two SSR markers produced PIC values of over 0.5 and these markers were more useful for fingerprinting. However, a few markers had a low range of PIC and discriminating power (RM12983, RM13209. RM14698, RM16816, RM18600, RM18516, RM27689). The PIC value is the reflection of allele diversity and their frequency among genotypes. Markers with higher PIC value will be useful in gene mapping, molecular breeding and germplasm evaluation [31] and the markers with PIC value of more than 0.5 are considered to be informative markers for genetic studies to measure the polymorphism for a marker locus [32]. Vanisri et al. [15] reported that the PIC values ranged from 0.14 to 0.99 for medium slender varieties and from 0.23 to 0.98 for long slender varieties of rice. Samal et al. [33] reported that the PIC [34], Bhave [35] and Ishaq et al. [36] among the different crops studied.

Unique markers/alleles for varieties
A set of seventeen markers among the 53 polymorphic markers, exhibited unique/specific alleles for nine varieties (Table 4, Figure 4 and Figure S3) and can distinguish those varieties from the others, hence they can serve as molecular IDs for those 9 varieties.

DNA/molecular barcodes/fingerprinting
The resulting allelic variations of 53 SSR markers for the varieties were converted to DNA barcode profiles by separating the allele size from each SSR locus by sorting the allele size data from the lowest to the highest. These allele size bars were then drawn to a linear scale for all of the analyzed varieties ( Figure  5). Vanisri et al. [15] developed DNA barcodes, for 14 visually similar varieties of medium grained rice and eight long grained slender varieties, from the unique pattern of SSR polymorphism from the allelic variation data. The DNA barcodes using SSR marker data were developed successfully for eggplant by Lakshmana Reddy et al. [14] and guava by Kanupriya et al. [13] and Chaitanya et al. [37]. These barcode profiles allow easy detection of genotypic differences, thereby helping in the identification of the individual with absolute certainty by acting as a reference or standard DNA barcode library [13] and useful tool for intellectual property rights protection or the resolution of commercial disputes. SSR fingerprints were constructed by the different primers combination, which provides useful methodological guidance for the construction of a standard DNA fingerprint database and performing rice varieties mapping analysis in the future. Molecular characterization done by using polyacrylamide gel electrophoresis (PAGE) in near future will be more accurate and reliable for identification of rice varieties. Application of microsatellite markers as descriptors for varietal protection and fingerprinting has been more prominent post the acceptance by the USDA Plant Variety Protection Office [38].

Development of variety-specific multiplex assays
A rapid DNA fingerprinting protocol for rice crop has been designed and verified in Tamil Nadu Agricultural University (TNAU), Coimbatore, in which commercial seed lots are analysed through multiplex PCR involving two markers. Multiplex PCR is a technique to amplify multiple target primers in a single PCR simultaneously to save time and efforts. One set of multiplex assays with five markers (RM 21392, RM 1388, RM 6699, RM 25754 and RM 16913) was developed which can differentiate all the 18 varieties. Earlier, similar kind of multiplex assays were developed by Vanisri et al. [15] to identify 14 medium slender and 8 long slender grain rice varieties. The identity code, code key and DNA barcode for the identified multiplex assay are provided in Figure 6, Tables S4 and S5.  According to our knowledge, this is the first attempt to develop both DUS and SSR based barcode/fingerprints for the rice varieties which are commercially grown (that are in the continuous seed chain) in Telangana and Andhra Pradesh states of India.

Conclusions
Among the 46 morphological traits studied, 21 characters showed variation among the 18 varieties and these characters were more useful in characterizing the varieties. DUS based barcodes/fingerprints specific to the varieties were developed based on variations observed. We used 53 polymorphic microsatellite markers to generate a unique fingerprinting profile for each variety that may act as a reference molecular barcode/ ID for accurate identification with a visual representation of the number and size of alleles, allowing a facile detection of genotypic differences between analyzed cultivars. The multiplex assay with a set of 5 markers (RM 21392, RM 1388, RM 6699, RM 25754 and RM 16913) can help in identifying all the varieties in a single PCR reaction. Further validation of the identified DUS/DNA barcodes in single plants/population of the varieties will be immensely useful in future variety identification and solving the adulteration disputes in seed lots.