Identification of genetic diversity among some promising lines of rice under drought stress using SSR markers

Fourteen rice lines and one cultivar were investigated for morphological traits and genetic diversity under normal and drought stress condition. Mean squares of years, environment and lines × environments were significant for all traits under observation, except for days to heading and grain yield/plant in years, days to heading in environment and days to heading and harvest index in lines × environments, respectively. The 10 SSR markers, covering seven chromosomes (1, 2, 4, 5, 6, 8 and 9) generated polymorphic alleles with sum of 72 alleles on seven chromosomes with 9 markers across the fifteen rice lines. Furthermore, genetic diversity values for all SSR markers varied from 0.94 to 1.00 with an average of 0.98. In addition, the PIC values ranged from 0.83 to 0.99 with an average value of 0.94. The promising lines of present investigation can be utilized in future plant breeding programs.


Introduction
Drought is among the most critical abiotic stresses influencing global food security. It remained the catalytic factor of the great famines in the history. Moreover, the world's fresh water resources are limited, the global food demand is likely to increase owing to population explosion which may further increase the water demand to grow crops [1]. The level of drought based on several factors i.e. rainfall pattern, evaporation, soil water holding capacity and plant water demand [2]. Rice ensures food security to more than half of the world's populations, covering one-fifth of total cultivated land under cereals [3,4]. Rice being the most diversified monocot, cultivated under various eco-geographical conditions, sensitive to various abiotic stresses i.e. drought, salinity, cold and heavy metals whereas, drought is among the most devastating stress at any stage of rice crop production. It is becoming alarming in several regions across the world, requiring the attention of researchers, farming community and governmental organizations [5]. In Egypt, developed rice varieties cannot perform well under the water shortage which is one of the most limiting factors in more than 30% of paddy fields in Egypt [6]. Moreover, Egypt has restricted the cultivation of rice areas due to its huge water demand. Thus, producing drought-tolerant genotypes are urgently needed to keep food security safe and alleviate poverty [7]. Keeping in view, recombination breeding can plays a major role in the accumulation of minor genes for grain yield and related attributes under drought stress. It is imperative to know the genetic diversity among germplasm before proceeding for any breeding programme [8]. To get heterosis in F 1 and greater variation in F 2 along with valuable transgressive segregants in subsequent generation required crossing among divergent parents with no exception to breeding for drought-tolerant germplasm [9].
Drought tolerance is a complex trait controlled by many genes. Therefore, the genetic improvement of drought tolerance is a big challenge [10]. Drought tolerance can be assessed by scoring traits that have a relation to water deficit and by which tolerant and susceptible genotypes can be discriminated. These traits can be morphological, physiological, yield traits, etc. To further analyze the estimation of genetic diversity can be one of the most crucial element for the success of any crop improvement programme helping for the estimation and establishment of genetic relationship during the collection of germplasm, parental combinations for segregating populations with higher genetic variability, elite recombinants for selection and introgression of desirable genes into elite cultivars [11,12]. The genetic diversity can be estimated through morphological, biochemical parameters along with DNA markers. These techniques can unearth the number of variables existing at genetic level, providing with more efficient, reliable and direct approach for germplasm conservation, management, characterization with little or no environmental impact. There are various PCR based markers i.e. Restriction Fragment Length Polymorphism (RFLP), Random Amplified Polymorphic DNA (RAPD), Simple Sequence Repeats (SSRs), Amplified Fragment Length Polymorphism (AFLP) and Single Nucleotide Polymorphisms (SNPs) have been used to assess genetic diversity of various rice cultivars throughout the world [13]. However, SSR are among the most reliable and widely utilized in plant sciences owing to reliability, reproducibility, cost effectiveness, mono-locus and easy to analyzed [14].
The SSR markers holds ability to detect higher level of allelic diversity therefore, extensively utilized to detect genetic variations existing among rice subspecies [15]. These markers are valuable for genetic polymorphism and differentiate among germplasm based on genetic variations of closely related breeding lines with same parental background [16]. Multi-allelic nature and high polymorphism of SSR markers help to establish the relationship among the individuals even with less number of markers [17,18]. Because of the low value of heritability for grain yield under stress and the lack of effective trait selection index related to drought tolerance, it is very important to find molecular marker associated with water stress tolerance in rice. Several SSR markers have been reported in studies undertaken by researcher working in cereals linked to drought tolerance traits in rice [18][19][20]. Although rice germplasm characterization and diversity analysis have been done by several workers, variability studies of the common landraces and cultivars grown is limited. The present study was undertaken with fourteen advanced lines in comparison to one cultivar of rice to assess their genetic and morphological diversity. Molecular characterization using microsatellite markers covering the entire twelve chromosomes of the rice genome and phenotypic characterization for various grain yield parameters were carried out. The present study focused on three major objectives genetic differentiation of advance rice lines by establishing specific DNA markers associated with drought tolerance using SSR markers, morphological characterization of the lines for grain and agronomic parameters under normal and drought stress conditions and grouping the lines according to their genetic relationship for various traits.

Plant materials
A set of fourteen locally bred advance lines along with one rice cultivar representing a wide array of diversity for several agronomic and physiological attributes were investigated under both normal and drought stress conditions during 2017 and 2018 successive rice growing seasons. The lines pedigree and features of lines studied were presented (Table 1). The advance lines were screened to identify germplasm having drought tolerance under prevailing water shortage conditions in Egyptian agriculture production system.

Experimental layout and agronomic practices
Plants were grown in Rice Research and Training center, Sakha Research Station Farm, Agricultural Research Center, Egypt. The nursery was sown on April 15th and transplanted to the field after 30 days. Individual seedlings of each genotype were transplanted to 5 rows 20 cm apart between rows, with the distance of 20 cm between the hills within a row. The experiment was conducted under two water regimes i.e. normal irrigated and water stress (drought) conditions, plantation was exposed to drought condition fifteen days after transplantation using a randomized complete block design using a randomized complete block design whereas, both treatments were replicated three times. The drought condition was established in the field by irrigation intervals every 12 days under normal irrigation and exposed plants to drought stress 15 days from transplanting to harvest under drought condition. Soil moisture content was gravimetrically determined in soil samples taken from consecutive depths of 15 cm down to a depth of 60 cm. Other soil samples were collected just before each irrigation and 48 h after irrigation. Field capacity, wilting point and bulk density were determined according to Klute [21] to a depth of 60 cm. The amount of irrigation water applied at each irrigation was determined on the basis of raising the soil moisture content to its field capacity plus 10% as a leaching requirements and it was measured by using flow metre. The amount of irrigation water applied was found to be 8330 m 3 per hectare comparing with 13090 m3 per hectare under continuous flooding. Also, irrigation water applied was calculated according to the equation of Michael [22]. Nitrogen fertilizer was applied in three splits as top dressing, phosphorus and potash were applied in full dose at the time of sowing. Insect and weed control were applied periodically as required. The others practices could be done according to the recommended package.

Statistical analyses
The combined analysis was calculated over the two years to test the interaction of the different genetic components with the two years, as two different environmental conditions. Homogeneity test was done, followed by Bartlett [18], before proceeding the computations of the combined experiments and the error variances of the tests were homogenous.

Genomic DNA extraction, SSR markers and PCR amplification
Genomic DNA was extracted from the healthy portion of young leaves harvested from 21 days old plant using a mini preparation modified CTAB method [23]. Ten SSR markers related to drought tolerance traits were used. The sequences of primer pairs are found on the Web database (http://www.gramene.org) and already published material for traits investigated. Primers names, repeat motifs, chromosome number and related trait are shown (Table 2). PCR amplification was undertaken using the DNA of lines/cultivars under investigation.

Electrophoretic separation and visualization of amplified products
Five µl of PCR amplified product were loaded into each well of 3% agarose gel supplemented with ethidium promide. The TAE 1X was used as a running buffer and 100 bp DNA ladder (0.5 µg/µl, fermentas) was used to estimate the molecular size of the amplified fragments. Electrophoresis was conducted at 60 V for 2 h. Gels were then visualized and photographed using Biometra gel documentation unit (BioDoc, Biometra, Germany).

SSR data analysis
The amplified bands were scored for each SSR marker based on the presence or absence of bands, generating a binary data matrix of 1 and 0 for each marker system. Effective alleles per locus (Aep) were calculated according to Weir 1989 (Aep = 1/(1-He), where He, the genetic diversity per locus). Genetic diversity was calculated according to Nei, 1973 (He = 1− Pi2, where Pi is the frequency of the ith allele). Polymorphic information content (PIC) values were calculated for each SSR marker by using the formula described by Botstein et al. [20] as follows: Where Pi and Pj are the frequencies of Ith and Jth alleles of a given marker, respectively, n = number of different alleles. Matrix was then analyzed using the PAST, Ver. 1.90 [28]. The data matrix were used to calculate genetic similarity based on Jaccard's similarity coefficients, and dendrogram displaying relationships among fourteen rice lines and one cultivar was constructed using the Un-weighted Pair Group Method with Arithmetic Mean (UPGMA).

Results and discussion
The analyses of variance are presented in Table 3. Mean squares of years shown significant for all the studied traits, except for days to heading and grain yield/plant indicating wide variation among the investigated lines. The variation among lines showed the drought caused variable influence on lines under investigation. Moreover, the variation among lines is a positive sign, helping for the selection of elite germplasm for future hybridization programme. Environmental mean squares were found to be highly significant for all studied attributes except, days to heading, indicating that all environments showed significant differences. Mean squares of lines × environment were highly significant for studied traits except days to heading and harvest index, showing observed lines influenced by environment and grouped differently from normal conditions. Mean squares due to genotype × year interactions  were insignificant for all traits under investigation. It was observed some lines surpassed the rest based on mean squares of lines varies from highly significant than the interaction mean squares of lines with years helping to identify the most superior lines. The lines with superior results hold immense potential to develop cultivars with a better drought stress response. Genotype x environment x year mean squares were non-significant for all the studied traits, indicating that each genotype performance in one environment deviate from year to another. The significant differences among rice lines in the present study showed presence of genetic variability in germplasm studied providing opportunity for yield improvement. Grain yield and other features displayed stability across the growing seasons since, the significance of genotype × environment interaction was identified and the differences among lines were obvious (Table 3). These findings show that further crop improvement programme through selection for all studied traits could be effective in breeding programme.

Performance across environments
The mean performances of the studied lines at the combined data over environments are presented in Table 4. For days to heading (day), the cultivar GZ8710 (Sakha107) gave the earliness value 82.22 day and 95.50 day for GZ9865-2-1-1-2 line. These findings are in agreement with those reported by [29,30] therefore, a strong indication the lines showing better performance for days to heading may perform better under drought stress condition. Concerning to leaf rolling which is a reflection of the pre-drought tolerance during the growth period, recorded the lowest values with significant differences by the line GZ9865-2-1-1-1 on the other hand the greatest one observed with GZ9865-2-1-1-2 (2.50), indicating that all lines had tolerance for water shortage stress. Flag leaf area an integral component for photosynthesis, assimilation and transpiration showed significant difference for lines GZ9865-2-1-1-2 (23.25 cm 2 ) however, lowest value observed for GZ9730-1-1-3-2 (13.40 cm 2 ). There were phenotypic variations for plant height predicting differences of growth rates among lines studied furthermore, the required dwarfism mean value was obtained from line GZ 8452-4-1-1-1 (90.  Table 3 revealed that Days to heading (DTH, day), leaf rolling (LR), flag leaf area (FLA, cm 2 ), plant height (PH, cm), number of panicles/plant (NP), relative water content (RWC), 100-grain weight (HGW, g), sterility percentage (SP, %), harvest index (HI, %) and grain yield (GY, g). * , * * significant and high significant at probability 0.05 and 0.01, respectively. the highest mean relative water content with significant differences obtained was by the line GZ9917-8-4-2-2, which gave the best values (84.70%) while, the lowest one resulted from GZ9724-11-2-1-2 (67.00%). With respect to 100-grain weight, the cultivar GZ8710 (Sakha107) (2.82 g) gave the highest mean value, and the lowest one obtained from GZ9865-2-1-1-2 (2.26 g). The most desirable mean values of sterility percentage were exhibited by the cultivar GZ8710 (Sakha107) which gave the lowest mean value of 6.33% and the highest value produced from GZ8452-4-1-1-1 and GZ9865-2-1-1-1 gave values 9.20% and 9.28%, respectively. For harvest index percentage the line GZ8714-7-1-1-2 gave the highest mean value (43.25%), however the decreased value was 36.15% obtained from the line GZ9730-1-1-1-1. With respect to grain yield/plant, the most desirable mean values were detected by the lines GZ9865-2-1-1-2 (36.88 g/plant), otherwise, the line GZ8452-4-1-1-1 gave the lowest one 26.97 g/plant. Therefore, it can be assumed that line GZ9865-2-1-1-2 contain genetic combination to withstand against drought conditions. The genotypes performing significant results under both environmental condition can be selected with confidence in future breeding programme or can be released as cultivar for particular environmental condition.

Number of alleles and allelic diversity
The fifteen rice genotypes used in the present study were subjected to DNA polymorphism screening and assessment using SSR markers which offer great potential for generating large numbers of markers evenly distributed throughout the genome and have efficiently been used to give reliable and reproducible genetic markers. A total of 10 SSR primer pairs with known map positions covering the whole rice genome were used to screen a set of fifteen selected indica, japonica and tropical-japonica rice lines with different levels and mechanisms of drought tolerance. Among 10 SSR markers, spread on seven chromosomes (1, 2, 4, 5, 6, 8 and 9) generated polymorphic alleles. The data in (Table 2) showed that, a total number of 72 alleles were detected at the loci of the 9 markers across the fifteen rice lines. The number of alleles per locus generated by each marker varied from 2 to 13 alleles with an average of 7.20 alleles per locus. The effective number of alleles per locus 24.53 alleles, ranged from 1.20 to 4.30 alleles with an average of 2.45 alleles. The highest number and effective number of alleles per locus were observed for RM276 (4.30), RM246 (4.00), RM242 (2.55) and RM164 (2.50). In another separate study, the low number of alleles per locus were obtained by [31] (3.33) and [32] (2.5). On the other hand, high number of alleles per locus was obtained by [33] (8.57). The number of alleles per locus ranged from 2 to 5 with an average of 2.9. The polymorphic information content value per locus ranged from 0.059 (RM537) to 0.755 (RM252) with an average of 0.475 by [34]. Fifty-six polymorphism SSR markers covering across eleven rice chromosomes were recorded with an average of 3.02 alleles per locus. The average value of polymorphism information content was 0.47 [35]. Bashier et al. [36] found that the eighteen out of the 19 primers tested showed amplification of the SSR markers generating 569 alleles that ranged between 13 and 113 alleles per marker. The numbers of polymorphic alleles ranged from 3 to 12 alleles per locus and average of 7.1 alleles per locus [37]. There was a significant positive correlation between the number of alleles detected at a locus and the number of repeats within the targeted microsatellite DNA (r = 0.58 * * ). Thus, the larger the repeat number in the microsatellite DNA, the larger the number of alleles detected. Also it was reported that the dinucleotide repeat motif (GA) displayed high level of variation among the rice lines [38]. On the other hand, Sajib et al. [31] reported no correlations between the number of alleles detected and the number of SSR repeats. The previous finding showcase the genotypes with allelic diversity were suitable germplasm for the development of drought-tolerant cultivars. Therefore, uses of molecular markers to select germplasm possessing genes and genomic regions that control target traits can fast-track the progress of breeding for drought-tolerant rice. This is because molecular markers are transmitted faithfully from generation to generation and are not subject to environmental influences. Days to heading (DTH, day), leaf rolling (LR), flag leaf area (FLA, cm 2 ), plant height (PH, cm), number of panicles/plant (NP), relative water content (RWC), 100-grain weight (HGW, g), sterility percentage (SP, %), harvest index (HI, %) and grain yield (GY, g). N, D and C are normal, water regime and combined data, respectively.

Gene diversity (HE)
The gene diversity or heterozygosity (HE) of a locus is defined as the probability that an individual is heterozygous for the locus in the population [39]. Higher values of this measure tend to be more informative because there is more allelic variation. As shown in Table 2, the HE values for all SSR markers used in this study varied from 0.94 to 1.00 with an average of 0.98. The genetic diversity was also studied in other studies by Fasahat et al. [40]; Islam et al. [41]; El-Wahsh et al. [42], however the present investigation displayed more allelic diversity among studied genotypes. The highest HE value was observed for markers RM164, RM 223, RM242, RM 246, RM 263, RM 276 and RM 518 were 1.00, 0.99, 0.99, 0.99, 0.99, 0.99 and 0.99. These markers can be utilized in further studies to unearth the underlying genetic diversity among genotypes for selection against drought stress condition.

Polymorphic information content value (PIC)
PIC value defines the value of marker for detecting polymorphism within a population under investigation, depending number of detectable alleles and their frequency distribution, providing an estimate of discriminating power of the marker [43]. As it shown in Table 2, the PIC values for the SSR used in this study varied from 0.83 to 0.99 with an average of 0.94. These results are in supported with the findings reported by Sajib et al. [31] providing great variations in PIC values for all tested SSR loci (from 0.14 to 0.71 with an average of 0.48). Higher averages of PIC values were reported by [44] (0.57) and Ram et al. [45] (0.707). Moroever, Bashier et al. [36] found that the alleles further produced polymorphism information content (PIC) values of 0.51-0.99 per marker. According to Botstein et al. [46], there were 19 highly informative markers (PIC > 0.50), 21 informative markers (50 < PIC < 0.25) and three slightly informative markers (PIC < 0.25). The highest PIC values were observed for RM164 (0.99), RM223 (0.98), RM246 (0.97) and RM518 (0.97). Highly significant correlation coefficient was found between PIC values and the number of amplified alleles detected per locus (r = 0.88 * * ) as shown in Table 5. A significant correlation between PIC value and the effective number of alleles (r = 0.65 * * ) and high significant correlation was found between PIC and gene diversity (r = 1.00 * * ). Moroever, Figure 1 shows the PCR amplified fragments produced by the highest polymorphic markers in the current study RM518, RM276, RM289 and RM242. These markers revealed the highest PIC values ranging from 0.92 to 0.97 as well as the highest number of alleles ranging from 2 to 3 alleles per locus suggesting that these markers could be used for molecular characterization of large number of rice lines rather than mapping populations for drought tolerance.

Similarity
As indicated in Table 6, the highest similarity coefficient was found among GZ9794 In a separate study by Chakravarthi and Naravaneni, [4] reported low similarity coefficient between japonica type and indica type lines, and Kanawapee et al. [47] reported relatively high level of similarity between closely related lines. Moreover, the similarity coefficient was observed by Youssef et al. [48]; Ramadan et al. [49] and reported slightly divergent results and reported similarity coefficients of the rice landraces fluctuated from 0.76 to 0.93; at a genetic correlation level of 0.78. Ninety accessions of rice landraces were divided into five groups based on analysis of genetic relationships [37].

Cluster analysis
The genetic relationships among rice lines are presented in a dendrogram based on informative microsatellite alleles ( Figure 2). All lines clearly grouped into two major clusters in the dendrogram at 67% similarity based on Jaccard's similarity index. The first cluster represents the indica/japonica rice, while the second cluster represents the japonica rice. Below the main indica/japonica cluster in the dendrogram, GZ9917-8-4-2-2, GZ9865-2-1-1-2 and GZ9724-11-2-1-2 grouped into one cluster for each one. GZ9730-1-1-1-1, GZ 9730-1-1-1-2 and GZ 9730-1-1-3-2 grouped into one cluster at 81% and GZ9781-3-2-2-6, GZ9792-13-1-1-2, GZ9794-15-1-1-1, GZ9854-1-2-2-4 and GZ9865-2-1-1-1 gave the same group at 85%. Below the second main cluster in the dendrogram, lines grouped into two sub clusters, B1 and B2, at about 74% and 81% similarity. The sub clusters B1 included the japonica rice lines GZ8714-7-1-1-2 while the sub cluster B2 included the genotype GZ8452-4-1-1-1, GZ8452-6-1-3-2 and GZ8710 Table 6. Similarity coefficient among studied lines based on SSR markers. (Sakha107). El-Malky et al. [33] reported the ability of SSR makers to divide the varieties into two groups, one included the indica varieties and the other included the japonica varieties. Zeng et al. [35] found that all lines clearly grouped into two major branches in the dendrogram with less than 10% similarity based on Jaccard's similarity index, one branch represents the subspecies, japonica and another branch represents the subspecies, indica, or the hybrids between japonica rice and indica rice. The findings were in similar direction as reported by Youssef et al. [48]; Ramadan et al. [49] whereas, the findings of the present study displayed vast variation present within the genotypes under investigations. Moreover, Anupam et al. [34] mentioned the cluster analysis based on 30 simple sequence repeat markers revealed 5 clusters and also indicated the presence of variability within the rice accessions. Anh et al. [35] reported by using the un-weighted pair group method with arithmetic mean (UPGMA) clustering, four clusters were generated with the genetic similarities ranging from 0.52 to 0.91. The variation among groups was 34%, while the variation among individuals within groups was 66%. Our study indicated a highest level of diversity among fifteen rice genotypes. This diversity is supposed to play an important role in rice genetic improvement. Though some genotypes from different origin are not clustered exactly according to their phylogeny, they could be grouped in the same cluster, probably due to similar yield potential, morphology, tolerance to drought and similarity at the genome level [50].

Conclusion
Drought is among the most challenging phenomena which can reduce the final crop yield and ultimately threatening food security. Rice being the major staple crop also greatly affected by drought stress. Genetic diversity is an integral component to pool desirable genotypes from different background. The significant difference among studied lines indicated the genetic variability which could be helpful for the selection during the breeding process. Moreover, the present study clearly indicated that SSR markers are useful in assessing genetic diversity. The 10 SSR markers successfully classified fifteen rice genotypes. The SSR markers, RM164, RM223, RM246, RM518 and RM242 helped in the establishment of five groups based on molecular weight. A basic molecular allelic dataset was created which could distinguish drought-tolerant and drought-sensitive rice genotypes. The set of markers could also be utilized for studying polymorphism and assessing hybridity while crossing the genotypes, and they might assist in markerassisted selection. The current genetic diversity analysis clearly differentiated genotypes into separate groups comprising of indica and japonica/indica types. This would assist further for hybridization as well as the identification of potential donors in marker-assisted selection because of their tolerance to drought. The effective breeding strategy could be formulated for designing high-yielding drought-tolerant varieties for the farming community of Egypt.