Analysis of genomic regions for crude protein and fractions of protein using a recombinant inbred population in Rice (Oryza sativa L.)

The improvement for grain seed storage proteins (SSPs) is integral for rice breeding to achieve superior nutritional quality. In present investigation, a sum of 44 QTLs for PC, GLU, GLO, ALB and PRO were detected on all 12 chromosomes, with single QTL explaining 3.70–12.47 % of phenotypic variation. The majority of detected QTLs located in the region of RM7158-RM3414 including Waxy (Wx) gene along with qPC1. The highly positive significant correlation showed traits under investigation are related to each other. Three QTLs (qPC6, qPC7 and qGLU6) had shown relatively higher rate of phenotypic contribution suggesting influence of environmental conditions. Whereas, a pair of epistatic QTLs, GLO and PRO were also detected meanwhile, the M-QTLs were the primary genetic basis for PC, GLU and ALB, respectively. The outcome of present study will help to unearth the genetic foundation of protein and fractions of protein for future grain nutritional quality improvement programs.


Introduction
Rice, being the important source of caloric intake, ensures food security to billions of people around the globe [1]. The quality improvement with yield is a hotspot for rice breeders worldwide, especially in Asian rice-growing countries [2]. The grain quality is complex and the sum of different attributesincludes cooking and eating, nutritional, milling and appearance quality [3]. Healthy quality improvement ultimately uplifts human nutrition and health; however, approximately one-third of humans on earth are suffering from an inadequate supply of protein. The inferior protein quality, unavailability of essential vitamins and micronutrients may cause some diseases [4]. The rice grain is composed of more than 90% protein and starch [5,6]. In addition, protein and fractions of proteins, known as seed storage proteins (SSPs), are key elements, explaining the nutritional quality aspects [7]. Rice holds lower protein content than other cereals, but net utilization of protein is higher owing to the most significant consumers [8]. Similarly, the high nutritional quality rice demand has been estimated to upsurge over time [9].
The rough endoplasmic reticulum (ER) is the key place for the synthesis of SSPs, translocated to ER lumen and further transferred to discrete intracellular compartments of the plant's endomembrane system [10]. Rice contains a relatively balanced amino acid composition and the SSPs are fractioned into albumins (ALB), globulins (GLO), prolamins (PRO) and glutelins (GLU), according to the differences in solubility [11]. Rice SSPs have the second-highest lysine content, the limiting amino acid after oats [12]. The GLU, an alkalisoluble protein, makes 80% of SSPs, found in the milled fraction [13]. It is accumulated in irregularly shaped protein bodies II (PB-II) derived from the protein storage vacuole and globulin, as 57 kDa precursor [14]. The GLUs are further divided into subfamilies: GluA, GluB, GluC, and GluD, depending on resemblance in amino acid [15], with high nutrition value for human's diet [16]. GLU is a major SSP of the rice grain. Any modification in it may cause a significant influence on grain quality. The PRO, an equally distributed alcohol-soluble protein, makes lower than 5% of SSPs and is stored in spherical PB-I. The PRO is categorized into three types based on their molecular mass: 10 kDa prolamin (RP10), 13 kDa prolamin (RM1, RM2, RM4, and RM9), and 16 kDa prolamin (RP16) [15]. Water-soluble ALB and saltsoluble GLO are primarily concentrated in the embryo and outer aleurone layer of the endosperm. We lose the major portion of these protein fractions during the polishing process [17]. Moreover, classes of proteins, i.e. ALB and GLO, have also been known for allergenic proteins. The rice hypoallergenic protein may be helpful for patients being allergic to these proteins. In addition, a low glutelin diet is recommended for patients suffering from diabetes and kidney stones [18]. Therefore, more consideration must be given to protein quality and its concentration in any rice breeding programme.
The rice SSPs are polygenic and a mixture of highly polymorphic polypeptides, mutation in some the structural genes may have less influence on grain protein contents [19]. From the past decades, more intensive efforts have been undertaken to understand the genetic base of crude protein content (PC) of rice [20,21]. These studies have shown significant influencing QTLs located on chromosomes 1 and 6, respectively. A significant gene qPC1 encoding a putative amino acid transporter OsAAP6, functioning as a positive regulator of PC was cloned and functionally characterized [21]. GLU is encoded by 15 genes, and the past studies have revealed the mutants of 57H have high concentration of 57 kDa pro-glutelin with floury/opaque endosperm phenotype [22]. Several genes have been isolated from 57H mutants; however, only gpa3, Osvpe1, and OsRab5a have been successfully cloned [23]. The fraction PRO is encoded by 34 gene copies, whereas, only a few loci have been successfully cloned and functionally characterized [24]. Zhang et al. [16] conducted a study and reported 16 QTLs for PC and other fractions of protein.
The SSPs are responsible for determining the nutritional value of rice grains. However, it is also well known that PC influences the cooking quality via protein-starch interaction and can hamper starch gelatinization; therefore, any disorder in the protein's structure increases the viscosity of rice meal during cooking [25]. Thus, the present study investigated physio-chemical properties of a recombinant inbred lines (RILs) population of 193 lines, derived from inter-sub-specific cross between Japonica rice Nipponbare (NIP) and Indica super rice YK 17, to identify QTLs associated with crude protein and fractions of protein. The detected QTLs were analysed further to understand epistatic effects and their interaction with the environment. The results of the present study displayed the genetic architecture of SSPs in rice, which may help breeders further improve the nutritional quality by marker-assisted selection.

Plant materials and field experiments
A recombinant inbred lines (RILs) plant population, containing 193 lines, was developed by the single-seeddescent method (SSD) from a cross between genomic sequenced Japonica rice NIP with Indica super rice YK17 as parents (  HN 2017 field trial, seeds were sown in November, and seedlings were transplanted in December. Each plot of plant population consisted of three rows of 21 plants at a spacing pattern of 25 cm (between rows) by 20 cm (within rows). The field trials were arranged in a randomized complete block design (RCBD) with three replications. Irrigation, fertilizer application and other management measures followed standard field production practices. At maturity, each plot of RILs was harvested in bulk and dried naturally. The dried rice grains were stored for three months at room temperature before the evaluation of physio-chemical properties.

Quality trait evaluation
From each individual line, filled grains were utilized to evaluate grain quality. Hulls were removed from 125 g of grains using a Satake testing husker (THU-35A Satake Engineering, Japan) and de-branned with a McGill number 2 mill (seedburo Equipment, U.S.A.). Milled rice flour samples were obtained by grinding milled rice grains to pass through a 0.42 mm screen on a Udy cyclone mill (Cyclotec 1093 sample mill, Tecator, Sweden). The milled flour samples were sieved through a 100-mesh sieve to get a uniform granule size. The following standard procedures were followed for the traits under investigation.

Crude protein and fractions of protein
The PC was measured using the micro-Kjeldahl pretreatment method [26]. Rice protein fractions were prepared from rice flour following the process of [27] with minor modifications. The milled rice flour of 1.5 g with three repeats of each line was weighted for all fractions of protein separately; 0.1 M NaOH was used as extraction buffer for GLU. The samples were stirred with 10 ml of extraction buffer at ice water bath for 2 hr; 0.5 M NaCl was used as extraction buffer for GLO; ddH20 was used as extraction buffer for ALB, and 70% n-propanol was used as extraction buffer for PRO protein. The samples of GLO, ALB and PRO were stirred with 10 ml of extraction buffer at room temperature for 4 hr. The procedure was repeated three times. The extracts were stored in the freezer for further analysis. Extracts were separated from residues by centrifugation (10,000 crf, 10 min). The PC of extracts was measured by the semi-micro-Kjeldahl method. The PC and fractions of protein were converted to ammonium nitrogen by sulphuric acid digestion and the absorbance value of the blue production of reaction with natrium salicylicum and hypochlorous acid at the wavelength of 660 nm was checked. The nitrogen contents were measured by employing Rapid Flow Auto Analyzer (AA3, SEAL, Germany). A conversion factor of 5.95 was used to extract PC from the calculated nitrogen content of milled rice flour.

Linkage map construction
A previously constructed linkage map of RIL population was used in the present investigation. The DNA was extracted through the sodium dodecyl sulphate method from the young leaves of rice seedlings. During linkage map construction, 163 single sequence repeats (SSRs) markers were used with excellent polymorphisms between Nipponbare and YK17. The markers were selected from the public database. MAPMAKER/Exp V 3.0 [28] was used to construct the linkage map, and the recombination rate was converted into the genetic distance (cM) using the Kosambi function. Map-Draw V2.1 [29] was used to draw the linkage map based on the obtained linkage data. It spanned a total of 1479.40 cM on all 12 chromosomes with an average interval of 9.08 cM between adjacent markers ( Table 1).

QTL analysis
QTLs controlling PC and fractions of proteins were mapped using Windows QTL Cartographer Version 2.5 (WinQTLCart 2.5) [30] with the composite interval mapping (CIM), and a LOD value of 2.5 was set as the threshold for the detection of putative QTLs. QTLs with epistatic effects and QTL-by-environment interaction (QEs) effects were analysed using QTL Network-2.1 [31] with the mixed-model-based composite interval mapping (MCIM).

Statistical analysis
All data were analysed using SPSS 22.0 and Excel 2016.

Phenotypic variation in parents and the RIL lines for ECQs
Over three rice growing seasons, the variation in parents and RIL population was significant and broad. Normal frequency distribution of phenotypic data was observed for PC, GLU, GLO, ALB and PRO, indicating the influence of several genes in controlling traits under observation (Table 2 and Figure 2).

Correlation analysis of traits
The correlation coefficient analysis showed a correlation among all traits under investigation ranged from −0.0793 to 0.5526 (Table 3). A significant to highly significant correlation was observed between PC and fractions of protein except a non-significant correlation between PC to ALB and PC to PRO. A highly significant correlation was observed between PC and GLU in all three (HZ 2017, HN 2017 and HZ 2018) environmental  Notes: Data are presented as the mean ± standard deviation (SD). PC = protein content, GLU = Glutelin, GLO = Globulin, ALB = Albumin, PRO = Prolamin.
conditions. The GLU exhibited a highly positive significant to significant positive correlation with ALB, GLO and PRO under all three population growing seasons.
Similarly, ALB showed a significant to highly positive significant correlation with GLO and PRO. Meanwhile, GLO showed a highly positive significant to significant

Mapping of QTLs
A total of 44 QTLs related to PC, GLU, GLO, ALB and PRO were detected by employing WinQTLCart 2.5 based on CIM with the phenotypic performance of the RIL population ( Table 4). The detected QTLs were distributed on all 12 chromosomes, with a single QTL explaining −0.02% to 0.43% of phenotypic variation. Among the 44 detected QTLs, 3 QTLs for GLOB were detected on chromosome 1, whereas 3 QTLs for PC, 3 QTLs for GLU, 3 QTLs for ALB were detected on chromosome 6 under all three population growing seasons ( Figure 3).

QTLs for protein content (PC)
The

QTLs for glutelin content (GLU)
Eight and qGLU10 were only significant in HZ 2018. The QTLs, qGLU4, qGLU8, qGLU5 and qGLU10 explained 6.14%, 9.79%, 7.93% and 5.36% phenotypic variance, respectively. The positive additive effect of qGLU4 and qGLU10 was contributed by the allele of Japonica NIP, whereas qGLU8 and qGLU5 were decreased by the allele of Indica YK17.

QTLs for albumin content (ALB)
A total of nine QTLs associated with ALB were detected under all three population growing seasons, with the explained phenotypic variation by individual QTL ranged from 5.53% to 10 and qALB9 in HN 2017, and qALB4 in HZ 2018 explained phenotypic variation of 9.91%, 8.31%, 6.50% and 6.19%, respectively. The positive additive effect of qALB9 and qALB4 was contributed from the allele of Japonica NIP, whereas the allele for qALB10 and qALB5 came from Indica YK17.

QTLs for prolamin content (PRO)
Nine QTLs were detected under all three (HZ 2017, HN 2017 and HZ 2018) population growing seasons, with reduced qPRO1, −0.04% and −0.03%, respectively. The QTL, qPRO6 was detected under HN 2017 and HZ 2018 with the explained phenotypic variation of 6.85% and 11.40%, respectively. The allele from Indica YK17 reduced qPRO6, −0.04% and −0.05%, respectively. Moreover, QTL, qPRO8 was detected under HN 2017 and HZ 2018 with the explained phenotypic variation of 5.76% and 7.60%, respectively. The allele from Indica YK17 came in HN 2017, whereas, the positive additive effect was contributed by Japonica NIP in HZ 2018. Other QTLs, qPRO4 and qPRO12 in HZ 2017 and qPRO8 in HZ 2018 contributed phenotypic variation of 7.80%, 7.85% and 8.84%, respectively. The allele came from Indica YK17 for all these three QTLs.

Detection of QTLs with additive × environment and epistasis interactions
To understand the genetic architecture of protein content and fractions of protein attributes, the digenic epistatic effects of PC, GLU, GLO, ALB and PRO were estimated. Three QTLs (qPC6, qPC7 and qGLU6) were detected by the joint analysis of PC, GLU, GLO, ALB and PRO under all three population growing seasons, and under non-significant additive × environment interaction, respectively ( Table 5). The phenotypic contribution rate was relatively higher. This suggested that the QTL expressions of PC and GLU were influenced by the environmental conditions. The epistatic interaction also played an important role in determining rice grain quality. So, to elaborate understanding for genetic components of these attributes, five locus bi-allelic epistatic interactions were estimated ( Table 6). The attributes GLOB and PRO showed two pairs of epistatic loci; however, PC, GLU, ALB exhibited non-significant epistasis effect indicating M-QTLs controlling PC, GLU, ALB. GLUB revealed one pair of epistatic loci, which elucidated 1.66% of phenotypic variation. One pair of epistatic loci was detected for PRO accounted for 7.41% of the phenotypic variation.

Discussion
Rice provides a large caloric and nutrition demand to approximately half of the world population [32]. The improvement in rice grain protein quality is possible only by regulating SSP contents. The genetic architecture of SSPs is much complexed, multigenic and with a mixture of highly polymorphic polypeptides. The mutation in some structural genes has little or no effect on the overall grain's protein composition [17,19]. Similarly, the significance of major and minor genes influenced by epistatic and environmental interaction are also noteworthy. Therefore, the combined utilization of conventional and molecular breeding techniques, e.g. marker-assisted selection (MAS), may be the effective and reliable technique in harnessing the SSP concentration in rice grain [16,33]. Moreover, the contemporary studies designedto identify QTLs for rice grain protein content have elaborated the genetic pillars of protein content [34]. The interrelationship among fractions of SSPs is complicated requiring comprehensive research.
In the present investigation, PC, GLU, GLO, ALB and PRO were analysed to detect M-QTLs, epistatic QTLs and QEs association under three population growing seasons. PC is an integral component for improving nutritional quality and palatability of cooked rice [25]. The synergistic relationship of amylose with PC determines that the rice with less than 7% protein is good in taste [35]. The QTL qPC6 detected repeatedly on the marker position near the Wx locus located on chromosome 6 confirmed the results derived from several other mapping populations [20,36]. Similarly, the qPC1, a major QTL for PC on chromosome 1 was detected under HN 2017 and HZ 2018 population growing seasons, suggesting the influence of qPC1 for rice grain protein contents. However, the qPC7 noticed under two population (HZ 2017 and HN 2017) growing seasons needs further investigation, and it can be a potential gene controlling PC along with qPC1 and qPC6. GLU is the highest in concentration and contains more essential amino acids, especially lysine, for human consumption compared to other fractions of protein. A QTL, qGLU6 was  detected repeatedly along the same genomic region as qPC6, with a negative additive effect contributed from Indica YK17. Moreover, a highly significant correlation (r = 0.5526 * * ) between PC and GLU was observed in the RIL population. These results indicated that PC and GLU might share the exact genetic mechanism. GLO is concentrated in the embryo and the outer aleurone layer of the endosperm, and a major portion of this protein is removed during milling. Only a few studies were conducted to clone and characterize the genes controlling GLO [24]. It is noteworthy that qGLO1 was detected as a major gene for GLO under all three population growing seasons with the same marker interval. It can be assumed that the qGLO1 globulin gene might be under a regulatory mechanism different from those of GLU, PRO and ALB, due to its unique genetic organization. The ALB protein is associated with the allergen proteins similar to GLO. The QTL, qALB6 was found as a major determinant for ALB under all three population growing seasons with same marker interval of qPC6 and qGLU6. Chen et al. [34] reported the influence of Wx gene for controlling ALB. They found 3.3 kb Wx pre-mRNA is positively correlated with ALB, providing new insights into the genetic basis of rice grain quality. In addition, qPRO6 QTLs were detected for the alcohol-soluble PRO protein. But, qPRO6 shares the same genomic region as qPC6, qGLU6 and qALB6, indicating the influence of Wx gene region to control PRO. These results indicated the simultaneous influence of major genes to control the fraction of protein, whereas the highly positive significant correlation of fraction of protein (GLU, GLO, ALB and PRO) with each other indicated the partially common genetic mechanism, which is consistent with the fact that these traits are related [37]. The non-significant correlations of ALB and PRO for PC contradict with the findings reported by [16]. In other studies, it was found that ALB was negatively correlated with PC, and there was no correlation between GLO and PC [16,38]. The discrepancy of present results may be due to the differences in germplasm evaluated, experiment location, and environmental influence. But, the QTLs for PC, GLO and PRO were observed at different chromosomes; it could be helpful to fine map underlying alleles controlling these attributes. These identified loci for PC, GLO and PRO can further introgress into elite cultivars via markerassisted selection (MAS) to enhance the nutritional and other specific objectives. In the present investigation, the genetic basis for protein and fraction of proteins (GLU, GLO, ALB and PRO) was obtained by analysing QEs and epistatic QTLs. The QEs interactions had shown phenotypic contribution rates are relatively higher, suggesting that the QTLs expressions of PC and GLU were influenced by the environmental conditions except for GLO, ALB and PRO, and showed the influence of M-QTL was dominant over QEs. For epistatic interaction, GLO and PRO were observed for epistatic interaction excluding PC, GLU and ALB. The sum effect of detected epistatic QTLs for GLO and PRO was higher than M-QTLs, suggesting the influence of the epistatic effect of one QTL over another QTL.
In conclusion, a better grain quality could be achieved by regulating the SSP content. The present investigation revealed the significance of minor QTLs, epistatic QTLs and QE interactions and M-QTLs on rice grain quality improvements programmes. The genetic mechanism of quantitative regulation for PC and fractions of protein in rice is inter-correlated, which is evident from the co-localization of QTLs responsible for these SSPs. Future genetics based studies, such as fine mapping of novel QTLs, interaction studies of genes and searching for non-environment-specific QTL, are required to elucidate the genetic mechanism of quantitative regulation and obtain DNA markers tightly linked to the desirable QTL to facilitate MAS in rice breeding for high nutritional quality.