Gene and expression analysis of the hexamerin family proteins from the grasshopper, Locusta migratoria (Orthoptera: Acridoidea)

ABSTRACT Hexamerins are large hemolymph-proteins that have been discovered in all insect species. According to the present study, hexamerins are not only storage proteins that provide amino acids and energy, but also transport hormones such as ecdysteroids. The hexamerin family genes cDNA from the grasshopper Locusta migratoria were cloned using RT-PCR and RACE methods. The biological functions of LmiHx5 were studied by RNA interference (RNAi), and qRT-PCR was performed to detect LmiHx transcripts after RNAi. Compared to the transcription database previously obtained, four family members (LmiHx1, LmiHx2, LmiHx4 and LmiHx5) of different LmiHx were determined. Homologous sequence alignments showed that the consistency of the amino acid sequence of LmiHx family members was 32%–43% and their encoded protein have the same three hemocyanin domains: hemocyanin-C, hemocyanin-N and hemocyanin-M. Gene expression patterns in different tissues and at different developmental stages showed that the LmiHx family genes are widely expressed in both female and male adults and at all developmental stages. The expression showed cyclical fluctuation with the nymphs molt. In addition, LmiHx2 mRNA expression level was the highest in fat body. LmiHx5 mRNA levels substantially decreased at 12, 24 and 48 h after dsLmiHx5 injection compared to the negative controls and almost not expressed at 96 h after dsRNA injection. The relative expression levels of other hexamerin family members showed varying degrees of increase after LmiHx5 RNAi. We speculate that hexamerin family gene members showed functional compensation.


Introduction
Hexamerins are large hemolymph-proteins that have been discovered in all insect species studied so far which shared similiar/highly conserved protein domains and amino acid sequence with hemocyanins in a variety of arthropods. They both belong to the hemocyanin superfamily [1]. This protein superfamily includes hemocyanins, phenoloxidases, pseudo-hemocyanins, hexamerin receptors, and hexamerins [2][3][4][5]. Phenoloxidases are copper-containing enzymes involved in the melanin pathway [6]. Pseudo-hemocyanins (cryptocyanins) probably serve as storage proteins [7,8] and dipteran hexamerin receptors mediate the uptake of hexamerins by the larval fat body [9]. In insects, hexamerins are accumulated to serve as sources of amino acids during metamorphosis, reproduction and periods when food is unavailable and demand for amino acids is high [10,11].
Hexamerins have been shown to be synthesized in the fat body and to be exported into the hemolymph where they either accumulate or are taken up by the fat body and sequestered in storage granules in the cytoplasm [12]. When amino acids are needed, these stored proteins are broken down by the fat body so that the amino acids become accessible to support the insect's metabolic needs.
At first, hexamerins were thought to act solely as storage proteins that provide amino acids and energy during non-feeding periods [13,14]. However, hexamerins may also transport hormones such as ecdysteroids [15] and juvenile hormone (JH) [16], similar to the vertebrate serum proteins. In termites, the effects of JH on caste differentiation have been known for several decades [17]. It has recently been demonstrated that JH induces the differentiation of the soldier caste from workers when its titres reach high enough levels [18,19]. The studies reported hexamerins of many insects, such as Diptera and Lepidoptera [7,20], and Hymenoptera [21][22][23]. Within the orthopterans, hexamerins display a remarkable diversity [14]. Typically, orthopterans possess distinct types of hexamerins that differ in terms of amino acid composition, evolutionary history and probably function. In L. migratoria, a high-molecular-weight protein (M r 500,000) in the hemolymph was isolated, characterized, and identified as larval hemolymph protein (LHP) [24]. Larval storage protein 1 (LSP1) and persistent storage protein (PSP) were identified in 1996 [16]. In Calliptamus italicus and Acrida cinerea cloning and expression analysis of the hexamerin subunit type 2 gene was reported [25,26]. Schistocerca americana's persistent storage protein (saPSP) was found and identified in 2003 [27]. In addition, there is evidence that hexamerins are involved in the immune response [28,29]. Surprisingly, hexamerins may also stimulate stem-cell proliferation [30,31] and may play an intracellular role in transcription regulation [23].
L. migratoria belongs to order Orthoptera, superfamily Acridoidea. The insect is an agricultural pest and it has developed resistance to many chemical pesticides. It is important to study the physiology of the grasshoppers in order to control the locust infestation. In this paper, we cloned the cDNA sequences of hexamerin subunits from L. migratoria. Moreover, the expressions of hexamerin subunits at different developmental stages and adult tissues were further investigated by quantitative real-time polymerase chain reaction (qRT-PCR). Through RNAi, we analysed the relationship between different members of hexamerin family. Our aim was to further understand the role of hexamerins throughout the grasshopper life cycle.

Materials
The grasshopper L. migratoria was collected from Shigatse of Xizang by Professor Daochuan Zhang et al. in September 2013. Materials were live specimens. Some were rapidly frozen by liquid nitrogen and stored at ¡80 C after removal of the digestive tract. And some were fed based on Gillespie's methods [32]. The insects were held in a growth chamber at 30 C and 75% relative humidity with an L14:D10 photoperiod. The L. migratoria eggs, produced by adults, were incubated in a growth chamber at 28 § 2 C after diapause. Insects were fed with wheat seedlings ad libitum after hatching. Nymphal instars were identified by the development of wings and nymphal instars change into adults after five molts. Plasmid pMD19-T was obtained from TaKaRa (China). Strains Escherichia coli DH5ɑ was obtained from TransGen Biotech (China). Restriction enzymes and T 4 DNA ligases were obtained from TaKaRa (China).

cDNA synthesis
Total RNA was extracted using RNAiso Plus (TaKaRa Biotechnology). RNA purity and concentration were determined by using agarose gel electrophoresis and an UV spectrophotometer, respectively. RNA was stored at ¡80 C. The first cDNA strand was synthesized according to TransScript TM First-Strand cDNA Synthesis Super Mix (TransGen Biotech), using 500 ng of total RNA per 10 mL system. The mixture was incubated at 42 C for 30 min and then at 85 C for 5 min to inactivate the enzyme. Then the cDNA was obtained and stored at ¡20 C.

Rapid amplification of cDNA ends (RACE)
To obtain the full-length of LmiHx family genes cDNA, RACE was performed using the 3'-Full RACECore Set with PrimeScript TM RTase and 5'-Full RACE Kit with TAP (TaKaRa, Dalian, China) according to the manufacturer's instructions. The gene-specific primer for 5'RACE and 3'RACE was deduced from the internal amplification (Table 1). All the RACE products were purified by TaKaRa Agarose Gel DNA Purification Kit (TaKaRa, Dalian, China) and sequenced as described above.

Sequence analysis
Fragment assembly of cDNA nucleotide sequence was performed using DNAMAN software. The complete amino acid sequences of different subunits were compared with the sequences in the GenBank database using the BLAST program (available from the National Center for Biotechnology Information (NCBI), US National Institute of Health). The homology analysis among the amino acid sequence was performed with ClustalX v1.81. Storage domains were predicted with the CD Search program available from the NCBI. The locations of their own signal peptide were predicted with Sig-nalP4.1 (available from Centre for Biological Sequence Analysis). The protein physicochemical properties were predicted and analysed with the online software ExPASy (http://web.expasy.org/protparam/). The analysis of protein secondary structure was performed with SABLE online software (http://sable.cchmc.org/). The analysis of protein tertiary structure was performed with SWISS-MODEL server.

Comparative real-time quantitative PCR analysis
Nymphae were collected from the first, second, third, fourth and late fifth instars. Prothorax, abdomen, legs, fat body, ovary ortestis, midgut, and hemolymph from adult females and males were separately isolated. The total RNA from these samples was extracted using the Trizol purification method. The extracted mRNA was then used as a template for cDNA synthesis using Trans-Script TM First-Strand cDNA Synthesis Super Mix (Trans-Gen Biotech, Beijing, China). qRT-PCR primer pairs based on the sequence of the full-length DNA were designed with the program Primer Premier 5. The levels of mRNA encoding LmiHx gene family were normalized with respect to the level of b-actin mRNA obtained with b-actin specific primers [33]. The sequences of primers used in real-time qPCR are listed in Table 2. Three independent biological replicates of each cDNA were performed using a SYBR® Premix Ex Taq TM Ⅱ(TliRNaseH Plus) (TaKaRa, Dalian, China), carried out in a LightCycler® 1.5 (Roche, Germany). Each 20 mL PCR reaction mixture contained 500 nmol of primer and 100 ng of cDNA. The remaining sample without template was used as control. Thermal-cycling conditions included an initial 95 C for 30 s; followed by 40 cycles of 95 C for 5 s, 59 C for 30 s, and 72 C for 15 s; then a final melting curve step from 59 to 95 C, ramp rate of 0.05 C/s. The data were analysed according to the 2 ¡DDC t method described by Livak and Schmittgen [34]. Each experiment was repeated at least three times.

RNA interference of LmiHx5
According to the sequencing result of LmiHx5, specific primers (with restriction enzyme site) were designed and synthesized. Sequences are RNAi-LmiHx5-F1: CCCAAGCTTCTCAAGGCCAAAAGCACTG; RNAi-LmiHx5-R1 CCGCTCGAGCGTCCATTACCCTCCACAT (the underlined parts are restriction enzyme sites). The obtained cDNA that was reverse-transcribed by total RNA which was obtained from second instar insects were used as We injected 3.0 g dsLmiHx5 to each second instar locust nymphae in good health condition. Then the control groups were injected with the equivalent dsGFP. These locust nymphae continued to rear at 30 C with an L14:D10 photoperiod using an artificial diet. The insects were observed at 12, 24, 48, 72, and 96 h after treatment. Every three to five lively larvae were randomly selected and stored at ¡80 C for subsequent RNA extraction. The housekeeping gene b-actin was used as a reference. qRT-PCR was performed as before. The specific expression primers are shown in Table 2. We used 2 ¡44Ct to analyse the relative expression of LmiHx5.

Results and discussion
Cloning and sequence analysis of LmiHx family genes The six hexamerin family genes, identified from the grasshopper L. migratoria, were named as LmiHxL, LmiHx1, LmiHx2, LmiHx3, LmiHx4 and LmiHx5 respectively and their cDNA sequences were cloned by using RT-PCR and RACE methods. After sequencing, the result was tested by using local BLAST and compared with the obtained transcription data. The alignment reports showed that LmiHxL and LmiHx1 had the same sequences, and LmiHx2 and LmiHx3 also had the same sequences. Four members of LmiHx family were confirmed: Sequence analysis showed that the length of LmiHx family genes Open Reading Frame (ORF) sequence was 1974-2058 bp and the number of coded amino acid was 657-685. This result coincides with the research of insect storage hexamerins [13]. The upper part of 3'end Poly(A) structure has a typical AATAAA eukaryote polyadenylation signal. The homology analysis among the amino acid sequence of LmiHx1, LmiHx2, LmiHx4, and LmiHx5 performed with ClustalX showed that their consistency is between 32% and 43%. The results of the comparison indicated that the putative amino acid sequences of LmiHx1, LmiHx2, LmiHx4, and LmiHx5 not only contained four signature motifs of the hemocyanin family (ERL, RDP, RLNH, and GFP) (Figure 1, blue boxes), but also had signature motifs of hexamerin (AK/SDF/YD/ETFYK, YY/FT/ L/YEDI/VGL, and TM/QLRDPVYY/K) (Figure 1, black boxes) [35]. The red boxes in Figure 1 (RxxR) represent protease cleavage sites [36]. The theoretical amino acid sequences were analysed and deduced by CD search program of NCBI. The results showed that LmiHx1, LmiHx2, LmiHx4, and LmiHx5 all have three domains that the arthropoda hemocyanin superfamily have: a specific site Hemocyanin-C, a specific site Hemocyanin-N and a non-specific site Hemocyanin-M ( Figure 2) [37,38].
The online software ExPASy offers the prediction of chemicophysical properties. The physicochemical properties of LmiHx1, LmiHx2, LmiHx4, and LmiHx5 are shown in Table 3.
The secondary structure of LmiHx protein family was predicted with the online protein predition tool SABLE. The results showed that the secondary structure of LmiHx1, LmiHx2, LmiHx4, and LmiHx5 protein consist of a-helix, b-sheet, and random coils. SWISS-MODEL analysis showed that the tertiary structure of LmiHx1 and LmiHx2 protein was based on the homologue Bombyx mori SP2/SP3 structure and showed 32.56% and 35.44% similarity with the model protein, respectively. The modelling range are the 8-630th and 6-647th amino acids, respectively. They were both described as arylphorin (Figure 3(A,B)). The tertiary structure of LmiHx4 protein was based on Antheraea pernyi aromatic protein crystal structure and showed 30.24% similarity with the referred protein. The modelling range is the 3-659th amino acids (Figure 3(C)). The referred template of LmiHx5 is the same as that of LmiHx1 and showed 26.32% sequence similarity. The modeling range is the 3-659th amino acids (Figure 3(D)).

Expression analysis of the LmiHx family genes
The transcript levels of LmiHx family genes at different developmental stages and in different tissues were analysed by qRT-PCR. LmiHx family genes were expressed in L. migratoria at different levels throughout the life cycle as well as in the adult tissues (thorax, abdomen, hindleg, fat body, ovary or spermary, midgut, and blood tissue). It was shown that the expression levels of LmiHx family genes all increased in the first and the second instar, and then declined over subsequent days in every instar. However, the expression level of different family genes had significant difference in the same instar. The qRT-PCR results indicated that the expression of LmiHx family genes all reached higher levels in the first and the second instar, and then decreased. This expression pattern was consistent with the research of hexamerin protein genes from Plutellaxy lostella [39]. The concentration of two kinds of aromatic protein reached its maximum while molting and then decreased in Blatta orientalis [40]. Transcripts of LmiHx family were present at lower levels in embryogenesis, the third and the fourth instar. There were no significant differences among these three stages. The expression of the LmiHx family genes has cyclical fluctuation at each instar of the nymphs. This phenomenon also exists in other insects. The reason of cyclical fluctuation could be the cycle regulation of ecdysteroids for the expression of hexamerin protein, while the regulation mechanism is still not clear [41]. At least  98% of hexamerin protein was utilized during eclosion [42]. Overall, the relative expression level of LmiHx2 was the highest and that of LmiHx5 -the lowest. During the juvenile stages the expression level of LmiHx1 peaked on the 10th day of the first instar nymphal stage. The expression levels were relatively high at the 8th day of the second and fifth instar. The expression level of LmiHx2 peaked on the 6th day of the second instar nymphal stage, higher than in other members. The expression levels of LmiHx4 peaked on the 8th day of the fifth instar ( Figure 4).
As shown in Figure 5(A,B), these four LmiHx family genes were all expressed in different tissues, with higher levels in females than in males, and it has been suggested that hexamerin protein supports female reproduction and egg development by enhancing the pool of sulphur-containing amino acids used for vitellogenesis [43]. The LmiHx2 expression level was significantly higher than other family members, especially when it reached its maximum in the fat body. This pattern supports the statement that the fat body is the main synthesis and storage site of hexamerin proteins [13,44,45]. The LmiHx family genes were expressed at lowest levels both in male and female midgut. LmiHx2 transcripts were highly expressed in male fat body, hemolymph, and abdomen. Furthermore, the gene was expressed in the hemolymph, which is due to an open circulatory system, allowing the hexamerin protein to provide energy to various tissues and organs with the delivery of body fluids [44]. LmiHx2 expression level was extremely high in female fat body and it also highly expressed in female prothorax. LmiHx2 is a key member of the LmiHx family. There were no obvious differences between the expression levels of LmiHx1, LmiHx4, and LmiHx5 in females. The expression level of LmiHx5 in different tissues was the lowest.   Expression analysis of the LmiHx family genes after LmiHx5 RNAi The expression patterns of the LmiHx family genes were analysed under the LmiHx5 RNAi at transcript level. The expression Ã of LmiHx5 was reduced remarkably at 12 h after injection of dsLmiHx5 (p = 0.0494), which was 35% of the control group (injection of dsGFP), and it was almost not expressed at 96 h after dsRNA injectiononly 1.96% of the control group (p = 0.0011). The interference efficiency was significant ( Figure 6(A)). The expression of other LmiHx family members were analysed by using qRT-PCR after LmiHx5 RNAi. The results showed that the expression of LmiHx1, LmiHx2, and LmiHx4 were all increased at different levels after dsRNA injection. The expression Ã level of LmiHx1 was increased remarkably at 12 h after dsRNA injection, which was 3.387 times higher than the control group (p = 0.0459), and it reached up to 7.815 times the values of the control group at 96 h after dsRNA injection (p = 0.0146) ( Figure 6 (B)). The expression of LmiHx2 was increased remarkably at 48 h after dsRNA injection, which was 1.732 times of the control group (p = 0.0176), and at 96 h it was up to 3.673 times of the control group (p = 0.0006) ( Figure 6 (C)). The expression of LmiHx4 was increased remarkably  at 12 h after dsRNA injection, which was 1.313 times higher than the control group, (p = 0.0285) and at 72 h after dsRNA injection it was up to 7.685 times higher than the control group (p = 0.0095) (Figure 6(D)). The expression of other members with similar functions in the family was up-regulated after LmiHx5 RNAi to a different degree. This study suggests that there is a functional compensatory effect between hexamerin family genes. More research is likely needed to further reveal the exact and more detailed function of hexamerin family genes in the grasshopper.

Conclusions
In this study, four members of LmiHx family were confirmed and their full-length cDNA clones were obtained. The analysis of LmiHx gene family sequences showed that they contained three conservative arthropod hemocyanin superfamily structure domains. LmiHx family genes were expressed in L. migratoria at different levels throughout the life cycle, showing a trend towards a rise and a subsequent decline. LmiHx family genes were also expressed at a higher level in females than in males. The expressions of LmiHx1, LmiHx2, and LmiHx4 were all increased at different levels after LmiHx5 RNAi, which showed a functional compensatory effect between hexamerin family genes.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This project was funded by the National Natural Science