In silico prediction of neuropeptides from the neural ganglia of Pacific abalone Haliotis discus hannai (Mollusca: Gastropoda)

Abstract Neuropeptides play a central role in the regulation of reproduction, growth, development, and various other physiological processes. In our study, 26 neuropeptide precursors were identified from the pleuropedal and cerebral ganglia of Pacific abalone using transcriptome analysis. Of them, two (neuromacin and neuroglian) were potentially novel and two (myomodulin and FMRFamide) were involved in the reproductive regulation of mollusks. A BLAST search indicated that most of these neuropeptides exhibited the highest homologies with neuropeptides of other protostomian species. Myomodulin, FMRFamide, and molluscan insulin-related peptides shared 87%, 72%, and 88% sequence identities with homologous peptides of the tropical gastropod mollusk, Haliotis asinina. In silico analysis revealed that these identified neuropeptide precursors were likely to be extracellular secreted proteins. Alignment of multiple cerebrin, FFamide, and insulin-related peptide sequences illustrated that most of their biologically active residues were highly conserved with other invertebrate homologous neuropeptide residues. Three-dimensional (3D) structures of adipokinetic hormone, allatostain (A, B, and C), allatotropin, cerebrin, clionin, conopressin, elevenin, and neuromycin precursors in H. discus hannai exhibited a helix-loop-helix structure. Based on phylogenetic analysis, buccalin A1 and A2, allatostain B, and FCAP neuropeptide precursors of Pacific abalone formed a clade with other gastropod and bivalve mollusks neuropeptide precursors. This study provides a novel insight for further functional studies of abalone and other gastropod mollusks.


Introduction
Neuropeptides are ubiquitous neuron-secreted peptides that can act as hormones, neurotransmitters, and neuromodulators (Jékely 2013). They are the largest and most diverse class of endogenous proteinaceous molecules within the nervous system (Burbach 2011). These intracellular signaling molecules are derived from prepropeptides that are typically produced by protein synthesis within neurons or neurosecretory cells (Nässel & Larhammar 2013). Neuropeptides and peptide hormones are widely distributed throughout the nervous and endocrine systems of protostomes and deuterostomes. The biological activity of neuropeptides for regulating target cell functions is mainly mediated by G protein-coupled receptors (GPCRs) or through the phospholipase-phosphatidylinositol pathway involving a different group of second messengers (Strand 2003;Tanaka et al. 2003). Neuropeptides are synthesized from a larger, inactive precursor molecule that requires proteolytic processing sites to generate active neuropeptides that then act on target cells by interacting with specific signal-transducing membrane receptors (Zupanc 1996). Neuropeptides are crucial for regulating a plethora of cellular and physiological functions involved in feeding, digestion, excretion, circulation, and reproduction (Nässel & Larhammar 2013). These endogenous active substances play key roles in the regulation, formation, maturation, and release of gametes in both vertebrates and invertebrates (Hartenstein 2006).
Many studies related to neuropeptides have been performed. Some of the earliest studies were the first to identify and characterize oxytocin and vasopressin neuropeptides (Du Vigneaud et al. 1953;Popenoe & du Vigneaud 1954). Two peptides of the vasopressin-oxytocin family have been identified and characterized in marine gastropod mollusks (Cruz et al. 1987). In addition, neuropeptides have been identified from Lottia gigantea (Veenstra 2010), Deroceras reticulatum (Ahn et al. 2017), Charonia tritonis (Bose et al. 2017), and Patinopecten yessoensis (Zhang et al. 2018), all of which provide valuable information to understand the mechanisms of neuroendocrine regulation in mollusks.
The abalone H. discus hannai is an ecologically and economically important marine species that is naturally distributed in Korea, Japan, China, and Taiwan. Several recent studies have identified a number of reproductive and metabolic genes in H. discus hannai (Mendoza-Porras et al. 2014;Kim et al. 2017). Additionally, neuropeptide precursors that influence sexual maturation in female Pacific abalone have been recently reported by Kim et al. (2019). In the present study, neuropeptide transcripts were identified and characterized from the pleuropedal and cerebral ganglia of H. discus hannai to provide valuable resources for future neuroendocrine research in abalone species.

Sample collection
In July 2018, two-year-old female abalone (H. discus hannai, 10.5 cm in shell length and 148.2 g in body weight) adults were collected from Jindo Island (South Korea) and transferred to the laboratory in the Department of Fisheries Science, Chonnam National University. The abalone was euthanized by immersion in an anesthetic solution (ethyl 3-aminobenzoate methane sulfonate, MS-222) to obtain neural ganglia tissues (pleuropedal ganglion, cerebral ganglion). After collection, samples were immediately frozen in liquid N 2 and stored at −80°C for subsequent use.

RNA extraction and transcriptome analysis
Total RNA derived from the ganglion tissues were extracted using an RNeasy mini kit (Qiagen, Hilden, Germany) and then treated with RNase-free DNase (Promega, Madison, WI, USA) to eliminate genomic DNA according to the manufacturer's instructions. RNA concentration and purity were determined using a NanoDrop® NP 1000 spectrophotometer, and RNA integrity was evaluated by agarose gel electrophoresis. A cDNA library was constructed using a TruSeq Standard mRNA Sample Preparation Kit (Illumina, CA, USA) following the manufacturer's recommendations. It was then subjected to paired end sequencing (2 × 100 bp) to produce raw reads on an Illumina HiSeq 2500 platform (Illumina, CA, USA).
Raw reads were filtered using a homemade Perl script to remove low-quality reads that contained greater than 10% skipped bases (marked as 'N's) or base quality scores less than 20. Filtered reads were mapped to reference genomes of related species (abalone and other molluscan species) using TopHat aligner (Trapnell 2009).

Sequence analysis
The amino acid sequences of neuropeptide or peptide hormones of mollusks and other invertebrates were used as queries for BLAST analysis to search for neuropeptide transcripts of H. discus hannai. The open reading frame (ORF) was used to translate sequences and then input into BLASTp as query sequences to identify neuropeptide orthologues. In silico analysis of the identified neuropeptide was performed using multiple online software tools. Basic Local Alignment Search Tool (BLASTP) (http://www.ncbi.nlm.nih. gov/BLAST/) was used to determine the homologies of H. discus hannai neuropeptide transcript to neuropeptide transcripts of other species. Signal peptides were predicted using SignalP 4.1 (www.cbs.dtu.dk/ services/SignalP/). Prepropeptide cleavage sites were predicted with NeuroPred (http://stagbeetle.animal. uiuc.edu/cgi-bin/neuropred.py) (Southey et al. 2006). Cysteine residues of disulfide bridges were predicted with CYSPRED (Fariselli et al. 1999). Primary structures and subcellular localization of proteins were detected using ProtParam (http://expasy.org/tools/ protparam.html) and Protcomp (http://linux1. softberry.com/berry.phtml), respectively. Multiple alignments of deduced amino acid sequences of neuropeptides were conducted using Clustal Omega (Sievers et al. 2011).

Phylogenetic analysis and three-dimensional protein structure
Neuropeptide precursors homologous sequences from invertebrates were retrieved to construct phylogenetic trees. The neighbor-joining algorithm was used to build phylogenetic trees using MEGA software (version 6.06) (Tamura et al. 2013). Numbers reported on branches were obtained using a bootstrap analysis conducted on 1000 replicates to evaluate the significance of nodes. 3D structures of neuropeptide precursors were predicted using the I-TASSER server (Yang et al. 2015). Visualization of the predicted 3D structure was performed using Chimera software (https://www.cgl.ucsf.edu/chimera/).

Results and discussion
Transcriptome sequencing and assembly The ganglia transcriptome of H. discus hannai produced a total of 34,580,812 paired-end reads. After the quality filtering step, 33,548,676 clean reads were obtained, representing 97.01% recovery of all sequencing reads. Finally, the ganglia transcriptome was assembled into 41,287 transcripts with an average length of 958.71 bp. A total of 11,819 sequences were annotated either by trinotate analysis or by BLASTn against a nt database with E values below 10 −15 .

Identification of neuropeptide precursors
Based on NCBI BLAST search, 26 cDNAs encoding full or partial neuropeptide transcripts were identified from the pleuropedal and cerebral ganglia of Pacific abalone, H. discus hannai (Table I). These identified genes were homologous to previously identified neuropeptide precursors of other molluscan species.
Myomodulin, FMRFamide, and molluscan insulinrelated peptide exhibited 87%, 72%, and 88% identities with homologous peptides of H. asinina. A catalogue of neuropeptide precursor constructed in this study and similar characterization of neuropeptides from other molluscan species are shown in Supplementary file 1.

Adipokinetic hormone (AKH)
AKH is a small neuropeptide synthesized and stored by neurosecretory cells. It plays a pivotal role in the mobilization of both carbohydrates and lipids. This neuropeptide was first reported as an insect metabolic neuropeptide involved in the mobilization of energy substrates during physiological activities (Gäde 2009). In mollusks, AKH has been identified in Crassostrea gigas (Dubos et al. 2017) and D. reticulatum (Ahn et al. 2017). An 84-residue AKH precursor protein was identified that had a predicted N-terminal signal peptide and an AKH amidated peptide (pQVSFSTNWGSamide). A similar AKH mature peptide has been reported in C. gigas (Dubos et al. 2017). For the AKH precursor protein, its 3D structure was characterized by two alpha helices separated by a loop (helix-loop-helix structure, HLH).

Allatostatin A or buccalin
Allatostatin A is an L-amide, biologically active neuropeptide that functions in muscle contraction. This neuropeptide was first isolated from Aplysia californica (Miller et al. 1993). It has been also reported in L. gigantea (Veenstra 2010), C. gigas (Stewart et al. 2014), D. reticulatum (Ahn et al. 2017), and P. yessoensis (Zhang et al. 2018). In H. discus hannai, two buccalin transcripts, allatostatin A1 and A2, were identified that encode 24 and 30-residue signal peptides and 3 and 5 copies of the mature peptide, respectively. The open reading frames (ORF) of allatostatin A1 and A2 cDNA sequences encoded deduced proteins of 114 and 129 amino acids with theoretical molecular mass of 12.60 and 14.06 kDa, respectively. The peptide precursor contains GxLamide at the C-terminal end, which is in agreement with a previous report on allatostatins A in D. reticulatum and P. yessoensis (Ahn et al. 2017;Zhang et al. 2018). Phylogenetic analysis revealed that buccalin precursors A1 and A2 of H. discus hannai were most closely related to D. reticulatum buccalin A1 and C. virginica buccalinlike neuropeptide precursors, respectively (Figure 1). 3D structures of allatostatin A1 revealed multiple alpha helices separated by a loop. The long helix contained 11 amino acids, and the numbers of amino acids involved in the N-terminal and C-terminal alpha helices were 4 and 7 amino acids, respectively.

Allatostatin B
The first molluscan allatostatin B was isolated from the ganglia of the African giant snail (Achatina fulica). This neuropeptide is also called WWamide due to the presence of an N-terminal Trp (W) and a C-terminal Trp (W)-amide. In H. discus hannai, allatostatin B was identified. It was found to encode a 25-residue signal peptide followed by four putative allatostatin B peptides that contained conserved W residues at both ends, consistent with the results of previous studies (Ahn et al. 2017;Zhang et al. 2018).
A phylogenetic tree analysis of allatostatin B was constructed, showing that allatostatin B H. discus hannai clustered strongly with M. yessoensis (Figure 1). 3D structures of allatostatin B were characterized by multiple alpha helices separated by a helix-loop-helix structure.

Allatostatin C
Allatostatin C is a vertebrate somatostatin homolog. It was first discovered in the moth, Manduca sexta (Veenstra 2009). It is characterized by two conserved Cys residues surrounding a hexapeptide. Here, we report an allatostatin C preprohormone that is 101 amino acids in length. It contains one copy of the mature peptide. Its cysteine residues at positions Cys-91 and Cys-98 are highly conserved with those of other homologs of invertebrates. 3D structures revealed multiple alpha helices exhibiting a typical HLH structure. The long loop was formed by 24 amino acids and the numbers of amino acids involved in the N-terminal and C-terminal helices were 6 and 8, respectively.

Allatotropin
Allatotropin is an amidated tridecapeptide neurohormone that was originally described as a regulator of juvenile hormone production, it was isolated from the nervous system of the lepidopteran, M. sexta (Kataoka et al. 1989). The encoded allatotropin peptide isolated from H. discus hannai shared 50% and 46% sequence identities with those of allatotropin peptide precursors of C. tritonis and D. reticulatum, respectively. In the predicted secondary structures of H. discus hannai allatotropin precursor, multiple alpha helices were observed, with a HLH structure.

Bradykinin-like neuropeptide
The presence of the bradykinin neuropeptide has been reported in the left upper quadrant cells in A. californica. These cells can regulate some renal functions (Wickham & DesGroseillers 1991). A bradykinin-like transcript encoding a putative 14.78 kDa precursor peptide containing a signal peptide (19-aa) and potential proteolytic cleavage sites was identified in H. discus hannai. It could generate smaller mature peptides. A BLAST search indicated that this protein shared 31% sequence identities with bradykinin-like neuropeptide precursor of A. californica.

Cerebrin
Cerebrin was first identified from the cerebral ganglia of A. californica. This protein was found to be involved in feeding behavior (Li & Floyd 2001). A single copy of a predicted 17-residue C-terminally amidated peptide was identified from the H. discus hannai transcriptome. Sequence alignment of bioactive peptides revealed that Gly, Asp, and two Asn residues at positions 70, 73, 68 and 77, respectively, were highly conserved among different mollusks (Figure 2). The 3D configuration of this precursor revealed multiple alpha helices possessing the typical HLH structure. It contained 3 and 9 amino acids in its N-terminal and C-terminal, respectively.

Clionin
A potential novel neuropeptide was identified from the transcriptome analysis of H. discus hannai. It amino acid sequences shared 94% identities with those of the clionin precursor of Tridachiella diomedea. Its open reading frame possessed a deduced peptide sequence of 104 amino acids with molecular weight and isoelectric point of 11.79 kDa and 5.89, respectively. An NH 2 -terminal signal peptide (24-residue) was predicted from this transcript (Table I). Two cysteine residues at positions 46 and 64 are likely to form a disulfide bond. In contrast, three putative disulfide bonds were detected in Sepia officinalis (Zatylny-Gaudin et al. 2016). In silico analysis (Protcomp, http://linux1.soft berry.com/berry.phtml) revealed that this neuropeptide precursor might be an extracellular secretory protein. For clionin, the 3D structure was characterized by multiple alpha helices separated by a loop.

Conopressin
Conopressin is a nonapeptide amide that causes contraction of the vas deferens in the pond snail, Lymnaea stagnalis (Van Kesteren et al. 1992). This neuropeptide was previously isolated from Conus geographus. It shared sequence identities with those of vasopressin-oxytocin peptides of vertebrates (Cruz et al. 1987). The conopressin gene of the Pacific abalone encodes a 25-residue signal peptide followed by one bioactive peptide (FIRNCPPGamide), and this is in agreement with the previous report (Ahn et al. 2017). An NCBI BLAST search indicated that this precursor protein shared 43% and 31% sequence identities with sequences of Octopus bimaculoides and L. stagnalis precursors, respectively. The HLH structure was formed by 7-aa residues of the N-terminal and 4-aa residues of the C-terminal.

Elevenin
The elevenin peptide was first reported in bag cells and abdominal ganglion neurons L11 in A. californica (Taussig et al. 1984). A 120-residue elevenin precursor was detected in H. discus hannai. It had a predicted N-terminal signal peptide (26residue) and a mature peptide with two conserved cysteine residues. Its 3D predicted model showed the classical HLH structure. The long loop was formed by 17-aa adjacent to the C-terminal tail.

Enterin
Enterin is a nonapeptide and decapeptide composed of an FVamide sequence at the C terminus. Of 20 different enterin peptides, 35 copies of enterin peptides have been isolated from the gut and central nervous system of A. californica (Furukawa et al. 2001). The enterin precursor of H. discus hannai had a 247-aa region encoding a signal peptide and five copies of the mature peptide that shared HSFVamide sequences at the C-terminal end. Conversely, the structure of the enterin precursor deduced from cDNA cloning in L. gigantea exhibited eight copies of the peptide, seven of which possessed a HRFVamide at the C-terminal end (Veenstra 2010).

Feeding circuit-activating peptide (FCAP)
Feeding circuit-activating neuropeptide (FCAP) was first identified and characterized from the cerebralbuccal connective tissue of A. californica, encoding multiple copies of eight different FCAPs (Sweedler et al. 2002). In H. discus hannai, one full-length precursor was identified. It had 107-aa with a 24-residue signal peptide and 3 copies of the FCAP like peptide. Two FCAP precursors containing 7 putative FCAPs have been reported in D. reticulatum (Ahn et al. 2017). Phylogenetic analysis revealed that the FCAP precursor of H. discus hannai was closely grouped with FCAP precursor 2 of D. reticulatum (Figure 3).

FFamide
FFamide has been reported in some mollusks, including gastropods. It was first identified in L. stagnalis (Li et al. 1995), L. gigantea (Veenstra 2010), Theba pisana (Adamson et al. 2015), and the cephalopod S. officinalis (Zatylny-Gaudin et al. 2016). This amide peptide is crucial for modulating the transfer rate of semen within the vas deferens. A single copy of C-terminally amidated FF amide precursor (GLRPGMNSLFFamide) was predicted in the H. discus hannai transcriptome. Sequence alignment of bioactive peptides revealed that H. discus hannai and D. reticulatum FFamide-1 shared the same residues ( Figure 4). The characteristic Leu/Met and Arg/Asn substitutions were observed in FFamide of M. yessoensis (Zhang et al. 2018).

FMRFamide
The tetrapeptide FMRFamide is widely distributed throughout the animal kingdom. The molluscan Figure 3. A phylogenetic tree of feeding circuit-activating neuropeptide precursor (FCAP) was constructed using the neighbor-joining method to determine the evolutionary relationship among the FCAP precursors of invertebrate.
FMRFamide precursor participates in a variety of physiological mechanisms such as heart activity, amylase secretion, feeding, and reproduction (Jakobs & Schipp 1992;Favrel et al. 1994;Santama et al. 1994). It is also involved in regulating egg capsule secretion and oocyte transport in the oviduct (Henry et al. 1999). In this study, the FMRF amide precursor encoding a 28-residue signal peptide and 7 copies of FMRF tetrapeptides flanked by multiple cleavage sites was found. A BLAST search revealed that this precursor exhibited 73% and 66% sequence identities with sequences of H. asinina and C. gigas FMRF neuropeptide precursors, respectively.

Insulin-related peptide
Numerous insulin-like peptides have been identified in mollusks, especially in gastropods (Smit et al. 1988;Floyd et al. 1999;Veenstra 2010;Ahn et al. 2017) and bivalves (Cherif-Feildel et al. 2018). These neuropeptides are involved in the growth and glucose metabolism (Ebberink et al. 1989). They also play a role in mediating germinal cell proliferation and maturation (Monnier & Bride 1995). Three insulin-like precursors (Insulin-like 2A, 2B, and MIP-related peptide) were identified in H. discus hannai. Insulin-like 2A, 2B precursor, and MIP-related peptide contained 142, 124, 118 amino acid residues, respectively, with a predicted N-terminal signal peptide and two insulinlike domains A and B ( Figure 5). Alignment of multiple sequences illustrated that the characteristic cysteine residues of these identified insulin-like precursors were highly conserved with other molluscan insulin-like peptides.

Mytilus inhibitory peptides (MIPs)/PxFVamide
Mytilus inhibitory peptides (MIPs) have been identified from the pedal ganglia of Mytilus edulis. (Hirata et al. 1988). These peptides have a potent inhibitory effect on retractor muscle contractions in these animals (Tetsuya et al. 1992). A MIP transcript encoding a precursor of 371-aa, with a 21-aa signal peptide and 16 copies of mature peptide processed from multiple mono or dibasic cleavage sites, was identified in H. discus hannai.

Myomodulin
Myomodulin is considered to be an important neural cotransmitter in mollusks that plays a role in innervating the male reproductive system of L. stagnalis (De Lange et al. 1998;Koene 2010). This neuropeptide was initially identified in A. californica and L. stagnalis. It has been reported in many gastropod and bivalve mollusks. In the present study, two full-length myomodulin transcripts encoding multiple copies of myomodulin-like peptides possessing a consensus MLRLamide at the C-terminal end (Miller et al. 1993;Kellett et al. 1996) were found. The myomodulin precursor of H. discus hannai encodes 16 different myomodulin peptides with 7 copies of the PMNMLRLamide, while seven different peptides with 7 copies of the PMNMLRLamide in H. asinina have been reported (York et al. 2012).

Neuromacin-like protein
Neuromacin is a cysteine-rich antimicrobial peptide that is only expressed in the nervous system   (Schikorski et al. 2008). It exhibits potent activity against bacteria. Molluscan neuromycin-like peptide has been observed in the central nervous system of A. californica (Moroz et al. 2006). In H. discus hannai, neuromacin-like protein is a 80-residue precursor protein comprising a predicted N-terminal signal peptide (19-residue), two copies of mature peptide, and six cysteines residues (residues 21, 28, 47, 57, 74, and 76) that are likely to form disulfide bridges. BLAST analysis results revealed that this precursor shared 59% sequence identities with sequences of a neuromycin-like protein isolated from Stylophora pistillata. Its predicted 3D structure exhibited HLH structure possessing long N-and C-termini ( Figure 6).

Neuroglian
Neuroglian is an extracellular cell adhesion molecule that plays an essential role in synaptic stability (Harris et al. 2016). In the present study, a 259amino acid residue neuroglian precursor comprising a predicted N-terminal signal peptide (21-residue), two copies of the mature peptide, and four cysteines residues that might form intramolecular disulfide bridges was found. A potential N-linked glycosylation site was detected in this precursor.

PRQFVamide
PRQFVamide was identified from the CNS and gut of A. californica. This protein is crucial for regulating the feeding system (Furukawa et al. 2003). The structure of this precursor is composed of 33 copies of the PRQFVamide and four related pentapeptides with an N-terminal signal peptide. This precursor is highly expressed within the abdominal ganglion of A. californica (Furukawa et al. 2003). In H. discus hannai, a PRQFVamide comprised of a 25-residue signal peptide and 3 copies of the PRQFVamide separated by multiple dibasic cleavage sites was identified. This precursor shared 83% and 48% amino acid sequence identities with those of D. reticulatum and A. californica precursors, respectively.

Conclusion
In the present study, we sequenced and analyzed 26 neuropeptide precursors from the Pacific abalone. Of them, two (neuromacin and neuroglian) were potentially novel transcripts. These neuropeptide transcripts shared the highest sequence similarities with those of other bivalve and gastropod mollusks. This study will aid future physiological and behavioral studies of abalone species. Fisheries, Korea.

Disclosure statement
No potential conflict of interest was reported by the authors.