Cloning, purification, kinetic and anion inhibition studies of a recombinant β-carbonic anhydrase from the Atlantic salmon parasite platyhelminth Gyrodactylus salaris

Abstract A β-class carbonic anhydrase (CA, EC 4.2.1.1) was cloned from the genome of the Monogenean platyhelminth Gyrodactylus salaris, a parasite of Atlantic salmon. The new enzyme, GsaCAβ has a significant catalytic activity for the physiological reaction, CO2 + H2O ⇋ HCO3− + H+ with a kcat of 1.1 × 105 s−1 and a kcat/Km of 7.58 × 106 M−1 × s−1. This activity was inhibited by acetazolamide (KI of 0.46 µM), a sulphonamide in clinical use, as well as by selected inorganic anions and small molecules. Most tested anions inhibited GsaCAβ at millimolar concentrations, but sulfamide (KI of 81 µM), N,N-diethyldithiocarbamate (KI of 67 µM) and sulphamic acid (KI of 6.2 µM) showed a rather efficient inhibitory action. There are currently very few non-toxic agents effective in combating this parasite. GsaCAβ is subsequently proposed as a new drug target for which effective inhibitors can be designed.


Introduction
Gyrodactylus salaris is a flatworm (platyhelminth) parasite belonging to the Monogeneans group, which are hermaphrodite ectoparasites found on the gills, fins, or skin of fish 1,2 . They do not need an intermediate host for infecting a range of fish species, some of which possess significant commercial status, such as the Atlantic salmon (Salmo salar) and related species 3,4 . The presence of this parasitic pathogen has been reported in 19 countries across Europe and has already produced catastrophic losses of Atlantic salmon mainly in Norway starting in the 1970s and in Russian Karelia since 1992 [1][2][3][4] . This small (0.5 mm) parasite attacks the host by attaching its anterior end to the fish through secretions from the cephalic glands, and then releasing a digestive solution rich in proteolytic enzymes which dissolves the fish skin, inducing the formation of large wounds which favour secondary infections 5 . A variety of inorganic and organic compounds, among which are salt (NaCl), hypochlorite, permanganate, aluminium salts, praziquantel, levamisole, mebendazole, toltrazuril, etc., have been tested for efficacy against a broad spectrum of monogenean species, including G. salaris, but only trichlorfon and dichlorvos ( Figure 1) showed some efficacy 6,7 . However, both compounds act as irreversible organophosphoric acetylcholinesterase inhibitors, showing thus a rather high toxicity for all vertebrates, not only for fish 8 .
Novel potential drug targets present in the proteome of this parasite have been recently explored. This includes excretory/secretory proteins involved in host invasion and colonisation 9 , with a flatworm host invasion and colonisation, such as a flatworm protease 9 . However, no pharmacological inhibitors were reported so far.
were supported by genome data from the University of Oulu. A BLAST search at NCBI 28 suggested that the sequence is close to full length, and therefore the existing sequence was only extended by an ATG codon at the beginning of the transcript and a stop codons (TAA TAG) at the end. This sequence of the b-CA was inserted in a pBVboost vector construct for the production of recombinant protein (Figure 2). The construct was obtained by GeneArt (Invitrogen, Regensburg, Germany). The sequence of the b-CA was modified accordingly to produce the protein in bacterial (Escherichia coli) cells.

Transformation of plasmid vector into BL21 cells
The construct of the b-CA sequence from the freeze-dried plasmid supplied by GeneArt was prepared according to the instructions of the manufacturer. The BL21 Star TM (DE3) cells were stored at À80 C cells (Invitrogen, Carlsbad, CA, USA) and thawed by keeping them on ice. After thawing the competent cells, 25 ml of the cell suspension and 1 ml of the reconstituted plasmid, were transferred into a 1.5 ml centrifuge tube. The suspension was incubated on ice for 30 min. Heat shock was performed by keeping the tube in 42 C water for 30 s, and transferred immediately on ice for 2 min. 125 ml of SOC Medium (Invitrogen, Carlsbad, CA, USA) was added to the microcentrifuge tube containing the transformed cells, and the tube was incubated at 37 C for 1 h with shaking (200 rpm). The agar plates containing gentamycin were stored at 37 C before the transformation. 20 ml or 50 ml of cell suspension described above were spread onto each plate, and the plates were incubated overnight at 37 C . A volume of 5 ml preculture was prepared by inoculating single colonies from growth plates onto an LB medium with gentamycin (ratio 1:1000), being then incubated overnight at 37 C with constant shaking of 200 rpm.

Production of GsaCAb recombinant protein
Protein production was carried out according to pO-stat fed batch protocol as described earlier with some modifications 29 . The fermentation medium contained no glycerol, as the cell line used did not require it. For induction of the culture 1 mM IPTG was used after 12 h of starting the fermentation. The temperature was decreased to 25 C at the time of the induction. Cell growth was stopped after 12 h of the induction with the OD 34 (A600). The cells were collected by centrifugation and the weight of the cell pellet was recorded. The fermentation was carried out at the Protein Services core facility of Tampere University (https://www. tuni.fi/en/research/protein-services). The cell pellet was suspended in 150 ml of binding buffer containing 50 mM Na 2 HPO 4 , 0.5 M NaCl, 50 mM imidazole, and 10% glycerol (pH 8.0). Cell pellet was lysed with sonication (5 min, 30 s off, 20 s on) into the lysing buffer and centrifuged at 20 000Âg/15 min. The suspension was homogenised using EmulsiFlex-C3 (AVESTIN, Ottawa, Canada) homogeniser. The lysate was centrifuged at 13 000Âg for 15 min at 4 C, and the clear supernatant was mixed with HisPur TM Ni-NTA Resin (Thermo Fisher Scientific, Waltham, MA, USA) and bound to the resin for 2 h at room temperature on the magnetic stirrer. The obtained resin was washed with the binding buffer and collected onto an empty column with an EMD Millipore TM vacuum filtering flask (Merck, Kenilworth, NJ, USA) and filter paper. The protein from the resin was eluted using 50 mM Na 2 HPO 4 , 0.5 M NaCl, 350 mM imidazole, and 10% glycerol (pH 7.0). The protein was repurified with TALON V R Superflow TM cobalt resin (GE Healthcare, Chicago, IL, USA). The eluted protein fractions were diluted with binding buffer (50 mM Na 2 HPO 4 , 0.5 M NaCl, and 10% glycerol pH 8.0), so that the imidazole concentration was under 10 mM. The protein binding and elution were performed as described above. The purity of the protein was determined with gel electrophoresis (SDS-PAGE) and visualised with PageBlue Protein staining solution (Thermo Fisher Scientific, Waltham, MA, USA). Protein fractions were pooled and concentrated according to the protocol (https:// store.repligen.com/products/floatalyzer) (8-10 kD). Buffer exchange in 50 mM TRIS (pH 7.5) was done using the same centrifugal concentrators. The His-tag was cleaved from the purified protein by Thrombin CleanCleave Kit (Sigma-Aldrich, Saint Louis, MO, USA), according to manufacturer's manual.

CA activity and inhibition measurements
An Applied Photophysics stopped-flow instrument has been used for assaying the CA catalysed CO 2 hydration activity 30 . Phenol red at a concentration of 0.2 mM was used as pH indicator, working at the absorbance maximum of 557 nm, with 10 mM TRIS (pH 8.3) as buffer, and in the presence of 10 mM NaClO 4 for maintaining constant the ionic strength, following the initial rates of the CA-catalysed CO 2 hydration reaction for a period of 10-100 s. The CO 2 concentrations ranged from 1.7 to 17 mM for the determination of the kinetic parameters and inhibition constants. For each inhibitor, at least six traces of the initial 5-10% of the reaction have been used for determining the initial velocity. The uncatalyzed rates were determined in the same manner and subtracted from the total observed rates. Stock solutions of inhibitors (10-20 mM) were prepared in distilled-deionized water and dilutions up to 0.01 mM were done thereafter with the assay buffer. Inhibitor and enzyme solutions were preincubated together for 15 min at room temperature prior to assay, in order to allow for the formation of the enzyme-inhibitor complex. The inhibition constants 31-37 represent the mean from at least three different determinations. GsaCAb concentration in the assay system was 14.3 nM.

Reagents
Anions and small molecules were commercially sold reagents of the highest available purity from Sigma-Aldrich (Milan, Italy). The purity of tested compounds was higher than 99%.

Phylogenetic analysis
A BLAST search was performed on the UniPROT webserver (https:// www.uniprot.org/blast/) with the novel GsaCAb sequence as a query and all settings as default. The top 250 closely related sequences, and their annotations (species, phyla), were taken for further analysis.
The 250 b-CAs were clustered to 80% similarity with the "cluster fast" algorithm of the USEARCH tool (version 11.0.667) 38 and 111 sequences representing the centroids of clusters resulted. To this list the novel G. salaris b-CA was added, and a custom Python script was then used to further filter out sequences that did not contain both canonical b-CA amino acid motifs (CxDxR and HxxC), resulting in a total of 104 b-CA sequences which were then aligned with Muscle (version 5.1) using all default settings 39 .
The alignment was reduced to a total of 62 conserved amino acid residues which were identified using GBlocks (version 0.91 b) 40 Model testing was performed to identify the best evolutionary model for analysis of the target sequences using ModelFinder 41 , with the best model determined to be "LG þ IþG4". A maximum likelihood phylogenetic analysis was performed using the IQTree software (version 2.0.3) 42 , with parameters set to "-alrt 100000 -bb 100000 -nt AUTO -m LG þ IþG4" and all other options run as default. A consensus tree was generated from the 100 000 bootstrap replicates, with a final log-likelihood value of À4717.575. The tree was then visualised using a custom Python script utilising the ETE Toolkit Python library 43 .
The code and workflow used to perform these analyses are provided at https://github.com/thirtysix/Aspatwar.Gsalaris_BCA (Supplemental data).

Multiple sequence alignment
Our protein sequence of GsaCAb was used as a query in BLAST searches 28 at NCBI (https://blast.ncbi.nlm.nih.gov/Blast.cgi). Searches were made limited to metazoa, except vertebrates, in the nr database with wordsize 3 and scoring matrix BLOSUM45, otherwise default parameters. The results were filtered for at least 90% query coverage. Seven homologs were selected based on taxonomical diversity and model organism status from the results with E value cut-off 1e-28 (top 242 hits). Details of all the hits are given in https://bit.ly/3JNRb7i (Supplemental data).
Sequences were aligned with Clustal Omega 47 at EBI (https:// www.ebi.ac.uk/Tools/msa/clustalo/) with number of combined iterations ¼ 3, otherwise default parameters. Espript 3 48 at https:// espript.ibcp.fr/ESPript/cgi-bin/ESPript.cgi was used in visualising the sequence alignment result. Our AlphaFold model for GsaCAb (see below) was used for the display of secondary structures above the alignment. The threshold for boxing nearly conserved residues was set to 80%.

Molecular modelling
All operations with 3 D protein structure models and molecular visualisation were performed using ChimeraX (daily build 1.4.dev202202030703), developed by the UCSF Resource for Biocomputing, Visualisation, and Informatics (San Francisco, California, USA), supported in part by the National Institutes of Health 49 .
A 3D model of GsaCAb was created with AlphaFold 50 using the ChimeraX interface to submit the prediction to run at Google Colab. This model was compared to a crystallographic structure of the pea b-CA, PDB 1EKJ 51 by 3 D superimposition using the MatchMaker tool of ChimeraX with an iteration cut-off of 1.5 Å for pruning residue pairs in fitting.

Results and discussion
3.1. Sequence analysis of GsaCAb, comparison with other b-CAs, and recombinant protein production The partial GsaCAb transcript sequence discovered in this study, of 687 nucleotides, has been submitted to European Nucleotide Archive  the active site of b-CAs, in which the cysteines and histidine coordinate the catalytic zinc ion, are conserved in GsaCAb (shown by triangles in Figure 3). It is notable that the sequence identity is nearly constant between GsaCAb and b-CAs from widely different groups of animals, as seen in Tables 1 and 2 Analysis of the GsaCAb sequence with TMHMM 2.0 suggests that no transmembrane helices are present. TargetP 2.0 predicts a cytoplasmic ("other") localisation, from among the choices of secreted, mitochondrial, or other. DeepLoc 1.0, on the other hand, predicts mitochondrial localisation from among 10 different localizations with a likelihood of 0.77. DeepLoc does not depend solely on N-terminal sequences in its inference, contrary to TargetP 2.0. Furthermore, the authors of TargetP 2.0 state that for non-plant proteins, the most common confusion is between mitochondrial targeting peptides and no targeting peptides 45 . The same article also notes that the second residue in the sequence has a markedly strong predictive value for metazoan mitochondrial targeting peptides. Considering all this, and the fact that our protein sequence is incomplete at the N-terminus, we prefer to accept the prediction of DeepLoc and suggest tentatively a mitochondrial localisation for GsaCAb. This would also be consistent with mitochondrial localisation observed or predicted for many other metazoan b-carbonic anhydrases (BCAs) 20,52,53 .
The sequence comparisons carried out here confirm that GsaCAb belongs to b-CA enzyme family and could be a potential target for developing suitable inhibitors for control of this parasite in fish culture farms.
The recombinant GsaCAb protein was produced in E. coli cells. The purified recombinant protein showed a single band close to the expected size on SDS-PAGE (Figure 4, measured MW 26.0 kDa).

Phylogeny of 131 b-CA sequences
Phylogenetic analysis was performed using the GsaCAb sequence together with 130 related b-CA sequences identified via BLAST search. Among all sequences included there were 17 distinct phyla represented, 5 of which came from bacteria (46 sequences) and 12 from animalia (85 sequences). Within animalia, the largest two groups were nematoda (29 sequences) and arthropoda (28 sequences). The most closely related sequence to the novel GsaCAb was from sea cucumber (A0A2G8LGE8), and together these occurred within a larger clade of 13 total sequences, which included 8 other platyhelminthes. Figure 5 shows the final phylogenetic tree.
A total of 130 bCA sequences identified by BLAST search of the UniProt database were combined with the GsaCAb sequence and used to perform maximum likelihood phylogenetic analysis (IQTree version 2.0.3) 42 to identify potential evolutionary relationships. Colours represent the phyla of the species from which each sequence originates. G. salaris is marked with an asterisk. Dot sizes in the internal nodes of the tree indicate bootstrap support level of each node.

Anion inhibition studies of GsaCAb
A panel of anions and small molecules known for interacting with CAs 69 were chosen to be tested as inhibitors of GsaCAb (Table 3).
As seen from the data in Table 4, where the inhibition of the human a-class isoforms hCA I and II was also included for comparison, many inorganic/organic anions and small molecules, such as sulfamide and sulphamic acid, inhibited GsaCAb. The action of inhibition is presumably through coordination of the molecule to the metal ion in the active site, as with other CAs previously investigated for their interaction with this type of inhibitor [13][14][15][16][17]54,55,70,71 . The following should be noted regarding the inhibition data of Azide, for example, is a rather efficient hCA I inhibitor, with a K I of 1.2 mM, and has thoroughly been characterised by diverse techniques, including X-ray crystallography [14][15][16][17] . ii. Most of the investigated anions showed a millimolar affinity for GsaCAb, which is the typical inhibitory profile for this type of compounds. Indeed, K I -s in the range of 1.9-91 mM were measured for the following anions: fluoride, chloride, bromide, cyanate, thiocyanate, nitrite, bisulphite, bisulphide, tellurate, perosmate, divanadate, perrhenate, perruthenate, selenocyanate, imidosulfonate, and trithiocarbonate. Many of these anions are known for their propensity to complex with metal ions and they also act as inhibitors of other CAs, including hCA I and II (Table 4)    This work. Inhibition data with the clinically used sulphonamide, acetazolamide (5-acetamido-1,3,4-thiadiazole-2-sulphonamide), are also provided.
iii. Sub-millimolar inhibition of GsaCAb was observed for cyanide, stannate, peroxydisulfate, and fluorosulfonate, which showed K I -s in the range of 0.77-0.92 mM. iv. The most effective GsaCAb inhibitors were sulfamide, sulphamic acid (presumably acting as sulfamate 72 ) and N,N-diethyldithiocarbamate, which showed K I -s in the range of 6.2-81 mM. These inhibitors incorporate two well-known zinc-binding groups (ZBGs) present in many efficient CA inhibitors: the sulfamoyl moiety (present in sulfamide and sulphamic acid 72 , which has been demonstrated using crystallography to coordinate the zinc ion from the CA active site 72 . The same inhibition mechanism was thereafter observed for dithiocarbamates and their derivatives [73][74][75] , which incorporate the CS 2 À ZBG. The fact that simple derivatives possessing no organic scaffolds (sulfamide and sulphamic acid) or a very small and compact scaffold (as N,N-diethyldithiocarbamate) do inhibit GsaCAb quite efficiently, prompts us to hypothesise that it might be possible to develop more efficient and selective inhibitors for this enzyme, with potential use as antiparasitic agents.

Molecular model of GsaCAb
We created a predicted structural model of GsaCAb using AlphaFold 50 . The model is highly similar to crystallographic structures of other b-CAs. AlphaFold evaluates the per-residue confidence score (pLDDT, between 0 and 100) to be higher than 90 ("very high confidence") for 65.1% of the residues and higher than 70 ("confident") for an additional 26.2% of the residues. The N-terminal region 1-30 contains only residues with pLDDT <90.
Counting from H31 (just before b-strand b1) to the end of the sequence, 75.3% of the residues are of very high confidence and 19.2% are confident. The only regions with low-confidence residues (50 < pLDDT < 70) are the N-terminus (8 of the first 10 residues) and the irregular helix at 67 to 78. All pLDDT values in our model can be found at https://bit.ly/3qIMIv6 (Supplemental data).
In Figure 6 our model (yellow) is superimposed with pea b-CA (sky blue), with excellent fit over the core secondary structures and the loops of the catalytic site. The models of Figure 6 were superimposed with a strict iteration cut-off of 1.5 Å. There were 97 residue pairs left within the cut-off distance, with an RMSD of 0.858 Å, indicating excellent fit. Note that this RMSD is not a proper measure of the overall similarity of the two models, just between the subsets of the best-fitting residues. These structurally constant residues cover the parallel b-sheet formed of b-strands b2, b1, b3 and b7; a-helices a5 and a8 (on top of the b-sheet in Figure 6); the a-helix a6 extending after the HxxC motif; and the loops at the tips of b-strands b1 and b3 which form the catalytic site. Refer to Figure 3 for the numbering and locations of secondary structure elements.
The region 150-168 represents an insertion relative to the pea b-CA sequence. This region is seen near the top of Figure 6, predicted to form a b-hairpin-like structure containing b-strands b5 and b6. Despite the lack of similar loops in b-CA models in PDB, this region is mainly evaluated to be of high confidence.
To our knowledge, this is the second 3 D model for any platyhelmith b-CA to be made freely accessible, the first one being Table 4. Anion inhibition data of the b-CA from G. salaris and human isoforms hCA I and hCA II as determined by stopped-flow CO 2 hydrase assay.

>50
>50 >50 a Mean from 3 different assays, by a stopped flow technique (errors were in the range of ± 5-10% of the reported values); b Sodium salts, except sulfamide and phenylboronic acid. b-CA of Schistosoma mansoni in the AlphaFold database. Our model is available as a PDB file at https://bit.ly/3tPaNT2.

Conclusions
We report here the cloning and characterisation of a b-class CA identified in the genome of the Monogenean platyhelminth Gyrodactylus salaris, a parasite of Atlantic salmon and other economically relevant aquaculture fish species. Sequence analysis and successful modelling of the protein confirm its membership in the b-CA class. Sequence comparisons and the observed enzymatic activity also suggest that the N-and C-terminal sequences missing from our incomplete sequence are insignificant. This new enzyme, GsaCAb, showed a low but significant catalytic activity for the physiological CO 2 hydration reaction, with a k cat of 1.1 Â 10 5 s À1 and a k cat /K m of 7.58 Â 10 6 M À1 Â s À1 . This activity was inhibited by acetazolamide (K I of 0.46 mM), a sulphonamide in clinical use, as well as by some inorganic anions and small molecules. Most investigated anions (fluoride, chloride, bromide, cyanate, thiocyanate, nitrite, bisulphite, bisulphide, tellurate, perosmate, divanadate, perrhenate, perruthenate, selenocyanate, imidosulfonate, and trithiocarbonate) were millimolar GsaCAb inhibitors. Cyanide, stannate, peroxydisulfate, and fluorosulfonate, showed submillimolar range K I -s of 0.77-0.92 mM. Sulfamide (K I of 81 mM), N,N-diethyldithiocarbamate (K I of 67 mM) and sulphamic acid (K I of 6.2 mM) were the most efficient GsaCAb inhibitors. Correlated to the fact that there are very few non-toxic agents effective in combating this parasite, GsaCAb is proposed as a new antiparasitic drug target for which effective inhibitors could be designed.

Disclosure statement
CT Supuran is Editor-in-Chief of the Journal of Enzyme Inhibition and Medicinal Chemistry. He was not involved in the assessment, peer review, or decision-making process of this paper. The authors have no relevant affiliations of financial involvement with any organisation or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.