An overview on the recently discovered iota-carbonic anhydrases

Abstract Carbonic anhydrases (CAs, EC 4.2.1.1) have been studied for decades and have been classified as a superfamily of enzymes which includes, up to date, eight gene families or classes indicated with the Greek letters α, β, γ, δ, ζ, η, θ, ι. This versatile enzyme superfamily is involved in multiple physiological processes, catalysing a fundamental reaction for all living organisms, the reversible hydration of carbon dioxide to bicarbonate and a proton. Recently, the ι-CA (LCIP63) from the diatom Thalassiosira pseudonana and a bacterial ι-CA (BteCAι) identified in the genome of Burkholderia territorii were characterised. The recombinant BteCAι was observed to act as an excellent catalyst for the physiologic reaction. Very recently, the discovery of a novel ι-CAs (COG4337) in the eukaryotic microalga Bigelowiella natans and the cyanobacterium Anabaena sp. PCC7120 has brought to light an unexpected feature for this ancient superfamily: this ι-CAs was catalytically active without a metal ion cofactor, unlike the previous reported ι-CAs as well as all known CAs investigated so far. This review reports recent investigations on ι-CAs obtained in these last three years, highlighting their peculiar features, and hypothesising that possibly this new CA family shows catalytic activity without the need of metal ions.


Introduction
CAs represent a superfamily of enzymes considered among the most versatile on the planet. They are ubiquitously present in Archaea, Bacteria, and Eukaryote domains, with the function to accelerate a fundamental reaction for all living organisms, the reversible hydration of carbon dioxide (CO 2 ) to bicarbonate (HCO 3 -) and a proton (H þ ), according to the following chemical reaction: CO 2 þ H 2 O HCO 3 þ H þ1-7 . The reversible spontaneous CO 2 hydration reaction occurs with a rate of 0.15 s À 1 that cannot meet the fast demand of CO 2 and HCO 3 necessary for a complex sequences of metabolic pathways, which allow organisms to grow and reproduce, maintain their structures, and respond to environmental changes 6,7 . CA activity increases the velocity of the CO 2 hydration reaction from 100,000 to one million times per second (k cat falling in the range of 10 4 -10 6 s À 1 ) with respect to the uncatalyzed reaction, making this superfamily of enzymes among the fastest biocatalysts known in nature 7 .
CAs are involved in multiple physiological processes in all organisms in which they are present, such as respiration, photosynthesis, CO 2 and bicarbonate transport, pH and CO 2 homeostasis, electrolyte secretion in various tissues and organs, bone resorption, and calcification, etc. 8 . For example, carbon dioxide is a byproduct of sugar and fat breakdown in cells and needs to be removed from the mammalian cells 9 . At the level of peripheral tissues, the CO 2 produced by cellular aerobic metabolism leaves the cells and enters the bloodstream by a pressure gradient effect. Approximately 90% of the CO 2 flows into red blood cells and is converted to bicarbonate by CAs 9 . The produced bicarbonate leaves the red blood cells via an anion exchanger (AE) protein and is transported from the bloodstream to the lungs. At the alveolar level, the concentration of CO 2 is lower than in peripheral tissues, whereas there is a higher concentration of bicarbonate, which is pumped into the red blood cell 9 . Here, through the action of the reverse reaction catalysed by CA, bicarbonate is transformed into water and CO 2 . The CO 2 produced in this way is released into the bloodstream and, passing through the alveolus walls, is exhaled. Among others, gluconeogenesis, lipogenesis, and ureagenesis are several biosynthetic reactions that use pyruvate carboxylase (PC), acetyl-Co-A carboxylase (ACC), and carbamoyl-phosphate synthetase I and II, respectively, also use bicarbonate as substrate for carboxylation reactions 8,10-12 . The bicarbonate is produced in a CA-dependent manner. [10][11][12] In plants, CO 2 is stored as bicarbonate ions. In both terrestrial and aquatic plants, CA converts HCO 3 ions to CO 2 , which is concentrated in the proximity of the enzyme RuBisCO (Ribulose Bisphosphate Carboxylase/Oxygenase) present in the stroma of the chloroplasts [13][14][15] . As a result, the performance of RuBisCO carboxylation reaction is increased, whereas its oxygenation is suppressed. Eukaryotic unicellular photosynthetic organisms have evolved diverse Carbon Concentrating Mechanisms (CCMs) to increase CO 2 concentration in the proximity of RuBisCO up to 1000-fold from the low CO 2 levels present in the environment. In algae, the main component of the CCM is the pyrenoid 16,17 . In cyanobacteria, the equivalent of the pyrenoid is the carboxysome. Carboxysomes are composed of RuBisCO, CAs, active bicarbonate transporters, and structural envelope proteins 18 . The structure of the carboxysome envelope prevents the escape of CO 2 from these organelles. Another notable biological phenomenon in which CAs are involved is represented by coral calcification [19][20][21][22][23] . Calcium in seawater reacts with the HCO 3 produced by coral CAs to form calcium carbonate and protons, which are extruded. CaCO 3 is thereafter deposited and generates the hard outer surface of corals. 19 In bacteria, the CA catalysed reaction is the only known pathway to obtain and balance endogenous levels of CO 2 , H 2 CO 3 (carbonic acid), HCO 3 -, and CO 3 2-(carbonate) rapidly 7,[24][25][26] . In bacteria, CO 2 enters and leaves the bacterial cell by passive diffusion, while bicarbonate is imported directly into the cell through bicarbonate transporters 27 . Gram-negative bacteria have a periplasmic CA in their periplasmic space, for avoiding the loss of CO 2 through diffusion. This enzyme converts faster the CO 2 generated from the bacterial metabolism and that coming from the atmosphere into bicarbonate. HCO 3 is thereafter pumped into the cytoplasm by bicarbonate transporters and, there, converted into CO 2 by cytoplasmic forms of CAs belonging to the band/or c-CA classes 15,24,27 . Thus, the bicarbonate transporters and bacterial CA enzymes provide CO 2 and HCO 3 to sustain bacterial metabolism 15,24,27 . The natural reaction of interconversion of CO 2 and H 2 O into HCO 3 and H þ cannot quickly supply CO 2 and HCO 3 to the bacterial metabolism, as already mentioned, since the reaction rate is too low at physiological pH.
From these examples, it is readily apparent the enzyme versatility of CAs, which are considered metabolic enzymes involved in many physiological processes indispensable for the lifecycle of most living organisms 7,25 .
The CA superfamily includes, up until now, eight gene families or classes indicated with the letters of the Greek alphabet (a, b, c, d, f, g, h, i) [1][2][3][4][5] . The distribution of the CA classes is somewhat assorted in most investigated organisms, and except for mammals which encode only for a-CAs, most of them possess multiple representatives of two or even more genetic families. The genome of mammals encodes only for the a-CA class, of which 15 isoforms have been identified 8,[28][29][30][31] . In plants, a and b-CAs have been recognised 32 . In Bacteria, Archaea, and cyanobacteria are present a, b, c, and i -CA classes [5][6][7][32][33][34] . Marine diatoms encode for a-d-, f-, hand i-CAs [35][36][37] . In protozoa have been detected a-b and g-CAs.
Probably, the g-CA-class, recently discovered, has a pivotal role in de novo purine/pyrimidine biosynthetic pathways in these organisms 38 . In the fungal kingdom, the typical class is represented by b-CAs, and most fungi encode at least one b-CA [39][40][41] . In contrast, most filamentous ascomycetes contain multiple b-CA genes and, in some of them, it is possible to also find genes encoding for a-CAs [39][40][41] .
The eight CA classes are phylogenetically unrelated and, thus, they can be classified as non-homologous isofunctional enzymes that catalyse the same reaction 1-7 . This is an example of convergent evolution since CA classes show low sequence similarity in primary and possibly tertiary structures because they evolved in a different biological contexts, but catalysing the same reaction, with the active site residues showing a rather similar geometry. As mentioned above, CAs are metalloenzymes whose catalytic site contains a metal ion cofactor necessary for enzyme catalysis [5][6][7]34 . Usually, the Zn 2þ ion cofactor is coordinated by three amino acid residues, which may be three His residues in the a-, c-, dand, probably, h-classes; one His, and two Cys residues in band f-CAs, and two His and one Gln residues in the g-class 42 . Simultaneously, the fourth ligand is a water molecule/hydroxide ion acting as the nucleophile in the catalytic enzyme cycle 5,6,28,34,43,44 . Some CA-classes can also coordinate metal ions different from Zn 2þ , such as Co 2þ , Cd 2þ , Fe 2þ , and Mn 2þ . As described in the literature, a-, b-, d-, gand, perhaps h-CAs use as ion cofactor the Zn 2þ ; c-CAs the Fe 2þ , although they can coordinate Zn 2þ or Co 2þ , too 31,45-51 . The f-CAs are active with either Cd 2þ or Zn 2þ incorporated into the same apoprotein and are defined as cambialistic enzymes [52][53][54] . From a structural point of view, the representative belonging to one CA-class shows a different folding and structure compared with those of other CA-classes. a-CAs are usually active as monomers or dimers; b-CAs are active only as dimers, tetramers, or octamers. The c-CAs must be trimers for accomplishing their catalytic function 46,47,50,55 . c-CA monomers are characterised by a tandemly-repeated hexapeptide, which is crucial for the left-hand fold of the trimeric b-helix structures 56 . The X-ray structure of the h-CAs resulted to be very similar to some b-CAs 57 . The crystal structure of f-CA showed three slightly different active sites on the same polypeptide chain 54 . No information is available so far on the structural organisation of dand g-CAs. Intriguingly, a-, g-, hand i-CAs were reported to catalyse the esters/thioesters hydrolysis, while no esterase activity was detected for the other CA families 28,58,59 .
2. The ultimately discovered class, the i-CA

Lcip63 and BteCAi
In 2019 Gontero et al. discovered the i-CAs (acronym LCIP63) by exploring the genome of the diatom Thalassiosira pseudonana 59 . LCIP63 was stated to prefer as ion cofactor Mn 2þ to Zn 2þ , being localised in the chloroplast, and being only expressed at low concentrations of CO 2 , confirming their primary role in the diatom CCM. 59 These authors also reported LCIP63 homologs in the genome of other diatoms and algae, bacteria, and archaea. Most of the LCIP63 homologs identified in bacteria have been annotated in the data bank as SgcJ/EcaC oxidoreductase family with an unknown function 59 . In 2020, Capasso et al. demonstrated that the recombinant bacterial i-CA (acronym BteCAi) identified in the genome of Burkholderia territorii resulted to be excellent catalyst for the hydration of CO 2 to bicarbonate and protons with a k cat of 3.0 Â 10 5 s À 1 and k cat /K M of 3.9 Â 10 7 M À 1 s À160 . Addition of Zn 2þ or Ca 2þ to the culture media for enzyme expression in E. coli allowed catalytically active enzyme. In contrast, by adding Mn 2þ , the enzyme activity was not present or the enzyme was found to contain zinc, probably from the traces of this ion present as impurity in the used reagents 60 . The protein resulted sensitive to inhibition with substituted benzene-sulphonamides and clinically licenced sulfonamide-, sulfamate-and sulfamide-type drugs, which are among the most investigated CA inhibitors (CAIs) 61 . BteCAi inhibition profile showed several benzene-sulphonamides with an inhibition constant lower than 100 nM 61 . In addition to sulphonamides and their bioisosteres, anion and small molecules (another group of CAIs) were investigated as BteCAi inhibitors 62 . The best inhibitors were sulphamic acid, stannate, phenylarsonic acid, phenylboronic acid, and sulfamide (K I values of 6.2-94 mM), whereas diethyldithiocarbamate, tellurate, selenate, bicarbonate, and cyanate were submillimolar inhibitors (K I values of 0.71-0.94 mM). The halides (except iodide), thiocyanate, nitrite, nitrate, carbonate, bisulphite, sulphate, hydrogensulfide, peroxydisulfate, selenocyanate, fluorosulfonate, and trithiocarbonate showed K I values in the range of 3.1-9.3 mM 62 . These prompted us to propose that BteCAi is probably a Zn 2þ -and not Mn 2þ -containing enzyme, 60 as reported for diatom i-CAs. 59

Primary structure features of i-CAs
LCIP63 and the homologs identified as bacterial i-CAs (like BteCAi) show a primary sequence that completely differs from any previously identified CA-class. 59,60 For example, LCIP63, at its Nterminal part, displays the presence of an endoplasmic reticulum signal peptide (of 22 amino acid residues) and a chloroplast signal peptide (of 34 amino acid residues) 59 . It is a multidomain protein with four, three, or two repeated domains, each of them homologous to the calcium/calmodulin-dependent protein kinase II Association Domain (CaMKII-AD) 59 . The CaMKII-AD belongs to the NTF2-like protein superfamily, which is a group of proteins, sharing a common fold identified for the first time in the structure of the rat NTF2 (Nuclear Transport Factor 2) 63 . Generally, the polypeptide chain of the bacterial i-CAs present a pre-sequence of 19 or more amino acid residues at the N-terminal part and contains one or two repeated domains. The amino acid sequence is homologous to a group of proteins annotated as SgcJ/EcaC oxidoreductase family, with an unknown function. These proteins share a common structure with the NTF2-like superfamily, having a hydrophobic pocket that could constitute a putative substrate binding or catalytic active site. Figure 1 reports the multialignment of i-CA amino acid sequences from different species. It is evident that the i-CAs do not show along the amino acid sequence the conserved residues essential for the catalytic mechanisms of all known CAs, such as the three histidine ligands (in a-, c-, and d-CAs), two histidine and one glutamine (of g-CA), or one histidine and two cysteines (from b-, f-, and h-CAs). However, it is remarkable the presence in the Cterminal domain of all the amino acid sequences analysed and classified as i-CAs (LCIP63-iCAs and SgcJ/EcaC-iCAs) of a consensus motif with the following residues: (H)HHSS, which seems to be a specific feature of i-CAs (Figure 1).

i-Cas (COG4337) with no metal ions within the catalytic site
Recently, a Japanese group identified novel CAs (acronym COG4337) encoded by the genome of the eukaryotic microalga Bigelowiella natans and the cyanobacterium Anabaena sp. PCC7120 64 . COG4337 homologs from eukaryotic organisms resulted in multidomain proteins formed of up to five domains, while the prokaryotic homolog genes encode only a single domain of about 160 amino acid residues 64 . The Bigelowiella natans and Anabaena sp. PCC7120 CAs (indicated here as BnaCA and AspCA) showed the typical consensus HHSS characterising the i-CAs (Figure 1) mentioned above. They showed CO 2 hydration activity, which was investigated by determining the WAU (Wilburn-Anderson Units). The enzyme activity of BnaCA and AspCA resulted in 7 and 37 times respectively lower than that obtained for a mammalian CA. Still, it was in the same range of the h-CA from Phaeodactylum tricornutum and i-CA from T. pseudonana 64 . Moreover, and this was a great surprise, both enzymes BnaCA and AspCA resulted to be catalytically active without the metal ion cofactor, unlike other reported i-CAs as well as any other known CAs investigated so far 64 .

Phylogenetic analysis
From the amino acid alignment of the two metal-free i-CAs with those of i-CAs from different other species, it is readily apparent that the main residues involved in the catalytic pocket of the two enzymes identified by Hirakawa et al. are completely conserved in all the polypeptide chains considered in this paper (Figure 1). A distinctive feature of the metal-free i-CAs is the presence of an insertion absent in all the other presumably metal-containing i-CAs (Figure 1). The analysis of the hallmarks present in the Figure 1. Multialignment of the i-CA amino acid sequences from different species (bacteria, cyanobacteria, diatoms, and algae). In red, the putative residues of the catalytic triad (T106, Y124, S199); light blue colour, the H197 as the putative proton shuttle residue. In green are the other residues of the catalytic pocket (see Hirakawa et al. [64]). The putative motif (H)HHSS is the typical consensus sequence characterising the i-CAs. The asterisk ( Ã ) indicates identity at all aligned positions. The multialignment was performed with MUSCLE, version 3.1. See Table 1 for the identification of the amino acid sequences used in the multialignment. The residue number system used refers to the AspCA enzyme. amino acid sequences is far from being exhaustive, as it does not consider all the amino acid substitutions that differentiate the novel metal-free-CAs from those of other i-CAs. Hence, we have constructed a most parsimonious tree to better investigate the relationships of the novel i-CAs identified by Hirakawa et al. with other i-CAs from other species, such as diatoms and bacteria ( Figure 2). In Table 1 is presented the information needed for the identification of the amino acid sequences used in the phylogenetic analysis. The two metal-free CAs appear closely associated with each other, as shown in the dendrogram in Figure 2. BnaCA and AspCA clustered in a branch distinct from all the other (presumably) metal-containing i-CAs identified in diatoms and the bacterium species mentioned above. Thus, BnaCA and AspCA have several features typical of other i-CAs, but in other aspects, they do not appear closely related to any other metal-i-CAs. For this reason, they were annotated as a new subclass of the i-CAs 64 . Moreover, Del Prete et al. demonstrated that i-CAs clustered in a group closely associated with the bacterial c-CAs 60 . Probably, from an ancestral c-CA, during the evolution, a i-CA originated developing a structural catalytic pocket, which evolved the CO 2 hydration function, making possible the CO 2 hydratase reaction without the metal ion cofactor as proposed by Hirakawa et al. 64 .

Three-dimensional structure analysis
X-ray crystallographic structures were obtained for COG4337 proteins in the presence of bicarbonate and the anion inhibitor iodide 64 . Figure 3 shows the three-dimensional structure of hNFT2 and CAMKII, two proteins belonging to the NTF2-like superfamily (Figure 3(A,B), respectively); the three-dimensional structure of AspCA and BnaCA (the novel i-CAs identified by Hirakawa et al. 64 ) obtained in the presence of bicarbonate and iodide anion ( Figure  3(C,D), respectively), the crystal structures of an SgcJ/EcaC oxidoreductase identified in the genome of Xanthomonas campestris (Figure 3(E)) and BteCAi homology modelling (Figure 3(F)) generated with a fully automated protein homology modelling server SWISS-MODEL (https://swissmodel.expasy.org) and using template structure the homologous enzyme from X. campestris. Here, we want to stress that with the aid of protonography 65 , a biochemical technique used to identify the activity and the oligomeric state of CAs on SDS-PAGE, it has been demonstrated that BteCAi can be present as a dimer as shown by the obtained homology model 60 .
Interestingly, all the enzymes reported in Figure 3 are homodimers with a very high degree of structural homology and belong to the NTF2-like family (Figure 3). The NTF2-like superfamily includes proteins widely found in both prokaryotic and eukaryotic organisms, which possesses highly versatile roles 63 . These proteins can perform a broad range of different functions because their three-dimensional folding form a channel that allows the introduction of differently molecular species 63 . The NTF2-like Table 1. Organisms, acronyms, and accession numbers of the amino acid sequences used in the phylogenetic analysis of i-CAs.  Figure 2. Phylogenetic analysis of i-CAs from various organisms. The dendrogram was constructed using the i-CA amino acid sequences reported in Table 1.
family represents a classic example of divergent evolution in which proteins have similar general structures but diverge significantly in their functions 63 , which often can be determined only from the biochemical analysis of the proteins. Our groups in fact recently demonstrated, with the aid of a stopped-flow spectrophotometer, that the bacterial amino acid sequence annotated as SgcJ/EcaC oxidoreductase and characterised by the motif (H)HHSS is in fact a CA, which acts as a good catalyst for the CO 2 hydration reaction 60 . Intriguing, the homology model of BteCAi has a shape very similar to the crystal structure of the AspCA and BnaCA obtained with a bicarbonate molecule or iodide ion located inside the cone-shaped barrel, respectively. The most crucial evidence of AspCA and BnaCA structure is that the electron densities corresponding to metals were not detected in the structure cavity of both enzymes, confirming that the catalytic activity of these two enzymes is not dependent on the presence of the metal ion Figure 4. Indeed, the residues of the catalytic pocket which are involved in the binding of bicarbonate and ion iodide are evidenced in Figure 4. We want to stress that the model of BteCAi presented in 2020 did not allow us the insertion of the zinc ion in the interface of the two monomers to make possible the metal coordination with two histidines of a monomer and one histidine of the other monomer 60 . However, it is also true that the BteCAi enzyme catalytic activity was observed by adding Zn 2þ as described by Del Prete et al. 60 . On the other hand, it may be possible that the zinc has not a catalytic but a structural function in the bacterial i-CAs. Figure 5 reports the metal-free i-CAs, the i-CAs from B. territorii, and the CaMKII (NTF2-like superfamily) binding pocket. The three-dimensional arrangement of AspCA and BnaCA evidenced a catalytic site with a shape of a cone whose cavity is formed by hydrophilic (Thr, Ser, His, Lys, and Tyr) and hydrophobic (Trp and  Phe) residues ( Figure 4). Through the experiments of point mutation analysis, essential amino acids of the COG4337 catalytic pocket have been highlighted, and a putative catalytic mechanism for the CO 2 hydration reaction has been proposed for these metal-free CAs (Figure 4). In the known metallo-CAs, such as the human isoforms hCA I and hCA II, the initial step of the reaction involves the deprotonation of H 2 O in the active site to generate the nucleophile OHion. It attacks the CO 2 , producing the HCO 3 -; and the gatekeeper residues (Thr199 and Glu106) accept a hydrogen bond from the zinc-bound water, while the proton shuttle residue (His64) has the function to push away the protons from the active site. In the BnaCA and AspCA, metal-free CAs, the hydroxyl groups of the Thr106, Tyr124, and Ser199, present in the catalytic site, were proposed to be involved in the deprotonation of the active site water. The function of the proton shuttle is probably mediated by the H197 or the Tyr positioned on the molecular surface of the enzyme (Figure 4). Hirakawa et al. assumed that the CO 2 binding site could be the hydrophobic part of the catalytic pocket. Interestingly, the metal-free i-CA probably does not exhibit the reversible dehydration reaction 64 , of bicarbonate to CO2, which might be a peculiar feature of the metal free i-CA, as all other known metallo-CAs catalyse both the CO 2 hydration as well as bicarbonate dehydration reactions 2,7,8 .

Conclusions
The recent report of a metal-free CA belonging to the i-class 64 and a very interesting proposal for the catalytic mechanism of these enzyme for the CO 2 hydration reactions, prompted us to investigate in detail the phylogenetic relationship, primary, secondary and tertiary structures of the other two i-CAs investigated in detail: the presumably manganese-containing enzyme from a diatom (T. pseudonana) 59 , and the bacterial, presumably zinc-enzyme from Burkholderia territorii 60 . This analysis however also included many such sequences from other organisms, which have not yet been characterised in detail. It was thus observed that i-CAs possess a rather relevant structural homology with the NTF2-like family of proteins, which have a great variety of functions and physiological roles 63 . Furthermore, the residues that have been proposed to be involved in the CO 2 hydration reaction of the metal-free i-CAs, were observed to be conserved in all sequences of such enzymes present in diatoms and bacteria. Coupled to the fact that by computational techniques we were unable to position zinc ions in the model of BteCAi, although we have determined the presence of one mole of Zn 2þ per polypeptide chain of this protein (using atomic absorption spectrophotometry) 60 prompts us to hypothesise that probably in all i-CAs the metal ions may not have a catalytic but a structural function (although in the X-ray crystal structure of AspCA no metal ions were present). It is also possible that the reported zinc or manganese ions necessary for the catalytic activity of some of the i-CAs is an artefact due to the ubiquity of some metal ions present in traces in most reagents, solvent, glass, etc. However, we wish to stress here, the results reported by Jenssen et al. 59 and Del Prete et al. 60 are valid, even if those enzymes are metal free CAs. In fact, Hirakawa et al. 64 reported also adducts with anion inhibitors of AspCA (iodide and bicarbonate), and such an inhibition effect was also reported with various anions (inorganic and organic ones) for the presumably metal-containing i-CAs 59-62 . Thus, future work is needed to establish whether all i-CAs are metal free, or whether some of them may use manganese or zinc ions within their active site, and the subclass reported by Hirakawa et al. 64 is a just a minority of such enzymes. have no relevant affiliations of financial involvement with any organisation or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript. This includes employment, consultancies, honoraria, stock ownership or options, expert testimony, grants or patents received or pending, or royalties.