The AAA+ superfamily: a review of the structural and mechanistic principles of these molecular machines

Abstract ATPases associated with diverse cellular activities (AAA+ proteins) are a superfamily of proteins found throughout all domains of life. The hallmark of this family is a conserved AAA+ domain responsible for a diverse range of cellular activities. Typically, AAA+ proteins transduce chemical energy from the hydrolysis of ATP into mechanical energy through conformational change, which can drive a variety of biological processes. AAA+ proteins operate in a variety of cellular contexts with diverse functions including disassembly of SNARE proteins, protein quality control, DNA replication, ribosome assembly, and viral replication. This breadth of function illustrates both the importance of AAA+ proteins in health and disease and emphasizes the importance of understanding conserved mechanisms of chemo-mechanical energy transduction. This review is divided into three major portions. First, the core AAA+ fold is presented. Next, the seven different clades of AAA+ proteins and structural details and reclassification pertaining to proteins in each clade are described. Finally, two well-known AAA+ proteins, NSF and its close relative p97, are reviewed in detail.


Introduction
Chemo-mechanical energy transduction-in which the energy liberated upon nucleotide hydrolysis is harnessed to perform work-is an essential biochemical feature in all living organisms and viruses. Typically, the energy of the b-c bond in triphosphate derivatives is released with hydrolysis and subsequently used to drive biological processes. P-loop NTPases are a class of proteins that bind and hydrolyze nucleotides (Saraste et al. 1990). P-loop NTPases encapsulate broad protein superfamilies such as kinases, GTPases, and AAAþ proteins, and are thought to predate the last universal common ancestor (LUCA) (Kyrpides et al. 1999;Leipe et al. 2002;Iyer et al. 2004). Common to all P-loop NTPases is the presence of two key motifs, the Walker A motif (GxxxxGK(T/S), where x is any amino acid; also referred to as Walker-A P-loop element), and the Walker B motif (hhhhD(D/E) where h is any hydrophobic amino acid) (Walker et al. 1982). The Walker A motif primarily binds nucleotide, and the Walker B motif drives hydrolysis by coordinating Mg 2þ (D) and water (D/E) (Story and Steitz 1992). The P-loop NTPases are divided into two diverse subclasses, the kinase-GTPase and ASCE (additional strand catalytic element) subclasses. This review concerns the AAAþ proteins, which occupy the ASCE subclasses (Leipe et al. 2002(Leipe et al. , 2003Iyer et al. 2004; and in many cases perform mechanical work to remodel their substrates.
The AAAþ superfamily was first proposed following sequence comparisons of several yeast proteins (e.g. Sec18, an ortholog of NSF, and Cdc48, an ortholog of p97) that cluster together based on a conserved domain Fr€ ohlich et al. 1991;Kunau et al. 1993). Subsequent work revealed that this AAAþ domain is shared among many more different proteins (Confalonieri and Duguet 1995;Beyer 1997;Patel and Latterich 1998). Key studies of the D2 AAAþ domain of N-ethylmaleimide-sensitive-factor (NSF) (Lenzen et al. 1998;Yu et al. 1998) and the d 0 subunit from the bacterial clamp loader (Guenther et al. 1997) produced the first high-resolution structures of the AAAþ domain (Figure 1(a)). These structures revealed that, despite little to no sequence conservation, the AAAþ domain maintains a strikingly conserved fold and that these diverse proteins represent a unique protein superfamily (Neuwald et al. 1999;Iyer et al. 2004;. The superfamily of AAAþ proteins is involved in a variety of biochemical systems in all domains of life and viruses (Neuwald et al. 1999;Iyer et al. 2004;Burroughs et al. 2007;Wendler et al. 2012;Hilbert et al. 2015;Miller and Enemark 2016;Puchades et al. 2020;Seraphim and Houry 2020). For example, in DNA replication (clade 1), a sliding clamp encircles DNA and associates with DNA polymerase at the replication fork to prevent it from falling off; this process is coordinated by a class of AAAþ proteins called clamp loaders (Kelch et al. 2012). A separate class of AAAþ proteins (clade 2) called initiators assist in recognizing replication origins and allows for assembly of the replication machinery on the DNA (Duderstadt and Berger 2008). AAAþ domains loaded in tandem accomplish the unraveling of folded proteins and even the disassembly of protein complexes, such as ubiquitintagged proteins (clade 3) (DeLaBarre and Brunger 2003;Banerjee et al. 2016;Bodnar et al. 2018;Cooney et al. 2019;Twomey et al. 2019;Pan et al. 2021) and SNARE protein complexes (Zhao et al. 2015;White et al. 2018), respectfully. AAAþ proteins are encoded within viral genomes to assist with their replication (clade 4) (Shen et al. 2005;Enemark and Joshua-Tor 2006;Santosh et al. 2020). AAAþ proteins can also assist in the untangling of protein aggregates prior to destruction by a proteasomal complex (Yedidi et al. 2017).
AAAþ proteins are the translocation engine that drives protein degradation in several different protease complexes and secretion systems (clade 5) (Sauer and Baker 2011;Livneh et al. 2016;Ho et al. 2018). In addition to DNA polymerase clamp loaders, some members of the AAAþ family bind to and regulate RNA polymerases (clade 6) (Joly et al. 2012). Dynein, a widely studied cytoskeletal protein that moves along microtubules in cells by coupling ATP hydrolysis to large conformational changes (clade 7) (Reck-Peterson et al. 2018), is also a AAAþ protein (Roberts et al. 2009).

AAA1 core domain
This AAAþ core domain consists of an aba "sandwich" topology ( Figure 1). The central b sheet (b5-b1-b4-b3-b2) is flanked on both sides by a-helices. Linearly, the core domain is ordered as N 0 -a0-b1-a1-b2-a2-b3-a3-b4-a4-b5-(a helical bundle, 3 or 4 helices)-C 0 (Figure 1(b)). The aforementioned Walker A motif is after strand 1, and the Walker B motif is in strand 3. Strand 4 contains sensor-1, a set of conserved polar residues which mediate ATP hydrolysis in coordination with the Walker B motif. Catalytically inactive AAAþ domains such as the NSF D2 domain lack functional arginine fingers and thus hydrolyze ATP very slowly (Morgan et al. 1994;May et al. 2001). The arginine finger(s) element is generally between a4 and b5. These residue(s) coordinate ATP from neighboring AAAþ domains and may be involved in communication of the nucleotide state between subunits. The region spanning between b4 and b5 is referred to as the second region of homology (SRH), Box VII motif, or the SRC motif (Neuwald et al.  (Guenther et al. 1997)) and NSF (purple, PDB: 1NSF ))) are overlaid. (b) Cartoon schematic of the AAAþ core domain. a-helices in the large subdomain are colored green, b-sheets are colored orange, and a-helices in the small subdomain are colored yellow. Some AAAþ proteins have either three or four helices in their small subdomain; the fourth helix is colored purple to indicate this. (c) The core AAAþ domain of ATPcS bound E. coli clamp loader complex (PDB: 1XXH (Kazmirski et al. 2004)) from the gene HolA is shown. The core b-sheets are colored orange, the surrounding regions of the large subdomain are green, and the small subdomain has the three first helices colored yellow, while the fourth helix is colored purple.
1999; Ogura et al. 2004;) (Figure 1(b)). A distinguishing feature of the AAAþ core domain that separates it from other NTPases is the presence of multiple insertions. The b2 strand insertion is only present in the ASCE subclass (hence the name additional strand catalytic element) (Iyer et al. 2004). The a0 helix and a conserved region upstream of this helix are only present in the AAAþ protein superfamily. The a0 helix typically contains a conserved glycine and a conserved polar residue that defines the N-terminal portion of this helix (Iyer et al. 2004). The C-terminal a helical bundle, which usually has three or four a helices, is also unique to the AAAþ core domain. Within the a helical bundle is sensor-2, a conserved motif that typically contains arginine (or alanine in the classical AAAþ protein clade (Wendler et al. 2012), see below), which interacts with ATP and can change conformation depending on whether ATP or ADP is bound. It has been speculated that this bundle may transmit the free energy of ATP hydrolysis to substrate (Iyer et al. 2004). In some cases, the sensor-2 element can act as a trans-acting element and interact with a bound ATP molecule from a different AAAþ domain (e.g. MCM (Miller et al. 2014)).

Common mechanistic features of AAA1 proteins
The defining feature of a majority of AAAþ proteins is their ability to oligomerize and form ring-or spiral-like complexes (Vale 2000). These complexes can act as molecular screws, unwinders, or threaders. Inter-protomer interaction is mediated by a conserved arginine finger(s) as stated above. This element coordinates the triphosphate nucleotide bound to an adjacent AAAþ protomer. This interaction elicits interprotomer joining and enables the detection of the nucleotide state of neighboring protomers . It may thus play a role in passing information through the complex from protomer to protomer in order to coordinate the broader activity of the oligomer. Generally speaking, however, principles governing the oligomerization of AAAþ proteins, inter-protomer communication, and the collective motions connected to function remain to be fully elucidated.
Another common theme throughout AAAþ proteins is stimulated ATPase activity upon substrate binding. For example, NSF shows limited ATPase activity until it forms a complex with two other components ) (reviewed thoroughly below). p97 shows stimulated ATPase activity only when associated with the substrate that it can unfold (Blythe et al. 2017).
Mutations to the DNA binding residues of DnaC severely impaired the ATPase activity of the mutants as compared to the WT (Arias-Palomo et al. 2019). PAN, the proteasomal ATPase complex in archaea homologous to a portion of the 26S eukaryotic complex, shows increased ATPase activity in the presence of an ssrAtagged substrate . Tying substrate binding and an elevation in ATPase activity likely allow AAAþ protein complexes to conserve ATP.
Domains N-terminal to the AAA+ domain ("N-domains") are diverse and primarily function in substrate engagement in many AAAþ families. The Ndomains of AAAþ proteins typically vary across different families, owed in part to the many different types of substrates that AAAþ proteins interact with. The Nterminal domain of DnaC has two a helices that bind to DnaB and trap it in an open ring conformation (Chodavarapu et al. 2016). Meiotic AAAþ families VPS4, katanin, fidgetin, and spastin all share a similar N-terminal microtubule-binding domain (reviewed in the clade 3 section) that mediates substrate binding (Monroe and Hill 2016). The Lon protease N-domain has a specific binding site for a degron signal from a substrate (Wohlever et al. 2014). NSF's N-domains do not bind substrate directly, but instead bind the adaptor proteins SNAPs/Sec17, which in turn bind the SNARE substrate meant for disassembly (Zhao et al. 2015). The protein p97/Cdc48 is a close relative of the NSF/Sec18 family and shares a similar N-domain that also engages with an adaptor/substrate complex (Cooney et al. 2019;Twomey et al. 2019;Pan et al. 2021).

The seven clades of AAA1 proteins
Nearly all AAAþ proteins share the common elements in the AAAþ core domain (Figure 1(b,c)). The proteins that share this common blueprint can be further subdivided into seven different clades, each with unique structural elements (Iyer et al. 2004; (Figure 2). This section will cover the phylogeny of known AAAþ proteins within each clade and focus on reviewing structural details of proteins in each clade (Table 1). It also includes the categorization of proteins within clade 3, the classical AAAþ proteins.

Clade 1: Clamp loader clade
Clade 1 does not deviate from the AAAþ core domain structurally. The arginine finger is immediately followed by a cysteine residue (Iyer et al. 2004) DNA replication in all cells requires the use of clamps-ring-shaped proteins that keep the DNA polymerase closely bound to the DNA (Hedglin et al. 2013). The clamp completely encircles the double-stranded DNA to serve as a sliding platform upon which DNA polymerase and other replication factors attach and associate, in addition to the DNA itself (Fernandez-Leiro et al. 2015). This complex of clamp and factors translocates along with the DNA and can even bypass DNA-protein crosslinks on the translocation strand (Sparks et al. 2019). Association of DNA polymerase with a clamp increases both the rate of nucleotide incorporation and the processivity (i.e. the number of nucleotides added in a single association), underscoring its essential function in DNA replication in all three domains of life (Maki and Kornberg 1985;O'Donnell and Kornberg 1985;Mok and Marians 1987;McInerney et al. 2007;Kelch et al. 2012). Equally important to the process of replication is its counterpart the clamp loader, an AAAþ protein that fastens this ring to the DNA (Duderstadt and Berger 2008). The clamp loader is a molecular switch controlled by ATP binding and hydrolysis (Goedken et al. 2004). ATP-bound clamp loader has a high affinity for the clamp, and when bound to ATP (Pietroni et al. 1997;Turner et al. 1999), DNA, and the clamp, its ATPase activity is stimulated (Jarvis et al. 1989;Ason et al. 2000). The affinity of the clamp loader for ADP is lower, so following hydrolysis, ADP is released, and the clamp and DNA are ejected (Kelch 2016).
Deviating from the "traditional" homohexameric arrangement of AAAþ proteins, clamp loaders are heteropentameric with a larger gap between the first and fifth subunit (Figure 3) (Jeruzalmi et al. 2001). This suggests that DNA is loaded through this gap. Following the previous revision of notation (Kelch et al. 2012), subunits are referred to as A through E.
Within clade 1, there are three major phylogenetic families: the bacterial family, the RFC family, and the WHIP family (Iyer et al. 2004). T4 bacteriophage clamp loader ( Figure 3) is a well-studied member of clade 1. It is closest to members of the eukaryotic RFC family in structure and sequence (Kelch et al. 2012), although it is difficult to place phylogenetically because the T4 bacteriophage is composed of genes from both eukaryotic and bacterial sources. The protein gp62 occupies the A position and has four identical gp44 subunits at positions B, C, D, and E. gp62 binds two clamp subunitsunlike gp44, which binds one (Kelch et al. 2011). High throughput mutagenesis of the T4 bacteriophage clamp loader revealed that regions not involved in catalysis or binding could tolerate mutations (Subramanian et al. 2021). The exception to this trend was a mutationally sensitive glutamine residue (Gln 118) that is spatially distant from both catalytic and interfacial sites. Hydrogen bonds formed by this residue appear to rigidly fasten two helices in the AAAþ core domain and connect DNA bound in the central channel of the clamp loader to the nucleotide at the catalytic Figure 2. Schematic of the seven clades. The coloring scheme for the core domain is the same as above. Modifications to the core fold by insertions in the clade are graphically noted. Cylindrical-shaped insertions represent a-helices and two-sided arrows connected with a black connector represent b-hairpins. Clade 2 insertions are colored red,clade 3 insertions are colored purple, clades 4-7 insertions are colored blue, clades 6 and 7 specific insertions are colored yellow, and clade 7 specific insertions are colored gray.
site. This glutamine-mediated hydrogen bonding network is present in AAAþ proteins outside of Clade 1 and may thus be an important general feature of these proteins as it appears to link domains responsible for ATP binding and/or hydrolysis across neighboring protomers (Subramanian et al. 2021).
The bacterial family contains a zinc cluster insertion downstream of the b1 strand (Guenther et al. 1997;Iyer et al. 2004) and consists of two lineages that likely arose through ancient gene duplication, exemplified by the E. coli proteins HolA (Dong et al. 1993) (d subunit) and DnaX (Kodaira et al. 1983;Mullin et al. 1983) (c and s subunits)/HolB (Carter et al. 1993;Dong et al. 1993) (d 0 subunit). Positions A and E are occupied by d and d 0 subunits, respectively, and lack ATPase activity. Positions B, C, and D are occupied by c or s subunits. The c subunit is a truncated version of the s subunit that is typically generated by Programmed À1 Ribosomal Frameshifting (À1 PRF) (Blinkowa and Walker 1990;Flower and McHenry 1990;Tsuchihashi and Kornberg 1990), the most well-known translational mechanism among the recoding phenomena (Baranov et al. 2015;Atkins et al. 2016;Khan et al. 2020) that is employed in many viruses (Jacks and Varmus 1985;Jacks et al. 1988;Moomau et al. 2016;Kendra et al. 2017) and bacteria (Meydan et al. 2017) but not currently discovered in cellular genes in vertebrates (Khan et al. 2019).
The RFC family (replication factor C) is found in archaea and eukaryotes with two lineages, which also may have arose through ancient gene duplication (Iyer et al. 2004). In the RFC family, eukaryotic proteins RFC1-5 occupy positions A through E. RFC1 occupies position A (active ATPase) and has an extra A 0 domain that bridges the gap between the A and E domain (Bowman et al. 2004). The other proteins sit as follows: RFC4 at B, RFC3 at C, RFC2 at D, and RFC5 at E (which lacks ATPase activity). The archaeal RFC family has one unique, active ATPase at the A position, and four identical ATPase subunits at the B, C, D, and E positions (Miyata et al. 2005). The collar region is a separate domain that C-terminally organizes into a planar ring. Mapping common cancer mutations onto human RFC showed a significant accumulation of mutations in the collar region (Gaubitz et al. 2020), which is thought to seed oligomerization.
The WHIP (Werner helicase interacting protein) family is present in eukaryotes and most prokaryotes (Iyer et al. 2004). WHIP family proteins contain a ubiquitinbinding zinc finger (UBZ) domain (Yoshimura et al. 2017). Gel-filtration chromatography suggests that Human Werner helicase interaction protein 1 oligomerizes into an octamer, but this has not been confirmed by structural studies (Tsurimoto et al. 2005). The Werner helicase, its interaction partner, is mutated in patients with Werner's Syndrome, a disorder characterized by premature aging (Yoshimura et al. 2017).

Clade 2: Initiator clade
Clade 2 is characterized by the insertion of an a helical element after b2 and before a2 (Iyer et al. 2004;Duderstadt and Berger 2008) (Figure 2).
Replication of DNA in all cells is initiated at the origin of replication. AAAþ initiators can form a spiral corkscrew around the double-stranded DNA, which then recruits helicases and other proteins to either begin the replicative process or to perform other roles in the assembly of the full replication complex on the origin of replication. Bacteria and archaea frequently employ a single origin of replication in the genome, whereas eukaryotes can range anywhere from the hundreds to tens of thousands (Leonard and M echali 2013).
There are two families within the initiator clade, the bacterial DnaA family and the archaeo-eukaryotic CDC6/ORC family ( Figure 4) (Iyer et al. 2004). The bacterial DnaA family is comprised of two orthologous lineages, DnaA and DnaC. DnaA is present in all known bacteria and forms an oligomeric complex around the origin of replication, which then recruits DnaB, a helicase, to unwind the DNA. DnaA (and its archaeoeukaryotic counterpart ORC) forms an open, spiral heteroligomer around the DNA (Figure 4) ). It has a C-terminal domain IV that mediates its interaction with the DNA itself. Many DnaA monomers form a right-handed filamentous structure composed of a tetrameric asymmetric unit.
DnaC (also called DnaI in Gram-positive bacteria) is thought to have arisen through duplication of DnaA and is not widely present in bacteria. Six DnaC protomers assist with the loading of DnaB helicase onto the DNA by "cracking" open the DnaB hexamer (with its extended N-terminal domain) (Wahle et al. 1989;Arias-Palomo et al. 2019;Nagata et al. 2020). Upon DNA binding, DnaC hydrolyzes ATP, allowing DnaB to rearrange into a closed ring formation around the DNA.
The ORC/CDC6 family is present in all eukaryotes and almost all archaea. In eukaryotes, AAAþ initiator proteins are involved with first forming the ORC complex on the DNA and then forming the pre-replication complex after the recruitment of the additional AAAþ initiator CDC6 and, subsequently, the clade 6 AAAþ mini-chromosome maintenance complex (MCM) of MCM2-7 (Fragkos et al. 2015).
In the ORC family, five different subunits of Orc (Orc1-5 in S. cerevisiae) form a spiral around the DNA (Figure 4(b,c)) (Li et al. 2018). Orc6 does not directly interact with the DNA but contacts Orc3, Orc2, and Orc5. Orc6 also has little homology to ORC1-5 and its function varies among eukaryotic organisms. In the spiral around the DNA, the order is Orc1, Orc4, Orc5, Orc3, and Orc2 with a gap between Orc1 and Orc2. This gap is partially filled by a winged helix domain (WHD) of the Orc1-5 proteins, which is a nucleic acid binding structural element. After the ORC complex is formed, cdc6 likely docks onto the complex by passing through this gap. Only Orc1, Orc4, Orc5 contain functional ATPase AAAþ domains while Orc2 and Orc3 do not. Mutations to human Orc proteins cause a wide range of diseases such as Meier-Gorlin syndrome (Jackson et al. 2014). Archaea also employ the ORC/CDC6 family. Proteins in archaea that have homology to eukaryotic Cdc6 and Orc1 are called either Orc1 or Cdc6 proteins in archaea, despite harboring homology to both (Arora et al. 2014). Archaeal Orc1/Cdc6 proteins form monomers or complexes at the origin of replication (Ausiannikava and Allers 2017).
The additional a-helix characteristic of clade 2 (Figure 2), also called an initiator-specific motif (ISM), differs between bacteria and archaea-eukaryotes as well. In DnaA, the ISM coordinates the DNA (Duderstadt et al. 2011) and forces the neighboring DnaA subunits out of plane with respect to each other to form a righthanded helix ). In the archaeaeukaryotic ORC/CDC6 family, the ORC ISM typically contacts the DNA (Li et al. 2018) also but does not promote a radical spiraling of subunits of the DNA due to its topology (Costa et al. 2013).

Clade 3: classical clade
A flexible linker and short a-helix are inserted after b2 and before a2 (Figure 2). This flexible linker is typically referred to as pore loop 1. Pore loop 1 typically contains one or more aromatic residues that interact with the substrate. There is a less conserved pore loop 2 portion in some classical AAAþ proteins, which also binds to substrate in the pore ). Members of this clade lack a conserved arginine at sensor-2 and are instead typically an alanine. There is often a conserved glycine N-terminal to the arginine finger and there are typically two conserved arginine residues in the SRH instead of the single residue seen in other clades (Iyer et al. 2004).
The classical clade encompasses the widest breadth of AAAþ proteins. Members of this clade usually form homo-oligomeric complexes that process and remodel nucleic acids, proteins, and protein complexes (YME1 in Figure 5 and NSF in Figure 10). With regard to protein remodeling, these roles often involve key aspects of disaggregation and degradation. A wide variety of families that occupy diverse functions in this clade exist. These families have differing, seemingly modular arrangements of multiple AAAþ domains and/or accessory domains.
While there is much diversity within clade 3, a common thread among the classic AAAþ proteins with niche functions is the presence of a conserved pore loop 1 (Puchades et al. 2020) ( Figure 2). The pore loop 1 typically contains an aromatic residue that acts like the teeth of a gear to exert axial force on the substrate through the hexamer pore in a somewhat nonspecific manner. A less-conserved pore loop 2 present in some proteins has also been implicated in both substrate recognition and in forming a second "conveyor belt" of residues to further aid in the translocation of substrate .
Due to the breadth of functional diversity, the classical clade is typically subdivided into two different categories: type I and type II (Saffert et al. 2017;Banchenko et al. 2019). Sometimes, a third "AAAþ protease" category is employed as well  (Li et al. 2018), trimmed to only include AAAþ containing ORC1-5). Orc1, Orc2, Orc3, Orc4, and Orc5 are colored green, teal, purple, peach, and yellow respectively. The DNA substrate, centered in the pore, is shown with a ball and stick representation. (c) Side view of the ORC complex. . Here, we summarize the definitions and categories of these types within the classical clade to provide a straightforward classification scheme. The type I category will include classical clade AAAþ proteins with only a single AAAþ domain. The type II category will include classical AAAþ proteins with two (or more, if found to exist in the classical clade) AAAþ domains within the same protein chain. The presence of accessory or proteasomal domains will not impact the classification of these classical AAAþ proteins.
Type I classical AAAþ proteins include the eukaryotic katanin, fidgetin, spastin, and VPS4 protein families (VPS4 is found in some archaeal phyla as well (Iyer et al. 2004)). These four families, within the Type I ATPases, are also referred to as Meiotic AAAþ proteins (Monroe & Hill 2016). The N-terminal domains of katanin, spastin, and fidgetin function in the recognition of tubulin polymers and help to recruit other enzymes involved in cytoskeletal remodeling. VPS4 disassembles ESCRT-III polymers, assisting in the membrane-remodeling ESCRT pathway and is stimulated to hydrolyze ATP in the presence of ESCRT-III (Merrill and Hanson 2010). VPS4's catalytic activity is directly responsible for membrane scission (Sch€ oneberg et al. 2018). All four families share an N-terminal Microtubule Interacting and Trafficking (MIT) domain. C-terminal to the MIT domain and N-terminal to the AAAþ domain, spastin contains a microtubule-binding domain (MTBD) necessary for microtubule binding. Katanin functions in microtubule severing (Lindeboom et al. 2013) and is a heterodimer of two unrelated proteins, p60 (AAAþ domain-containing) and p80. p80 enhances the in vitro ATPase and microtubule severing activity of p60. Fidgetin also has the ability to cleave microtubules. Only eukaryotic VPS4, but not archaeal VPS4 or other members of the Type I ATPase category, has a unique insertion into the AAAþ domain called the b-domain. The b-domain packs against the AAAþ b-sheet in an antiparallel orientation and binds a VPS4 activator. Type I ATPases oligomerize into hexamers around their substrate. Highresolution structures of hexameric spastin, katanin, and VPS4 confirm this (Caillat et al. 2015;Monroe et al. 2017;Sun et al. 2017;Zehr et al. 2017;Sandate et al. 2019). Mutations in spastin and katanin are associated with Hereditary Spastic Paraplegia, a disease showing abnormal microtubule arrangement and amount (Ghosh et al. 2012). These protein families are considered drug targets (Cupido et al. 2019;. The eukaryotic Bcs1 is another Type I AAAþ protein, with an N-terminal domain that anchors it to the inner membrane of the mitochondria. Bcs1 is essential for the transport of RIP1 across the inner membrane and assembly into complex III of the electron transport chain (Wagener et al. 2011). Surprisingly, Bcs1 has been observed in a heptameric state in several structural studies (Kater et al. 2020;Tang et al. 2020).
The TIP49 family is an archaeo-eukaryotic family involved in transcription that is stimulated by DNA (Iyer et al. 2004). TIP49 family proteins have a large insertion between the Walker A and B motifs, containing an oligosaccharide/oligonucleotide binding (OB) domain (Petukhov et al. 2012). Confusingly, the RuvB-like 1 (RUVBL1, also known as pontin) and RuvB-like 2 (RUVBL2, also known as reptin) eukaryotic proteins are classified in the TIP49 family despite the actual RuvB protein belonging in clade 5 (discussed below) and lacking the PS1bH element that characterizes clade 5 AAAþ proteins. RUVBL1 and RUVBL2 have been implicated in a wide variety of molecular processes (Mao and Houry 2017). RUVBL1 and RUVBL2 form a heterohexamer with alternating RUVBL1 and RUVBL2 units in the larger human INO80 chromatin complex (Aramayo et al. 2018). Activities of the two range from transcriptional regulation, chromatin remodeling, DNA damage signaling and repair, assembly of macromolecular complexes (such as cTuRC (Zimmermann et al. 2020)), cell cycle, and motility. These proteins are overexpressed in many cancer types, which is perhaps unsurprising given their role in so many cellular processes involved in cellular replication; identification of inhibitors is ongoing (Nano et al. 2020). Because members of the TIP49 family are classical AAAþ proteins with a single AAAþ domain, they are here designated as Type I ATPases.
The prokaryo-eukaryotic AFG1 was originally discovered as being an AAAþ domain containing protein with homology to NSF/Sec18, Cdc48/p97, and other AAAþ proteins (Lee and Wickner 1992). For this reason, the AFG1 family has been typically placed closely to the NSF and p97 families. It likely made its way to eukaryotes via the proto-mitochondrion. Loss of functional Afg1 leads to progressive mitochondrial failure and impaired oxidative stress tolerance in eukaryotes (Germany et al. 2018). Despite a lack of structural studies, it appears that AFG1 only contains one AAAþ domain and is thus more appropriately suited for the Type I category.
Type I AAAþ proteins can also have proteases fused to them. For example, the FtsH family of proteases has a metalloprotease fused C-terminally to a AAAþ domain (Iyer et al. 2004). This family is pan-bacterial and present in some eukaryotes via proto-mitochondrion transfer. Examples of proteins in this family include YME1 and AFG3L2 (Puchades et al. 2019;. YME1 forms a spiral-shaped hexameric AAAþ ring that passes substrate into a hexameric protease for degradation. Four subunits appear to engage the substrate, while the top and bottom protomers along the spiral are partially or completely disengaged ( Figure 5) (Puchades et al. 2017). AFG3L2 also forms a spiral when engaged to substrate (Puchades et al. 2019) and has specificity for certain degron sequences (Ding et al. 2018). Mutations in YME1 and AFG3L2, both which are involved in the maintenance of the mitochondrial proteome, can cause autosomal dominant spinocerebellar ataxias (Di Bella et al. 2010).
Type I AAAþ proteins can also function in proteasomal processing without being fused to a protease, such as the RPT family. The base of the eukaryotic 26S proteasome consists of a ring of Rpt1-2-6-3-4-5, all AAAþ proteins (Beckwith et al. 2013). The archaeal proteosome also employs a AAAþ ring system, PAN, with a similar functional purpose (Majumder et al. 2019). This AAAþ ring is the engine of translocation in the 26S proteasome (Dong et al. 2019). We direct the reader to thorough reviews on the 26S proteasome (Livneh et al. 2016; Bard et al. 2018).
Type II ATPases are AAAþ proteins that contain tandem AAAþ domains which arose through duplication or fusion events. Families in this category include the eukaryotic NSF, archaeo-eukaryotic CDC48, eukaryotic Drg1, eukaryotic Rix7, eukaryotic Pex1/6, the eukaryotic ATAD, and the ClpAB families (N-terminal domain). The NSF family is most likely derived from the CDC48 family as eukaryotic vesicle trafficking evolved and grew in complexity (Iyer et al. 2004). NSF is thoroughly reviewed below.
The Cdc48 family, members of which are composed of an N-domain and two AAAþ domains like NSF, is involved in extracting and unfolding peptides targeted by ubiquitination. It appears to actively use both ATPase rings in the threading and unfolding of substrate ( Figure 11) (Bodnar et al. 2018). Mutations in p97, the human protein in the CDC48 family, cause a variety of diseases such as multisystem proteinopathy (Tang and Xia 2016).
The Drg1, Rix7, and Rea1/Midasin (clade 7, discussed below) protein families are all involved in different maturation steps of the pre-60S ribosomal particles (Prattes et al. 2019). Drg1 is a eukaryotic family that is a type II AAAþ protein. It appears to have branched off from the Cdc48 family, as it has similarity in its N-domain, D1, and D2 domain. Structure prediction suggests that the Drg1 N-domain folds into two subdomains, as do Cdc48 and other related classical AAAþ proteins (Prattes et al. 2019).
Rix7 is a eukaryotic type II AAAþ protein that also appears to have branched off of the Cdc48 family earlier on in eukaryotic evolution, but unlike Drg1, it has a unique N-terminal domain from Cdc48's (Prattes et al. 2019). Members of the Rix7 family have a 40-50 amino acid insertion after a7 in D1 and a 10-35 amino acid insertion after a7 in D2 (Prattes et al. 2019). Rix7 was found to adopt an asymmetric spiral in complex with an unknown protein substrate; both D1 and D2 were found engaged to the substrate  but no clear N-terminal domain density was present.
The Pex1/6 family forms a heterohexameric complex of Pex1 and Pex6 that assists in peroxisome biogenesis. The D2 domains of Pex1 and Pex6 are strongly conserved, in contrast to their partially functional or nonfunctional D1 domains. Pex1/6 forms a unique, somewhat triangular hexamer due to the asymmetric arrangement of the N-terminal domains of Pex1 and Pex6 (Blok et al. 2015;Ciniawsky et al. 2015;Gardner et al. 2015).
The ATAD, 1 through 5, (ATPase family AAA domaincontaining family) is a protein family containing the AAAþ domain that is conserved in eukaryotes (Cattaneo et al. 2014). ATAD1 (Msp1 in yeast) consists of a transmembrane domain, linker domain, and the AAAþ domain (Wang et al. 2020) and is thus considered a type I classical AAAþ protein. ATAD2 (Yta7 in yeast) is comprised of an N-domain, one functional AAAþ domain (D1), one nonfunctional AAAþ domain (D2), and a C-terminal bromodomain (Cho et al. 2019). ATAD2 is classified as a type II classical AAAþ protein due to the presence of two AAAþ domains. Thus, the ATAD family contains both Type I and Type II classical AAAþ proteins.
Msp1, ortholog of ATAD1, has been shown to processively thread substrate through its central pore (Castanzo et al. 2020). It contains a transmembrane domain and primarily functions in the quality control of proteins anchored in membranes by removing them and initiating their degradation (Okreglak & Walter 2014). Structural analysis reveals two sets of pore loops (pore loops 1 and 2) that directly engage the substrate. A pore loop 3 exists which does not contact the substrate directly but instead functions in stabilizing the interactions with the substrate (Wang et al. 2020).
ATAD2 primarily functions in the context of chromatin dynamics, where it acts as a histone chaperone that regulates their interaction with DNA. The D2 domain is functionally inactive and lacks both Walker A and B motifs. The bromodomain (BRD) functions in the recognition of acetylated lysines, a common modification of the N-termini of histones (Fujisawa and Filippakopoulos 2017). In humans, ATAD2 has been identified as an oncogene overexpressed in multiple cancers, and efforts have been made to target the ATAD2 BRD with small-molecule inhibitors (Hussain et al. 2018). Interestingly, cryo-EM structural analysis of a yeast ortholog bound to a substrate (Cho et al. 2019) revealed that the complex is able to oligomerize without the presence of nucleotide, unlike many other AAAþ oligomers. This is due to the unique insertion of a helix-turn-helix "knob" that reaches across to a "linker arm" on a neighboring protomer. Affinity of chaperone to substrate does not change in the presence of nucleotide, but the deposition of chaperone on DNA is nevertheless ATP hydrolysis dependent.
The ClpAB ATPase family is composed of tandem AAAþ domains. The N-terminal AAAþ domain is from the classical clade, but the C-terminal domain is related to clade 5 (HCLR). ClpAB proteins are found in bacteria and eukaryotes, which likely arose from acquisition from the proto-mitochondrion (Iyer et al. 2004). The ClpAB family is discussed in the clade 5 section of the review.
Clades 4-7: pre-sensor-1 b-hairpin (PS1bH) superclade All four subsequent clades share a conserved insertion between a3 and b4, the strand containing the sensor-1 motif. This insert forms a b-hairpin that projects out of the AAAþ core ( Figure 2) .

Clade 4: superfamily III (SF3) helicase clade
In addition to the PS1bH insertion, the C-terminal alpha helical bundle is replaced with a unique arrangement of C-terminal elements.
The PS1bH is present in the central channel of the oligomer and interacts with a nucleic acid substrate (a representative example is shown in Figure 6). The SF3 helicases are found in RNA and DNA viruses but not in cellular genomes, aside from viral remnants. These initiators form closed hexameric rings around the DNA. In the SV40 initiation complex, His513 and Phe459 of the b-hairpin, along with Lys512 and Lys516, interact with and help separate the dsDNA. These and other residues allow for proper DNA melting and processive unwinding (Shen et al. 2005). Similarly, the E1 protein of papillomavirus is a clade 4 helicase. It engages DNA through the use of Lys506, coordinating the ssDNA phosphate, the main-chain amide of His507 (both of which are found in the b-hairpin insert) and several other residues ( Figure 6) (Enemark and Joshua-Tor 2006). Interestingly, the REP68 complex in adeno-associated virus forms a heptameric AAAþ ring structure (Santosh et al. 2020). Under different conditions, depending on the presence of substrate and type of nucleotide, the AAAþ ring (referred to as the SF3 helicase domain) could transition into a hexameric form as well.
The HCLR clade is comprised of AAAþ proteins that act as chaperones and proteases. The HCLR clade is comprised of four major families: HslU, ClpAB (C-terminal domain), Lon family, and RuvB family. The HslU (heat shock locus U) family (also called the Hsp100 or Clp family) is a bacterial family that has two orthologous lineages, HslU and ClpX. Both HslU and ClpX proteins have also made their way into eukaryotic organisms, likely via acquisition from the proto-mitochondrion. HslU (sometimes called ClpY) forms a hexamer and coordinates itself with proteolytic HslV (also hexameric, sometimes called ClpQ) to form a bacterial protease (Sousa et al. 2000). Two HslU hexamers sandwich two HsIV hexamers to form a cylindrical hetero-24mer (Bochtler et al. 2000). ClpX forms a hexamer as well (Glynn et al. 2009) and associates with its conjugate protease ClpP (Wang et al. 1997) to form the ClpXP protease (Gatsogiannis et al. 2019;Fei et al. 2020;Ripstein et al. 2020), which recognizes and degrades proteins that have a degradation signal in a multistep binding and engagement process (Saunders et al. 2020). Some of these degradation signals are added to the C-terminus of an incomplete protein from a stalled ribosome, which directs the incomplete protein to ClpXP for degradation (Flynn et al. 2001). ClpX on its own forms an asymmetric homohexamer similar to HslU but docks onto ClpP, which forms a homoheptameric complex. ClpX binds substrate through via pore-1, pore-2 and RKH loops (Figure 7). The RKH loop is part of the b-hairpin loop. ClpX docks into ClpP using flexible IGF loops that fit into hydrophobic pockets of ClpP. Structures showing ClpXP forming a "double capped" complex with two ClpX hexamers flanking a doubleringed heptameric ClpP have also been found, analogous to HslUV. The ClpXP is a principal player in cellular homeostasis (Bhandari et al. 2018) and has recently been leveraged as a therapeutic target against cancer (Ishizawa et al. 2019). Mitochondrial ClpX activates 5aminoleuvilinic acid synthase not by proteolysis but by remodeling the peptide substrate, demonstrating that ClpX can function independently of ClpP as an unfoldase. (Kardon et al. 2020).
The second tandem AAA+ domain of the ClpAB proteins (C-terminal domain) and other related proteins make up another bacterio-eukaryotic family. Unlike ClpX, ClpA is a double-ringed AAAþ protein that associates with ClpP but still maintains a hexamer-heptamer mismatch interaction. ClpA is found gram-negative bacteria, while close AAAþ relative ClpC, a functional ortholog, is found in gram-positive bacteria and cyanobacteria (Hamon et al. 2015). The first domain of ClpA is referred to as D1 and the second domain is referred to as D2, a common naming convention applied to AAAþ proteins with two connected ATPase domains. ClpP consists of two heptameric rings stacked together. Together, ClpA and ClpP form either a ClpAP 20-mer or 26-mer, with ClpA docking with ClpP via flexible IGL loops on one or both axial faces of ClpP. ClpA's IGL loops fill 6 out of the 7 hydrophobic pockets of ClpP. Pocket switching of the IGL loops appears to be associated with ClpA rotation, suggesting that ClpA uses ClpP as a surface to push against and rotate about during translocation (Lopez et al. 2020). Like ClpA, ClpC also contains tandem AAAþ domains, with the first belonging to the classical clade and the second belonging to the HCLR clade. It also associates with ClpP to form a proteolytic complex but also requires the adaptor protein MecA (Wang et al. 2011). ClpE is also a doubleringed AAAþ closely related to ClpA and ClpC and presumably functions in a similar manner to form a ClpEP proteolytic complex (Kress et al. 2009). ClpL, a close relative to ClpC, is found mostly in gram-positive bacteria and requires no other proteins to function as a chaperone . Curiously, it forms a tetradecameric complex of two heptameric rings with unfoldase activity (Kim et al. 2020). ClpD, another close relative of ClpC, is found in bacteria and eukaryotesspecifically, in plants, where it localizes to the chloroplast and presumably also associates with ClpP (Singh and Grover 2010).
The directionality of substrate loading (i.e. N-terminus or C-terminus inserted into pore) is key to determining the efficiency of energy coupling to unfolding (Olivares et al. 2017). Single-molecule optical trap experiments for both ClpAP and ClpXP suggested that titin is unfolded more quickly when the N-terminus is loaded versus the C-terminus (controlled by location of the degron signal) (Olivares et al. 2017) and that local mechanical stability at the position proximal to the enzymatic complex is rate limiting (Cordova et al. 2014;Olivares et al. 2017). The double-ringed ClpA unfolded some substrates faster than the single ringed ClpX, while taking smaller 1-2 nm step sizes as opposed to 1-4 nm by ClpX despite this similarity in energy coupling with regards to N-terminus or C-terminus loading .
ClpB, with ortholog Hsp104/Skd3 in eukaryotes, is a disaggregase that associates with DnaK/Hsp70 to unravel proteins (Deville et al. 2019). It consists of a two AAAþ domains, with an M-domain between them. ClpB does not associate with ClpP and does not form a proteolytic complex. The coiled-coil M-domain is thought to regulate ClpB activity by encircling the first AAAþ domain. The first AAAþ domain has a tyrosine residue on its pore loop (because the first AAAþ domain belongs to the classical clade) and a secondary, less-conserved loop that appears to interact with substrate as well. Although the second AAAþ domain seems to be more important to disaggregase activity than the first, the PS1bH does not engage substrate; instead, a small tyrosine-containing motif interrupts the alpha-helical element between b2 and b3 and interacts with substrate. As with many AAAþ proteins, mutation of ClpB leads to many diseases, ranging from brain atrophy (Wortmann et al. 2015) to aciduria (Kiykim et al. 2016). It has been suggested that Hsp104 could be repurposed to combat neurodegenerative disease caused by protein aggregation (March et al. 2019) considering the fact that the disaggregase activity of Skd3 (ClpB/Hsp104 in humans) is critical in maintaining mitochondrial proteostasis (Cupo and Shorter 2020).
Torsin proteins are also members of the ClpAB family because they share homology to the 2 nd tandem AAA+ domain of the ClpAB family (Iyer et al. 2004). Torsin proteins are primarily limited to metazoans (Rose et al. 2015), suggesting that they evolved rapidly from eukaryotic ClpAB into their own subfamily (Iyer et al. 2004). TorsinA is ubiquitously expressed in all cell types, and deletion(s) in TorsinA are associated with earlyonset torsion dystonia (EOTD) (Rose et al. 2015). Torsin localizes to the ER membrane and has been shown to affect lipid content in Drosophila (Grillet et al. 2016). Torsin ATPase activity, unlike other AAAþ proteins, relies on cofactors LAP1 and LULL1 (Goodchild and   (Sosa et al. 2014;Demircioglu et al. 2016). Additionally, TorsinA has been found to form helical filaments (Demircioglu et al. 2019), reminiscent of the DnaA right-handed superhelix .
Lon proteins comprise the third major family in the HCLR clade. There are two major clades within this family, the bacterial Lon lineage (which made its way into eukaryotes via mitochondria) and the archaeal Lon lineage (Iyer et al. 2004). The bacterial Lon proteins consist of a AAAþ domain with an N-terminal domain that functions in the recognition of substrate and a C-terminal protease domain (Vieux et al. 2013). Like other AAAþ proteins, four Lon protomers bind the substrate while the top and bottom protomer occupy do not directly bind the substrate and have a gap between the two protomers, also called the seam region (Zhang et al. 2020). When not bound to substrate, bacterial Lon adopts a "locked," left-handed spiral that is ADP bound. When bound to substrate, Lon adopts a "closed," righthanded spiral along the translocating peptide. PS1bH serves to stabilize substrate-engaged organization of the pore loops, playing a critical role in substrate processing (Shin et al. 2020).
The RuvB family associates with double-stranded DNA and assists in recombination in bacteria. RuvB, in concert with RuvA and RuvC, processes Holliday junctions. RuvB appears to form a hexamer that surrounds the DNA. The beta hairpin of RuvB interacts with RuvA (Yamada et al. 2002).

Clade 6: H2 insert clade
In addition to the PS1bH, there is another b-hairpin insertion in a2 .
There are two families that make up clade 6: the archaeo-prokaryotic-eukaryotic McrB family and a prokaryotic family of r 54 -related transcription factors (Iyer et al. 2004). The McrB family uses GTP instead of ATP, unlike most AAAþ proteins . They are sporadically distributed throughout bacterial, archaeal, and animal phylogeny. In bacteria and archaea, McrB is encoded on a mobile operon that encodes a modification-dependent restriction enzyme system. The N-terminal domain of McrB forms a DNAbinding domain that precedes the AAAþ domain. McrB then associates with the nucleolytic McrC to form the McrBC complex (Figure 8(a,b) (Muley et al. 2008). This protein is required for the process of axonal elongations it facilitates interactions between microtubules and neurofilaments, and it exhibits 3 0 to 5 0 helicase activity and exonuclease activity in vitro (Ishiguro et al. 2002).
Proteins from the r 54 -related transcription factors family assist in transcription by the r 54 RNA polymerase, which is used to transcribe genes involved in the stress response . Proteins in the r 54related transcription factors family have diversified with various N-terminal fusions connected to the ATPase; these domains allow them to respond to small molecules in bacteria by triggering a conformational change of r 54 . NtrC, one such member of this family, controls the transcription of nitrogen-related genes. r 54 requires NtrC to transition from a closed complex to an open complex (Soules et al. 2020). NtrC1, a protein in the family, binds to r 54 through a highly conserved GAFTGA motif centered in the a2 insertion (Bush and Dixon 2012). NtrC1 is heptameric (Lee et al. 2003;Chen et al. 2010) under a variety of conditions, with and without mutation to the Walker B site (Figure 8(c,d)). Moreso, the heptameric NtrC1 binds r 54 . It is not clear if the heptameric state bound to substrate is the active form or a locked form prior to activation. When bound to ATP, a series of conformational changes are associated with the binding of the arginine finger to the c-phosphate group. Notably, this leads to stabilization of the loop containing the GAFTGA motif used for engagement of RNA polymerase.

Clade 7: PS-II insert clade
In addition to the b-hairpin insertion in a2, an additional a-helix in the C-terminal a helical bundle causes a repositioning of the sensor-2 motif (Figure 1(B) and 9(b,c)) .
Clade 7 is made up of the following families: MCM Family, MoxR family, Chelatase/YifB family, and Dynein/ Midasin/Mysterin family. The archaeo-eukaryotic MCM family has been studied extensively in the context of DNA replication. In both archaea and eukaryotes, MCM is a hexameric ring that serves as the helicase loaded by clade 2 Orc1/Cdc6 or Cdc6 respectively (Duderstadt and Berger 2008). The helicase hydrolyzes ATP to perform strand separation at the replication fork, subsequently allowing the replication machinery to continue. In archaea (e.g. Sulfolobus solfataricus), the MCM ring is a homohexamer of six identical MCM proteins and is able to unwind DNA without forming a larger complex in vitro (Figure 9(a)) (Meagher et al. 2019). The N-terminal domain of an archaeal MCM subunit binds DNA. The C-terminal domain contains a helix-turn-helix motif, a common DNA-binding motif, and the PS-II AAAþ domain is flanked by these domains in the middle (Figure 9(a-c)). Two MCM spiral staircase-shaped hexamers form a double ring in a head-to-head configuration. Both of the hairpins, the H2 insert, and PS1bH, project into the central channel and bind the DNA; one protomer interacts with two nucleotides at a time, the same ratio observed for DnaB. The PS1bH interaction with the DNA is comparable to how papilloma virus E1 from clade 4 interacts with DNA. The PS1bH A431 amide interacts with one phosphate of the ssDNA while the PS1bH K430 forms an ionic interaction with the phosphate immediately 3 0 . The H2 V377 amide and the side chain of T369 also bind phosphates in the same manner.
The eukaryotic MCM ring consists of the six related proteins, MCM2-7 (Li et al. 2015). In eukaryotes, the active replicative helicase consists of Cdc45-MCM2-7-GINS to form the CMG complex. CMG unwinds duplex DNA from 3 0 -5 0 . MCM2-7 also forms a double head-tohead configuration of hexameric rings, mediated by the zinc fingers in the N-domains. The eukaryotic MCM proteins differ from the single archaeal MCM protein in that each subunit has N and C terminal extensions and insertions. These extensions contribute to inter-hexamer interactions, such as between the zinc finger of MCM2 of one and b-turn of MCM6 of another, the Nterminal domain MCM5, an N-terminal insertion of MCM7, and lastly, the N-terminal domains of MCM3 and MCM7. The H2I and PS1bH also arrange themselves in a spiral fashion in the eukaryotic variant.
The MoxR family is well-represented in most bacteria and archaea, where they are thought to function as chaperones for metabolic complexes (Iyer et al. 2004). For example, in P. denitrificans, norQ codes for a MoxR ATPase that works in inserting an iron cofactor into an oxide reductase (Snider et al. 2006). CbbQ, found in some chemoautrophic bateria, forms a hexameric complex to which the CbbO adaptor protein binds (Tsai et al. 2020). RavA is another well-characterized protein in the MoxR family that works in concert with von Willebrand factor A (VWA) domain-containing protein ViaA (Jessop et al. 2020). RavA is also known to bind to lysine decarboxylase LdcI (El Bakkouri et al. 2010;Kandiah et al. 2016;Jessop et al. 2020). RavA is comprised of a AAAþ domain: a triple helical domain and a LARA domain (LdcI associating domain of RavA). RavA forms a cage with the LdcI decamer to form a larger complex. RavA on its own forms a hexamer, and five of these hexamers arrange themselves around two LdcI decamers to form the RavA-LdcI cage. The LARA domain mediates interaction between RavA and LdcI.
The YifB/chelatase family is a family found exclusively in bacteria. The YifB arm of this family is fused to a Lon protease domain N-terminally and has an insertion of a zinc cluster in the fourth beta strand of the AAAþ core domain (Iyer et al. 2004). Members of the chelatase family have two AAAþ domains. In the ChID protein, the first is active, and the second is inactive, similar to the arrangement of NSF's D1 and D2 domains (Adams & Reid 2013). Other magnesium chelatases, such as BchI from R. capsulatus and Ch1I from Synechocystis, can form both hexameric and heptameric rings (Lundqvist et al. 2010).
Finally, the dynein/midasin/mysterin family is unique in that all six tandem AAAþ domains are encoded on a single polypeptide, instead of forming from six identical or similar protomers (Figure 9(d)). Dynein, extensively studied, is found in eukaryotes and associates with microtubules to transport macromolecular complexes around the cell. The linear sequence of dynein from N to C terminus consists of a tail region, an N-terminal sequence, the first four AAAþ domains, a microtubulebinding domain (MTBD), the fifth and sixth AAAþ domains, and then a C-terminal sequence (Roberts et al. 2009). The "head" of dynein is composed of a pseudo-hexameric arrangement of the six AAAþ domains, and the MTBD is at the end of a long stalk that points away from the pseudo-hexameric ring. Dynein functions as a dimer of these tailed, pseudo-hexameric ATPase rings, which are in turn connected by the tail domains and several light and Core b-sheets are colored orange, surrounding area of large subdomain are colored green and the small subdomain is colored yellow (first three helices) and purple (fourth helix). The PS1bH is colored blue, the helix-2 insert is colored yellow, and the C-terminal helix insertion is colored gray. (c) Front view of core AAAþ MCM domain. (d) Side view of Dynein chain (PDB: 4RH7 (Schmidt et al. 2015)). Dynein has all six of its AAAþ domains and the rest of its domains linked in a single polypeptide. AAA1 is colored red, AAA2 is colored blue, AAA3 is colored orange, AAA4 is colored magenta, AAA5 is colored yellow, AAA6 is colored gray, the stalk is colored pink, and intervening regions are colored green.
intermediate chains. Inactivation of the first AAAþ domain renders dynein completely immobile, inactivation of the third dramatically reduces the velocity of dynein (Cho et al. 2008), and inactivation of the fourth domain renders dynein immotile ). The first four AAAþ domains can bind nucleotide, but the fifth and sixth cannot. A comparison of structures of dynein bound to either ADP or ADP-vanadate suggests that closure and opening of the hexameric ring leads to a steric clash with the linker, thus generating movement (Schmidt et al. 2015).
Midasin or Rea1 mechanically removes ribosomal assembly factors to promote the maturation of the pre-60S complex, which eventually leads to its export to the cytosol (Sosnowski et al. 2018). The N to C arrangement of Rea1 is as follows: N-terminal domain, the six AAAþ domains, a larger tail linker region, and a tail Cterminus that includes a Metal Ion Dependent  Adhesion Site (MIDAS), which interacts with the substrates that are removed from the pre-60S particles. Rea1 has several tail conformations that, like dynein, could produce force to dislodge the factors. The first and last AAAþ domains lack the Walker B motif required to hydrolyze ATP and are thus nonfunctional.
Mysterin, or RNF213 (RING finger protein 213), is a large, 500-600 kDa eukaryotic protein that consists of six linked AAAþ domains, as well as an N-arm and a multidomain E3 domain responsible for ubiquitin ligase activity (Ahel et al. 2020). Mutations in mysterin significantly increase the risk for Moyamoya disease, a cerebrovascular disease (Morito et al. 2014). It is targeted to lipid droplets, where it plays a role in lipid metabolism and other biological processes (Sugihara et al. 2019). Only the third and fourth AAAþ domains are catalytically competent, while the second AAAþ domain binds ATP but lacks a Walker B motif and is thus unable to hydrolyze ATP (Ahel et al. 2020).
The Pch2/Trip13 family is a AAAþ family that has no discernable placement in any of the seven clades (Ye et al. 2015). It lacks a clear b hairpin insertion before the fourth b strand after the third a helix, disqualifying it from clades 4-7. It also lacks the synapomorphic features of an "RC" finger from clade 1 members, or a long a helical insertion as in clade 2 members. It also has a functional sensor 2 motif, unlike the classical clade 3 members that have a mutated alanine at the position instead of a conserved polar residue. Based on phylogenetic and structural analysis (Ye et al. 2015), we postulate that it diverged from a clade 3 and clades 4-7 ancestor. This eukaryotic protein family is a key regulatory in spindle assembly. TRIP13, with its adaptor protein p31 comet , loads substrate protein MAD2 in a spiral conformation (Alfieri et al. 2018) similar to other AAAþ proteins (Zhao et al. 2015;Puchades et al. 2017;White et al. 2018;Fei et al. 2020;Ripstein et al. 2020;Saunders et al. 2020).
N-ethylmaleimide sensitive factor (NSF) as a model system for the AAA1 superfamily NSF (also known as Sec18 in yeast (Novick et al. 1980) and comatose (Siddiqi & Benzer 1976;Dellinger et al. 2000) in Drosophila) is a protein found in all eukaryotic cells. Early studies identified conditional mutants of the gene responsible for membrane transport defects at restrictive temperatures (Novick et al. 1980). In 1988, Rothman and colleagues discovered a protein that was essential for vesicular transport in eucaryotic cells Malhotra et al. 1988). This vesicle transport activity was ablated by the addition of the chemical N-ethylmaleimide (NEM), leading to the naming of the protein as N-ethylmaleimide-sensitive-factor (NSF). Rothman and colleagues demonstrated inhibition of transport by either titrating in more NEM or by using an antibody targeting NSF. Subsequently, SNARE proteins were identified as substrates for NSF in a process requiring the adapter protein SNAP (Sec17p in yeast) (Weidman et al. 1989).
SNARE proteins are essential factors required for vesicle fusion in a variety of cellular contexts. For example, they play a key role in fusion during intravesicular transport, such as cargo transport through the Golgi apparatus (Malsam and S€ ollner 2011), and secretion, as in the case of neurotransmitter release (S€ udhof 2013;Rothman 2014). Mechanistically, this is accomplished by the association of SNARE proteins on opposing membranes to form the trans SNARE complex (Sutton et al. 1998;Weber et al. 1998). Fusion is then achieved as the SNARE proteins more tightly associate to form a helical bundle. Following fusion, the SNARE proteins are found in a stable complex anchored to the same membrane, the cis SNARE complex. NSF maintains a supply of individual SNAREs by disassembly of these post-fusion cis SNARE complexes (S€ ollner et al. 1993). Moreover, NSF is involved in quality control, as it promotes the formation of fusogenic SNARE complexes by disassembly of the syntaxin-SNAP-25 binary complex as well as off-pathway complexes (Ma et al. 2013;Lai et al. 2017;Brunger et al. 2018).
NSF engages a variety of SNARE complexes through adaptor proteins called soluble NSF-attachment proteins (SNAPs). NSF, multiple SNAPs, and a SNARE complex form a super-complex (often referred to as 20S complex), the starting state for SNARE complex disassembly (Malhotra et al. 1988;Whiteheart et al. 1992;S€ ollner et al. 1993;Mayer et al. 1996;Hanson et al. 1997). NSF's ability to disassemble post-fusion SNARE complexes is dependent on the hydrolysis of ATP (Nagiec et al. 1995). There is only one copy of NSF in most eukaryotic organisms, and null mutations known to date are largely lethal (Bayless et al. 2018).
NSF is comprised of three domains, an N-domain, the first AAAþ domain (D1), and the second AAAþ domain (D2) (Figure 10). These domains are discrete structural entities, as was first recognized by limited proteolysis studies (Tagaya et al. 1993). NSF may have branched off from the archaeo-eukaryotic CDC48 family, which also consists of an N-domain, a D1, and a D2 domain. The structure of the D2 domain was among the first two structures of the "classical" clade 3 AAAþ fold (Figure 1(c)). Moreover, the crystal structures of the NSF D2 domain provided the first insights into the oligomeric arrangements of AAAþ domains by revealing the structural basis of the formation of the NSF D2 hexameric ring (Lenzen et al. 1998;. The biochemical properties of the NSF AAAþ domains point to diverging functions. The D1 domain of NSF binds ATP weakly in comparison to the D2 domain. For the D1 domain, the K D for ATP is approximately 15-20 lM, while in the case of the D2 domain, the K D is approximately 30-40 nM (Matveeva et al. 1997). Although both the D1 and D2 domains hydrolyze ATP, the D2 does so very slowly (Morgan et al. 1994). When nucleotide is entirely stripped from NSF, it reverts to a monomeric state, and upon reconstitution with ATP, it re-forms a hexamer (Zhao et al. 2015). Together, these observations suggest that the D2 domain is primarily responsible for oligomerization, while the D1 domain actively processes substrate over successive rounds of hydrolysis. The order of the domains also matters, as flipping the D2 domain for the D1 domain completely destroys the activity of the enzymatic complex (Whiteheart et al. 1994). Mutations to the Walker A or B motif of the D1 domain (K266A and E329Q) leave the complex catalytically dead (Matveeva et al. 1997). As mentioned above, the Walker A motif is involved in binding nucleotide, whereas the Walker B motif plays a role in hydrolysis. The D1 Walker B mutation (E329Q) is still able to form the NSF/SNAP/ SNARE complex at WT levels, while the D1 Walker A mutation (K266A) only forms $10% of complex compared to WT. Mutation of the D2 Walker B site (D604Q) still allows for NSF/SNAP/SNARE formation and disassembly activity, which is in line with the hypothesis that the D1 domain provides the majority of ATPase activity required for function (although it is theoretically possible that ATPase activity of D2 may be stimulated by substrate). Interestingly, mutation of the D2 Walker A site (K549A) still allows for NSF/SNAP/SNARE complex formation and intermediate amounts of activity.
The N-domain interacts with the SNAP adaptor proteins to bind substrate, following another key trend in AAAþ N-domains . The N-domain is composed of two different sub-domains, A and B, joined by a linker (Figure 10) (May et al. 1999;Yu et al. 1999). The A sub-domain is a double-w b barrel (Castillo et al. 1999) (DPBB), appropriately named due to the two interlocking w loops. The first w loop lies between the first and second b-strands, and the second w loop is between fourth and fifth b-strands. The DPBB of A domain varies from the canonical structure only by the location of the first w loop being between the first and second b-strands, instead of the canonical second and third. The B-domain is an a/b roll, with one primary alpha helix surrounded by four beta strands. The core of the roll between the a-helix and the four b-strands is stabilized by a row of conserved phenylalanine and tyrosine side chains, forming an aromatic ladder. The overall charge of the N-domain is positive, and it has an extremely positive "groove" which serves as a point of electrostatic interaction with the negatively-charged interface of the C-terminal region of the SNAP protein (Zhao et al. 2015). Deletion of the N-domain prevents formation of the NSF/SNAP/SNARE complex with the SNAPs and SNARE complex. The recombinant N-domain alone does not bind SNAP/SNARE either (Nagiec et al. 1995), but when fused to a single D1 or D2 domain it is able to form a complex with SNAP, suggesting that the N-domain in the context of a AAAþ domain is the minimum requirement for SNAP binding. The N-domain of NSF is similar to the CDC48/p97 family (Iyer et al. 2004).
The oligomeric state of NSF is primarily hexameric as supported by solution studies (Fleming et al. 1998), structural studies of the D2 domain (Lenzen et al. 1998;Yu et al. 1998), and, more recently, cryo-EM structures of NSF (Zhao et al. 2015). The number of SNAP molecules involved varies from 2 to 4 (Shah et al. 2015;Zhao et al. 2015;White et al. 2018).
Structure determination of full-length NSF and the NSF/SNAP/SNARE complex proved extremely challenging. Crystallization of full-length NSF as well as the NSF/ SNAP/SNARE complex has been unsuccessful to date, and early cryo-EM studies resulted in low-resolution maps that did not reveal SNAREs/SNAPs or artificially symmetry averaged densities of SNAREs/SNAPs (Furst et al. 2003;Chang et al. 2012). The first near-atomic resolution cryo-EM structures of both apo NSF and the NSF/SNAP/SNARE complex were determined in 2015 (Zhao et al. 2015). The structure of the NSF/SNAP/ SNARE complex revealed the engagement of NSF with aSNAP and a highly truncated neuronal SNARE complex composed of syntaxin-1a, synaptobrevin-2, and SNAP-25a in the presence of the non-hydrolysable nucleotide analog AMPPNP. This success was due to an extensive purification scheme; NSF was ectopically expressed in E. coli, reduced to monomeric state through repeated buffer exchange and removal of nucleotide, and then reassembled during size exclusion chromatography with readdition of nucleotide. With this purification scheme, hexameric NSF of exceptional homogeneity and purity can be prepared in the presence of a variety of nucleotides and nucleotide analogues. Moreover, the disassembly activity of this highly purified NSF sample is much higher than that obtained by previously published methods. Advances in cryo-EM methodology and the availability of direct electron detectors also played a key role (Bammes et al. 2012). Subsequently, an even higher-resolution structure of the NSF/SNAP/SNARE complex was determined by cryo-EM with aSNAP and a near-full length neuronal SNARE complex, in the presence of ATP under non-hydrolyzing conditions (White et al. 2018) (Figure 10). This success was due to slight truncation of the SNARE complex at the C-terminal end, reducing aggregation and association of particles on EM grids, and use of a more advanced electron microscope. Critically, this cryo-EM structure approached near-atomic resolution and for the first time revealed the engagement of 17 N-terminal residues of SNAP-25a by the NSF D1 ring (White et al. 2018). This structure was among the very first to reveal any substrate-clade 3 interactions.
Together, these cryo-EM structures (Zhao et al. 2015;White et al. 2018) provided many new insights about NSF and the NSF/SNAP/SNARE complex. The fold of the AAAþ domain is conventional for both D1 and D2, with the notable exception of a bent a2 helix in the D1 domain. This is not the case for the D2 domain and a close relative, p97, which both have the classical straight helix. In ATP-bound NSF (ATP-NSF), all six Ndomains are facing upwards, opposite to direction of substrate translocation, whereas in ADP-bound NSF (ADP-NSF), four are up, and two are facing downward, hugging the side of the double-ringed hexamer (Zhao et al. 2015). ATP-NSF has a symmetric D2 ring and asymmetric D1 ring, with a split between the first and last protomer both in the plane of the ring and along the pore axis. ADP-NSF is more planar than ATP-NSF but maintains a larger gap between the first and last protomer. The a7 helix is also significantly translated between ATP-NSF and ADP-NSF, likely due to the difference in nucleotide state in the D1 ring.
As noted previously, NSF requires the SNAPs to mediate interactions with various SNARE complexes; the SNAPs and their yeast homolog Sec17 are composed of an extensive twisted sheet of a helical hairpins and a C-terminal helical bundle (Rice and Brunger 1999) that interacts with the NSF N-domain ( Figure 10). A hydrophobic loop in the N-terminal region of SNAP might serve as a membrane attachment site as disassembly activity is greatly enhanced in the presence of membranes mediated by this N-terminal region of SNAP (Winter et al. 2009). The SNAPs directly interact with the SNARE complex via primarily electrostatic interactions Zhao et al. 2015;White et al. 2018). The SNARE complex has a highly conserved characteristic pattern of positive and negative surface charges (conserved across all species and localizations investigated so far) with which the SNAPs interact; a minimum of two SNAPs interface with this pattern through their own complementary charged surfaces in all structures observed to date, with two additional SNAPs possibly binding as well. The presence of the SNAP-25a linker may preclude binding of an additional two aSNAPs as observed in the NSF/SNAP/SNARE structure lacking this fragment (Zhao et al. 2015;White et al. 2018). Biochemical data highlight the importance of the charged SNARE complex surface in disassembly; NSF is unable to disassemble a four-helix bundle that does not have the characteristic surface charge distribution of SNARE complexes . Interestingly, NSF is capable of disassembling a "double" SNARE complex created by linking two SNARE complexes in a head-to-tail fashion .
The position of the N-domains relative to the D1 ring is likely related to substrate recognition. N-domains in the "up" position interact with aSNAP electrostatically, typically with two N-domains associating with negative residues in the C-terminal domain of aSNAP. This series of SNAP-SNARE and SNAP-NSF interactions positions the N-terminal end of the SNARE complex proximal to the D1 pore. This pore is formed primarily by protomers A-F of the D1 ring, which are arranged to form an asymmetric staircase configuration with an average rotation of 57.5 ± 1.0 around the pore axis ( Figure  10). The N-terminal residues of one of the neuronal SNAREs, SNAP-25a, are gripped by a spiraling array of tyrosine residues on the NSF D1 pore loop 1's of protomers A-E (White et al. 2018); this structure was determined in the presence of ATP and EDTA (i.e. under nonhydrolyzing conditions). Each tyrosine intercalates between every other amino acid of the substrate by engaging in a hydrogen bonding interaction with SNAP-25a's backbone carbonyl, as well as apparently nonspecific packing interactions with the substrate sidechains themselves (Figure 10(d)).
The available cryo-EM structures of NSF/SNAP/SNARE complexes so far have not revealed any density inside the D2 pore. Moreover, the register of the N-terminal residues of SNAP-25 is uniquely determined by the position of the SNARE complex (White et al. 2018) (Figure  10(b)), precluding extension of this chain into the D2 pore. Of course, it cannot be ruled out that substrate may enter the D2 pore under hydrolyzing conditions.
The density corresponding to the D1 F protomer is poor, indicative of high mobility and thus disengagement from the rest of the D1 ring. The overall configuration of the D1 ring with respect to substrate is consistent with other classical clade 3 AAAþ proteins such as YME1, Vps4, and many others (Caillat et al. 2015;Monroe and Hill 2016;Monroe et al. 2017;Puchades et al. 2017;Sun et al. 2017); apical aromatic residues at the pore loop 1 intercalate between every other amino acid of the substrate. However, the NSF D2 pore loops lack a conserved aromatic residue, suggestive of a minimal role in the active processing of substrate and casting further doubt on the possibility of substrate threading through the D2 pore. Furthermore, despite the asymmetry of the D1 ring, the D2 ring remains largely symmetric in the currently available cryo-EM structures and probably serves as a rigid platform, with conformational change of the D1-D2 linker accommodating the asymmetric arrangement of the D1 domains.
The structural details of the SNARE complex disassembly process remain largely unknown. Based on the structures of the substrate-loaded complex (Zhao et al. 2015;White et al. 2018), we speculated that pulling and subsequent translocation of the SNARE protein (the Nterminal region of SNAP-25 in the available structures) engaged by the D1 ring leads to destabilization of the ternary complex, possibly through the disruption of the N-terminal portion of the SNARE bundle. This process could be further driven by the motion of the N-domains and associated ASNAP molecules; we note that the ASNAPs themselves wrap around the SNARE complex in the opposite direction of the twist in the ternary SNARE complex. Following destabilization of the SNARE complex, the SNAP-25 N-terminus could be released through a split in the D1 and/or D2 rings of NSF, as suggested by the mobility of the F-protomer, while the other SNAREs could simply disassociate.
The energetics and dynamics of SNARE complex disassembly process have been studied as well. Single molecule experiments show that NSF unravels the SNARE ternary complex in a single, fast (<10 ms) step and appears to require only the 12 ATP (Ryu et al. 2015), although the temporal resolution of these experiments precluded observation of possible intermediate steps during SNARE complex disassembly. This observation is in contrast to other systems, where sequential steps have been observed for AAAþ proteins involved in degradation/disaggregation which occurs on the timescale of seconds (Sen et al. 2013;Rodriguez-Aliaga et al. 2016;Olivares et al. 2017). Furthermore, magnetic tweezer-assisted unfolding of all four SNARE motifs is simultaneous on a millisecond timescale (Kim et al. 2021). ATP binding to NSF hexamers is random and negatively cooperative while in the NSF/SNAP/SNARE complex it is synchronous and cooperative (Kim et al. 2021). Finally, substrate binding is associated with an 8.4-fold increase in the rate of ATP hydrolysis (Kim et al. 2021).
The gap in knowledge in understanding how NSF (and related AAAþ complexes) performs its explosive disassembly of SNARE proteins necessitates additional studies. Single molecule experiments performed at even higher time resolutions would be worthwhile in determining intermediate conformations. These single molecule studies could then help guide structural studies to capture these intermediate conformations and help delineate principles in AAAþ action on substrate.
The p97/Cdc48 family A close relative of the NSF protein family worth noting is the p97/Cdc48 family, also a type II, clade 3 AAAþ protein. p97 (also called VCP, valosin-containing protein) typically refers to the mammalian ortholog, and Cdc48 refers to the yeast ortholog. p97 has been associated with many cellular functions that span across many different biochemical processes (van den Boom and Meyer 2018), and is perhaps most well-known for its central role in the ubiquitin-proteosome system (UPS). Through the aid of adaptor proteins, p97 binds ubiquitinated proteins and assists in their degradation at the proteosome. This is highly important for alleviating proteotoxic and organelle stress where proteins become damaged or misfolded. The unfolding activity of p97 prevents the mass accumulation of these proteins in cellular sub-compartments. Studies have also linked p97 to the lysosome pathway and macroautophagy (van den Boom and Meyer 2018). p97 also helps to liberate proteins that are sequestered in macromolecular complexes or the membrane; these proteins are then able to become functional, so the function of p97 is not solely related to protein degradation. The loss of p97 is lethal in mice (M€ uller et al. 2007), and it is upregulated in certain cancers (Deshaies 2014). Its key role in several processes makes its dysregulation key for several diseases (Meyer & Weihl 2014).
Like NSF, p97 contains an N-domain followed by a D1 and D2 domain (DeLaBarre and Brunger 2003;Davies et al. 2008;Banerjee et al. 2016). Six subunits form a homohexameric oligomer that maintains interprotomer contacts through the traditional mode of arginine fingers. Corkscrew-like changes of the D1 and D2 domains along with upward motions of the N domains are observed in the presence of ATPcS (Banerjee et al. 2016 Additionally, D2 Walker A or B mutations ablated activity much more dramatically than D1 domain mutants (Song et al. 2003). At increased temperatures, however, mutations to D1 reduced activity, whereas D2 mutants were insensitive to temperature change. Thus, the roles of the D1 and D2 domain appear to be reversed compared to NSF, as the D1 domain is primarily responsible for oligomerization while the D2 domain appears to be the domain that carries out the bulk of the hydrolysis activity. Indeed, mutation of the Walker B motif in D1 of Cdc48 eliminated ATPase activity in the absence of substrate and reduced activity when the substrate and cofactor are present, while mutation of the Walker B motif in D2 of Cdc48 led to reduced ATPase activity without substrate and even further reduced activity upon addition of substrate and co-factor (Bodnar and Rapoport 2017). Substrate stimulates ATPase activity in D2 while suppressing activity in D1. Moreover, substrate is completely translocated by both D1 and D2 rings as revealed by EM structures of human p97 (Cooney et al. 2019;Twomey et al. 2019;Pan et al. 2021). In contrast, available structures have thus far not revealed any substrate binding to the D2 domain of NSF.
The N-terminal domain of this protein family, as briefly mentioned above, interacts with adaptor proteins that help to coordinate engagement of substrate. For example, the Ufd1-Npl4 (UN) heterodimer interacts with p97 (Bodnar et al. 2018). Ufd1 interacts with Npl4 through an unstructured protein segment. Two short SHP motifs in Ufd1 mediate its interaction with Cdc48. Npl4 interacts via a UBX-like domain the with N-domain of Cdc48 and two zinc fingers to orient itself above the D1 ring. Alternatively, Cdc48 can bind other mutually exclusive adaptor proteins such as Shp1 (Cooney et al. 2019).
The UN heterodimer binds polyubiquitinated proteins and which form a "tower" on top of the D1 ring to form a processive complex with Cdc48/p97 (Twomey et al. 2019;Pan et al. 2021) (Figure 11). In all three cryo-EM structures, the substrate transverses both the D1 and the D2 pores. However, at variance with NSF, the Cdc48/p97 D1 pores lack the conserved tyrosine "ladder" that is a hallmark of the substrate-NSF interactions (White et al. 2018) as well as other substrate-AAAþ interactions known to date (Puchades et al. 2017. Methionine residues in D1 are involved in substrate binding (Twomey et al. 2019). In contrast, the Cdc48/p97 D2 pore contains conserved tyrosine residues, and the substrate is bound by the D2 pore in a fashion similar to NSF D1 (Figure 11 Pan et al. used a mutation in the N domain that makes the p97 preparation more homogeneous, and, importantly, let hydrolysis proceed for 5 min at room temperature before flash freezing the sample for cryo-EM. Critically, Pan et al. (2021) found that the nucleotide states in D1 and D2 are not coupled. Coordination of hydrolysis between the D1 and D2 appears to be asynchronous, given that in an active state D1 and D2 domains in the same chain can bind different nucleotides. Moreover, two distinct states were identified-an open state with all six D2 pore loops engaged with substrate, and a closed state with only five engaged, (i.e. the F D2 domain does not interact with substrate). Within a single D1 or D2 ring however, it appears that hydrolysis could be sequential, as an intersubunit signaling (ISS) motif appears to insert itself into neighboring nucleotide-binding sites for several consecutive protomers; this could inhibit uncoordinated ATP hydrolysis (Pan et al. 2021). This ISS motif is conserved in several AAAþ proteins in different clades, suggesting that it could be widely important in different complexes (Chang et al. 2017). The authors, therefore, suggest that the D2 domain is likely exerting the force on the substrate by a power-stroke-like mechanism, although a stochastic Brownian ratchet mechanism cannot be entirely ruled out (see discussion below).

Mechanistic models of AAA1 engagement
A mechanistic explanation of the action of different members of the AAAþ protein superfamily has been a key goal ever since the discovery of its first members (Ogura and Wilkinson 2001;Harrison 2004). NSF and other AAAþ proteins have been used as model systems to tease apart the mechanism of oligomerized AAAþ action on substrate through a variety of methods. Despite remarkable progress, particularly with respect to understanding the biological contexts of AAAþ protein action, many questions remain. Indeed, progress in understanding the mechanistic pathway preceding, during, and following nucleotide hydrolysis-the reaction coordinate of AAAþ protein substrate processing-remains, at best, limited in most cases. Furthermore, to what degree might a given AAAþ protein reaction coordinate be conserved across members of the family? Of the proposed models so far, one can at least subdivide them into two groups based on the pattern of ATPase domain firing-that is, either sequential or stochastic hydrolysis of ATP by the protomers in any oligomeric arrangement of AAAþ proteins.

The sequential model
The sequential model (also known as the hand-overhand model) posits that each protomer hydrolyzes ATP one at a time in a sequential fashion. One protomer hydrolyzes ATP to ADP. The conformational change that this protomer undergoes during hydrolysis and/or with the release of ADP is transmitted to its neighboring protomer, which is then primed to hydrolyze its own ATP. In this manner, hydrolysis proceeds around the ring, either continuing to cycle several times until it has completed its full action on the substrate (processive) or ending after one cycle (non-processive).
The sequential model is supported primarily by the structures and biophysical data of many different AAAþ proteins engaged to protein substrate Han et al. 2017;Puchades et al. 2017;de la Peña et al. 2018;White et al. 2018;Deville et al. 2019;Ripstein et al. 2020). Some AAAþ proteins that engage with substrate form a spiral arrangement, suggesting that protomers that have already fired might transition between the top and bottom of the spiral. AAAþ proteins with two catalytically active rings show a spiral engagement of both rings (Gates et al. 2017), while AAAþ proteins with a single active and single inactive domain show one ring as spiral and the other as predominantly planar (Zhao et al. 2015;White et al. 2018). Most structures of AAAþ proteins engaging substrate are unfoldases, but presumably this sequential mechanism would hold true for helicases that must unwind large portions of DNA (Enemark and Joshua-Tor 2006).
The clamp loader clade (clade 1) is unique from the other clades as it typically arranges itself into a heteropentameric arrangement. Despite this, it follows many of the same local conformational patterns other AAAþ proteins show when bound to nucleotide (Kelch 2016). Upon hydrolysis, the large and small subdomains move toward each other. This conformational change could presumably promote ATP hydrolysis at the next, sequential protomer. For clamp loaders, sequential activation of the protomers would be non-processive. ATP hydrolysis would lead to clamp and DNA release. In essence, the clamp loaders would thus be an "onoff" switch.

The stochastic model
In contrast, the stochastic model proposes that protomers fire independently of each other, regardless of position, to exert force on the substrate (Fei et al. 2020). This model is supported by studies of synthetically linked ClpX protomers and synthetically linked HslU protomers. In both cases, the entire hexamer is present as a single fusion protein; these proteins function in vitro even with mutations to the ATP binding/hydrolysis sites of some but not all protomers (Martin et al. 2005;Cordova et al. 2014;Baytshtok et al. 2017). If the protomers were forced to act in sequential fashion, then an inactive protomer unable to hydrolyze ATP would presumably stall the cycle of hydrolysis, leading to a completely inactive enzyme. On the contrary, a stochastic mechanism could tolerate some inactive protomers, albeit with reduced processivity. However, mixing of unlinked, active ClpB protomers with catalytically dead ClpB protomers abolished disassembly activity after the addition of only 1-2 dead protomers, contrary to results seen with synthetically linked Clp family protomers (Martin et al. 2005;Cordova et al. 2014;Deville et al. 2019). Mixing of inactive ClpA protomers resulted in inactivation of complex activity with four or more dead subunits .
Single molecule data also seems to suggest that Clp family ATPases process substrate in step sizes larger than two amino acids, which is what would be expected in the sequential model (Maillard et al. 2011;Sen et al. 2013;Avellaneda et al. 2020). Under a stochastic model, however, multiple protomers could fire at once and drive larger step sizes consistent with single molecule data (Qu et al. 2011;Sen et al. 2013;Rodriguez-Aliaga et al. 2016). Recent high-resolution structures consistently show ADP only at protomers at the top or bottom of the staircase. If protomers could fire randomly, one would expect a random distribution of ATP/ADP nucleotide. This is what was seen in a crystal structure of HslU, where three AMPPNP ligands were found bound to alternating subunits (Bochtler et al. 2000;Ogura and Wilkinson 2001). Atomic force microscopy on Abo1, without substrate, revealed random transitions from a planar to asymmetric D1 ring in the presence of ATP, suggestive of a stochastic model of activation (Cho et al. 2019). Presumably, clamp loaders would randomly hydrolyze ATP and release their clamp DNA once a sufficient number of protomers had fired.

Firing mechanism
We have discussed the two main models, involving either sequential or stochastic hydrolysis of ATP by the AAAþ protomers. An equally important question relates to the mode of transfer of energy to the substrate. Sequential models often assume that the protomers fire in a "power-stroke"-like manner, whereas stochastic models involve two potential firing mechanisms-one in which the protomers power stroke stochastically and one in which the AAAþ complex acts as a Brownian ratchet. However, these different firing mechanisms do not necessarily depend on the sequential or stochastic nature of the model.
The power-stroke model dictates that the AAAþ protein irreversibly performs action on substrate in one single step (Hwang and Karplus 2019). Biophysical studies of linked ClpX measuring the force it exerts on substrate reveal that each protomer hydrolyzes ATP with a maximum efficiency of about $35%, which is in the range of other power-stroke motors (Maillard et al. 2011;Rodriguez-Aliaga et al. 2016). Additionally, in optical tweezer experiments, phosphate release appears to coincide with a burst phase of disassembly (Rodriguez-Aliaga et al. 2016). Combination studies using single-molecule magnetic tweezers and single-molecule fluorescence suggest that NSF disassembles SNAREs in a rapid disassembly step coupled directly to ATP hydrolysis (Ryu et al. 2015).
Conversely, the Brownian ratchet model assumes that the AAAþ protein uses energy to rectify or bias the diffusive motion of the substrate (Hwang and Karplus 2019). FRET studies of ClpB pore loops show that they undergo large conformational fluctuations between discrete states on the microsecond timescale (Mazal et al. 2020). These microsecond transitions between the different FRET states are consistent with motions of more than 10 Å and are presumably associated with the translocation of substrate peptide by more than two residues. This observation implies translocation of substrate on a timescale faster than expected, as many residues could translocate in between pore loop transitions. The pore loops of ClpB are thus proposed to serve as "ratchets" for the substrate, which fluctuate in the pore via Brownian motion. While substrate motion through the AAAþ pore is random in this model, the stochastic hydrolysis of ATP causes the pore loops responsible for engagement with the substrate to act as a "pawl" and prevent the substrate from moving backwards through the pore. As such, each subsequent stochastic ATP hydrolysis event does not directly translocate the substrate through the pore of the AAAþ complex, but instead ensures the unidirectional motion of the substrate. We conclude that the current data do not rule out stochastic or sequential models and firing mechanisms. It is also possible that all these mechanisms exist simultaneously, or that they are specific to particular members of the AAAþ family.

Future directions
A plethora of already existing biophysical and structural data pertaining to AAAþ proteins have been reviewed here. However, many aspects of underlying molecular mechanisms remain unclear after decades of research. Capturing transition state intermediates of these AAAþ proteins during active translocation of substrate by cryo-EM would be highly informative in discerning between the sequential and stochastic model, along with their details. Cryo-EM also enables the search for new drugs targeting AAAþ proteins; new workflows will drive the discovery of drug candidates that target malfunctioning AAAþ proteins responsible for disease and enable the development of novel antibiotics that target bacterial AAAþ proteins. Additionally, employing cryoelectron tomography (cryoET) could be employed to capture unique conformational or oligomeric states of AAAþ proteins in situ, providing valuable information to the structural heterogeneity of these molecular machines.