Circumventing the packaging limit of AAV-mediated gene replacement therapy for neurological disorders

ABSTRACT Introduction Gene therapy provides the exciting opportunity of a curative single treatment for devastating diseases, eradicating the need for chronic medication. Adeno-associated viruses (AAVs) are among the most attractive vector carriers for gene replacement in vivo. Yet, despite the success of recent AAV-based clinical trials, the clinical use of these vectors has been limited. For instance, the AAV packaging capacity is restricted to ~4.7 kb, making it a substantial challenge to deliver large gene products. Areas covered In this review, we explore established and emerging strategies that circumvent the packaging limit of AAVs to make them effective vehicles for gene replacement therapy of monogenic disorders, with a particular focus on diseases affecting the nervous system. We report historical references, design remarks, as well as strengths and weaknesses of these approaches. We additionally discuss examples of neurological disorders for which such strategies have been attempted. Expert opinion The field of AAV-gene therapy has experienced enormous advancements in the last decade. However, there is still ample space for improvement aimed at overcoming existing challenges that are slowing down the progressive trajectory of this field.


AAVs and their role in gene therapy
Adeno-associated viruses (AAVs) belong to the Parvoviridae family (from the Latin 'small viruses'), comprising a group of small-sized non-enveloped viruses with an icosahedral capsid carrying a single-stranded DNA genome [1]. The AAV capsid measures ~250 Å in diameter and is composed of three proteins (VP1, VP2 and VP3) organized in a 1:1:10 ratio to form a T = 1 icosahedron of 60 subunits assembled in 20 triangular faces (Figure 1(a), left) [2]. This highly symmetrical design follows the principle of 'genetic economy,' first described by Watson and Crick in 1956 [3] and subsequently conceptualized by Caspar and Klug in 1962 [4]. According to their observations, viral proteins are arranged in a repeated configuration to ensure maximal non-covalent interactions while generating a complete coat with sufficient volume to accommodate the viral genetic material. Hence, the genome of icosahedral viruses encodes a small number of capsid subunits that are geometrically organized as multiples of 60 to build composite shells around their nucleic acids. Both the AAV capsid and genome are very small. Particularly, the AAV genome measures 4.7 kb and consists of two open-reading frames (rep, cap) flanked by T-shaped palindromic inverted terminal repeats (ITRs) of 145 base pairs. While the ITRs allow the replication and packaging of the AAV genome, rep encodes the four proteins required for replication (Rep40, Rep52, Rep68 and Rep78), and cap generates VP1, VP2 and VP3 proteins [5].
Following their serendipitous discovery as contaminants in adenovirus preparations in the 1960s [8], AAVs were soon taken into consideration as potential vehicles for gene therapeutic applications [9]. In fact, because gene replacement therapy aims at treating diseases via targeted delivery of functional genetic material, AAVs show a combination of properties that immediately made them promising candidates for this task. For instance, AAVs are non-pathogenic and exhibit low immunogenicity in humans [10], with AAV toxicity being predominantly linked to the administration of high vector doses leading to liver side effects [11]. One major advantage of AAVs is that they naturally display a broad tissue tropism, a property primarily influenced by their capsid. At least eleven natural serotype capsids have been identified [12], and researchers have been mixing capsids and genomes from different serotypes to modulate tropism according to their needs (typically combining the AAV2 genome with other capsids) [13]. AAV2, AAV5 and AAV8 capsids have been utilized for early studies targeting the nervous system; however, AAV9 has become predominant in the last decade due to its wider biodistribution [14]. Additionally, AAV9 has been proven to cross the blood-brain barrier (BBB) in mice, non-human primates and human trials [15], suggesting the possibility to target the nervous system via systemic delivery [16][17][18], which would be desirable for disorders extending beyond a localized neuroanatomical region. Last but not least, AAVs can target both dividing and non-dividing cells [19], which is particularly interesting in the context of diseases of the nervous system, considering that neurons are post-mitotic.
Unfortunately, the characteristic small size of AAVs has limited the application of these vectors in the context of gene replacement therapy. Recombinant AAVs for gene therapy maintain their ITRs while being deprived of the core viral rep-cap genes, which are replaced by (i) a promoter, (ii) the therapeutic transgene sequence, and (iii) a poly-adenylation signal (Figure 1(a), right) [20]. This means that genes with coding sequences exceeding ~4 kb represent an oversized cargo for AAV vectors. Importantly, a number of monogenic diseases, including several disorders of the nervous system, would be excluded from the benefits of gene replacement therapy based on this limitation. Therefore, several strategies have been aimed at modifying oversized transgenes in order to adapt them to AAV-mediated delivery. Here, we review these different approaches and explore emerging alternatives. We additionally highlight the challenges linked to targeting the unique microenvironment of the nervous system, and provide insights into the research efforts currently directed at treating a number of neurological disorders via these methods.

Dual/triple vector strategies
Following the lead of a previous report [21], in 2008 Allocca and colleagues speculated that different AAV capsids might differ in their ability to tolerate large payloads [22]. To test this hypothesis, the team assessed the packaging capacity of 8 AAV serotypes (AAV1-9, excluding AAV6, which exhibits ~99.2% sequence homology with the AAV1 capsid [23]) using genome sequences of up to 8.9 kb. Interestingly, they observed that AAV5 capsids were able to deliver the investigated cargoes and correct pathology in vivo. However, this socalled 'oversized' vector strategy did not survive long. Later experiments published simultaneously in 2010 by three independent groups, revealed that packaged genomes hardly ever exceeded the postulated packaging limit and actually showed heterogeneous size due to truncation [24][25][26]. Additionally, vector yields were at least 10-fold lower than usual, suggesting intrinsic glitches with this method. To reconcile the discrepancy between the physical AAV capsid limit and the successful biological outcome linked to transducing putatively large transgenes, it was proposed that complete genomes may be generated from fragmented genomes by DNA recombination events occurring in target cells [27]. These findings marked the intensification of research efforts aimed at generating optimized systems for intracellular split genome reassembly. In the so-called 'dual/triple vector approaches,' the transgene is split into either two or three separate sequences delivered by independent AAVs and allowed to reassemble to produce the full-length transgene upon co-infection [28]. These approaches have been intensely studied particularly in the field of inherited retinal disorders [29], producing a valuable dataset that has demonstrated their validity, while highlighting important limitations to be overcome. For instance, a general issue of dual/triple AAV vectors is the potential production of aberrant protein products from (i) erroneous reassembly and/or from (ii) the untoward expression of individual split transgenes. In fact, the absence of a stop codon/polyA sequence in the 5ʹ split transgene should not produce any stable product. Similarly, the lack of a promoter/start codon in the 3ʹ split transgene should prevent any expression. Yet, putative upstream-or downstreamonly products have been detected when testing dual vectors separately [30,31], suggesting that cryptic sites may be inadvertently used. An additional limitation is that the reduced efficiency of reassembly does not enable to reach the transgene expression levels typically observed with single AAV vectors [29]. However, it could be argued that this is not an issue but rather an advantage for proteins that are normally produced at low levels. An overview of dual vector strategies, including historical references, design remarks, as well as pros and cons, is outlined below.

Concatamerization/trans-splicing strategy
Since natural AAVs are Dependoviruses, they require coinfection with a helper virus, e.g. Adenovirus or Herpesvirus, to enter the lytic cycle and establish a productive infection in the host [32]. In the absence of helper viruses, natural AAVs establish latent infections, whereby their genome persists in target cells predominantly by integrating in a Rep-dependent manner within AAVS1, a specific site on human chromosome 19 [33]. Recombinant AAVs lack the Rep protein and are, therefore, incapable of site-specific integration. In this case, persistence has been attributed to episome formation by concatamerization. This was initially observed in 1998, when the Engelhardt lab reported that DNA from recombinant AAVs delivered to muscle tissue had the tendency to circularize in a head-to-tail configuration and form stable multimers, or concatamers, that persisted as episomes in transduced cells correlating with long-term transgene expression [34]. Concatamerization was found to be driven by the sequence homology of the ITRs, and early attempts to overcome the AAV packaging limit stemmed from these fascinating observations. In the year 2000, three works reported the reconstitution of split transgenes by intermolecular junction of two separate vectors via their ITRs [35][36][37]. Because re-joining the split transgene into a single DNA molecule via this method would lead to its disruption by intervening ITR elements, splicing sequences were included to specifically excise the ITRs during mRNA processing, thus yielding an intact transcript encoding a functional full-length protein (Figure 2(a)). Due to conceptual similarities to the biological process [38], this method was also designated as 'trans-splicing.' Unfortunately, subsequent studies highlighted the possibility for AAV genomes to concatamerize in the incorrect orientation [39], which could be at least partly counteracted using heterologous ITRs (derived from different AAV serotypes) [40]. Importantly, it was also observed that particular care should be put into selecting a splitting point and splicing signals, as efficient trans-splicing relies on the chosen sequences, making it crucial to meticulously optimize vector design on a case-to-case basis [41].

Overlapping strategy
In the 'overlapping' strategy, the two delivery vectors contain split transgenes sharing an overlapping fragment that facilitates reassembly of the full-length coding sequence via homologous recombination (Figure 2(b)). Here, the nature and length of the overlap region seem to represent critical elements for reassembly and may need extensive optimization. In fact, an excessively short overlapping region may not provide sufficient space for the DNA polymerase to physically bind, while a too long overlapping region could be more prone to form obstructive secondary structures [42]. However, because the overlapping approach does not require the insertion of additional DNA sequences, it is conceptually considered to be the simplest of dual strategies. This strategy was shown to work particularly well in muscle tissue, ameliorating phenotypes linked to Duchenne muscular dystrophy (DMD) [43][44][45] and dysferlinopathies [46][47][48]. Contrarily, overlapping vectors were not as successful in terminally differentiated photoreceptors [29,49], suggesting that some targets may be intrinsically more refractory to this approach. The mechanisms leading to the improved performance of overlapping vectors in muscle tissue are not fully understood. Typically, homologous recombination is a prominent DNA repair pathway in rapidly dividing cells [50]. For instance, proliferating myoblasts exhibit a robust DNA recombination machinery [51]. Although a certain level of homology-directed repair has been observed in non-dividing cells, such as neurons [52], other DNA repair pathways than homologous recombination are more active in these long-lived postmitotic cells [53]. Such moderate activity could explain the described difficulties in achieving transgene reconstitution via overlapping in the nervous system.

Hybrid strategy
The 'hybrid' strategy combines the winning features of the concatamerization and overlapping approaches within the same system. As such, hybrid vectors contain splicing signals as well as a highly recombinogenic sequence acting as an overlapping region (Figure 2(c)). The hybrid dual AAV strategy was initially proposed by Gosh and colleagues in 2008 [54] and found to significantly improve transgene reassembly in a variety of studies thereafter [29,49]. This was not surprising, considering that the presence of a highly recombinogenic sequence should favour reassembly in the correct orientation, and that reassembly may still occur via concatamerization/ trans-splicing in the absence of recombination. One important advantage of this approach is that it employs universal overlapping fragments (e.g. 270 bp sequence derived from human alkaline phosphatase [55] or the 77 bp AK sequence of the F1 phage [30]) that have been thoroughly assessed for recombination efficiency, rather than requiring a sophisticated analysis of recombinogenic regions for each transgene of interest. However, splitting point and splicing sites remain to be accurately determined.

Protein trans-splicing strategy
Another way to reassemble split transgenes is via protein trans-splicing ( Figure 2(d)). This mechanism differs from the ones described above, as reassembly occurs at the protein level rather than involving DNA rearrangements. In fact, protein trans-splicing consists of joining two separate polypeptides into one through a peptide bond [56]. This process is mediated by intervening protein segments, or 'inteins,' endowed with self-catalytic activity. Inteins, which revolutionized the field of protein engineering earning the label of 'nature's gift to protein chemists' [57], were discovered in 1990 within the genome of Saccharomyces cerevisiae [58,59], and subsequently identified in a plethora of other unicellular organisms. At present, this technology finds applications in multiple disciplines [60,61]. In dual AAV vectors for gene therapy, one intein is split into two separate segments (i.e. 'split inteins'), each linked to one half of the transgenic protein.
When split-inteins find each other, they undergo spontaneous self-excision while mediating ligation of the two protein halves, as long as the first residue of the 3ʹ half is a Cys, Ser, or Thr [62]. With this, the oversized protein is reconstituted without leaving any trace. The earliest report of protein transsplicing in the context of gene therapy dates back to 2008, when Li et al. demonstrated effective reconstitution of the dystrophin mini-gene [63]. Recently, the Auricchio lab explored the potentials of protein trans-splicing for inherited retinal disorders, proving this technology to achieve superior reassembly efficiencies as compared to other dual AAV vector strategies [64]. A number of non-cross reacting natural and artificial split inteins are available to date [65], and their size ranges between 129 and >1000 residues [56]. Although some inteins are rather small, one shortcoming of this strategy is that these elements will still take up space and reduce the room available for the transgene. Consequently, some transgenes may need to be broken down into more than two parts for reassembly.

Single vector strategies
In spite of successful stories in the pre-clinical context, there are important limitations and concerns linked to the use of dual/triple vector strategies in the clinics. As anticipated above, the generation of full-length transgenes occurs at the Dual strategies split the transgene into two separate portions that are delivered simultaneously, and transgene reassembly occurs upon co-infection following various mechanisms: the two transgene fragments can be re-joined via concatamerization driven by the ITRs; the intervening ITRs are then removed by splicing (A). Reassembly can also be guided by homologous recombination when the transgene halves share a region of homology (B). The hybrid strategy combines both features, increasing the chances of transgene reconstitution (C). Transgene fragments can also be expressed as independent polypeptides, which will be re-joined in a scarless manner in the presence of split-inteins (D). Finally, the full-length transgene can be modified to retain only the most crucial protein domains, thus producing a functional miniature gene fitting within one single AAV vector (E). ITR = Inverted terminal repeat, CDS = Coding sequence, SD = Splicing donor, SA = Splicing acceptor, pA = Polyadenylation tail, HR = Highly recombinogenic sequence. mercy of reassembly events that can be favored by means of molecular expedients but not fully controlled. The efficiency of transgene reconstitution depends on (i) the simultaneous infection of the target cell with both therapeutic AAVs and (ii) the competence of the cellular molecular machinery mediating reassembly, which may differ from cell type to cell type as well as from individual to individual. As a result, dual strategies typically suffer from reduced efficiency rates. Additionally, the formation of heterogeneous reassembly products or truncated species cannot be excluded, posing severe safety issues around these platforms. Lastly, the production costs linked to these vectors would be double/triple, drastically increasing the already prohibitive prices of gene therapy [66]. For these reasons, the ability to deliver transgenes within a single AAV vector is of major interest to the scientific community involved in gene therapy programs.

Mini-gene strategy
One fascinating approach to solving the problem of fitting oversized transgenes into small AAV capsids is to replace the full-length coding sequence of the gene in question with a trimmed version encoding a truncated, but functional protein ( Figure 2(e)). This idea was first put forward by the Davies lab upon discovery of the Δ17-48 mini-dystrophin natural protein variant in 1990 [67]. The group found that this inframe deletion gave rise to a smaller but mostly functional dystrophin protein, associated with a much milder disease course. Several years of follow-up studies led to the development of optimized versions of the original mini-dystrophin gene, showing great promise for the treatment of DMD by accessing clinical testing [68]. On the wave of this initial success, a handful of additional mini-genes have been reported. These have been either linked to the identification of naturally occurring protein variants associated with lateonset moderate pathology [69] or artificially designed [70,71]. However, the mini-gene approach poses multiple challenges, as demonstrated by the small number of studies reporting successful stories. Setting up a mini-gene strategy may be daunting when prior knowledge on the target protein is lacking, especially if no truncated variants have been identified as an ideal starting point. Yet, with newly available scientific resources, such as the artificial intelligence program for protein structure prediction AlphaFold [72], the rational generation of artificial mini-genes is prospecting as a more realistic possibility. A successful mini-gene strategy requires a thorough understanding of the structure-function correlation of each protein domain, suggesting the need for a highly interdisciplinary workflow combining protein chemistry, structural biology, bioinformatics and molecular biology methods. One major limitation is that truncated proteins do not generally retain the whole functional spectrum of the original fulllength protein. It is arguable, though, that even a partial restoration of protein functions would represent an excellent achievement in the absence of better and/or curative alternatives. Interestingly, it has been shown that in certain contexts, the delivery of truncated proteins can rescue the defective/dysfunctional endogenous protein by a phenomenon called 'transcomplementation.' This was shown, for instance, for cystic fibrosis, where the functionality of the Cystic Fibrosis Transmembrane conductance Regulator (CFTR) protein affected by the F508del mutation could be restored following AAV-mediated administration of truncated CFTR variants as a result of a biomolecular interaction that stabilized the endogenous protein otherwise destined to rapid degradation [73].

Challenges and successes in treating neurological disorders
The nervous system is a network of specialized cells involved in signal acquisition, processing and transmission. Structurally, it is classified into two components: (i) the central nervous system (CNS), comprising the brain, spinal cord, and retina; (ii) the peripheral nervous system (PNS), consisting of nerves (cranial, spinal and peripheral) connecting the CNS to the rest of the body as well as to the surrounding environment. Importantly, other cell types than neurons play a critical role in maintaining the architecture and homeostasis of the nervous system. These cells, collectively referred to as glia (from the Greek 'glue'), support, protect and/or nourish neurons [74]. Particularly, oligodendrocytes and Schwann cells generate the myelin sheath that insulates nerve axons enabling impulse conduction [75]. The microglia, a population of CNS-resident macrophages, patrols the CNS actively releasing signalling molecules involved in the crosstalk among the different cell populations composing the brain [76]. Finally, astrocytes provide axon guidance and synaptic support via the uptake, recycling and release of neurotransmitters; they additionally preserve osmolarity, protect neurons from oxidative stress, and participate in the formation and regulation of the BBB [77].
The BBB is what confers to the CNS its unique characteristics. It is an interface composed of endothelial cells, pericytes and astrocytes that tightly oversees the exchange of ions, molecules and cells between the CNS vasculature and the rest of the circulation [78]. As a result, the CNS microenvironment is shielded by a robust physical barrier, which makes the delivery of large particles to the CNS (including of AAV vectors) a relevant challenge. Additionally, the post-mitotic nature of neurons prevents them from being harvested for ex vivo modification followed by re-infusion, as is commonly done for hematological disorders [79]. Therefore, gene replacement therapy for neurological disorders must be performed in vivo, raising the issue of the most appropriate route of administration for optimal biodistribution. Gene therapy approaches can vary significantly depending on the target disease. In some neurological disorders, delivering functional transgenes to a subset of cells is enough to gain therapeutic potential. For example, the eye can benefit from direct ocular AAV delivery, making gene therapy relatively easier to target. The brain, on the other hand, is a considerably larger, more complex and safeguarded structure. AAV injections into the brain parenchyma can mechanically bypass the BBB, but they require invasive neurosurgery and present substantial risks for patients. Serotype vectors with robust neuronal tropism and volumetric spread, such as AAV9, AAV8, and AAV1, are used for this type of administration [80], and could certainly aid with neurological illnesses that predominantly affect a specific brain region (e.g. Parkinson's disease [81]). However, many other conditions would necessitate widespread therapeutic gene transfer throughout the entire CNS. Because certain serotype vectors, such as AAV9, can cross the BBB to transduce both neurons and glia, systemic delivery via intravenous infusion appears to be an appealing option due to its simplicity. Yet, intra-cisterna magna injection and lumbar puncture have shown a wider biodistribution of AAV9 within the brain and spinal cord, despite the procedural risks and moderate toxicity linked to these methods [82]. The optimal level of transgene expression is another important aspect to consider when designing a gene therapeutic approach. In fact, supraphysiological expression of target proteins could be hazardous, as previously shown in the context of Rett syndrome [83], GM2 gangliosidosis [84], and Spinal Muscular Atrophy [85]. Last but not least, due to the non-proliferative nature of terminally differentiated neurons, any neural tissue lost during disease progression will not be regenerated. Hence, timing is of the essence when approaching disorders of the nervous system. An overview of the considerations surrounding neurological disorders using AAVs is provided in Figure 3.
To date, only two AAV-based products have been approved by the US Food and Drug Administration (FDA) and the European Medicines Agency (EMA) in the field of neurological disorders (Figure 3). Voretigene Neparvovec, also known as Luxturna, received authorization in 2017 (by the FDA) and in 2018 (by the EMA) for the treatment of Leber Congenital Amaurosis (LCA), an inherited retinal disease linked to mutations in the RPE65 gene [86]. Here, clinical efficacy has been supported by recent publications demonstrating patients' improvement in light sensitivity, visual field, and navigational ability under dim lighting conditions for up to 4 years posttreatment [87,88]. In 2019, Onasemnogene Abeparvovec was commercialized under the brand name of Zolgensma to treat children with the lethal motor neuron disease Spinal Muscular Atrophy (SMA) [89]. In spite of its hefty price tag of about $2.1 million in the USA and €1.9 million in Europe, this product is being increasingly welcomed by families desperate to provide a therapeutic opportunity to their severely ill infants [90]. Zolgensma has led to increased probability of survival and improved motor skills in SMA type 1 infants when compared to historical controls [91]. Additionally, a follow-up safety study has revealed a favorable safety profile of the therapy following 6.2 years from dosing, as well as providing evidence of sustained clinical durability [92]. These successes have undoubtedly set an example for the development of other AAV-based gene products, although most experimental vectors are still far from entering clinical testing. This particularly applies to the treatment of neurological diseases caused by mutations in oversized genes, which present one additional degree of complexity, when compared to LCA and SMA. Among the different types of neurological disorders, inherited retinal diseases and, more recently, genetic diseases causing hearing loss, have reported the most success, facilitated by the Figure 3. Considerations for tackling neurological disorders using AAVs. Reported is a summary of the details to consider when approaching neurological disorders using recombinant AAVs as delivery vectors for gene replacement therapy. Advantages/positive aspects are highlighted in green, while the red color code labels disadvantages/aspects of concern. A timeline illustrating the chronology of AAV-based therapeutics targeting neurological disorders approved to date is also included. AAV = Adeno-associated virus, BBB = Blood-brain barrier, CNS = Central nervous system, CSF = Cerebrospinal fluid, ICM = intra-cisterna magna, IP = intraparenchyma, LCA = Leber congenital amaurosis, LP = lumbar puncture, SMA = Spinal muscular atrophy. Figure created with BioRender.com under academic license. easy accessibility of these target tissues. In fact, the visual and auditory systems represent an approachable interface between the external environment and the CNS. Additionally, there are lower risks of immunogenicity when targeting AAV vectors towards relatively immune-privileged sites, such as the subretinal space and the cochlea [93,94]. Contrarily, more complex neurological disorders with systemic involvement have been requiring more intense efforts. Below, we discuss the most successful examples of investigational gene therapies targeting disorders of the nervous system caused by mutations in large genes, which are conveniently summarized in Figure 4.

Usher 1B
Usher 1B (USH1, MIM 276900) is an autosomal recessive disease characterized by deafness and blindness resulting from mutations in the MYO7A gene, whose coding sequence spans 6.6 kb [95] and produces a myosin isoform predominantly expressed in cochlear hair cells, photoreceptors and retinal pigmented epithelium [96]. The development of a gene therapy for Usher 1B has been largely hampered by difficulties in identifying robust phenotypes in animal models, especially with respect to retinal dysfunction and degeneration, which is currently untreatable. However, recent advances have led to the identification of reliable readouts of retinal function in Myo7a-null mice, which has rapidly accelerated the gene therapy research output. For instance, subretinal injection of single oversized AAV2/2 and AAV2/5 vectors has shown retinal transduction and phenotype correction, indicating that fragmented genomes could reconstitute the full-length cDNA post-infection [97,98]. To overcome issues of clinical translatability linked to the generation of heterogeneous genomes from oversized vectors, dual AAV2/8 overlapping, trans-splicing and hybrid vectors have been developed and found to induce successful recombination of the transgene halves [99]. Based on these promising results, a consortium has been recently established to evaluate the feasibility of dual AAV strategies for the treatment of Usher syndrome 1B in the clinic (UshTher, https://www.ushther.eu/). In fact, one ambitious goal of UshTher is to assess the potentials of dual AAVs in humans for the first time.

Stargardt disease
Stargardt disease (STGD1, MIM 248200) is a common form of hereditary recessive macular dystrophy caused by mutations in the ABCA4 gene, which encodes the retinal-specific ABCA4 protein involved in transporting retinoids from photoreceptors to the retinal pigment epithelium. Losing this protein leads to a progressive bilateral decline in central vision that begins in adolescence, accompanied by lipofuscin deposits around the macula. Because of the large size of the ABCA4 coding sequence (6.8 kb), dual AAV strategies have been explored to deliver the ABCA4 gene product for gene replacement therapy [100]. Subretinal administration of AAV2/8 using transsplicing and hybrid vector strategies has shown efficient retinal pigment epithelium and photoreceptor transduction [29,30,101]. Intein vectors have also been used, particularly targeting mice, pigs and human retinal organoids. Although some of these studies have revealed amelioration of retinal phenotypes in STGD1 mouse models [29,64,102], the low transduction efficiencies as well as the aberrant production of truncated proteins from single AAV vectors have been two major limitations [103,104]. Including degradation sequences in the 5′-half-containing vector has successfully minimized the untoward generation of truncated proteins [30], achieving improved features that may prove beneficial for the clinical application of dual AAVs to target this disease.

Leber congenital amaurosis 10
Leber congenital amaurosis 10 (LCA10, MIM 611755) is a severe retinal dystrophy, causing blindness/severe visual impairment at birth or during the first months of life. LCA10 is caused by mutations in the CEP290 gene, which encodes for a protein required for the correct localization of ciliary and phototransduction proteins in retinal photoreceptor cells. The impossibility to package the long CEP290 cDNA (7.4 kb) into a single AAV vector has delayed the development of gene replacement strategies for LCA10. However, because CEP290 is a multi-domain-structure protein, minigenes designed to maintain essential functional properties have been evaluated. For instance, Zhang and colleagues have identified a human CEP290 domain (miniCEP290 580−1180 ) that could partially delay photoreceptor loss after subretinal administration of AAV8 in a mouse model of the disease [70]. Another study has favored a 'transcomplementation' strategy, reporting that the Cep290 rd16 mutation could be complemented in trans by a C-terminal CEP290 fragment delivered via AAV8-subretinal injection, which improved visual behavior in test mice [105]. Finally, the use of inteins has also been explored at the experimental level: AAV2/8-CEP290 protein trans-splicing vectors have achieved reconstitution of the large CEP290 protein in the mouse retina, showing ameliorated retinal phenotypes [64]. However, human studies have not been reported to date.

Usher 1D
Usher 1D (USH1D, MIM 601067) is a severe form of autosomal recessive blindness-deafness caused by mutations in CDH23 (cDNA, 10.1 kb), which encodes for Cadherin-related family member 23, a protein thought to be involved in stereocilia organization and hair bundle formation. Importantly, the cDNA of CDH23 exceeds the packaging limit of both single and dual AAVs. However, triple AAV vectors can increase the genome capacity to up to 14 kb, thus expanding the possibilities for full-length reconstitution of large transgenes [28]. As shown by Maddalena and colleagues, subretinal administration of triple AAV vectors could reconstitute the full-length CDH23 in the mouse retina, although protein expression was found to be weak [106]. The therapeutic effect of this approach remains unknown, as no retinal phenotype was investigated in this study.

Alstrom syndrome
Alstrom syndrome (ALMS, MIM 203800) is an autosomal recessive disease, manifesting with a broad spectrum of symptoms, including obesity, insulin resistance, retinal dystrophy and hearing loss. ALMS is caused by mutations in Figure 4. Overview of oversized transgenes for AAV-mediated gene replacement therapy of diseases affecting the nervous system and attempted delivery strategies. Disorders are grouped by target organ/system. Within each section, disorders are further organized in ascending order according to CDS size. Reference studies are cited. A non-comprehensive list of substantial unmet medical needs is also reported: for these diseases, no experimental gene therapy treatment has been published to date. CDS = coding sequence, OS = oversized, TS = trans-splicing, OL = overlapping, Hyb = hybrid, PTS = protein trans-splicing. * = This strategy relies on the use of an HSV-1 plasmid-based vector (up to 150 kb of capacity) carrying the full-length ATM CDS flanked by AAV ITRs. Because HSV-1 amplicons are not particularly stable, AAV sequences were exploited to permit integration of the large transgene packaged into HSV-1 virions within the AAVS1 site. ** = Note that this is not a mini-gene derived from LAMA2 but rather a functional substitute able to structurally compensate for Laminin-α2 deficiency. Images in the table were created with BioRender.com under academic license. ALMS1 (cDNA, 12.5 kb), a large gene encoding for a protein thought to play key roles in ciliary function, intracellular trafficking, and adipocyte differentiation [107]. Maddalena et al. have reported attempting subretinal injections of triple AAV2/8-ALMS1 vectors in Alms1 −/− mice without observing any stable phenotype amelioration [106]. The observed lack of efficacy could be, at least in part, explained by the low levels of transgenic ALMS1. However, the same study showed higher transduction efficiency in the pig than in the mouse retina, suggesting that the combined AAV vector regimen might achieve improved co-infection rates in the subretinal space of larger animals. Further studies will be required to address this observation.

Non-syndromic deafness
Non-syndromic deafness (DFNB9, MIM 601071) is one of the most frequent recessive forms of congenital deafness and it is caused by mutations in the OTOF gene (cDNA, 6 kb) producing the otoferlin protein, essential for inner hair cell exocytosis and vesicle replenishment. Dual AAV vector approaches have conceptually shown that production of the full-length otoferlin protein can be effectively achieved. For example, Al-Moyed and colleagues have employed AAV2-OTOF hybrid vectors for cochlear delivery, demonstrating sustained correction of deafness in Otof −/− mice [108]. Interestingly, when this therapy was administered before hearing onset in mice, it was found to prevent deafness altogether. Another study performed cochlear injections of dual AAV6 trans-splicing or hybrid vectors in Otof −/− mice, achieving 31-37% otoferlin protein levels in the inner hair cells compared to wild-type, which was sufficient to partially restore auditory function in deaf mice [109]. These studies show great promise for the use of dual AAV vectors for gene replacement therapy. An additional approach considered in the context of DFNB9 was an AAV8mediated mini-gene strategy published by Tertrais et al. However, this method could only partially restore the exocytotic component of inner hair cells, without achieving hearing restoration in mice [110].

Tuberous sclerosis complex
Tuberous sclerosis complex (TSC, MIM 191100 & 613,254) is an autosomal dominant multi-system disorder characterized by the appearance of benign tumors in multiple organs (i.e. brain, skin, heart, kidneys, lungs), accompanied by CNS manifestations such as epilepsy, learning deficits and behavioral problems. TSC is caused by loss-of-function mutations in either of the tumor-suppressor genes TSC1 or TSC2, encoding hamartin and tuberin, which normally inhibit mTORC1-mediated cell growth and proliferation. Whereas hamartin's cDNA (3.5 kb) fits into a single AAV vector, tuberin's cDNA is too large (5.4 kb). Recently, a mini-gene approach showed that intravenous injection of AAV9-mini-TSC2 in a TSC2-mouse model greatly extended survival, while reducing brain pathology [111]. This is the first description of an AAV-based gene replacement therapy for TSC type 2, whose further development will be particularly beneficial to TSC type 2 patients currently subjected to a regimen consisting of periodic administration of therapeutic drugs (e.g. rapamycin analogs), as gene replacement therapy would require a single AAV injection.

Neurofibromatosis type 1
Neurofibromatosis type 1 (NF1, MIM 162200) is an autosomal dominant disorder associated with increased risk of tumor growth along nerves throughout the body. NF1 is caused by loss-of-function mutations in the NF1 tumor suppressor gene, which lead to hyperactivation of signaling pathways of cell proliferation and survival. Although the full-length NF1 cDNA is too large for packaging into a single AAV vector (8.5 kb), one of its encoded domains, called 'GTPase activating proteinrelated domain (GRD),' is thought to be sufficient to deactivate cellular proliferation pathways, and due to its small size (~1 kb), it can be contained within a single AAV vector. Bai and colleagues used a panel of AAV vectors to express various GRD constructs in malignant peripheral nerve sheath tumors as well as in human Schwann cells [112]. This led them to demonstrate that several AAV serotypes could inhibit proliferation (i.e. the Ras pathway) in test cells. In vivo work will be required to obtain full proof of concept for this gene replacement strategy.

Ataxia-telangiectasia
Ataxia-telangiectasia (AT, MIM 208900) is a devastating genetic syndrome involving neurodegeneration, immunodeficiency and cancer predisposition due to mutations in the ATM gene (cDNA, 9.1 kb). The encoded protein kinase, ataxia telangiectasia mutated (ATM), is a master regulator of DNA double-strand breaks and stress responses. In 2008, Cortez and colleagues generated a hybrid amplicon vector in which the ATM cDNA was flanked by AAV ITRs while packaged in Herpes simplex virus 1 (HSV-1) virions [113]. This construct, which also contained the AAV rep gene, could mediate the insertion of the large ATM cDNA into the AAVS1 site of human chromosome 19. Interestingly, this HSV/AAV hybrid vector ensured functional expression of ATM cDNA in AT human cells as well as in AAVS1-transgenic Atm -/mouse cells. Despite this proofof-principle, this approach harbors important limitations, such as (i) the potential contamination by helper viruses during production, (ii) the inability of the AAVS1 site to support transgene expression in some cell types, and (iii) the fact that integration in the AAVS1 site disrupts the PPP1R12 C gene, which encodes a protein whose function is still unclear. Nonetheless, this 'targeted integration' approach, which can deliver very large transgenes (up to 150 kb) [114], remains an interesting tool worth further exploration.

LAMA2-related congenital muscular dystrophy
LAMA2-related congenital muscular dystrophy, also termed muscular dystrophy type 1A (MDC1A, MIM 607855), is the most common and fatal form of early-onset inherited muscular dystrophy, characterized by hypotonia, delayed motor development and relentless muscle weakening. Importantly, patients also present neurological abnormalities, including lissencephaly and agyria, intellectual disability, and epilepsy [115]. Currently, there is no treatment for this debilitating disease. Mutations in the LAMA2 gene lead to deficiency of laminin α2, a subunit of laminin 2 (merosin). Unfortunately, the extent of the laminin α2 cDNA (>9 kb) precludes its packaging into a single AAV vector. A suggested solution to this limitation has been the use of a condensed version of agrin, which is a functional substitute sharing some fundamental properties with laminin α2 (i.e. α-dystroglycan binding, basement membrane-sarcolemma linkage) and thus able to structurally compensate for laminin-α2 deficiency. In 2005, the Xiao lab showed that systemic delivery of AAV1/2-miniagrin vectors into different muscles of an MDC1A-mouse model quadrupled its lifespan, improving whole-body growth and motility [116]. More recently, the same group upgraded to the AAV9 serotype and showed that AAV9mini-agrin vectors additionally ameliorated peripheral neuropathy and cognition, thus offering superior therapeutic effects compared to the AAV1 serotype [117]. These studies show that the AAV vector-mediated delivery of mini-agrin is among the most promising experimental approaches for LAMA2-related congenital muscular dystrophy. Further optimization of mini-agrin constructs (i.e. by the use of musclespecific promoters) will be needed to improve their therapeutic potential.

Duchenne muscular dystrophy
Duchenne muscular dystrophy (DMD, MIM 310200) is an X-linked disease characterized by progressive muscle waste due to lack of a protein (dystrophin) instrumental for the structural integrity of muscle tissue, but evidence suggests intellectual disability as a consistent feature of DMD. In fact, dystrophin is also expressed in cortical neurons [118]. As highlighted in the paragraphs above, research on a gene replacement therapy for DMD has been intense. Yet, the journey to therapy development has been tortuous and in constant evolution. Gene replacement therapy for DMD relies on the delivery of mini or micro-dystrophin genes well below the size of the 14 kb-long dystrophin cDNA. Early research conducted in murine and canine models of DMD [119][120][121] paved the way for clinical investigation. However, the first mini-dystrophin clinical trial (NCT00428935), completed in 2010, most parsimoniously failed due to immune responses developed against the transgene, which was delivered in the biceps muscle using the AAV2.5 vector [122]. Here, mini-dystrophin expression was driven by the strong cytomegalovirus promoter, which may have amplified immune reactions. To overcome these limitations, a revised paradigm currently considers the use of muscle-specific promoters to prevent leaky expression, as well as systemic vector delivery for broader biodistribution via AAV9 and/or AAVrh74, which efficiently target both striated muscle and the CNS [123][124][125]. Very recently, three pharma companies have taken over the assessment of DMD gene therapy for human use: NCT03368742 (Solid Biosciences, LLC), NCT03769116 (Sarepta Therapeutics, Inc.), NCT03362502 (Pfizer). Preliminary data from two of these trials showed welltolerated treatment, sustained protein production in muscles, and improved motor function; a handful of serious adverse events were reported within 4 weeks post-dosing, but patients had fully recovered by their latest medical examination (https://mdaconference.org/node/1152 & https://musculardy strophynews.com/pf-06939926/).

Conclusion
Recombinant AAVs have emerged as the platform of choice to deliver gene therapeutic products. However, their modest packaging capacity still constitutes a major setback for the treatment of diseases linked to mutations in large genes. Here, we reviewed the different strategies that have been explored in the last decade to try and circumvent this limitation, including dual/triple vector strategies (concatamerization, overlapping, hybrid, protein trans-splicing) and single vector strategies (mini-genes). Despite promising results from several studies, only a handful of products generated through these strategies have made it to clinical testing so far, suggesting the need to further improve current methods.

Expert opinion
Most of the research on gene replacement therapy for disorders of the nervous system has been focused on treating inherited retinal diseases. In this context, localized subretinal delivery of therapeutic transgenes has demonstrated efficient photoreceptor engagement using AAV5,7,8 and 9 [126]. These achievements have been facilitated by the easy accessibility of this target tissue as well as its circumscribed nature, as opposed to other conditions affecting the nervous system more broadly. As is the case for the eye, the inner ear is a confined compartment of the nervous system that has become the target of increasing gene therapy research efforts for oversized transgenes [127]. Yet, complex neurodegenerative disorders with systemic abnormalities, such as Autosomal Recessive Spastic Ataxia of Charlevoix-Saguenay (ARSACS) [128] and certain forms of Hereditary Spastic Paraplegia (HSP, including SPG11 and SPG15-HSPs) [129], would also benefit from a gene therapeutic approach developed from any of the methods discussed in this review. Especially considering the ability of the recombinant AAV genome to persist as an episome in post-mitotic cells, such as neurons, which ensures long-term expression even after a single administration [130][131][132]. The list of neurological disorders lacking published gene therapeutic options is still long (Figure 4, 'unmet medical needs'). However, the portfolio of strategies aimed at correcting diseases caused by mutations in large genes is slowly but tangibly expanding.
While the strategies described above focus on adapting the cargo to fit the limited size of the AAV capsid, a fascinating possibility to solve the size-limit problem would be to modify the AAV capsid to achieve higher capacity. This approach carries high risks, as the capsid sites that could afford changes without affecting titers, yield and efficacy are very limited. However, systematic screens to mutagenize AAV capsids and evaluate whether amino acid changes may improve the packaging abilities of AAVs have been conducted [22]. Similarly, protein libraries and directed evolution are being combined in an effort to generate novel AAVs endowed with larger packaging ability [133]. Researchers are additionally assessing a series of AAV capsid variants carrying extra lysine and arginine residues within the capsid lumen, which, due to their positive charge, should induce condensation of the DNA payload and indirectly increase vector capacity by cargo compaction [134]. Last but not least, some of the most recent experiments have been inspired by the size polymorphism observed in natural viral capsids. Suggesting that the T = 1 geometry of AAV capsids could be manipulated by means of rational design, Ding and Gradinaru disclosed the generation of engineered 'eXtra Large AAV capsids' (XL-AAVs) [135]. These particles were reported to range between 35 nm and 70 nm in size and be able to package genomes from multiple serotypes. A thorough characterization of XL-AAVs is ongoing, and upcoming experiments will instruct on the feasibility of this potentially groundbreaking approach. If this system were successful, this would provide a revolutionary universal solution for the delivery of oversized transgenes using a single AAV vector, thus extending the benefits of gene therapy to a much larger number of conditions. Undoubtedly, the future holds exciting perspectives for this field, suggesting that more and more disorders linked to mutations in large genes may soon benefit from AAV-mediated gene replacement therapies.

Declaration of interests
The authors have no other relevant affiliations or financial involvement with any organization or entity with a financial interest in or financial conflict with the subject matter or materials discussed in the manuscript apart from those disclosed.

Reviewer disclosures
Peer reviewers on this manuscript have no relevant financial relationships or otherwise to disclose.