Rare genetic variants prioritize molecular pathways for semaphorin interactions in Alzheimer’s disease patients

Abstract Contemporary genetic methods have not yet solved the ‘missing heritability’ problem of complex diseases such as Alzheimer’s disease (AD). The impact of rare or less common variation on human complex diseases and traits remains to date barely investigated. In this study rare population variants detected using whole-exome sequencing were employed to examine how molecular pathways are prioritized in four groups: Alzheimer’s disease (AD) patients, Frontotemporal dementia (FTD) patients, young and healthy individuals and centenarians. The set of prioritized genes in AD patients associated with Semaphorin interactions pathways, contrasting with the results of the other groups. We identified rare pathogenic, likely pathogenic and variants of unknown significance in these prioritized genes in AD patients. The results of this study offer evidence that semaphorin pathways play a role in AD genetic etiology.


Introduction
Pathological changes in the brain structure may commence twenty years before the clinical manifestations of Alzheimer's disease (AD) [1][2][3][4]. At the onset of the clinical phase, which could last years, mild symptoms of cognitive impairment (MCI) develop as a result of synapse loss [5]. Neuron loss continues as the disease progresses resulting in 15-40% loss in hippocampal volume and in inflammatory processes in hippocampus and cerebral cortex [6]. These neurodegenerative processes have severe clinical manifestations, e.g. impaired memory, cognition and language, and in due course lead to dementia [7,8].
Some of the pathological alterations that occur in the brain are associated with proteopathies. According to the 'amyloidogenic hypothesis' the accumulation of amyloid plaques in the extracellular matrix triggers a cascade of pathological processes leading to neuronal dysfunction and cell death [9]. Complex molecular interactions among different immune responses and disorders of regulatory pathways in synapses have been established as causes for AD [10,11].
Studies on AD patients have revealed a large number of pathogenic fully penetrant mutations in the genes APP, PSEN1, PSEN2 causing autosomic dominant forms of AD. The ε4 allele of the АРОE gene is a major genetic risk factor for both early-and late-onset AD [12]. Mutations in MAPT and TREM2 genes have also been confirmed as risk factors [13]. Massive parallel resequencing techniques have established at least 21 additional loci associated with AD [14,15].
The SORL1 gene encodes the Sortilin-related receptor LR11/SorLA involved in the control of amyloid beta peptide production. Studies have observed increased risk for developing late-onset and autosomal dominant early-onset AD associated with common and rare SORL1 variants [16]. These variants are associated with AD as they likely disrupt the processing of the APP proteins [17,18]. A study on a Belgian cohort of AD patients found that several ABCA7 gene loss-of-function mutations were in higher frequency in the patients compared to a control group, and one frameshift mutation (glu709fs) was found to segregate with disease in a family with autosomal dominant inheritance of Alzheimer's disease [19].
The methods that are widely used in the postgenomic era, i.e. case/control genome-wide association studies and whole genome/exome sequencing of families, have not solved the problem of 'missing heritability' , i.e. a large number of variants predisposing to complex diseases remains undiscovered. It has been proposed for several polygenic traits that rare variants contribute to a substantial proportion of this 'missing heritability' [20,21]. The role of rare and low-frequency genetic variants (MAF < 1%) in complex diseases is however hard to establish, as they are usually left statistically undetected in association studies [22][23][24].
The impact of rare or less common gene variants on complex human diseases and traits remains therefore to date unclear. Target sequencing performed on Chinese AD families found a number of rare coding variants in risk genes for neurodegenerative diseases to be significantly associated with AD, including rare variants in PLD3, LRRK2, ABCA7, TREM2 and FUS genes [25]. A search for low-frequency coding variants with large effects on late-onset AD in risk families identified a rare variant in the PLD3 gene which increased the risk for late-onset AD twofold, along with evidence that the PLD3 gene influences APP processing [26]. genome-wide searches have established rare variants in relatively few genes, e.g. TREM2, AKAP9, UNC5C, ZNF655, IGHG3 and CASP7, to be associated with AD [27].
Studying individual genes however cannot reveal the complex interactions among them. It is not yet clear how rare coding variants alter the protein structure and function and trigger the pathogenic mechanisms of neurodegenerative disorders. We speculate that the effect of common variants is combined with that of rare variants in molecular pathways and alter biological processes. Determining the role of Mendelian and non-Mendelian diseases in molecular pathogenesis is also hindered by lack of functional or family history data. In silico predictions, well-established American College of Medical genetics and genomics (ACMg) criteria for variant classification, and allele frequency considerations are tools that are used to determine variant pathogenicity [28]. The ACMg criteria include: population frequencies of variants in genome databases, computational and in silico predictions, functional data, segregation analysis in family pedigrees, allelic data and patient's phenotype.
The aim of the present study was to explore the role of rare variants, obtained by whole-exome sequencing (WeS), in biological processes and molecular pathways leading to AD pathogenesis. Our analysis is based on comparing the WeS results of AD patients with those of Frontotemporal dementia (FTD) patients, a centenarian group, and young and healthy controls.

Ethics statement
This study was approved by the ethics committee of the Medical university of Sofia, adhering to national and international legislation.

Subjects
AD and FTD patients had been diagnosed by a team of clinical neurologists using standard diagnostic approaches. Compiling a control group for this study was complicated by the fact that the age of AD onset varies widely, and that it is difficult to predict the risk for healthy individuals to develop the disease at a later age. We choose therefore to use two types of controls, i.e. young and healthy individuals and Bulgarian centenarians. Individuals with no AD symptoms were sequenced in the centenarian group, after they were examined by a clinical team that included gerontologists in order to assess their physical and mental status. A questionnaire was devised to get information about their lifestyle, medical history, neurological status, movement independence, cardiovascular disease, cancer, diabetes, etc. The control group was constituted of young healthy individuals, free of acute or chronic diseases.

DNA extraction and sequencing
Whole-exome sequencing was performed on four DNA pools composed with equimolar amounts of DNA extracted from: 66 AD patients, 70 FTD patients, 32 Bulgarian centenarians (100-106 years old) and 61 young and healthy individuals (25-30 years old). DNA was extracted using qIAamp DNA Blood Mini Kit (qiagen). AD and FTD patients' pools were whole exome sequenced using Agilent SureSelect Human All exonV6 kit (Agilent Technologies, CA, uSA), whereas the centenarian and young control pools were sequenced using BgI v4 chemistry on a BgISeq-500 platform (BgI genomics) at a mean 250× coverage. Such high coverage is required for pool-seq sequencing to ensure that alleles with low frequency are also detected [29].

Bioinformatics analysis
Following the 'best practice' recommendations for pool-seq data, we initially filtered all pools as follows: total depth of coverage ≥ 30, mapping quality ≥ 60, supporting base quality ≥ 30, number of reads per MAF ≥ 2, fraction of observations supporting an alternate allele ≥ 0.0001. For further analyses we selected exonic and splicing variants, frameshift/non-frameshift insertions, non-synonymous SNVs, startloss, stopgain and stoploss variants. As we based our analysis on variants with very low population frequencies, we selected variants with european population frequencies MAF ≤ 0.001 (non-Finnish europeans genome [30]), but enhanced in the AD and FTD pools (MAF ≥ 0.01). From the centenarian and young control pools were selected variants with MAF ≤ 0.01 and number of reads per MAF ≥ 4. The Varsome database was used to classify variants according to their clinical significance [31]. The ReACTOMe online platform [32] was used to establish and to graphically visualize molecular pathways that might be affected by the set of genes carrying rare variants.

Results
The whole-exome sequencing established 303,135-368,090 variants in the four pools analyzed. The selected genes with rare population variants were 2299 in the AD patients' pool, 2221 in the FTD patients' pool, 2685 in the centenarian pool and 2605 in the control group of young and healthy individuals. Prioritized pathways related to developmental biology for Alzheimer's patients contrasted to the other analyzed clinical and control groups ( Figure 1):
The seven genes with pathogenic and VuSs were input in the ReACTOMe platform in an effort to obtain a more accurate picture of the molecular pathways associated with Semaphorin interactions pathway in AD patients ( Figure 2 and Table 1).

Discussion
The aim of the present study was to examine if and how rare variants impact gene prioritization of molecular pathways in AD patients, FTD patients, centenarians with no AD symptoms and young and healthy individuals. Our results suggest that in AD patients enriched genes associate with Semaphorin interactions, molecular pathways constituent of the nervous system development events.
Changes in neural connectivity and loss of synaptic contacts are an indication of neurodegenerative disease pathogenesis. It has been suggested that aberrant semaphorin expression or function may result in altered neuronal connectivity or synaptic function associated with a number of developmental and degenerative neural disorders [33]. Such role for semaphorins is most strongly supported by experimental data for AD and ALS (amyotrophic lateral sclerosis). Compelling evidence for semaphorins playing a role in AD is also provided by the discovery of a multiprotein complex from brain tissue of AD patients that contains the phosphorylated microtubule-associated proteins 1B, Sema3A, CRMP-2, plexinA1 and -A2 [34]. Semas and their related receptors have been implicated in developmental and adult onset nervous system diseases including AD [35]. growing axons are highly adaptable at the growing point, which senses environmental guidance cues and responds by undergoing cytoskeletal changes. Semaphorins are highly conserved families of guidance molecules that guide axons [36]. Semaphorins signal via receptor complexes that include other   proteins such as plexins. Interactions between different subfamilies of plexins and semaphorins show differential specificity and trigger different sets of biological functions [37]. Our results show that AD patients carry the pathogenic variant rs104894002 in the gene TREM2, previously established in another study of AD patients [38], along with two rare VuSs: rs201258314 and rs374851046. TREM2 gene mutations have been shown to increase the risk of neurodegenerative conditions such as AD, ALS and Parkinson's disease [27]. The TReM2 protein is a part of a complex network, including PLXNA1 and the semaphorin SeMA6D ( Figure 3). Additional studies are needed in order to elucidate which rare pathogenic variants and rare VuSs variants play a role in AD genetic etiology. It is conceivable that such variants in the TREM2 gene, in interaction with PLXNA1 and SEMA6D variants, are implicated in molecular pathways associated with AD pathogenesis. The PLXND1 gene accommodates the likely pathogenic variant rs746887877 along with 6 VuSs. This gene plays an important role in cell-cell signaling, in regulating the migration of many cell types and in ensuring the specificity of synapse formation. Plexin-D1 is a cell surface receptor for the proteins SeMA4A, SeMA3C, SeMA3F, SeMA3e and SeMA3g (Figure 4). enriched rare VuSs are found in semaphore genes associated with PLXND1, e.g. SEMA4A, SEMA3C, SEMA3F and SEMA3G. It can be speculated that likely pathogenic variants and an array of VuSs in the PLXND1 gene, along with the VuSs in the semaphoring genes, have the potential to disrupt crucial biological processes in the brain leading to AD.
It is worth mentioning that microRNAs (miRNAs) play a key role in growth, differentiation and cell death by suppressing one or more target genes. The activity of miRNAs in the brain is affected by various factors such as electromagnetic fields, diminishing the significance of AD associated variants that might be carried the subject. There is evidence that long-term exposure to certain radiofrequencies alters miRNA expression in the brain and may lead to adverse effects such as development of neurodegenerative diseases, e.g. AD, but the evidence for this is inconclusive and more studies should be devoted on this topic [39,40].

Conclusions
Our results suggest differential molecular pathway prioritization in AD patients compared to FTD patients, centenarians with no AD symptoms and young and healthy individuals. The presence of rare pathogenic, likely pathogenic and variants of unknown significance in genes associated with semaphorin interactions in AD patients suggest the involvement of these pathways in the complex picture of genetic disturbances in the molecular pathogenesis of Alzheimer's disease.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This work is funded by the National Science Fund of Bulgaria, project KP-06-N33/5 from 13.12.2019.

Data availability statement
Data are available upon reasonable request from the authors.