Identification of potential key pathways, genes and circulating markers in the development of intracranial aneurysm based on weighted gene co-expression network analysis

Abstract Background Intracranial aneurysm (IA) is a disease resulted from weak brain control, characterized by local expansion or dilation of brain artery. This study aimed to construct a gene co-expression network by Weighted Gene Correlation Network Analysis (WGCNA) to explore the potential key pathways and genes for the development of IA. Method Six IA-related gene expression data sets were downloaded from the Gene Expression Omnibus (GEO) database for identifying differentially expressed genes (DEGs). WGCNA was used to identify modules associated with IA. Functional enrichment analysis was used to explore the potential biological functions. ROC analysis was used to find markers for predicting IA. Results Purple, greenyellow and yellow modules were significantly associated with unruptured intracranial aneurysms, while blue and turquoise modules were significantly associated with ruptured intracranial aneurysms. Functional modules significantly related to IA were enriched in Ribosome, Glutathione metabolism, cAMP signalling pathway, Lysosome, Glycosaminoglycan degradation and other pathways. CD163, FCEREG, FPR1, ITGAM, NLRC4, PDG, and TYROBP were up-regulated ruptured intracranial aneurysms and serum, these genes were potential circulating markers for predicting IA rupture. Conclusions Potential IA-related key pathways, genes and circulating markers were identified for predicting IA rupture by WGCNA analysis.

Intracranial aneurysm; key pathways; hub genes; circulating markers; ruptured intracranial aneurysm; unruptured intracranial aneurysm Introdution Intracranial aneurysm (IA) is a disease resulted from weak brain control, characterized by local expansion or dilation of brain artery; typically, cystic aneurysm is the most commonly seen, which is the leading cause inducing spontaneous subarachnoid haemorrhage (SAH) [1]. The overall prevalence rate of unruptured intracranial aneurysm (UIA) among populations with no common disease is 3.2%, with the average age of 50 years old [2,3]. In recent years, IA has been increasingly detected due to the frequent application of imaging technique in clinic [4,5]. Although some IAs remain stable and unbroken, part of them may encounter rupture and endanger life [6]. Previous studies showed the mortality of SAH induced by ruptured intracranial aneurysm (RIA) is approximately 30-40%, moreover, half of survivors may suffer disability [7,8]. Consequently, there is still an urgent need to investigate the pathogenesis of IA for prevention and treatment for this disease, and a reliable circulating marker for predicting RIA can help physicians to prevent the severe consequences induced by RIA.
The Weighted Gene Co-expression Network Analysis (WGCNA) has been verified as an effective method to identify the co-expression modules and hub genes [9][10][11][12][13][14][15][16]. Through WGCNA, Genes with the similar expression pattern into a module can be clustered and the correlation of the module with the phenotype can be analyzed [17,18]. In the present study, WGCNA was employed to identify UIA-related and RIArelated modules for exploring potential IA-related key pathways and hub genes. In addition, some hub genes were also differentially expressed in blood and may be potential circulating markers for predicting RIA.

Data collection and pre-processing
In this study, all data were derived from the Gene Expression Omnibus(GEO) database (PMID: 23193258), and processed using the Rv3.6.0 (https://www.r-project.org/). Six IA-related data sets were downloaded from the database, including GSE6551, GSE13353, GSE26969, GSE75436, GSE106520, and GSE36791. Among them, the GSE6551, GSE13353, GSE26969, and GSE75436 datasets were raw data from the GPL570 platform. These datasets covered 33 IA wall tissue samples and 27 normal arterial wall controls; The gene expression profile data were performed standardized processing using the justRMA method in affy package [19]; The impute package was use to fill the missing values, and the ComBat function in sva package [20] was employed to remove batch effects. The combined data set included 15 RIA samples, 27 UIA samples and 18 control samples. Additionally, the serum gene expression profile data set of GSE106520 [21] was based on GPL16791 platform, including 16 UIAs and 16 controls. The blood gene expression profile data set of GSE36791 data set was based on GPL10558 platform, covering 43 RIAs and 18 controls. The normalize Between Arrays method was applied for the normalization of these two data sets respectively. The two data sets were used to verify the differential expression of hub genes in serum and find potential circulating markers for IA. The workflow of the present study was in Figure 1.

Differentially expressed gene analysis and clustering analysis
After the above-mentioned pre-processing of the several data sets, the limma package was used to screening DEGs between IAs and controls. The DEGs between UIAs and controls, the DEGs between RIAs and controls were screened in the combined data set, respectively. p Value <.05 considered to be significant. Bidirectional hierarchical clustering analysis using the expression values of DEGs based on the Euclidean distance was performed, and the results were presented as the heat maps.
The weighted gene co-expression network analysis WGCNA is a comprehensive set of R functions used to perform analysis of various aspects of weighted correlation networks [18]. WGCNA was performed using the common DEGs between UIA compared to control and RIA compared to control. The aim of WGCNA was to identify the correlation with gene modules and the three phenotypes (control, UIA, and RIA). First, weighted value of person coefficient was used to calculate the correlation any two genes. Then, the hierarchical clustering tree was constructed through the Pearson coefficients between genes, in which the different branches indicated different gene modules, while different colours stood for different modules. The result of correlation with gene modules and the three phenotypes were showed as a heat map. A module with high correlation with the phenotype is the candidate module for further investigation.

Enrichment analysis for gene modules
Gene Ontology (GO) is a database constructed by the Association for Gene Ontology. GO annotations can be classified as three categories, including molecular function (MF), biological process (BP), and cellular components (CC) [22]. Kyoto Encyclopaedia of Genes and Genomes (KEGG) is a comprehensive database integrating the genomic, chemical and systemic functional information, among which, the KEGG Pathway is specifically used to store gene pathway information among different species [23,24]. To explore the biological functions of the candidate gene modules, enrichment analyses including GO and KEGG pathways enrichment analysis were performed using clusterProfiler package [25]. p Value < .05 and q value < 0.05 were set as threshold.

Identification of the Hub genes and circulating markers
Through WGCNA, gene significance (GS) is defined as the correlation a gene with phenotype. The module membership (MM): MM(i) ¼ cor(x i, ME) is defined to measure the importance of a gene in the module. In the present study, a gene with GS > 0.2 and MM > 0.8 was defined as hub gene among the candidate gene modules. In order to find a reliable biomarker for predicting IA, the expression of hub gene was investigated whether the differential expression in blood tissue was consistent with the differential expression in arterial wall tissue. Furthermore, hub genes with consistent differential expression were carried out receiver operating characteristic curve (ROC) analysis to identify the potential circulating markers to diagnose UIA and predict RIA. The ROC analysis was performed using pROC package [26]. Through WGCNA analysis, key pathways, genes related to IA and potential circulating markers were identified that may predict aneurysm rupture.

DEG and clustering analysis
According to the p value < .05, we found that 783 genes were significantly up-regulated and 1097 genes were significantly down-regulated in UIA (Figure 2(A)). While 711 genes were significantly up-regulated and 1020 genes were significantly down-regulated in RIA (Figure 2(B)). The most up-regulated 50 genes and the most down-regulated 50 genes were used to perform cluster analysis, the results showed that the expression patterns of these 100 DEGs can basically distinguish IA and normal control (Figure 2(C,D)).

The results of WGCNA
In the present study, 2559 common DEGs in RIA and UIA were used to perform WGCNA (Figure 3(A)). Prior to WGCNA, we had carried out sample clustering analysis, in which two outlier samples were discovered and eliminated (Figure 3(B)).
With regard to the WGCNA parameter setting, R 2 ¼ 0.85, b ¼7, the minimum gene number in gene module was 30, and the cutting height was set at 0.75. Finally, 12 modules (Figure 3(C)) were identified and each module was assigned with a unique colour (Figure 3(D)). Three modules (purple, greenyellow, and yellow) were markedly correlated with the UIA, while two modules (blue and turquoise) were evidently related to RIA (Figure 3(E)). The genes within modules were significantly correlated with the modules (Figure 4).

Enrichment analysis
In order to explore the biological functions of these five candidate gene modules, GO and KEGG pathway enrichment analysis were performed. Enrichment results ( Figure 5) in GO biological process (BP) showed that, the formation of IA might be related to neutrophil degranulation, neutrophil activation involved in immune response, neutrophil mediated immunity, and neutrophil activation. In addition, the results of KEGG pathway enrichment analysis revealed that, modules related to UIA were involved in the pathways such as Glutathione metabolism, Biosynthesis of amino acids, cAMP signalling pathway, Ras signalling pathway, cGMP-PKG signalling pathway, Glycosaminoglycan degradation and Carbon metabolism. While modules related to RIA were involved in the pathways such as Ribosome, Spliceosome, Lysosome, Phagosome, Toll-like receptor signalling pathway, Osteoclast differentiation and NOD-like receptor signalling pathway. These results indicate that the mechanism of IA formation and rupture may be different.

Identification of the hub genes and IA biomarkers
Among the above five modules, 101 hub genes were identified according to GS >0.2 and MM >0.8. The blood gene expression profile of GSE106520 indicated that some hub genes were also differentially expressed in blood in IAs. Figure 6(A) is the heat map for the expression of the 101 hub genes in these three data sets. The expression of 24 hub genes in blood were consistent with the differential expression of artery wall tissues in RIA (Figure 6(B)). While, in the blood gene expression profile GSE36791 from UIA, these hub genes were not significant differential expression, except ALOX5 ( Figure 6(C)). The ROC curves showed that CD163, FCEREG, FPR1, ITGAM, NLRC4, PDG and TYROBP1 were potential circulating markers for predicting RIA with high AUC values ( Figure 6(D)). ROC curve for ALOX5 indicated that the diagnostic value for UIA may be not well (Figure 6(E)).

Discussion
Multiple surveys have indicated the morbidity of IA is estimated to be as high as 10%. Generally, medium and small IAs are asymptomatic prior to rupture. However, patients with IA will encounter severe SAH in the case of IA rupture [27,28], while SAH is a devastating subtype of stroke with high risk of neurological deficit and even death [29,30]. Therefore, to identify the key potential pathways, genes and circulating markers in the development of IA is very meaningful. In the present study, the DEGs related to IA were identified for carrying out WGCNA. The gene modules correlated with IA formation and rupture were also identified using WGCNA. According to the results, a total of 12 gene modules were identified. Interestingly, we found that three modules (purple, greenyellow and yellow) were markedly correlated with UIA, while the blue and turquoise modules were evidently correlated with RIA. Such results suggested that IA formation and progress might be resulted from different functional modules.
In order to prevent the progression of IA, the mechanism of IA rupture should be further studied. In our study, five interested gene modules were elected to carry out the functional enrichment analysis according to the correlation between phenotype and disease. And the result of enrichment analysis shows that modules related to UIA and RIA were involved in different biological process and pathway, which demonstrated that the mechanism of IA formation and rupture may be different. Previous studies have shown, IA is related to immune response, regulation of immune response, lysome, cGMP PKG signalling pathway, Toll-like receptor signalling and lysosome pathway [31][32][33], which were also confirmed in our study. Recent studies have demonstrated, Lysosomes not only play a central role in cell decomposition, but also participate in metabolism, membrane repair and cell death [34]. And in our study, lysosome pathway was enriched in module related to RIA, which suggests the degradation of the components of the unruptured aneurysm wall may be a process leading to rupture. In addition, many studies showed that IA is also closely related to immune/inflammatory response [31,32,35,36]. Inflammation is one of the crucial players in the pathophysiology of IA, inflammation and immune response could potentially contribute to the formation of IA [37]. However, the specific mechanism of immune/ inflammation in the formation and rupture of IA is not clear. Additionally, we also found Ribosome, Glutathione metabolism, cAMP signalling pathway and Glycosaminoglycan degradation were probably related to the formation and rupture of IA, but more detailed mechanism of IA rupture needs to be further explored.
Moreover, we identified 101 hub genes from five interested gene modules, which subsequently were used to ROC curve analysis. Then the results of ROC curves showed that CD163, FCEREG, FPR1, ITGAM, NLRC4, PDG and TYROBP were potential circulating markers for predicting RIA. Among them, CD163 is a scavenger receptor for hemoglobin-haptoglobin complex phagocytosis [38], it is highly expressed in degenerated and ruptured IA walls [39]. The upregulation of FPR1 can induce the synthesis of neutrophils mediated by Mac-1 through fpr related Gi protein and b-arrestin signalling pathway, thus aggravating the development of abdominal aortic anerysm (AAA) [40], which suggest it may also participate in the development of IA. In addition, NLRC4 has been found to play an important role in human autoinflammation, TYROBP was also studied in inflammatory diseases [41], but whether these two genes were related to inflammation of IA rupture needs further study. The identification of these genes can help IA patients to predict whether there is a danger of rupture in aneurysms, and remind doctors to intervene as soon as possible, which is of great significance.
Though our study provided new insights in IA, the potential key pathways and genes were identified based on bioinformatics method and need to be validated in molecular experiment. Therefore, it is not clear whether these genes are causal or merely markers for UIA or RIA.

Conclusion
There has no reliable circulating markers to precisely diagnose IA at present. In the present study, the hub genes related to IA formation and progress were identified, their predictive value for IA were explored. Gene CD163, FCEREG, FPR1, ITGAM, NLRC4, PDG, and TYROBP are up-regulated in the both arterial wall tissues and blood tissues, and these gene may potentially serve as the circulating markers for predicting RIA.