Potential therapeutic target genes for systemic lupus erythematosus: a bioinformatics analysis

ABSTRACT Systemic lupus erythematosus (SLE) is a chronic autoimmune disease involving multiple organs. However, the underlying etiology and mechanisms remain unclear. This study was performed to identify potential therapeutic targets for SLE using bioinformatics methods. First, 584 differentially expressed genes were identified based on the GSE61635 dataset. Tissue-specific analyses, enrichment analyses, and Protein–Protein interaction network were successively conducted. Furthermore, ELISA was performed to confirm the expression levels of key genes in the control and SLE blood samples. The findings revealed that tissue-specific expression of markers of the hematological system (25.5%, 28/110) varied significantly. CCL2, MMP9, and RSAD2 expression was markedly increased in the SLE samples compared with controls. In conclusion, the identified key genes (CCL2, MMP9, and RSAD2) may act as possible therapeutic targets for the treatment of SLE.


Introduction
Systemic lupus erythematosus (SLE) is a polysystemic autoimmune disease involving multiple organs [1]. Epidemiological studies have suggested that the 10-year survival rate of patients with SLE is 90%, and 25% of deaths are caused by thrombotic events or concurrent infections [2][3][4]. Decades of research have revealed that genetic, immune, and environmental factors participate in the pathogenesis of SLE [5][6][7][8]. However, the precise pathogenic mechanisms underlying SLE remain to be fully elucidated. Currently, there is no cure for SLE, and the treatment mostly relies on nonsteroidal anti-inflammatory drugs (NSAIDs) and immunosuppressants to relieve symptoms.
The Gene Expression Omnibus (GEO) database contains gene profiles generated predominantly using DNA microarray technology [9,10]. This study aimed to explore the potential hub genes and underlying mechanisms in SLE using bioinformatics methods. Raw data from microarray analyses conducted on SLE samples and healthy controls were downloaded from the GEO database. According to the enrichment analysis, BioGPS, String database, and protein-protein interaction (PPI) network analysis were utilized to identify key genes. By verifying the selected key genes, the validation results provide a basis upon which novel insights regarding mechanisms underlying SLE and new approaches for SLE therapeutic intervention can be developed.

Data source
Microarray dataset GSE61635 was available at the Gene Expression Omnibus (GEO) database (www. ncbi.nlm.nih.gov/geo/). It was based on the GPL570 platform (HG-U133_Plus_2), comprising 99 SLE blood samples and 30 healthy control samples.

Data processing
Raw data were processed and analyzed using R (version 4.0.2). The median value of each sample was normalized using the limma package between arrays for background correction. A robust multichip average (RMA) was then created, and perfect matches from the raw data were log-transformed. FDR <0.05 and |log2 fold change (FC)| >1 were considered for the differentially expressed genes (DEGs) [11]. DEGs were processed and plotted as volcano plots and a heatmap using ggplot2 and pheatmap R packages, respectively.

Tissue-specific gene expression analysis
Information regarding the function of a gene can be obtained from the relative tissue-specific genes. To screen out tissue-specific DEGs, the BioGPS database (http://biogps.org/#goto=welcome) was used [12]. Highly tissue-specific transcripts mapped to a single tissue were included if all of the following criteria were met: (a) median expression > 30 times the median expression of all other tissues; (b) the highest expression level was at least threefold higher than the second-highest expression. GO [13] and KEGG [14] pathway analysis of DEGs were screened out by using DAVID 6.8 (http://david.abcc.ncifcrf.gov/) online database [15]. Significant difference was set at P< 0.05.

Identification of key genes
STRING (https://string-db.org/) was used to construct the PPI network [16]. The confidence score was set at ≥0.4. Cytoscape v3.7.2 and the CytoHubba plugin (version 0.1) were used to visualize and identify the PPI network. The top 20 hub genes were obtained based on the filtering algorithm (closeness). A Venn diagram was then delineated to confirm the key genes between hub genes and tissue-specific genes.

ELISA
The experimental protocol was approved by the Ethics Committee of the Second Affiliated Hospital of Nanchang University in compliance with the Declaration of Helsinki. SLE and normal subjects were informed of the study content in oral form. Two milliliters of blood was collected and anticoagulated with EDTA. Serum samples were collected by centrifuging the blood samples at 2000 rpm for 10 min at 4°C. All ELISA kits (CCL2, MMP9, GATA1, and RSAD2) were used according to the manufacturer's instructions (MEIMIAN, Jiangsu Biological Industrial Co., Ltd., China).

Statistical analysis
A minimum of three replicates were performed for each experiment, and data are presented as the mean ± SD. Statistical analyses were performed using GraphPad Prism 8 (GraphPad Software, San Diego, USA). Comparisons between groups were performed using an unpaired t-test. Statistical significance was set at p< 0.05.

Results
In order to explore potential therapeutic targets of SLE, bioinformatics methods were used to identify DEGs. We next performed tissue-specific gene expression analysis and enrichment analysis and constructed a PPI network. Finally, the selected hub genes were verified using ELISA. Therefore, this study may significantly improve the targeted therapy of SLE and enrich our understanding of its pathogenesis.

Differential expression analysis
In total, 99 patients with SLE and 30 normal subjects were enrolled in this study. Microarray data of the GSE61635 dataset were standardized ( Figure 1). After setting the cutoff at FDR <0.05, and |log2 (FC)| >1, 584 DEGs were identified ( Figure 2). The 19 significantly expressed genes between the two groups were extracted using cutoffs at |log2 (FC)| >3 and FDR <0.005 (Figure 2(a)).

Enrichment analysis of DEGs
GO analysis was conducted using the DAVID software. Enriched GO terms were divided into three categories: BP, CC, and MF. As shown in Table 2 and Figure 3, the DEGs were mainly enriched in the 'Type I interferon signalling pathway' and 'Defence response to virus' in the BP group. CC analysis indicated that the DEGs were mainly enriched in 'hemoglobin complex' and 'cortical cytoskeleton'. In terms of MF, DEGs were most enriched in 'metalloendopeptidase activity' and '2 -5 -oligoadenylate synthetase activity'. Pathway enrichment analysis of DEGs using the KEGG. KEGG analysis of DEGs revealed that they were mainly enriched in 'influenza A', 'measles', and 'porphyrin and chlorophyll metabolism' (Table 3 and Figure 4).

Validation of gene expression
Expression of four key genes (CCL2, MMP9, GATA1, and RSAD2) was verified using ELISA in control and SLE subjects. The ELISA results showed that the levels of CCL2, MMP9, and RSAD2 in the SLE group were significantly increased (Figure 7).

Discussion
Previously, a significant number of genes have been shown to correlate with SLE [17,18]. However, current therapies for SLE have limited efficacy and increased susceptibility to secondary outcomes [19]. In this study, 584 DEGs were obtained from the selected dataset GSE61635. Enrichment analysis of DEGs showed that they were primarily involved in the hemoglobin complex, immune response, and metalloendopeptidase activity. Compared with previous researches, we conducted a tissue-specific analysis of differential gene expression, which could potentially allow for the development of more effective and targeted therapeutics [20,21]. The results suggested that 110 DEGs were involved in the hematological system, urinary/genital system, neurologic and digestive system, respiratory and skin/skeletal muscle system, immune system, endocrine system, and circulatory system. Furthermore, four key genes were revealed between hub genes and tissuespecific genes, including CCL2, MMP9, GATA1, and RSAD2. The statistical results validated by ELISA showed that the levels of CCL2, MMP9, and RSAD2 were significantly increased in the SLE group.
Chemokines are a family of small peptides that are involved in cell trafficking and inflammatory responses [22][23][24]. Currently, approximately 50 different chemokines have been identified, most of which belong to the CC and CXC families [25]. Monocyte chemoattractant protein-1 (MCP-1 or CCL2), a prototype of the CC subfamily, plays  a crucial role in inflammatory processes [26,27]. CCL2/MCP-1 is significantly correlated with SLE, and CCL2 levels are significantly reduced after treatment [28,29]. Moreover, it has been demonstrated that CCL2/MCP-1 is strongly associated with atherosclerosis and cardiovascular diseases (CVD) in patients with SLE [30,31].
Matrix metalloproteinases (MMPs), also known as matrixins, are extracellular matrix (ECM)degrading enzymes [32]. MMP-9, an extracellular proteinase, is involved in various pathophysiological processes, such as ECM remodeling, inflammatory response, and immune response [33]. Multiple cytokines play crucial roles in upregulating the expression of MMP-9 in response to inflammation [34]. However, MMP-9 appears passively as a downstream product of the inflammatory response. Additionally, it plays a positive feedback role on many pro-inflammatory factors (IL-1β and IL-8), which are important 'regulators' of the inflammatory response [35]. Prior studies have shown that MMP-9 plays a significant role in chronic autoimmune diseases, such as SLE, by activating the inflammatory response [36,37]. MMP-9 degrades components of the vascular basement membrane that help inflammatory cells invade the vascular wall and induce inflammation associated with the pathogenesis of SLE, thus increasing endothelial cell permeability [38,39].
RSAD2, an interferon-inducible gene, is involved in the innate immune response against viruses [40,41]. RSAD2 activates the immune response and has been associated with multiple autoimmune diseases, such as RA, SLE, and AS [42,43]. Doedens et al. [44] found that patients with SLE have an important link with IFN dysregulation. A study performed by Sezin et al. [42] showed that RSAD2 is the hub gene in the pathogenesis of SLE.
There are several limitations to this study. First, it was performed at a single center in China; therefore, the results warrant further validation in other populations. Second, only one dataset was utilized in this study, and future studies will be required to validate these findings in other datasets. Further large-scale validation studies and molecular mechanisms of SLE should be performed to explore the roles of these genes.

Conclusion
In conclusion, the present investigation demonstrates that CCL2, MMP9, and RSAD2 are linked to the initiation and development of SLE. These genes and the related pathways may serve as novel therapeutic targets for SLE. Large-scale, multicenter research is needed to further validate these findings.

Author statement
Yun Yu: Statistical Analysis, Manuscript Preparation. Liang Liu: Data Collection, Statistical Analysis. Long-long Hu, Junpei Li, Ling-ling Yu, Ling-juan Zhu and Qian Liang: Data Interpretation. Jing-an Rao and Rong-wei Zhang: Data Collection. Hui-hui Bao and Xiao-shu Cheng: Study Design and Manuscript revision.

Data availability
The GSE61635 dataset analysed in this study was downloaded from the NCBI GEO database.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Ethical approval
The study was conducted in accordance with the Declaration of Helsinki and was approved by the Medical Ethics Committee of the Second Affiliated Hospital of Nanchang University.