Identification of FCER1G as a key gene in multiple myeloma based on weighted gene co-expression network analysis

ABSTRACT Purpose: Although the prognosis of multiple myeloma (MM) has remarkably improved with the emerge of novel agents, it remains incurable and relapses inevitably. The molecular mechanisms of MM have not been well-studied. Herein, this study aimed to identify key genes in MM. Materials and Methods: The GSE39754 dataset was used to screen differentially expressed genes (DEGs) and construct a co-expression network. Hub nodes were identified in the protein and protein interaction (PPI) network. Datasets GSE13591 and GSE2658 were used to validate hub genes. Moreover, function and gene set enrichment analyses were performed to elucidate the molecular pathogenesis of MM. Results: In this study, 11 genes were found to be hub genes in the co-expression network, among which four genes (CD68, FCER1G, PLAUR and LCP2) were also identified as hub nodes. In the test dataset GSE13591, CD68 and FCER1G were significantly downregulated in MM. Besides, the areas under the curve (AUCs) of CD68 and FCER1G were greater than 0.8 in both the training dataset and the test dataset. Our results also confirmed that FCER1G highly expressed patients had remarkably longer survival times in MM. Function and pathway enrichment analyses suggested that hub genes were associated with epithelial mesenchymal transition, TNF-α signaling via NF-κB and inflammatory response. GSEA in our study indicated that FCER1G participated in NK cell mediated cytotoxicity and the NOD-like receptor signaling pathway. Conclusion: Our study identified FCER1G as a key gene in MM, providing a novel biomarker and potential molecular mechanisms of MM for further studies.


Introduction
Multiple myeloma (MM) is one of the most common hematologic malignancies, characterized by increased abnormal clonal plasma cells in the bone marrow [1].It is estimated that the number of new cases attributed to MM was almost 1.5 times that of deaths in 2020 [2].In recent years, with the application of autologous hematopoietic stem cell transplantation (auto-HSCT) in young patients and the development of novel drugs, the outcomes in MM patients have markedly improved [3][4][5].Nevertheless, MM remains incurable and fatal, with the majority of patients failed to escape of disease recurrence and aggravation [6,7].Therefore, it is urgently needed to elucidate the specific mechanisms contributing to disease progression and identify novel biomarkers of prognostic prediction and therapeutic improvements in MM patients.
The high heterogeneity of tumor cells is a hallmark of genome and transcriptome in multiple myeloma, in this context, DNA microarrays and RNA sequencing are able to measure the expression level of thousands of genes and even entire transcriptome.Accumulating evidence associates abnormal mRNA expression with the occurrence, progression and prognosis of multiple myeloma.With the rapid developments of high-throughput technologies and bioinformatics, a large number of gene expression spectrums have been widely employed to identify certain disease-related biomarkers [8].A growing body of research has been carried out to identify potential mechanisms of MM on the basis of microarray data profiles [9][10][11][12].Weighted gene co-expression network analysis (WGCNA) serves as an effective data mining approach to focusing on the interconnection between genes, which could explore the huge and complex relationships among genes and identify genes with high connectivity, thereby providing promising targets related to clinical traits of interest [13].In the present study, we aimed to explore network-centric genes related to clinical phenotypes of MM by constructing WGCNA on the basis of data from Gene Expression Omnibus (GEO) database.The expected results may provide a novel insight into our understanding of carcinogenesis in MM and potential biomarkers for its prognosis and treatment.

Data collection
By searching GEO database (https://www.ncbi.nlm.nih.gov/geo/), we obtained expression data with corresponding information of MM patients.The GSE39754 dataset performed on Affymetrix Human Exon 1.0 ST Array was selected as the training set used for the construction of the co-expression network in our study, which included 170 newly diagnosed MM patients and 6 healthy donors [14].GSE13591 was taken as one of independent test datasets, including 133 MM patients and 5 normal donors [15].The GSE2658 dataset contained the microarray data of CD138selected plasma cells from bone marrow of 559 newly diagnosed MM patients [16], which was selected as another test dataset.

Data preprocessing
Preprocessing and normalizing the downloaded raw data were accomplished through the 'affy' package in R [17].The quality of microarray data was evaluated by sample clustering on the basis of Pearson's correlation matrices and average linkage between different samples.

Differentially expressed genes (DEGs) identification
The 'limma' package in R [18] was utilized to screen DEGs between samples from healthy donors and MM patients in GSE39754.The |log2-fold change (FC)| > 1 combined with false discovery rate (FDR) < 0.05 was taken as the threshold value.Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment and gene ontology (GO) analyses were performed by 'clus-terProfiler' package in the R environment.p.adjust <0.05 and q value <0.2 was set as the cut-off value.

WGCNA construction
WGCNA was conducted using 'WGCNA (version 1.6.9)'package in R, which could show the connection between gene modules and clinical phenotypes by clustering genes with similar expression patterns [19].Samples were clustered and the outliers were removed from our study.First, we utilized the soft threshold selection analysis to determine an appropriate soft thresholding power, ensuring that genes were interacted by scale-free distribution [20].Then, the adjacency was transformed into the topological overlap matrix (TOM).Subsequently, different modules could be identified through the dynamics cut tree algorithms by detecting groups of highly correlated genes, followed by the use of 1-TOM dissimilarity with a minimum size of 30 cut-off [21,22].After that, a cutline for module dendrogram was chosen to merge similar modules (dissimilarity degree of MES < 0.25).

Significant modules and hub genes identification
Module eigengene (ME) was deemed as the dominant component of one gene module, module significance (MS) represents the average value of gene expression levels (gene significance, GS) in each gene module, which were two important parameters in identification of clinically significant modules.Module membership indicates the Pearson's correlation between ME and gene expression profile in a given module.In our study, hub genes were identified as genes with high absolute value of module membership (|cor.geneModu-leMembership|> 0.8) [23].Since most proteins function by interacting with other proteins, constructing a protein-protein interaction (PPI) network may help us to better identify genes or proteins of great significance.We uploaded genes into the STRING database (https:// string-db.org/)[24] and visualized by Cytoscape software, the PPI network was constructed with the confidence of > 0.4.Hub nodes were deemed as genes possessing connectivity degree of ≥ 20.Those genes in both the PPI network and the co-expression network were considered to be 'real' hub genes for validation.

Hub genes validation
The expression levels of hub genes were validated by the test dataset of GSE13591.GSE2658 was selected as another independent test dataset to perform survival analyses, in which 559 MM samples were dichotomized into two subgroups according to the medium value of gene expression levels.Furthermore, the 'survival' package in R was applied to implement log-rank tests and generate Kaplan-Meier survival curve [25].To measure the specificity and accuracy of hub genes, receiver operator characteristic (ROC) analyses were performed in the training dataset of GSE39754 and the test dataset of GSE13591 [26].

Function and pathway enrichment analysis
To explore the functional roles of the above hub genes, function and pathway enrichment analysis has been carried out by Metascape (https://metascape.org)[27] with the following ontology sources: Hallmark Gene Sets, Reactome Gene Sets, KEGG Pathway, Canonical Pathways, GO Biological Processes and CORUM.Terms with a minimum count of 3, p-value < 0.01 as well as an enrichment factor (the ratio between observed counts and the counts expected by chance) > 1.5 were categorized into clusters on the basis of membership similarities.

Gene set enrichment analysis (GSEA)
In GSE13591, 133 MM patients were grouped into two subgroups (high and low) according to the median value of the expression levels of genes.The collection of annotated gene sets of c2.cp.kegg.v7.5.1.symbols.gmt in Molecular Signatures Database (MSigDB, https://www.gsea-msigdb.org/gsea/msigdb/index.jsp) was selected as a reference gene set.FDR < 0.25 and NOM p-value < 0.05 was chosen as the cut-off criteria [28].

Weighted co-expression network construction and key module identification
Our study utilized sample clustering to detect microarray outliers.With a height cut of 40,000, three samples (GSM978565, GSM978566 and GSM978567) were deemed as outliers and removed.Consequently, 162 MM samples with complete information were included in co-expression analysis.Those genes with similar expression patterns were divided into the same module by using the 'WGCNA' package in R (Figure 2 (A)).We selected β = 5 (scale free R 2 = 0.92) as the soft-threshold to construct a scale-free network (Figure 2(B-E)).At a height cut of 0.75, 11 modules were eventually achieved (Figure 3(A)).Two methods were measured to assess the relevance of the key module to disease state.Based on the heatmap of correlation between MEs and clinical traits (Figure 3(B)), we found that the pink module had a significantly negative correlation with disease state of clinical samples (P = 9e-16, R 2 = −0.56).Moreover, the pink module also possessed the highest MS (Figure 3(C)).For each gene, module membership and GS in pink module were calculated to draw the scatterplot (cor = 0.71, p = 3.3e-15, Figure 3(D)).Therefore, the pink module in association with disease state was identified as the clinically significant module that contained 91 genes and extracted for subsequent analyses.

Hub genes identification
Eleven genes with high connectivity in pink module were identified to be hub genes in our study (Table 1).A PPI network was constructed and visualized by Cytoscape (Figure 4(A)).Under the confidence level of > 0.4 and connectivity degree of ≥ 20, there were 20 genes identified as hub nodes in the PPI network (Table S2).The four genes (CD68, FCER1G, PLAUR and LCP2) in both the co-expression network and the PPI network were identified as 'real' hub genes to facilitate further validation (Table 1, Figure 4(B)).

Hub genes validation
In the test dataset GSE13591, we observed that CD68 and FCER1G were also significantly downregulated in MM samples compared with normal group (CD68 P < 0.0001; FCER1G P = 0.0001, Figure 5(A-D)).To evaluate the prognostic value of the four candidate hub genes, survival analyses were performed in GSE2658.Our results confirmed that patients who expressed lower level of FCER1G had significantly shorter survival times (P = 0.002, Figure 5(F)), however, the other three hub genes were not significantly correlated with survival time in MM patients (Figure 5(E, G and  H)).Moreover, ROC analyses were performed to assess the capacity of hub genes to distinguish MM patients from healthy people in GSE39754 and GSE13591, respectively.As a result, the areas under the curve (AUCs) of four hub genes were all greater than 0.7, among which the AUCs of CD68 and FCER1G were greater than 0.8 in both the training dataset and the test dataset (Figure 6).

Function and pathway enrichment analysis
To elucidate the biological functions of hub genes in the pink module, functional enrichment analyses were performed by Metascape.Our results showed that these genes exhibited significantly enriched pathways and processes related to TNF-α signaling via NF-κB, regulation of cell activation, epithelialmesenchymal transition, neutrophil degranulation and inflammatory response (Figure 7).

Gene set enrichment analysis (GSEA)
To learn more about the biological functions and pathways in association with hub genes in MM, GSEA was conducted to search KEGG pathways enriched in samples with highly expressed CD68 and FCER1G, respectively.As a result, 26 gene sets were enriched in FCER1G highly expressed samples, however, none gene sets were found to be enriched in CD68 highly expressed samples under the threshold value of FDR < 0.25 and NOM p-value < 0.05.Six representative pathways involved in FCER1G highly expressed samples were presented, including chemokine signaling pathway, complement and coagulation cascades, hematopoietic cell lineage, leukocyte transendothelial migration, natural killer (NK) cell-mediated cytotoxicity and the NOD-like receptor signaling pathway (Figure 8).

Discussion
MM is the second most common haematological malignancy in high-income countries that produce monoclonal immunoglobulin [3].In almost all patients, MM always begins as smoldering multiple myeloma (SMM) or monoclonal gammopathy of undetermined significance (MGUS) [29].As is known to all, immunomodulatory agents and proteasome inhibitors are still the cornerstones of treatment nowadays.Additionally, CD38-targeting antibody is becoming an important component of relapse or even first-line therapies.In recent years, there were numerous novel drugs being found in treatment of relapsed MM, including carfilzomib, selinexor and belantamab mafodotin [1].Although the current treatments are highly effective in controlling the disease, a large number of patients still suffer from disease refractory and relapse.Accordingly, it is of great significance to identify novel targets involved in the progression and prognosis of MM.The WGCNA algorithm, which is possible to find highly related gene modules and link them to clinical phenotypes, has been widely applied to screen core genes in various types of cancer, such as cholangiocarcinoma, ovarian cancer, hepatocellular carcinoma, gastric cancer and lymphoma [30][31][32][33][34][35], as well as in more and more non-neoplastic diseases [36][37][38][39][40].However, it is rarely utilized in MM so far [41][42][43].
In the present study, a co-expression network was constructed based on the microarray of MM (GSE39754).The key module pink was found to be highly correlated with disease state in MM, in which 11 genes were identified as hub genes.Then, combined with the construction of PPI network, four genes (CD68, FCER1G, PLAUR and LCP2) were ultimately considered as candidate hub genes for further analyses.Among the four hub genes, we found the expression levels of CD68 and FCER1G were also significantly lower in MM patients validated in the test dataset.In addition, as determined by AUCs of ROC curves, CD68 and FCER1G exhibited excellent diagnostic efficacy for MM patients and healthy people in the training set as well as the test dataset.Subsequently, further validation of the hub genes  was carried out through survival analyses.As a result, significantly better survival outcomes were observed in FCER1G highly expressed patients, indicating that FCER1G might serve as a potential biomarker for prognosis in MM.FCER1G, also known as Fc receptor-gamma (FcRγ), is located on chromosome 1q23.3and encodes the γ subunit of the fragment crystallizable (Fc) region (Fc R) of immunoglobulin E (IgE).It is recognized that the combination of the Fc of immunoglobulins and the Fc R on immune cells is a key step in the activation of cellular effector functions via the antigen-antibody binding reaction [44,45].Previous studies have shown that FCER1G was involved in the development of kinds of diseases, such as glioma, diabetic kidney disease, endometrial cancer and osteosarcoma [46][47][48][49].Recently, a class of αβ T-cell receptor-positive and FCER1G-expressing T cells with high cytotoxic potential was identified as a class of tumor-specific T cells, and FCER1G was a defining marker for the lineage, indicating that FCER1G may play a crucial role in T cell-mediated antitumor immune response [50].Moreover, FCER1G was reported to be a hub gene participating in the cytokine storm syndrome in COVID-19 patients [51].It has been reported that high expression level of FCER1G was related to T cell suppression and macrophage infiltration, besides, FCER1G and CD68 low expressed patients had the longest overall survival in clear cell renal cell carcinoma, which were two hub genes also identified in our study [52].In the viral infection process, FCER1G was demonstrated to be a regulator of NK cell-dependent CD8 + T-cell functions [53].Similarly, GSEA in our study revealed that high expression of FCER1G was related to NK cell -mediated cytotoxicity.However, the knowledge regarding the function and role of FCER1G in MM is limited.Fu et al. reported that FCER1G expression decreased with the progression of MM and its high level may become a good prognostic factor in MM patients [54], which were consistent with our results.The difference is that Fu's research was aimed at exploring the role of FCER1G in MM through GEO datasets, however, our study was aimed at identifying key genes in MM and ultimately identified FCER1G as a key gene by WGCNA.That is to say, the research purposes of the two were essentially different.Another study also identified FCER1G as a significant gene in MM, however, the method used was different from our study [55].
To get further insight into the underlying mechanisms of FCER1G in MM, function and pathway enrichment analyses were performed.Our results showed that FCER1G was involved in NK cell-mediated cytotoxicity, epithelial-mesenchymal transition and leukocyte transendothelial migration, indicating that FCER1G may play a key role in the tumorigenesis and progression.As mentioned above, previous studies have revealed that FCER1G-expressing NK cells could respond to the over activation of CD8+ T cells and was related to T cell suppression [52,53].Therefore, it is presumable that FCER1G may be  relevant to the activity of NK cells for regulation of CD8 + T cell response in the development and progression of MM, and it provided insight into the underlying mechanisms of FCER1G expression interaction with tumor immunity in MM.Furthermore, our study found that FCER1G participated in the NOD-like receptor signaling pathway.However, the specific regulatory mechanism needs to be further studied.
In summary, this is the first study to identify FCER1G as a key gene in MM by performing WGCNA.We demonstrated that FCER1G was negatively associated with the progression of MM and might be a favorable biomarker in MM patients.Further experimental evidence is necessary to validate our findings obtained in this study.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Figure 1 .
Figure 1.DEGs identification and function analysis.(A) Volcano plot and (B) Differential ordering diagram visualizing DEGs in GSE39754.The red dots represent upregulated genes, the blue dots represent downregulated genes.GO and KEGG analysis of upregulated DEGs (C, D) and downregulated DEGs (E, F).In figure C and E, the size of the nodes is proportional to the number of genes, the different color of the nodes represents different ontologies, including biological process (BP), cellular component (CC), molecular function (MF) and Kyoto Encyclopedia of Genes and Genomes (KEGG).In figure D and F, the left half circle includes color modules of genes, the color represents logFC value, the right half circle includes color modules of terms, the size represents the number of genes.DEG, differentially expressed genes.

Figure 2 .
Figure 2. Gene cluster analysis.(A) Sample dendrogram and trait heatmap.The branches of the dendrogram correspond to clustered samples.(B, C) Construction of the scale-free network with a suitable soft-thresholding power (β).(D) Histogram of connectivity distribution when β = 5. (E) Checking the scale free topology when β = 5.

Figure 3 .
Figure 3. Identification of modules associated with clinical traits of MM. (A) Cluster dendrogram of genes based on dissimilarity measure (1-TOM).(B) Heatmap of the correlation between module eigengenes and clinical traits of MM. (C) Distribution of average gene significance and errors in the modules.(D) The relationship between the module membership and gene significance in the pink module.MM, multiple myeloma; TOM, topological overlap matrix.

Figure 4 .
Figure 4. Hub genes identification.(A) Protein-protein interaction (PPI) network visualized by cytoHubba.Nodes represent genes in the pink module.Node size represents connectivity of the gene measured by degree.(B) Venn diagram of the hub genes and hub nodes.

Figure 7 .
Figure 7. Function and pathway enrichment analysis of hub genes in the pink module.(A) Bar graph of enriched terms of genes colored by P-values.(B) Network of enriched terms colored by P-value.The size of the nodes is proportional to the number of genes.The color depth of the nodes is proportional to the size of P-value.Terms with more genes tend to have more significant P-values.

Figure 8 .
Figure 8. Gene set enrichment analysis.Six significant and representative functional gene sets enriched in MM samples with highly expressed FCER1G were presented.MM, multiple myeloma.

Table 1 .
Hub genes in the pink module related with disease state.