Elevated Ras related GTP binding B (RRAGB) expression predicts poor overall survival and constructs a prognostic nomogram for colon adenocarcinoma

ABSTRACT Currently, no articles have explored the roles of RRAGB gene in the occurrence and development of cancer. By means of The Cancer Genome Atlas (TCGA) data mining, we found that this gene might be a novel prognostic predictor for colon adenocarcinoma (COAD). Hence, this article was carried out to explore its roles in COAD and associations with immunity. RRAGB single-gene expression matrix and corresponding clinical information were extracted from TCGA database. Univariate/multivariate cox regression analyses and gene set enrichment analysis (GSEA) were utilized to identify independent prognostic factors and RRAGB related pathways, respectively. Relationships between RRAGB and immunity were also analyzed. Boxplot and K-M survival analysis indicated that RRAGB was not only differently expressed in COAD (P < 0.05), but also significantly associated with overall survival (OS; P < 0.05). Univariate and multivariate Cox hazard regression analyses indicated that RRAGB could serve as an independent prognostic factor for COAD (both P < 0.05). GSEA identified five signaling pathways significantly enriched in the high-RRAGB expression phenotype. Moreover, a RRAGB-based nomogram was successfully constructed and displayed a satisfactory performance. In addition, RRAGB expression was found to be significantly associated with microsatellite instability (MSI), tumor mutational burden (TMB) and immunity. Our results revealed that RRAGB could be a prognostic biomarker for COAD in terms of OS and markedly related to MSI, TMB, and immunity. We also constructed an RRAGB-based nomogram with a satisfactory performance. Further researches should be carried out to validate our findings.


Introduction
Colorectal cancer (CRC), as one of the most common tumor diagnosed in the digestive system, accounts for 8% with approximately 149,500 newly diagnosed cases and accounts for 9% with approximately 52,980 newly estimated death in the United States, 2021 [1]. Colon adenocarcinoma (COAD), as the most common histological subtype of colon cancer, mainly occurs in the intestinal mucosa and spreads to adjacent organs [2]. Although the 5-year survival rate for early-stage colon cancer patients undergoing radical resection is more than 90%, most patients are diagnosed with advanced cases or metastasis, leading to the drop of 5-year survival rates to 10% [3]. Currently, surgery, radiotherapy, chemotherapy, targeted therapies are available for the clinical treatment of COAD and considerable advancements have been achieved in these therapeutic strategies [4]. However, patients' prognosis is still poor and far from satisfactory due to the late diagnosis, rapid development and high frequency of metastasis [5,6]. Hence, there is an urgent need to explore the molecular mechanisms of COAD and to identify novel biomarkers for survival evaluation and targeted treatment [7].
Ras-related GTP binding B (RRAGB), also known as RAGB or bA465E19.1, belongs to the large family of Ras-homologous GTPases and encodes proteins of cellular switches operated by GTP-exchange factors and factors stimulating their intrinsic GTPase activity [8,9]. Thanks to the great improvement of high-throughput sequencing technology and an increasing establishment of public database networks, more and more transcriptome and clinical data could be explored by means of bioinformatics analysis. Based on these, a growing number of prognostic biomarkers or signatures had been established in different cancers [10][11][12]. As for RRAGB, Shi et al. successfully constructed a signature based on CHMP4C, FOXO1, RRAGB to effectively predict the cervical cancer patients' prognosis [13] and Xie et al. also constructed a six-gene model including RRAGB predict the non-small-cell lung cancer patients' overall survival (OS) [14]. Currently, little was still known about the roles of RRAGB gene in the occurrence and development of COAD. In this article, we aimed to explore the comprehensive roles of RRAGB in COAD and to underline its associations with immunity, hoping to provide a novel candidate gene to improve the prognosis and survival rates of COAD.

Single gene matrix mining from the TCGA database
RNA-sequencing FPKM data and corresponding clinical information of 39 normal and 398 COAD tumor tissues were extracted from the official website of The Cancer Genome Atlas (TCGA; https:// tcga-data.nci.nih.gov/tcga/). R version 3.5.1 software (https://www.r-project.org/) was utilized to standardize RRAGB RNA-Sequencing data [15] and we further did an overlap with RRAGB mRNA to get single-gene expression matrix and clinical information for each sample ID. OS was the primary outcome of this study and we further analyzed the associations between RRAGB's expression and clinical data. Besides, we also used 'limma' package (http://www.bioconductor. org/packages/release/bioc/html/limma.html) to calculate differently expressed genes (DEGs), with the cutoff criteria of adjusted P-value (FDR) <0.05 and |log2 FC (fold change)| ≥1 [16].

Gene set enrichment analysis (GSEA)
As a reference, the 'c2.cp.kegg.v6.2.symbols.gmt' gene set was obtained from the Molecular Signatures Database (MSigDB) (http://software. broadinstitute.org/gsea/msigdb) [17]. We performed GSEA to reveal significant survival differences between the high-RRAGB and low-RRAGB expression groups, with at least 1000 times permutation tests for each analysis [18]. Through this way we could discover significant critical biological pathways, with the threshold of the normalized enrichment score (NES)>1.5 and normal p values <0.05.

Protein-protein interaction (PPI) network and the Human Protein Atlas (HPA) database
The PPI network analysis was carried out with the help of the online STRING (https://string-db.org/) website for the sake of exploring the potential relationships between RRAGB and other genes [19]. We explored the online HPA (http://www. proteinatlas.org/) database to validate the RRAGB protein expression of COAD through immunohistochemical staining by antibody HPA003734 [20].

Statistical analysis
All statistical analyses were conducted by using R version 3.5.1 software (https://www.r-project.org/ ). The 'limma' package (http://www.bioconductor. org/packages/release/bioc/html/limma.html) was used to compare the RRAGB expression differences between the normal and cancer tissues by Student's t test [26]. The Wilcoxon signed-rank test [27] and logistic regression [28] were also performed by us to estimate the associations between the RRAGB expression and clinicopathological variables. Based on the median expression of RRAGB, COAD patients were divided into high-risk groups and lowrisk groups. Survival analysis was conducted by Kaplan-Meier (K-M) method with the log-rank test [29]. The receiver operating characteristic (ROC) curves associated with the area under the curve (AUC) values were calculated by the R 'survivalROC' package (https://cran.r-project.org/ web/packages/survivalROC/index.html). Univariate and multivariate cox hazard regression analyses were applied to identify independent prognostic factors. Furthermore, nomogram was also visualized by the R 'rms' package (https://cran.r-project.org/web/ packages/rms/index.html) to predict the 1-, 3-, and 5-year survival probabilities and individual predictors. In addition, all P values were adopted by a two-sided test and P < 0.05 was regarded as statistical significance.

Results
By means of TCGA data mining, we aimed to explore the comprehensive roles of RRAGB in COAD and to underline its associations with immunity, hoping to provide a novel candidate gene to improve the prognosis and survival rates of COAD. It was the first time for us to comprehensively explore the expression of RRAGB and its impact on COAD. Our results indicated that RRAGB could serve as an independent prognostic factor for predicting the prognosis of COAD patients and it was significantly associated with MSI, TMB, immunity. GSEA identified five RRAGB-related signaling pathways and RRAGB-based nomogram was also constructed to guide the prognosis of COAD patients. Taken together, RRAGB might be a novel prognostic predictor for COAD.

RRAGB expression levels in COAD from TCGA and HPA database
The RRAGB mRNA expression levels in different cancer types are detailed in Figure 1(a) and we could detect that RRAGB was differently expressed in 14 cancer types including COAD (all P < 0.05). We then selected COAD for further analysis. Boxplot (N = 39; T = 398) and pairwise boxplot (N = 39; T = 39) confirmed that RRAGB mRNA was up-regulated in COAD tumor tissues, compared with normal tissues (both P < 0.05; Figure 1 (b,c)). K-M survival analysis indicated that the low-RRAGB group had a much longer OS than the high-RRAGB group, based on its median expression in COAD (P = 0.016; Figure 1(d)). Moreover, ROC curves of RRAGB associated with 1-, 3-, and 5-year AUCs were 0.589, 0.711, 0.726, respectively (Figure 1(e)). As displayed in figure 1(f,g), immunohistochemical staining of the HPA database indicated that RRAGB was not detected in normal colon tissues and its expression was medium in COAD tumor tissues. Moreover, we also employed GSE44076 dataset to validate the expression of RRAGB in normal and tumor COAD cells (P = 1.92e-07; N = 148; T = 98; Supplement Figure S1).

Associations between RRAGB expression and clinicopathologic variables
The Wilcoxon signed-rank test and logistic regression were performed by us to estimate the associations between the RRAGB expression and seven clinicopathological variables including age, gender, race, stage, T, N, M. However, no significant associations were observed (all P > 0.05, Figure 2).

Univariate and multivariate cox hazard regression analyses
Univariate and multivariate Cox hazard regression analyses were applied by us to identify independent prognostic factors from RRAGB, age, gender, race, stage, T, N and M. Univariate Cox hazard regression analysis showed that gender, stage, T, N, M and RRAGB were all significantly related to OS (all  Table 1). Multivariate Cox hazard regression analysis presented that stage, M and RRAGB were markedly linked to the OS (all P < 0.05; Figure 3(b) and Table 1). In other words, stage, M and RRAGB were all independent prognostic factors for COAD.

Construction of RRAGB based nomogram
A nomogram was constructed by the R 'rms' package to intuitively visualize the associations between eight clinicopathological variables (RRAGB, age, gender, race, stage, T, N and M) and 1-, 3-, 5-year survival probabilities of OS (Figure 4(a)). As summarized in Table 2, C-index and 1-, 3-, 5-year AUCs of RRAGB-based nomogram were 0.872, 0.748, 0.799 and 0.791, showing a moderate prediction accuracy. As displayed in Figure 4(b-d), 1-, 3-, and 5-year calibration curves indicated the consistency of our results and the predictive values, indicating satisfactory performance for this RRAGB-based nomogram.

RRAGB-related signaling pathways according to GSEA
GSEA was by us performed to reveal significant survival differences between the high-RRAGB and low-RRAGB expression groups to discover significant critical biological pathways, based on the the 'c2.cp.kegg.v6.2.symbols.gmt' gene set from the MSigDB. Our results indicated that the high-RRAGB expression phenotype was significantly associated with Basal transcription factors, Ubiquitin mediated proteolysis, Insulin, Wnt, Erbb signaling pathways, with the threshold of    Figure 5 and Table 3).

Associations between RRAGB and PPI network, MSI, TNB, TMB in COAD
A PPI network analysis was carried out with the help of the online STRING (https://string-db.org/) website to explore the potential relationships between RRAGB and other genes in COAD (Figure 6(a)). By means of the pearson's method, we calculated the correlations between the RRAGB expression and MSI, TNB, TMB, with the help of the Sangerbox website tools (http://www.sangerbox. com/tool). Radar maps indicated that the RRAGB expression was dramatically linked to MSI (P = 1.8e-05) and TMB (P = 0.00079) in COAD, whereas it was not associated with TNB (P = 0.16).

Associations between RRAGB and tumor immune infiltration, SCNA, tumor microenvironment in COAD
With the help of the Sangerbox website tools, we calculated the correlations between the RRAGB expression and tumor immune infiltration by means of the Spearman's method. The RRAGB mRNA expression was markedly related to B cells,   Figure 7(a)). SCNA module provided the correlations between tumor immune infiltration levels among COAD and different somatic copy number alterations for RRAGB by the Wilcoxon rank-sum test (Figure 7(b)). With the pearson's method, the RRAGB mRNA expression was significantly associated with ImmuneScore and StromalScore (both P < 0.05), whereas it was not linked to ESTIMATEScore (P = 0.540, Figure 7(c)).

Correlations between RRAGB and immune checkpoint molecules, immune cells pathway, mismatch repair genes in COAD
By means of the pearson's method, we calculated the correlations between the RRAGB expression and immune checkpoint molecules, immune cells pathway, mismatch repair genes, with the help of the Sangerbox website tools. As for immune checkpoint molecules, the RRAGB mRNA expression was significantly associated with TNFSF4, TNFSF9, TNFSF18, TMIGD2, TNFRSF14, TNFRSF18 in COAD (all P < 0.05, Figure 8(a)). In terms of immune cells pathways, the RRAGB mRNA expression was markedly related to Type 17 T helper cell, Type 2 T helper cell, Neutrophil, Monocyte, Memory B cell, CD56dim natural killer cell, Activated dendritic cell and so on in COAD (all P < 0.05, Figure 8(b)). Co-expression analysis of RRAGB and mismatch repair genes indicated that RRAGB was dramatically linked to MLH1, MSH2, MSH6, PMS2, EPCAM in COAD (all P < 0.001, Figure 8(c)).

Discussion
Currently, no studies have reported the definite roles of RRAGB in human cancers and it was the first time for us to comprehensively explore the expression of RRAGB and its impact on COAD. In this article, our results indicated that RRAGB was not only differently expressed in COAD, but also significantly associated with OS. Further univariate and multivariate Cox hazard regression analyses indicated that stage, M and RRAGB could all serve as independent prognostic factors for COAD. As reported by previous researches, Bunte et al. found that RRAGB was more prevalent in periodontitis patients than healthy controls and might be associated with periodontitis clinical manifestations [30]. Shi et al. successfully constructed a signature based on CHMP4C, FOXO1, RRAGB, and effectively predicted the cervical cancer patients' prognosis [13]. Xie et al. also constructed a six-gene model including RRAGB and it could function as an independent prognostic factor to predict the non-small-cell lung cancer patients' OS [14]. In this article, our results indicated that the high-RRAGB expression phenotype was significantly associated with Basal transcription factors, Ubiquitin-mediated proteolysis, Insulin, Wnt, Erbb signaling pathways. As a useful tool, GSEA had been applied by various studies to reveal significant survival differences between the high-and low-gene expression groups to discover significant critical biological pathways [31]. Shuai et al found that eight signaling pathways (apoptosis, cell cycle, ErbB, MAPK, mTOR, Notch, p53 and TGF-β pathways) were most significantly enriched in the high-CDCA8 phenotype, according to the GSEA [11]. Liu et al. reveal that ZNF132 could participate in multiple biological pathways containing the regulation of glycolysis and cell cycle, based on the results of GSEA [32]. Consistent with our results, all of these three signaling pathways had been reported to be markedly related to colon cancer [33][34][35]. As reported by previous articles, Wnt and ERBB signaling pathway played important roles in cancer [36,37]. Yan et al. reported that LINC00261 could regulate miR-324-3p and the Wnt signaling pathway to repress colon cancer progression [38]. Wu et al. showed that tankyrase 1 inhibitior (XAV939) could increase chemosensitivity in colon cancer cells through inhibiting the Wnt signaling pathway [39]. As for Erbb signaling pathway, He et al. reported that YAP could form an autocrine loop with the ERBB pathway to regulate the initiation and progression of ovarian cancer [40]. Grimont revealed that SOX9 could regulate the ERBB signaling pathway in the development of pancreatic cancer [41].
Due to the ability of reducing statistically predictive models into single numerical estimated probabilities, nomograms had been widely used for cancer prognosis and clinical decision-making [42]. Wang et al found that nomogram based on serum cystatin C was beneficial to evaluate acute kidney injury (AKI) possibilities and avoid its occurrences [43]. Liu et al. successfully establish a TP53-associated nomogram and exhibited excellent efficacy in predicting the OS prognosis of pancreatic cancer patients [44]. In this article, we also constructed a nomogram to intuitively visualize the associations between eight clinicopathological variables (RRAGB, age, gender, race, stage, T, N and M) and 1-, 3-, 5-year survival probabilities of OS. After evaluated by C-index, ROC and calibration curves, our established RRAGB-based nomogram displayed a satisfactory performance.
As reported by previous studies, TMB, TNB, MSI could serve as biomarkers related to the immune checkpoint inhibitors' efficacy and survival prognosis [45,46]. Therefore, we calculated the correlations between the RRAGB expression and MSI, TNB, TMB by means of the pearson's method and results indicated that the RRAGB expression was dramatically linked to MSI and TMB in COAD, whereas it was not associated with TNB. The dynamic characteristics of the tumor immune infiltration, tumor microenvironment, immune checkpoint molecules and immune cells pathway were vital for immunotherapy and played essential roles in tumorigenesis and progression [47][48][49]. In our article, we calculated the correlations between the RRAGB expression and tumor immune infiltration by the spearman's method and found that the RRAGB mRNA expression was markedly related to B cells, CD4 + T cells and Macrophage cells infiltration. Through the pearson's method, the RRAGB mRNA expression was significantly associated with ImmuneScore and StromalScore, whereas it was not linked to ESTIMATEScore. In terms of immune checkpoint molecules and immune cells pathway, the RRAGB mRNA expression was significantly associated with TNFSF4, TNFSF9, TNFSF18, TMIGD2, TNFRSF14, TNFRSF18, Type 17T helper cell, Type 2T helper cell, neutrophil, Monocyte, Memory B cell, CD56dim natural killer cell, activated dendritic cell and so on in COAD by means of the pearson's method. All of these indicated the strong associations between RRAGB and immunity.
The strength of this article was that it was the first time for us to comprehensively explore the expression of RRAGB and its impact on COAD. Univariate and multivariate cox hazard regression analyses indicated that RRAGB could serve as independent prognostic factors for COAD and GSEA identified RRAGB-related signaling pathways. RRAGB was revealed to be significantly associated with MSI, TMB, immunity and we also constructed an RRAGB-based nomogram, having a satisfactory performance. There were several limitations too. Firstly, clinical information was merely collected from TCGA dataset, which was limited and insufficient. Treatment information in TCGA was absent. Secondly, we mainly paid attention to the bioinformatics analysis of RRAGB, without experimental validation. We would further carry out in vivo and in vitro experiments to verify our results.

Conclusions
Overall, our results revealed that RRAGB could be a prognostic biomarker for COAD in terms of OS. Univariate and multivariate Cox hazard regression analyses indicated that RRAGB could serve as independent prognostic factors for COAD and GSEA identified RRAGB-related signaling pathways. Moreover, RRAGB was found to be significantly associated with MSI, TMB, and immunity. Last but not least, we also constructed an RRAGBbased nomogram, having a satisfactory performance. Further researches with more sample sizes and experiments in vivo and in vitro were required to verify our findings.

Highlights
• It was the first time for us to comprehensively explore the expression of RRAGB and its impact on COAD. • RRAGB could serve as an independent prognostic factor for predicting the prognosis of COAD patients. • RRAGB was significantly associated with MSI, TMB, immunity. • GSEA identified five RRAGB-related signaling pathways.
• RRAGB-based nomogram was also constructed to guide the prognosis of COAD patients.