A hypoxia related long non-coding RNA signature could accurately predict survival outcomes in patients with bladder cancer

ABSTRACT Hypoxia plays a significant role in tumor progression. This study aimed to develop a hypoxia-related long noncoding RNA (lncRNA) signature for predicting survival outcomes of patients with bladder cancer (BC). The transcriptome and clinicopathologic data were downloaded from The Cancer Genome Atlas (TCGA) database. Univariate Cox regression analysis and Lasso regression analysis were used to screened lncRNAs. Ten lncRNAs were screened out and included into the hypoxia lncRNA signature. The risk score based on hypoxia lncRNA signature could accurately predict the survival outcomes of BC patients. Immune infiltration analysis showed that six types of immune cells had significant different infiltration. Tumor mutation burden (TMB) analysis showed that the risk scores between the wild types and the mutation types of TP53, FGFR3, and RB1 were significantly different. Gene Set Enrichment Analysis (GSEA) showed that cancer-associated pathways belonged to the high risk groups and immune-related signal pathways were enriched into the low risk group. Then, we constructed a predictive model with the risk score, age, and clinical stage, which showed a robust prognostic performance. An lncRNA-mRNA coexpression network was constructed, which contained 62 lncRNA-mRNA links among 10 lncRNAs and 40 related mRNAs. In summary, the hypoxia lncRNA signature could accurately predict prognosis, chemotherapy and immunotherapy response in patients with BC and was relevant to clinicopathologic parameters and immune cell infiltration.


Introduction
Bladder cancer (BC) is the second most common cancer in urinary and male reproductive system, and has a clear male predominance [1,2]. More than 90% of cases were urothelial carcinoma and 25% of patients with BC were presented with muscle invasive disease [3]. Although minimally invasive technology, neoadjuvant chemotherapy, and radiotherapy were introduced to treat BC, the survival rate of patients does not improve in the past 30 years [1,4]. Moreover, the staging system based on pathological parameters still requires improvement to predict patients' survival accurately [5]. Recently, an increasing number of researches have focused on the molecular biomarkers [3]. The molecular subtype's classes of BC could stratify responses to treatment and predict the survival outcomes effectively, which might be incorporated into clinical management in the future [6,7].
LncRNAs, a subtype of noncoding RNA family, have more than 200 nucleotides in their transcripts and play an indispensable roles in regulating initiation and progression of tumors [5,18]. To date, lots of studies have investigated hypoxia-associated lncRNAs and revealed their important regulatory roles in cancer cells [19][20][21][22][23]. It was reported that lncRNA-RMRP, lncRNA SNHG3, lncRNA GAS6-AS2, lncRNA CCAT1, and lncRNA GClnc1 could promote proliferation, migration, and invasion of BC cells [24][25][26][27][28]. Chen once reported that lncRNA LNMAT1 in cytoplasm promoted lymphatic invasion in BC. Meanwhile, lymphatic endothelial cell internalizing lncRNA LNMAT1 in exosomes could improve tube formation and migration in vitro [29,30]. It is due to the significance of lncRNA that Qian once conducted a meta-analysis to analyze the association between lncRNA expression and survival, and demonstrated that lncRNA could serve as diagnostic and prognostic biomarkers in BC.
Whether we could build a hypoxia-associated lncRNA signature to predict the prognosis of BC patients more accurately compared with traditional clinicopathologic parameters, and to provide some clinical guidance for chemotherapy and immunotherapy has become our goal of the study. Herein, we have successfully constructed a hypoxia-related lncRNA signature and investigated its performance and relationships with other clinicopathological variables in BC. Interestingly, the hypoxia related lncRNA signature could accurately predict the chemotherapy and immunotherapy response in patients with different risk scores, which may contribute to decision-making in the management of BC.

Data collection
The transcriptome and clinical data of patients were downloaded from the TCGA database (https://portal.gdc.cancer.gov/). Patients with survival time less than 30 days were excluded. Finally, 396 cases were incorporated into our study. The Ensembl human genome browser (http://grch37.ensembl.org/index.html) was used to classify and annotate lncRNAs and proteincoding genes. Moreover, hallmark genes of hypoxia (total 200 genes) were downloaded from the hallmark gene sets of the Molecular Signatures Database (https://www.gsea-msigdb. org/gsea/msigdb/index.jsp). Since data of participants were acquired from the public database, written informed consent and approval of the ethics committee were waived.

Identification of hypoxia associated LncRNAs
The 'limma' package was used in the R software to calculate Pearson correlation coefficients, which were employed to analyze the correlation between hypoxia hallmark genes and lncRNAs in TCGA dataset. The hypoxia-related lncRNAs with absolute value of correlation coefficient greater than 0.3 (|R|>0.3) and P value less than 0.05 (P < 0.05) were screened out and used to construct the hypoxia-related signature [5].

Construction of the prognostic-related hypoxia LncRNA signature
All hypoxia-related lncRNAs were screened using univariate Cox regression analysis, to identify prognosis associated lncRNAs with P-value < 0.05. Then, the screened prognosis-associated lncRNAs were used to make Gene Ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis with 'clusterProfiler' package and 'enrichplot' package in R software. Moreover, the screened prognosis-related hypoxia lncRNAs were incorporated into Lasso regression model [31], in which penalties were applied to all prognosis-associated hypoxia lncRNAs for preventing overfitting effects of the model. Penalty parameter (λ) for the model was determined by 10-fold crossvalidation following the criteria that error was within 1 standard error of the minimum. After that, the selected lncRNAs constituted the hypoxia signature and could generate risk scores in multivariate Cox regression model with the following formula:

Evaluation of the hypoxia LncRNA signature
All BC patients were divided into two groups in the light of the mean risk score, and the Kaplan-Meier method was used to compare overall survival (OS) of patients in different risk groups [32]. Principle component analysis (PCA) was used to evaluate the distribution of genes expression in patients with different risk levels. Stratified survival analysis was performed to examine the accuracy and stability of the hypoxia lncRNAs signature. Risk scores were compared in subgroups stratified by clinicopathological parameters to explore potential relationships between the signature and clinicopathological parameters.

Immune infiltration analysis, tumor mutational burden analysis, and gene set enrichment analysis
The CIBERSORT tool was used to estimate the contents of 22 human immune cells in each BC patient. Next, we compared the infiltrating immune cells in the high-and the low-risk groups and identified the significantly different immune infiltrating cells [11]. The tumor mutational data were downloaded from TCGA and the maftools package were used to analyze the mutational data in both the highand the low-risk groups. TMB was calculated with the tumor specific mutation genes [33]. We listed the top mutational genes and compared the risk scores in mutational-and wild-type cohorts.
We uploaded RNA-seq profiles to GSEA to investigate that differentially expressed generelated signaling pathways in the high-risk group and the low-risk group. The enriched set were screened based on a FDR < 0.25 and P < 0.05 after 1000 permutations [32].

Correlation of the hypoxia signature with clinical parameters
The risk score and other clinical parameters in the TCGA dataset were incorporated into univariate Cox regression and multivariate Cox regression to evaluate whether the risk score was an independent prognostic predictor, and then ROC curves were used to calculate the predictive accuracy of the risk score and other clinicopathological parameters [33].

Development of a predictive nomogram based on clinical parameters and the risk score
Age, gender, AJCC-stage, grade, and hypoxiarelated risk score were incorporated into the univariate Cox regression analysis and multivariate Cox regression analysis, and then ROC curves were used to evaluate the predictive accuracy of the risk score and other clinicopathological parameters. After that, we picked up the independent predictive factors with P < 0.05 and AJCC-stage to build a Cox regression model and presented it with a nomogram to facilitate clinical practice [33]. The area under curve (AUC), Brier scores, and calibration plots were used to assess the discrimination and calibration of the model in 1, 3, and 5 years. The simple bootstrap strategy was used to validate the model internally [32,33].

Construction of the LncRNA-mRNA coexpression network and function enrichment analysis
Pearson correlation analysis was used to explore hypoxia lncRNAs correlated mRNAs with absolute value of correlation coefficient greater than 0.3 (|R| > 0.3) and P value less than 0.05 (P < 0.05) [5]. Next, the Cytoscape software (Version 3.7.2) (https://cytoscape.org/) was applied to construct and visualize the lncRNA-mRNA coexpression network.
The hypoxia lncRNA-associated mRNAs were uploaded to DAVID database (https://david.ncifcrf.gov/home.jsp) to make GO enrichment analysis and KEGG pathway analysis. P < 0.05 was deemed as statistically significant [5].

Prediction of chemotherapy response
Public pharmacogenomics database Genomics of Drug Sensitivity in Cancer (GDSC) was used to evaluate and predict the chemotherapy response of BC patients in different risk groups in TCGA database [34]. The half-maximal inhibitory concentration (IC50), a significant predictor of chemosensitivity, was compared between the high and low risk groups, and P < 0.05 was considered statistically significant.

Prediction of immunotherapy response
TMB, Programmed cell death 1 ligand 1 and 2 (PD-L1 and PD-L2), and microsatellite instable (MSI) in tumor tissue were deemed as potent biomarkers for predicting immunotherapy response [35]. Therefore, we first analyzed the TMB in both risk groups and explored the correlation between risk score and TMB. Then, the expression of CD274 (PD-L1) and PDCD1LG2 (PD-L2) in both different risk groups were calculated and compared at the transcriptional level. Moreover, we also calculated and compared the transcriptional expression of significant mismatch repair genes in tumor samples, including MLH1, MSH2, MSH6, and PMS2.
Finally, we used The Cancer Immunome Atlas (TCIA) database (https://tcia.at/) to generate the immunophenoscore (IPS) in each sample, which was a superior predictor of response to anticytotoxic T lymphocyte antigen-4 (CTLA-4) and anti-programmed cell death protein 1 (PD-1) [36], and then the IPS in different risk groups were compared to explore the relationships between risk scores and IPS. P < 0.05 was consideredstatistically significant.

Statistical analysis
All statistical analyses were performed by using the R software (Version 4.0.2) (https://www.r-project. org/) and GraphPad Prism 8. Quantitative data in two groups were compared using Student t-test and quantitative data in three or more groups were compared with one-way analysis of variance (ANOVA) or Welch's test. P < 0.05 was considered statistically significant.

Results
BC is a common cancer in urinary and male reproductive system. The rapid proliferation of tumor cells and the aberrant vascularization in tumor stroma resulted in hypoxia status in tumor microenvironment, which could enhance the invasion of malignancies and had closely relationships with treatment resistance. Along with the wide application of next-generation sequence (NGS) in clinical practice, an increasing number of biomarkeroriented studies have become more and more popular. Herein, we aimed to build a hypoxia-associated lncRNA signature to predict the prognosis of BC patients more accurately compared with traditional clinicopathologic parameters, and to be helpful for decision-making, especially for those patients required chemotherapy and immunotherapy.

Selection of hypoxia-related LncRNA and construction of hypoxia LncRNA signature
The flowchart showed the main procedures of our study (Graphical Abstract). The basic characteristics of BC patients in the TCGA database were presented in Table 1. We classified the RNA sequencing data and extracted 14,142 lncRNAs in TCGA database. Meanwhile, a total of 200 hypoxia-related genes were acquired from Molecular Signature Database, among which 195 genes were expressed in BC samples (Supplementary Table 1). Pearson correlation analysis were applied to analyze the correlation between hypoxia related genes and lncRNAs, and 1423 hypoxia related lncRNAs were identified with the criteria that |R| > 0.3 and P < 0.05 (Supplementary Table 2). We first analyzed 1423 hypoxia-associated lncRNAs with GO enrichment and found that these lncRNAs were relevant to catabolic or metabolic process, which were in accordance with hypoxia-related biological process (Supplementary Figure 1). Then, KEGG enrichment was applied to explore the associated enriched pathways, but no significant pathways were enriched due to the lack of noncoding RNA information in enriched pathways. Univariate Cox regression analysis were used to select prognosis related lncRNAs among the 1423 hypoxia-related lncRNAs and 248 prognosis-related hypoxia lncRNAs were screened out (Supplementary Table 3). In order to further screen variables and prevent overfitting, lasso regression analysis was used and finally 10 prognosis-related hypoxia lncRNAs were selected to construct the hypoxia lncRNA signature (Figures 1a  and 1b). Among them, eight lncRNAs (AL031775.1, USP30-AS1, AC024060.1, AL162586.1, AP003352.1, PSMB8-AS1, AC016957.2, and STAG3L5P-PVRIG2P-PILRB) were deemed as protective predictors with hazard ratio < 1. While AC105942.1 and MAFG-DT were regard as harmful with hazard ratio > 1 (Figure 1c).

Evaluation of the hypoxia LncRNA signature
All 10 prognosis-related hypoxia lncRNAs were incorporated into multivariate Cox regression analysis to generate risk scores in the light of the formula described in Method and Materials. The patients with different risk scores were divided into highrisk and low-risk groups using the mean risk score as a cutoff value in the TCGA database. Kaplan-Meier analysis showed that patients in the low-risk group had better OS than that in the high-risk group, which demonstrated the excellent discrimination of this signature (Figure 2a). Principle component analysis based on gene expression showed two significantly distinct distribution between patients in highand low-risk groups ( Figure 2b). As for patients with lower risk scores, they usually had lower mortality rates and longer survival time compared with ones with higher risk scores (Figures 2c and 2d). Furthermore, the expressions of AC105942.1 and MAFG-DT increased notably with the increment of risk scores, while the expressions of AL031775.1, USP30-AS1, AC024060.1, AL162586.1, AP003352.1, PSMB8-AS1, AC016957.2, and STAG3L5P-PVRIG2P-PILRB decreased distinctly, which were in accordance with hazard ratio of these predictors (Figure 2e).

Relationship between hypoxia LncRNA signature and clinicopathological parameters
First, in order to further verify accuracy of this hypoxia lncRNA signature, patients in the TCGA database were stratified according to age (≥ 60 y and < 60 y), gender (female and male), AJCC stage (I+ II and III+IV), T stage (T1-T2 and T3-T4), N stage (N0 and N1-3), M stage (M0 and  The prognosis-related hypoxia lncRNAs screened by univariate Cox regression were incorporated into Lasso regression analysis. Penalty parameter (λ) for the model was determined by 10-fold cross-validation following the criteria that error is within 1 standard error of the minimum. (c) Ten prognosisrelated hypoxia genes were incorporated into multivariate regression model and used to generate hypoxia-associated risk scores. (P = 3.052e−03), metastasis-free subgroup (P = 9.086e−06), high AJCC-stage (P = 7.127e −03), low AJCC-stage (P = 7.776e−04), and high pathological grade (P = 2.224e−11) (Figure 3b-l), while the difference of OS time was not significant in subgroups of age ≤ 60 (P = 1.286e−01), metastasis subgroup (P = 2.802e−01), and low pathological grade (P = 1e+00) (Figure 3a and Supplementary Figure 2). Secondly, we compared the risk scores in different subgroups stratified by clinicopathological parameters and found that the risk scores were significantly higher in subgroups of advanced T stage (T3 and T4), nodule-metastasis (N1-N3), metastasis (M1), high pathological grade, and advanced AJCC stage (Stage III and Stage IV) (Figure 4c-g). It suggested that the risk score based on the hypoxia lncRNA signature had a correlation with pathologic parameters. Moreover, we observed that the difference of the risk score in different age subgroups and gender subgroups were not significant (Figures 4a  and 4b).

Tumor mutational burden analysis
The maftool package in R software was employed to summarize and analyze the mutational data in the TCGA database. In order to compare different mutational genes, the top 20 mutational genes were listed in high-and low-risk groups (Figures 5a and 5b). We found that TP53, TTN, ARID1A, MUC16, KMT2D, SYNE1, PIK3CA, KDM6A, MACF1, HMCN1, RYR2, KMT2C, OBSCN, and FAT4 were the most frequent mutational genes in both risk groups. RB1, EP300, CREBBP, AKAP9, ERCC2, and SYNE2 belonged to top 20 frequent mutational genes in the highrisk group, while FGFR3, ATM, STAG2, SPTAN1,   Stage III and Stage IV). Moreover, the difference of the risk score in different age subgroups and gender subgroups was not significant (a and b). *P < 0.05; **P < 0.01; *** P < 0.001; **** P < 0.0001; ns: not significant. The risk scores between the wild type and the mutation type of the top frequent mutational genes were compared (c). The vioplots showed that 22 immune cells content in the high-risk and low-risk groups (d). *P < 0.05; **P < 0.01; *** P < 0.001; ns: not significant. DNAH11, and FLG were parts of the top 20 frequent mutational genes in the low-risk group. Furthermore, the risk scores between the wild type and the mutation type of the top frequent mutational genes were compared. The risk scores in the mutation types of TP53 and RB1 were significantly higher than that in the wild types, while the risk scores in the wild type of EGFR3 were higher than that in the mutational type (Figure 5c).

Relationship between hypoxia signature and immune cells infiltration
Twenty-two immune cells types were evaluated in the TCGA database, and six sorts of immune cells types were significantly different between the highrisk group and the low-risk group, including CD8 + T cells, activated memory CD4 + T cells, follicular helper T cells, regulatory T cells, eosinophils, and neutrophils ( Figure 5d). The contents of CD8 + T cells, activated memory CD4 + T cells, follicular helper T cells, and regulatory T cells were significantly higher in the low-risk group, while the contents of eosinophils and neutrophils were much higher in the high-risk group.

Gene set enrichment analysis
We collected the top 50 KEGG molecular pathways that differentially expressed genes enriched to in high-risk groups. Interestingly, some cancer-associated pathways were enriched into the high risk groups, including acute myeloid leukemia, basal cell carcinoma, BC, chronic myeloid leukemia, colorectal cancer, endometrial cancer, ERBB signaling pathway, MAPK signaling pathway, glioma, melanoma, nonsmall cell lung cancer, P53 signaling pathway, pancreatic cancer, pathways in cancer, prostate cancer, renal cell carcinoma, small cell lung cancer, TGF-β signaling pathway, thyroid cancer, and WNT signaling pathway (Figure 6a). Moreover, there were only six irrelevant KEGG pathways with FDR < 0.25 and P < 0.05 enriched into the low-risk group, so we made a Gene Ontology (GO) enrichment analysis and collected the top 50 GO molecular pathways in low risk group. Some immunerelated signal pathways were enriched into the low-risk group, which suggested that activation of immune regulation and response related pathways might contribute to better OS (Figure 6b).

The hypoxia related LncRNA signature is an independent prognostic factor
Age, gender, clinical stage, T stage, N stage, and the risk score were all incorporated into univariate Cox regression and multivariate Cox regression to screen out the independent prognostic factors. Due to the lack of metastatic data in more than half of patients, we ignored metastasis as an independent prognostic factor to minimize data loss in the TCGA database.
Age, clinical stage, T stage, N stage, and the risk score based on hypoxia lncRNA signature were all prognostic-associated factors with P < 0.05 in univariate Cox regression (Figure 7a). Multivariate Cox regression analysis showed that age and the hypoxia lncRNA-related risk score were independent prognostic factors ( Figure 7b). The multivariate ROC showed that AUC of the risk score, age, gender, clinical stage,  (Figure 7c-e). It suggested that compared with traditional pathological prognostic factors, the hypoxia lncRNA-related risk scores had a preeminent prognostic performance.

Prognostic nomogram construction
We constructed a prognostic model presented with a nomogram, including the independent prognostic factors (age and the risk score) and traditional clinical stage in multivariable Cox regression analysis ( Figure  8a).  [16.4;22.3] in the whole TCGA database (training set), respectively (Figure 8b-d) [13.5;25.4] in internal validation set (Figure 8e-g).

Construction of the LncRNA-mRNA coexpression network and functional enrichment analysis
We used Pearson correlation analysis to screen out 40 hypoxia lncRNA-related mRNAs with |R|>0.3 and P < 0.05. An lncRNA-mRNA coexpression network were constructed, which contained 62 lncRNA-mRNA links among 10 lncRNAs and 40 related mRNAs (Figure 9a and Supplementary  Table 4). Then, we summarized and presented eight lncRNA-mRNA links with correlation coefficient larger than 0.4 (Figure 9b-i). As shown in the Figure 9, all seven lncRNA-mRNA links were positive correlation except AC024060.1 negatively correlated with ANXA2. The Sankey plot showed the relationships between 40 mRNAs and 10 hypoxia lncRNAs and clearly classified lncRNAs in the light of their biological functions (protect or risk) (Figure 9j). Top 10 GO enrichment terms were presented in Figure 9k and KEGG pathway analysis showed that metabolism and biosynthesis pathways were enriched, which were in concordance with hypoxia and reprogramming (Figure 9l).

Prediction of chemotherapy response
We used the 'pRRophetic' package to explore the GDSC database, and to investigate whether the risk score based on hypoxia lncRNAs could predict chemotherapy response in different risk groups. Gemcitabine (G), cisplatin (C), methotrexate (M), vinblastine (V), and doxorubicin (A) were selected to evaluate chemotherapy response in both risk groups, for they constituted fundamental GC or MVAC protocol in muscle-invasive BC [32]. Moreover, camptothecin, docetaxel, and thapsigargin were also used to evaluate response, for thapsigargin could induce endoplasmic reticulum stress-associated apoptosis in BC and camptothecin and docetaxel were widely used for intravesical chemotherapy in nonmuscle-invasive BC [37]. Interestingly, IC50 of camptothecin ( Figure 10A), vinblastine ( Figure 10B), and methotrexate ( Figure 10C) in the high-risk group were significantly higher than that in the low-risk group, which meant that the chemotherapy response rates of these drugs were lower in the high-risk   group. On the contrary, IC50 of cisplatin ( Figure 10D), docetaxel ( Figure 10E), and thapsigargin ( Figure 10F) were lower in the high-risk group, which indicated that the application of these drugs could be more beneficial for patients with higher-risk scores. Furthermore, the differences of IC50 in Gemcitabine and doxorubicin in both groups were not significant.

Prediction of immunotherapy response
TMB, PD-L1/PD-L2, and MSI in tumor tissue were deemed as potent biomarkers for predicting immunotherapy response [33,35]. We first analyzed the expression of CD274 (PD-L1) and PDCD1LG2 (PD-L2) in both different risk groups and found that PD-L1 and PD-L2 were slightly higher in the high-risk group, but the difference was not statistically significant ( Figures 11A and  11b). Then, we calculated and compared the transcriptional expression of significant mismatch repair genes in tumor samples and found that all four mismatch repair genes (MLH1, MSH2, MSH6, and PMS2) expressed significantly higher in the high-risk group ( Figure 11C-F), which signified that the microsatellites might be more stable in the high-risk group. Moreover, we also explored the relationship between the risk score and TMB, for an increasing number of studies showed that higher TMB predicted a better immunotherapy response [33]. Along with the increment of risk scores, the TMB decreased slightly with R = −0.075 and P = 0.13 ( Figure 11G). The Kaplan-Meier curves showed that patients with higher TMB and low-risk scores tended to have the best OS and those with lower TMB and high-risk scores usually had worst survival probabilities ( Figure 11H). Finally, we used the TCIA database to generate the IPS in each sample, which was a superior predictor of response to anti-CTLA-4 and anti-PD-1, and then the IPS of anti-CTLA-4 ( Figure 11I), anti-PD-1 ( Figure 11J), and anti-  , and PMS2 (f), expressed significantly higher in the high-risk group. Moreover, with the increment of risk scores, TMB decreased slightly with R = −0.075 and P = 0.13 (g) and the Kaplan-Meier curves showed that patients with higher TMB and low-risk scores tended to have the best overall survival and those with low TMB and high risk scores usually had worst survival probabilities (h). The IPS of anti-CTLA -4 (i), anti-PD-1(j), and anti-(CTLA-4 plus PD-1) (k) in the high-risk group was significant lower than that in the low-risk group, predicting that patients with higher risk scores had a worse immunotherapy response.
(CTLA-4 plus PD-1) ( Figure 11K) in the high-risk group was significant lower than that in the lowrisk group, which powerfully predicted that patients with higher risk scores had a worse immunotherapy response. Taken together, the all biomarkers above predicted that patients with higher risk scores tended to have worse immunotherapy response.

Discussion
BC ranks the 11th among all diagnosed cancers in the world, which caused more than 16,000 deaths in the United State and have a clear male predominance [2,3,38]. While the 5-year survival has improved for many other malignancies, such as melanoma and hepatocellular carcinoma, it has remained largely unchanged in BC [1]. To date, an increasing number of researches have focused on hypoxia, a hallmark of tumor microenvironment, for it is closely related to reprogramming, invasion, and metastasis of tumor [8][9][10]. Hypoxia-related lncRNAs have been studied to seek out potential therapeutic targets. It was reported that lncRNA-RMRP, lncRNA SNHG3, lncRNA GAS6-AS2, lncRNA CCAT1, and lncRNA GClnc1 could promote proliferation, migration, and invasion in BC [24][25][26][27][28].
It is due to the significant roles of hypoxiarelated lncRNAs that we first constructed a signature with hypoxia lncRNA to predict patients' survival outcomes in BC. We used AC105942.1, AL031775.1, USP30-AS1, AC024060.1, AL162586.1, AP003352.1, PSMB8-AS1, AC016957.2, STAG3L5P-PVRIG2P-PILRB, and MAFG-DT to construct a hypoxia signature with a robust performance. The AUC of our signature in 1, 3, and 5 years were 0.733, 0.808, and 0.818, respectively, which were much higher than the AUC of AJCC-stage, T stage, and N stage ( Figure 7). Then, we constructed a predictive model with risk score, age, and stage in TCGA database and presented it with a nomogram. 1000 times simple bootstraps were used to validate the nomogram internally. The Concordance index and AUC were used to evaluate the model's discrimination, and calibration plot and Brier scores were applied to estimate the calibration of this model, which all demonstrated the model's robust predict performance ( Figure 8).
We further verified the accuracy of this hypoxia lncRNA signature in subgroups stratified by clinicopathological parameters and found that, to a large extent, it could still distinguish the survival difference for patients in the high-and low risk groups ( Figure 3). As for the subgroups of age ≤ 60 (p = 1.286e−01), metastasis subgroup (p = 2.802e −01) and low pathological grade (p = 1e+00), limited sample size in the subgroups might result in an underestimated accuracy of the hypoxia lncRNA signature to distinguish the patients with different risk scores. Moreover, we compared the risk scores in different subgroups stratified by clinical parameters. It was worth noting that the subgroups with higher pathological stages or grades usually had higher risk scores, which illustrated that our molecular signature closely related to pathological parameters and had the potential to be an important prognostic indicator (Figure 4).
Immune cells infiltration and immune checkpoint in tumor tissue played a significant role in promoting or prevent the proliferation, invasion, and migration of cancer cells, so that the immunotherapy have become a new management in cancer, such as urothelial carcinoma [8,12]. To explore the relationships between the hypoxia lncRNA signature and immune cells infiltration, we compared the immune cells content of different risk score groups and found that the contents of CD8 + T cells, activated memory CD4 + T cells, follicular helper T cells, and regulatory T cells were significantly higher in the low-risk group, while the contents of eosinophils and neutrophils were much higher in the high risk group. Recent researches reported that CD4 + T lymphocytes and CD8 + T lymphocytes infiltrating in tumor could improve the efficacy of tumor-targeted vaccine or adoptive immune cell therapy, which had been proved effective in patients with melanoma, head and neck cancer, breast cancer, and lung cancer [44][45][46]. Moreover, Jóźwicki once reported that BC patients with decreased infiltration of CD4 + T cells had shorter OS and CD4 + T cells were an important prognostic index of BC [47]. Follicular helper T cells signature signified organized antitumor immunity and could robustly predicted better survival or preoperative response to chemotherapy [48]. All the three immune infiltrating cells discussed above played significant roles in immune-related antitumor effect and were significantly infiltrated into tumor microenvironment of patients with low risk scores, which also demonstrated close relationship between hypoxia lncRNA and infiltrating immune cells. It was believed that regulatory T cells (Tregs) were a type of immunosuppressive cells and could potentially regulate invasiveness in BC [49]. Unexpectedly, Treg cells significantly infiltrated into tumor microenvironment of patients with low-risk scores according to the findings of our study. Whether there was a potential certain relationship between hypoxia level and Treg cells in microenvironment of BC remained unclear. Similarly, eosinophils and neutrophils infiltrating in tumor microenvironment were seldomly studied in BC.
It was demonstrated that TMB was closely related to the immunotherapy response [50,51]. The more the mutational genes existed in tumor cells, the more the mutation-associated RNA and protein might be generated, which could be recognized and targeted by immune system [52]. In our study, we listed the top 20 mutational genes in the low-and high-risk groups, and then compared risk scores between the wild types and the mutation types of top mutational genes. We found that the risk scores in the wild type of FGFR3 were significantly higher than that in the mutation type. While the risk score in the mutational type of TP53 and RB1 was higher than that in the wild type ( Figure 5). FGFR3 was a carcinogenic driver and the mutation, activation, and overexpression of FGFR3 was common in BC [53,54]. Ahmad once explored the frequency of FGFR3 mutation in Indian BC patients and found that FGFR3 mutations were more common in earlier pathological stage and low-grade tumors [55]. TP53 and RB1 played a role of suppressor genes in all cancer, which encoded p53 and rb1 protein involved in regulating numerous target genes. Mutations in TP53 and RB1 were frequently observed and were closely related with poor prognosis of patients with BC [7,56,57]. Taken together, the effects caused by mutation of TP53, RB1, and FGFR3 were in accordance with the risks predicted by hypoxia risk scores in our study. Furthermore, we enriched different genes in high-and low-risk groups and found that cancer associated pathway were enriched in the high-risk group, while the immune-related pathway belonged to the low-risk group. The cancerassociated pathways were enriched in the highrisk group, which might imply poor prognosis of patients with high score, and the immune response associated pathways in the low-risk groups usually suggested an improved prognosis. Similarly, hypoxia lncRNA-correlated mRNA functional enrichment analysis showed that metabolism and biosynthesis pathways were enriched, which were in concordance with hypoxia and reprogramming.
Along with the rapid development of drug delivery vehicles, an increasing number of ingredients were used in clinical anticancer, including chemical drugs, immune drugs, and even some fungal-derived materials [58]. Finally, we explored the signature's predictive ability of chemotherapy and immunotherapy response and found that patients with higher risk scores tended to have lower response rates of camptothecin, vinblastine, and methotrexate. Conversely, application of cisplatin, docetaxel and thapsigargin could be more beneficial for patients with higher risk scores. Next, TMB, PD-L1 /PD-L2, MSI, and IPS were employed to evaluate patients with different risk scores, and it showed that patients with higher risk scores tended to have worse immunotherapy response. Taken together, our hypoxia-associated lncRNA signature could accurately predict chemotherapy and immunotherapy response in patients with BC, which may be helpful for clinical medical decision-making. Since development and validation of the signature in our study was based on TCGA database and TCIA database, it has not been verified in large-scale clinical samples. However, we believe that with the wide application of NGS and intelligent equipment in clinic practice, noncoding RNA detection based on tumor samples will become more and more feasible [59].

Conclusions
This is the first hypoxia lncRNA-related signature in BC, which could accurately predict the overall survival in patients with BC compared with the traditional pathological parameters. Moreover, the molecular signature has close relationships with clinicalpathological parameters, some certain infiltrating immune cells, and mutational genes in tumors. More verifications are required in future to validate the stability and practicability of the present signature.

Research Highlights
(1) A hypoxia 10-lncRNA signature was established to predict BC patients' prognosis. (2) The signature could predict chemotherapy and immunotherapy response accurately. (3) Functional enrichment analysis revealed potential effects of 10 lncRNAs in BC.