Identification of hsa_circ_0002024 as a prognostic competing endogenous RNA (ceRNA) through the hsa_miR_129-5p/Anti-Silencing Function 1B Histone Chaperone (ASF1B) axis in renal cell carcinoma

ABSTRACT We aimed to identify novel circular RNAs (circRNAs) as prognostic competing endogenous RNAs (ceRNAs) to serve as genetic biomarkers and therapeutic targets for renal cell carcinoma (RCC). High-throughput sequencing data of circRNAs from Gene Expression Omnibus (GEO) and of microRNAs (miRNAs) and messenger RNAs (mRNAs) from The Cancer Genome Atlas (TCGA) were retrieved to identify differentially expressed RNAs (DERNAs). DEmRNAs were subjected to weighted gene coexpression network analysis (WGCNA) to identify prognostic DEmRNA (proDEmRNA) modules. Overlapping DEcircRNA-DEmiRNA and DEmiRNA-proDEmRNA interactions among the TargetScan, miRanda and RNAhybrid databases were constructed and identified. The circRNA-miRNA-mRNA ceRNA network was constructed using mutual DEmiRNAs in two interaction networks as nodes. mRNAs validated as significantly overexpressed in RCC by Oncomine, Gene Expression Profiling Interactive Analysis (GEPIA) and quantitative polymerase chain reaction (q-PCR), along with the correlative miRNAs, were used for survival analysis. Finally, a ceRNA network with 13 upregulated circRNAs, 8 downregulated miRNAs and 21 upregulated mRNAs was constructed, in which Anti-Silencing Function 1B Histone Chaperone (ASF1B) and Forkhead Box M1 (FOXM1) were considered significant by Oncomine, GEPIA and q-PCR. Survival analysis showed that ASF1B, FOXM1 and hsa_miR_1254 were significantly negatively correlated but hsa_miR_129-5p was positively correlated with overall survival time. Exploration of the ceRNA network revealed the prognostic hsa_circ_0002024/hsa_miR_129-5p/ASF1B axis. Therefore, hsa_circ_0002024 was identified as a prognostic ceRNA that might sponge hsa_miR_129-5p to regulate ASF1B and affect RCC prognosis. However, further validation is needed.


Introduction
Currently, renal cell carcinoma (RCC) is one of the most commonly diagnosed uro-oncological diseases, second only to bladder cancer [1]. RCC can be histologically classified into three major types: clear cell RCC (~80%), papillary RCC (10-15%) and chromophobe RCC (~5%) [2]. Approximately 3-5% of RCCs are familial hereditary, and up to 92% of clear cell RCCs exhibit inactivation of the Von Hippel-Lindau (VHL) gene [3,4]. Several syndromes, including VHL syndrome, hereditary clear cell RCC syndrome, etc., have been reported to increase the risk of RCC; however, the genetic association remains poorly characterized [1]. Since the 1970s, the morbidity of kidney disease has been increasing worldwide, and more than 90% of these deaths are attributed to RCC [5,6]. Although multimodal therapeutic approaches such as surgery, chemotherapy, radiotherapy and targeted therapy are available, the prognosis of RCC remains poor primarily due to the delay in diagnosis and high incidence of metastasis and recurrence [7,8]. Moreover, most patients with RCC ultimately develop drug resistance, even to targeted drugs [9]. As RCC is a histologically heterogeneous, genetically complex and prognostically poor malignant tumor, exploring the molecular mechanism of RCC to discover novel genetic biomarkers and therapeutic targets to allow its early detection and improve its prognosis is critical.
Recently, a new RNA crosstalk mechanism, named a competing endogenous RNA (ceRNA) network, has been a popular topic in cancer research. The ceRNA hypothesis, which states that messenger RNAs (mRNAs) and noncoding RNAs can communicate with each other via microRNAs (miRNAs), was first proposed by Salmena et al. in 2011 [10], and its role in cancer was further demonstrated by Karreth et al. in 2013 [11]. Typically, noncoding RNAs can interact with miRNAs to block their negative regulatory effects on mRNA expression and therefore affect the disease phenotype. As an emerging cancer biomarker and target, circular RNAs (circRNAs) can also serve as ceRNAs to regulate and control cancer progression in humans [12,13]. CircRNAs are long noncoding RNAs generated in a covalently closed-loop structure from introns, exons, untranslated regions or intergenic areas in the genome [14]. Studies have demonstrated that several circRNAs can affect the initiation and development of RCC by sponging miRNAs to regulate mRNA expression; however, the specific mechanism remains unclear [15].
In the present study, we performed a systematic study combining bioinformatics analysis using the Gene Expression Omnibus (GEO), The Cancer Genome Atlas (TCGA), Oncomine, and Gene Expression Profiling Interactive Analysis (GEPIA) databases and experimental validation by quantitative polymerase chain reaction (q-PCR) of RCC cells compared to normal kidney cells. Studies have shown that ceRNA network construction and weighted gene coexpression network analysis (WGCNA) can be used to identify RNA crosstalk networks and prognostic gene modules [16,17]. By using these approaches as our principal methods, we aimed to construct a prognostic circRNA-miRNA-mRNA ceRNA network and identify prognostic circRNA-miRNA-mRNA axes. Furthermore, multiple validation analyses were performed to identify novel circRNAs as diagnostic biomarkers and therapeutic targets for RCC.
The flow chart of the present study is shown in Figure 1.

Preprocessing of RNA sequencing data and collection of clinical information
We searched GEO (www.ncbi.nlm.nih.gov/geo/) using the keyword 'renal cell carcinoma AND circ*' to select high-throughput circRNA sequencing datasets of RCC published on or before 11 March 2021. The Sequence Read Archive (SRA) files and clinical information of the selected datasets were downloaded from SRA Run Selector (https://www.ncbi.nlm.nih. gov/Traces/study) for further analysis in the Linux operating system. Paired-end SRA files were divided into two single-end fastq compressed files using the Fastq-dump function in sra-tools software (version 2.10.0) [18]. After adapter trimming using Trim Galore (version 0.6.4) (www.bioinformatics.babra ham.ac.uk/projects/trim_galore) and removal of lowquality reads (N base % > 5% or Q20 < 80%), the filtered reads were aligned to the hg19 reference genome/transcriptome from the UCSC Genome Browser (genome.ucsc.edu) with the BWA-MEM function in BWA software (version 0.7.17) [19]. CircRNAs were identified with CIRI software [20], annotated in the CircBase database (www.circbase.org) and finally entered into an expression matrix.
Data on miRNAs and mRNAs were retrieved from the kidney chromophobe (KICH), kidney renal clear cell carcinoma (KIRC) and kidney renal papillary cell carcinoma (KIRP) projects in the TCGA database (cancergenome.nih.gov) and analyzed with R software (version 3.6.1) [21]. We used the TCGAbiolinks package (version 2.15.3) [22] to download the high-throughput sequencing counts of miRNAs and mRNAs as well as the relevant clinical information. The miRNA and mRNA expression matrices from the above three projects were merged.

Differential expression analysis of circRNAs, miRNAs and mRNAs
EdgeR [23] is a package in R that can be used to identify differential expression in count-based expression data using an overdispersed Poisson model and an empirical Bayes method. It was applied for normalization and differential expression analysis of circRNAs, miRNAs and mRNAs between RCC tissues and normal kidney tissues. We filtered out RNAs with an expression count < 1, and the counts for duplicate RNAs were averaged. RNAs with a |Log (fold change (FC))|>2 and statistical p-value < 0.05 were considered differentially expressed RNAs (DERNAs) and included differentially expressed circRNAs (DEcircRNAs), differentially expressed miRNAs (DEmiRNAs) and differentially expressed mRNAs (DEmRNAs), which were further visualized in volcano plots using the ggplot2 package [24].

WGCNA of the DEmRNAs
WGCNA [25] is an algorithm used for the identification, summarization, membership measurement and related analysis of correlated gene modules (gene modules to gene modules or gene modules to external sample traits) and has been widely used to identify relevant genes with prognostic value in many cancers. Due to the limited quantity of DEcircRNAs and DEmiRNAs, we analyzed only DEmRNAs in RCC tumor samples using the WGCNA package in R language to evaluate gene interactions and identify coexpression modules. We calculated Pearson correlation coefficients to demonstrate the influence of the soft-thresholding power value on the scale independence and mean connectivity and subsequently chose a softthresholding power with a corresponding scale-free topology fit index reaching 0.95 and a corresponding maximum mean connectivity. By transforming the adjacency matrix into a topology matrix, applying the static tree cut method and setting the minimum number of genes in a module to 20, we identified coexpression gene modules and differentiated the modules by colors. Finally, the clinical information related to five prognostic factors in RCC, including tumor grade, T stage, N stage, M stage and survival time, was used to determine module-trait relationships by calculating the Pearson correlation coefficient, and the data were visualized in a heat map. With the cutoff criterion of p < 0.05, modules significantly positively related to tumor malignancy (grade and stage) and negatively related to survival time were considered prognostic DEmRNA (proDEmRNA) modules.

Gene ontology (GO) annotation analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of the proDEmRNAs
Genes in the proDEmRNA modules were subjected to GO [26] annotation analysis and KEGG [27] pathway enrichment analysis using the clusterProfiler package [28] in R. GO annotations are classified in three components, namely, biological process (BP), cellular component (CC) and molecular function (MF), and GO terms with an adjusted p < 0.05 and a gene count >10 were considered significant. The cutoff criterion was set as an adjusted p < 0.05 in KEGG analysis to identify the pathways significantly enriched with the DEmRNAs.

Construction of the circRNA-miRNA-mRNA ceRNA network
TargetScan (www.targetscan.org), miRanda (www. miranda.org) and RNAhybrid (bibiserv.cebitec. uni-bielefeld.de/rnahybrid/) were used to explore the network of circRNAs, miRNAs and mRNAs. We used the local tools of the three databases to explore DEcircRNA-DEmiRNA and DEmiRNA-proDEmRNA interactions, and subsequently, the Venn web tool (bioinformatics.psb.ugent.be/webtools/Venn) was applied to identify the overlapping interactions from the three databases. Finally, using mutual DEmiRNAs in two interaction networks as nodes and considering the typical ceRNA regulation method in which a circRNA sponges an miRNA to negatively regulate the miRNA and in turn promote the expression of the target mRNA (i.e., only upregulated circRNAs, downregulated miRNAs and upregulated mRNAs were preserved in the network), we constructed a circRNA-miRNA-mRNA ceRNA network and visualized it with Cytoscape software (version 3.6.1) [29].

Validation of prognostic markers with Oncomine and GEPIA
To further verify the prognostic significance of the ceRNA network, we conducted comparative expression analysis of the mRNAs in Oncomine (www.oncomine.org) [30] and GEPIA (gepia.cancer-pku.cn/index.html) [31]. As Oncomine provides integrated gene expression analysis data of multiple datasets, we input 'renal cell carcinoma vs. normal analysis' in the filter section and selected datasets comparing mRNA expression between RCC and normal kidney tissues. The expression of prognostic mRNAs was compared between cancer tissues and normal tissues across the above datasets, and the comparison data with median ranks and combined p-values were automatically generated. The expression of prognostic mRNAs validated in Oncomine was further compared (TCGA RCC tissue vs. TCGA normal kidney tissue + Genotype-Tissue Expression (GTEx) project normal kidney tissues) with the cutoff criteria of FC > 1.5 and p-value < 0.01 in the GEPIA database for clear cell RCC, papillary RCC and chromophobe RCC, separately.

Cell culture
Three RCC cell lines (A498, 786-O and ACHN) and 1 normal kidney cell line (293 T) were purchased from the Chinese Academy of Sciences Shanghai Branch (China). RCC and normal kidney cells were cultured in different culture media (293 T cells in DMEM (HyClone), 786-O cells in RPMI-1640 medium (HyClone), and A498 and ACHN cells in MEM (HyClone)). All media were supplemented with 10% fetal bovine serum (Gibco, Invitrogen, USA) and cultured at 37°C in 5% CO 2 .

RNA isolation and q-PCR
Total RNA from 4 cell lines was isolated with TRIzol reagent (Invitrogen, USA), and cDNA was synthesized with an Evo M-MLV RT Kit with gDNA Clean for qPCR (Accurate Biotechnology, China). The expression of prognostic mRNAs validated in Oncomine and GEPIA was evaluated by a SYBR Green qPCR assay (Accurate Biotechnology, China) in an ABI 7500 Real-Time PCR System (Thermo Fisher Scientific, USA). The PCR primers used were as follows: Anti-Silencing Function 1B Histone Chaperone (ASF1B) forward, GACCTGGAG TGGAAGATCATTT; ASF1B reverse, GCCTGAAAGACAAACATGTGTC; Forkhead Box M1 (FOXM1) forward, GATCTGCGAGA TTTTGGTACAC; FOXM1 reverse, CTGCAGAAGA AAGAGGAGCTAT.

Survival analysis
Survival analysis was performed on TCGA data for patients stratified by the expression levels of the mRNAs verified as significant by Oncomine, GEPIA and q-PCR, along with the correlative miRNAs in the ceRNA network, using the survival package (version 3.1-11) [32]. Typically, mRNAs are cancer promoters and are negatively correlated with survival outcomes in the ceRNA network, while miRNAs have the opposite relationship. Therefore, we set p < 0.05 as the significance criterion to identify negative prognostic mRNAs and positive prognostic miRNAs. CircRNAs that sponged positive prognostic miRNAs to upregulate negative prognostic mRNAs in the ceRNA network were considered negative prognostic factors.

Results
We analyzed data from GEO and TCGA by differential expression analysis and WGCNA to identify DEcircRNAs, DEmiRNAs, and proDEmRNA modules, from which a circRNA-miRNA-mRNA ceRNA network was constructed. Via comparative mRNA expression analysis of the Oncomine and GEPIA databases, q-PCR in RCC and normal kidney cell lines, and survival analysis in TCGA, we validated a prognostic circRNA-miRNA-mRNA axis and identified a novel circRNA as a prognostic ceRNA in RCC.

WGCNA, GO annotation analysis and KEGG pathway analysis
A total of 889 RCC patients with 3807 identified DEmRNAs were included for WGCNA. After calculation of the soft-thresholding power, a threshold power of 3 was found to correspond with a scale-free topology fit index reaching 0.95 and the maximum mean connectivity and was therefore set as the cutoff threshold (Figure 3a). By applying the cutoff threshold  GSM2912685  63  male  \  GSM2912686  64  male  \  GSM2912687  53  female  \  GSM2912688  60  male  \  GSM2912689  53  female  \  GSM2912690  60  male  \  GSM2912691  61  male  \  renal  cell  carcinoma   GSM2912692  63  male  T1N0M0  GSM2912693  64  male  T1N0M0  GSM2912694  53  female  T1bN0M0  GSM2912695  60  male  T1bN0M0  GSM2912696  53  female  Unknow  GSM2912697  60  male  T1N0M0  GSM2912698  61  male  T1N0M0 and performing WGCNA, we identified 11 gene coexpression modules with more than 20 genes each (Figure 3b). Module-trait relationships were identified and visualized; the red module, with 50 genes, was considered a proDEmRNA module due to its significant positive correlation with tumor malignancy (p Grade = 1e-26, p T stage = 3e-22, p N stage = 9e-13, p M stage = 4e-25) and negative correlation with survival time (p Survival time = 5e-6) (Figure 3c). The proDEmRNAs were highly related to the cell cycle pathway in both the GO and KEGG analyses. In the GO analysis, although none of the significantly enriched MF terms contained more than 10 genes, the proDEmRNAs were enriched mainly in 18 BP terms related to the cell cycle (nuclear division, organelle fission, chromosome segregation, etc.) and classified in three relative CC terms (chromosome, spindle and microtubule) (Figure 4(a,b)). The cell cycle was also the most enriched pathway in the KEGG analysis (Figure 4c).

Comparative mRNA expression, q-PCR and survival analyses
Fifteen comparative analysis datasets were found in Oncomine, and the expression of 21 mRNAs was assessed in these datasets. Eighteen mRNAs showed no statistically significant difference between RCC and normal kidney samples, while three -ASF1B (p = 0.012), Ribonucleotide Reductase Regulatory Subunit M2 (RRM2) (p = 0.002) and FOXM1 (p = 0.024) -were considered significant (Figure 7). The expression data of the three significant mRNAs was further presented as boxplots in GEPIA, which showed that only ASF1B and FOXM1 were significantly overexpressed in all 3 subtypes of RCC (Figure 8). Figure 9 confirms the higher mRNA abundance Survival curves for patients stratified by the expression levels of two significant mRNAs and four correlative miRNAs (hsa_miR_129-5p, hsa_miR_193a-5p, hsa_miR_1254 and hsa_miR_4433a-5p) in the ceRNA network were constructed with R software. The survival analyses for mRNAs included 889 RCC patients and the survival analyses of miRNAs involving 860 RCC patients were conducted. While both mRNAs were significantly negatively correlated with overall survival time (all p < 0.001), hsa_miR_129-5p was the only positive prognostic miRNA (p = 0.0212) ( Figure 10). Therefore, we explored the circRNA-miRNA-mRNA interactions in the ceRNA network and identified hsa_circ_0002024 as a negative prognostic factor that acts by sponging and suppressing hsa_miR_129-5p to promote ASF1B expression in RCC. The prognostic hsa_-circ_0002024/hsa_miR_129-5p/ASF1B axis is marked with a dashed ellipse in Figure 6.

Discussion
As mRNAs encode proteins that participate in various BPs, any factor that interferes with the normal expression of mRNAs can possibly cause abnormal cell proliferation and differentiation and eventually lead to carcinogenesis. MiRNAs bind to specific 3 -untranslated regions (3 -UTRs) of mRNA transcripts to cause mRNA degradation and regulate downstream signaling pathways [33,34]. As a newly discovered RNA species, circRNAs are considered to have multiple functions, such as protein translation, participating in circRNA-protein interactions and, most importantly, acting as ceRNAs [35]. The structural stability of the closed loop of circRNAs provides natural resistance to exoribonucleases, which  makes circRNAs highly stable in the cytoplasm [36]. Given their natural stability, circRNAs have been described as reliable potential regulators in multiple cancers and could possibly serve as a promising biomarker and novel therapeutic target [37].
This systematic study combining bioinformatics analysis and experimental validation was performed to identify novel prognostic circRNAs as diagnostic biomarkers and therapeutic targets for RCC. We used differential expression analysis to identify DERNAs and then applied WGCNA to identify the red proDEmRNA module. A ceRNA network was constructed among the DEcircRNAs, DEmiRNAs and red module, in which two mRNAs, ASF1B and FOXM1, were validated as significant by Oncomine, GEPIA and q-PCR. The two validated mRNAs, along with the four correlative miRNAs (hsa_miR_129-5p, hsa_miR_193a-5p, hsa_miR_1254 and hsa_miR_4433a-5p) in the ceRNA network, were used for survival analysis to identify the positive survival-related mRNAs  (ASF1B and FOXM1) and negative survivalrelated miRNA (hsa_miR_129-5p). Based on the interactions in the ceRNA network, the hsa_-circ_0002024/hsa_miR_129-5p/ASF1B axis was identified; thus, hsa_circ_0002024 was revealed to be the prognostic ceRNA in RCC.
ASF1B, a subtype of ASF1, encodes a histone H3-H4 chaperone protein, which is the substrate of the tousled-like kinase family of cell cycleregulated kinases and may catalyze the assembly and disassembly of the nucleosome structure of chromatin. When the nucleosome structure of chromatin is not appropriately modulated, diseases such as cancers occur [48,49]. Studies have demonstrated that ASF1 regulates chromatin function and promotes cancer development, especially the ASF1B subtype, which has been reported as a promoter of multiple cancers [50]. Both ASF1B and hsa_miR_129-5p have been demonstrated to contribute to the same cancers, for example, breast cancer [51,52], prostate cancer [53,54] and RCC [50,55], although no interactions have been established. However, both Zhou et al. [50] and Chiang et al. [55] found that ASF1B and hsa_miR_129-5p were involved in AKT signal transduction pathway activation in RCC. The AKT signal transduction pathway regulates many cellular processes, such as survival, proliferation, growth, metabolism, angiogenesis and metastasis [56], and its hyperactivation has been abundantly demonstrated to be involved in the initiation, progression, and drug resistance of many cancers; thus, it is a therapeutic target in cancer [57]. Collectively considering the evidence that the AKT pathway plays a critical role in malignant tumors with the results of the present study, we can hypothesize that hsa_circ_0002024 sponges hsa_miR_129-5p to regulate ASF1B and increase the occurrence, metastasis and fatality rate of RCC via the AKT pathway.
This study has several limitations, such as methodological bias, data heterogeneity, experimental simplicity and lack of in vivo experimental validation. These limitations contribute to the differences in the results and impact the reliability of this study.

Conclusion
In summary, we identified hsa_circ_0002024 as a novel diagnostic biomarker and therapeutic target ceRNA. Hypothetically, hsa_circ_0002024 can sponge hsa_miR_129-5p to impact its binding to ASF1B, thereby resulting in overexpression of ASF1B and eventually leading to cell cycle dysregulation and an aberrant nucleosome structure in chromatin. These events play a role in the occurrence and development of RCC, possibly via the AKT signal transduction pathway. However, further biological studies are necessary to verify our research findings.