Integrated bioinformatics analysis of miRNA expression in Ewing sarcoma and potential regulatory effects of miR-21 via targeting ALCAM/CD166

Abstract MicroRNAs (miRNAs) play essential functions in pathogenesis of Ewing sarcoma (ES). However, the molecular mechanisms responsible for ES occurrence and development through the regulation of miRNAs remain largely unknown. This study is aimed to explore the differential expressed miRNAs and mRNAs that play vital roles in ES. GSE80201 miRNA and GSE68776 mRNA microarray dataset were selected to carry out a series of bioinformatics analysis such as GEO 2R, gene ontology, pathway enrichment analysis, Venn analysis and PPI network construction to predict hub genes. Furthermore, using quantitative real-time PCR, RNA interference and luciferase reporter assay we demonstrated that activated leukocyte cell adhesion molecule (ALCAM/CD166) is a direct target of miR-21-3p in human ES cell lines. Our results suggest that the miR-21/CD166 axis has the potential to serve as both diagnostic markers and therapeutic targets for ES.


Introduction
Ewing sarcoma (ES), the second most common primary bone cancer, is a soft malignant tumour that mainly affects children and adolescents with poor outcomes [1,2]. Up to 40% of the patients will eventually relapse after initial treatment [3]. The symptoms are unremarkable in early stages, high metastasis and recurrence rate have become the main obstacles in the treatment of ES [4]. Further understanding of the mechanisms responsible for metastasis and recurrence will lead to the development of more effective therapy.
MicroRNAs (miRNAs) are 18-22 nucleotide non-coding RNAs that inhibit protein translation and promote mRNA degradation through complementary binding the 3 0 -untranslated region (3 0 -UTR) of their target mRNAs [5]. It has been shown that miRNAs are involved in regulating multiple physiological processes as well as oncogenesis [6,7]. miRNAs influence tumour development by regulating tumour cell proliferation, growth, differentiation, metabolism and apoptosis, therefore, miRNAs can serve as both diagnostic marker and therapeutic targets for ES [8]. Although miRNAs have been implicated in the pathogenesis of ES, the underlying molecular pathways that link miRNAs to the development ES remains undefined.
In our study, we combined bioinformatics analysis and research techniques of molecular biology to identify vital miRNAs and target genes involved in the prevention and treatment of ES. GEO2R analysis was used to identify differential expressed miRNAs (DEMs) and mRNAs between ES and normal tissues. Then, we used target gene prediction tools to screen out target genes corresponding to the vital DEMs. In addition, functions of differential expressed genes (DEGs) were annotated and pathway analysis was carried out. Furthermore, Venn analysis was adopted to select coexpressed DEGs between predicted target genes with DEMs, PPI networks of co-expressed DEGs were constructed using GeneMANIA. Finally, a series of confirmatory experiments were employed to identify ALCAM/CD166 as a direct target of miR-21, which functions as a tumour suppressor for in ES.

Differential expression analysis
GEO2R (https://www.ncbi.nlm.nih.gov/geo/geo2r/) is a web application [10] that allows users to perform sophisticated R-based analysis of GEO data to detect and analyse DEMs and identify up-and down-regulated DEGs between ES samples and normal controls. ES samples were allocated to test group and MSCs samples were allocated to control group in both GSE80201 and GSE68776 datasets. p values were adjusted using by the Benjamini and Hochberg test, p < 0.05 and |Log FC| ! 1 were considered as the cutoff values.
Target gene prediction of differential expressed miRNAs TargetScan (www.targetscan.org) and miRDB (http://mirdb. org) were used to accomplish top 10 up-or down-regulated miRNAs target genes prediction. Then, miRNA-predicted target gene networks were constructed by Cytoscape, one of the most popular open-source software tools that offer researchers a versatile and interactive visualization interface, to visualize their interactions.

Gene ontology and pathway enrichment analysis
The significantly up-and down-regulated DEGs (p < .05 and FC ! 2) were further analysed by GO and KEGG in order to identify the DEGs at the functional level using the Database for Annotation, Visualization and Integrated Discovery (DAVID, https://david.ncifcrf.gov/), an online bioinformatics resource website that aims to provide tools for the functional interpretation of large lists of genes or proteins [11]. We collected top 10 enrichment score (Àlog10 p value) in each analysis as the cutoff point.

Venn analysis
The DEGs (p < 0.05 and |Log FC| ! 1) of the gene expression profile and predicted miRNA target genes were imported into Functional Enrichment analysis tool (FunRich), which provides information on functional enrichment and interaction network of genes and proteins [12]. Then, we used Venn to analyse function to obtain the co-expression genes.

PPI network construction
The protein-protein interaction (PPI) networks of co-expressed DEGs was generated using the GeneMANIA (http://www.gene mania.org/), which is a web-based database tool for the prediction of gene functions on the basis of multiple networks derive d from different genomic or proteomic data/sources, such as Bio-GRID, GEO, I2D and Pathway Commons.

Cell lines and cell culture
The human Ewing sarcoma cell lines A673 SK-ES-1 RD-ES and the hMSC cell line was obtained from the American Type Culture Collection (ATCC). A673 SK-ES-1 RD-ES cell lines were cultured in high glucose (Dulbecco minimum essential medium) (DMEM; GIBCO), and hMSC cell lines were cultured in low glucose DMEM (GIBCO) mixed with 10% fetal bovine serum (GIBCO) and 1% antibiotics (penicillin and streptomycin). All cell lines demonstrated above were cultured in 37 C and 5% CO 2 incubator.

RNA extraction and quantitative real-time polymerase chain reaction (qRT-PCR)
Total RNA was extracted using TRIzol (Invitrogen) following the manufacturer's instructions. The hsa-miR-21-3p complementary DNAs (cDNAs) were synthesized using All-in-one miRNA First-Strand cDNA Synthesis Kit (GeneCopoeia,  Rockville, MD, USA). To quantitate the mRNA expression of ALCAM, total RNA was reversely transcribed using All-in-one First-Strand cDNA Synthesis kit (GeneCopoeia). qRT-PCR was performed using SYBR Green qPCR Master Mix (Thermo Fisher Scientific). The expression level of ALCAM and hsa-miR-21-3p were normalized to the endogenous control of human glyceraldehyde-3-phosphate dehydrogenase (GAPDH) and U6 separately as the endogenous control. The expression levels of related RNAs were evaluated using the 2 À DDCt method. All the primers were displayed in Table 2.
Opti-MEM was changed to DMEM 12-h after transfection. Then, we extracted total RNA and performed reverse transcription and qRT-PCR per manufacturer's instructions.

Vector construction and luciferase reporter analysis
To determine whether ALCAM was regulated by miR-21 through targeting its 3 0 -UTR, the reporter gene recombinant plasmids were first constructed, the wild type or mutant type ALCAM 3 0 -URT were cloned into the Kpnl and HindIII sites of the pGL6-miR vector (Beyotime, China). About 2 Â 10 4 293 T cells were seeded into each well of 24-well plates for 24 h before transfection. Firefly luciferase reporter gene assay kit (Beyotime) was adopted for the luciferase reporter analysis following the manufacturer's protocols, pGL6-ALCAM-wt and pGL6-ALCAM-mut were co-transfected with miR-21mimics or miR-21 mimics-control into 293 T cells. The maximum luminescence wavelength of luciferin catalysed by firefly luciferin is 560 nm.

Data analysis
Each experiment was repeated at least three times. Foldchanges !2 and p values <0.05 in miRNA and gene microarray data were regarded as significantly differentially expressed. The volcano map and frequency distribution maps of DEGs were conducted using GraphPad Prism 7.0. p Values <0.05 were considered statistically significant.

Differential expressed miRNAs and genes
The differential expression profile of miRNAs (DEMs) or mRNAs was generated using GEO2R tool. For the data set GSE80201, based on the thresholds of Fold-changes! 2 and p values <0.05, a total of 59 DEMs were obtained in the ES group compared with the control group, including 24 upregulated and 35 down-regulated miRNAs. Volcano plots revealed that miRNAs were differentially expressed between ES patients and control subjects ( Figure 2). For the dataset GSE68776, 4786 DEGs, including 2000 up-regulated and 2786 down-regulated differential expressed genes (DEGs), were collected (Table 3).

Target gene prediction
The potential target genes were identified using online bioinformatics database miRDB and TargetScan. Ten most upand down-regulated miRNA-target gene networks were constructed using Cytoscape separately. There were 135 predicted target genes in the network of 10 most down-regulated miRNAs with high target score ( Figure 3A).
Meanwhile, 60 predicted target genes in the network of 10 most up-regulated miRNAs were collected with high target rank ( Figure 3B).

GO and pathway enrichment analysis of DEGs
Functional annotations of up-and down-regulated DEGs were applied using DAVID. Top ten biological processes and pathways with p < 0.05 were selected. Enriched processes of up-regulated DEGs were significantly associated with cellcell adhesion and cell adhesion in ES ( Figure 4A), suggesting that up-regulated DEGs are primarily involved in enhancing ES tumour cells adhere to normal tissue. Down-regulated DEGs mainly played a role in taste process ( Figure 4B). KEGG analysis of DEGs was also performed using DAVID. The pathway analysis (p < .05) revealed that up-regulated DEGs were mainly associated with focal adhesion, protein processing in endoplasmic reticulum, ECM-receptor interaction, proteoglycans in cancer and cancer signalling pathway, indicating up-regulated DEGs are involved in various cancer pathogenesis in ES ( Figure 5). However, the significantly enriched pathway of down-regulated DEGs was taste transduction, which seemed not relative to ES function.

The co-expressed DEGs
Up-and down-regulated DEGs were collected from predicted target genes and GSE68776 gene expression profile separately using Funrich. According to Venn analysis, 2 co-expressed down-regulated DEGs ( Figure 6A) and 20 co-expressed upregulated DEGs ( Figure 6B) were found and presented in Table 4.    PPI network of co-expressed DEGs and interested genes confirmation PPI networks of 20 up-and 2 down-regulated co-expressed genes were constructed using GeneMANIA online data sets. The genes related to the up-regulated DEG ALCAM were associated with guanyl nucleotide binding ( Figure 7A), while the genes related to down-regulated DEGs were associated with dystrophin-associated glycoprotein complex, muscle construction and muscle system process ( Figure 7B). Activated leukocyte cell adhesion molecule (ALCAM), also known as CD166 (cluster of differentiation 166) [13] has caught our attention because it was an up-regulated DEG of the mRNA microarray dataset that is also the predicted target gene of miR-21, which was the top 10 down-regulated miRNAs of the ES miRNA microarray dataset. Furthermore, both of them served vital roles in many kinds of tumours. The predicted binding sites of 3 0 -UTR using miRDB and TargetScan are displayed in Figure 8.

Quantitative RT-PCR validation and RNA interference
We validated the expression of miR-21 using qRT-PCR in three human Ewing sarcoma cell lines (A673, SK-ES-1, RD-ES) and the human MSC cell line. The expression of miR-21 was downregulated significantly in three human Ewing sarcoma cell lines compared with the hMSC cell line (Figure 9). The siRNAs were used to knockdown the expression of miR-21 and then the expression of ALCAM was analysed in each ES cell line. The expression of miR-21 was significantly decreased, whereas the expression of ALCAM was found significantly up-regulated in cells transfected with miR-21 inhibitor. On the contrary, miR-21 was up-regulated and the expression of ALCAM was   found significantly down-regulated when the cells were transfected with miR-21 mimics (Figure 10).

miR-21 targets ALCAM/CD166 gene in human ES cell lines
To prove AlCAM/CD166 was the target of miR-21 in human Ewing sarcoma cells, we performed luciferase assay. After the co-transfection with miR-21 mimics and wild type ALCAM reporter gene recombinant plasmids, the luciferase activity was significantly repressed compared with the mutant groups in 293 T cells (Figure 11).

Discussion
Ewing sarcoma is a high malignant aggressive tumour of the bone and soft tissue affecting children and young adults, which have very poor prognosis [14]. Although many genes have been reported to play a role in the development of ES, the molecular mechanisms remain largely unknown.
Generally, miRNAs can affect tumour cells growth, differentiation and apoptosis by repressing the expression of multiple target genes [15]. Jedlicka et al. identified a group of miRNAs that negatively regulate the expression of multiple pro-oncogenic components of the IGF pathway in ES [16].
Stamenkovic et al. showed that deregulation of miRNA-143 or miRNA-145 is involved in t cancer stem cell (CSC) selfrenewal and tumour maintenance by targeting TARBP2 pathway [17]. Moreover, Zhang et al. suggested that miR-638 may inhibit tumour growth by suppressing the activity of VEGFA [18]. Nonetheless, the identification of specific miRNAs that play important roles in tumour development remains a complex and difficult task. With the development of new bioinformatics and molecular biology tools and gene chips, we are now in a better position to analyse a large number of samples to screen out vital DEMs of ES.
In this study, we employed microarray and bioinformatics tools to detect and analyze differential expressed miRNAs/ mRNAs in Ewing sarcoma. A total of 59 DEMs and 4798 DEGs were identified between ES and hMSC samples through GES80201 and GSE68776 from GEO database using GEO2R online tool. Top 10 most up-and down-regulated miRNAs were performed target gene prediction using miRDB and TargetScan. Then, 22 co-expressed DEGs were selected via Venn analysis, which is more convincing with more evidence that plays significant role of ES.
In order to provide an in-depth analysis of DEGs function in ES, we first performed GO and KEGG analysis. Gene ontology (GO) is a comprehensive resource that collects a large number of gene annotation terms, including biological process (BP), cellular component (CC) and molecular function (MF) [19]. Kyoto Encyclopedia of Genes and Genomes (KEGG) is an integrated database resource that provides biological interpretation of genome sequences and other high-throughput data [20]. The up-regulated DEGs were mostly enriched in the BP term associated with cell adhesion. In addition, pathway enrichment analysis showed that down-regulated DEGs were mainly associated with metabolic pathways, focal adhesion, protein processing in endoplasmic reticulum, proteoglycans in cancer, endocytosis and cancer signalling pathways. According to the previous study, focal adhesion signalling pathway could regulate cell behaviour and influence tumour cell survival, which could serve as potential cancer targets [21]. In order to understand the potential function of DEGs in ES, we separately analysed the up-regulated and down-regulated DEGs using Venn.
Interestingly, one DEG named ALCAM is found not only in the 22 co-expressed DEGs group as mention above but is also enriched in up-regulated BP term cell adhesion, suggesting that ALCAM may act as a significant molecular in ES. PPI networks also revealed that co-expressed up-regulated gene function was associated with guanyl nucleotide binding, which was consistent with the result of previous analysis showing ALCAM and miR-21 binding site consisted of 3 guanines among the seven binding sites. It is convenient and accurate enough using GeneMANIA to predict gene functions with the comprehensive datasets including organism-specific functional genomics datasets.
Our results of qRT-PCR validation showed the expression of miR-21 was lower in ES cell lines compared with hMSCs. Furthermore, the predicted miR-21 target gene ALCAM was up-regulated nearly three to seven times when the cells were transfected with miR-21 inhibitor. On the contrary, the expression of ALCAM was down-regulated almost two times when the cells were transfected with miR-21 mimics. Finally, luciferase reporter analysis further proved that miR-21 regulates ALCAM expression directly.
Previous studies of miRNA profiling experiments showed that miR-21 was found differentially expressed in human tumours [22,23]. Medina   oncogene, which indicated certain tumours are oncomiRs dependent [24]. We now provide further evidence to support the role of miR-21 as a biological marker for ES. Previous studies indicated ALCAM expression correlated positively with cancer progression in various types of cancers [25,26] such as lung cancer, gastric cancer, pancreatic cancer and prostate cancer [27][28][29][30]. Thus, over-expression of ALCAM may result in high metastasis and poor prognosis in ES. However, the mechanisms whereby ALCAM influencing tumour cells proliferation and metastasis in ES need to be confirmed in future studies.
Overall, we identified 59 miRNAs and 4786 mRNAs differentially expressed in ES. Moreover, we found ten most up-and down-regulated miRNAs and their 22 co-expressed mRNAs were more relative to ES pathological process, which is worthy to continue further research. According to bioinformatics analysis and experimental verification of qRT-PCR and luciferase reporter assay, we for the first time identified hsa-miR-21-3p as the potential regulator of ALCAM/CD166. miR-21/CD166 signalling axis may be adopted as novel biomarkers in the diagnosis and molecular therapy for Ewing sarcoma.