Two immune-enhanced molecular subtypes differ in inflammation, checkpoint signaling and outcome of advanced head and neck squamous cell carcinoma

ABSTRACT The immune environment of primary tumor is associated with the clinical response and benefit of immunotherapy. This study aims to investigate the intratumoral immune profile and its clinical relevance in advanced head and neck squamous cell carcinoma (HNSCC). Gene expression profiles of 401 HNSCCs at stage III-IVB from two cohorts (The Cancer Genome Atlas, TCGA, n = 203; the Leipzig Head and Neck Group, LHNG, n = 198) were involved in this analysis. Based on the global immune-related genes, four gene expression subtypes (C1-4) were identified in HNSCCs. Overall, subtypes C2 and C3 showed upregulation of immune profiles and increased tumor lymphocyte infiltration, exhibiting an enhanced immune microenvironment (EIME). However, the two EIME subtypes revealed differences in immune markers and clinical features. Subtype C2 showed higher expression of macrophage signature, whereas subtype C3 was more associated with B cell infiltration. T cell and NK cell infiltration was not different between C2 and C3 subtypes. The subtype C2 tumors were characterized by inflammation compared with subtype C3. Although the checkpoint receptors PD1 and CTLA4 expressed equally between the EIME subtypes, their ligands (PD-L1/PD-L2, CD86/CD80) were significantly upregulated in subtype C2 compared with C3. HPV-positive tumors were predominantly enriched in subtype C3 but not in C2. Furthermore, patients in subtype C2 had a worse outcome than those in C3. In summary, two immune-enhanced subtypes with different immune characteristics and clinical features were identified in advanced HNSCC. The different immune microenvironments among HNSCC subgroups may provide new insights into the strategy of immunotherapy.


Introduction
Head and neck squamous cell carcinoma (HNSCC) is the eighth most common malignancy for males in the United States. 1 Stage III/IV tumors are the major subgroups for HNSCC. The prognosis of patients at stage III/IV is dissatisfactory, with 5-year relapse-free survival (RFS) and overall survival (OS) of 40% and 60%, respectively. 2 Surgical resection followed by adjuvant irradiation or chemoradiotherapy is the common treatment for resectable HNSCC at late stage. 2 Postoperative radiotherapy and chemoradiotherapy improve locoregional control initially, but with a long-term local and distant failure rate of 50% and 15%, respectively. 3 Therefore, novel therapeutic strategies in monotherapy or in combination with traditional treatments are required to improve outcome of primary HNSCC at late stage.
Recently, much attention has been paid to immunotherapy in the field of cancer treatment. The monoclonal antibodies (mAb) targeting the immune checkpoint pathways have been approved for the treatment of metastatic melanoma and non-small cell lung cancer. 4 HNSCC is characterized by several alterations in the immune system, making it possible that these patients may benefit from immunotherapy. Indeed, the anti-PD1 mAb (pembrolizumab and nivolumab) has been demonstrated to improve clinical outcome for platinum refractory recurrent/metastatic HNSCC. 5,6 Other ongoing clinical trials are designed to evaluate the efficacy of pembrolizumab in biomarker-unselected or platinum-and cetuximab-refractory recurrent/metastatic HNSCC with promising results. 7,8 However, little is known about the usefulness of immunotherapy in primary advanced HNSCC. We propose that a better understanding of the immune microenvironment of the primary tumors may provide new insights into the immunotherapy of HNSCC.
Immune processes play critical roles in carcinogenesis and progression of solid tumors. It is considered that the nascent transformed cells can be initially eliminated by the host immune system based on both innate and adaptive immunity. 9,10 The destroyed cells release various tumor antigens that further stimulate an adaptive immunity, e.g., activated T cells and B cells. In a later stage, multiple mechanisms have been developed by cancer cells to escape from the immune surveillance. For example, cancer cells secrete soluble cytokines or chemokines, which can recruit suppressive cells such as regulatory T cells (Tregs) and myeloid-derived suppressor cells into the tumor microenvironment. The costimulatory or coinhibitory signals are co-opted by tumors in regulating the activation of T cells. The PD1 signaling contributes to a suppressive effect on T cells. The ligands CD86/CD80 can act as either costimulatory or coinhibitory effectors, depending on their ligation to CD28 or CTLA4 respectively. 11 Upregulation of these checkpoint molecules on cancer cells or stromal cells leads to an immunosuppressive tumor microenvironment. 12 Immunotherapy agents targeting coinhibitory receptors (PD1 and CTLA4) can increase immune response by inhibiting the immunosuppressive mechanisms, and provide excellent therapeutic benefit in several cancers. 4,12 The expression pattern and potential clinical relevance of immune checkpoint molecules in HNSCC remain unclear so far.
HPV infection has been established as a cause of HNSCC, especially for tumors of the oropharynx (approximately 45.8% of oropharyngeal cancer were HPV positive). 13 HPV-positive tumors are associated with high responsiveness to chemotherapy and chemoradiotherapy and have improved outcomes in HNSCC patients. 14 A recent study showed that the HPV-positive tumors were enriched in the HNSCC subtype with high immune response. 15 For the platinum refractory recurrent/metastatic HNSCC, HPV-positive status is associated with a favorable outcome when receiving anti-PD1 mAb treatment. 16 The relationship between immune landscape and HPV status must be uncovered for primary III/IV HNSCC.
This study aimed to investigate the overall immune landscape and its clinical relevance in HNSCC at stage III-IVB. From The Cancer Genome Atlas (TCGA) cohort, we identified four consensus molecular subtypes of tumors based on the gene expression of global immune genes. The four subtypes were further validated in an external cohort from the Leipzig Head and Neck Group (LHNG). Two subtypes were characterized as upregulation of a series of immune markers. Moreover, the two subtypes also showed remarkable differences in terms of HPV status and expression of checkpoint genes as well as patient outcomes.

Identification of molecular subtypes of HNSCC based on immune profiles
The gene expression profiles of 1703 immune-related genes were used to identify the HNSCC subtypes in TCGA cohort. All tumor samples were firstly grouped into k (k D 2, 3, 4, 5, 6) different subtypes using Consensus Cluster Plus. According to the cumulative distribution function curves of the consensus score, the optimal division was achieved when k D 4 ( Fig. 1A-B, supplementary Fig. 1). The four clusters of samples were separated from each other on the two-dimensional scaling plotting (Fig. 1C). In addition, SigClust analysis also demonstrated that the consensus clusters (k D 4) were significant in all pairwise comparisons (supplementary Table 2). Thus, the HNSCC tumors were finally classified into four molecular subtypes based on the global immune gene expression (C1, n D 59; C2, n D 42; C3, n D 41; C4, n D 61).
The distributions of conventional clinical variables among the four molecular subtypes are summarized in Table 1. The association between anatomic tumor site and four subtypes was significant (x 2 test, FDR < 0.001). Tumors of oropharynx were enriched in C3 subtype, the laryngeal tumors were more likely to be characterized as C3 and C4 subtypes. A significant HPVpositive rate was observed in C3 subtype (47.4%, 9/19), compared with other subtypes (C1, 0/9; C2, 1/10; C4, 3/16) (x 2 test, FDR D 0.01). In addition, subtype C3 tended to have higher histological grade that other subtypes (x 2 test, FDR D 0.025). All other clinical parameters did not achieve statistical significance (Table 1).

Two subtypes were characterized as enhanced immune microenvironment
The upregulated genes in each subtype compared with other subtypes were calculated by unpaired t-test (FDR < 0.01, fold change > 2). Of the 1,706 immune genes, 57 genes, 402 genes, 377 genes, and 39 genes were significantly upregulated in subtypes C1, C2, C3, and C4, respectively (Fig. 1D). Notably, there were 226 genes overlapped between subtypes C2 and C3. However, only a few genes overlapped between other pairs of subtypes (Fig. 1D). We selected the top 100 upregulated genes of each subtype as the featured genes as previously described 17 (Fig. 1E). Again, subtypes C2 and C3 shared a highly similar gene expression pattern with 37 overlapped featured genes. The remarkably large number of immune genes upregulated in subtypes C2 and C3 indicates an enhanced immune profile in the two subtypes.
We also examined the expression signatures of 24 different types of immune cells. 18 Overall, most of the immune signatures were upregulated in subtypes C2 and C3 than in subtypes C1 and C4 ( Fig. 2A, upper panel). The upregulated pattern was observed for signatures of both innate and adaptive immunity. This was consistent with the observation of a relatively higher intratumoral lymphocyte infiltration in C2 and C3 subtypes (Supplementary Fig. 2A). Further analysis indicated a positive correlation between most of the immune signatures and lymphocyte infiltration ( Supplementary Fig. 2B). Collectively, the high expression of immune profiles and increased infiltrates suggested an enhanced immune microenvironment (EIME) in subtypes C2 and C3.
Differential levels of specific immune cell markers as well as checkpoint signalings between subtypes C2 and C3 Although subtypes C2 and C3 showed high immune levels compared with subtypes C1 and C4, they were significantly different in the consensus cluster analysis (Fig. 1B-C). There was no expression difference between subtypes C2 and C3 for several immune markers, such as T cells, Tregs, and NK cells (Fig. 2B, Supplementary Fig. 3). The expression signatures of B cells, T helper cells and central memory T cells (Tcm) were significantly higher in subtype C3 than those in subtype C2 ( Fig. 2B, Supplementary Fig. 3). On the other hand, subtype C2 showed upregulated signatures of macrophages, neutrophils, Th1 cells, and immature dendritic cells (iDC), compared with subtype C3 (Fig. 2B, Supplementary Fig. 3). The immune profiles were also validated by using the MCP-counter (Supplementary Fig. 4). Many of the immune patterns by MCPcounter were consistent with that described above, such as B cells, T cells and NK cells. We also found that the monocytic lineage were significantly upregulated in subtype C2 compared with C3. However, the difference of neutrophils was not significant between subtype C2 and C3 by MCP-counter (Supplementary Figure 4).
We then investigated the expression of immune checkpoint genes, which play a critical role in T cell regulation. Similar to the immune cell signatures, most of the checkpoint genes were upregulated in the two subtypes with EIME (C2 and C3) ( Fig. 2A, lower panel). However, some of the checkpoint molecules showed differential expression between the two subtypes. The immune checkpoint receptors, such as PD1 and CTLA4, were equally expressed in tumors between subtypes C2 and C3 (Fig. 3). However, the corresponding ligands (PD-L1/PD-L2 for PD1 and CD86/ CD80 for CTLA4) were significantly higher in subtype C2 compared with subtype C3 (Fig. 3). B7H3 and B7H4 were recently identified as checkpoint molecules. We found that B7H3 was dramatically downregulated while B7H4 was upregulated in subtype C3, compared with subtype C2 (Fig. 3).

Validation of four molecular subtypes in the LHNG cohort
The four subtypes identified in TCGA cohort were further validated in an external cohort of LHNG, using a nearest-centroid classifier (Fig. 4A). The four subtypes classified by our method were strongly correlated with the sample clusters identified by Wichmann et al 15 Table 2). The previously identified immune response cluster was remarkably enriched in our subtype C3 (45/50, 90%), but not in other subtypes (C1, 0/68; C2, 6/28; C4, 11/52). Moreover, subtype C3 were enriched for HPV-positive tumors ( test, FDR < 0.001) ( Table 2). The four subtypes in the LHNG cohort were also associated with T stage (x 2 test, FDR D 0.039) and N stage (x 2 test, FDR D 0.056) ( Table 2). Consistent with TCGA cohort, two subtypes (C2 and C3) in the LHNG cohort were characterized as high expression of various immune signatures (Fig. 4B). However, subtype C2 was preferentially associated with macrophages, neutrophils, iDCs and Th1 cells (Fig. 4B, Supplementary Fig. 5). Subtype C3  Fig. 6). Moreover, the MCP-counter score of monocytic lineage was significantly higher in subtype C2 than that in C3 (Supplementary Figure 6). These observations in the LHNG cohort were greatly consistent with TCGA cohort. Boxplots indicate 5%, 25%, 50%, 75%, and 95%, respectively. Comparisons between subtypes were performed by Student's t-test. All p values were corrected by the Bonferroni method. The adjusted value of p < 0.05 was considered significant.
In the LHNG cohort, the checkpoint receptors CTLA4 and PD1 were equally expressed in subtypes C2 and C3, with higher levels compared with subtypes C1 and C4 (Fig. 4C). However, the ligands CD86/CD80 and PD-L1 were significantly upregulated in subtype C2 compared with subtype C3 (Fig. 4C). The expression value of PD-L2, another ligand for PD1, was not detected in the LHNG dataset. In addition, B7H3 was significantly decreased, but B7H4 was increased in the subtype C3 compared with C2 (Fig. 4C). Again, these findings were consistent with those in TCGA cohort.

Expression pattern of non-immune stromal cell signatures and inflammatory chemokines
We examined the expression of two non-immune stromal cells, endothelial cells and fibroblasts, by using the MCP-counter. The expression score of endothelial cells were upregulated in subtype C2 compare with C3 in the LHNG cohort, but the significance was not achieved in TCGA cohort (Fig. 5, left column). The markers of fibroblasts were significantly higher in subtype C2 than C3 in both cohorts (Fig. 5, middle column). In addition, we investigated the inflammatory chemokine signature identified from melanoma in our HNSCC subtypes. Overall, the expression score of the inflammatory signature was enhanced in the two EIME subtypes compared with C1 and C4 subtypes in both cohorts. Furthermore, a higher expression of inflammatory chemokines was observed in subtype C2 than in C3 (Fig. 5, right column).
Angiogenesis related genes were reported as prognostic factors in colorectal cancers with high expression of lymphocyte-derived cytotoxic protein. 19 We then studied three angiogenesis related genes VEGFA, PDGFA and FGF1 across the four molecular subtypes. There was no remarkable difference for the angiogenesis related genes between subtypes C2 and C3, except that the expression of PDGFA was upregulated in subtype C2 than C3 in the LHNG cohort (Supplementary Figure 7).

Prognostic significance of the four molecular subtypes
On the basis of differential expression of immune profiles, we sought to know whether the four subtypes reflect distinct clinical outcome of patients. In TCGA cohort, Kaplan-Meier curves showed significantly different OS and RFS rates of patients in the four subtypes (log-rank test, OS, p D 0.006; RFS, p D 0.0003, Fig. 6A). Patients in subtype C3 demonstrated the best outcome out of the four subtypes based on both OS and RFS. Particularly, for the two EIME subtypes, Figure 3. Differential expression of checkpoint molecules in HNSCC subtypes in TCGA cohort. The gene expression value of eight checkpoint molecules were presented across four subtypes. Boxplots indicate 5%, 25%, 50%, 75%, and 95%, respectively. Comparisons between subtypes were performed by Student t-test. All p values were corrected by the Bonferroni method. The adjusted value of p < 0.05 was considered significant.
subtype C2 was associated with a trend toward poorer patient outcome (log-rank test, OS, p D 0.09; RFS, p D 0.051) than subtype C3.
For the LHNG cohort, the OS rates of patients in the four subtypes were significantly different (log-rank test, OS, p D 0.002, Fig. 6B). For the two EIME subtypes, patients in subtype C3 had remarkably better OS than those in C2 (log-rank test, p D 0.005). The PFS results indicated a similar trend with that of OS, but the significance was not achieved.

Discussion
In this study, we identified four gene expression subtypes based on global immune genes in stage III-IVB HNSCCs in TCGA cohort. The subtypes were also validated in an external cohort of LHNG. Two of the four subtypes showed significant higher expression of a series of immune markers. However, the two high-immune subtypes were different in terms of specific immune markers (immune cell types, inflammatory chemokines The subtypes (columns) were assigned by the centroids of TCGA cohort. Heat map indicates relative gene expression value, with yellow for high expression and cyan for low expression. B and C. The expression value of immune signatures (B) and checkpoint molecules (C) among subtypes in the LHNG cohort. Boxplots indicate 5%, 25%, 50%, 75%, and 95%, respectively. Comparisons between subtypes were performed by Student's t-test. Multiple testing was corrected by the Bonferroni method. The adjusted value of p < 0.05 was considered significant. and checkpoint molecules) and clinical features (HPV infection and patient outcome). These findings may provide some new insight into the strategy of immunotherapy of primary advanced HNSCCs.
Previous studies have reported some molecular subtypes of HNSCCs mainly based on the genome wide profiles. 15,20,21 In our analysis, we focused only on the global immune profiles, which could provide more details about the immune landscape of HNSCC. Among the four molecular subtypes, the overall immune profiles of subtypes C2 and C3 were significantly higher than that of subtypes C1 and C4. The two subtypes also showed increased tumor lymphocyte infiltration rate, which was positively correlated with expression signatures of several types of immune cells. These findings suggest that subtypes C2 and C3 were associated with an enhanced immune status in the tumor microenvironment, which was termed as EIME. Although subtypes C1 and C4 were characterized with lower expression of immune profiles, they were different from each other depending on several aspects. Firstly, subtype C1 and C4 were separated from each other on the 2-dimentional plotting based on global immune profiles (Fig. 1C). Secondly, there was no overlapping of genes upregulated in the two subtypes ( Fig. 1D-E). Thirdly, a larger proportion of larynx tumors were enriched in C4 than C1 (TCGA cohort, 32.8% vs 10.2%, Table 1; LHNG cohort, 23.1% vs 11.8%, Table 2). A higher proportion of 'G1' grade tumors were observed in subtype C1 than C4 (18.6% vs 8.2%, Table 1). Moreover, subtype C1 exhibited an higher enrichment in mesenchymal subtype than C4 (55.9% vs 11.5%), but subtype C4 were more related with the classical subtype than C1 (32.7% vs 1.5%) ( Table 2). Wichmann et al 15 reported a molecular subtype of 'immune response', which was significantly overlapped with our C3 subtype. Recently, an inflamed phenotype was identified in HNSCCs using a chemokine signature of metastatic melanoma. 22 The chemokine signature was upregulated in our EIME subtypes, with an even higher expression in C2 subtype (Fig. 5, right column). Both the 'immune response' subtype and the inflamed phenotype were reported to be enriched in HPVpositive tumors. 15,22 However, in our EIME subtypes, only subtype C3 but not subtype C2 showed a strong correlation with HPV infection. These results indicated that two distinct immune phenotypes exist in HNSCC, which may reflect HPVdependent or -independent mechanisms, respectively.
We found that the two EIME subtypes were quite different in terms of specific immune cell types. Subtype C2 was more likely to be correlated with markers of cells involved in innate immunity, such as macrophages, iDCs, and monocytes. The gene expression signatures of adaptive immune response cells, such as B cells, T helper cells as well as Tcm, were significantly higher in subtype C3 than subtype C2. The higher expression of adaptive immunity-related markers in subtype C3 may be partially due to the enrichment of HPV-positive tumors in subtype C3. A high level of adaptive immune response was associated with HPV infection in HNSCC. 23 These data again suggest that HNSCC contains at least two immune environments, which may reflect innate immunity and adaptive immunity respectively. Immunity type and specific immune cell markers may provide more details about the immune microenvironment of HNSCC.
Inflammation is common in tumor microenvironment, and contributes to tumorigenesis and cancer progression. 24 The proinflammatory cytokines and chemokines can be secreted by cancer cells or stromal cells, such as macrophages and fibroblasts. [24][25][26] We found that the inflammatory chemokine signature was significantly higher in subtype C2 than in C3. Consistently, the expression of markers for macrophages and fibroblasts was upregulated in subtype C2 in both cohorts. Moreover, we found a higher content of monocytic lineage infiltration in subtype C2 than C3. fibroblasts (middle column), and the expression value of chemokine signature (right column) were presented across four HNSCC subtypes. The data of TCGA cohort (upper row) and the LHNG cohort (lower row) was shown respectively. Boxplots indicate 5%, 25%, 50%, 75%, and 95%, respectively. Comparisons between subtypes were performed by Student t-test. All p values were corrected by the Bonferroni method. The adjusted value of p < 0.05 was considered significant. Previous studies reported that the cancer associated fibroblasts and macrophages were associated with unfavourable prognosis in HNSCC patients. 25,[27][28][29] A recent study found that high count of pretreatment peripheral monocytes was an independent factor for poor prognosis in HNSCCs. 30 All together, the inflammatory phenotype with higher infiltration of macrophages, fibroblasts and monocytes, may partially contribute to the bad prognosis of patients in subtype C2 than those in C3. Otherwise, the inflammatory microenvironment may also contribute to immunosuppression, because of that the expression of PD-L1 was correlated with an inflammatory phenotype in HNSCC. 22,31 Indeed, we found that the immune checkpoint signalings differ between the two EIME subtypes. Previous studies have identified HNSCC subtype with high immune response along with upregulation of checkpoint molecules. 22,32 Consistently, the overall expression of checkpoint molecules was elevated in our EIME subtypes (Fig. 2A). The checkpoint receptors are expressed on activated immune cells to prevent overabundant immune response. 16 We found that PD1 and CTLA4 were equally expressed in subtypes C2 and C3, which was consistent with the similar infiltration rate of lymphocytes. However, subtype C2 was preferentially associated with higher expression of the ligands (PD-L1/PD-L2 for PD1 and CD86/CD80 for CTLA4). The ligation of CD86/CD80 to the costimulatory receptor CD28 trigers the activation of T cells. The subsequent upregulation of CTLA4 on activated T cells competitively bind to CD86/CD80 to prevent overabundant immune response. 11 Previous studies have suggested that CD86 acts as a tumor suppressor expressed on non-cancer cells, such as macrophages and NK cells. [33][34][35][36] In our results, the expression signature of macrophage was significantly upregulated in subtype C2 compared with C3. We found a high correlation between CD86/ CD80 expression and the macrophage signature (Supplementary Figure 8). Therefore, it is possible that the upregulation of CD86/CD80 in subtype C2 is a sign of higher macrophage infiltration. However, the elevated CTLA4 may preferentially bind to CD86/CD80 with a higher affinity than CD28, 11 resulting in an immunosuppressive environment. The correlation of PD-L1/PD-L2 with macrophage signature was somewhat weaker (Supplementary Figure 8), suggesting that PD-L1/PD-L2 may express on both immune cells and cancer cells. In fact, overexpression of PD-L1 and/or PD-L2 on tumor cells was observed in many cancers including HNSCC. 37,38 These observations collectively suggest a more immunosuppressive microenvironment existing in subtype C2. This notion was supported by survival analysis that showed a worse prognosis of patients in subtype C2 than those in subtype C3. Therefore, we presumed that the patients in subtype C2 may derive more clinical benefit from checkpoint pathway blockade.
Th1 cells regulate cytotoxic immune response and contribute to tumor suppression. Previous studies reported that higher expression of Th1 related genes in tumor microenvironment was correlated with better prognosis in cancer patients. 18,39 Surprisingly, we found that the Th1 signature was significantly upregulated in the C2 subtype with a bad outcome. However, the biological mechanism underlying this observation is unclear and needs further investigation. The cytotoxic CD8C T cells suppress tumor growth and improve patient survival in many cancers. 40 However, it seems to remain a debate on the prognostic value of CD8C T cell infiltration in HNSCC. CD8C T cell infiltration was responsible for the antitumor effect of cancer vaccine against HNSCC. 40 High infiltration of tumors by CD8C T cells correlated with a favourable prognosis in HNSCCs following definitive chemoradiotherapy. 41 Although HPVC HNSCCs were highly infiltrated by T cells, the high content of CD8C cells could not predict a better outcome in these patients. 41,42 Another study showed that the prognostic value of CD8C infiltration was only observed in HNSCCs at low risk, but not found in the late-stage subgroup. 43 In our results, the CD8C T cell signature were upregulated in subtype C3 compared with C2 in the LHNG cohort, but the significance was not achieved in TCGA cohort. These data suggested that the prognostic value of CD8C infiltration in HNSCCs is far more complex than previously expected. The HPV status, tumor stage, and the localization of infiltration in tumors may affect the prognostic value of CD8C T cells. [41][42][43] The four molecular subtypes were correlated with distinct patient outcome in both cohorts. For the two EIME subtypes, prognosis of patients in subtype C3 was consistently better than those in subtype C2 in both cohorts. However, the non-EIME subtypes (especially subtype C4) showed different prognostic results between the two cohorts. One potential reason is that the distributions of tumor sites for the two cohorts are quite different (Supplementary Table 1). The TCGA cohort contains 64.5% (131/203) of tumors of the oral cavity, whereas the proportion is only 27.8% (55/198) in the LHNG cohort. When considering only the non-oral cavity tumors, the survival patterns of the four subtypes were more similar between the two cohorts (Supplementary Figure 9). Another explanation is that the patients in the two cohorts were underwent different treatments, which affects clinical outcome significantly.
In conclusion, using the gene expression profile of global immune genes, we identified two EIME subtypes in primary advanced HNSCC. The two subtypes were distinct in terms of HPV status, immunity features, and immune checkpoint molecules as well as patient outcomes. These findings of the intratumoral immune microenvironment may shed new light on the strategy of immunotherapy in advanced HNSCC.

Patients and clinical characteristics
The gene expression profiles and clinical information of 528 primary HNSCCs by surgery resection were obtained from TCGA. This study analyzed 203 patients from this entire cohort with the following criteria: a) without any neoadjuvant therapy; b) at pTNM stage of III-IVB; c) receiving postoperative radiotherapy or chemoradiotherapy before any recurrence or metastasis occurs; d) with follow-up information; e) with gene expression profiles of the primary tumor. Clinical characteristics of TCGA cohort were summarized in Supplementary  Table 1. HPV status was determined by p16 testing or HPV in situ hybridization. OS and RFS (locoregional or distant recurrence after surgery treatment) were available for TCGA cohort.
The external validation cohort is a subset of 270 HNSCCs collected by the LHNG. 15 Only 198 primary HNSCCs at stage III-IVB were included in our analysis. HPV status of the LHNG cohort is determined by HPV DNA genotyping, as well as DNA and RNA test of HPV16 as described previously. 15 OS and progression-free survival (PFS) information was available for the LHNG cohort. Clinical characteristics of this cohort were summarized in Supplementary Table 1. There was no difference in most of the clinical parameters between TCGA cohort and the LHNG cohort. There was one exception that a larger proportion of oropharyngeal tumors and smaller ratio of oral cavity tumors were enrolled in the LHNG cohort (Supplementary Table 1).

Gene expression data processing
For TCGA cohort, the RNA-seq data was profiled by the nextgeneration sequencing platform of Illumina Ò . The fragments per kilobase of gene per million fragments mapped with upper quartile normalization (FPKM-uq) were downloaded from TCGA Data Portal. Gene annotation was performed with the Ensembl database. The gene expression value was log 2 -transformed for subsequent analysis. The gene expression profiles of LHNG cohort were analyzed by Illumina Ò HT12 v4 Expression Bead-Chips. 15 The normalized expression data and corresponding clinical information were downloaded from Gene Expression Omnibus (GEO) (GSE65858). Probe annotations of BeadChips were obtained from the GEO database. Entrez Gene IDs were used for mapping gene expression data between the two cohorts.

Identification of molecular subtypes of HNSCC by immune genes
The gene expression profile of global immune genes in TCGA cohort was used to identify the HNSCC subtypes. Immunerelated genes were extracted from the Gene Ontology (GO) database by searching the immune/inflammation/defense related GO terms, resulting in 139 GO terms with 1,939 genes. The genes with no expression (FPKM D 0) in more than 80% of samples were excluded. The remaining 1,706 immune genes were used for Consensus Cluster Plus analysis. 44 Gene expression data were median centered. The clustering program was performed with 1,000 iterations, by sampling 80% of samples at each iteration. The optimal cluster number was determined by cumulative distribution function curves of the consensus score. Pairwise comparisons among identified subtypes were performed by SigClust analysis. 45 Bonferroni correction was used for multiple testing.
For each subtype, the featured genes were calculated by comparing the samples in this subtype with the remained samples using Student's t-test. We selected the top 100 upregulated genes in each subtype to distinguish the different molecular subtypes, as suggested in a previous study. 17 False discovery rate (FDR) was calculated using the Benjamini-Hochberg method.

Immune profiles in HNSCC molecular subtypes
The expression signatures over 24 types of immune cells were obtained from a previous study by Bindea et al. 18 A signature with 12 chemokines identified in metastatic melanoma was recently used to identify an inflamed phenotype of HNSCCs. 22 This chemokine signature was also analyzed in our study. Expression score of a given signature in each tumor sample was calculated by averaging the z-normalized expression value of the signature genes. The immune profiles were also validated by the MCP-counter, an estimator for abundance of tissueinfiltrating cells by gene expression profiles. 46 The MCPcounter score was calculated by using the R package 'MCPcounter'. 46 The costimulatory or coinhibitory molecules were analyzed across the molecular subtypes. Immune signature scores and expression value of checkpoint genes were compared between different HNSCC subtypes by Student's t-test. The association between subtypes and the tumor lymphocyte infiltration rate (%) was assessed using the Mann-Whitney U test. Bonferroni correction was used for multiple testing.

Validation the molecular subtypes in LHNG cohort
To validate the molecular subtypes identified in TCGA cohort, the cancer samples in the LHNG cohort were classified by the top 100 featured genes. Gene expression data were z-normalized for both cohorts. The centroid of each subtype in TCGA cohort was calculated by averaging the inner-subtype expression data. For a profile of each sample in the LHNG cohort, the Pearson correlations with the centroids of TCGA cohort were calculated, and the subtype with the highest correlation was assigned to this sample.

Other statistical methods
The association between conventional clinical variables and molecular subtypes was tested by chi-square test or Fisher's exact test. Multiple testing was corrected by the Benjamini-Hochberg's FDR. Kaplan-Meier curves and log-rank test were employed to compare the 5-year OS and RFS rates of different molecular subtypes. All statistical tests were two-sided. All statistical analyses were performed by R software.