High tumor amplification burden is associated with TP53 mutations in the pan-cancer setting

ABSTRACT Next-generation sequencing data is fundamentally changing the clinical management of patients with cancer. The most frequent genomic alterations in malignancy are mutations and amplifications, with a subset of tumors having multiple amplifications – “amplificators”. We sought to understand the molecular correlates of high tumor amplification burden in a pan-cancer context. Using both national registries and a single-institution dataset, our results demonstrate that cancers with TP53 mutations (as compared to those with wild-type TP53) exhibited significantly higher tumor amplification burden across all datasets. Amplifications, generally associated with overexpression, may be potentially actionable secondary consequences of TP53 mutations.


Background
Advances in next-generation sequencing (NGS) techniques have begun to revolutionize our fundamental understanding of disease, especially cancer. The identification of specific genomic anomalies has enabled the development of molecular and immune marker-specific drugs as treatment options for various cancers. [1][2][3][4][5] For instance, some patients with high tumor mutational burden appear more responsive to immune checkpoint blockade. 1,6 However, the underlying biology driving certain genomic alteration patterns remains to be elucidated. As an example, gene amplification refers to an increase in the number of copies of a specific gene and is a prominent manifestation of genomic instability in mammalian cells. 7,8 Gene amplifications are often present in cancer cells and can be the cause of RNA or protein overexpression. 8,9 The occurrence of gene amplifications in early stages of cancer and the amplification of multiple genes in some tumors may suggest an underlying genomic etiology. 10,11 While the mechanisms behind gene amplification have not been empirically determined, they are generally understood to be the result of DNA double-stranded breaks, impaired DNA replication, or dysfunction in the DNA repair machinery. 10 Interestingly, we have observed a group of patients, dubbed 'amplificators', who have large numbers of gene amplifications, with or without concomitant large numbers of deleterious mutations.
In this study, we reviewed the medical records of 1,891 patients seen at the University of California, San Diego (UCSD) Moores Center for Personalized Cancer Therapy, and additionally explored 7,246 tumor samples from The Cancer Genome Atlas (TCGA). We show an association between TP53 mutations and a high number of oncogenic gene amplifications. TP53 is a tumor suppressor, designated the "guardian of the genome" because of its crucial role in maintaining genomic integrity ( Figure 1). [11][12][13][14][15][16] Although TP53 alterations are considered difficult to drug, their secondary effects, such as amplifications, might be important in that the resultant overexpression levels may be actionable.

Results and discussion
Exploratory analysis was performed to identify patients deemed to have higher proportions of gene amplifications. To define the "amplificator" phenotype, we examined samples that expressed the top 10% of tumor amplification burden, across all samples, and based on two different sequencing panels. The two panels used for sequencing were whole genome sequencing (WGS) and a panel of 315 common oncogenes from the FoundationOne CDx gene panel by Foundation Medicine (FM) (https://www.foundationmedicine.com). A total of 7,246 patient samples were included from TCGA and 1,891 patients treated at Moores Cancer Center at UCSD and sequenced by FM.
In Table 1 Additional analysis compared the frequency of mutated genes in samples with amplification burden in the top 10% of samples vs. those in the bottom 90% using WGS or the FM panel (Table 2). There was a significant association in the TCGA dataset between alterations in TP53, BRAF, and KRAS and tumor amplification burden in both the FM panel as well as WGS. TP53 alterations were associated with increased amplifications, while BRAF and KRAS alterations associated with decreased amplifications (all p < .01). In the UCSD dataset analyzed with the FM panel, only TP53 alterations (but not BRAF or KRAS alterations) were found to be significantly associated with increased tumor amplification burden (perhaps because this dataset was smaller and WGS was not available) (all p < .01 for significance).

Conclusions
The tumor suppressor TP53 has long been implicated in the development of diverse cancers. It is the most commonly mutated gene in cancer and has diverse functions important to oncogenesis. [12][13][14][15][16][17] Unfortunately, TP53 alterations are considered difficult to target from a therapeutic standpoint. [12][13][14][15][16][17] Groups have, however, reported an increase in the expression of vascular endothelial growth factor (VEGF) in a pan-cancer analysis as well as improvement in outcome of patients who receive VEGF/VEGFR inhibitors when their tumors harbor deleterious TP53 alterations as a possible therapeutic proxy for targeting harmful TP53 alterations. 13,15,17 Importantly, mutations in the TP53 tumor suppressor increase genomic instability, corroborating the reputation of TP53 as the "guardian of the genome." 12 Interestingly, TP53 also likely plays a role in transcriptional regulation. 18 This inherent process of upregulating and downregulating various aspect of mRNA production may directly impact the tumor suppression function of this protein. 18,19 It may be that TP53 impacts mRNA expression through many TP53-dependent pathways that directly impact transcriptional regulation and via indirect transcriptional regulation (such as by virtue of secondary amplifications) as well as through posttranscriptional regulation. 18 Our data suggest that a subset of cancers have high tumor amplification burden, and these tumors are significantly more likely to bear TP53 mutations than those with lower tumor amplification burden. In contrast, BRAF and KRAS alterations correlated with decreased tumor amplification burden in the TCGA dataset. A limitation of our findings is that it is unclear why BRAF and KRAS alterations would correlate with a decreased tumor amplification burden. It is also unclear why specific tumor types such as breast cancer and ovarian serous carcinomas are especially likely to have an amplificator phenotype, though the latter could be due to the fact that highgrade ovarian serous carcinomas demonstrate TP53 anomalies in about 90% of cases. 20 TP53 mutations may correlate with high tumor amplification burden because these mutations impair genomic stability as evidenced by loss-of-function TP53 mutations shown to be associated with increased mutation rate. 21 Since amplifications (which generally [but not always] cause overexpression) 22 may be pharmacologically tractable in some cases, targeting them may be an indirect way to impact the consequences of TP53 mutation-related genomic instability.

Methods
Two distinct datasets were used for the statistical analysis. The first dataset was retrieved from the publicly available *In the TCGA data, a total of 7,246 samples that had copy number variation (CNV) and mutation data were curated from 11,245 possible TCGA samples across all cancer cohorts. 1 The phenotype "WGS amplificator" corresponds to tumors presenting a high number of amplifications considering the whole genome (top 10% amplification burden, within the whole genome): All p < 0.0001. repository, The Cancer Genome Atlas (TCGA) (https://portal. gdc.cancer.gov/), which is a cohort of sequenced cancer samples from patients. Our second dataset was composed of a cohort of patients who had been treated and sequenced using the FoundationOne CDx gene panel (Foundation Medicine, Inc., Cambridge, MA) (https://corpsite.foundation medicine.com/genomic-testing) (Supplemental Table S1) at UCSD. All studies were conducted under the auspices of an Internal Review Board (IRB) Committee-approved protocol (NCT02478931) and any investigational trials for which the patient gave consent. Data collected and reviewed retrospectively for this study from the UCSD cohort included genomic information from sequencing results detailing amplifications and mutations present (with gene localization) across all cancer types and using the genes in the FM panel. Data collected from TCGA included demographic information such as age (years), sex, primary cancer diagnosis, number of amplifications, and mutation status for all of the FM panel genes. Within the TCGA dataset, patient samples from all cancer types were queried.
Descriptive statistics were tabulated to describe patient sample information, comparing our amplificators to nonamplificators in both the TCGA cohort, as well as our institutional dataset. For the TCGA cohort, statistical summaries were stratified into sub-groups based on amplificator vs. non-amplificator phenotype, as determined by both WGS and the FM gene panels (Supplemental Tables S2  and S3). Descriptive information included number of patients matching the criteria for amplificator or nonamplificator phenotype, total number of TP53 mutated samples present in each sub-group, average number of WGS amplifications per sample, and average number FM gene amplifications per sample. Similar analysis was conducted in the UCSD cohort; however, amplificators were determined solely based on number of amplifications present in the FM gene panel as WGS was not utilized for these patients. Additionally, for both cohorts, a second analysis was conducted to describe summary statistics based on TP53 mutation status. Patient samples were stratified into TP53 mutant and TP53 wild-type sub-groups. Within these two sub-groups, the average number of WGS amplifications and FM-panel amplifications per sample was determined and compared using student's t-tests. Similar analysis was conducted for the UCSD dataset comparing only number of FM panel amplifications per sample in the TP53 mutant and wild-type sub-groups.
Further analysis was performed to assess the number of mutations in common cancer genes (Supplemental Table S1) across samples with the top 10% of amplifications against samples with the bottom 90% of amplifications. The top four genes reported in our analysis were TP53, BRAF, KRAS, and GATA3. Odds ratios and Bonferroni adjusted p-values were  Table S1). *** Using the UCSD dataset, which consists of 1,891 cancer samples, the 90 th percentile for number of amplifications (FM panel) was calculated to be 9. **** Refers to number of samples with designated gene mutation/total samples in that subgroup. For instance, within TCGA database, 727 samples were in the top 10% for amplification burden ("amplificators"); of these 727 samples, 449 (65.7%) had a TP53 mutation. then calculated to compare mutation burden between the two amplification sub-groups. This analysis was performed using both the WGS and FM information for the TCGA dataset and only FM-panel genes for the UCSD dataset.
For the TCGA dataset, copy numbers were measured using whole-genome microarray. Gene-level focal copy number variations (CNVs) were normalized using data from all TCGA cohorts (« pan-cancer » data set) and estimated using the GISTIC2 threshold method, 23 where the values −2, −1, 0, 1, and 2 represented homozygous deletion, single-copy deletion, diploid normal copy, low-level amplification, and high-level amplification, respectively. Only high-level amplification (+2) was considered for this analysis.
For the UCSD dataset, copy numbers were measured using gene-panel capture sequencing (FoundationOne CDx test, Foundation Medicine, Inc.) and gene-level amplifications were reported when the number of copies exceeded 6.
All statistical analysis was conducted using a combination of Microsoft Excel version 16.42 (Microsoft Corporation, Redmond, Washington, USA) and R version 3.6.1 (R Foundation for Statistical Computing, Vienna, Austria).