Initial investigation of using Norwegian health data for the purpose of external comparator arms - an example for non-small cell lung cancer

Abstract Background Using real-world (RW) data from registries to mimic or substitute a comparator arm of clinical trials is increasingly investigated. The feasibility of using data sources for this purpose depends on the source’s completeness of the wide spectrum of a disease characteristics and relevant endpoints, which could allow for proper matching of important variables. Materials and methods This is a Norwegian study using data from three population-based registries, the Cancer Registry (CRN), the National Patient Registry (NPR), and the Norwegian Prescribed Drug Registry (LMR). We assessed if the registries contained the information necessary for selecting a RW cohort of patients with characteristics that mimicked the inclusion and exclusion criteria from the control arm of the phase 3 trial (KeyNote-407) on patients with stage IV, squamous NSCLC. We did this both on an aggregated - and individual data level. We also described the survival in the RW-cohorts and compared it with the survival in the control arm of the KN-407 trial. Results Using aggregated data from the CRN allowed us to find the patients based on some clinically relevant inclusion criteria, but only to a limited extent could we apply the exclusion criteria of KN-407. When we used individual data from the CRN, NPR and the LMR we could create a patient cohort that shared more criteria corresponding to the eligibility criteria from KN-407, including exclusion criteria. Compared to the 11.3 months (CI 95% 9.5, 14.8) median survival in the control arm of KN-407, both RW-cohorts had a shorter median survival, 7.07 months (CI 95% 6.7, 9.5) (individual) and 8.0 months (CI 95% 5.8, 10.5) (aggregated). Conclusion Even if we demonstrated that the registries contain clinically relevant information necessary to mimic the eligibility criteria of a selected RCT, the survival is shorter for RW patients compared to their control arm counterpart in the RCT.


Background
In the current era of precision medicine, single-arm trials (SATs) are becoming more common, because Randomized Controlled Trials (RCTs) might be unethical due to an unmet medical need, no defined standard treatment and/or too small patient groups [1][2][3].A growing scientific field is how to use real-world data (RWD), e.g., registries and electronic health records, to create the standard of care arm and use that as an external comparator (EC) to SATs.As an example, in a proof-of-concept study, Carrigan et al. (2019), investigated how closely contemporaneous ECs derived from the US-based Flatiron Health database, mirrored the overall survival observed in the control arms from non-small cell lung cancer (NSCLC) trials.For 10 of the 11 analyses they conducted, hazard ratio estimates using ECs were similar to those from the original RCTs and they indicated that if ECs consistently can replicate the results from RCT data, they might serve as meaningful comparators to SATs [4].While Carrigan et al. (2019) used curated electronic health records, Jemielita et al. (2021) used RWD from Swedish national healthcare registries to replicate the comparator arm of two previously published oncology RCTs.They concluded that the sources lacked information on relevant variables to fully match the comparator arm in the RCTs, and that overall survival was shorter in the RWD compared to the RCTs [5].The latter underscores the importance of more research on how to maximize the usage of national health registries, especially in the context of external control arms.
To potentially substitute a control arm in clinical trials, it is important that the RWD source has sufficiently captured the wide spectrum of the disease characteristics, treatments, and relevant endpoints.In 2019/2020, the population-based Cancer Registry of Norway (CRN) led a unique public-private partnership project in Norway, called INSPIRE (INcreaSe PharmaceutIcal REporting).This was initiated to collect data automatically and electronically on cancer medication from the hospital systems to the CRN, retrospectively back to 2008 [6].With the INSPIRE-data in place, Norway has an overview of hospital-administrated cancer medications linked to detailed clinicopathological features, in one registry.Few, if any, publications have used Norwegian data as a source for creating ECs, thus we know little about whether these sources are fit for this purpose.Both Carrigan et al. (2019) and Jemielita et al. (2021) focused more on investigating if the endpoints were the "same" in RWD as in the RCTs.We focused on whether the CRN (including the INSPIRE data) alone and the CRN plus the National Patient Registry (NPR) and the Norwegian Prescribed Drug Registry (LMR) contain the necessary information to mimic inclusion and exclusion criteria from a recent phase 3 RCT (KeyNote-407 trial (KN-407)) on patients with stage IV, squamous NSCLC, for the proper identification of a real-world (RW) cohort.We asked the questions, can we find the patients, and how does the survival in the RW-cohorts that apply as many eligibility criteria as possible, compare to the survival in the control arm of KN-407.

Data sources
This is a study using nationwide data from three populationbased registries in Norway.The unique, 11-digit personal identification number (PIN) each citizen living in Norway has, allows for linkage between them.We used data on both an aggregated -and individual level and created two RWcohorts; an aggregated RW-cohort (data only from the CRN) and an individual-linked RW-cohort (with data from the CRN, NPR and the LMR).Reporting and interpretation of these data is done by the authors.No endorsement of any statements made in this paper by the CRN, NPR, or LMR is intended, and should not be inferred.

The Cancer Registry of Norway
The Cancer Registry of Norway (CRN) was established in 1951 and all health institutions involved in cancer care are required by law to report all cases of malignant neoplasm.The main sources of information for the CRN are clinical notifications from the oncologist, pathology reports and death certificates.It has 12 Clinical Registries for specific cancers (including lung cancer), which provide a comprehensive overview of cancer-specific diagnostics, treatment modalities (surgery, radiation, and medical treatment) and follow-up [7].

The National Patient Registry
The National Patient Registry (NPR) has data from 2008 onwards, and information on all patients who have been referred to or have received specialized healthcare at any specialist healthcare service in Norway, including private institutions.It contains ICD-10 codes for diagnostic data and includes administrative -, demographic-, and reimbursement information, in addition to information on surgical, medical treatment and radiological procedures [8].

The Norwegian Prescribed Drug Registry
The Norwegian Prescribed Drug Registry (LMR) contains information on dispensed drugs (by prescription) from pharmacies.In summary, it contains information about the person receiving the drug(s), the prescriber, the pharmacy, and the product (name, dose, ATC-code, generic name [9].In 2021, LMR replaced the Norwegian Prescription Database (NorPD), which has collected information since 2004.

Design and study selection
Patient selection from the CRN was based on the eligibility criteria of the KN-407 trial [10].See the full list of eligibility criteria for KN-407 in the Supplementary material, Table S1.This is a phase 3 RCT comparing first-line (1 L) treatment with pembrolizumab þ chemotherapy to chemotherapy alone, in patients with stage IV squamous NSCLC.We assessed one by one the eligibility criteria in the trial against the availability of the "corresponding" information in our data sources.The overall aim was that our cohorts should be made up of patients that have as many similar characteristics to the patients making up the control arm of KN-407 as possible.Note that we added "1 L treatment with any regime containing chemotherapy" as an eligibility criterion to the RW-cohorts, to mimic the control arm of KN-407.With few exceptions, laboratory values are not reported to the CRN and are thus not taken into consideration in this study.

Aggregated RW-cohort
We reviewed the eligibility criteria of KN-407 and compared it with the information registered in the CRN, available for aggregated data extraction.The aggregated RW-cohort was made up of patients diagnosed between 2014 to 2016.Follow-up was defined as the time from diagnosis, from 1 January 2014, to death or emigration, or up to 2 May 2022, for patients who were alive at that date.For the estimation of relative survival, the CRN uses a method based on the age-standardized Pohar-Perme method.For a detailed description of other statistical methods and requirements set out by the CRN, please see the Cancer in Norway-report, Statistical methods [11].The use of aggregated data is not in scope for regional ethics approval.

Individual RW-cohort
The patients who had a diagnosis of lung cancer were identified using the CRN.Then the individual CRN data was linked with data from the NPR and the LMR based on the PIN-ID.Patient-level data were de-identified to guarantee the protection of individual patient integrity.The individual RW-cohort was made up of patients diagnosed between 2015 to 2021, to death or emigration, or up to 31 December 2021, for patients who were alive at that date.Ethical approval was obtained from the Regional Ethical Committee, ID number 108024.Median survival estimates with 95% confidence intervals were computed, as well as descriptive statistics.

Results
For a detailed description of how we assessed the correspondence between the trial eligibility criteria and the RWdata sources, see the Supplementary Material, Table S1.

Aggregated RW-cohort
By using only aggregated data from the CRN, we were able to identify relevant patients in terms of sex, age, stage, cancer-specific ICD-code, histology, treatment, and survival outcome.Table 1 presents the eligibility criteria that made up the aggregated RW-cohort.
Regarding criteria number 5 "Information on systemic cancer treatment available" we experienced that the medical information (INSPIRE)-data between 2014-2016 to the CRN were incomplete and approximately (�43%) were excluded due to this.The CRN does not have information on comorbidities, nor any other pre-existing (non-cancer) conditions or co-medications for those.We were unable to exclude patients from the cohort, based on the "non-cancer" exclusion criteria of KN-407.

Individual RW-cohort
Compared to aggregated data, we were able to select a patient cohort that shared more criteria corresponding to the eligibility criteria from KN-407, by using individual CRN data linked with data from the NPR and LMR.In addition to the criteria from the aggregated cohort, we could select ECOG status and exclude patients based on variables suggesting the presence of comorbidities.The eligibility criteria for the individual RW-cohort are summarized in Table 2.
Table 3 compares the baseline characteristics of the aggregated-, and individual RW-cohort with those of the control arm in KN-407.The aggregated cohort was the smallest with 76 patients.Patients were included regardless of ECOG status due to low data quality between 2024 -2014, and 47.4% had ECOG 2, unknown or missing ECOG status.A total of 119 patients made up the individual RW-cohort, of which the majority had an ECOG status ¼ 1 (66.4%), which was comparable to the 68% of patients having ECOG status ¼ 1 in the control arm of KN-407.Although PD-L1 score was not an inclusion criterion in KN-407, we demonstrate the opportunity for PD-L1 stratification like the KN-407 trial.A higher proportion of patients in the KN-407 control arm had a high PD-L1 score, compared to the RW-cohort (in addition, the RW-cohort had a larger proportion of patients with missing PD-L1 information).It was not standard practice to measure PD-L1 scores in 2014-2016, thus we don't have that information on the aggregated cohort.The median age in the individual RW-cohort was five years older than in the control arm of KN-407 (70 years vs 65 years), while it was 68 years in the aggregated cohort.The proportion of patients in the RW-cohorts that were under 65 years old, was lower than in the KN-407 control arm (31.8%, 22.7% vs. 45.2%,respectively).All cohorts had mostly males (72.4%, 73.9% and 83.9%, respectively the aggregated, individual and the KN-407 control arm.

Survival of the RW-cohorts compared to the survival in the control arm of KN-407
A secondary objective was to describe the survival in the two RW-cohorts and compare them to each other, and to the control arm of KN-407.Figure 1 shows the median survival of the respective cohorts.Compared to the control arm of KN-407 (11.3 months), both the RW-cohorts had a shorter median survival.The RW-cohort derived from the linked, individual data had a shorter median survival (7.07 months) than the RW-cohort derived from the aggregated data (8.0 months).This is a naïve comparison, and interpretations must be done with caution.

Discussion
The aims of this study were to apply the eligibility criteria of the KN-407 trial to the Norwegian population-based registries CRN, NPR and the LMR and investigate if we could create RW-cohorts of squamous NSCLC patients, who had been treated with chemotherapy, to mimic the cohort in the trial control arm.We also wanted to assess how the RW survival compared to the survival in the control arm of KN-407.First, using aggregated data from the CRN for this purpose allowed us to select patients based on some clinically relevant inclusion criteria, but only to a limited extent could we apply the exclusion criteria of KN-407 to the CRN.When we used individual data from three, linked registries we could select a patient cohort that shared more criteria corresponding to the eligibility criteria from KN-407, including exclusion criteria.Second, both RW-cohorts displayed a shorter survival than the control arm of KN-407.In summary, even if we can demonstrate that the registries contain clinically relevant information necessary to mimic the eligibility criteria of a selected RCT, the survival is shorter for RW-patients compared to their control arm counterpart in the RCT.Following, we discuss some key considerations of our findings.Using aggregated data from the CRN for this purpose conforms to the simplest design that may be suggested for use in practice, but it is a realistic scenario for secondary use of data.The general conception is that aggregated data have limited applicability, but we find it meaningful to investigate the extent of information that can be extracted with this method, which could be done quickly as these data are publicly available without the need for ethical approval.Even though we on a group level showed that we can create a cohort of patients that shares many similar clinical characteristics as those in the study arm of KN-407, we don't know the characteristics of each individual and we could not exclude patients based on other pre-existing (non-cancer) conditions.It's also not possible to follow the treatment trajectory of each patient.Thus, we argue that it is not sufficient to use this method if the aim is to act as a formal external control arm to a clinical trial.Somewhat surprisingly, our data showed that the survival of the aggregated cohort was longer (8.0 months) than the survival in the individual cohort (7.07 months).Our data don't support any causal explanation for this, but we are highlighting some plausible explanations.A considerable proportion of patients (�43%) were excluded from the aggregated cohort due to the incomplete medication data in the period 2014-2016.We believe that more patients could have been identified in the aggregated cohort if the medical data were complete in this period.This could have introduced a skewed patient population in our aggregated cohort.In addition, it is known that ECOG performance status is one of the most important independent prognostic factors in multiple tumors, including advanced NSCLC [12,13], and ECOG performance status of 0-1 was a requirement in KN-407.However, the baseline description in Table 3 shows that the aggregated cohort had 47.4% of patients with ECOG 2, unknown or missing, thus we don't know if the aggregated cohort has a high proportion of patients with favorable ECOG status.Both the lower median age (68 years vs. 70 years) and the proportion of patients younger than 65 years (31.8% vs. 22.7%) in respectively the aggregated vs. the individual RW cohort could also imply favorable survival data for the aggregated cohort.We cannot conclude that these are the sole reasons for the longer survival seen in the aggregated cohort vs. the individual cohort.Both patient numbers and survival benefits are relatively small, so we should be careful in interpreting too much the underlying reasons.
In all aspects, the individual RW-cohort should theoretically be more similar to the control arm of KN-407 since we both could use ECOG status (individual cohort was selected from the period 2015-2021) and co-morbidities as exclusion variables.But again, even if they shared similar clinical characteristics, the survival was lower in the RW-individual cohort compared to the KN-407 control arm (7.07 months vs. 11.3 months).Taking the baseline characteristic into account, the RW-cohort was both older (70 years vs. 65 years) and had a lower proportion of patients with high PD-L1 tumor proportion score (including missing PD-L1 information), compared to the KN-407 control arm, respectively.To our knowledge, there is limited documentation on survival outcomes of RW patients mimicking as many trial eligibility criteria as possible, but when Cramer-van der Welle et al. (2018) assessed the difference between outcomes in metastatic NSCLC clinical trials and the "real world", they showed that the survival of patients treated with chemotherapy or targeted therapy in RW practice was nearly one quarter shorter than for patients included in clinical trials [14].We also know that 31.7% of the patients in the control arm of KN-407 crossed over to receive pembrolizumab after the occurrence of disease progression [10] which is also likely to contribute to the longer survival seen in the trial, compared to the individual RW-cohort.Although we cannot conclude that it is due to disease progression, a comparable proportion to KN-407, of the RW individual cohort has been treated with PD-1/PD-L1 inhibitors as 2 L treatment (data not shown).
All patients in the RW-cohorts were treated with chemotherapy as their 1 L treatment, and the most common chemotherapy regime in Norway was a combination of vinorelbine þ carboplatin (VC).This was not similar to the chemotherapy regimen used in the control arm of KN-407, which used carboplatin þ paclitaxel or nab-paclitaxel [10].To our knowledge, no randomized study has evaluated the efficacy between all combinations of chemotherapy regimens for squamous NSCLC, and the treatment guidelines state that physicians often must choose one chemotherapy regimen over another based on other factors, including drug schedule and adverse events [15].Features such as dosing, frequency, combinations, and treatment sequence were not assessed in our study.Few studies have investigated the RW efficacy of VC, but a study from 2008 showed that treatment with VC, (NSCLC stage IIIB and IV) had a median survival of 7.3 months, demonstrating that our survival data are consistent [16].
We argue that for a trial like KN-407, the CRN in combination with NPR and LMR has sufficient information on the crucial, clinically relevant variables to select a patient population with stage IV, squamous NSCLC treated with chemotherapy.However, a study like KN-407 is not the type of trial that would require an external comparator (EC) in the future, as this is an RCT with a large patient population.Interestingly, one out of the 11 analyses that Carrigan et al. (2019) conducted had a discordant result, the authors proposed that the likely explanation was that the EHR data used for creating the EC lacked information on a prognostic biomarker (mesenchymal-to-epithelial transition (MET)).This led the EHR-cohort to have fewer MET-positive patients compared with the MET-positive-enriched RCT, thereby skewing the EC to have a longer OS than the RCT population it was trying to replicate [4].The future use of ECs in oncology is evolving and Mishra-Kalyani et al. (2022) discuss that with formal analysis, it might be possible that comparative effectiveness between an intervention-and an external control arm would provide the primary evidence to support regulatory approval in oncology [17].Regardless of the future use area, if there are key issues with the comprehensiveness and validity of the data source, the source will fail in its attempt to create an EC.
Although on a side note, estimates for clinical trial participation for adults with cancer in the US are 5% or lower [18].One of the largest external validity studies on RCTs was done by Yi Tan et al. (2022) who demonstrated that overly stringent RCT exclusion criteria do not appropriately account for the heterogeneity of characteristics observed in RW-populations [19].The American Society of Clinical Oncology (ASCO), Friends of Cancer Research, and the US Food and Drug Administration voice for modernizing criteria related to comorbidities used to exclude patients from cancer clinical trials, to improve clinical trial participation [18].This emphasizes the need for RWD sources to be comprehensive, as it becomes even more obvious that RWE and RCTs should be considered mutually complementary.
The present study is descriptive, focusing on the qualitative aspects of the availability of data variables in the registries, and it has some limitations.Our approach applies to the stage IV NSCLC disease setting, and it may not generalize to other cancer types in the registries.Importantly, even though we were able to select patient groups who shared many traits with the control arm of KN-407, in terms of survival, the RW cohort(s) had shorter survival than the control arm of KN-407.The baseline characteristics showed that the groups differed in terms of age, ECOG status and PD-L1 tumor proportion score, which likely influenced the outcome.We cannot exclude that it can also be because of fundamental differences between the populations that are unknown.An important note is also that our RW-cohort consisted of all patients matching as many of the eligibility criteria as possible.We did not perform any adjustments to make the patients comparable on a 1-1 individual basis, i.e. with propensity score matching, since this study did not have access to the individual data from KN-407.If one wants to formally assess the comparability between a control arm and that of the EC from an RWD source, one should apply relevant statistical methods to address any bias influence on the treatment effect estimation.
In conclusion, even if the RW-cohorts have similar clinical characteristics as those in the control arm of KN-407, we did not end up with the same survival as in the published study.As we have discussed, this might be due to reasons other than that the registries are unfit for this purpose.We have limited knowledge about the use of Norwegian health data in this setting and this study highlights important factors to consider when moving forward in this field.The comprehensiveness and full population coverage of the CRN, linked to other nation-wide registries in Norway, should attract the interest of the academic, regulatory, and global pharmaceutical industry, to further evaluate whether its RWD can be used for the purpose of creating external comparator arms.

Disclosure statement
SB is employed by Merck Norway.For 3 years since 2021 she holds an industrial PhD-position and during that time she is released from any mandatory work for Merck.The industrial PhD project is publicly financed by the Norwegian Research council, grant number 321291.She is enrolled as a PhD-student at the Medical Faculty at the University of Oslo, where her research focuses on the use of real-world data and how specifically Norwegian lung cancer registry data can be used as an external control arm for clinical trials.She is supervised by AH MD, PhD and ST MD, PhD.
ST MD, PhD is the external supervisor of SB.He is an external consultant to Merck Norway and a founder of the company NordicRWE.
SBB MSc, is employed by NordicRWE.AH MD, PhD is the internal supervisor of SB.In association with research/clinical studies she has received financial support and/or study drug from AstraZeneca, Roche, Novartis, Incyte, Eli Lilly, Ultimovacs and BMS.Adv board/advise: AstraZeneca, BMS, Janssen, MSD, Pfizer, Roche, Takeda, Sanofi, Bayer, EliLilly, Abbvie.All payments to institution.The funding bodies had no role in the data collection and analysis and were not involved in the interpretation of results, writing, revision, or approval of the manuscript.

Table 1 .
Eligibility criteria (inclusion and exclusion combined) of the RW-cohort.
Defining the Aggregated RW-cohort 1. Adult patients (18 years and older) diagnosed from 2014-2016, in Norway 2. Squamous cell carcinoma, NSCLC (ICD10 C33-34) 3. Histologically verified 4. Stage IV /metastatic 5. Information on systemic cancer treatment available 6. 1L treatment with any regime containing chemotherapy 7. Excluding patients with any other malignancy with regional disease or more advanced, including unknown status before primary lung cancer diagnosis 8. Excluding patients with active brain metastasis

Table 2 .
Eligibility criteria (inclusion and exclusion) of the individual RW-cohort.

Table 3 .
Selected (based on available data) baseline characteristics of the aggregated-and individual RW cohort compared with the corresponding characteristics of the control arm in KN-407.