A validation of register-derived diagnoses of interstitial lung disease in patients with inflammatory arthritis: data from the NOR-DMARD study

Objective There is a lack of knowledge concerning the validity of the interstitial lung disease (ILD) diagnoses used in epidemiological studies on rheumatic diseases. This paper seeks to verify register-derived ILD diagnoses using chest computed tomography (CT) and medical records as a gold standard. Method The Norwegian Anti-Rheumatic Drug Register (NOR-DMARD) is a multicentre prospective observational study of patients with inflammatory arthritis who start treatment with disease-modifying anti-rheumatic drugs. NOR-DMARD is linked to the Norwegian Patient Registry (NPR) and Cause of Death Registry. We searched registers for ILD coded by ICD-10 J84 or J99 among patients with rheumatoid arthritis, psoriatic arthritis, or spondyloarthritis. We extracted chest CT reports and medical records from participating hospitals. Two expert thoracic radiologists scored examinations to confirm the ILD diagnosis. We also searched medical records to find justifications for the diagnosis following multidisciplinary evaluations. We calculated the positive predictive values (PPVs) for ILD across subsets. Results We identified 71 cases with an ILD diagnosis. CT examinations were available in 65/71 patients (91.5%), of whom ILD was confirmed on CT in 29/65 (44.6%). In a further 10 patients, medical records confirmed the diagnosis, giving a total of 39/71 verified cases. The PPV of a register-derived ILD diagnosis was thus 54.9%. In a subset of patients who had received an ILD code at two or more time-points and had a CT scan taken within a relevant period, the PPV was 72.2%. Conclusion The validity of register-based diagnoses of ILD must be carefully considered in epidemiological studies.

Interstitial lung disease (ILD) is the most common pulmonary manifestation in rheumatoid arthritis (RA) and is associated with poor survival (1).
The diagnosis of ILD in the clinic relies on highresolution computed tomography (HRCT) which, to date, is the gold standard for the diagnosis (2)(3)(4).A multidisciplinary discussion involving clinicians from the fields of radiology, pulmonology, pathology, and rheumatology is also important in the diagnostic process (4,5).The estimated prevalence of ILD varies between studies and according to the data sources (5) and methods applied.Studies using data derived from hospital registers or Medicare have calculated the prevalence of ILD in RA populations to be approximately 2% (6,7), while studies based on data from chest HRCT and/or pulmonary function testing have, in general, found a higher prevalence in the range of 4-15% (8,9).Little is known concerning the risk of ILD in patients with psoriatic arthritis (PsA) or spondyloarthritis (SpA), although case reports and retrospective studies of lung involvement in PsA populations indicate a lower risk than in RA (10,11).
Register linkage studies, where information on patients in a cohort is retrieved from a register of hospital discharge data and/or a cause of death register, are increasingly popular epidemiological studies, and are particularly relevant in studies of infrequent events.However, knowledge of the validity of register-derived ILD diagnoses in patients with inflammatory arthritis is limited, challenging the interpretation of results from such studies (12,13).
In this study, we sought to verify a register-derived diagnosis of ILD in patients with inflammatory arthritis, by evaluating the presence of ILD on chest computed tomography (CT) images or seeking confirmation from medical records.We also aimed to examine the positive predictive value (PPV) of the ILD diagnosis in subsets of the cohort.

Patient cohort
The Norwegian Anti-Rheumatic Drug Register (NOR-DMARD) is a multicentre prospective observational longitudinal study, which was established in 2000.The register includes patients with a diagnosis of inflammatory arthritis who start treatment with a diseasemodifying anti-rheumatic drug (DMARD) (14).The inclusion criteria were revised in 2012, and only patients who start a biological DMARD have been included since that year (15).

Data from national registers
NOR-DMARD was linked to the Norwegian Patient Registry (NPR) and Norwegian Cause of Death Registry for patients included in the period from 1 January 2008 to 31 December 2018.The former records all International Classification of Diseases, 10th revision (ICD-10) diagnoses given at any hospital admission, whereas the latter records ICD-10 diagnoses given on death certificates.By searching both registers, we identified patients who had been diagnosed with ILD at one or more time-points.The ICD-10 codes used were J84, or J99 with an additional diagnosis of inflammatory arthritis.J84 is defined as 'Other interstitial pulmonary disease' and J99 as 'Respiratory disorders in diseases classified elsewhere'.We also extracted patients who had been diagnosed with ICD-10 J70, 'Respiratory conditions due to other external agents', which include drug-induced ILD (16).

Assessment of ILD on chest CT
All available chest CT reports from patients who had received an ILD ICD-10 code were extracted.The CT examinations were reviewed by two experienced chest radiologists (FA and TMA).FA scored all chest CT scans for the presence and extent of ground glass opacities, consolidations, reticular pattern, nodules, and interlobal interstitial thickening to confirm ILD (17).The ILD pattern on chest CT was described when possible (18,19).TMA reviewed the chest CT examinations again to confirm findings in cases of uncertainties or negative findings and to ensure agreement with FA.Although both radiologists were aware that all patients had received at least one ILD diagnosis in a patient register, they were blinded for information extracted from medical records or the details of the ILD diagnosis given.The extracted chest CT examinations were classified as relevant to the register-derived ILD diagnosis if they had been performed within a predefined time frame of 3 months prior to, or within 1 year after the first ILD diagnosis.CT examinations obtained outside the relevant time frame could also be considered to confirm the diagnosis.

Assessment of ILD in medical records
The coauthors conducted a manual search of the medical records at relevant hospitals for patients who had a register-derived diagnosis of ILD, but who did not have a confirmed ILD on chest CT.The search focused on medical records from the period of 1 year prior to the first diagnosis and the subsequent years.Information regarding the justification for the register ILD diagnosis was extracted when available.We extracted information on the patient's medical history, frequency of diagnosis, change in diagnosis over time, which specialist made the diagnosis (pulmonologist, rheumatologist, other), and whether the diagnosis had been confirmed by multidisciplinary discussions and/or evaluations.

ILD verification
A register-derived diagnosis of ILD was considered to be verified if it was confirmed on chest CT.We also considered diagnoses as verified if the reason was apparent in the medical records and confirmed by a multidisciplinary discussion and/or a pulmonologist and/or a rheumatologist on more than one occasion.

Validity of a register-derived ILD diagnosis
We explored the PPV of register-derived ILD diagnoses selection in cohort subsets.The examined subsets were selected based on the literature and expert opinion.The applied selection criteria are shown in Figure 1 Ethics The study has been approved by the Regional Ethical Committee of South-Eastern Norway and by the data protection officers at each hospital.To ensure study feasibility, only patients from NOR-DMARD centres with five or more cases of ILD were included in this study.

Statistics
We compared key demographic variables across groups using the Student's t-test, Mann-Whitney U-test, and chi-squared test, as appropriate.The assumptions for the Student's t-test were checked by visual inspection of the distribution and Levene's test.The PPVs were calculated as the percentage of patients with a validated diagnosis divided by the number of patients identified in the category.All statistics were performed on Stata 17 (StataCorp, College Station, TX, USA).

ILD in patients with inflammatory arthritis
We linked 4548 individual patients (2617 from Diakonhjemmet Hospital, 1165 from University Hospital of Northern Norway, and 766 from Lillehammer Hospital for Rheumatic Diseases) to the NPR and the Death Register.The numbers of patients with RA, PsA, SpA, and unspecified inflammatory arthritis were 1541 (33.9%), 930 (20.5%), 1445 (31.8%), and 632 (13.9%), respectively.
We identified 71 individual cases with an ICD-10 J84 or J99 ILD diagnosis: 61 cases in patients with RA, five in PsA, and five in SpA patients.The average age at the time of first ILD diagnosis was 60.6 ± 11.5 years (mean ± sd) and 39 (54.9%) were females.Biopsies had been performed in three patients, and the ILD diagnosis was subsequently rejected in two of these.
Pulmonary function tests were performed in 42 patients and a forced vital capacity < 80% was found in 14 (33%) of these patients.
Baseline demographics are presented in Table 1.

Verification of ILD (ICD-10 J84 or J99) on chest CT
Chest CT examinations were available in 65 patients, of which 53 had been performed within the prespecified relevant time frame.The ILD diagnoses were confirmed by CT in 26 of the 53 patients, with relevant CT examinations within the prespecified time frame being available.Furthermore, the ILD diagnosis was confirmed by a radiologist in three patients of the 12 patients who had chest CT available only from time-points outside the predefined time frame.In these three patients, the CT examination had been performed more than 3 months prior to the first ILD diagnosis.
Of the patients who were not diagnosed with ILD by chest CT, several had findings indicative of ILD, such as sparse ground glass opacities or interstitial thickening, but not to an extent sufficient to justify an ILD diagnosis.
We thus verified the ILD diagnoses on chest CT in 29 out of the 65 patients (44.6%) who had received an ICD-10 J84 or J99 diagnosis and had available chest CT reports.Table 2 lists the ILD patterns on these 29 chest CT examinations.

Verification of ILD (ICD-10 J84 or J99) in medical records
We succeeded in retrieving the medical records of 40 out of the 42 patients for whom ILD (ICD-10 J84 or J99) was not confirmed by chest CT examinations (Table 2).Of these, 17 had received ILD diagnoses while under investigation for respiratory symptoms such as dyspnoea and persistent coughing.Another four patients were reported to have pulmonary fibrosis due to other specified causes such as inorganic dust or radiation, and five patients were identified as having possible acute pneumonitis related to treatment for arthritis, such as methotrexate (2) and leflunomide (2).In six patients, no reason for the ILD diagnosis was identified in the medical records (1).Finally, in 10 patients, the ILD diagnosis was confirmed by several medical specialists.The ILD diagnoses were thus verified in 10 patients based on information from the medical records.The diagnosis of ILD (ICD-10 J84 or J99) was thus confirmed in 39 out of 71 patients (54.9%) who had a register-derived ILD diagnosis.Twenty-nine of the confirmed cases were diagnosed as J84, while 10 were diagnosed as J99.We found 35 cases of confirmed ILD in patients with RA, two cases in patients diagnosed with PsA, and two cases in patients diagnosed with SpA.
Comparing the 35 patients with RA and verified ILD with the 26 RA patients with a register-derived nonverified ILD, we found a significantly higher age in the patients with verified ILD (mean ± sd 65.8 ± 8.7 years vs 57.6 ± 10.4 years, p < 0.05) and a non-significantly higher proportion of males [17 (48.6%) vs 10 (38.5%), p = 0.45].Disease duration at the first registered ILD event was non-significantly shorter in patients with a verified ILD diagnosis (mean ± sd 9.8 ± 7.8 years vs 14.2 ± 11.7, p = 0.48).

PPV of ILD diagnoses across cohort subsets
We explored the PPV of register-extracted ICD-10 codes for register-derived ILD diagnoses across subsets of the total cohort.The subset selections are presented in supplementary Figure S1 and the PPV in each subset is presented in Figure 1.
The highest PPV in any subset was 5/5 (100%), as the diagnosis was confirmed in the five patients with RA who had been diagnosed with ICD-10 J99 at two or more time-points with a chest CT scan performed within the prespecified relevant period.The greatest PPV in RA patients with ICD-10 J84 was 18 (72%) in a subset of 25 patients with at least two ILD diagnoses and a chest CT scan taken within the prespecified relevant period.However, by applying the selection criteria, several patients with verified ILD were disregarded.For example, only 23 out of the 35 RA patients with verified ILD were selected according to the final criteria.
When the information in the model was not restricted to those with available CT examinations, the PPV for patients with at least two ILD diagnoses was 60%, and for RA patients this increased to 67% for ICD-10 J84 and 75% for ICD-10 J99.The PPV was not improved by selecting patients with ILD as a first diagnosis, by selecting patients with three or more ILD diagnoses, or by applying a criterion of ≥ 30 days between two ILD diagnoses.

Verification of ICD-10 J70
Nine patients had register-derived diagnoses of ICD-10 J70.Of these, five were diagnosed with J70.2 (Acute drug-induced ILD) and four with J70.4 (Drug-induced ILD, unspecified).In three patients we found evidence of ILD on chest CT.The CT findings reported were non-specific interstitial pneumonia, ground glass opacities, reticular pattern, and possible drug reactions.The medical record confirmed the diagnosis of drug-induced ILD in an additional three patients.In those cases, the clinician suspected that the patient's symptoms were caused by methotrexate.In summary, the ILD diagnosis was verified in six out of nine patients with ICD-10 J70.
The distribution of diagnoses across types of inflammatory arthritis is presented in supplementary table 1.

Discussion
In this study of patients with inflammatory arthritis and a register-extracted ILD diagnosis, we verified 54.9% of ICD-10 J84 or J99 diagnoses by combining verification of the diagnosis on chest CT or medical records.However, the PPV increased to 100% for RA patients with J99 in a subset of patients with at least two ILD diagnoses and a CT scan taken within a relevant time-frame.The trade-off for increasing the PPV was that a considerable proportion of verified ILD cases was then excluded from the final cohort.
The validity of register-derived ILD diagnoses has been explored in claims data, but, to our knowledge, the diagnosis has not been verified in the rich linkage data that are available from the Nordic countries.Although a variety of methods has been used to verify the ILD diagnoses in studies seeking to validate register ILD diagnoses, the majority reviewed medical records without retrieving CT examinations.In a study by Cho et al, medical records describing chest CT or lung biopsy were used as the gold standard to validate eight different algorithms using a combination of ICD-9 diagnoses and procedural codes in Medicare data (20).The algorithm with the highest PPV was that requiring three or more ILD diagnoses coded by a specialist rheumatologist or pulmonologist.In these patients, the PPV was 72.4%, which is very similar to the 72% we found in RA patients diagnosed with J84 who had at least two diagnoses of ILD in the NPR and/or death register.We did not find an increased PPV by restricting the subset to three or more diagnoses.Adding procedural codes to the algorithm did not increase the PPV in the study by Cho et al (20).
Alternative approaches have been taken by others.Herrinton et al validated the ILD diagnosis by examining chest CT reports, but not medical records.After a detailed review of the CT reports they found a PPV of 63% in incident cases in patients with codes for ILD on ICD-9 or later versions (21).In contrast, England et al validated ILD diagnoses from the Veterans Affairs Rheumatoid Arthritis (VARA) register by reviewing medical records.They reported a PPV of approximately 80%, depending on the criteria used.They found the highest PPV after applying an algorithm that required at least 30 days between two diagnoses of ILD.They also investigated specificity and sensitivity by selecting a random sample of patients who were not registered with an ILD diagnosis, and found that increases in PPV and specificity were accompanied by a decline in sensitivity (22).Meehan et al studied Medicare linkage data and reported a PPV of 86% for outpatient cases of ILD confirmed in the medical records of RA patients with a preceding chest CT scan, where a second ILD diagnosis had been given within a year (23).In contrast to England and Meehan, we did not find that adding a requirement of time between each ILD diagnosis increased the PPV.
Our results also show the patients whom we identified with verified ILD to be significantly older and with nonsignificantly longer disease duration.As mentioned earlier, there is also a trade-off between increasing the PPV through applying a more restrictive subset selection and the resultant non-identification of patients with a true ILD diagnosis.We found a total of 35 verified cases of ICD-10 J84 or J99 in patients with RA, but 12 of these were not included in the final subset, which required at least two ILD diagnoses in the register and a chest CT scan taken within the relevant period.
In this study, we verified six out of nine cases of ICD-10 J70 by a combination of chest CT and medical records.We have found no other studies that have attempted to verify register-derived diagnoses of ICD-10 J70 (respiratory conditions due to other external agents, which include drug-induced ILD).The diagnosis is often given by a combination of radiographic findings, medical history, and/or clinical findings (24).In the three cases verified by medical records, methotrexate was the suspected cause of the pneumonitis.Prior pulmonary disease is a risk factor for methotrexateassociated pneumonitis (24) but, unfortunately, we lack information regarding prior comorbidities.
The NOR-DMARD is a clinical study that collects data at predefined time-points following the initiation of DMARD therapy.We therefore lack clinical data from the time of the ILD diagnosis, which is a weakness of this research.Also, the data presented in this study cannot be used to estimate the prevalence of ILD in patients with arthritis because the inclusion criteria for the NOR-DMARD study changed during the course of the study.In addition, we cannot exclude missing cases of ILD due to non-coding of the diagnosis.Another weakness is that CT examinations were not found for all patients, and we cannot exclude the possibility that the CT scan may have been taken at a hospital that did not participate in the study, although there was no evidence of this in the medical records.We also did not have biopsy results or pulmonary function tests for a large number of patients.We have used hospital medical records to verify the ILD diagnoses in some patients.However, Samhouri et al noted, in their study from the Rochester cohort, that only 27% of RA patients who developed ILD in the period 1999-2014 had documentation in their clinical notes (9).The strengths of this paper are the heterogeneous real-life patient population, the extraction and reexamination of chest CT reports, and the hand searching of medical journals.
Few patients had been discussed at a designated multidisciplinary panel meeting, but in our study ILD diagnoses had been repeated on several occasions by a pulmonologist and/or other medical specialists, which is also a strength of the study.We have to bear in mind that some of our patients were identified as far back as 2008.

Conclusion
Among inflammatory arthritis patients who had received an ICD-10 J84 or J99 diagnosis on at least two occasions and had undergone chest CT within 3 months before or 1 year after the first ILD diagnosis, ILD was confirmed in 72.2% of the cases.This study indicates that register-derived diagnoses may be used with caution in outcome studies.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Table 1 .
Distribution of demographics and register diagnoses in the study.
Data are shown as n (% of total in group).ILD, interstitial lung disease; UIP, usual interstitial pneumonia; NSIP, non-specific interstitial pneumonia; unspecified fibrosis, fibrosis which cannot be classified as UIP or NSIP.