Measuring medication adherence in asthma: Development of a novel self-report tool

Objective: This study presents the development and validation of MIS-A (Medication Intake Survey-Asthma), a new self-report instrument measuring key adherence properties during long-term asthma treatment. Design: Within a longitudinal asthma cohort study in France and the United Kingdom, adult patients and caregivers of children responded to computer-assisted telephone interviews. Main outcome measures: Scores for distinct adherence properties (taking adherence, correct dosing, therapeutic coverage, drug holidays, overuse) and composite measures were computed for several time intervals. We examined distributions, longitudinal variation, associations between adherence scores and concordance with adherence calculated from medication prescribing or dispensing records. Results: Nine hundred and two participants reported on adherence to 4481 medications on 4140 occasions. About 59.47 and 70.36% revealed < 100% taking adherence in the last week and month; 42.76% had a drug holiday of > 1 week in the last 4 months. Adherence varied within patients during the follow-up (intra-class correlation = . 41–.71). Correlations between adherence scores were moderate to strong (ρ = .51–.85, p ≤ .001), except medication overuse (ρ = .04–.19, p ≤.05). Four-month taking adherence was associated with dispensing adherence, but not with prescribing adherence (ρ = .33, p < .001; and .12, p = .26). Conclusion: MIS-A is a promising, easy-to-use self-report tool that can capture accurately different adherence properties over a long time period.


Background
Assessing medication adherence with reliable and valid tools is essential for research, as well as for developing and implementing effective adherence support interventions for chronic conditions (Nieuwlaat et al., 2014;Stirratt et al., 2015). Although numerous methods are available, the quest for optimal adherence assessment is still ongoing (Lehmann et al., 2013). Electronic monitoring (EM) is valued for its high granularity (it time-stamps each use of the monitoring device preceding medication intake), validity (minimal systematic error due to its temporal proximity to medication intake) and reliability (low random error if medication is taken from the device) (Lehmann et al., 2013;Williams, Amico, Bova, & Womack, 2012). Raw EM data can be used flexibly to assess various adherence properties for different time windows (Blaschke, Osterberg, Vrijens, & Urquhart, 2012;Vrijens & Goetghebeur, 1997). Yet, collection and analysis of EM data are resource intensive and often raise practical difficulties (Bova et al., 2005;Chan et al., 2013).
In contrast, self-reports of medication use are easier to collect and analyse, but tend to suffer from substantial ceiling effects (Stirratt et al., 2015). Moreover, self-report measures either collect high-resolution data over very short time periods (e.g. the last 2-3 days) and thus cannot detect long-term patterns (Chesney et al., 2000), or request respondents to estimate their adherence over longer or non-specified time intervals resulting in low granularity (Kerr et al., 2008;Mora et al., 2011;Morisky, Ang, Krousel-Wood, & Ward, 2008). Several widely-used tools also suffer from concept contamination (Gagné & Godin, 2005), as they measure a mix of behaviours and determinants, which they combine in one overall score (Nguyen, Caze, & Cottrell, 2014). A self-report adherence tool that captures patients' actual adherence behaviour (i.e. whether they take medication as prescribed) with optimal granularity, over a long time interval, and with acceptable ceiling effects has yet to be developed.
This study aimed to apply state-of-the-art methodology (Gagné & Godin, 2005;Stirratt et al., 2015;Williams et al., 2012) to develop a reliable and valid adherence self-report tool that capitalises on the strengths of both EM (i.e. sufficiently detailed data to estimate key adherence patterns over various time periods) and self-report (i.e. easy to administer on a large scale), while attempting to address the main limitations of currently available tools.

Methodological framework
Recall bias and social desirability are the main barriers to accurate reporting of deviations from a prescribed medication regimen. Respondents find it difficult to access information about (habitual) behaviours and events, particularly over/after longer time intervals. They also tend to adjust their answers to meet the inquirer's presumed expectations, particularly if they perceive the question as having personally relevant consequences. Thus, self-reported adherence often shows little variance and is usually higher with substantial ceiling effectsthan when assessed through EM or medication prescribing or dispensing records (Bender et al., 2000;Garber, Nau, Erickson, Aikens, & Lawrence, 2004).
To improve recall, methodologists advise supporting respondents by identifying personal events that took place in the relevant time intervals (Belli, Smith, Andreski, & Agrawal, 2007), using familiar and clear wording, specifying the time intervals of reference and requesting less detail over longer intervals (Gagné & Godin, 2005). Social desirability is reduced by various techniques such as normalising behaviour, ensuring confidentiality or obtaining commitment for accurate reporting (Gagné & Godin, 2005).
Moreover, interviewer-administered measures can facilitate the response process, for example by correcting any biased perceptions on the consequences of reporting non-adherence (Williams et al., 2012). Investigating via cognitive interviewing how respondents understand questions and generate answers by accessing and selecting relevant information has been proven useful in adherence measurement for improving question comprehension and relevance in different respondent groups and contexts (Wilson et al., 2013). Applying such methods in questionnaire design can improve psychometric properties and reduce ceiling effects (Stirratt et al., 2015;Wilson, Carter, & Berg, 2009).
Self-reports have also been criticised for the limitations imposed by recall bias on collecting high-granularity data on longer time intervals. EM records of individual medication intake events allow for the examination of each event individually, and the computation of various aggregate scores over various time windows, such as correct dosing, taking adherence or timing adherence (Demonceau et al., 2013). However, maximum granularity might not be necessary for all purposes and it might depend on which adherence component (initiation, implementation or discontinuation; Vrijens et al., 2012) is examined. Individual events are central when investigating medication (re-)initiation or discontinuation, or short-term effects of dose omission or mistiming (Blaschke et al., 2012). Medication implementation is most commonly investigated via weekly or monthly scores (Wilson et al., 2009), and fewer studies examine it as a series of individual events or distinct adherence properties such as correct dosing or drug holidays (e.g. De Geest et al., 2006). The relevance of these properties for clinical outcomes is an empirical question; it may be condition-and medication-specific, and may depend on the temporal dynamics of both adherence and health outcomes. Although current self-reports do not do this, in principle self-reports can also collect data of sufficient granularity and accuracy by focusing directly on aggregate estimates of distinct adherence properties. Such a tool would allow empirical investigation of the impact of different adherence properties, as well as diagnosis and intervention in clinical practice if certain adherence patterns prove relevant.
Self-report is often described as unreliable, yet this may be partly due to not using appropriate methods for testing reliability (Voils, Hoyle, Thorpe, Maciejewski, & Yancy, 2011). Reliability represents the degree to which a measure is free of random measurement error and is commonly tested via internal consistency and test-retest reliability. The former assesses error due to item content and assumes all questions tap into the same construct, while the latter estimates error due to context of measure administration assuming temporal stability of the construct assessed (Viswanathan, 2005). Unsurprisingly, measures that conceptualise adherence as a latent patient characteristic perform well on these tests (Morisky et al., 2008;Reynolds et al., 2007;Thompson, Kulkarni, & Sergejew, 2000), while measures that assess the quantity of medication taken over a period of time compared to the quantity prescribed perform poorer (Jerant, DiMatteo, Arnsten, Moore-Hill, & Franks, 2008). This is because, when conceptualised quantitatively in terms of patients' dosing histories, adherence is a dynamic process and can consist of different patterns at successive time points (Blaschke et al., 2012). Therefore, self-report tools that are theoretically consistent with this conceptualisation require a different approach to reliability testing, for example by examining associations between overlapping reports for the same time interval from different questions vs. non-overlapping reports.
The low concordance of self-reported adherence with other adherence measures, such as pharmacy dispensing records and EM, has cast doubt on the validity of self-reports (Garber et al., 2004). Full concordance is, however, not expected as all methods measure indirectly complementary aspects of medication use and show varying degrees of subjectivity and construct-irrelevant variance (Berg & Arnsten, 2006;Lehmann et al., 2013;Williams et al., 2012). For example, irrelevant actions for EM are device use without taking medication, or using medication from another supply (e.g. 'pocket-dosing', Bova et al., 2005;de Bruin, 2013). Similarly, dispensing medication at the pharmacy is not always followed by immediate use of that supply nor does it immediately precede medication use, for example because patients still have medication available from prior dispensations (Williams et al., 2012). Besides pharmacy records, medication prescribing by patients' doctors is sometimes used as a measure for medication adherence. Prescribing is even more distal from actual medication use than pharmacy dispensing, and is not always followed by a corresponding pharmacy dispensing event. Therefore, prescribing records would be expected to show lower concordance with self-reports or EM than pharmacy dispensing data. Concordance between these different measures can be increased by computing adherence scores for the same behaviour and time interval (Wilson et al., 2009), and may depend on context (e.g. type of respondent; Bender et al., 2000) and tool characteristics (Garber et al., 2004). Thus, validity tests for self-reported adherence need to consider these conceptual constraints, and include an exploration of the possible influencing factors. Substantial variations in concordance depending on factors such as respondent characteristics would indicate bias; the opposite would support validity.

Study objectives
Asthma is a chronic airways disease which usually requires daily use of inhaled corticosteroids (ICs) to control underlying inflammation and prevent exacerbations. Empirical evidence shows that adherence to ICs is commonly low, which can have a detrimental impact on asthma-related health outcomes (Engelkes, Janssens, de Jongste, Sturkenboom, & Verhamme, 2015). Yet, the low quality and standardization of available measures hinders progress on understanding the role of adherence in asthma, and improving adherence measurement is a methodological priority in this field . Therefore, within a prospective cohort study on asthma (ASTRO-LAB; Van Ganse et al., 2015), we developed the Medication Intake Survey-Asthma (MIS-A), a new tool for assessing self-reported medication adherence. MIS-A is delivered via computer-assisted telephone interviews (CATIs), asks about different complementary adherence properties (taking adherence, therapeutic coverage, correct dosing, drug holidays, overdosing), and applies strategies for reducing social desirability and improving recall. We report its development and examine its psychometric properties by answering the following research questions: (RQ1) Does MIS-A adequately capture non-adherence? (RQ2) Are the different MIS-A scores non-redundant? (RQ3) Are MIS-A scores reliable? and (RQ4) Are MIS-A scores valid?

Study design and participants
The design of the ASTRO-LAB asthma cohort study is reported elsewhere (Van Ganse et al., 2015). Briefly, patients aged 6-40 years with ≥ 6 months of prescribed coverage of daily controller inhalers (ICs and/or long-acting beta-agonists, LABA) in the past 12 months were selected from primary care or pharmacy records in France and the United Kingdom (UK). After informed consent and enrolment procedures, participants were followed up for maximum 24 months via monthly text messages, CATIs and online surveys. CATIs were conducted every 4 months (regular CATIs) and when patients reported recent asthma-related exacerbations (AEs) in their monthly text message replies (AE CATIs). Recent occurrence of AEs was also probed during regular CATIs. Adults and teenagers (≥12 years), and children (6-12 years) through caregivers, reported on prescribed medication type and dosage, and on medication use before the regular CATI and before an AE. Socio-demographics (gender, age, country, primary care practice identifier) were collected at enrolment from patient records. Electronic medication dispensing claims were accessed for French participants (Moulis et al., 2015) and medication prescribing data by primary care providers were accessed for British participants (Blak, Thompson, Dattani, & Bourke, 2011). The resulting data-set had a multilevel structure: participants could have multiple asthma medications that were nested within CATI reports, there were multiple CATI reports nested within patients and multiple patients were nested within primary care practices.

Development of the medication-related CATI questions
The CATI script was developed for adult English participants based on prior literature on self-reported and EM adherence, and qualitative interviews with patients, caregivers and health care professionals on asthma self-management in France (13 interviews) and in the UK (26 interviews), and were revised iteratively within the project team.
It was translated to French by a specialised company using forward translation, independent back translation, review with investigator input, and independent proofreading. It was then adapted for caregivers reporting adherence for their children. The resulting scripts were pretested with 5 adult patients and 4 caregivers via cognitive interviews. Implementation in the online tool included extensive pretesting. Interviewers were trained using a study manual, group sessions and feedback on their own performance. They were also asked to provide feedback on their first interviews to the investigators. Improvements after pre-testing were implemented identically in the different CATI versions.
The MIS-A script is available as a Supplementary Online Material 1 (SOM 1). Essentially, MIS-A is a count-based recall measure (Williams et al., 2012) similar to the AIDS Clinical Trials Group questionnaire (Reynolds et al., 2007); MIS-A develops further the quantitative estimation of adherence on longer time periods consistent with EM-based aggregate scores of medication implementation. We aimed for a minimum number of complementary questions that would cover the type of adherence data that one collects when using EM, adapted for recall capacities. Four months were selected as the maximum recall period because of the current study design, but this could be adapted to e.g. 2 or 3 months for other studies.
Patients were first asked the names of their currently prescribed medications. For each medication pre-labelled as daily controller inhaler, detailed questions followed on prescription start, daily dosage recommendations and adherence. The latter inquired about: (Q1) number of inhalations used a day before; (Q2) number of days with no use in the past 7 days; (Q3) number of days with perfect adherence in the past 7 days (i.e. use according to prescribed dosage); (Q4) number of days with no use in the past 4 weeks; (Q5) number of weeks of treatment interruption in the past 4 months; and (Q6) medication overuse in the past 4 months. If medications were prescribed < 4 months before the CATI, only the questions regarding the more recent period were asked (e.g. if prescribing happened two weeks before a CATI, only Q1 to Q3 applied).
Question development and pretesting aimed to increase recall by asking more detailed information for shorter time intervals and estimates of more memorable events (such as drug holidays, and overuse) for longer time intervals, ordering the questions chronologically, and collecting factual information (e.g. number of inhalations/days, number of days without doses last week) rather than global adherence estimates (e.g. percentages or perceived quality of implementation). It also aimed to facilitate natural and parsimonious conversation, hence it included conditional questions where possible (e.g. ask Q3 only if Q2 < 7 days), and used collected information to tailor subsequent questions (e.g. the medication name). To further facilitate recall, interviewers were instructed to introduce the MIS-A questions by clarifying the time interval targeted (past 4 months) and identifying any public or personal events within this interval with the respondent in free conversation (Belli et al., 2007). Moreover, interviewers were trained to guide respondents to choose a distraction-free time interval and location for the interview, use prompting to support the recall if respondents reported difficulties, and reminders were included in the script for interviewers to use these techniques.
Proceeding to a next screen was conditioned by completion of current questions, and interviewers were instructed to use 'don't know' options only after providing adequate recall support. To reduce social desirability, medication use questions were preceded by an introduction normalising non-adherence, and several questions included normalising words. Every CATI started with a refresher regarding response anonymity and independence of interviewer from the patient's health care provider. Interviewers were trained to probe and address any concern about the study conduct before starting the CATI, and to conduct interviews in a neutral, non-judgemental, supportive manner. Implementation of these recommendations was monitored during data collection.

Data analysis
Data management and analysis were performed in R (R Core Team, 2013). We selected from the ASTRO-LAB database CATI reports that referred to inhalers prescribed for daily use at the time of the CATI, thus excluding treatments ended recently, prescribed for as-needed use, or with recording errors. Data preparation included adjusting for any inconsistencies between responses to MIS-A items, taking reports on shorter intervals as reference (e.g. if a respondent reported no medication use a day before and 7 days of adherent use in the last week, the latter was adjusted to 6 days). Single items were used to compute scores for specific properties of medication implementation (see Table 1). We used established definitions of taking adherence (percentage of prescribed doses taken), and correct dosing (percentage of treatment days when the patient took the correct number of doses; Demonceau et al., 2013). To adapt to self-report on longer time intervals, drug holidays were defined as 7 or more consecutive days with no use (as interruptions of one week or more were identified during pre-testing as more easy to remember) and therapeutic coverage estimated the proportion of days with active medication (as it was not possible to calculate intervals based on clinical data for each medication as in Detry et al., 1994). Overuse was defined as the proportion of medication taken over the prescribed quantity. Taking adherence was also computed over 1 week, 4 weeks and 4 months as composite scores considering the complementary information provided by Q1-Q5 (algorithm presented in SOM 2).
Characteristics of patients and medications were summarised descriptively. To investigate whether MIS-A captures non-adherence (RQ1), we examined ceiling effects in comparison to other asthma studies, and longitudinal variation in adherence scores. Descriptive statistics were calculated for scores and compared to estimates from published asthma studies using other self-report tools. To test whether MIS-A can detect variation at within-and between-patient and between-practice levels, variation in the four taking adherence scores was examined via three-level linear mixed-effects models (LMM; reports nested within patients within practices; maximum likelihood estimation) on a subsample of patients with long-term use of ICs-based medication. Thus, for this analysis, we excluded patients with ≥ 1 reports with no daily medication (not prescribed, ended recently or prescribed as needed), other asthma controllers (e.g. tiotropium), ≥1 reports with no ICS-based controllers (only daily LABA prescribed) and insufficient follow-up (<2 reports). For reports with > 1 medication, average scores were computed. Unconditional means models were performed to assess the proportion of variance at different levels via intra-class correlation coefficients (ICC); a cut-off of .05 was considered as indicating substantial variance to capture differences in adherence (Heck, Thomas, & Tabata, 2013).
To investigate whether MIS-A scores are non-redundant (RQ2) we examined bivariate correlations (Spearman's ρ) at medication level, compared to a collinearity threshold of .80 (Field, 2005). As the time intervals of some scores overlapped partially, correlations were also performed after adjusting scores to refer to non-overlapping intervals (e.g. adjusted Q2 referred to last week excluding a day before, which Q2 shared with Q1). No tests of structural validity were performed, as the questions were not hypothesised to reflect latent dimensions of adherence.
The reliability of MIS-A (RQ3) was examined by comparing correlations between adjusted and non-adjusted scores. We expected that scores would be sensitive to adjustment, i.e. adjusting scores to exclude temporal overlap between items would result in lower correlations between adherence scores. This sensitivity to adjustment was taken to reflect reliability of reports while accounting for temporal variation in behaviour, in other words show that answers to different questions referring to the same time interval are consistent, while reports on the same occasion but for non-overlapping time intervals are less similar but still moderately to strongly associated.
To test the convergent and discriminant validity of MIS-A scores (RQ4), CATI data were linked to French dispensing records via a probabilistic method or with UK prescribing records via patient identification numbers. We selected patient records with a single type of medication matching between data sources and available data for a twoyear period around the patient's first CATI to obtain comparable scores. Medication dispensing-and prescribing-based adherence estimates were computed for 4 months before the first CATI using a Continuous Medication Availability (CMA) algorithm. This algorithm takes into account the timing of dispensing/prescribing events, the medication supply available at the beginning of the time interval, and banking of new medication until current supply is used as directed (described in Vollmer et al., 2012). We tested the concordance between MIS-A 4-month taking adherence and dispensing/prescribingbased adherence for the same interval via Spearman's correlations, Wilcoxon rank sum tests, concordance correlation coefficients (CCC) and Bland-Altman plots, as recommended for comparison of adherence scores (El Alili, Vrijens, Demonceau, Evers, & Hiligsmann, 2016). Medium-sized correlations with dispensing-based estimates and comparatively lower correlations with more distal prescription-based estimates were considered to support convergent and discriminant validity, respectively. To explore possible influences on convergent validity (RQ4), we performed linear multiple regression models with the absolute difference between MIS-A and dispensing-based adherence as dependent variable, and type of report (patient or parent) and patient gender and age as relevant predictor variables available in the CATI data-set. We considered weak/ non-significant effects as indicating that convergent validity does not differ depending on these characteristics.
For a third line of evidence on RQ1, the differences between MIS-A-and dispensing/prescribing-based adherence were compared to previous studies. Smaller differences would indicate a higher ability to detect non-adherence consistent with dispensing/ prescribing records. To examine RQ2 in more detail, the MIS-A scores were used to predict dispensing-based four-month adherence in linear multiple regression models.
Significant effects of individual scores were interpreted as supporting non-redundancy, i.e. different adherence properties might explain unique variance in related measures.

Results
From 1051 patients enrolled in the study in France and the UK, 4196 regular fourmonthly CATIs and 163 post-AE CATIs were conducted with 934 participants (1 to 10 CATIs per participant); 117 enrolled participants could not be reached for CATI. Adherence reports on 3920 medications from 3634 regular CATIs, and 164 medications from 148 post-AE CATIs, met inclusion criteria. These reports were provided by 902 participants from 80 UK primary care practices and 243 French general practitioners. Of the regular CATIs selected, 358 also included an AE report, followed by adherence reports on 397 medications (flow chart available in SOM 2). Thus, the total sample included adherence reports on 4481 medications prior to 4140 occasions (AE or CATI). Participant and medication characteristics are presented in Table 2. A report could include more medications: one medication was reported on 3813 occasions (92.10%), and a maximum of 3 medications on 14 occasions.

Descriptive statistics of MIS-A scores
Descriptive statistics of MIS-A scores are presented in Table 3 (see SOM 2 for distribution plots). Taking adherence 1 day before the report (Q1) showed 34.06% of medications were underused, of which 19.39% were completely unused, and 3.73% was overused. A week before (Q2 and Q3), 39.70% were unused for ≥ 1 day and 12.65% were completely unused; 40.40% were used as prescribed the entire week. Therapeutic coverage 1 month before (Q4) indicated that 55.02% were unused for ≥ 1 day (8.97% completely unused). Four months before, drug holidays (Q5) were reported for 28.67%, and overuse reports (Q6) showed that 18.93% were overused with > 1 inhalation. Median taking adherence for 1 week, 1 month and 4 months (CS1, CS2 and CS3) was 85.71%. Ceiling effects (% of respondents with 100% adherence) were 24.5-61% for Q1 to Q5, and 23.5-40.5% for the composite measures. By comparison, self-reports of one-week adherence collected via a single survey question in an asthma trial (Patel et al., 2013) resulted in 60-70% reports of 100% adherence. In a daily diary study (Jentzsch, Camargos, Colosimo, & Bousquet, 2009), self/parent reports resulted in a mean adherence of 97.9% in young people with asthma. MIS-A scores have by comparison better distributions, hence it is more able to detect non-adherence (RQ1).

Variation of MIS-A taking adherence scores at between-practice, between-patient and within-patient levels
The subsample for linear mixed models included 3272 reports from 631 patients (in 67 UK primary care practices and 214 French general practitioners; flow chart in SOM 2). Table 4 summarises unconditional means models of the taking adherence scores. All 4 models indicate that a substantial proportion of variance was present between patients (ICC = .24−.52) and between practices (UK primary care centres or French general practitioners; ICC = .05−.10). For RQ1, this suggests that MIS-A taking adherence scores can detect variation at these two levels. Moreover, the remaining within-patient variance (ICC = .41−.71) indicates that patients had different adherence levels during the follow-up. In other words, patients with high adherence in some CATIs were likely to report suboptimal adherence at other times, and vice versa.

Bivariate associations and reliability of MIS-A scores
Spearman's ρ correlations between MIS-A scores (including adjusted scores for non-overlapping time intervals) are presented in Table 5. Q1 to Q5 scores showed associations of large effect size (ρ = .51−.85). Q6 (4-month overuse) was weakly associated with the other items, except Q1 (ρ = .19). Only one-week and four-week therapeutic coverage (Q2 and Q4) showed correlations > .80, suggesting collinearity. All other associations were below this threshold, indicating non-redundancy of individual items (RQ2). Composite scores were associated with each other > .80, and were strongly associated with the first 5 questions (ρ = .59−.97). All correlations were considerably reduced (ρ = .46−.73) when scores were adjusted to exclude overlap in time intervals but remained moderate to strong (see SOM 2 for scatterplots), suggesting reliability (RQ3).
Comparison of MIS-A scores with prescribing-or dispensing-based adherence (CMA) Summary statistics, correlations, paired differences and concordance tests for therapeutic coverage estimates are shown in Table 6 for patients with matching data regarding the type of medication used (Bland-Altman plots in Figure 1). There was a moderate positive correlation with dispensing and a comparatively smaller non-significant positive correlation with prescribing CMA (ρ = .33 and .12 respectively). Hence MIS-A scores showed convergent and discriminant validity (RQ4). The differences between MIS-A vs. dispensing or prescribing CMA scores were 15 and 7%. In the above-mentioned study by Jentzsch et al. (2009), dispensing estimates were 27.9% lower than self/parent reports. We could not find other published studies where both self-reported and dispensing/prescribing adherence are expressed as percentages, to allow comparison. MIS-A scores showed comparatively smaller differences, which supports its ability to detect non-adherence consistent to alternative measures (RQ1).

Predicting differences between MIS-A-and dispensing-based adherence
The results of the linear multiple regression predicting differences between MIS-A and dispensing-based adherence (in absolute values) are shown in Table 7. The type of  Notes: Q1 = 1-day taking adherence; Q2 = 1-week therapeutic coverage; Q3 = 1-week correct dosing; Q4 = 1-month therapeutic coverage; Q5 = 4-month (no) drug holidays; Q2adj = 1-week therapeutic coverage, excluding the day before; Q3adj = 1-week correct dosing when in use, excluding a day before; Q4 = 1-month therapeutic coverage, excluding 1 week before; Q5 = 4-month (no) drug holidays, excluding the month before; Q6 = 4-month overuse; CS1 = 1-week taking adherence; CS2 = 1-month taking adherence; CS3 = 4-month taking adherence   interviewee and patient's age and gender did not explain a significant amount of variance in concordance between self-report and dispensing-based adherence (F(3, 298) = 2.27, p = .08; adjusted R 2 = .01). Differences were significantly lower for older patients, yet the effect size was small. Thus, no substantial influences on convergent validity were found among the available variables (RQ4).
Predicting dispensing-based adherence from MIS-A scores Table 8 presents Spearman's ρ correlations and linear multiple regression models predicting dispensing-based adherence (CMA scores) from MIS-A scores (excluding composite scores and Q3 due to collinearity). All scores showed significant correlations of similar effect size with CMA scores. Higher one-day taking adherence (Q1) and fourweek therapeutic coverage (Q4) had unique contributions to predicting higher CMA in both models; other scores did not contribute to explaining additional variance (F(5296) = 12.21, p < .001 and F(2299) = 29.06, p < .001, adjusted R 2 = .16 for both models). These results suggest that Q1 and Q4 were complementary in relation to dispensingbased adherence (RQ2), while others could be considered redundant in this respect.

Discussion
Self-report remains the most accessible and practical adherence measurement method. Advances in theory and methodology of adherence measurement and psychometrics were applied in this study to develop a next-generation adherence self-report that captures key properties of adherence usually only captured with EM, combines high granularity of data over short time periods with estimations of adherence over longer periods, and addresses recall and social desirability bias. MIS-A properties were evaluated in a large, longitudinal sample of patients with persistent asthma in two countries. Our results show that (1) MIS-A was able to detect non-adherence with low ceiling effects and substantial within-person variance; (2) it targeted related yet distinct adherence properties; (3) reports were reliable; and (4) MIS-A four-month taking adherence demonstrated convergent and discriminant validity. Hence, MIS-A is a promising, easyto-use self-report tool that can capture accurately different adherence properties over a long time period. MIS-A showed more variation in scores compared to other self-report measures in asthma (RQ1). First, adherence levels and ceiling effects in our study were lower compared to prior asthma studies using self-report measures (Jentzsch et al., 2009;Patel et al., 2013). The higher levels of non-adherence identified also indicate that MIS-A has higher validity, as self-reports are characterised by high specificity (Stirratt et al., 2015). Second, multilevel unconditional means models showed that MIS-A was able to capture variation in adherence within-patient, between-patients and between-practices. These results are consistent with studies showing long-term variation in ICs use in asthma based on electronic health care data (e.g. Laforest et al., 2016). Third, four-month taking adherence estimates were more similar to dispensing and prescribing-based estimates, than the only other similar comparison we could find (Jentzsch et al., 2009). These results suggest that respondents were able to remember and were comfortable with reporting suboptimal adherence when it occurred.
The associations between the MIS-A scores indicate that they largely offer complementary information on a variable behaviour (RQ2). Among these scores, oneday taking adherence and therapeutic coverage (one-or four-week) were predictive of four-month dispensing-based adherence. Overuse was less common and not associated with underuse reports, suggesting that patients when underusing medication do not necessarily also overuse inhalers in the same four-month time period. Composite taking adherence scores were strongly associated with underuse scores, as they were computed based on this information; they would need to be used separately in further analyses depending on the study aims.
We propose that adherence is better conceptualised as a dynamic behaviour in line with recent consensus on defining adherence to medications as a process . Thus, common reliability and validity tests (e.g. factor analysis, Cronbach's α) are not applicable. Such tests have been reported for other measures (e.g. Mora et al., 2011;Reynolds et al., 2007), and assume that questions are equivalent indicators of a stable latent dimension. Recent variance decomposition methods that estimate reliability in longitudinal data using generalizability theory (Cranford et al., 2006) also assume the existence of a latent construct. This does not apply to count-based measures such as MIS-A, as scores refer to different time intervals in a variable behaviour. Hence, we adopted another approach to reliability testing, which was based on sensitivity-to-adjustment: inter-item correlations were lower, but still moderate to strong, after the overlap in time interval between items was excluded. These analyses suggest that the MIS-A is a reliable self-reported adherence tool.
Concordance with four-month dispensing-based adherence was moderate, in line with other reports in the literature (Garber et al., 2004). For example, medium positive associations (ρ = .21−.26) were found between dispensing-based adherence and three other self-report measures in a heterogeneous sample of long-term medication users (Cook, Wade, Martin, & Perri, 2005), as well as between claims and self-reported ICS adherence (r = .35) in adults with asthma (Erickson, Coombs, Kirking, & Azimi, 2001). The comparatively higher association with dispensing than with prescribing supports convergent vs. discriminant validity, as prescribing is more distal than dispensing relative to people actually using the medication. Prescribing-based adherence may nevertheless be relevant clinically and related to dispensing (Mabotuwana, Warren, Harrison, & Kenealy, 2009;Taylor, Chen, & Smith, 2014). Differences between MIS-A and dispensing estimates were not higher when parents reported for their children compared with adult self-reports, and not influenced by participant's gender and age, suggesting that validity is not influenced by these characteristics. Other contextual influences (for example parents' perceived social desirability or involvement in their child's medication administration) might play a role and would need to be explored further.
The present study and tool show some limitations. First, MIS-A focuses on medication implementation, as it was intended to assess patients with ongoing long-term medication; hence, exact dates of treatment initiation and discontinuation, the other two components of the recent consensus-based medication adherence taxonomy , are not precisely captured. The practical applications of this taxonomy are under development, and most adherence measures target implementation (Nguyen et al., 2014). In MIS-A, 0% four-month taking adherence could be coded as either non-initiation or non-persistence; future versions could include preliminary questions on time of initiation and whether and when discontinuation occurred. Second, the MIS-A items were chosen in the context of asthma and ASTRO-LAB, and other adherence properties possibly relevant for other long-term conditions were excluded, e.g. timing adherence for assessing the exact times of medication ingestion (Demonceau et al., 2013). Adaptations to other conditions would need to consider the potential relevance of other adherence properties. Third, the MIS-A validity testing was limited by the data available. Hence, although our study provides strong support for its validity, MIS-A would benefit from further validation in studies that collect EM data, or dispensing and prescribing data from the same sample. Fourth, the impact of adherence on health outcomes is currently under investigation in ASTRO-LAB and would also test MIS-A's criterion-related validity (Berg & Arnsten, 2006). Should this impact prove significant, MIS-A would represent a valuable tool also in clinical consultations for investigating individual causes of worsening asthma.
These first validation results suggest that MIS-A is an easy, inexpensive, reliable and valid self-report method for assessing specific adherence properties over long time periods. MIS-A is ready to use for measuring adherence to asthma controllers via CATI, and can be adapted as a self-administered questionnaire, or for other types of medications or chronic conditions. It can also be employed as adherence diagnosis tool in general practice or community pharmacies either as routine monitoring on selected patients (e.g. with severe asthma) or retrospectively after an AE to identify preceding events. Should MIS-A prove valid in such contexts, interventions could be developed to target specifically the adherence properties linked with worsening health status.
More broadly, the development of MIS-A illustrates how self-report, when carefully designed and used, is able to produce rich and valid information on implementation patterns of long-term treatment. Further improvements in adherence self-report need to be explored and tested for different administration contexts, respondent characteristics and in different chronic conditions.

Disclosure statement
No potential conflict of interest was reported by the authors.