Understanding health behaviours in context: A systematic review and meta-analysis of ecological momentary assessment studies of five key health behaviours

ABSTRACT Ecological Momentary Assessment (EMA) involves repeated, real-time sampling of health behaviours in context. We present the state-of-knowledge in EMA research focused on five key health behaviours (physical activity and sedentary behaviour, dietary behaviour, alcohol consumption, tobacco smoking, sexual health), summarising theoretical (e.g., psychological and contextual predictors) and methodological aspects (e.g., study characteristics, EMA adherence). We searched Ovid MEDLINE, Embase, PsycINFO and Web of Science until February 2021. We included studies focused on any of the aforementioned health behaviours in adult, non-clinical populations that assessed ≥1 psychological/contextual predictor and reported a predictor-behaviour association. A narrative synthesis and random-effects meta-analyses of EMA adherence were conducted. We included 633 studies. The median study duration was 14 days. The most frequently assessed predictors were ‘negative feeling states’ (21%) and ‘motivation and goals’ (16.5%). The pooled percentage of EMA adherence was high at 81.4% (95% CI = 80.0%, 82.8%, k = 348) and did not differ by target behaviour but was somewhat higher in student (vs. general population) samples, when EMAs were delivered via mobile phones/smartphones (vs. handheld devices), and when event contingent (vs. fixed) sampling was used. This review showcases how the EMA method has been applied to improve understanding and prediction of health behaviours in context.


Introduction
Andy Warhol wrote: 'They always say time changes things, but you actually have to change them yourself.' This holds true for changing key health behaviours: increasing exercise and reducing time spent sitting, eating healthily, drinking less alcohol, stopping smoking, and having safe sex. In the health psychology domain, researchers have traditionally relied on one-off assessments of psychological constructs (e.g., motivation, self-efficacy), contextual factors (e.g., weather), and health behaviours. Researchers typically ask participants if they want to change their health behaviour(s), if they feel confident to do so, and if they perceive any specific change barriers or facilitators. We also tend to ask participants to retrospectively recall the average frequency of their health behaviour(s) over longer time periods (e.g., 'On average, how many times per week do you exercise?'). In the early 1980s, a method referred to as experience sampling (or Ecological Momentary Assessment; EMA) was introduced (Larson et al., 1980), which involves repeated (often technology-mediated), real-time measurements of cognitions, emotions, environmental contexts, and behaviours in people's daily lives (Stone & Shiffman, 1994).
This new method has revolutionised health psychological research: through relying on real-time (as opposed to retrospective) assessments of variables of interest, findings from EMA studies have provided a more precise and reliable understanding of how health behaviours unfold over time and in context, also mitigating methodological issues such as recall bias (Reichert et al., 2020). For example, participants are better at recalling emotions and cognitions at hourly or daily compared with weekly or monthly retrospective reports (Shiffman et al., 2007). In the four decades since its inception, the EMA method has been applied across many health behaviours. However, no review has synthesised findings from the many available EMA studies, summarising key theoretical and methodological aspects. Although prior systematic reviews have summarised methodological aspects of EMA studies (Cain et al., 2009;Colombo et al., 2019;de Vries et al., 2021;Degroote et al., 2020;Heron et al., 2017;Jones et al., 2019;Schembre et al., 2018;Wen et al., 2017), to the best of our knowledge, no available review has summarised psychological and contextual predictors of health behaviours measured via EMAs or compared methodological aspects across EMA studies focused on five key public health behaviours, which are leading causes of morbidity and premature mortality globally (Murray et al., 2020), including: physical activity and sedentary behaviour, dietary behaviour, tobacco smoking, alcohol consumption, and sexual health behaviour. We aimed to fill this gap by synthesising findings from EMA studies conducted across these five key health behaviours of interest.
Theoretical considerations in EMA studies: studying dynamic health behaviour change within persons To predict and explain health behaviours and inform the development of effective behaviour change interventions, theories of health behaviours must apply to individuals . However, most studies that aim to test or build health psychology theory are designed in such a way that they can only explain why people are different from one another (i.e., they capture betweenperson differences). Ergodic processes are those that are identical for groups and individuals, with the mean and variance of the process (e.g., motivation to exercise) remaining consistent over time. Inferences made from group-level estimates of psychological processes can only be validly applied to understanding individuals if the process of interest is ergodic. However, evidence from EMA studies shows that the ergodicity assumption rarely holds for psychological processes (Fisher et al., 2018). Calls have therefore been made to focus research efforts on both group-and individual-level change processes (Chevance, Perski, et al., 2020;Fisher et al., 2018;Hekler et al., 2019). For example, EMA studies have been used to capture the co-occurrence of psychological and/or contextual variables and health behaviours ('synchronicity'; e.g., positive affect while eating), antecedents and consequences of health behaviours ('sequentiality'; e.g., the lagged effect of intentions on physical activity), critical fluctuations in psychological or contextual variables and health behaviours ('stability' or 'instability'), with a focus on individual-level change processes (Dunton, 2017).
In addition, one cannot consider behaviour change without considering time. Few health psychology theories explicitly refer to time in their conceptualisation of change processes (Scholz, 2019), such as specifying the timeframe within which change in a psychological or contextual variable is expected to lead to change to the target behaviour, and with what magnitude. This only scratches the surface of the importance of time for the understanding and prediction of health behaviour change: both the psychological or contextual variable and the behaviour are likely to have their own group-and individual-level variances (i.e., 'stability' or 'instability' over time) and covariances that may or may not be systematically associated with time of day, week, month, or year.
To test any clearly articulated health behaviour change theory, study designs that can reliably capture the dynamics of psychological and behavioural processes at the within-person level are required, followed by the use of a statistical or computational approach that robustly operationalises the theoretical model (Collins, 2006). EMA studies are well-suited for capturing such dynamics as these allow researchers to flexibly schedule real-time assessments at different temporal frequencies (e.g., daily, hourly). Study designs and prompting schedules vary across EMA studies, with the latter being triggered by time (e.g., fixed, random, quasi-random or stratified prompts) or event occurrence (e.g., after having smoked a cigarette). Due to recent technological advances, the dynamics of health behaviours can also be captured using passive and continuous sensing with portable and/or wearable devices, which can be used to trigger event-based assessments when some predefined threshold is reached (Ebner-Priemer et al., 2013;Giurgiu et al., 2020).
However, to the best of our knowledge, no available review has summarised what psychological and contextual predictors have been examinedand at what sampling frequencyin EMA studies of the five key health behaviours of interest.
Methodological considerations in EMA studies: prompting schedules, incentives and adherence Their theoretical benefits notwithstanding, EMA studies bring key methodological challenges for participants and researchers, including the burden associated with some prompting schedules (potentially leading to low adherence) (Reichert et al., 2020), a limited number of validated instruments for measuring state-like (i.e., dynamically fluctuating) psychological and contextual variables and health behaviours, and the requirement for researchers to master relatively sophisticated statistical modelling techniques, including multilevel/hierarchical regression models (Bolger & Laurenceau, 2013).
Although systematic reviews of EMA studies focusing on specific health behaviours are available, we lack a comprehensive summary of theoretical (e.g., psychological and contextual predictors) and methodological aspects (e.g., study designs, frequency of EMAs, incentives, EMA adherence) of EMA studies across key health behaviours, including physical activity and sedentary behaviour, dietary behaviour, tobacco smoking, alcohol consumption, and sexual health behaviour. The extent to which such theoretical and methodological aspects differ by target health behaviour remains an empirical question. Such information is useful for health psychology researchers planning the design of future EMA studies, identifying knowledge gaps, and providing a summary of best practice across research contexts, settings, and health behaviours. Although we acknowledge that our list of target health behaviours could beneficially be expanded to, for example, include medication adherence, healthcare seeking behaviour, and sleep, we were mindful when designing the review protocol that a large number of studies would likely be in scope and therefore opted to impose a boundary to only include key public health behaviours which are known to account for a considerable proportion of mortality and morbidity globally (Murray et al., 2020).

The present study
The present systematic review and meta-analysis therefore aimed to showcase the current state-ofknowledge in EMA health behaviour research and identify knowledge gaps by summarising theoretical (e.g., psychological and contextual predictors) and methodological aspects (e.g., study settings, study designs, sample characteristics, study durations, frequency of EMAs, EMA prompting strategies, adherence to EMAs, incentive structures) of EMA studies across five key health behaviours.

Study design
This review adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) checklist (Moher et al., 2009) and the American Psychological Association's Meta-Analysis Reporting Standards (Cooper, 2020). A protocol was pre-registered on the Open Science Framework (www.osf.io/cmnvw) and on the international Prospective Register of Systematic Reviews (www.crd. york.ac.uk/prospero/display_record.php?ID = CRD42020168314). In addition, the review protocol has been published (Kwasnicka et al., 2021).

Inclusion criteria
This review focused on the following five key health behaviours in healthy adults (i.e., non-clinical populations) aged 18 + years: (1) physical activity and sedentary behaviour, including the interruption of sitting time; (2) dietary behaviour, including snacking and fruit and vegetable consumption; (3) alcohol consumption, including binge drinking; (4) tobacco smoking, including cigarette-, cigar-and pipe smoking; (5) Sexual health behaviour, including contraceptive and condom use.
To limit the review scope and as several available reviews have focused on EMA studies in specific clinical populations (e.g., borderline personality disorder, psychotic disorder, binge eating, bulimia nervosa, schizophrenia, chronic pain), we opted to include only non-clinical populations. Studies that recruited individuals with overweight or obesity were judged as non-clinical and therefore included, given that 39% of adults globally meet criteria for overweight or obesity, with most Western countries averaging above 50% (World Health Organisation, 2021). We included studies that involved individuals with a diagnosed mental or physical health condition, providing that they were not specifically recruited into the study based on a mental or physical health condition. We also included studies in which a behavioural or pharmacological intervention was delivered, providing that participants were asked to complete free-living EMAs.
To the authors' knowledge, there is no consensus definition of EMAs; therefore, we opted for an inclusive approach and included studies with repeated (i.e., two or more) within-day, daily or weekly assessments of psychological or contextual predictors and behaviours. We reasoned that the frequency of the EMAs needed to plausibly match how the target behaviour (and psychological and contextual predictors) theoretically or empirically unfolds over time (e.g., daily assessments of steps, weekly assessments of gym class attendance if the class is undertaken only once a week). To be included, studies needed to assess the target behaviour and at least one psychological or contextual variable through EMAs, and to have reported at least one within-or between-person predictor-behaviour association. In this review, we defined psychological variables as emergent properties of a distributed network of neurons, including cognitions (e.g., beliefs, attitudes, goals), emotions (e.g., negative affect, cravings) and processes operating on these (e.g., self-regulation, learning), which are linked to behaviour (Fried, 2017). We defined contextual variables as any potential environmental (i.e., social or physical) influences on behaviour, including the presence of other people, weather, or the availability of unhealthy foods/tobacco/alcohol. The psychological and contextual variables were closely assessed by the reviewers as to their suitability for inclusion/exclusion in the review. Studies reporting associations between behaviours and psychological consequences (e.g., the association of physical activity and positive affect) were included providing that they also reported at least one predictor-behaviour association (e.g., the association of positive affect with physical activity). Studies were included if they used self-report or physiological measures of psychological or contextual predictors (e.g., cortisol or heart rate variability to capture stress) or behaviours (e.g., accelerometer data to capture physical activity). No restrictions on geographical location or publication date were applied.

Exclusion criteria
Studies only focusing on purchasing behaviours were excluded if they did not include any other relevant behaviour. Studies not published in English or where no full text could be obtained were not included. Behaviour-behaviour associations (e.g., the relationship between physical activity and eating behaviour) were not considered in this review.

Electronic searches
We searched Ovid MEDLINE, Embase, PsycINFO and Web of Science (see the Supplementary Materials for the full search strategy). Terms were searched for in titles and abstracts as free text or index terms (e.g., Medical Subject Headings), as appropriate. We combined two groups of terms, the first with terms relevant to EMAs and within-person study designs; the second with terms relevant to the five key health behaviours. Electronic and hand searches were conducted in January 2020 and updated on February 28th, 2021. The search was restricted to human studies written in English that were published in peer-reviewed journals.

Searching for other sources
Reference lists of available systematic reviews of EMA studies were hand searched and expertise within the review team was used to identify additional articles of interest.

Selection of studies
Identified articles were merged using Covidence (www.covidence.org) and duplicate records were removed. Three reviewers (OP, JK, DKw) independently screened titles and abstracts (yes, maybe, no) against the pre-specified inclusion criteria. Authors were e-mailed to request access to full texts which could not be obtained electronically. Full texts were independently screened by two reviewers from the author team (yes, no). Discrepancies were resolved by three reviewers (OP, JK, DKw), consulting the other team members if needed. We did not calculate inter-rater reliability. In line with the PRISMA guideline, reasons for exclusion were recorded at the full text stage and are listed in Figure 1.

Data extraction and management
A data extraction form was developed in Microsoft Excel by the review leads in collaboration with the other team members. Data were extracted by one reviewer from the author team, with 20% of studies double checked for accuracy and completeness by a second reviewer from the author team. Discrepancies were resolved by the two reviewers involved in the data extraction and checking, consulting the other team members if needed. We did not calculate inter-rater reliability. Data were extracted on study description, health behaviour(s), participant characteristics, study design, EMA characteristics and adherence, and psychological and contextual predictors (see Kwasnicka et al., 2021, for further details). EMA adherence was defined as the average percentage EMAs completed (nominator) out of the available EMAs (denominator) across the study sample (Kwasnicka et al., 2021).

Quality appraisal
As a specific quality appraisal tool for EMA studies is currently not available, we devised a bespoke tool specifically for the purposes of this review based on previous literature (including the CREMAS checklist) (Liao et al., 2016;Stone & Shiffman, 2002). The quality appraisal tool was piloted by the review team and included the following four criteria: (1) rationale for the EMA design; (2) whether an a priori power analysis had been conducted; (3) adherence to the EMAs; and (4) treatment of missingness (see the Supplementary Materials, Table S1). In line with the Effective Public Health Practice Project quality assessment tool (Armijo-Olivo et al., 2012), we rated each of the four criteria as 'Strong', 'Moderate' or 'Weak'. As each criterion refers to a different aspect of study quality, we did not produce an overall study quality rating for each study. The quality appraisal was performed by one reviewer from the author team, with 20% of studies double checked by a second reviewer from the author team. Discrepancies were resolved by the two reviewers involved in the data extraction and checking, consulting the other team members if needed. We did not calculate inter-rater reliability. Data synthesis.
A narrative (descriptive) synthesis was conducted to summarise the theoretical and methodological aspects of the EMA studies, first across all included studies and next split by target behaviour.
To aid interpretation and prior to summarising the psychological and contextual predictors assessed, we coded the identified constructs against the following higher-order categories, developed by three reviewers (OP, JK, DKw) based on the Theoretical Domains Framework (Atkins et al., 2017;Michie et al., 2005). The Theoretical Domains Framework (TDF) was developed through consensus methodology, with a view to integrating the many available behaviour change theories and theoretical constructs into a single framework, thus making theory more accessible to researchers and practitioners (Atkins et al., 2017;Michie et al., 2005). We used the following TDF-based, higher-order categories: 'feeling statesunspecified', 'positive feeling states', 'negative feeling states', 'momentary trait manifestations and physical states', 'motivation and goals', 'beliefs about capabilities', 'beliefs about consequences', 'behavioural regulation', 'memory, attention and decision processes', 'social influences', 'environmental context and physical/environmental resources' and 'nature of the behaviour' (see 'Data statement' for a link to the dictionary used). The psychological and contextual variables identified across the included studies were coded by one reviewer (OP) and double checked by two reviewers (DKw and JK). Discrepancies were resolved through discussion among three reviewers (OP, JK and DKw). Following an identical procedure, the identified funders were coded against the following higher-order categories: research/government, society, charity, university/health institution, industry or no funding reported (see 'Data statement' for a link to the dictionary used).
Second, although we did not systematically extract information on overlapping samples across included studies at the time of data extraction, we returned to the dataset to identify such samples using the following approach: (i) two reviewers (DP and FN) flagged studies with identical sample sizes and identical sample mean ages; and (ii) checked the author list for overlaps in coauthorship. Where (i) and (ii) were satisfied, studies were coded as having an overlapping sample. Where an overlap in co-authorship was not identified, the article full texts were further checked. Next, the 'General Comments' column in the data extraction sheet (used by reviewers to highlight any queries) was screened for any mention of overlapping samples, and where this was the case, this was confirmed by checking if the samples in the articles were the same or a subsample of each other. Finally, where the first approach brought up sample sizes and mean ages that were very close but not identical, the articles were further screened to check for overlapping samples. Studies with overlapping samples were excluded prior to the meta-analysis, keeping the earliest record of a study using each sample.
We then conducted a series of uni-and multivariable random-effects meta-analyses to estimate the pooled percentage adherence across included studies and to examine whether adherence varies depending on study setting, study population, whether an incentive(s) was provided, target behaviour, EMA delivery mode, EMA sampling frequency, EMA sampling method, whether an adherence cut-off was applied, year of publication, or study duration (in days), with some moderator levels collapsed due to low cell counts (see the Supplementary Materials, Table S2, for the moderator coding). Studies with missing data on any of the moderator variables were excluded from the uni-and multivariable meta-analyses. We did not have pre-specified hypotheses regarding potential moderators of EMA adherence; all variables were entered simultaneously into a multivariable random-effects model. Analyses were conducted in RStudio using the metafor package and with the estimator set to restricted maximum-likelihood (Viechtbauer, 2010). To aid interpretation, we did not apply any transformations of the raw percentages prior to meta-analysis. The I 2 statistic was used to quantify the between-study heterogeneity but we did not deem it useful to assess the potential for publication bias via, for example, Egger's test given EMA researchers often apply adherence cut-offs for inclusion (which was already captured descriptively). Due to the large number of included studies, forest plots for each target behaviour were produced.
We had specified in the pre-registered study protocol that we aimed to synthesise predictorbehaviour associations using random effects meta-analyses, grouped by target behaviour (Kwasnicka et al., 2021). However, due to the length of the present review and the desire to describe predictor-behaviour associations in more depth, we opted instead to present such results as part of smaller, behaviour-specific sub-reviews (e.g., https://osf.io/49uqf/; https://osf.io/p2b65/), which are currently in progress.

Results
After removing duplicates, 15,733 records were identified, with 1,078 studies carried forward to the full text screening. A total of 633 studies were included in the narrative synthesis, with 348 studies included in the meta-analysis to examine moderators of EMA adherence (see Figure 1). Table 1 summarises the study characteristics of the included studies. Most studies focused on physical activity (187/633; 29.5%), followed by alcohol (175/633; 27.6%), smoking (139/633; 22.0%), dietary behaviour (111/633; 17.5%) and sexual health behaviour (21/633; 3.3%). Most studies were conducted in the United States (441/633; 70.1%), followed by Germany (32/633; 5.1%) and Australia (31/633; 4.9%; see Table 1 and Supplemental Materials, Figure S1). With the exception of the studies focused on sexual health behaviour, there appeared to be an increasing trend in the number of studies published over time (see Supplementary Materials, Figure S2).

Study characteristics
Studies primarily received funding from research/government organisations (407/633; 64.3%). Just over one fifth of studies did not report any specific funding received (138/633; 21.8%).
In a subsequent, multivariable random-effects meta-analysis with moderators entered (k = 348), study population, EMA delivery mode, EMA sampling method, EMA device ownership and year of publication were significant moderators of EMA adherence (see Table 3). Specifically, greater adherence was observed in studies with student (vs. general) population samples, mobile phone/smartphone (vs. handheld device) EMA delivery, and event contingent (vs. fixed) EMA sampling. Reduced adherence was observed in studies with all/majority (vs. none) of participants using their own device and random (vs. fixed) EMA sampling. Since the first EMA publication included in the meta-analysis in 1987, for every decade until 2021, adherence decreased by 3.1%.

Discussion
This systematic review and meta-analysis summarises the state-of-the-art in EMA studies conducted in non-clinical populations and across five key health behaviours. We identified 633 studies that investigated psychological and/or contextual predictors of the health behaviours of interest, with most studies focused on physical activity or alcohol consumption. The number of EMA studies across all (except for sexual health) behaviours of interest appears to have increased over time; this likely reflects popularisation of the EMA method and elevated technological progress that facilitates real-time data collection (Gibbons, 2017).

Study characteristics
Most of the included studies were conducted in the US, with a large proportion of participants having a university degree and identifying as White ethnicity. This aligns with research showing that much of our psychological science is based on what has been described as WEIRD populations (i.e., Western, Educated, Industrialised, Rich and Democratic) (Henrich et al., 2010). However, the included EMA studies reported a relatively equal gender split and most studies recruited participants from the general population rather than student cohorts (although the latter was also common). Most included studies applied observational designs, suggesting that interventional designs are currently less common in EMA research. In addition, within the few identified interventional studies, most tested interventions in which allocation occurred between rather than within participants, suggesting that the latter design remains rare, as highlighted in reviews of N-of-1 studies (which typically harness EMAs) (Kwasnicka et al., 2019). Recently, researchers have demonstrated the potential of EMAs for first exploring participants' behavioural patterns in context, followed by interventions tailored to the most important predictors identified in the observational phase (Kwasnicka et al., 2020).
EMAs were primarily delivered via technological tools, such as handheld devices or mobile phones/smartphones. More than half of studies provided all participants with a study specific EMA device, such as a handheld device or activity monitor. The most commonly used EMA sampling frequency was daily, and the most commonly used EMA sampling method was 'multiple' (e.g., a combination of at least two sampling methods such as event and signal contingent prompts).

Psychological and contextual predictors of the five key health behaviours
The most frequently assessed psychological and contextual variables fit into the higher-order categories 'negative feeling states' and 'motivation and goals'; however, this varied by target behaviour. For instance, the studies focused on sexual health behaviours primarily captured 'social influences' and 'motivation and goals'. Our review also highlights that some construct domains from the Theoretical Domains Framework (Atkins et al., 2017) have been relatively understudied (e.g., 'memory, attention, and decision processes'). Further planned behaviour-specific sub-reviews and meta-analyses will examine in depth the ways in which the identified constructs have been assessed for each health behaviour and pool data on predictor-behaviour associations to understand their relative importance (e.g., https://osf.io/49uqf/; https://osf.io/p2b65/). Our database of included EMA studies is openly available and we encourage other researchers to explore how different psychological and contextual variables have been assessed across the five health behaviours. Just over 40% of psychological and contextual predictors were assessed with multiple (rather than single) items and just over a third were reported to have been measured with items for which there was a precedent. The Experience Sampling Methodology (ESM) Item Repository (https://www.esmitemrepositoryinfo.com/) and working group were established to progress EMA methodology and help researchers identify relevant EMA items. The repository includes a searchable database which allows researchers to identify if a given item has been used in a previous EMA study and future aims include psychometrically validating items in the repository.

Moderators of EMA adherence
In the meta-analysis of moderators of EMA adherence, the pooled percentage adherence was high at around 80% and comparable across the five target behaviours. This is similar to numbers reported in previous reviews of EMA studies, which have ranged from 71.6% to 79.0% (Cain et al., 2009;Colombo et al., 2019;de Vries et al., 2021;Degroote et al., 2020;Heron et al., 2017;Jones et al., 2019;Schembre et al., 2018;Wen et al., 2017). However, substantial between-study heterogeneity was detected in our review.
Most studies reported providing some type of incentive for participation or data completion (e.g., flat payment based on study completion, payment per EMA, course credit). However, in the metaanalysis, there was no significant association between the receipt of an incentive and adhering better to the study protocol (vs. no incentive), which stands in contrast to other studies reporting that financial incentives in particular are associated with greater adherence (Giles et al., 2014). Possibly, adherence rate in EMA studies is not primarily related to extrinsic factors (e.g., financial incentives) as participants might be motivated due to intrinsic factors such as their interest in the real-life examination of their health behaviours. Similarly, studies that recruited students reported significantly greater EMA adherence, which may be related to students' increased motivation to contribute to science (Jang, 2008).
Studies in which EMAs were delivered via mobile phones/smartphones reported significantly greater adherence than those using handheld devices, suggesting that phones are suitable for answering EMA prompts, as participants are used to carrying smartphones with them throughout the day (Statista, 2021). However, studies in which all or the majority of participants used their own device to respond to EMAs reported significantly lower adherence than when using a device provided by the research team. This may be interpreted to suggest that adding objects to participants' environment (i.e., a dedicated study phone)an unintended behaviour change technique (Michie et al., 2013) may act as a method for increasing study adherence. It is also possible that other apps on participants' own devices generated similar alerts, which may have interfered with their engagement with the EMA alerts.
We note that researchers also need to consider the environmental impact on buying new electronic devices for each EMA study (Chevance, Hekler, et al., 2020), which is often driven by incompatibilities between new EMA software and the operating systems in older smartphones. However, if we are aiming to achieve sustainability in EMA research, we need to take into consideration that data collection devices and the energy that they use to run are limited. We need to carefully weigh costs and benefits of using technology, including when to purchase new (as opposed to recycling old) devices. Reusing devices across studies and opting for energy saving devices/functionalities where possible (e.g., traditional short message service; SMS) (Dondyk et al., 2015) are some of the potential solutions for making EMA research more environmentally friendly.
In addition, year of publication was a significant moderator of adherence, such that the reported adherence to EMA schedules has reduced over timeon average by 3.1% per decade since 1987. It is possible that methodological advances have made it more straightforward to accurately detect adherence, with fewer opportunities to backfill EMAs when these are prompted by digital technologies (e.g., smartphones). As a further explanation, people's digital environment (e.g., the frequency of notifications from multiple apps) has changed in recent years, potentially reducing attention to EMA prompts.
Study duration and sampling frequency were not significant moderators of EMA adherence. However, studies that used event contingent sampling reported significantly greater EMA adherence and those using random prompts reported significantly reduced adherence compared with fixed sampling (e.g., every evening). The former may simply be explained by participants reporting 'in the moment' (e.g., when smoking a cigarette) making it close to impossible to assess if the participant reported all occurrences of the behaviour; therefore, adherence rates are inflated in studies applying this type of sampling method. The latter may be explained by participants being unable to anticipate prompts, meaning they may be busy at times of randomly sent prompts.

Quality appraisal
Most included studies did not provide an a priori power analysis to justify sample sizes at the withinor between-person level. This is similar to other psychology domains: for example, a recent review in the psychopathology domain found that only 2% of included studies reported a power calculation (Trull & Ebner-Priemer, 2020). Conducting sample size calculations for EMA studies is complex and requires various parameters to be estimated which can be difficult to know in advance without access to pilot data or previous studies that fully report model outputs. The latter is often absent, with random effects commonly omitted from papers and supplementary materials. Tutorials for how to conduct power analyses for EMA studies have been published (Bolger et al., 2012;Lafit et al., 2021); however, their use appears rather limited. In addition to the above issues relating to uncertainties about model parameters, off-the-shelf power analysis tools for EMA studies are not widely available in popular statistical software (but see, for example, Green &MacLeod, 2016 andLafit et al., 2021 for available tools). Therefore, researchers often rely on 'rules of thumb' when making decisions about the sample size in EMA studies.
Most included studies did not interrogate reasons for EMA missingness or control for missing mechanisms in their analyses. Although some missing data are inevitable in EMA studies, the statistical techniques used to analyse clustered data require that data are missing at random or missing completely at random for these to be 'ignorable' within the analyses (Little & Rubin, 2019). Where data are missing not at random, both the process of interest and the process of missingness must be simultaneously modelled (Black et al., 2012). Researchers have, for example, used innovative methods such as unobtrusive 'eavesdropping' to understand factors associated with missed EMAs (Sun et al., 2020).
We strongly encourage EMA researchers to increase the methodological rigour and transparency of EMA research. We echo our colleagues' call (Kirtley et al., 2021) for greater use of study pre-registrations, using a template for EMA research to register both prospective studies and secondary analyses of available data. In order to progress dynamic theory building and making the most out of EMA data, we also strongly encourage data sharing (e.g., via the Open Science Framework) and the sharing of questionnaire items (e.g., via the ESM Item Repository; https://www. esmitemrepositoryinfo.com/).

Strengths
First, a key strength of this review is the comprehensive summary of the application of the EMA method since its inception and across five key health behaviours. Second, we provided an overview of psychological and contextual predictors examined across EMA studies, highlighting differences in focus across the five health behaviours and identifying gaps for future research. Third, we summarised moderators of EMA adherence. Fourth, there is currently no consensus on how to reliably determine the quality of EMA studies. We therefore opted to design a bespoke quality appraisal tool, drawing on available checklists. Although this was useful for the purposes of our review, the tool requires further optimisation prior to wider use. Other research teams are in the process of developing more comprehensive frameworks and quality assessment tools that can be used in future reviews of EMA studies (although these remain unpublished). Fifth, this review was conducted by an international team of researchers, with team members collaborating online throughout the research process. Sixth, we closely followed the principles of Open Science, including study pre-registration; publication of the review protocol; documentation of design and analytic decisions; and sharing the analytic code, procedures, and the underlying dataset for transparency and reuse (McKiernan et al., 2016). The authors strongly encourage other EMA researchers to use and update the electronic searches and the database of EMA studies.

Limitations
First, some of the included studies are likely to have used overlapping samples. As we did not have resource to contact study authors, we attempted to identify articles using the same dataset by checking sample sizes and author names, andwhere identifiedremoved studies with overlapping samples prior to conducting the meta-analysis. However, we may not have identified all such studies, thus potentially biasing the pooled estimates. The results should therefore be interpreted with caution.
Second, although our review provided an overview of theoretical and methodological aspects of EMA studies, we did not attempt to quantify potential reactivity effects (i.e., whether repeatedly responding to EMAs may lead to behaviour change) (Wilding et al., 2016). However, this has been explicitly studied in extant EMA reviews (König, Allmeta, et al., 2021).
Third, this review focused solely on non-clinical populations. We acknowledge that there is a large number of EMA studies conducted in clinical populations.
Fourth, we focused on five key health behaviours (due to their relationships with morbidity and mortality) and presented the results stratified by target behaviour. However, we acknowledge that some of the health behaviours of interest can usefully be split into further sub-behaviours (e.g., 'movement behaviour' tends to be split into physical activity and sedentary behaviour, 'dietary behaviour' tends to be split into several categories, including fruit and vegetable consumption, sugary beverage consumption, etc.), which are expected a priori to be differently associated with psychological and contextual variables. This will be further explored in a series of behaviourspecific sub-reviews, which will look at such questions in more depth.
Fifth, since initiating this project, a few similar reviews focusing on EMA adherence have been published (Ottenstein & Werner, 2021;Wrzus & Neubauer, 2022). The present review is unique in that it is the first to consider both the theoretical aspects of EMA studies (e.g., the psychological and contextual predictors assessed) and study quality across key health behaviours. Although many of the results presented here align with those in extant reviews (e.g., EMA adherence), and did not differ markedly by the target health behaviour, these remained empirical questions prior to the present review.
Sixth, and related to the above limitation, our updated search was conducted in February 2021 and many relevant EMA studies have likely been published since. The database of included studies, the search strategy and all relevant study materials are published open source and we strongly encourage other researchers to update the search and to make further use of the extracted data.
Seventh, due to the already wide scope of the current review, we did not search the grey literature (e.g., PhD theses, pre-prints, other unpublished sources). Additional relevant studies may therefore have been missed.
Finally, due to the cost of EMA data collection (e.g., participant burden, researcher time), researchers often collect data on many variables within a single study and subsequently use different variable sets for different papers. Therefore, it is plausible that the number of variables reported in the included studies did not correspond to the actual number of variables assessed. We strongly encourage EMA researchers to publish study protocols and fully anonymised datasets.'

Wider implications and avenues for future research
Future EMA research would benefit from harnessing advancements in sensor technology to detect health behaviours and contexts/locations (e.g., using geo-location, ambient light, biomarkers such as cortisol or glucose) (Reichert et al., 2020) and applying novel methods such as micro-EMAs to reduce participant burden, increase EMA adherence and increase the precision of EMAs (Ponnada et al., 2021). In addition, we note that most EMA studies reviewed here relied on 'rules of thumb' to guide key study design decisions (e.g., study duration, assessment frequency).
Event and signal contingent designs serve different purposes in data collection. However, event contingent sampling is associated with greater EMA adherence due to limited opportunities to estimate the 'true' denominator (i.e., the actual event rate is unknown) and this leads to inflated adherence rates reported in studies using event contingent EMAs.
EMA studies allow researchers to test theories within individuals over time, and to build dynamic behaviour change theories. However, we note that few studies explicitly tested behaviour change theories or used EMAs to develop and validate dynamic theories (Hall & Fong, 2007). Drawing on recent developments in sensor technology, natural language processing, and pattern recognition (Naylor, 2018), we are now at an opportune time to design EMA studies that facilitate understanding of individuals in context and then devise interventions that enhance health behaviour change and maintenance.
Future EMA studies should consider, where appropriate, to move beyond observation and intervene at the within-person level, for example by deploying 'just-in-time adaptive interventions' (JITAIs). JITAIs can be defined as interventions providing the right type and amount of support, at the right time, by adapting intervention delivery to an individual's changing psychological and contextual states (Nahum-Shani et al., 2016). Dynamic interventions such as JITAIs also have the potential to inform how psychological and contextual factors co-vary with health behaviours through their attempts at modification, and therefore present an exciting avenue for future research and theory development.

Conclusions
This systematic review and meta-analysis of EMA studies conducted across five key health behaviours found that studies have largely focused on capturing negative feeling states and motivation and goals. Participants' adherence to EMAs was high (around 80%) and did not differ by target behaviour but was higher in student (vs. general) samples, when EMAs were delivered via mobile phones (particularly when using a study provided phone), and when event contingent sampling was used (although this is due to artificially inflated adherence rates in such studies). The quality of future EMA studies could be improved by conducting a priori power analyses and better accounting for EMA missingness. Future work harnessing EMAs may benefit from moving from understanding and predicting behavioural patterns to designing dynamically tailored interventions and building dynamic health behaviour change theories.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
Olga Perski and Dimitra Kale receive salary support from Cancer Research UK (C1417/A22962). Daniel Powell is funded by the Scottish Government's Rural and Environment Science and Analytical Services (RESAS) and by the School of Medicine, Medical Sciences, and Nutrition (SMMSN) at the University of Aberdeen. Felix Naughton's salary is covered by the Faculty of Medicine and Health Sciences at the University of East Anglia. Dominika Kwasnicka's work is carried out within the HOMING program of the Foundation for Polish Science co-financed by the European Union under the European Regional Development Fund (grant number POIR.04.04.00-00-5CF3/18-00; HOMING 5/2018) and she is also funded by the NHMRC CRE in Digital Technology to Transform Chronic Disease Outcomes, Australia.

Author contributions
DKw, OP, DP and FN conceived the project. DKw and OP are the project leads and coordinators. All authors have made conceptual contributions to the project design and procedures. All authors have contributed to the data extraction. OP conducted the statistical analyses and wrote the first draft of the manuscript. All authors have read, edited, and approved the final version of the manuscript.