Multi-level temporal autoregressive modelling of daily activity satisfaction using GPS-integrated activity diary data

ABSTRACT In this research, we match web-based activity diary data with daily mobility information recorded by GPS trackers for a sample of 709 residents in a 7-day survey in Beijing in 2012 to investigate activity satisfaction. Given the complications arising from the irregular time intervals of GPS-integrated diary data and the associated complex dependency structure, a direct application of standard (spatial) panel data econometric approaches is inappropriate. This study develops a multi-level temporal autoregressive modelling approach to analyse such data, which conceptualises time as continuous and examines sequential correlations via a time or space-time weights matrix. Moreover, we manage to simultaneously model individual heterogeneity through the inclusion of individual random effects, which can be treated flexibly either as independent or dependent. Bayesian Markov chain Monte Carlo (MCMC) algorithms are developed for model implementation. Positive sequential correlations and individual heterogeneity effects are both found to be statistically significant. Geographical contextual characteristics of sites where activities take place are significantly associated with daily activity satisfaction, controlling for a range of situational characteristics and individual socio-demographic attributes. Apart from the conceivable urban planning and development implications of our study, we demonstrate a novel statistical methodology for analysing semantic GPS trajectory data in general.


Introduction
The study of human mobility has increasingly resorted to high spatio-temporal resolution data such as GPS trajectories with the fast development of location tracking technologies. Exploiting the fine granular representation of individual daily mobility offered by such data, recent research has involved integrating trajectories with additional contextual datainformation beyond geo-coordinates and time stamps of movementto create semantic trajectories (Spaccapietra et al. 2008, Yan et al. 2013. A plethora of methods have been proposed for trajectory semantic enrichment processes, usually depending on application contexts and data availability (Parent et al. 2013). For instance, Grinberger and Shoval (2015) detailed a process to construct semantic trajectories by coupling raw GPS traces with residents' digital activity diaries, whereby trajectory segments were characterised by activity information. They also demonstrated the usefulness of such semantic trajectories in producing high-level knowledge on urban dynamics and spatial structure. Siła-Nowicka et al. (2016) augmented GPS trajectories with urban point of interest (POI) data to identify residents' daily activities and travel modes, and to explore the effects of residential location on daily travel mode choices. Whilst methodological advances revolve around new techniques of annotating and segmenting GPS trajectories with external data (Grinberger and Shoval 2015), innovative spatio-temporal visualisation tools (eg Demšar and Virrantaus 2010), and improved data mining approaches (eg Dodge et al. 2012, Yang andGidófalvi 2017), the development of generic statistical models to properly analyse semantic GPS trajectories is rather under-researched. Nonetheless, the importance of statistical models in discovering and drawing inferences on relationships between human activity and contextual factors is beyond doubt.
Treatment of time and scale has been well-recognised as great challenges in the quantitative analysis of GPS trajectory or movement data (eg Long and Nelson 2013, Kwan and Neutens 2014, Purves et al. 2014. From the statistical modelling perspective, trajectory segmentshomogeneous sub-trajectories defined by certain criteria such as activities or travels in residents' daily mobility (eg Grinberger andShoval 2015, Siła-Nowicka et al. 2016)are irregularly spaced on the time axis with unequal time intervals, and the number of segments per trajectory usually differ. In contrast, standard panel data and spatial panel data techniques (eg Elhorst 2014, Hsiao 2014 are developed based on data with structured, regularly-spaced and discrete time units, and thus it would be problematic to use them to model sequential or temporal correlations between trajectory segments. The scale issue refers to the choice of analysis units at different levels of aggregation influencing data analysis results and statistical inferences (eg Dungan et al. 2002, Haining 2003. For instance, semantic GPS trajectory data can be analysed at scales of trajectories and trajectory segments. However, as well-established in the statistics literature, a separate modelling of data with a multi-level (or multi-scale) structure could lead to unreliable estimates of the relationships under interest and incorrect inferences (Raudenbush andBryk 2002, Goldstein 2011). Therefore, we need a proper methodology to deal with the multiple-scale nature of trajectory data.
This study extends the statistical analysis tools for investigating semantic trajectories by proposing a novel Bayesian multi-level temporal autoregressive model that deals with the issues of time and scale. The methodology simultaneously investigates various attributes of trajectory segments and trajectories under an extended multi-level modelling framework. In addition, time is conceptualised as continuous in our proposal, respecting the fine temporal granularity of GPS trajectory data, and more importantly, enabling a simple way to characterise the sequential correlations between trajectory segments via a temporal weights matrix (detailed below). A continuous time statistical modelling approach has been applied to longitudinal data analysis when measures of an outcome variable are irregularly spaced over time and the number of measurement varies between subjects or individuals (eg Taylor et al. 1994, Diggle 2002. Linear or generalised linear mixed models are commonly adopted for such data, in which the temporal or sequential correlation among measurements or outcomes is specified through a structured residual correlation matrix (eg Diggle 2002, Goldstein 2011. Correlations among residuals of each individual are formulated usually based on temporal distances separating measurements and a time decay function such as an exponential or Gaussian kernel (eg Steele 2008). Advanced stochastic process approaches, such as an integrated Ornstein-Uhlenbeck process derived from a stochastic differential equation, have also been devised to form the temporal correlation structure of model residuals (Taylor et al. 1994, Diggle et al. 2014, Asar et al. 2016, Hughes et al. 2017. Despite the flexibility and mathematical rigour offered by these approaches, a key concern is that the temporal correlation among measurements or outcomes are captured through model residuals, which makes them inappropriate to explore substantive interactions among outcomes, ie the impacts of precedent outcomes on the current one. Such inquiries, however, appeal to social science researchers. The methodology developed in this study treats temporal correlations among outcomes substantively and produces estimates on the strength of how precedent outcomes affect the current one. Moreover, it allows for potential interactions among individuals, which is not modelled in the above studies. Lastly, space or spatial displacement can also be taken into account when analysing temporal correlations in our methodology. The motivation of such methodological development lies in our research interest in understanding residents' subjectively experienced well-being associated with their daily activities. Subjective well-being can be understood as the levels of pleasure individuals experienced from their daily activities and measured by activity satisfaction (eg Kahneman et al. 2004, Schwanen andWang 2014). Characterising urban residents' satisfaction trajectories associated with daily activities and understanding the role of geographical contexts where activities are conducted in shaping experiences of activities has great potential for benefiting urban planning and management policies aiming to improve individuals' quality of life. We utilise a unique GPS-integrated activity diary dataset, in which both GPS trajectories of each participant and detailed information on each activity conducted were recorded for a period of seven days, to provide insight into the nexus between activity participation, geographical context and well-being. Similar to the trajectory enrichment process outlined in Grinberger and Shoval (2015), daily trajectories of participants were annotated and segmented by using additional activity diary data. The resulting trajectory segments represent different activities, and drawing on the locational information of trajectory segments, a range of urban form characteristics of the place or context where activities took place are extracted. To deal with the scale issue discussed above, the data is analysed simultaneously at two scales or levels: the individual scale (ie a participant's whole activity sequences) and the activity scale (or trajectory segments).
Three types of structural effects are highlighted specifically for our daily activity satisfaction study. The first relates to the sequential correlations between activities. The satisfaction level of the current activity might be influenced by that of the precedent activities, with the intensity of influence attenuated by time and space. In other words, satisfaction or pleasure experienced from an earlier activity might be carried over to subsequent activities. Second, individual heterogeneity effects on activity satisfaction are expected due to differences between individuals in terms of socio-demographic and biological characteristics. Third, there might be interaction effects on activity satisfaction between family members, as the arrangement of certain activities (eg shopping) is often implemented by members of a household together. The developed Bayesian multi-level temporal autoregressive model allows for an examination of these effects simultaneously. Bayesian Markov chain Monte Carlo (MCMC) algorithms for implementing the methodology are developed and the computer codes are made available for potential users via the Supplementary online Materials of the paper.
The contributions of this paper lie in two aspects. Foremost, we offer a novel statistical tool to model trajectory data (eg attributes of trajectory segments), taking into account sequential correlations within a trajectory and heterogeneities between trajectories simultaneously. It benefits studies that aim to explore and draw statistical inferences on the influences of geographical contexts on human mobility behaviours or outcomes. Second, we contribute to the substantive subjective well-being literature by examining situational variabilities in daily activity satisfaction and providing insights into how geographical context affects individuals' experiences of daily activities. The remainder of this paper is organised as follows. Section 2 describes the proposed multi-level temporal autoregressive model and the estimation strategy. In Section 3, data and variables are described. Section 4 reports and interprets model estimation results. Finally, we conclude with a summary of findings and discussions on potential limitations of the paper as well as future development.

Modelling sequential correlation and individual heterogeneity
We first propose a two-level autoregressive model with independent individual random effects to capture the sequential correlation and individual heterogeneity effects on daily activity satisfaction. Denote y ki as the reported satisfaction level of k-th activity for individual i, and t k as the time when k-th activity takes place. Time is treated as continuous, and thus it is convenient to characterise temporal decay in the correlations between activities k and k 0 via an exponential function (eg Steele 2008, Browne andGoldstein 2010), where t start k and t end k 0 represent the starting time of k-th activity and the ending time of k 0 -th activity. t o is a temporal threshold parameter, which will be discussed later.
Equation (1) is employed to form elements of a temporal weights matrix T specifying how activities are correlated over time, which can be considered as a temporal analogue of a spatial weights matrix. For individual i, T i has the following form, which is a lower-triangular square matrix of order N i where N i is the number of activities conducted by individual i and varies between individuals. In a general setting where the analysis units are trajectory segments that nest into individual trajectories, T i measures the temporal correlation structure among sequential segments. With a row-normalised T i and y i ¼ ½y 1;i ; . . . ; y N i ;i , the impacts on satisfaction of the current activity from preceding activities are measured by T i y i , which can be understood as a continuous time lag operator similar to the spatial lag operator in spatial econometrics models (Anselin 1988, LeSage andPace 2009). A further concern rises when the temporal separation (Δt k;k 0 ) between activities k and k 0 also results in a spatial displacement. This is the case when activities take place in different locations, ie people might need to travel some distances to conduct the next activity. It is not unreasonable to assume that the spatial distance between two activity sites might attenuate the temporal correlation of the two activities. To address this issue, the temporal correlation between activities k and k 0 is further weighted by the associated spatial distance. More specifically, an exponential decay function is used to calculate influences of spatial distances on the temporal correlations of activities: s k;k 0 ¼ expðÀd k;k 0 =d i max Þ where d k;k 0 is the spatial distance between activities k and k 0 and d i max is the maximum distance among activities conducted by individual i. Thus, the modified temporal correlation between activities k and k 0 is c Ã k;k 0 ¼ c k;k 0 Â s k;k 0 , which further forms elements of the updated T Ã i . Similar approach has been employed to deal with correlations between observations from different domains such as space, time, and geographical context (eg Huang et al. 2010, Harris et al. 2013. Denoting S i as the spatial weights matrix based on s k;k 0 , we have T Ã i ¼ T i S i where represents an element-wise matrix multiplication operation (Searle and Khuri 2017). For notational simplicity, T i is used hereafter.
A multi-level temporal autoregressive model with independent individual heterogeneity effects is specified as, In Equation (3), T i k is the k-th row of T i ; x ki denotes activity-level independent variables such as situational activity characteristics (eg duration and companionship); z i denotes individual-level covariates. β and γ are two vectors of regression coefficients to estimate. ρ measures the strength of sequential correlations between activities and is referred to as a temporal autoregressive parameter for simplicity. Individual heterogeneity is modelled via the term u i , which is assumed to follow a Normal distribution Nð0; σ 2 Þ. It captures effects on activity satisfaction from unobserved individual characteristics and levels up (or down) the average satisfaction of individual i for positive (or negative) u i . The independence assumption on u i might be restrictive and will be relaxed later.
Re-writing Equation (3) in a succinct matrix form gives, where T is a block-diagonal matrix of order N ¼ P n i¼1 N i with T i forming each block; i is a column vector of ones; Δ is a random effect matrix of order N Â n, linking individual random effects on activity satisfaction and bridging the two data scales. To illustrate sequential correlation and individual heterogeneity effects on activity satisfaction implied by the model, we consider a hypothetical individual with four sequential activities. Rearranging Equation (4) and replacing xβ þ zγ with a linear predictor η (the sum of effects of observed independent variables) give, Specifying T i as in Equation (2), ðI 4 À ρT i Þ À1 is obtained as, A few features implied by the model are worth mentioning. First, the satisfaction of the current activity is linked to the satisfaction levels of the preceding activities via both linear predictors and individual random effects. The linear predictors of the preceding activities affect the current activity satisfaction in a way that decreases with increasing powers of the temporal autoregressive parameter (ρ) and the time (or space and time) gaps separating them. That said, η i1 and η i2 both directly affect y i3 weighted by ρ respectively, while η i1 also indirectly affects y i3 through its direct impact on y i2 and the direct impact of η i2 on y i3 weighted by ρ 2 . This correlation structure differs from the firstorder autoregressive model of (spatial) panel data where direct impacts of observations separated by two or more temporal units are assumed to be zero (eg Parent and LeSage 2012, Elhorst 2014, Hsiao 2014. We note that direct impacts between activities separated by a large time interval could approach zero, as indicated in Equation (1). Second, Equation (7) also determines the interpretation of regression coefficients. Consider a variable x p and its coefficient β p in Equation (6). The marginal effect of x p on y i is ðI 4 À ρT i Þ À1 I 4 β p . From Equation (7), it is clear that the a one unit increase in x p for each activity will lead to a different total effect (row-wise summing of ðI 4 À ρT i Þ À1 β p ). If a scalar summary on marginal total effects was desirable, one could calculate the average of observation-wise marginal total effects (LeSage and Pace 2009, Elhorst 2010). The diagonal entries of the matrix are all equal to β p , which can be thought of as the marginal direct effect of a one unit change in x p .

Modelling the dependency of individual random effects
In the multi-level modelling literature, discussions on the dependency of higher-level random effects are rather limited (eg Browne and Goldstein 2010, Dong and Harris 2015). An interesting feature of our data is the collection of all adults' GPS trajectories and daily activities for a proportion of the families that participated in the study, which allows for exploring potential interaction or correlation effects between family members. We treat random effects u i and u j of participants i and j as dependent if they are from a family, and independent, otherwise. Dependencies in the individual random effects are specified via an individual-scale n Â n connection matrix W (w ij ¼ 1 if individuals i and j are from a family; w ij ¼ 0, otherwise). Following the convention of the spatial econometrics literature (Anselin 1988, LeSage and Pace 2009), individual random effects u are postulated to be an autoregressive model, The parameter ϕ measures the extent to which individual-level random effects are correlated within a family. The resulting covariance matrix for u is σ 2 ðA 0 AÞ À1 where A ¼ I n À ϕW. Re-arranging Equation (8) and inserting it into Equation (5) lead to our final model,

Model estimation
Bayesian Markov chain Monte Carlo (MCMC) method is employed for model estimation.
Quasi-maximum likelihood estimation (QML) approaches have often been used for spatial and spatio-temporal econometric models (eg Lee andYu 2010, Elhorst 2014). However, the issue of local optima in maximising the concentrated log-likelihood function might seriously impact model parameter estimation, which could be avoided in the MCMC approach by directly sampling from posterior distributions of parameters (LeSage and Pace 2009, Parent and LeSage 2012). Bayesian MCMC approach is based on the joint posterior distribution of all model parameters, which is the product of data likelihood, denoted by f ðYj:Þ, and prior densities for model parameters, denoted by pð:Þ in Equation (10), where δ ¼ ½β; γ and the formed new model design matrixX ¼ ½X; Z. The prior distributions for unknown parameters (δ; ρ; ϕ; σ 2 e ; σ 2 ) are assumed to be independent. More specifically, pðδÞ follows a multivariate Normal distribution with mean M 0 and variance matrix T 0 , NðM 0 ; T 0 Þ. Uniform prior distributions are assigned for the two autoregressive parameters ρ and ϕ over (−1, 1). Inverse gamma (IG) distributions are used for σ 2 e and σ 2 : pðσ 2 e Þ,IGðc 0 ; d 0 Þ and pðσ 2 Þ,IGða 0 ; b 0 Þ. Following Gelman et al. (2014), the density function of a non-normalised IG with the shape parameter a and scale parameter β is pðxÞ / x ÀaÀ1 expðÀβ=xÞ.
The likelihood function for the two models proposed (Equations (5) and (9)) is expressed as, where B ¼ I N À ρT. We note that the determinant of B is equal to one because it is a lower-triangular matrix with diagonal entries of ones. Based on the likelihood function and prior distributions, we can derive the posterior distribution for each model parameter. The posterior distribution for regression coefficients pðδjY; u; ρ; ϕ; σ 2 ; σ 2 e Þ is also a multivariate Normal distribution, NðM δ ; AE δ Þ with The posterior distribution for individual random effects u is a multivariate Normal distribution, NðM u ; AE u Þ with The posterior distributions for σ 2 and σ 2 e are IGða ; b Þ and IGðc e ; d e Þ where and The posterior distribution for the temporal autoregressive parameter (ρ) is a Normal distribution, NðM ρ ; σ 2 ρ Þ with Unlike parameters ðδ; u; σ 2 ; σ 2 e ; ρÞ, the conditional posterior distribution of ϕ is not a standard density function, thus Gibbs samplers cannot be directly applied Gelman et al. (2014). The posterior conditional distribution of ϕ is expressed as, The Metropolis-Hastings (M-H) sampling method could be selected for updating ϕ. However, the M-H algorithm might not be efficient as it requires a large number of MCMC iterations and a careful choice of tuning parameters, especially in the presence of large data sets. We, instead, use an inversion sampling algorithm to update ϕ. The same approach has been widely used in Bayesian spatial econometric models (eg LeSage and Pace 2009, Harris 2015, Dong et al. 2016).
In short, there are two steps involved in this sampling approach. In the first step, the log-posterior density function of ϕ, logf ðϕÞ, is empirically evaluated using updated values of ðδ ðkÞ ; u ðkÞ ; ρ ðkÞ ; ðσ 2 e Þ ðkÞ ; ðσ 2 Þ ðkÞ Þ in the k-th MCMC iteration. logf ðϕÞ is expressed as, logf ðϕÞ ¼ log I n À ϕW j jþ ðu 0ðkÞ u ðkÞ À 2ϕu 0ðkÞ Wu ðkÞ þ ϕ 2 u 0ðkÞ W 0 Wu ðkÞ Þ=2ðσ 2 Þ ðkÞ þ C: where C is a constant. In the second step, we numerically integrate logf ðϕÞ on ϕ over the range of (−1, 1), calculate the empirical cumulative distribution, and update ϕ ðkÞ . Note, the updating of ϕ is not needed for the first type of multilevel temporal autoregressive model with independent individual random effects.
The above MCMC samplers are coded by using the R language and available in the Supplementary online Materials of the paper. Two computational aspects are worth mentioning. First, as ðI N À ρTÞ and ðI n À ϕWÞ are sparse matrices with majority entries of zero, the use of sparse matrix operation routines offered in the R Matrix package (Bates and Maechler, 2017) greatly reduces computational cost. Second, pre-calculating the log-determinant of ðI n À ϕWÞ for ϕ over the feasible range of (−1, 1) before starting the MCMC iterations is also important for saving computing time.

Data and variables
Our main data sources consist of daily GPS trajectories and activity diaries of urban residents living or working in the Shangdi-Qinghe area (Jiedao) of Beijing. As shown in Figure 1, the study area is located adjacent to the fifth ring road of Beijing in the north, about 16 km away from the city centre. Using a stratified random sampling approach, 709 respondents living or working in 23 neighbourhoods located in the study area were selected to participate in a seven-day survey from October to December in 2012 via eight waves (Ta et al. 2016). GPS tracking devices with a positional accuracy of about 15 m were used to record participants' movement every 30 s for seven consecutive days. An interactive survey website consisting of two main components was developed: an activity diary component to collect detailed information on participants' daily activities, and a questionnaire to collect socio-demographic and economic characteristics of participants.
In this study, we work with complete daily trajectories, which is defined by using the criterion that the time gaps between any two sequential location records are all less than or equal to 10 min unless a participant is at home (information available from the diary data). Missing locations in a time period less than 10 min are assumed to be equally spaced between the start and end points of that period for simplicity. Based on a spatial threshold of 50 m and a temporal threshold of 10 min (time to conduct a meaningful activity), each participant's daily movement trajectory was characterised as a sequence of consecutive stop and movement episodes, similar to the concept of syntactic trajectories in Grinberger and Shoval (2015). These trajectory episodes were then displayed on the website when a participant was filling in the activity diary at the end of each day to reduce recall bias (Kahneman et al. 2004). Both trajectory and activity diary data were passed on to a server and stored. As a quality control measure, we checked the completeness of the filled activity diary of each participant on the following day, and if an activity diary was incomplete, a text message would be sent to notify the participant of the issue. The stop episodes of trajectories were then matched with the reported activities and annotated with the attributes of matched activities and the socio-demographics of the corresponding participant. A match is achieved if the discrepancies in the start and end times between a stop episode and a reported activity period are both less than five minutes. In some cases when more than one activities are conducted in the same place (eg most often at home), a stop episode is further divided into a few episodes based on reported activity periods and annotated accordingly. Only matched activity episodes with complete information on key socio-demographic variables are included in the analysis. We further exclude respondents who are full-time students or unemployed as their daily activity arrangement could differ substantively from that of other groups of people (Shen et al. 2013). The final data includes 19,358 activities conducted by 494 participants in two to seven consecutive days (with a mode of five days). About 36% of the participants are dual earners of a family.
Our dependent variable is activity satisfaction, which is measured on a five-point Likert scale from being very dissatisfied (1) to being very satisfied (5). The mean activity satisfaction level is 3.77 with a standard deviation of 1.09 in our data ( Table 1). The independent variables are broadly divided into several categories: situational activity characteristics, geographical context or activity space attributes, and individual-level socio-demographics. Although the impacts on general life satisfaction of individual socio-demographics have been extensively discussed (eg Diener et al. 1999, Kahneman et al. 2004, Ma et al. 2017, few studies have examined the potential link between activity satisfaction and characteristics of real-time geographical contexts where activities take place (Schwanen and Wang 2014).
To address this gap in the subjective well-being literature, the focus of our empirical investigation is on extending understandings of how situational urban environment, measured by fine spatial resolution urban form characteristics, affects activity satisfaction. More specifically, urban form was measured in three dimensions: land-use mix, density, and dominant function (Cervero and Kockelman 1997). These land-use characteristics were measured at the land parcel scale (an average size of about 0.03), which is the finest resolution land-use data publicly available in Beijing. Geographical extents of land parcels were delineated based on road networks by Beijing Institute of City Planning. Land-use mix and function are extracted based on points of interest (POI) data. In short, the function of a land parcel is inferred based on the dominant POI category in that parcel while land-use mix represents the diversity of POI categories, calculated by an entropy measure. Density is measured by the total floor areas in each land parcel. We refer to Liu and Long (2016) for a detailed description of the development of land-use characteristic variables. A standard GIS overlay operation was applied to the annotated trajectory episodes data and the spatial polygon land-use data to extract real-time urban environment characteristics of residents' daily activities. Based on the upper and lower quantiles of land-use density and mix variables, a set of binary variables were generated to represent high, medium and low levels of density and mix. This is useful to explore potential non-linear land-use density and mix impacts on activity satisfaction, and to alleviate potential correlations between the two variables if treated as continuous. Our activity-scale measurement of urban environment takes into account participants' daily mobility and thus tackling the uncertain geographic context problem (Kwan 2012) in the estimates of geographical contextual effects on subjective well-being.
The second set of independent variables is about activity situational characteristics such as type, duration and companionship. Respondents' daily activities were originally coded into 19 specific categories in the web diary. To achieve a concise model specification, daily activities are divided into four broad categories: working or subsistence, maintenance, social and leisure, and others (eg Krizek 2003), as shown in Table 1. For instance, maintenance activities include sleeping, preparing food, eating, grocery shopping, and family obligation activities such as child care and shopping; social and leisure activities include socialising, social networking, exercising, while others include posting, banking, hospital visits and so on. Activity duration is the time that an activity lasts, which is treated as a continuous variable and differs greatly between activities (Table 1). Activity companionship is defined as a binary variablewhether an activity is conducted alone or with companions. To explore potential travel impacts on activity satisfaction, travel time from a preceding activity to the current activity and mode choices for the journey are included in the model.
The final set of independent variables are key individual socio-demographics. The linear and quadratic terms of age are included in the model to capture possible nonlinear age impacts on activity satisfaction. Income was added to the model via a series of dummy variables (Table 1). Migrants (residents without Beijing hukou status) have been shown to be associated with lower global life satisfaction than local residents of Beijing (Dong et al. 2016, Ma et al. 2017. It is interesting to test whether migrants also tend to report lower satisfaction with daily activities. Family structure, represented by the variable child presence, was also incorporated in our model following previous studies (eg Ma et al. 2018). Descriptive statistics of each variable included in our analysis are provided in Table 1. The dependent variable was transformed to a standard Normal distribution while the independent variables including activity duration and travel time were log-transformed.

Model estimation results
As discussed above, the time threshold (or bandwidth) parameter t o is required when forming the temporal weights matrix T i or the spatially adjusted weights matrix T Ã i (a space-time weights matrix for simplicity). Usually it was specified a priori or, in few cases, estimated along with other model parameters. Appealing as it sounds to calibrate t o from data, issues exist including additional computational cost and, more critically, great difficulties in distinguishing the estimation of t o and the temporal autoregressive parameter ρ (eg Banerjee et al. 2014). The study adopts an alternative approach: selecting the value of t o that yields the best model fit from a finite set of possible values. Deviance information criterion (DIC, Spiegelhalter et al. 2002), the common model fit index in Bayesian inference that penalises model complexity, was used for model comparison and selection of t o . Smaller values of DIC indicate better model fits.
Twelve discrete values of t o over a range of ½0:5; 6 were selected. Four models, treating individual random effects either as independent or dependent and with a time weights matrix T i or a space-time weights matrix T Ã i , were estimated using the above MCMC samplers for each value of t o . Statistical inferences were based on two MCMC chains, each of which consisted of 10,000 iterations with a burn-in period of 5,000. Convergence of samplers was checked by visual inspection of trace plots of parameters and the Brooks-Gelman-Rubin scale reduction statistics (Brooks and Gelman 1998). The relationship between DIC and t o is illustrated in Figure 2. A clear result is the superior model fit offered by the model treating individual random effects as dependent (ie family interaction or correlation effects considered) and with a spacetime weights matrix. For this preferred model specification, the optimal t o is 2.5 h. It is also noticeable that optimal values of t o are slightly different between model specifications. Moreover, it appears that models with a space-time weights matrix perform better than do models with a pure time weights matrix. This highlights the importance of taking into account both temporal and spatial distances between daily activities when exploring residents' subjective experiences of activities.
Regression coefficient estimates and the associated 95% credible intervals from the multi-level temporal autoregressive models with dependent individual random effects and a space-time weights matrix are reported in Table 2. As a comparison, estimation results from a multi-level temporal autoregressive models with a time weights matrix are also reported. Before proceeding to interpreting covariate effects on activity satisfaction, we discuss the estimates on structural model parameters. First, positive sequential or temporal correlations between activity satisfactions were found, indicated by the statistical significance of the temporal autoregressive parameter ρ in both models. Putting the magnitude of temporal autocorrelation in perspective, for two activities separated by one hour, about 10.4% (0:256 Â exp À1=2:5 f gÂexp À0:5 f g) of the satisfaction level of an activity would be carried over to the following activity, on average. Second, family correlation effects on activity satisfaction were identified, as indicated by the statistical significance of the autoregressive parameter ϕ at the 95% credible interval. A plausible explanation of the within-family correlations of daily activity satisfactions is the interacting decision process between family members in terms of daily activity arrangement. Third, σ 2 is related to the magnitude of unobservable individual heterogeneity effect on activity satisfaction. In a standard multi-level model, the importance of individual heterogeneity effects can be quantified by the variance partitioning coefficient (σ 2 =ðσ 2 þ σ 2 e Þ, Goldstein 2011). This is not valid any more in our models with sequential correlations and dependent individual random effects, a similar issue found in the multilevel spatial econometric models (Dong and Harris 2015, Ma et al. 2017. To approximate the variance partitioning coefficient, we calculated marginal variances of the posterior residuals at the activity and individual scales. This yields an estimate on the variance partitioning coefficient of 0.44, indicating that about 44% of unexplained variances of activity satisfaction is due to unobservable individual heterogeneity effects. Overall, estimates on model structure parameters demonstrate complex dependencies in residents' daily activity satisfaction, arising from both sequential correlations and individual heterogeneity effects. Turning to the estimates on regression coefficients, we note that they represent the direct effects of independent variables on activity satisfaction. If a scalar summary of the total marginal effect of a variable X k was desired, one could calculate the average of row-sums of the matrix ðI N À ρTÞ À1 I N δ k by plugging in respective parameter estimates. Table 2 shows that most situational activity characteristics and geographical contextual variables are statistically significantly associated with activity satisfaction. In terms of activity type, working is the least enjoyable activity comparing to other three activity types, ceteris paribus, which corroborates previous findings (Kahneman et al. 2004). Activity duration appears to be associated with satisfaction in a non-linear way as the coefficient of the quadratic duration term is statistically significant at the 5% significance level. Activity companionship makes a difference on satisfactionactivities conducted with companions tend to be more enjoyable than those conducted alone, holding other variables constant.
In terms of geographical contextual or activity space characteristics, land-use mix and urban function are statistically significantly associated with activity satisfaction. More specifically, conducting daily activities in places with low land-use mix tends to lower activity satisfaction, ceteris paribus. The difference in experiences of activities conducted in places with high or medium levels of land-use mix is, however, insignificant controlling for other variables. Urban functions of places where residents arrange activities also matter. Activities taking place in green space or residential areas are associated with higher levels of satisfaction comparing to those conducted in places with other urban functions, everything else being equal. Overall, these results suggest that characteristics of geographical contexts where activities take place are associated with residents' subjective well-being of activities. This is in contrast with the findings in Schwanen and Wang (2014) that show no statistically significant relationships between activity space characteristics and satisfaction. A plausible explanation is that the relatively coarse-scale measurement of activity space attributes in Schwanen and Wang (2014) might not be able to characterise sites of activities accurately. In addition, the temporal autocorrelation in subjective well-being and individual heterogeneity effects are not properly captured in their model. Longer time spent on travel to conduct an activity is associated with lower activity satisfaction, holding other variables constant. However, travel mode choices do not appear to be correlated with activity satisfaction. Implications on urban planning would involve the promotion of mixed land uses to reduce travelling efforts in residents' daily lives.
With respect to individual-scale variables, the linear term of age is statistically associated with activity satisfaction while the quadratic term is not. This, to some extent, departs from previous findings that age tends to be correlated with global life satisfaction non-linearly (eg Diener et al. 1999, Ma et al. 2017. The association between income and activity satisfaction was not statistically significant after controlling for the activity-level covariate effects on satisfaction. The lack of associations between the two key individual socio-demographic variables and daily activity satisfaction corroborates the argument that momentary satisfaction or affection relies more on situational circumstances and individual biological characteristics (part of the unobservable individual-scale effects) than life circumstance variables (Kahneman and Deaton 2010). Migrants, residents living in the city but without Beijing household registration (hukou), tend to report lower daily activity satisfaction than local residents. This might reflect the relatively limited space-time accessibility to quality opportunities for migrants due to them being subject to institutional constraints on access to affordable housing and social welfare systems such as education (eg Kwan 1999, Ma et al. 2017).

Conclusions
GPS trajectory or movement data have been increasingly explored to understand human mobility at fine spatio-temporal granularity. Integrating trajectory data with external data sources such as urban built environment and activity diaries to form semantic trajectories has been demonstrated to be of great value to promote knowledge on human activity and environment interactions (Grinberger andShoval 2015, Siła-Nowicka et al. 2016). Nonetheless, dealing with the issues of time and scale for (semantic) GPS trajectory or movement data in a unified statistical model poses great challenges. This paper has proposed a novel multi-level temporal autoregressive model that deals with the two issues. In the model, time was conceptualised as continuous, allowing for a simple and intuitive way (ie a temporal weights matrix) to model sequential or temporal correlations between trajectory segments (daily activities in this study). Space is also introduced into the model, adjusting the temporal weights matrix on the basis of spatial distances between locations of activities. The scale issue is tackled by a simultaneous analysis of GPS trajectory data at different scales via a multi-level modelling approach.
The developed methodology is demonstrated by an empirical examination of residents' daily activity satisfaction in Beijing, with a particular focus on how geographical contexts affect experiences of daily activities. Daily GPS trajectories of residents are annotated by using reported diary data, enabling rich characteristics to be extracted for trajectory episodes (activities in the study) and insights into relationships between activity satisfaction and situational urban environment at fine spatial resolution to be generated. A few key findings and the associated implications are worth highlighting. First, significant temporal or sequential correlations in daily activity satisfaction are found in models with a time or space-time weights matrix. It implies that treating activities or trajectory segments as independent in statistical models will be inappropriate. Individual heterogeneity effects account for a relatively large proportion of the variability in residents' daily activity satisfaction, signifying the importance of individual heterogeneity in subjective wellbeing studies. Second, geographical contextual characteristics of sites where activities take place make a difference in experiences of activitiesactivities conducted in places with the higher land-use mix, lower density, and in green space tend to be associated with higher satisfaction levels. This, however, does not indicates that causal geographical contextual effects on activity satisfaction can be drawn due to the possible selection effect, ie people might intentionally choose sites with desirable characteristics for certain activities to pursue high levels of satisfaction. A pursuit of causal geographical contextual effect identification would require a simultaneous modelling of residents' activity site choices and activity satisfaction (eg Steele 2008, Goldstein 2011, which is left for future research. Third, activity situational characteristics including types, duration and companionship are statistically significantly associated with satisfaction levels. Lastly, most of individual life circumstance variables are not significant correlates of daily activity satisfaction, with the exception of the migrant status. Despite illustrated by examining residents' activity satisfaction, the developed multilevel temporal autoregressive model is suitable to explore other characteristics of activities or trajectory segments. It also supplements the spatio-temporal statistics or econometrics literature (eg Elhorst 2014) by offering a useful tool to deal with data with irregular time intervals. Nevertheless, there are some limitations associated with the methodology. The first concerns the assumption that the temporal autoregressive parameter (ρ) is equal among individuals. This equal constraint is made for practical computational concerns, but it might be violated as it is possible that the strength of sequential correlations in daily activity satisfaction varies between individuals with distinct characteristics such as personalities. The second limitation regards the treatment of activity satisfaction as a continuous outcome variable. Activity satisfaction is, however, measured on a Likert scale, thus being an ordinal variable in nature (eg Dong et al. 2018). Ideally, it would be modelled as an ordinal response variable. These two extensions to the developed methodology are our next step of research. Thirdly, socio-demographic characteristics of activity space were not directly captured in our empirical satisfaction models due to the lack of data at fineresolution spatial scale in the study area. However, as suggested in the spatial econometrics literature (LeSage and Pace 2009), the inclusion of a lagged dependent variable in our equation might capture the effect of these variables. Lastly, the formalisation of spatiotemporal relationships is based on distances between activities, but it would be extended to also consider movement speed in our future development such that the potential effect of travel congestion can be captured.
Yiming Wang is a senior lecturer at the School for Policy Studies, University of Bristol. His research involves analysing economic and social policies through the application of both quantitative and qualitative methods.