Skewness and Staging: Does the Floor Effect Induce Bias in Multilevel AR(1) Models?

Abstract Multilevel autoregressive models are popular choices for the analysis of intensive longitudinal data in psychology. Empirical studies have found a positive correlation between autoregressive parameters of affective time series and the between-person measures of psychopathology, a phenomenon known as the staging effect. However, it has been argued that such findings may represent a statistical artifact: Although common models assume normal error distributions, empirical data (for instance, measurements of negative affect among healthy individuals) often exhibit the floor effect, that is response distributions with high skewness, low mean, and low variability. In this paper, we investigated whether—and to what extent—the floor effect leads to erroneous conclusions by means of a simulation study. We describe three dynamic models which have meaningful substantive interpretations and can produce floor-effect data. We simulate multilevel data from these models, varying skewness independent of individuals’ autoregressive parameters, while also varying the number of time points and cases. Analyzing these data with the standard multilevel AR(1) model we found that positive bias only occurs when modeling with random residual variance, whereas modeling with fixed residual variance leads to negative bias. We discuss the implications of our study for data collection and modeling choices

and then iteratively repeat the above step of substituting the previous value in the equation for X T −2 , X T −3 , . . .X 1 , which results in in which the first element is the sum of a geometric sequence of length t with the starting term c and common ratio φ.To get the mean of the time series, we calculate the expected value of X T , and given that the Gaussian innovation term has a mean of zero (E[ t ] = 0), we have For an infinitely long sequence, given the limited range of φ (|φ| < 1), the marginal expectation of the AR(1) process becomes its marginal mean-also referred to as its stationary mean-that is which is finite and time-invariant (proving mean stationarity), and is equal to the expression in Equation 3.

Marginal distribution
To derive the marginal distribution of the AR(1) process, similar to Equation 6, we should first center it around its marginal expectation: Furthermore, as we know, if two independent Gaussian random variables Y a and Y b are normally distributed with mean zero and common variance σ 2 Y (i.e., Y a , Y b ∼ N (0, σ 2 Y )), the distribution of their weighted sum Y sum = aY a + bY 2 would again be Gaussian and is characterized by By generalizing this for XT , XT −1 , . . .X1 , and following the same logic as of Equation S2, we will have By adding the expected value of X T (Equation S3) to XT , the marginal distribution will be Finally, if the sequence is infinitely long (t → ∞), given that |φ| < 1, the stationary distribution of X t becomes which is equal to the expression in Equation 4, and entails that the variance is ) and the skewness is zero (γ = 0).Given that the variance of the AR(1) model converged to a a finite, time-independent value, we may again conclude stationarity of variance.

Autocorrelation function
To derive the ACF of the AR(1) model, we first calculate its autocovariance function, and then transform it to ACF by dividing it by the variance (gunes, 2019).We calculate AFC asymptotically (for infinitely long time series) and make use of the stationarity assumption of the AR(1) model.Let ζ(l) be the lag-l autocovariance of X t , which entails, in which Cov(c, X t−1 ) = 0 because the covariance between a random variable and a constant is always zero, and Cov( t , X t−1 ) = 0 because t , by definition, is independent of X t−1 .Similarly, for l = 2 we have By iteratively repeating the steps taken in Equations S10 and S11 we reach to the general formula of The autocorrelation function of AR(1) can thus be calculated by dividing ζ(l) by the variance of X t (i.e., ζ(0), which yields ρ(l) = φ l for l ≥ 0 (Equation 5), which shows an exponential decay.

The χ 2 AR(1) model
The χ 2 AR(1) process is a special case of the AR(1) process with gamma-distributed residuals, as a χ 2 distribution with ν degrees of freedom is equivalent to a gamma distribution with shape parameter α = ν/2 and scale parameter λ = 2, that is, χ 2 (ν) ∼ Γ(ν/2, 2).Thus, here we prove the properties for the general case where a t ∼ Γ(α, λ) and include an intercept in the model.We call such process the ΓAR(1) model which has the following form: Including the intercept c makes the physical interpretation of the model harder, and it has been ignored in the literature because it does not have practical applications.However, we keep it in the model to derive more general formulae and make it more comparable to the AR(1) model.As we will see, c only appears in the marginal mean of the ΓAR(1) model and only changes the location of the marginal distribution.In this section, instead of deriving the expressions for a finite time series and then calculating their asymptotic limits, we make use of the stationarity assumption for |φ| < 1importantly, that the moments of the time series do not change over time (Tufto, 2021), thus they asymptotically converge to a specific value-which may be proven by taking the same steps in the previous section to derive the asymptotic properties of the AR(1) model.

Marginal properties
To derive the marginal properties of a ΓAR(1) model write X t recursively based on its previous values X t−2 , X t−3 , . . .X 1 , following the steps taken in Equation S2.However, given that the weighted sum of gamma-distributed random variables does not have a closed-form distribution (Di Salvo, 2008), it is not possible to derive an analytical expression for the marginal distribution of the ΓAR(1) model (Tiku et al., 1999).Thus, we only derive the relevant moments of the process, namely, mean, variance, and skewness.

Marginal mean
We calculate the stationary mean of the ΓAR(1) process using the Law of Total Expectation-which states that the expected value of a random variable A is equal to the expected value of its expectation given another random variable B, that is, and make use of Equation S26: Substituting c = 0, α = ν/2, and λ = 2 in Equation S15 yields the marginal mean of the χ 2 AR(1) model of Equation 10.

Marginal variance
In a similar manner, to calculate the marginal variance, we make use of the Law of Total Variance-which determines the variance of a random variable A based on another random variable in which I is the conditional expectation of X t , which is equal to c + φX t−1 + αλ, and II is the conditional variance of the process, which is equal to the variance of a t , because Thus, Equation S16 simplifies to which yields Substituting α = ν/2, and λ = 2 in Equation S19 yields the marginal variance of the χ 2 AR(1) model of Equation 11.

Marginal skewness
Following Tufto (2021), we make use of the Law of Total Cumulance (Brillinger, 1969) to calculate the stationary third moment (κ 3 ) of X t and make use of the results from earlier: Following similar steps as in Equation S17, we may show that X t |X t−1 and a t have the same third central moment (µ 3 (X t |X t−1 ) = µ 3 (a t )).Thus, to calculate I, we may make use of the cumulant generating function of a t .Specifically, given that the moment generating function of the gamma-distributed Thus we can calculate µ 3 (X t |X t−1 ) by taking the third derivative of K a (s) at zero: By substituting the conditional expectation of X t in II and using the stationarity assumption, we have which, solving for κ 3 , yields Finally, the skewness of X t can be calculated by dividing κ 3 by σ 3 , which yields By substituting α = ν/2 in Equation S25, we get to the marginal skewness of the χ 2 AR(1) model of Equation 12.

Conditional expectation
The conditional expectation of the ΓAR(1) model, which is its deterministic part given the previous observation, may be easily derived via which, for c = 0, α = ν/2, and λ = 2, is equal to the conditional expectation of the χ 2 AR(1) model in Equation 9.

Autocorrelation function
To calculate the ACF of the ΓAR(1) model, like that of the AR(1) model, we should derive its lag-l autocovariance function ζ(l), for which we start by ζ(1): As we can see, the lag-1 autocovariance of the ΓAR(1) model is identical to that of the AR(1) model and independent of the residual distribution.Thus we may follow the same steps taken for the AR(1) process, the ACF of ΓAR(1) model will be ρ(l) = φ l for l ≥ 0.

Marginal properties
It can be shown that if the first observation of a BinAR(1) process is taken from a binomial distribution with success probability θ and size k (i.e., X 1 ∼ Binom(θ, k)), later observations of the process follow the same distribution (i.e., X t ∼ Binom(θ, k) for t = 2, 3, . . . ) (see, e.g., Al-Osh & Alzaid, 1991).One should, however, show how θ is related to the survival and revival probabilities α and β.To do so, McKenzie (1985) constructs the BinAR(1) model starting with its marginal distribution, which is considered to be X t ∼ Binom(θ, k), and argues that for any survival probability 0 ≤ α ≤ 1, a BinAR(1) model of a form similar to Equation 131 exists if we define β = (1 − α)θ/(1 − θ), which results in an autocorrelation parameter of φ = α − β.To assure that 0 ≤ β ≤ 1 (such that it may be considered a probability), there must be a constraint on the values of θ given φ (Weiß & Kim, 2013), which is Given this constraint, we may plot the admissible area of the BinAR(1) model in the φ-θ parameter plane, which is shown in Figure S3.In this figure, α and β corresponding to any permissible pairs of φ and θ are plotted with contour lines, and the plot is colored based on the marginal skewness (with conventional thresholds of |γ| ≤ 0.5, 0.5 < |γ| ≤ 1, and |γ| > 1) resulting from the θ parameter.As we can see, α and β can freely take any values from 0 to 1, and as long as the autoregressive parameter is non-negative (φ ≥ 0), θ can also take any values from 0 to 1.Given that the BinAR(1) model has a binomial distribution with X t ∼ Binom(θ, k) wherein θ = β 1−(α−β) , its mean, variance, and skewness are easily computed using Equations 16-18.

Equivalent Markov model
As shown by Al-Osh and Alzaid (1991) and Weiß (2009), the BinAR(1) model can also be expressed as a special, parsimonious case of a first-order Markov model: The 0 − k integer values of the BinAR(1) process can be thought of as k + 1 distinct states, such that the probability of being in a specific state at any given time depends solely on the state the person was in at the previous occasion and a 1-step-ahead transition probability.Particularly, the transition probability of going from X t−1 = u to X t = v (for u, v = 0, ..., k) in the BinAR(1) model with parameters α and β can be expressed as To provide an impression of what the matrix of transition probabilities looks like for the BinAR(1) models, we include the 2 × 2 transition matrix T 1 for a model with k = 1, and the 3 × 3 matrix T 2 for the model with k = 2, that are These show that the elements of the matrix with transition probabilities are functions only two parameters, namely, α and β.

Autocorrelation function
Following the derivations of the previous models, let ζ(l) denote the lag-l autocovariance of X t , thus ζ(1) = Cov(X t , X t−1 ) and ζ(0) = Cov(X t , X t ) = V ar(X t ).We start by calculating the lag-1 autocovariance of X t , that is To calculate the first term, we use the definition of covariance and use the Law of Total Expectation (Ziddletwix, 2021) to calculate I, which yields Note that given S t has a binomial distribution with size X t−1 and probability of success α, its expectation given X t−1 is equal to the mean of a binomially distributed random variable, which is X t−1 α.To calculate II, we use the Law of Total Expectation for S t , which entails Putting Equations S32-S34 together, we have

The PoDAR(1) model
As explained in the main text, the properties of the DAR(1) model are independent of its marginal distribution-whether bounded (like the binomial and beta-binomial distributions) or unbounded (like Poisson, geometric, or negative binomial distributions)-as long as it is discrete-valued.2Thus, in this section, we discuss the properties of this model in its general form (with an arbitrary marginal distribution of X t ∼ Π).However, when presenting its equivalent Markov model, we also introduce an upper-bounded version of the PoDAR(1) model with right-truncated Poisson marginal distribution, which is more suitable for scales with an upper bound and its Markov model can be expressed in a transition matrix.

Marginal properties
The DAR(1) model is characterized by the τ and Π, which are the persistence probability and the distribution of Z t .The main marginal property of the DAR(1) model is that it has the same marginal distribution as Z t , that is, X t ∼ Z t ∼ Π, and it is independent of τ .To do so, let us assume that the first observation of the time series followed the distribution of Z t , that is, X 1 ∼ Π.Given the formulation of Equation 20, we may see that X 2 , a random variable, is a mixture of two other random variables, X 1 and Z 2 : We know that if A and B are two random variables with probability mass functions f A (i) and f B (i), their weighted sum has a density which is a mixture of A and B and has a probability mass function given by f , which entails that C has the same distribution as A (C ∼ A).
Applying this to Equation S37, given that X 1 ∼ Z 2 ∼ Π and the weights (P 2 and 1−P 2 ) always add up to 1 regardless of the value of τ , we may conclude that X 2 ∼ Π.By repeating this for other measurement occasions t = 3, 4, . .., we can show that X t ∼ Π.Given that in the PoDAR(1) model Z t ∼ P oisson(λ), its marginal mean, variance, and skewness are readily calculated from the formulas of the Poisson distribution (Equation 23).

Equivalent Markov model
It can be easily shown that the DAR(1) model of Equation 19 is a special case of a firstorder Markov process by enumerating the different ways the person may have a value of v at the measurement occasion t (X t = v) (Jacobs & Lewis, 1978).Assuming that at the previous occasion, the person had a value of u (X t−1 = u), the person may have the same value of distress at occasion t (v = u) in two ways: Either (a) no disruption takes place between t − 1 and t (P t = 1, with probability τ ) thus the person keeps the same level of distress (X t = X t−1 ); or (b) a disruption happens (P t = 0 with probability 1 − τ ), but with a probability of π u the person ends up experiencing the same level of distress on occasion t (X t = Z t = u).Because P t is independent of X t−1 and Z t , the total probability of keeping the same level of distress is τ + (1 − τ )π v .Conversely, the only way that a person can have a different level of distress on occasion t is that both a disruption happens (P t = 0 with probability 1 − τ ), and they experience another level of distress (v = u) with the probability π v .Given the independencies, this probability will be (1 − τ )π v .Consequently, the transition probability of going from u at t − 1 to v at t, denoted by p v|u , can be expressed as In case Π is not upper-bounded, as in the PoDAR(1) model, the process will be a countable-state infinite Markov process (with countably infinite number of states v = 0, 1, . ..), which is hard to express by a transition matrix.However, one may impose an upper bound of k for the process, for instance, on the grounds of substantive or measurement considerations.In that case, another marginal distribution can be used for Π, and the resulting DAR(1) model may be characterized by an ordinary lag-1 Markov transition matrix T , which can be written using Ω (a k × k matrix whose columns are Π), and I (identity matrix of size k) as follows (McKenzie, 2003): By applying the Bayes' rule to Equation S39, one can easily show that the marginal distribution of the DAR(1) is equal to Π (and is independent of τ ).
As an example, and following the rationale of using the Poisson distribution for stress units exposing the person at a constant rate of λ, we may use the right-truncated Poisson distribution (Suaiee, 2013) as an upper-bounded marginal distribution for Π, which has the following probability mass function: There are closed form expressions for moments of the right-truncated Poisson distribution, however, they are quite complex (see, e.g., Suaiee, 2013, pp. 24-25).If k is much larger than λ-that is, the upper limit of the scale is chosen conservatively high-the term ψ k in Equation S40, using power series expansion, can be approximated by e −λ , thus which is similar to the probability mass function of the non-truncated Poisson distribution (Equation 22).As a result, when k >> λ,3 we may conveniently approximate mean, variance, and skewness of the right-truncated PoDAR(1) with To provide an impression of what the matrix of transition probabilities resulting from Equation S39 looks like, here we show the transition matrix of a right-truncated PoDAR(1) model with k = 7, which is fully characterized by λ and τ :

Autocorrelation function
To calculate the ACF of the DAR(1) model, like those of the other models, we first derive the lag-l autocovariance function ζ(l).Using Equation 20, we start with ζ(1) and make use of the independencies between P t , Z t , and X t−1 : ζ(1) = Cov(X t , X t−1 ) = Cov(P t X t−1 + (1 − P t )Z t , X t−1 ) = Cov(P t X t−1 , X t−1 ) + Cov((1 − P t )Z t , X t−1 ) (S44) This shows that the lag-1 autocovariance of the DAR(1) model is independent of marginal distribution Π and is identical to that of an AR(1) model with φ = τ .Like for other DGMs, we may follow the same steps taken for the AR(1) model to reach the ACF of the DAR(1) model, which is ρ(l) = τ l for l ≥ 0.

Figure S3 .
Figure S3.The admissible ranges of the BinAR(1) model parameters with k = 7.The contour lines show values of α and β corresponding to any given admissible θ and φ pairs.The area is shaded based on the marginal skewness implied by θ, with conventional thresholds of |γ| ≤ 0.5, 0.5 < |γ| ≤ 1, and |γ| > 1, respectively for negligible, moderate, and high skewness.