Approximate leave-future-out cross-validation for Bayesian time series models

One of the common goals of time series analysis is to use the observed series to inform predictions for future observations. In the absence of any actual new data to predict, cross-validation can be used to estimate a model's future predictive accuracy, for instance, for the purpose of model comparison or selection. As exact cross-validation for Bayesian models is often computationally expensive, approximate cross-validation methods have been developed; most notably methods for leave-one-out cross-validation (LOO-CV). If the actual prediction task is to predict the future given the past, LOO-CV provides an overly optimistic estimate as the information from future observations is available to influence predictions of the past. To tackle the prediction task properly and account for the time series structure, we can use leave-future-out cross-validation (LFO-CV). Like exact LOO-CV, exact LFO-CV requires refitting the model many times to different subsets of the data. Using Pareto smoothed importance sampling, we propose a method for approximating exact LFO-CV that drastically reduces the computational costs while also providing informative diagnostics about the quality of the approximation.


Introduction
A time series is a set of observations each one being recorded at a specific time (Brockwell et al., 2002). In statistics, a wide range of time series models has been developed, which find application in nearly all empirical sciences (e.g., see Brockwell et al., 2002;Hamilton, 1994). One common goal of a time series analysis is to use the observed series to inform predictions for future. When working with discrete time -in which time points form discrete set -we will refer to the task of predicting a sequence of M future observations as M -step-ahead prediction (M -SAP). Once we have fit a Bayesian model and can sample from the posterior predictive distribution, it is straightforward to generate predictions as far into the future as we want. It is also straightforward to evaluate the M -SAP performance of a time series model by comparing the predictions to the observed sequence of M future data points once they become available.
It is common that we would like to estimate the future predictive performance before we can collect the future observations. If we have many competing models we may also need to first decide which of the models (or which combination of the models) we should rely on for predictions (Geisser and Eddy, 1979;Hoeting et al., 1999;Vehtari and Lampinen, 2002;Ando and Tsay, 2010;Vehtari and Ojanen, 2012). In the absence of new data with which to evaluate predictive performance, one general approach for evaluating a model's predictive accuracy is cross-validation. When doing cross-validation, the data is split into two subsets. We fit the statistical model based on the first subset and then evaluate its predictive accuracy for the second subset.
We may do this once or many times, each time leaving out another subset.
If there is no time ordering in the data or if the focus is to assess the non-time-dependent part of the model, we can use leave-one-out cross-validation (LOO-CV). For a data set with N observations, we refit the model N times, each time leaving out one of the N observations and assessing how well the model predicts the left-out observation. Due to the higher number of required refits, exact LOO-CV is computationally expensive in particular when performing full Bayesian inference where refitting the model means estimating a new posterior distribution rather than a point estimate. However, we may approximate exact LOO-CV using the Pareto smoothed importance sampling algorithm (PSIS; Vehtari et al., 2017b,a). PSIS-LOO-CV requires only a single fit of the full model and comes with diagnostics for assessing the validity of the approximation.
If there is time ordering in the data and we are interested in the predictive performance to new future time points, leaving out only one observation at a time allows information from the future to influence predictions of the past (i.e., times t + 1, t + 2, . . . should not be used to predict time t). To apply the idea of cross-validation to the M -SAP case, we can use leave-future-out cross-validation (LFO-CV). LFO-CV does not refer to one particular prediction task but rather to various possible cross-validation approaches that all involve some form of prediction of future time points. Like exact LOO-CV, exact LFO-CV requires refitting the model many times to different subsets of the data, which is computationally expensive, in particular when performing full Bayesian inference.
In this paper, we extend the ideas from PSIS-LOO-CV and present PSIS-LFO-CV, an algorithm that typically only requires refitting the time-series model a small number times. This will make LFO-CV tractable for many more realistic applications than previously possible including time series model averaging using stacking of predictive distributions (Yao et al., 2018).
The structure of the paper is as follows. In Section 2, we introduce the idea and various forms of M -stepahead predictions and how to approximate them using PSIS. In Section 3, we evaluate the accuracy of the approximation using extensive simulations. Then, in Section 4, we provide two real world case studies. One analyzing the change in level of Lake Huron and the other examining when the annual day of the cherry blossoms in Kyoto, Japan, occurred, with the timeline starting in the 9th century. We end with a discussion of the usefulness and limitations of the approach in Section 5.

M -step-ahead predictions
Assume we have a time series of observations y = (y 1 , y 2 , . . . , y N ) and let L be the minimum number of observations from the series that we will require before making predictions for future data. Depending on the application and how informative the data are, it may not be possible to make reasonable predictions for y i based on (y 1 , . . . , y i−1 ) until i is large enough so that we can learn enough about the time series to predict future observations. Setting L = 10, for example, means that we will only assess predictive performance starting with observation y 11 , so that we always have at least 10 previous observations to condition on.
In order to assess M -SAP performance we would like to compute the predictive densities p(y i+1:M | y 1:i ) = p(y i , . . . , y i+M −1 | y 1 , ..., y i−1 ) for each i ∈ {L + 1, . . . , N − M + 1}, where we use y i+1:M = (y i , . . . , y i+M −1 ) and y 1:i = (y 1 , . . . , y i−1 ) to shorten the notation. As a global measure of predictive accuracy, we can use the expected log posterior density (ELPD; Vehtari et al., 2017b), which, for M-SAP, can be defined as The distribution p t (ỹ i+1:M ) describes the true data generating process for new dataỹ i+1:M . As these true data generating processes are unknown, we approximate the ELPD using LFO-CV, which leads to The quantities p(y i+1:M | y 1:i ) can be computed with the help of the posterior distribution p(θ | y 1:i ) of the parameters θ conditional on only the first i − 1 observations of the time-series: For factorizable models, the response values are conditionally independent given the parameters, and the likelihood can be written in the factorized form In this case, p(y i+1:M | y 1:i , θ) reduces to due to the assumption of conditional independence between y i+1:M and y 1:i given θ. Cross-validation for non-factorizable models, which does not make this assumption, is discussed in Bürkner et al. (2018).
In practice, we will not be able to directly solve the integral in (4), but instead have to use Monte-Carlo methods to approximate it. Having obtained S random draws (θ 1:i ) from the posterior distribution p(θ | y 1:i ), we can estimate p(y i+1:M |y 1:i ) as which further simplifies for factorizable models as shown above.

Approximate M -step-ahead predictions
The above equations include the posterior distributions from many different fits of the model to different subsets of the data. To obtain the predictive density p(y i+1:M | y 1:i ), a model is fit to only the first i − 1 data points, and we will need to do this for every value of i under consideration (i.e., all i ∈ {L + 1, . . . , N − M + 1}).
Below, we will present a new algorithm to reduce the number of models that need to be fit for the purpose of obtaining each of the densities p(y i+1:M | y 1:i ). This algorithm relies in a central manner on Pareto smoothed importance sampling (Vehtari et al., 2017b,a), which we will briefly review next.

Pareto smoothed importance sampling
In general, importance sampling is a technique to compute expectations with respect to some target distribution using an approximating proposal distribution that is easier to draw samples from than the actual target.
If f (θ) is the target and g(θ) is the proposal distribution, we can write any expectation with importance ratios Accordingly, if θ (s) are S random draws from g(θ), we can approximate provided that we can compute the raw importance ratios r(θ (s) ) up to some multiplicative constant. We see that the raw importance ratios serve as weights of the corresponding random draws in the approximation of the quantity of interest. The main problem with this approach is that the raw importance ratios tend to have high or infinite variance and as such, results computed on their basis can be highly unstable.
In order to stabilize those computations, one solution is to regularize the largest raw importance ratios using the corresponding quantiles of generalized Pareto distribution fitted to the largest raw importance ratios. This procedure is called Pareto smooth importance sampling (PSIS; Vehtari et al., 2017b,a) and has been demonstrated to have a lower error and faster convergence rate than other commonly used regularization techniques (Vehtari et al., 2017a). In addition, PSIS comes with a useful diagnostic to evaluate the goodness of the importance sampling approximation. The shape parameter k of the generalized Pareto distribution fitted to the largest importance ratios provides information about the number of existing moments of the weight distribution and the actual importance sampling estimate. When k < 0.5, the weight distribution has finite variance, and as a result of the central limit theorem, the convergence of the importance sampling estimate with increasing number of draws will be fast. This implies that approximate LOO-CV via PSIS is highly accurate for k < 0.5 (Vehtari et al., 2017a). For 0.5 ≤ k < 1, a generalized central limit theorem holds, but the convergence rate drops quickly when k increases (Vehtari et al., 2017a). In practice, PSIS has been shown to be relatively robust for k < 0.7 (Vehtari et al., 2017b,a). As such, the default threshold is set to 0.7 when performing PSIS LOO-CV (Vehtari et al., 2017b).

PSIS applied to M -step-ahead predictions
We now come back to our task of performing M -step-ahead predictions in time-series models. Starting with where w (s) i are the PSIS weights and θ (s) are draws from the posterior distribution based on all observations.
To obtain w (s) i , we first compute the raw importance ratios with J = {1, . . . , N }, and then stabilize them using PSIS as described above. The index set J i contains all the indices of observations which are part of the actually fitted model but not of the model whose predictive performance we are trying to approximate. That is, for the starting value i = N − M + 1, we have This approach to computing importance ratios is a generalization of the approach used in PSIS-LOO-CV, where only a single observation is left out at a time and thus J i = i for all i.
Starting from i = N − M + 1, we gradually decrease i by 1 (i.e., we move backwards in time) and repeat the process. At some observation i, the variability of the importance ratios r (s) i will become too large and importance sampling fails. We will refer to this particular value of i as i 1 . To identify the value of i 1 , we check for which value of i does the estimated shape parameter k of the generalized Pareto distribution first cross a certain threshold τ (Vehtari et al., 2017a). Only then do we refit the model using only observations before i 1 and then restart the process. Until the next refit, we have J i = {i, . . . , i 1 − 1} for i < i 1 , as the refitted model only contains the observations up to index N 1 = i 1 − 1. An illustration of the above described procedure is shown in Figure 1.
In some cases we may only need to refit once and in other cases we will find a value i 2 that requires a second refitting, maybe an i 3 that requires a third refitting, and so on. We repeat the refitting as many times as is required (only if k > τ ) until we arrive at i = L + 1. Recall that L is the minimum number of observations we have deemed acceptable for making predictions (setting L = 0 means predicting only based on the prior).
A detailed description of the algorithm in the form of pseudo code is provided in Appendix A. If the data contains multiple independent time-series, the above described algorithm should be applied to each of these time-series, separately, and the obtained ELPD values can be summed up afterwards.
The threshold τ is crucial to the accuracy and speed of the proposed algorithm. If τ is too large, we need fewer refits and thus achieve higher speed, but accuracy is likely to suffer. If τ is too small, we get high accuracy but a lot of refits to that speed will drop noticeably. When performing exact cross-validation of Bayesian models, almost all of the computational time is spend fitting models, while the time needed to do predictions is negligible in comparison. That is, a reduction of the number of refits basically implies a proportional in the present paper, we can expect an appropriate threshold to be somewhere between 0.5 ≤ τ ≤ 0.7. It is unlikely to be as high as τ = 0.7 used for PSIS-LOO-CV, as the errors are more dependent in PSIS-LFO-CV.
If there is a large error leaving out ith observation, then there is likely to be a large error when leaving out . until a refit is performed. That is, highly influential observations with high k are likely to have stronger effects for the total estimate in LFO-CV than in LOO-CV. We will come back to the issue of setting appropriate thresholds in Section 3.
An alternative to the LFO-CV approach discussed above is to exclude only the block of future values that directly follow the observations to be predicted while retaining all of the more distant future values. We will discuss this approach in Appendix B.

Simulations
To evaluate the goodness of the approximation of PSIS-LFO-CV, we performed a simulation study by systematically varying the following conditions: The number M of future observations to be predicted took on values of M = 1 and M = 4. The threshold τ of the Pareto k estimates was varied between k = 0.5 to k = 0.7 in steps of 0.1. In addition, we evaluated six different data generating models with linear and/or quadratic terms and/or autoregressive terms of order 2 (see Figure 2). In all conditions, the time-series consistent of N = 200 observations and the minimal number of observations to make predictions was set to where η i is the linear predictor for the ith observation, ϕ k are the autoregressive parameters and ε i are pairwise independent errors, which are usually assumed to be normally distributed with equal variance σ 2 .
The model implies a recursive formula that allows for computing the right-hand side of the above equation for observation i based on the values of the equations for previous observations. Thus, by definition, responses of AR-models are not conditionally independent. However they are still factorizable, that is we may write down a separate likelihood contribution per observation (see Bürkner et al., 2018, for more discussion on factorizability of statistical models).
In addition to exact and approximate LFO-CV, we also compute approximate LOO-CV for comparison. This is not because we think LOO-CV is a generally appropriate approach for time-series models, but because, in the absence of any approximate LFO-CV method, researchers may have used approximate LOO-CV for time-series models in the past simply because it was available. As such, demonstrating that LOO-CV is a biased estimate of LFO-CV underlines the importance of our newly developed methods for approximate LFO-CV.
All simulations were done in R (R Core Team, 2018) using the brms package (Bürkner, 2017(Bürkner, , 2018 together with the probabilistic programming language Stan (Carpenter et al., 2017) for the modeling fitting, the loo package (Vehtari et al., 2017b) for the PSIS computation, and several tidyverse packages (Wickham,

Results
Results of the 1-SAP simulations are visualized in Figure 3. Comparing the columns of Figure 3, it is clearly visible that the accuracy of the PSIS approximation increases with decreasing τ , up to almost perfect accuracy for τ = 0.5. At the same time, the proportion of observations at which refitting the model was required increased substantially with decreasing τ (see Table 1). Using τ = 0.6 induced a slight positive bias in PSIS-LFO-CV, but also reduced the number of required refits by roughly 30%. Another 30% reduction in the number of refits was achieved by using τ = 0.7 but at the cost of disproportionally increasing the positive bias in PSIS-LFO-CV. As expected, LOO-CV is a biased estimate of the 1-SAP performance for all non-constant models in particular those with a trend in the time-series (see light-blue histograms in Figure 3).
Results of the 4-SAP simulations are visualized in Figure 4. Comparing the columns of Figure 4, it is clearly visible that the accuracy of the PSIS approximation increases with decreasing τ , up to almost perfect accuracy for τ = 0.5. At the same time, the proportion of observations at which refitting the model was required increased substantially with decreasing τ (see Table 1). In light of the corresponding 1-SAP results (see above), this is not surprising as the procedure to determining the necessity of a refit is independent of M (see Section 2.1). Using τ = 0.6 again induced a slight positive bias in PSIS-LFO-CV, but also reduced the number of required refits by roughly 30%. Another 30% reduction in the number of refits was achieved by using τ = 0.7 but at the cost of disproportionally increasing the positive bias in PSIS-LFO-CV. PSIS-LOO-CV is not displayed in Figure 4 as the number of observations predicted as each step (4 vs. 1) renders 4-SAP LFO-CV and LOO-CV incomparable.

Annual measurements of the level of Lake Huron
To illustrate the application of PSIS-LFO-CV for estimating expected M -SAP performance, we will fit a model for 98 annual measurements of the water level (in feet) of Lake Huron from the years 1875-1972. This Year Water Level (ft) Figure 5: Water Level in Lake Huron . Black points are observed data. The blue line represents mean predictions of an AR(4) model with 90% prediction intervals shown in gray.
data set is found in the datasets R package, which is installed automatically with R (R Core Team, 2018).
The time-series shows rather strong autocorrelation of the level as some trend towards lower levels for later points in time. We fit an AR(4) model and display the model implied predictions along with the observed values in Figure 5.
Based on this data and model, we will illustrate the use of PSIS-LFO-CV to provide estimates of 1-SAP and 4-SAP leaving out all future values. To allow for reasonable predictions of future values, we will require at least L = 20 historical observations (20 years) to make predictions. Further, we set a threshold of τ = 0.6 for the Pareto k estimates at which define that refitting becomes necessary. Our fully reproducible analysis of this case study can be found on GitHub (https://github.com/paul-buerkner/LFO-CV-paper).
We start by computing exact and PSIS-approximated LFO-CV of 1-SAP. We compute ELPD exact = -93.5 and ELPD approx = -93.5, which are almost identical. Not only is the overall ELPD estimated accurately but also each of the pointwise ELPD contributions (see the left panel of Figure 6). In comparison, PSIS-LOO-CV returns ELPD loo = -89.0 and thus overestimates the predictive performance, which coincides with our simulation results of stationary autoregessive models (see fourth row of Figure 3). Plotting the Pareto k estimates reveals that the model had to be refit 8 times, out of a total of N − L = 78 predicted observations (see Figure 7). On average, this means one refit every 9.8 observations, which implies a drastic speed increase as compared to exact LFO-CV.
Performing LFO-CV of 4-SAP, we compute ELPD exact = -535.5 and ELPD approx = -535.5, which are again almost identical. In general, for increasing M , the approximation will tend to become more variable around the true value in absolute ELPD units, as the ELPD increment of each observation will be based on more and more observations (see also Section 3). For this example, we see some differences in the pointwise ELPD contributions of specific observations which were hard to predict accurately by the model (see the right panel of Figure 6). However, these differences cancel out in the overall ELPD estimate. Since, for constant

Annual date of the cherry blossoms in Japan
The cherry blossom in Japan is a famous natural phenomenon occurring once every year during spring. As climate changes so does the annual date of the cherry blossom (Aono and Kazui, 2008;Aono and Saito, 2010).
In this case study, we are going to predict the annual date of the cherry blossom using an approximate Gaussian process model (Solin andSärkkä, 2014, Riutort Mayol et al. (2019)) to provide flexible non-linear smoothing of the time-series. A visualisation of both the data and the fitted model in provided in Figure   8. While the time-series appears rather stable across earlier centuries, with substantial variation across consecutive years, there are some clearly visible trends in the data. In particular in more recent years, the cherry blossom tended to happen much earlier than before, presumably as a result of climate change (Aono and Kazui, 2008;Aono and Saito, 2010).
Based on this data and model, we will illustrate the use of PSIS-LFO-CV to provide estimates of 1-SAP and 4-SAP leaving out all future values. To allow for reasonable predictions of future values, we will require at least L = 100 historical observations (100 years) to make predictions. Further, we set a threshold of τ = 0.6 for the Pareto k estimates at which define that refitting becomes necessary. Our fully reproducible analysis of this case study can be found on GitHub (https://github.com/paul-buerkner/LFO-CV-paper). Year Day of cherry blossom We start by computing exact and PSIS-approximated LFO-CV of 1-SAP. We compute ELPD exact = -2345.7 and ELPD approx = -2345.1, which are highly similar. PSIS-LFO-CV slightly overestimates the predictive performance for τ = 0.6, which is in line with our simulation results (see Section 3). However, as the difference is so small, it may also just be random error. As shown in the left panel of Figure 9, the pointwise ELPD contributions are highly accurate, with no outliers, indicating the our approximation has worked out consistently well across observations. PSIS-LFO-CV clearly performs better than PSIS-LOO-CV for which we obtain ELPD exact = -2340.3 and thus an overestimation of the predictive performance. Plotting the Pareto k estimates reveals that the model had to be refit 35 times, out of a total of N − L = 727 predicted observations (see Figure 10). On average, this means one refit every 20.8 observations, which implies a drastic speed increase as compared to exact LFO-CV.
Performing LFO-CV of 4-SAP, we compute ELPD exact = -9348.3 and ELPD approx = -9345.0, which are again similar but not as close as the corresponding 1-SAP results. This is to be expected as the uncertainty of PSIS-LFO-CV increases for increasing M (see Section 3). As displayed in the right panel of Figure 9, the pointwise ELPD contributions are highly accurate, with no outliers, indicating the our approximation has worked out consistently well across observations. Since, for constant threshold τ , the importance weights are the same independent of M , the Pareto k estimates are also the same in 4-SAP as in 1-SAP.

Conclusion
In the present paper, we proposed and evaluated a new method to approximate cross-validation methods for time-series models, which we called PSIS-LFO-CV. It follows the common task of time-series models to However, for reasons discussed in Appendix B, we do not recommend using block-M -SAP in practice.
For a set of common time-series models, we established via simulations that PSIS-LFO-CV is an almost unbiased approximation of exact LFO-CV if we choose the threshold τ of the Pareto k estimates to be not larger than τ = 0.6. As the number of required model refits, and thus the computational time, increases with decreasing τ , we currently see τ = 0.6 as a good default when performing PSIS-LFO-CV. This is noticeably smaller than the recommended threshold for PSIS-LOO-CV of τ = 0.7, because, in PSIS-LFO-CV, the errors are dependent as highly influential observations also influence the approximation in the following iterations before refit, thus having a stronger influence on the overall accuracy than in PSIS-LOO-CV.
Lastly, we want to briefly note that LFO-CV can also be used to compute marginal likelihoods. Using basic rules of conditional probability, we can factorize the log marginal likelihood as This is nothing else than the ELPD of 1-SAP if we set L = 0, that is if we choose to predict all observations using their respective past (the very first observation is only predicted from the prior). As such, marginal likelihoods may be approximated using PSIS-LFO-CV. Although this approach is unlikely to be more efficient than methods specialized to compute marginal likelihoods such as bridge sampling (Meng and Wong, 1996;Meng and Schilling, 2002;Gronau et al., 2017), it may be a noteworthy options if, for some reason, other methods fail.

Acknowledgments
We thank Daniel Simpson, Shira Mitchell, and Måns Magnusson for helpful comments and discussions on earlier versions of this paper.

Simulations
In the simulation of block-M -SAP, we use the same conditions as for ordinary M -SAP, but instead of leaving out all future values, we left out a block of only B = 10 future values.
Results of the block-1-SAP simulations are shown in Figure 12. PSIS-LFO-CV provides almost unbiased estimate of the corresponding exact LFO-CV for all investigated conditions, that is regardless of the threshold τ or the data generating model. The number of required refits was not only much smaller than when leaving out all future values, but practically approached zero for most conditions (see Table 2). PSIS-LOO-CV has also small bias, but higher variance than PSIS-LFO-CV. This is plausible given that LOO-CV and LFO-CV of block-1-SAP only differ in whether they include the relatively few observations in the block when fitting the approximating model. accuracy of PSIS-LFO-CV for block-4-SAP is highly variable when applied to autoregressive models (see Figure 13), something that is also visible in block-1-SAP although to a smaller degree. This seems to be a counter-intuitive result given that predictions should be more certain in the block version as more observations are available to inform the model. However, it can be explained as follows. In autoregressive models, predictions of future observations directly depend on past observations, that is predictions are not conditionally independent. This becomes a problem when dealing with observations that are missing in the approximating model right after the block of left out observations, since the directly preceding observations are part of the block and are thus have to be treated as missing values (for details see Section 6). This implies a disproportionally high variability in the predictions of observations right after the block in autoregressive models, which then naturally propagates into higher variability of the PSIS-LFO-CV approximations.

Annual measurements of the level of Lake Huron
In the following, we discuss the application of block-LFO-CV on our case study about annual measurements of the level of Lake Huron (see Section 4.1). It is not entirely clear how stationary the time-series is as it may have a slight negative trend across time (see Figure 5). However, the AR(4) model we are using assumes stationarity and it is appropriate to also use block-LFO-CV for this example, at least for illustration. We choose to leave out a block of B = 10 future values as the dependency of an AR(4) model will not reach that far into the future. That is, we will include all observations after this block when re-fitting the model.
Approximate LFO-CV of block-1-SAP reveals ELPD exact = -88.5 and ELPD approx = -88.5, which are almost identical. Plotting the Pareto k estimates reveals that the model had to be refit 2 times, out of a total of N − L = 78 predicted observations (see Figure 14). On average, this means one refit every 39.0 observations, which again implies a drastic speed increase as compared to exact LFO-CV. What is more, we needed even fewer refits than in non-block LFO-CV, an observation we already made in our simulation in Section 3.
Performing LFO-CV of block-4-SAP, we compute ELPD exact = -489.0 and ELPD approx = -484.3, which are similar but not quite a close as in the 1-SAP case. Since AR-models fall in the class of conditionally dependent models, predicting observations right after the left-out block may be quite difficult as shown in Section 3. However, for the present data set, the PSIS approximations of block-LFO-CV seem to have worked reasonably well.

Conclusion
Among other things, our simulations indicated that the accuracy of PSIS approximated block-M -SAP is highly variable for conditionally dependent models such as autoregressive models. Together with the fact that block-M -SAP is only theoretically reasonable for stationary time series, as the future will always be informative for non-stationary ones, this leaves PSIS approximated block-M -SAP in a difficult spot. It appears to be a theoretically reasonable and empirically accurate choice only for conditionally independent models fit to stationary time-series. If the time-series is not too long and the corresponding model not too complex, so that a few more refits are acceptable, it might thus be more consistent and safe to just use PSIS-LFO-CV of M -SAP not trying to approximate block-M -SAP at all.