Case study: shipping trend estimation and prediction via multiscale variance stabilisation. Applied Statistics

Shipping and shipping services are a key industry of great impor- tance to the economy and the wider European Union.


Trends in shipping transactions
Ship owning and management are vital activities in the shipping industry which carries out a variety of operations and supports a wide community of intermediary organisations, such as marine insurance brokers and ship chandlers. Many ship owners currently outsource tasks to third party management companies which helps control expenditure and avoids disruption of lengthy voyages. Such outsourcing has caused ship owning and management to become increasingly distinct and specialised activities.
Shipping is a key contributor to the Balance of Payments for Cyprus. In the European Union (EU) Cyprus is a top performer in ship management that employs about 55,000 sailors that originate from many countries and manages more than 2000 ocean-going ships equating to about 48 million tons, Cyprus Shipping Chamber [11]. Cyprus also has the third-largest ship registry in the EU with more than 1000 ships, in terms of tonnage.

Trend estimation from shipping credit transaction flow time series
This article describes analysis of a monthly credit transaction flow time series of resident shipping companies with non-residents. The series was recorded from January 2002 to December 2014 and consists of n = 156 observations denominated in Euros. The aim of our analysis is to estimate the transaction flow trend and to forecast it. Shipping transaction flows are challenging as such series exhibit extreme non-stationarity. Most time series methods are designed to work on time series, X t , that are stationary which means that they possess constant mean, constant variance and an autocovariance, γ (k) = cov(X t , X t+k ) that only depends on the lag k. All the computations performed for this article were carried out using the R language and various packages, see R Development Core Team [57]. Figure 1 plots the credit flow time series which shows extreme changes in variance as well as an increasing trend. For example, dividing the series into thirds, the mean of the first, second and third part of the series is approximately e43M, e45M and e53M, respectively, demonstrating the increasing trend. Likewise, the standard deviation of the prices on these three intervals is approximately e14M, e20M and e48M.
The challenge here is to estimate the trend in the presence of such extreme variance changes. Trend and prediction techniques are used for performance monitoring and influencing expectations about future revenue and expenditure often in the financial planning process [15]. Normally, due to the variability of shipping markets, these analyses are also merged with market intelligence and expert opinion. Forecasting is of vital importance so that future weaknesses can be identified sufficiently early to enable corrective action to be taken. For this article we denote μ t = E(X t ) > 0 and σ 2 t = var(X t ) to be the finite mean and variance of X t and, since we are not assuming stationarity, we do not expect μ t nor σ 2 t to be constant functions of t. It is not immediately obvious, but close examination of Figure 1 indicates that the variance of the series is, approximately, an increasing function of the mean. We will shed more light on this in Section 2.2 and Figure 2 in particular. Such a relationship can be modelled by the assumption that is a non-decreasing function of x. For example, if X t had a marginal (stationary) Poisson distribution with mean λ then μ t = λ and since σ 2 t = λ also we have h(x) = x, see Fryzlewicz et al. [22] for more description and examples.

Time series and multiscale methods
There is a vast literature on how wavelets, and multiscale methods more generally can be applied to time series analysis. See Nason and von Sachs [49] for a review, Percival and Walden [54] for a comprehensive exposition or [1,9,10,35,53,60] as a selection of more  recent work. Comparison of techniques for stationary and non-stationary time series can be found in Priestley [55] or Nason [44]. For time series that are non-stationary in more than just the mean the book by Priestley [55] is a comprehensive treatment whereas Dahlhaus [12] is a more recent review focusing on locally stationary time series. For time series, a key early step is to assess whether a time series is stationary or whether there is evidence of non-stationarity. Tests of non-stationarity, which examine the constancy of a (potentially) time-varying spectrum, include [51,52,56,65]; those that look at Fourier ordinate correlations such as [16,29,31] or those that measure the distance between a time-varying spectral estimate and its 'closest' stationary spectrum such as [14,27]. Other, more recent work, has investigated the use of wavelet bases, rather than Fourier, to test stationarity such as [6] or [46], or using Walsh basis as suggested by Jin et al. [34] or by using wavelet packets as in Cardinali and Nason [7]. Historically, in statistical time series at least, the archetypal model has been the Cramer-Fourier model with the amplitude function modified to exhibit variation over time, and Dahlhaus [12] provides the essential recent review into the specification, modelling and forecasting with such models. Alternatively, a model using wavelets instead of Fourier basis functions was introduced by Nason et al. [50] as the locally stationary wavelet processes. These, and their associated estimation methods were further improved and extended by Fryzlewicz [19], Van Bellegem and von Sachs [62], Fryzlewicz [20], Fryzlewicz and Nason [24], Triantafyllopoulos and Nason [61], Van Bellegem and von Sachs [63], Fryzlewicz and Ombao [25], for forecasting in Fryzlewicz et al. [26], Xie et al. [66] and for extensions to 2D processes and texture modelling, see [17] or [28] or [59] and references therein.
Recently, wavelet models for spectral estimation and long-memory (Hurst) parameter estimation have been developed for irregularly-spaced time series (or regularly spaced series subject to missing values) using second-generation wavelets otherwise known as lifting, see [36][37][38]. These methods appear to be particularly promising as their performance seems to match or exceed that of regular wavelets on regularly spaced data for reasons that are not yet fully understood.
Another area of time series that wavelets have influenced is that of multiscale variance stabilisation. This is the main topic of this paper and so we briefly review the relevant literature below in Section 2.2.
The next section describes how our case study estimates trend in such a volatile series and Section 3 extends our techniques to forecast future values.

Trend estimation in a variance-changing series
There are many well-known methods for trend estimation in a time series X t of the form X t = m t + t where { t } is a weakly stationary time series. Methods such as curve-fitting, moving averages or differencing are explained by standard references such as Section 1.4 of [5] or Section 2.5 of [8]. Other techniques for trend estimation for series with locally stationary errors also exist, see [64], for example.
There are many possible ways that we could model the credit flows shown in Figure 1. As the variance changes quite so dramatically one popular approach might be to apply some form of variance stabilisation. A quick and likely transform might be the logarithm or the square root, but these are special cases of the well-known class of the Box-Cox transformation, [3], which we apply next.

Box-Cox transformation of credit flow
Given the credit flow series, X t , we can form the Box-Cox transformed version, Y t , by To apply this transformation for our data we use the BoxCox.lambda function from the forecast package, [33], which implements the method of Guerrero [30]. The optimal Box-Cox transformation parameter was estimated to beλ ≈ −0.283 which, when applied, results in the transformed series shown in plot (a) of Figure 3. Haar-Fisz transform preserves the mean after transformation. Hence, without modification, a plot of the data-driven Haar-Fisz transformed time series would have a vertical axis presented on a scale involving very large numbers, which would be hard to read. To improve readability of the vertical axis of plot (d) we have subtracted off the mean of the series resulting in a range of values from 0 to about 10.

Data-driven Haar-Fisz transformation of credit flow
The data-driven Haar-Fisz transform is a recent method for variance stabilisation [21,22,43] which is itself a development of the Haar-Fisz transformation for Poissondistributed data introduced by Fryzlewicz [19] and described by Fryzlewicz and Nason [23]. A theoretical analysis of a likelihood-based analysis of multiscale variance stabilisation techniques such as Haar-Fisz or the multiscale Box-Cox method due to Zhang et al. [67] can be found in Nason [47].
Briefly, the Haar-Fisz transform combines the Haar wavelet transform with the variance stabilising Fisz [18] transform as we now describe briefly. The Haar wavelet transform takes a time series of observations, c J,t = X t for t = 1, . . . , n and recursively applies the following operations: Here we assume n = 2 J but the idea can be extended to arbitrary n, see [45] for further information.
The Haar-Fisz variance stabilisation is carried out in the wavelet domain by manipulating the mother and father coefficients by forming f j, = d j, /h 1/2 (c j, ) for all coefficients j, . For h known it can be shown that the f j, coefficients have constant variance, asymptotic normality and, where the time series is smooth enough, zero mean. The f j, coefficients can then take the place of the d j, coefficients and these coefficients are inverted to obtain a transformed series possessing trend plus zero mean noise with near-constant variance and approximate normality, see [22] for technical details.
Naturally, h is not known in practice and needs to be estimated. Fryzlewicz et al. [22] suggest fitting the model to the finest scale coefficients and estimating h by least-squares isotonic regression so that the variance estimated is non-decreasing in the mean.
The package DDHFm in Motakis et al. [42] performs this by computing the forward and inverse data-driven Haar-Fisz transform via the functions ddhft.np.2 and ddhft.np.inv, respectively. These return R composite objects: the transformed data in the hft component, the calculated finest mean and variance coefficients in the components mu and sigma respectively and the estimated isotonic regression in the sigma2 component.
For the credit flow data the black dots in Figure 2 show the absolute value of the mother coefficients plotted against the fathers and the least-squares non-decreasing regression is shown as the solid line. The non-decreasing regression has an approximate overall slope of one which translates into a mean-variance relationship of h(μ) = μ 2 which suggests that the credit flow data may possess a marginal distribution consistent with a Gamma distribution.

Practical issues to deal with non-dyadic length of data
By now the astute reader may have realised that the DDHFm data-driven Haar-Fisz software operates with data sets that are of dyadic length (that is, when n = 2 J for some positive integer J). Our shipping credit flow data has n = 156 which is not dyadic. There are several ways to address this issue. Possibly, one could rewrite the software, so that it makes use of a wavelet transform for non-dyadic data. This is both time consuming, pernickety and not adopted here. Simpler methods include truncation (in this case deleting 28 observations, reducing the 156 to the next smallest power of two, 128), zero padding (adding one hundred zeroes to extend it to the next largest power of two (256) or, the method we adopt here, symmetric end reflection. We do not favour truncation as this would result in a loss of information, nearly 20% of the data. Zero padding at the beginning of the series would also be feasible here, especially since the size of the observations near to the beginning is small.
For symmetric end reflection we augment the series by reflecting the first 100 observations about the first observation. So, in the new series, the first observation is x 100 , the next is x 99 incrementally counting down to x 2 , x 1 and then followed by x 1 , x 2 , . . ., the original series. In R this operation can be carried out in one line by cflow.augment < -c(rev(cflow[1:100]), cflow) which generates the augmented series, cflow.augment, from the original series, cflow. The augmented series can then be subjected to data-driven Haar-Fisz transformation and then the last 156 values of the transformed series can be extracted as the transformed version of the original series. The benefit of symmetric end reflection is that no discontinuity is generated by the reflection which means that no large wavelet coefficient is artificially generated near to the reflection point as might happen with zero padding. More details on the symmetric end reflection and the other methods of handling boundaries and non-dyadic data sets can be found in Nason and Silverman [48] or Section 4.11 of [54].

Smoothing the credit flow data
After stabilising the variance by using the data-driven Haar-Fisz transform, resulting in the transformed series shown in Figure 3(d), we then applied a smoothing-spline smoother (the function smooth.spline in R). Then, on inverting the smoothed transformed   Figure 1). Red line is data-driven Haar-Fisz stabilised estimate with 50% (dark blue) and 95% (light blue) approximate confidence intervals. Green line is standard smoothing spline estimate of credit flows.
series we obtained our smoothed estimate of the credit flow. The trend estimated via this method is shown in Figure 3 as a red line and nicely follows the overall bulk of the data. We can also apply a smoothing spline directly to the highly variable data without transformation: this results in the constant green line in Figure 3 which is an obviously poor estimate of the trend. Clearly, the untransformed smoothing spline method had difficulty in choosing a suitable smoothing parameter in this case of extreme variability change. Figure 3 also shows approximate 50% and 95% pointwise confidence intervals for the credit flow trend. Interestingly, and not surprisingly, the variance of the trend estimate is slightly higher to the right of the plot, but due to the stability of the estimation procedure, the increase in variance is not markedly different across the plot.
The confidence intervals were obtained by bootstrap simulation in the transformed domain (using simulations from a simple ARMA error model added to the trend), inverting the transform and assuming approximate normality of the pointwise estimates. This is akin to the 'simulating from the (wavelet) posterior' mentioned in Barber et al. [2] or, more generally, Section 7.6 of [13], where we have a model for the transformed series, generate bootstrap samples from that model, invert each sample, and form pointwise confidence intervals from the samples thus generated in the original domain.
A similar trend plot could have been generated in a similar way for the logtransformation, but in view of the improved stabilisation of the data-driven Haar-Fisz transform observed in Figure 3(d) this seems unnecessary.

Forecasting
We next consider forecasting: an important task for budgeting and planning, see page 455 of [4]. Forecasting credit flows here enables decision makers to make use of information to lessen uncertainty and monitor volatile shipping environments, see page 701 of [58]. Forecasts are often discussed at board meetings and are usually supplemented with market information and expert opinion to form considered views about the current state of the industry. For budgeting, forecasts are used to assess likely expected revenues, expenses for future years and to help allocate financial resources. For strategic planning, forecasts are used to ascertain industry prospects and to help construct business plans with detailed policies. Frequently, forecasts form part of trade union negotiations and are also important within government, whose policies influence the industry's prospects.
To continue the previous analysis, we applied the exponential smoothing state space modelling function ets from the forecast package [33] to a large and contiguous training set of the transformed credit flow data (156−h observations) and then forecasted h steps ahead. Then, using the inverse transformation, we inverted the forecasts, and their nominal 95% prediction intervals, back to the original data domain for assessment. We compared the forecasts with the known truth both in terms of mean-squared prediction error and by computing empirical coverage of the forecasts. The ets methodology was  Figure 1). Solid blue line shows the forecasts from one to h = 12 steps ahead and the blue polygon shows the (nominal) 95% prediction interval. selected because of its successful automatic general-purpose forecasting nature, see [32], although other alternatives might include the Box-Jenkins ARIMA methodology, or used in combination [39]. Other, more tailored computationally intensive smoothing methods [40] could be used to further improve forecasting performance. Figure 4 shows four forecasts made for h = 12 steps ahead using each of the four transformations. Of these, the data-driven Haar-Fisz method clearly has a prediction interval that covers most of the extreme values at the right-hand end of the plot. The Box-Cox and Log transforms' intervals are small and the square-root transformation not far behind. Clearly, the data-driven Haar-Fisz method could not have anticipated the huge values that were about to occur but it has obviously picked up on the increasing variance in the series prior to these. Table 1 shows the mean-squared prediction errors for the various forecasting horizons h ranging from three to 24. In six out of eight cases the data-driven Haar-Fisz transformed forecaster is either the best of within 2% of the best error. It is, however, not good for h = 3,9 and, for the latter, considerably worse than the other three methods. Table 2 shows the empirical coverage of the nominal 95% prediction intervals: the data-driven Haar-Fisz method has the highest (or equal highest) coverage in six of the eight cases.

Discussion
The shipping credit flow problem is a particularly good exemplar for trend estimation and prediction with the data-driven Haar-Fisz transform because (i) the change in variance observed in the series is extreme (ii) existing powerful transformations singularly fail to stabilise the variance and (iii) the data-driven Haar-Fisz transform performs the stabilisation well and, further, gives interesting information as to the nature of the mean-variance relationship as shown by Figure 2.
Of course, there are several other paths that could have been followed to estimate trend. We could have fitted other time series trend models to the transformed data other than the smoothing spline (although this technique is well-understood and worked well), or fitted models other than ARIMA as part of our bootstrap confidence interval generator. There are alternative methods for handling the non-dyadic length of the data set and many other forecasting techniques that could have been tried. However, the aim of this case study is to clearly and directly set down a set of reasonable steps, with model justification at each step, to obtain a good estimator of the trend.
Further the techniques and analysis in this paper have been of direct use and genuinely useful in trend estimation and prediction for the credit flow (and other) shipping-related time series that are assessed by organisations such as the Central Bank of Cyprus as described by Michis and Nason [41].
The supplementary material contains the full R code of our analyses.