Skip to Main Content

This article presents a new estimation method for the parameters of a time series model. We consider here composite Gaussian processes that are the sum of independent Gaussian processes which, in turn, explain an important aspect of the time series, as is the case in engineering and natural sciences. The proposed estimation method offers an alternative to classical estimation based on the likelihood, that is straightforward to implement and often the only feasible estimation method with complex models. The estimator furnishes results as the optimization of a criterion based on a standardized distance between the sample wavelet variances (WV) estimates and the model-based WV. Indeed, the WV provides a decomposition of the variance process through different scales, so that they contain the information about different features of the stochastic model. We derive the asymptotic properties of the proposed estimator for inference and perform a simulation study to compare our estimator to the MLE and the LSE with different models. We also set sufficient conditions on composite models for our estimator to be consistent, that are easy to verify. We use the new estimator to estimate the stochastic error's parameters of the sum of three first order Gauss–Markov processes by means of a sample of over 800, 000 issued from gyroscopes that compose inertial navigation systems. Supplementary materials for this article are available online.

1. INTRODUCTION

Let , be the model associated to the univariate Gaussian time series that is stationary or nonstationary but with stationary backward differences of order d, and let {yt , t = 1, …, T} be the corresponding observed outcome. In this article, we propose a new estimation method for the parameter's vector θ that is based on the matching between the empirical and model-based wavelet variances (WV). We also set the conditions on under which our estimator is consistent. In particular, we consider for the empirical WV the Maximal Overlap Discrete Wavelet Transform (MODWT) WV estimator (Greenhall 1991; Percival and Guttorp 1994) based on Haar wavelet filters for which d = 1. In this case, we show that for a model made of the sum of independent Gaussian white-noise (WN), drift, quantization noise (QN), random walk (RW), and a finite number of autoregressive models of order 1 (AR(1)), our estimator is consistent (see Corollaries 2 and 3 in the supplementary materials, Section F). This model encompasses most of the stochastic models used in engineering and natural sciences applications.

The processes can be represented without loss of generality, by a state-space model of the form with measurements where x t is a q × 1 system state vector at time t, Φ is a q × q coefficient or state-transition matrix from t to t + 1, w t is a q × 1 multivariate WN vector, that is, , u t is a q × 1 deterministic input vector, yt is the one-dimensional output variable, h is a 1 × q design vector which maps the state vector x t into the output yt , and vt is a one-dimensional noise such that .

The books by Harvey (1989) and Durbin and Koopman (2001) contain extensive accounts of state-space models and their applications. For linear and/or Gaussian state-space models, the Maximum Likelihood Estimator (MLE) is a natural choice for the estimation of the model's parameters. The Kalman filter, MCMC, or other simulation smoother, together with the specification of a prior to start the latent process, are typically used for computing predictors of the state-variables and one-step-ahead predictors of the observations which are then used in an Expectation-Maximization (EM) algorithm of Dempster, Laird, and Rubing (1977) to compute the MLE (see also, e.g., Shumway and Stoffer 1982). The MLE is based on the density , with θ the p × 1 vector of parameters containing the unknown elements of Φ , Q, h and possibly also u t + 1 together with σ2 WN. The maximization step can be very complex and finding the MLE is not always a simple task. Moreover, even if new smoothers are regularly proposed (see, e.g., Durbin and Koopman 2002; McCausland, Miller, and Pelletier 2011) that improve the computational efficiency, the task becomes even more challenging when the observed process size is large and the model is, for example, a mixture of three first-order Gauss–Markov (GM) random processes (a reparameterization of AR(1) processes) with different parameters, a model that is used in the case study Section 5. We also found in a simulation exercise in Section 4 that our estimator, unlike the classical MLE or the least squares estimator (LSE), is importantly less biased when the roots of an AR(1) process lie near the unit circle. However, a formal and more general treatment of the unit root problem is left for future research.

An important useful tool for understanding underlying features of a process is the Power Spectral Density (PSD), which for stationary processes, is given by for |f| ⩽ fN = 1/(2Δt) (fN is the Nyquist frequency) with being the Auto-Covariance Function (ACF) of {Yt } such that . The WV are related to the PSD since they decompose the variance process as thoroughly described in Percival and Walden (2000) (see also Section 2). The WV can be estimated from the sample {yt , t = 1, …, T} using, for example, the maximal-overlap discrete wavelet transform estimator defined in Serroukh, Walden, and Percival (2000) (see also Percival 1995 and Equation (6) next), and for the matching of these empirical estimates to the WV implied by the model, we propose an optimization criterion based on a standardized distance between both WV estimates, as is done, for example, with the Generalized Method of Moments (GMM) (Hansen 1982). We call the new estimator the Generalized Method of Wavelet Moments (GMWM) estimator. We provide, among others, sufficient conditions on the model for the GMWM estimator to be consistent and asymptotically normal. Moreover, it can be computed using simulations as is done with indirect inference (Gourieroux, Monfort, and Renault 1993; Smith 1993; Gallant and Tauchen 1996), so that it is very straightforward to implement in practice.

While in this article we use wavelets (through their variances) for parametric estimation, wavelets have been used in many different statistical problems mainly for nonparametric estimation. To cite only a few, wavelets have been used in nonparametric density estimation (see, e.g., Doukhan and Leon 1990; Kerkyacharian and Picard 1992; Walter 1992; Donoho et al. 1996), inverse problems (see, e.g., Donoho 1995), nonparametric regression or curve fitting (see, e.g., Donoho and Johnstone 1994 and Donoho et al. 1995) also with correlated errors (see, e.g., Johnstone and Silverman 1997), or with locally stationary noise (see, e.g., Von Sachs and Macgibbon 2000).

This article is organized as follows. In Section 2, we introduce the WV and explain how it is used mainly in the engineering literature. In Section 3, we present the estimator based on the WV and derive its statistical properties. A simulation study is then presented in Section 4 that compares the finite sample performance of the GMWM estimator to the MLE and the LSE. In Section 5, we apply the new methodology to estimate parameters for the stochastic error model in inertial sensors with a real dataset of size 833, 685.

2. THE WAVELET VARIANCE

Basically, as pointed out by Percival and Guttorp (1994), the WV can be interpreted as the variance of a process after it has been subject to an approximate bandpass filter. WV can be built using wavelet coefficients issued from a modified Discrete Wavelet Transform (DWT) (Mallat 1999; Percival and Walden 2000) called the Maximal Overlap DWT (MODWT); see Greenhall (1991), Percival and Guttorp (1994). The wavelet coefficients are built using wavelet filters which for j = 1 and for the MODWT satisfy where for l < 0 and lL 1, L 1 is the length of , and m is a nonzero integer. Let also be the transfer function of . To obtain the jth-level wavelet filters of length Lj = (2 j − 1)(L 1 − 1) + 1, one computes the inverse discrete Fourier Transform of The MODWT filter is actually a rescaled version of the DWT filter h j, l , that is, . Filtering an infinite sequence using the wavelet filters yields the MODWT wavelet coefficients We define the WV at dyadic scales τ j = 2 j − 1, as the variances of , that is

Notice that the WV are assumed not to depend on time. The condition for this property to hold is that the integration order d for the series {Yt } has to be stationary such that dL 1/2 and is based on a Daubechies (Daubechies 1992) wavelet filter; see Percival and Walden (2000), chapter 8. This is because Daubechies wavelet filters of width L 1 contain an embedded backward difference filter of order L 1/2. In such a case, the series of wavelet coefficients is stationary with PSD , | · | denoting the modulus. Hence (see Serroukh, Walden, and Percival 2000) There is therefore an implicit link between the WV and the parameters of the data generating model . We exploit this connection when defining an estimator for θ , namely by matching a sample estimate of the WV ν2 j ) together with the model-based expression of the WV given by the left side of Equa- tion (5). For WV based on Haar wavelet filters (see Equation (9) next) and for the WN, RW, Drift, QN, AR(1), ARMA(1,1), and ARIMA(0,1,1) models, the integral in Equation (5) is solved and given in the supplementary materials (see Section A), based on the results of Zhang (2008). WV for other models can be computed using the same methodology.

For a finite (observed) process {yt ; t = 1, …, T}, the MODWT is given by with and Mj = TLj + 1, is a consistent estimator for ν2 j ); see Serroukh, Walden, and Percival (2000) who also show that under suitable conditions (see also Theorem 3 in the supplementary materials), is asymptotically normal with mean 0 and variance Equation (7) can be estimated by means of and the asymptotic properties of Equation (8) are given in Percival and Walden (2000), p. 312.

A particular choice for the wavelet filter is the Haar wavelet filter whose first DWT filter (j = 1) is with length L 1 = 2. If the process is stationary with backward differences of order d > 1, then other wavelet filters such as Daubechies wavelet filters can be used (Daubechies 1992). When the WV is evaluated with Haar wavelet filters, it is actually equal to half the Allan variance (AV) (Allan 1966). In 1998, the IEEE Standard 1293 (1998) introduced the AV technique as a noise identification method which can be used for determining the characteristics of underlying stochastic processes affecting signals. AV and WV are often used in engineering disciplines and physical sciences as a graphical approach for model building purposes. In the supplementary materials (see Section B), we provide an example of the graphical use of WV. For a more detailed description, see, for example, Percival and Walden (2000).

When a graphical representation is suitable, with Haar WV, the parameters of stochastic processes are sometimes estimated by means of linear regression on preidentified linear regions in a log–log plot of scale τ j versus WV ν2 j ). For example, this approach has been used for over 30 years as a standard routine measure of frequency stability in lasers (Fukuda, Tachikawa, and Kinoshita 2003) or atomic clocks (Allan 1987). More recently, the WV has also been used with optical sensors (Kebabian, Herndon, and Freedman 2005), various types of gas monitoring spectrometers (Werle, Mücke, and Slemr 1993; Bowling et al. 2003; Skrínskı et al. 2009), sonic anemometer-thermometers (Loescher et al. 2005), inertial sensors (El-Sheimy, Hou, and Niu 2008; Guerrier 2009), and radio-astronomical instrumentation (Schieder and Kramer 2001). The WV was also used in Percival and Guttorp (1994) to analyze vertical ocean shear measurements. In Fadel et al. (2004), it was employed to study the variability in heart-beat intervals. In Whitcher (2004), discrete wavelet packet transforms are used to estimate one of the parameters of a seasonal long memory process for the analysis of atmospheric and economic time series. In Gebber, Orer, and Barman (2006), WV is exploited to study nerve activities. WV has also been applied in Earth orientation metrology in Gambis (2002) and with other types of (geo)physical data. However, the linear regression on identified linear regions of the WV plots provides reasonably estimated parameters only for a limited number of processes and is often biased (Stebler et al. 2011). In our simulated example presented in Figure 4, the graphical estimation of first-order GM process parameters mixed with other processes like a WN is not suitable.

In the following section, we propose instead a criterion based on a standardized distance between sample and model-based WV that provides consistent estimators of the model's parameters for a wide range of models.

3. GMWM Estimator

We propose to estimate the model's parameters using an estimator which combines on the one hand the WV and on the other hand the Generalized Least Squares (GLS) principle, using the relationship given in Equation (5). More precisely, we propose to find such that the WV implied by the model, say φ ( θ ), matches the empirical WV, say , and solves the following GLS optimization problem in which Ω , a positive-definite weighting matrix, is chosen in a suitable manner (see next). Equation (10) defines the GMWM estimator. is a binding function between θ and such that , and and are two estimators.

In Theorem 3 given in the supplementary materials (see Section C), we use the results of Giraitis and Taqqu (1997) on limit theorems for bivariate Appell polynomials to set the conditions for to be asymptotically multivariate normal. In particular, . This result generalizes the results of Serroukh, Walden, and Percival (2000) to the multivariate case, while Serroukh and Walden (2000a, 2000b) generalize the results of Serroukh, Walden, and Percival (2000) to wavelet covariances.

In Theorem 1 next, we state the conditions on , , and φ ( θ ) (hence on F θ ) under which is a consistent estimator of θ . For that, we follow the methodology used for GMM (see, e.g., Harris and Mátyás 1999) and use the results of Komunjer (2012). Let φ ( θ ) = ( φ 1 ( θ ), φ 2 ( θ )) with dim( φ 1 ( θ )) = p (and dim( φ 2 ( θ )) = Jp), and let also (i.e., the same split as with φ ( θ )). Consider the following conditions:

1.

If Ω is estimated by , then , that is, the weighting matrix converges to a population weighting matrix.

2.

is a consistent estimator of the WV .

3.

Let θ θ with θ being an open subset of . Then, there exists a bijective, twice continuous differentiable function , such that , k ( γ ) = [kj j )] j = 1, …, p , |(∂/∂ γ ) k ( γ )| = ∏ p j = 1(∂/∂γ j )kj j ) > 0 (i.e., the determinant of the Jacobian matrix is nonnegative) and if p = 2, (∂/∂γ j )kj j ) > 0, ∀j (i.e., the Jacobian matrix is positive-definite definite).

4.

g 1 ( θ ) is twice continuously differentiable.

5.

|(∂/∂ θ ) g 1 ( θ )| is nonnegative.

6.

whenever || θ || → ∞.

7.

For every , the equation has countably many (possibly zero) solutions in .

8.

For p = 2, (∂/∂ θ ) g 1 ( θ ) is positive definite, and for p > 2, the set of points θ s for which rank[(∂/∂ θ ) g 1 ( θ )] < p − 1 is bounded.

Theorem 1.

For Ω satisfying (C.1), under (C.2), for g ( θ ) or − g ( θ ) satisfying (C.4)–(C.8), and if there exists a function f satisfying (C.3), then defined in Equation (10) is consistent.

The proof is given in the supplementary materials (see Section D). Condition (C.3) defines a reparameterization from θ θ to . Hence, in practice, for a given model F θ one has first to check that a function k = [kj ] j = 1, …, p exists that satisfies (C.3). For variance parameters 0 < σ2, the kj 's are the logarithm, and for other restricted parameters with restrictions of the form a < θ j < b, one can use the logit function on (θ j a)/(ba), that is, kj j ) = (ba)exp (γ j )((1 + exp (γ j )) + a.

Condition (C.3) is convenient but actually too restrictive. The following corollary actually extends the conditions of Theorem 1.

Corollary 1.

If the bijective, twice continuous differentiable function k does not satisfy the conditions on the Jacobian matrix in (C.3), then defined in (10) is consistent under conditions (C.1), (C.2), and (C.4)–(C.8) in which (∂/∂ θ ) g 1 ( θ ) is replaced by (∂/∂ θ ) g 1 ( θ )(∂/∂ γ ) k ( γ ).

The proof follows directly from the proof of Theorem 1.

In Theorem 2 below, we provide the consistency conditions of the GMWM when F θ is made of the sum of L independent processes. Let their respective WV be φ(l) j ( θ (l)) and partition θ as θ = [ θ (l)] l = 1, …, L , we need the following set of conditions:

1.

The functions are linearly independent with all linear combinations of the other , ll′, l, l′ = 1, …, L, and for all τ j , that is, one cannot find such that .

2.

For all j, φ(l) j ( θ (l)) = φ j (l)( θ (l) ) if and only if θ (l) = θ (l) for all l.

3.

Each element of , where depends on τ j for at least L − 1 of the l = 1, …, L.

Theorem 2.

Let F θ be the stochastic process associated to made of the sum of L independent processes with respective WV , be the GMWM estimator of θ , then for Ω satisfying (C.1), satisfying (C.2), and satisfying conditions (C.9)–(C.11), is a consistent estimator of θ .

The proof is given in the supplementary materials (see Section E). Condition (C.10) can be verified for each subprocess WV using the conditions of Theorem 1. We do that in the supplementary materials (see Section F) for some of the processes presented in Table 3 (see Section A), and for the composite processes treated in this article, we show the consistency using the conditions in Theorem 2.

Obviously, the number of scales J should be Jp but at the same time, as discussed in Section 4, a too large J can introduce variability in the estimator. The study of a possible optimal value for J is out of the scope of the present article. However, we may notice that in Guerrier et al. (2013), the authors investigated through simulation studies different choices for the number of scales J for mixtures of WN, QN, drift, RW, and AR(1) models, and found out that from a given number of scales, the variability of the GMWM does not really improve. This suggests that one could choose a number J such that J = min (J, a · p), with a a bounded (positive) integer.

Then, under the conditions of Theorem 1 (or possibly Corollary 1) or Theorem 2 as well as Theorem 3, we have that is asymptotically normal, with asymptotic covariance matrix given by where and where D = ∂ φ ( θ )/∂ θ T , and is the asymptotic covariance matrix of with elements given in (A-3) (see Section C). When Ω = I, then . The most efficient estimator is obtained by choosing , leading then to . In practice, the matrix D is computed at .

The estimation of the WV covariance σ2 kl of given in (A-3) is in general not straightforward. In the supplementary materials (see Section G), we show that under the assumption of a Gaussian process for Yt , a suitable estimator is given by where Mkl = min (Ml , Mk ) (see Equation (6)). Alternatively, when the process is not Gaussian or when the sample size is very large as is the case with the dataset analyzed in Section 4 so that the computation of Equation (13) is infeasible, one can use a parametric bootstrap to estimate . Q samples of size T are simulated from on which Q WV and , q = 1, …, Q, are computed and σ2 kl is estimated by their empirical covariance.

When J > p, that is, the number of WV is greater than the dimension of the parameter vector θ , one can assess the goodness of fit of the model F θ to the data by testing the hypotheses , using the χ2-test statistic which is asymptotically χ2 Jp under H 0 (see Hansen 1982). It will be used to assess the fit of the postulated model in the case study in Section 5. The investigation of the finite sample properties of Equation (14) is left for future research.

As a possible extension of the GMWM when analytical expressions for φ ( θ ) in Equation (10) are too complicated to compute, one can resort to simulations to compute φ ( θ ) and hence place the GMWM in the framework of indirect inference (Gourieroux, Monfort, and Renault 1993; Smith 1993; Gallant and Tauchen 1996). Basically, given a sample of observations {yt , t = 1, …, T} and a hypothetical model , is defined as the WV estimated from the sample using Equation (6). Let also with being the WV estimates computed on simulated series {y ⋆(r) t ( θ ), t = 1, …, T}, r = 1, …, R. Then, and are used in Equation (10) to obtain an estimate of , whose properties are described in, for example, Gourieroux, Monfort, and Renault (1993). In particular, for R sufficiently large, . In that case, B can be computed numerically.

A step-by-step algorithm for computing the GMWM estimator (10) given a dataset of size T and a choice for F θ , is given by the following steps:

1.

Compute the empirical WV for a suitable number of scales using Equation (6).

2.

Construct the matrix Ω . We recommend using for Ω the inverse of a diagonal matrix whose elements are given in Equation (8) for the considered scales. This choice ensures that condition (C.1) is satisfied.

3.

Compute the model-based WV φ ( θ ).

4.

Minimize the quadratic form defined in Equation (10) in θ to obtain the GMWM estimator .

5.

Compute confidence intervals for by parametric bootstrap or using Equation (11) with θ replaced by .

Hence, the method is rather easy to implement in practice. Assuming that we have a time series, say yt of length T and a model F θ made of the sum of processes satisfying Corollary 3, the estimation of θ starts with the computation (only once) of the empirical WV using Equation (6) and their estimated variances using Equation (8) (that are used in Ω ). The model-based WV is then analytically built using the sum of the model-based WV (in Table 3) of the independent subprocesses in F θ . Given all these ingredients, the quadratic form defined in Equation (10) is then optimized. This optimization problem is very simple. Confidence intervals for are, however, generally computed by parametric bootstrap. This task is quite easy to implement, but of course implies simulating data and repeating several times Steps 1–4. This can take in practice about a minute with massive datasets such as the one considered in the case study section.

If a simulation-based approach (indirect inference) is preferred, then Step 3 is not necessary and Step 4 is replaced by the following steps:

1.

Choose a sufficiently large value for R (R = 100 is usually a sensible choice)

2.

Minimize the quadratic form defined in Equation (10) in which φ ( θ ) is replaced by φ ( θ ), each element being computed using Equation (6) in which the observations are replaced by a simulated sample from F θ (for a given current value of θ ) of size RT.

Step 5 can be then used for inference, for sufficiently large R (see, e.g., Genton and Ronchetti 2003).

4. SIMULATION STUDY

This section is dedicated to the finite sample performance evaluation of the GMWM estimator compared to the MLE and the LSE. For the GMWM estimator, one has to make a choice about in Equation (10). The choice is not straightforward in practice, as it is the case for the moment-based estimator GMM or Indirect-Inference-based estimators. Asymptotically, it is well known that the optimal choice for is any matrix proportional to . This does not mean that this choice is better in finite samples, especially because the asymptotic covariance matrix needs to be estimated. A simple choice is the identity matrix, and in our case an intermediate solution is the inverse of the diagonal matrix made of the variance estimates of the MODWT using Equation (8). Indeed, in our simulations, we found that choosing may lead in some cases to numerical instability, probably due to the estimation of the covariances using Equation (13). Anyway, asymptotically these three choices for Ω lead to a consistent GMWM (provided the other conditions are also satisfied). When optimizing Equation (10), we use a quasi-Newton optimization method. For the MLE, we use the EM algorithm together with the Kalman smoother (EM-KF) as proposed by Shumway and Stoffer (1982) (see also Holmes 2010).

An important engineering application is sensor calibration in which the behavior of the errors affecting signals has to be well understood and modeled. In particular, gyroscopes and accelerometers which are part of inertial navigation systems used for space, aeronautical, ground, and underwater applications are subject to random errors that largely affect the quality of the positioning and navigation solution over time. A widely accepted model for the error behavior of an inertial sensor is given by the following state-space model where Δt is the time interval between two consecutive measurements, ut = ωΔt, with q = σ2 GM(1 − e − 2βΔt ), and . The observed series is hence made up of the sum of a first-order GM process with correlation time β− 1 and variance σ2 GM, a drift with slope ω, and a WN with variance σ2 WN. Even with such a relatively simple model, the estimation task is nontrivial.

For our simulation study, we consider simulated processes {yt t = 1, …T} from model (15) as well as an AR(1) with autocorrelation parameter near the boundary value of 1. From model (15), we actually generate three types of samples which correspond to three different (sub)models. In Model 1, we set σ2 WN = ω = 0, hence we have only a first-order GM, in Model 2, we set ω = 0, hence we remove the drift only from the complete model (15) (Model 3). We generate 100 processes of size T = 6000 with Δt = 1 and θ = (σ2 WN, σGM 2, β, ω) = (4, 16, 0.05, 0.005) at the complete model. For the submodels, the parameters are constrained accordingly and they are not estimated.

For both the GMWM (simulation based or not) and the EM-KF, the initial values for the optimizations were set to , which is relatively far away from the true simulation values. We found that the choice for the starting values is not a serious issue for the computation of the GMWM. The root mean-squared errors (RMSE) as well as the relative RMSE (relative to the true parameter value) are presented in Table 1 for Models 1–3, and for the GMWM and the EM-KF. We also tried a simulation-based version of the GMWM with R = 100 and found out that the RMSE are of the same order as for the GMWM with analytical WV (results not presented here). The sample WV were computed for J = 12(< log (6000)/log (2) ≈ 12.55) scales for all models. The results show that for the smaller models 1 and 2, the RMSE is smaller for the EM-KF than for the GMWM estimator, while the RMSE of the EM-KF explodes for the complete model 3. This last feature is actually well known with models with a drift component. For example, Hinrichsen and Holmes (2009) consider a multivariate state-space model composed of a drift with unknown rate and a random walk process to model the growth of an ecological population with observations that are subject to measurement error. They actually use a two-step estimation procedure by which the drift is first estimated by linear regression and then removed from the state equation leading to a simpler model that is well estimated by means of the EM-KF (see also Stebler et al. 2011).

Table 1 RMSE and relative RMSE (R-RMSE) of the GMWM and EM-KF estimators for 100 simulated processes of size T = 6000 from model (15) (Model 3) with Δt = 1 and and from submodels with (Model 1) and with ω = 0 (Model 2)

When the EM-KF behaves well (models 1 and 2), it has a better performance in terms of RMSE than the GMWM estimator. However, one can further improve the efficiency of the latter by decreasing the number of scales J at which the WV are estimated. Indeed, in this example, J = 12 scales are used to estimate two or three parameters, and if more scales are added (supposing a larger T), this only introduces more variability in the GMWM estimator. An optimal choice (in terms of GMWM estimator's efficiency) of the scales and their number is beyond the scope of the present article. The number of scales is obviously a function of the number of parameters p, but their choice (among the possible ones) depends on the model that is considered. For example, with model (15) without the drift element (i.e., Model 2), according to the WV graph in Figure 4, if the last three to four scales are ignored, then the WV are still able to capture information about the other model components. Actually, removing the last four scales improves the efficiency of the GMWM estimator in the simulation study for Model 2 (results not shown here).

An important feature of the GMWM estimator is that it can be used for models for which the MLE (or other classical estimators) is seriously biased. For example, maximum likelihood estimation of ARMA processes, and/or least squares estimation (based on the Yule–Walker equations) of AR processes, when the roots lie near the unit circle, provide estimators that can be seriously biased (see, e.g., Andrews 1993). With the GMWM estimator with Haar coefficients for the WV, we do not have this boundary problem, since, for example, with an AR(1) process, with autocorrelation parameter ρ, when the latter tends to the value of 1, we get a random walk which is a nonstationary process. However, the wavelet coefficients W j, t actually are backward differences of order at least 1, so that the problem of nonstationarity disappears. Note also that taking first differences of a stationary AR(1) process is known to lead to the problem of overdifferencing, since the resulting series is not invertible (it becomes an ARMA(1,1)) which has more parameters than the original one. Consequently, its parameters will be difficult to estimate, and it will tend to degrade the quality of forecasts, so differencing is not always a suitable method to overcome the estimation problem.

As an illustration, we simulated 1000 AR(1) processes of size T = 5000 with ρ ranging from 0.8 to 1 (and with residual variance σ2 = 4) and estimated the LSE based on the Yule–Walker equations together with the GMWM (J = 12). Figure 1 presents the relative RMSE of the LSE and GMWM estimator for both parameters. The GMWM estimator of ρ is less precise than the LSE, except when ρ is close to 1. For σ2, the RMSE of both estimators are quite similar for ρ smaller than 0.9 but for higher values the bias and variance of the LSE explodes showing that there is a boundary effect that affects the LSE. The MLE provides RMSE that are sensibly smaller than the LSE (not presented here) but for values of ρ larger than 0.95 the MLE is difficult if not impossible to compute.

Figure 1 Relative RMSE based on 1000 simulations from AR(1) processes with ρ ranging from 0.8 to 1 and with residual variance σ2 = 4, of the LSE relative to the GMWM estimator of ρ (“”) and of σ2 (“o”).

5. CASE STUDY: INERTIAL SENSORS

An inertial navigation system exploits observation from usually an orthogonal triad of gyroscopes and accelerometers measuring angular rates and specific forces, respectively. Once initialized, the navigation is a dead-reckoning process (i.e., the solution at one epoch is computed from current observations as well as from the solution of the previous epoch) in which gyroscope signals (after suppressing Earth rotation) are integrated once to yield orientation, and accelerometers (after accounting for gravity and apparent forces) are integrated twice to get the velocity, and finally the position in three-dimensional space. The advantage of inertial navigation is in the autonomy (i.e., no external infrastructure is needed) and the large sensor bandwidth (unlike satellite positioning system), while its weakness lies in the time-dependent error behavior (due to the integration process). Indeed, the sensor signals are corrupted by random errors, making the resulting positioning and attitude (or orientation) error increase rapidly with time. To bound the error growth, inertial sensors are combined with other sensors (e.g., GPS, odometer, and altimeter) through optimal (e.g., Kalman) filtering. Such combined navigation systems are not only the base for manned and unmanned vehicle (e.g., spacecrafts, aircrafts, cars, and robots) trajectory and orientation determination, but also for pedestrian and indoor localization devices. The gyroscope and accelerometer error behavior is modeled through state augmentation in the system where the augmented states account for sensor errors usually modeled as stochastic processes. This requires that the parameters of the stochastic processes in the state-space model are carefully set up prior to filtering. The quality of this setup (or estimation) strongly influences the quality of the predicted trajectory. Traditionally, the Allan variance with ad hoc graphical estimation (see, e.g., IEEE Standard 1293 1998), or the KF-Self-Tuning approach (see Niu et al. 2007; Waegli 2009; Stebler et al. 2012), or even the MLE on state-space models (with simple models) (Stebler et al. 2011) are used to both determine the error's model structure and to estimate their parameters. The GMWM estimator offers an alternative approach that is especially suitable when the stochastic error process is rather complex.

In this section, we consider the angular rate signal issued from a real Micro-Electro-Mechanical System (MEMS) gyroscope where the spectral structure of errors is often complex. Such sensors have a great potential in navigation due to ergonomic (smaller and lighter equipment) and economical considerations (El-Sheimy and Niu 2007). Therefore, MEMS sensors are increasingly used in a very wide range of applications such as 3D input devices, robotic, virtual reality, vehicle stability control and so forth, and research focuses on modeling and compensating for their large and variable measurement errors. To validate the use of the GMWM for modeling the MEMS’ errors, we first analyze the signal produced in static conditions at a frequency of 100 Hz (833, 685 measurements). We provide GMWM estimates on a rather complex model that cannot be estimated by an alternative standard method. To show the impact of the importance of model structure and estimation precision of its parameters in positioning, we then present an emulation study in which the trajectory of a helicopter performing airborne laser scanning is used and for which a nearly perfect (i.e., reference) trajectory can be computed. The emulated measurements along the real trajectory are corrupted with an error signal observed in static condition, and the three methods (AV, the KF-Self-Tuning Approach, and GMWM) are used to model the error signal. The predicted trajectories based on the corrupted measurements and estimated error model, together with GPS measurements, are then compared to determine the positioning error.

After removing the mean in the signal made of 833, 685 measurements produced in static conditions at a frequency of 100 Hz, the time-varying part of the sensor errors is directly available and is presented in the time domain in Figure 3 (upper panel), together with the Haar WV of this process with 95% confidence intervals (lower panel). The WV computed on the original signal give an indication of the underlying stochastic processes that are summed up to build up a composite process. A possible model is a mixture of three different first-order GM processes , with q = σ2 GM(1 − e − 2βΔt ) and Δt = 0.01 s, the sampling interval. In such a case, the parameters to be estimated are θ = {σ2 1, β1, σ2 2, β2, σ2 3, β3} and the GMWM is identifiable since a first order-Markov random process is a one-to-one reparameterization of an AR(1) process whose sum allows identifiable GMWM (see Corollaries 2 and 3). The GMWM estimates of the parameter set θ and its corresponding 95% confidence intervals using the WV covariances estimated by means of a parametric bootstrap (400 samples) are given in Table 2. The suitability of the estimated model for the data at hand can be judged graphically by a matching of the empirical WV and the parametric WV using the estimates in Table 2, as it is done in Figure 3 (lower panel). It can be seen that the estimated process matches the observed one very well across WV scales, yielding an indication of a suitable fit. Moreover, one can also compute the goodness of fit test statistic in Equation (14) for competing models and test to what extent the chosen one yields a suitable fit. For that, we also considered a single GM process and the sum of two GM processes. The test statistics and corresponding p-values are 384363.9 (on 17 df) and p-value < 10− 5 for the single GM process, 84787.01 (on 15 df) and p-value < 10− 5 for a sum of two GM processes, and 16.81637 (on 13 df) and p-value = 0.208 for a sum of three GM processes, respectively, clearly indicating that the last model is suitable. The corresponding estimated WV are drawn in Figure 3 (lower panel).

Table 2 Estimated parameters with associated 95% confidence intervals for the mixture of three first-order GM random processes with the Gyroscope signal data

It should be noted that the sum of three first-order GM random processes can be reparameterized as an ARMA(3,2) process (see, e.g., Granger and Morris 1976; Terasvirta 1977), so that one could, in principle, estimate the latter instead of the former. However, one should then be able to invert the estimated ARMA since very often, and in particular in the example at hand, a sum of first-order GM random processes is the one that we believe explains the real underlying process, which for gyroscopes reflect the short- and long-term correlated noise due to quantization and mechanical thermal noise phenomena present in MEMS sensors (see Kittel 1958; Drexler 1992). To recover the sum of GM processes’ parameters from an estimated ARMA process, and also their standard errors, several conditions need to be satisfied. First, the roots of the processes must lie outside the unit circle. Second, the Jacobian matrix of the transformation between the two parameterizations must be invertible to apply the delta method (Rao 1973; Benichou and Gail 1989). With the inertial sensor data, the estimated processes have roots that are near the unit circle, and moreover, the Jacobian matrix of the transformation evaluated at the estimated parameters is not invertible. In that case at least, estimating an ARMA process and converting the estimated model and inference to the sum of first-order GM processes is infeasible.

To illustrate the impact of the importance of model structure and estimation precision in device positioning, we also performed an emulation study that was presented in Stebler et al. (2012). Using extremely reliable equipments, the trajectory of a helicopter performing airborne laser scanning can be precisely determined. From this trajectory, we computed the theoretical perfect inertial measurements (i.e., accelerations and angular rotations) that should be observed along this trajectory. These perfect measurements were then corrupted with a real static error signal acquired such as the one analyzed above. Then, these “pseudo” measurements were used together with the real GPS measurements to compute a trajectory. At one point, we introduced relatively short artificial gaps in GPS observations of 1 min duration (note that, in practice, such gaps are very common) and compared the deviations of the trajectories obtained with different inertial error models estimated from the data. The two first estimated models are based on the AV and on the KF-Self-Tuning approach. A detailed discussion on these methods can be found for example in Stebler et al. (2012). The GMWM was also used as an alternative estimator and model building approach. The trajectories are depicted in Figure 2 (adapted from Stebler et al. 2012) along with the “true” trajectory. It can be seen that the GMWM-based model (black-dashed line) limits significantly the error growth during the GPS-signal outage compared with the other two benchmark methods which diverged from the “true” trajectory by several thousand of meters! The poor performances of the standard methods explains the recent explosion of the research conducted to determine the stochastic modeling of MEMS-type inertial sensors.

Figure 2 Comparison between a reference trajectory (black dotted line) issued from a mapping flight in which a GPS outage was introduced, with estimated error model based on the AV (light-gray line), the KF-Self-Tunning (dark gray line), and the GMWM (black dashed line).

Figure 3 Gyroscope-observed error process (top panel) and graphical comparison (log–log scale) between the Haar WV (line “o”) computed from the observed signal and the analytical signal using the estimated parameters of, respectively, the sum of three GM processes (line “”), the sum of two GM processes (line “”), and one GM process (line “”).

6. CONCLUDING REMARKS

In this article, we present a new estimator for the parameters of composite stochastic processes which is shown to be consistent for the class made of the sum of independent white-noise, drift, quantization noise, random walk, and k < ∞ AR(1). As demonstrated in Stebler et al. (2012), the GMWM has many advantages over existing alternative methods for applications in engineering or natural sciences. The enlargement of the class of models for which the GMWM is consistent involves verifying the conditions provided in Theorem 2 and more generally in Theorem 1 (with Corollary 1) and is left for future research. Also, one could use in principle another consistent estimator for the Haar WV, such as the one proposed in Nason, von Sachs, and Kroisandt (2000).

As an anonymous referee pointed out, one could wonder why to use wavelets instead of making inference directly on the spectral density for which a similar estimation procedure could be used. We believe that our approach is more suitable for the following reasons. First, inference on the PSD would make the optimization of a least-squares-type measure (between the empirical and model-based PSD) more difficult to solve when the PSD has large variability over very narrow frequency bands. As shown in Percival and Walden (2000), the wavelet coefficients at scale τ j are associated with frequencies in the interval [1/2 j + 1, 1/2 j ] and relation (5) can be approximated by This means that the WV summarizes the information in the PSD using just one value per octave frequency band. Thus, it is particularly useful when the PSD is relatively featureless within each octave band. In the case of the widely used pure power law processes (SY (f)∝|f|α), for example from Equation (16) one gets ν2 Y j )∝τ− α − 1 j , meaning that no information is lost in using the PSD summary given by the WV. Second, the computation of empirical WV is more straightforward than nonparametric PSD; for example, the periodogram is an inconsistent estimator of SY (f) and can be badly biased even for large samples sizes (because of frequency leakage effects) so that more sophisticated PSD estimators and/or smoothing techniques such as prewhitening or tapering should be used. Third, the PSD of two important processes in sensor error models, namely the drift and the random walk, cannot be distinguished (both have slope of − 2 in a log–log representation of the PSD). Finally, the MODWT on which the WV computation is based requires a number of multiplications of order Tlog 2 T, which is the same order as the widely used fast Fourier transform algorithm, so that the use of WV does not increase the computational burden.

SUPPLEMENTARY MATERIALS

The proofs and some additional results are given in the supplementary materials. This document is organized as follow. In Section A we provide the analytical version of the Haar WV for the processes considered in this article. In Section B we explain de graphical use of the WV for model building purposes. Theorem 3 which demonstrates the multivariate normality of the WV is given in Section C. The proofs of Theorems 1 and 2 are given in Sections D and E. Corollaries 2 and 3 which provides some additional results regarding the consistency of composite stochastic processes are given in Section F. Finally, in Section G we explain the construction of the covariance estimator of the WV presented given (13).

NOTES

A C++ program for computing the GMWM for the models treated in this article is available upon request from the authors.

Supplemental material

Supplementary Material

Download Zip (380 KB)

Acknowledgments

Stéphane Guerrier and Maria-Pia Victoria-Feser are partially supported by a Swiss National Fund Grant (no. 100018-131906). Yannick Stebler was partially supported by the European FP7-GALILEO-2007-GSA-1 grand CLOSE-SEARCH as well as by the project no. 103010.2 of the Swiss Commission for Technology and Innovation (CTI). The authors are grateful to two anonymous reviewers for their comment that helped to improve the content and the presentation of the article.

    REFERENCES

  • Allan, D. 1966. “Statistics of Atomic Frequency Standards,”. Proceedings of the IEEE, 54: 221230.  [Crossref], [Web of Science ®][Google Scholar]
  • Allan, D. 1987. “Time and Frequency (Time-Domain) Characterization, Estimation, and Prediction of Precision Clocks and Oscillators,”. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 34: 647654.  [Google Scholar]
  • Andrews, D. W. 1993. “Exactly Median-Unbiased Estimation of First Order Autoregressive/Unit Root Models,”. Econometrica, 61: 139165.  [Crossref], [Web of Science ®][Google Scholar]
  • Benichou, J. and Gail, M. H. 1989. “A Delta Method for Implicitly Defined Random Variables,”. The American Statistician, 43: 4144.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Bowling, D. R., Sargent, S. D., Tanner, B. D. and Ehleringer, J. R. 2003. “Tunable Diode Laser Absorption Spectroscopy for Stable Isotope Studies of Ecosystem–Atmosphere CO2 Exchange,”. Agricultural and Forest Meteorology, 118: 119.  [Crossref], [Web of Science ®][Google Scholar]
  • Daubechies, I. 1992. Ten Lectures on Wavelets, Philadelphia: SIAM.  [Crossref][Google Scholar]
  • Dempster, A. P., Laird, N. M. and Rubing, D. B. 1977. “Maximum Likelihood From Incomplete Data via the EM Algorithm,”. Journal of the Royal Statistical Society, Series B, 39: 138.  [Crossref][Google Scholar]
  • Donoho, D. L. 1995. Nonlinear Solution of Linear Inverse Problems by Wavelet-Vaguelette Decomposition. Applied and Computational Harmonic Analysis, 2: 101126.  [Crossref], [Web of Science ®][Google Scholar]
  • Donoho, D. L. and Johnstone, I. M. 1994. “Ideal Spatial Adaptation by Wavelet Shrinkage,”. Biometrika, 81: 425455.  [Crossref], [Web of Science ®][Google Scholar]
  • Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D. 1995. “Wavelet Shrinkage: Asymptopia?,”. Journal of the Royal Statistical Society, Series B, 57: 301369.  [Google Scholar]
  • Donoho, D. L., Johnstone, I. M., Kerkyacharian, G. and Picard, D. 1996. “Density Estimation by Wavelet Thresholding,”. The Annals of Statistics, 24: 508539.  [Crossref], [Web of Science ®][Google Scholar]
  • Doukhan, P. and Leon, J. 1990. “Déviation Quadratique D’Estimateur De Densité par Projection Orthogonale,”. Comptes Rendus de l’Académie des Sciences, Paris (A), 310: 424430.  [Google Scholar]
  • Drexler, K. E. 1992. Nanosystems—Molecular Machinery, Manufacturing, and Computation, New York: Wiley.  [Google Scholar]
  • Durbin, J. and Koopman, S. J. 2001. Time Series Analysis by State Space Methods, Oxford Statistical Series 24, Oxford, , UK: Oxford University Press.  [Google Scholar]
  • Durbin, J. and Koopman, S. J. 2002. “A Simple and Efficient Simulation Smoother for State Space Time Series Analysis,”. Biometrika, 89: 603615.  [Crossref], [Web of Science ®][Google Scholar]
  • El-Sheimy, N. and Niu, X. 2007. “The Promise of MEMS to the Navigation Community,”. Inside GNSS, 2: 4656.  [Google Scholar]
  • El-Sheimy, N., Hou, H. and Niu, X. 2008. “Analysis and Modeling of Inertial Sensors Using Allan Variance,”. IEEE Transactions on Instrumentation and Measurement, 57: 140149.  [Google Scholar]
  • Fadel, P. J., Orer, H. S., Barman, S. M., Vongpatanasin, W., Victor, R. G. and Gebber, G. L. 2004. “Fractal Properties of Human Muscle Sympathetic Nerve Activity,”. American Journal of Physiology—Heart and Circulatory Physiology, 286: 11761184.  [Google Scholar]
  • Fukuda, K., Tachikawa, M. and Kinoshita, M. 2003. “Allan Variance Measurements of Diode Laser Frequency-Stabilized With a Thin Vapor Cell,”. Applied Physics B: Lasers and Optics, 77: 823827.  [Crossref], [Web of Science ®][Google Scholar]
  • Gallant, A. R. and Tauchen, G. 1996. “Which Moments to Match?,”. Econometric Theory, 12: 657681.  [Crossref], [Web of Science ®][Google Scholar]
  • Gambis, D. 2002. “Allan Variance in Earth Rotation Time Series Analysis,”. Advances in Space Research, 30: 207212.  [Google Scholar]
  • Gebber, G. L., Orer, H. S. and Barman, S. M. 2006. “Fractal Noises and Motions in Time Series of Presympathetic and Sympathetic Neural Activities,”. Journal of Neurophysiology, 95: 11761184.  [Google Scholar]
  • Genton, M. G. and Ronchetti, E. 2003. “Robust Indirect Inference,”. Journal of the American Statistical Association, 98: 6776.  [Taylor & Francis Online][Google Scholar]
  • Giraitis, L. and Taqqu, M. S. 1997. “Limit Theorems for Bivariate Appell Polynomials. Part i: Central Limit Theorems,”. Probability Theory and Related Fields, 107: 359381.  [Google Scholar]
  • Gourieroux, C., Monfort, A. and Renault, E. 1993. “Indirect Inference,”. Journal of Applied Econometrics, 8: S85S118.  [Google Scholar]
  • Granger, C. W. and Morris, M. J. 1976. “Time Series Modelling and Interpretation,”. Journal of the Royal Statistical Society, Series A, 139: 246257.  [Crossref], [Web of Science ®][Google Scholar]
  • Greenhall, C. A. 1991. “Recipes for Degrees of Freedom of Frequency Stability Estimators,”. In IEEE Transactions on Instrumentation and Measurements (vol. 40), 994999. IEEE.  [Google Scholar]
  • Guerrier, S. September 22–25, 2009. “Improving Accuracy With Multiple Sensors: Study of Redundant MEMS-IMU/GPS Configurations,”. In 22nd International Meeting of the Satellite Division of The Institute of Navigation, 31143121. Savannnah, GA, , USA: The Institute of Navigation.  [Google Scholar]
  • Guerrier, S., Stebler, Y., Skaloud, J. and Victoria-Feser, M.-P. 2013. “Limits of the Allan Variance and Optimal Tuning of Wavelet Variance Based Estimators,”. Working Paper, Submitted [Google Scholar]
  • Hansen, L. P. 1982. “Large Sample Properties of Generalized Method of Moments Estimators,”. Econometrica: Journal of the Econometric Society, 50: 10291054.  [Crossref], [Web of Science ®][Google Scholar]
  • Harris, D. and Mátyás, L. 1999. “Introduction to the Generalized Method of Moments Estimation,”. In Generalized Method of Moments Estimation, 330. Themes in Modern Econometrics, László Mátyás, ed., Cambridge University Press.  [Google Scholar]
  • Harvey, A. C. 1989. Forecasting, Structural Time Series Models and the Kalman Filter, Cambridge: Cambridge University Press.  [Google Scholar]
  • Hinrichsen, R. A. and Holmes, E. E. 2009. Spatial Ecology Chapter, Using Multivariate State-Space Models to Study Spatial Structure and Dynamics, Mathematical and Computational Biology Series, 145164. Boca Raton, FL: CRC/Chapman & Hall.  [Google Scholar]
  • Holmes, E. E. 2010. “Derivation of the EM Algorithm for Constrained and Unconstrained Multivariate Autoregressive State-Space (MARSS) Models,”. Technical Report, Northwest Fisheries Science Center, Seattle, WA: NOAA Fisheries [Google Scholar]
  • IEEE Standard 1293. 1998. “IEEE Standard Specification Format Guide and Test Procedure for Linear, Single-Axis, Non-Gyroscopic Accelerometers,”. Technical Report, IEEE [Google Scholar]
  • Johnstone, I. M. and Silverman, B. W. 1997. “Wavelet Threshold Estimators for Data With Correlated Noise,”. Journal of the Royal Statistical Society, Series B, 2: 319351.  [Google Scholar]
  • Kebabian, P. L., Herndon, S. C. and Freedman, A. 2005. “Detection of Nitrogen Dioxide by Cavity Attenuated Phase Shift Spectroscopy,”. Analytical Chemistry, 77: 724728.  [Google Scholar]
  • Kerkyacharian, G. and Picard, D. 1992. “Density Estimation in Besov Spaces,”. Statistics and Probability Letters, 13: 1524.  [Crossref], [Web of Science ®][Google Scholar]
  • Kittel, C. 1958. Elementary Statistical Physics, New York: Wiley.  [Google Scholar]
  • Komunjer, I. 2012. “Global Identification in Nonlinear Models With Moment Restrictions,”. Econometric Theory, 28: 719729.  [Google Scholar]
  • Loescher, H. W., Ocheltree, T., Tanner, B., Swiatek, E., Dano, B., Wong, J., Zimmerman, G., Campbell, J., Stock, C. and Jacobsen, L. 2005. “Comparison of Temperature and Wind Statistics in Contrasting Environments Among Different Sonic Anemometer-Thermometers,”. Agricultural and Forest Meteorology, 133: 119139.  [Google Scholar]
  • Mallat, S. 1999. A Wavelet Tour of Signal Processing, New York: Academic Press.  [Google Scholar]
  • McCausland, W. J., Miller, S. and Pelletier, D. 2011. “Simulation Smoothing for State Space Models: A Computational Efficiency Analysis,”. Computational Statistics and Data Analysis, 55: 199212.  [Google Scholar]
  • Nason, G. P., von Sachs, R. and Kroisandt, G. 2000. “Wavelet Processes and Adaptive of the Evolutionary Wavelet Spectrum,”. Journal of the Royal Statistical Society, Series B, 62: 271292.  [Crossref][Google Scholar]
  • Niu, X., Nasser, S., Goodall, C. and El-Sheimy, N. 2007. “A Universal Approach for Processing any MEMS Inertial Sensor Configuration for Land-Vehicle Navigation,”. Journal of Navigation, 60: 233246.  [Google Scholar]
  • Percival, D. B. 1995. “On Estimation of the Wavelet Variance,”. Biometrika, 82: 619631.  [Crossref], [Web of Science ®][Google Scholar]
  • Percival, D. B. and Guttorp, P. 1994. “Long-Memory Processes, the Allan Variance and Wavelets,”. Wavelets in Geophysics, 4: 325344.  [Google Scholar]
  • Percival, D. B. and Walden, A. T. 2000. Wavelet Methods for Time Series Analysis, Cambridge Series in Statistical and Probabilistic Mathematics, New York: Cambridge University Press.  [Google Scholar]
  • Rao, C. R. 1973. Linear Statistical Inferentcea and Its Applications (2nd ed.), New York: Wiley.  [Google Scholar]
  • Schieder, R. and Kramer, C. 2001. “Optimization of Heterodyne Observations Using Allan Variance Measurements,”. Astronomy and Astrophysics, 373: 746756.  [Google Scholar]
  • Serroukh, A. and Walden, A. T. 2000a. “Wavelet Scale Analysis of Bivariate Time Series I: Motivation and Estimation,”. Journal of Nonparametric Statistics, 13: 136.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Serroukh, A. and Walden, A. T. 2000b. “Wavelet Scale Analysis of Bivariate Time Series II: Statistical Properties for Linear Processes,”. Journal of Nonparametric Statistics, 13: 3756.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Serroukh, A., Walden, A. T. and Percival, D. B. 2000. “Statistical Properties and Uses of the Wavelet Variance Estimator for the Scale Analysis of Time Series,”. Journal of the American Statistical Association, 95: 184196.  [Taylor & Francis Online][Google Scholar]
  • Shumway, R. H. and Stoffer, D. S. 1982. “An Approach to Time Series Smoothing and Forecasting Using the EM Algorithm,”. Journal of Time Series Analysis, 3: 253264.  [Crossref][Google Scholar]
  • Skrínskı, J., Janecková, R., Grigorová, E., Strizík, M., Kubát, L., Herecová, P., Nevrlı, V., Zelinger, Z. and Civis, S. 2009. “Allan Variance for Optimal Signal Averaging-Monitoring by Diode-Laser and CO2 Laser Photo-Acoustic Spectroscopy,”. Journal of Molecular Spectroscopy, 256: 99101.  [Google Scholar]
  • Smith, A. A. 1993. “Estimating Nonlinear Time Series Models Using Simulated Vector Autoregressions,”. Journal of Applied Econometrics, 8: S63S84.  [Google Scholar]
  • Stebler, Y., Guerrier, S., Skaloud, J. and Victoria-Feser, M.-P. 2011. “Constrained Expectation-Maximization Algorithm for Stochastic Inertial Error Modeling: Study of Feasibility,”. Measurement Science and Technology, 22: 112.  [Google Scholar]
  • Stebler, Y., Guerrier, S., Skaloud, J. and Victoria-Feser, M.-P. 2012. “A Framework for Inertial Sensor Calibration Using Complex Stochastic Error Models,”. In Position Location and Navigation Symposium (PLANS), 2012 IEEE/ION, 849861. IEEE.  [Google Scholar]
  • Terasvirta, T. 1977. “The Invertibility of Sums of Discrete MA and ARMA Processes,”. Scandinavian Journal of Statistics, 4: 165170.  [Google Scholar]
  • Von Sachs, R. and Macgibbon, B. 2000. “Non-Parametric Curve Estimation by Wavelet Thresholding With Locally Stationary Errors,”. Scandinavian Journal of Statistics, 27: 475499.  [Crossref], [Web of Science ®][Google Scholar]
  • Waegli, A. 2009. Trajectory Determinationand Analysis in Sports by Satellite and Inertial Navigation, PhD thesis, Switzerland: Ecole Polytechnique Fédérale de Lausanne, Available at http://library.epfl.ch/theses/?nr=4288, http://library.epfl.ch/theses/?nr=4288 [Google Scholar]
  • Walter, G. G. 1992. “Approximation of Delta Function by Wavelets,”. Journal of Approximation Theory, 71: 329343.  [Crossref], [Web of Science ®][Google Scholar]
  • Werle, P., Mücke, R. and Slemr, F. 1993. “The Limits of Signal Averaging in Atmospheric Trace-Gas Monitoring by Tunable Diode-Laser Absorption Spectroscopy (TDLAS),”. Applied Physics B: Lasers and Optics, 57: 131139. 1993 [Crossref], [Web of Science ®][Google Scholar]
  • Whitcher, B. 2004. “Wavelet-Based Estimation for Seasonal Long-Memory Processes,”. Technometrics, 46: 225238.  [Taylor & Francis Online], [Web of Science ®][Google Scholar]
  • Zhang, N. F. 2008. “Allan Variance of Time Series Models for Measurement Data,”. Metrologia, 45: 549561.  [Crossref][Google Scholar]