The Rational SPDE Approach for Gaussian Random Fields With General Smoothness

Abstract A popular approach for modeling and inference in spatial statistics is to represent Gaussian random fields as solutions to stochastic partial differential equations (SPDEs) of the form , where is Gaussian white noise, L is a second-order differential operator, and is a parameter that determines the smoothness of u. However, this approach has been limited to the case , which excludes several important models and makes it necessary to keep β fixed during inference. We propose a new method, the rational SPDE approach, which in spatial dimension is applicable for any , and thus remedies the mentioned limitation. The presented scheme combines a finite element discretization with a rational approximation of the function to approximate u. For the resulting approximation, an explicit rate of convergence to u in mean-square sense is derived. Furthermore, we show that our method has the same computational benefits as in the restricted case . Several numerical experiments and a statistical application are used to illustrate the accuracy of the method, and to show that it facilitates likelihood-based inference for all model parameters including β. Supplementary materials for this article are available online.


Introduction
One of the main challenges in spatial statistics is to handle large data sets. A reason for this is that the computational cost for likelihood evaluation and spatial prediction is in general cubic in the number N of observations of a Gaussian random field. A tremendous amount of research has been devoted to coping with this problem and various methods have been suggested (see Heaton et al., 2018, for a recent review).
A common approach is to define an approximation u h of a Gaussian random field u on a spatial domain D via a basis expansion, where ϕ j : D → R are fixed basis functions and u = (u 1 , . . . , u n ) ∼ N(0, Σ u ) are stochastic weights. The computational effort can then be reduced by choosing n N . However, methods based on such low-rank approximations tend to remove fine-scale variations of the process. For this reason, methods which instead exploit sparsity for reducing the computational cost have gained popularity in recent years. One can construct sparse approximations either of the covariance matrix of the measurements (Furrer et al., 2006), or of the inverse of the covariance matrix (Datta et al., 2016). Alternatively, one can let the precision matrix Σ −1 u of the weights in (1.1) be sparse, as in the stochastic partial differential equation (SPDE) approach by Lindgren et al. (2011), where usually n ≈ N . To increase the accuracy further, several combinations of the methods mentioned above have been considered (e.g., Sang and Huang, 2012) and multiresolution approximations of the process have been exploited (Nychka et al., 2015;Katzfuss, 2017). However, theoretical error bounds have not been derived for most of these methods, which necessitates tuning these approximations for each specific model.
In this work we propose a new class of approximations, whose members we refer to as rational stochastic partial differential equation approximations or rational approximations for short. Our approach is similar to some of the above methods in the sense that an expansion (1.1) with compactly supported basis functions is exploited. The main novelty is that neither the covariance matrix Σ u nor the precision matrix Σ −1 u of the weights u is assumed to be sparse. The covariance matrix is instead a product Σ u = PQ −1 P , where P and Q are sparse matrices and the sparsity pattern of P is a subset of that of Q. We show that the resulting approximation facilitates inference and prediction at the same computational cost as a Markov approximation with Σ −1 u = Q, and at a higher accuracy. For the theoretical framework of our approach, we consider a Gaussian random field on a bounded domain D ⊂ R d which can be represented as the solution u to the SPDE where W is Gaussian white noise on D, and L β is a fractional power of a secondorder differential operator L which determines the covariance structure of u. Our rational approximations are based on two components: (i) a finite element method (FEM) with continuous and piecewise polynomial basis functions {ϕ j } n j=1 , and (ii) a rational approximation of the function x −β . We explain how to perform these two steps in practice in order to explicitly compute the matrices P and Q. Furthermore, we derive an upper bound for the strong mean-square error of the rational approximation. This bound provides an explicit rate of convergence in terms of the mesh size of the finite element discretization, which facilitates tuning the approximation without empirical tests for each specific model.
Examples of random fields which can be expressed as solutions to SPDEs of the form (1.2) include approximations of Gaussian Matérn fields (Matérn, 1960). Specifically, if D := R d a zero-mean Gaussian Matérn field can be viewed as a solution u to where ∆ denotes the Laplacian (Whittle, 1963). The constant parameters κ, τ > 0 determine the practical correlation range and the variance of u. The exponent β defines the smoothness parameter ν of the Matérn covariance function via the relation 2β = ν + d/2 and, thus, the differentiability of the field. For applications, variance, range and differentiability typically are the most important properties of a Gaussian field. For this reason, the Matérn model is highly popular in spatial statistics and has become the standard choice for Gaussian process priors in machine learning (Rasmussen and Williams, 2006). Since (1.3) is a special case of (1.2) we believe that the outcomes of this work will be of great relevance for many statistical applications, see also §7.
In contrast to covariance-based models, the SPDE approach additionally has the advantage that it allows for a number of generalizations of stationary Matérn fields including (i) non-stationary fields generated by non-stationary differential operators (e.g., Fuglstad et al., 2015), (ii) fields on more general domains such as the sphere (e.g., Lindgren et al., 2011), and (iii) non-Gaussian models (e.g., Wallin and Bolin, 2015). Lindgren et al. (2011) showed that, if 2β ∈ N, one can construct accurate approximations of the form (1.1) for Gaussian Matérn fields on bounded domains D R d , such that Σ −1 u is sparse. To this end, (1.3) is considered on D and the differential operator κ 2 − ∆ is augmented with appropriate boundary conditions. The resulting SPDE is then solved approximately by means of a FEM. Due to the implementation in the R-INLA software, this approach has become widely used, see (Bakka et al., 2018) for a comprehensive list of recent applications.
However, the restriction 2β ∈ N implies a significant limitation for the flexibility of the method. In particular, it is therefore not directly applicable to the important special case of exponential covariance (ν = 1/2) on R 2 , where β = 3/4. In addition, restricting the value of β complicates estimating the smoothness of the process from data. In fact, β typically is fixed when the method is used in practice, since identifying the value of 2β ∈ N with the highest likelihood requires a separate estimation of all the other parameters in the model for each value of β. A common justification for fixing β is to argue that it is not practicable to estimate the smoothness of a random field from data. However, there are certainly applications for which it is feasible to estimate the smoothness. We provide an example of this in §7. Furthermore, having the correct smoothness of the model is particularly important for interpolation, and the fact that the Matérn model allows for estimating the smoothness from data was the main reason for why Stein (1999) recommended the model.
The rational SPDE approach presented in this work facilitates an estimation of β from data by providing an approximation of u which is computable for all fractional powers β > d/4 (i.e., ν > 0), where d ∈ N is the dimension of the spatial domain D ⊂ R d . It thus enables to include this parameter in likelihoodbased (or Bayesian) parameter estimation for both stationary and non-stationary models. Although the SPDE approach has been considered in the non-fractional case also for non-stationary models, Lindgren et al. (2011) showed convergence of the approximation only for the stationary model (1.3). Our analysis in §3 closes this gap since we consider the general model (1.2) which covers the non-stationary case and several other previously proposed generalizations of the Matérn model.
The structure of this article is as follows: We briefly review existing methods for the SPDE approach in the fractional case in §2. In §3 the rational SPDE approximation is introduced and a result on its strong convergence is stated. The procedure of applying the rational SPDE approach to statistical inference is addressed in §4. §5 contains numerical experiments which illustrate the accuracy of the proposed method. The identifiability of the parameters in the Matérn SPDE model (1.3) is discussed in §6, where we derive necessary and sufficient conditions for equivalence of the induced Gaussian measures. In §7 we present an application to climate data, where we consider fractional and non-fractional models for both stationary and non-stationary covariances. We conclude with a discussion in §8. Finally, the supplementary material contains four appendices providing details about (A) the finite element discretization, (B) the convergence analysis, (C) a comparison with the quadrature method by Bolin et al. (2018a), and (D) the equivalence of Gaussian measures. The method developed in this work has been implemented in the R (R Core Team, 2017) package rSPDE (Bolin, 2019).

The SPDE approach in the fractional case until now
A reason for why the approach by Lindgren et al. (2011) only works for integer values of 2β is given by Rozanov (1977), who showed that a Gaussian random field on R d is Markov if and only if its spectral density can be written as the reciprocal of a polynomial, Since the spectrum of a Gaussian Matérn field is the precision matrix Q will therefore not be sparse unless 2β ∈ N. For 2β / ∈ N, Lindgren et al. (2011) suggested to compute a Markov approximation by choosing m = 2β and selecting the coefficients b = (b 1 , . . . , b m ) so that the deviation between the spectral densities R d w(k)(S(k) − S(k)) 2 dk is minimized. For this measure of deviation, w is some suitable weight function which should be chosen to get a good approximation of the covariance function. For the method to be useful in practice, the coefficients b j should be given explicitly in terms of the parameters κ and ν. Because of this, Lindgren et al. (2011) proposed a weight function that enables an analytical evaluation of the integral, where θ > 0 is a tuning parameter. By differentiating this integral with respect to the parameters and setting the differentials equal to zero, a system of linear equations is obtained, which can be solved to find the coefficients b. The resulting approximation depends strongly on θ, and one could use numerical optimization to find a good value of θ for a specific value of β, or use the choice θ = 2β − 2β , which approximately minimizes the maximal distance between the covariance functions (Lindgren et al., 2011). This method was used for the comparison in (Heaton et al., 2018), and we will use it as a baseline method when analyzing the accuracy of the rational SPDE approximations in later sections.
Another Markov approximation based on the spectral density was proposed by Roininen et al. (2018). These Markov approximations may be sufficient in certain applications; however, any approach based on the spectral density or the covariance function is difficult to generalize to models on more general domains than R d , nonstationary models, or non-Gaussian models. Thus, such methods cannot be used if the full potential of the SPDE approach should be kept for fractional values of β.
There is a rich literature on methods for solving deterministic fractional PDEs (e.g., Bonito and Pasciak, 2015;Gavrilyuk et al., 2004;Jin et al., 2015;Nochetto et al., 2015), and some of the methods that have been proposed could be used to compute approximations of the solution to the SPDE (1.3). However, any deterministic problem becomes more sophisticated when randomness is included. Even methods developed specifically for sampling solutions to SPDEs like (1.3) may be difficult to use directly for statistical applications, when likelihood evaluations, spatial predictions or posterior sampling are needed. For instance, it has been unclear if the sampling approach by Bolin et al. (2018a), which is based on a quadrature approximation for an integral representation of the fractional inverse L −β , could be used for statistical inference. In Appendix C we show that it can be viewed as a (computationally less efficient) version of the rational SPDE approximations developed in this work. Consequently, the results in §4 on how to use the rational SPDE approach for inference apply also to that method. In §5.1 we compare the performance of the two methods in practice within the scope of a numerical experiment.

Rational approximations for fractional SPDEs
In this section we propose an explicit scheme for approximating solutions to a class of SPDEs including (1.3). Specifically, in §3.1- §3.2 we introduce the fractional order equation of interest as well as its finite element discretization. In §3.3 we propose a non-fractional equation, whose solution after specification of certain coefficients approximates the random field of interest. For this approximation, we provide a rigorous error bound in §3.4. Finally, in §3.5 we address the computation of the coefficients in the rational approximation.
3.1. The fractional order equation. With the objective of allowing for more general Gaussian random fields than the Matérn class, we consider the fractional order equation ( , is an open, bounded, convex polytope, with closure D, and W is Gaussian white noise in L 2 (D). Here and below, L 2 (D) is the Lebesgue space of square-integrable real-valued functions, which is equipped with the inner product (w, v) L2(D) := D w(s)v(s) ds. The Sobolev space of order k ∈ N is denoted by H k (D) := {w ∈ L 2 (D) : D γ w ∈ L 2 (D) ∀ |γ| ≤ k} and H 1 0 (D) is the subspace of H 1 (D) containing functions with vanishing trace. We assume that the operator L : D(L) ⊂ L 2 (D) → L 2 (D) is a linear second-order differential operator in divergence form, and uniformly positive definite, i.e., . Furthermore, if Neumann boundary conditions are imposed, then ess inf s∈D κ(s) ≥ κ 0 > 0 holds. If I.-II. are satisfied, the differential operator L in (3.1) induces a symmetric, continuous and coercive bilinear form a L on V , and its domain is given by D(L) = H 2 (D) ∩ V . Furthermore, Weyl's law (see, e.g., Davies, 1995, Thm. 6.3.1) shows that the eigenvalues {λ j } j∈N of the elliptic differential operator L in (3.1), in nondecreasing order, satisfy the spectral asymptotics Thus, existence and uniqueness of the solution u to (1.2) readily follow from Lemma 2.1 and Proposition 2.3 of Bolin et al. (2018a). We formulate this as a proposition.
Proposition 3.1. Let L be given by (3.1) where H and κ satisfy the assumptions I.-II. above and assume β > d/4. Then (1.2) has a unique solution u in L 2 (Ω; L 2 (D)).
The assumptions I.-II. on the differential operator L are satisfied, e.g., by the Matérn operator L = κ 2 − ∆, in which case the condition β > d/4 on the fractional exponent in (1.2) corresponds to a positive smoothness parameter ν, i.e., to a nondegenerate field. Moreover, the equation (1.2) as considered in our work includes several previously proposed non-fractional non-stationary models as special cases, such as the non-stationary Matérn models by Lindgren et al. (2011), the models with locally varying anisotropy by Fuglstad et al. (2015), and the barrier models by Bakka et al. (2019). Thus, Proposition 3.1 shows existence and uniqueness of the fractional versions of all these models, which can be treated in practice by using the results of the following sections.
3.2. The discrete model. In order to discretize the problem, we assume that V h ⊂ V is a finite element space with continuous piecewise linear basis functions {ϕ j } n h j=1 defined with respect to a triangulation T h of the domain D indexed by the mesh width h := max T ∈T h h T , where h T := diam(T ) is the diameter of the element T ∈ T h . Furthermore, the family (T h ) h∈(0,1) of triangulations inducing the finite-dimensional subspaces (V h ) h∈(0,1) of V is supposed to be quasi-uniform, i.e., there exist constants C 1 , C 2 > 0 such that ρ T ≥ C 1 h T and h T ≥ C 2 h for all T ∈ T h and h ∈ (0, 1). Here, ρ T > 0 is radius of largest ball inscribed in T ∈ T h .
The discrete operator L h : We then consider the following SPDE on the finite-dimensional state space V h , of V h which is orthonormal in L 2 (D) and ξ j ∼ N(0, 1) i.i.d. for j = 1, . . . , n h .
We note that the assumptions I.-II. from §3.1 on the functions H and κ combined with the convexity of D imply that the operator L in (3.1) is H 2 (D)-regular, i.e., if f ∈ L 2 (D), then the weak solution u ∈ V to Lu = f satisfies u ∈ H 2 (D) ∩ V , see, e.g., (Grisvard, 2011, Thm. 3.2.1.2) for the case of Dirichlet boundary conditions. By combining this observation with the spectral asymptotics (3.3) we see that the assumptions in Lemmata 3.1 and 3.2 of Bolin et al. (2018a) are satisfied (since then, in their notation, r = s = q = 2 and α = 2/d) and we obtain an error estimate for the finite element approximation . Furthermore, since their derivation requires only that β > d/4, we can formulate this result for all such values of β in the following proposition.
Proposition 3.2. Suppose that β > d/4 and that L is given by (3.1) where H and κ satisfy the assumptions I.-II. from §3.1. Let u, u h be the solutions to (1.2) and (3.4), respectively. Then, there exists a constant C > 0 such that, for sufficiently small h, 3.3. The rational approximation. Proposition 3.2 shows that the mean-square error between u and u h in L 2 (D) converges to zero as h → 0. It remains to describe how an approximation of the random field u h with values in the finite-dimensional state space V h can be constructed.
For β ∈ N one can use, e.g., the iterated finite element method presented in Appendix A to compute u h in (3.4) directly. In the following, we construct approximations of u h if β ∈ N is a fractional exponent. For this purpose, we aim at finding a non-fractional equation such that u R h,m is a good approximation of u h , and where the operator P j,h := p j (L h ) is defined in terms of a polynomial p j of degree m j ∈ N 0 , for j ∈ { , r}. Since the so-defined operators P ,h , P r,h commute, this will lead to a nested SPDE model of the form In practice, we first choose a degree m ∈ N and then set In this case, the solution u R m of (3.7) has the same smoothness as the solution v of the non-fractional equation L β v = W, if β ≥ 1, and as v in Lv = W, if β < 1. Note that, for fixed h, the degree m controls the accuracy of the approximation u R h,m . We now turn to the problem of defining the non-fractional operators P ,h and P r,h in (3.5). In order to compute u h in (3.4) directly, one would have to apply the discrete fractional inverse L −β h to the noise term W h on the right-hand side. Therefore, a first idea would be to approximate the function x −β on the spectrum of L h by a rational function r and to use r(L h )W h as an approximation of u h . This is, in essence, the approach proposed by Harizanov et al. (2018) to find optimal solvers for the problem L β x = f , where L is a sparse symmetric positive definite matrix. However, the spectra of L and of L h as h → 0 (considered as operators on L 2 (D)) are unbounded and, thus, it would be necessary to normalize the spectrum of L h for every h, since it is not feasible to construct the rational approximation r on an unbounded interval. We aim at an approximation where in practice the choice of p and p r can be made independent of L h and h. Thus, we pursue another idea.
In contrast to the differential operator L in (3.1), its inverse L −1 : L 2 (D) → L 2 (D) is compact and, thus, the spectra of L −1 and of L −1 h are bounded subsets of the intervals J := 0, λ −1 j=0 b j x j are polynomials of degree m and m + 1, respectively, and use r(x) :=r(x)x m β as an approximation for f . This construction leads (after expanding the fraction with x m ) to a rational approximation pr p of x −β , The operators P ,h , P r,h in (3.5) are defined accordingly, (3.10) Their continuous counterparts in (3.7) are P := p (L) and P r := p r (L). We note that, for (3.8) to hold, any choice m 2 ∈ {0, 1, . . . , m + m β } would have been permissible for the polynomial degree of q 2 , if m is the degree of q 1 . The reason for setting m 2 = m+1 is that this is the maximal choice which is universally applicable for all values of β > d/4. In the following we refer to u R h,m in (3.5) with P ,h , P r,h defined by (3.10) as the rational SPDE approximation of degree m. We emphasize that this approximation relies (besides the finite element discretization) only on the rational approximation of the functionf . In particular, no information about the operator L except for a lower bound of the eigenvalues is needed. In the Matérn case, we have L = κ 2 − ∆ (with certain boundary conditions) and an obvious lower bound of the eigenvalues is therefore given by κ 2 .
3.4. An error bound for the rational approximation. In this subsection we justify the approach proposed in §3.2- §3.3 by providing an upper bound for the strong mean-square error u − u R h,m L2(Ω;L2(D)) . Here u and u R h,m are the solutions of (1.2) and (3.5) and the rational approximation u R h,m is constructed as described in §3.3, assuming thatr =r h is the L ∞ -best rational approximation off (x) = x β−m β on the interval J h for each h. This means thatr h minimizes the error in the supremum norm on J h among all rational approximations of the chosen degrees in numerator and denominator. How such approximations can be computed is discussed in §3.5.
The theoretical analysis presented in Appendix B results in the following theorem, showing strong convergence of the rational approximation u R h,m to the exact solution u.
Theorem 3.3. Suppose that β > d/4 and that L is given by (3.1) where H and κ satisfy the assumptions I.-II. from §3.1. Let u, u R h,m be the solutions to (1.2) and (3.5), respectively. Then, there is a constant C > 0, independent of h, m, such that, for sufficiently small h, Remark 3.4. In order to calibrate the accuracy of the rational approximation with the finite element error, one can choose m ∈ N such that e −2π √ |β−m β |m ∝ h 2 max{β, 1} . The strong rate of mean-square convergence is then min{2β − d/2, 2}.
Remark 3.5. If the functions H and κ of the operator L in (3.1) are smooth, H ∈ C ∞ (D) d×d and κ ∈ C ∞ (D) (as, e.g., in the Matérn case) and if the domain D has a smooth boundary, the higher-order strong mean-square convergence rate min{2β − d/2, p + 1} can be proven for a finite element method with continuous basis functions which are piecewise polynomial of degree at most p ∈ N. Thus, for β > 1, finite elements with p > 1 may be meaningful.
3.5. Computing the coefficients of the rational approximation. As explained in §3.3, the coefficients {c i } m i=0 and {b j } m+1 j=0 needed for defining the operators P ,h , P r,h in (3.10) are obtained from a rational approximationr =r h of f (x) = x β−m β on J h . For each h, this approximation can, e.g., be computed with the second Remez algorithm (Remez, 1934), which generates the coefficients of the L ∞ -best approximation. The error analysis for the resulting approximation u R h,m in (3.5) was performed in §3.4. Despite the theoretical benefit of generating the L ∞ -best approximation, the Remez algorithm is often unstable in computations and, therefore, we use a different method in our simulations. However, versions of the Remez scheme were used, e.g., by Harizanov et al. (2018).
A simpler and computationally more stable way of choosing the rational approximation is, for instance, the Clenshaw-Lord Chebyshev-Padé algorithm (Baker and Graves-Morris, 1996). To further improve the stability of the method, we will rescale the operator L so that its eigenvalues are bounded from below by one, which for the Matérn case corresponds to reformulating the SPDE (1.3) as (Id−κ −2 ∆) β ( τ u) = W and using L = Id − κ −2 ∆, where Id denotes the identity on L 2 (D) and τ := κ 2β τ .
In order to avoid computing a different rational approximationr for each finite element mesh width h, in practice we compute the approximationr only once on the interval J * := [δ, 1], where δ ∈ (0, 1) should ideally be chosen such that J h ⊂ J * for all considered mesh sizes h. For the numerical experiments later, we will use δ = 10 −(5+m)/2 when computing rational approximations of order m, which gives acceptable results for all values of β. As an example, the coefficients computed with the Clenshaw-Lord Chebyshev-Padé algorithm on J * for the case of exponential covariance on R 2 are shown in Table 1.

Computational aspects of the rational approximation
In the non-fractional case, the sparsity of the precision matrix for the weights u in (1.1) facilitates fast computation of samples, likelihoods, and other quantities of interest for statistical inference. The purpose of this section is to show that the rational SPDE approximation proposed in §3 preserves these good computational properties.
The representation (3.6) shows that u R h,m can be seen as a Markov random field x h,m , transformed by the operator P r,h . Solving this latent model as explained in Appendix A, yields an approximation of the form (1.1), where Σ u = P r Q −1 P r . Here P , P r ∈ R n h ×n h correspond to the discrete operators P ,h and P r,h in (3.10), respectively. The matrix Q := P C −1 P is sparse if the mass matrix C with respect to the finite element basis {ϕ j } n h j=1 is replaced by the diagonal lumped mass matrix C, see Appendix A. By defining x ∼ N(0, Q −1 ), we have u = P r x, which is a transformed Gaussian Markov random field (GMRF). Choosing x as a latent variable instead of u thus enables us to use all computational methods, which are available for GMRFs (see Rue and Held, 2005), also for the rational SPDE approximation.
As an illustration, we consider the following hierarchical model, with a latent field u which is a rational approximation of (1.2), where u is observed under i.i.d. Gaussian measurement noise ε i ∼ N(0, σ 2 ). Given that one can treat this case, one can easily adapt the method to be used for inference in combination with MCMC or INLA (Rue et al., 2009) for models with more sophisticated likelihoods.
Defining the matrix A with elements A ij = ϕ j (s i ) and the vector y = (y 1 , . . . , y N ) gives us the discretized model In this way, the problem has been reduced to a standard latent GMRF model and a sparse Cholesky factorization of Q can be used for sampling x from N(0, Q −1 ) as well as to evaluate its log-density log π x (x). Samples of u can then be obtained from samples of x via u = P r x. For evaluating the log-density of u, log π u (u), the relation log π u (u) = log π x (P −1 r u) can be exploited. Furthermore, the posterior distribution of x is given by x|y ∼ N µ x|y , Q −1 x|y , where µ x|y = σ −2 Q −1 x|y P r A y and Q x|y = Q + σ −2 P r A AP r is a sparse matrix. Thus, simulations from the distribution of x|y, and evaluations of the corresponding log-density log π x|y (x), can be performed efficiently via a sparse Cholesky factorization of Q x|y . Finally, the marginal data log-likelihood is proportional to We therefore conclude that all computations needed for statistical inference can be facilitated by sparse Cholesky factorizations of P and Q x|y .
Remark 4.1. From the specific form of the matrices P and P r addressed in Appendix A, we can infer that the number of non-zero elements in Q x|y for a rational SPDE approximation of degree m will be the same as the number of non-zero elements in Q x|y for the standard (non-fractional) SPDE approach with β = m + m β . Thus, also the computational cost will be comparable for these two cases.
Remark 4.2. The matrix Q x|y can be ill-conditioned for m > 1 if a FEM approximation with piecewise linear basis functions is used. The numerical stability for large values of m can likely be improved by increasing the polynomial degree of the FEM basis functions, see also Remark 3.5.

Numerical experiments
5.1. The Matérn covariance on R 2 . As a first test, we investigate the performance of the rational SPDE approach for Gaussian Matérn fields, without including the finite element discretization in space.
The spectral density S of the solution to (1.3) on R 2 is given by (2.1), whereas the spectral density for the non-discretized rational SPDE approximation u R m in (3.7) is (5.1) We compute the coefficients as described in §3.5. To this end, we apply an implementation of the Clenshaw-Lord Chebyshev-Padé algorithm provided by the Matlab package Chebfun (Driscoll et al., 2014). By performing a partial fraction decomposition of (5.1), expanding the square, transforming to polar coordinates, and using the equality we are able to compute the corresponding covariance function C R (h) analytically.
Here, J 0 is a Bessel function of the first kind and K 0 is a modified Bessel function of the second kind. To measure the accuracy of the approximation, we compare C R (h) to the true Matérn covariance function C(h) for different values of ν, where κ = √ 8ν is chosen such that the practical correlation range r = √ 8ν/κ equals one in all cases.
To put the accuracy of the rational approximation in context, the Markov approximation by Lindgren et al. (2011) and the quadrature method by Bolin et al. (2018a) are also shown. For the quadrature method, K = 12 quadrature nodes are used, which results in an approximation with the same computational cost as a rational approximation of degree m = 11, see Appendix C. Figure 1 shows the Here, C a is the covariance function obtained by the respective approximation method. Already for m = 3, the rational approximation performs better than both the Markov approximation and the quadrature approximation for all values of ν. It also decreases the error for the case of an exponential covariance by several orders of magnitude.
All methods are exact when ν = 1, since this is the non-fractional case. The Markov and rational methods show errors decreasing to zero as ν = 1, whereas the error of the quadrature method has a singularity at ν = 1. The performance of the quadrature method can be improved (although not the behavior near ν = 1) by increasing the number of quadrature nodes, see Appendix C. This is reasonable if the method is needed only for sampling from the model, but implementing this method for statistical applications, which require kriging or likelihood evaluations, is not feasible since the computational costs then are comparable to the standard SPDE approach with β = K.
Finally, it should be noted that the Markov method also is exact at ν = 2 (β = 1.5) since the spectrum of the process then is the reciprocal of a polynomial. The rational and quadrature methods cannot exploit this fact, since these approximations are based on the corresponding differential operator instead of the spectral density. This is the prize that has to be paid in order to formulate a method which works not only for the stationary Matérn fields but also for non-stationary and non-Gaussian models.  (34) 18 (57) 6.3 (18) 11 (35) 18 (53) Table 2. Covariance errors (×100) and computing times in seconds (×100) for sampling from the rational SPDE approximation u (with β = 3/4) and, in parentheses, for evaluating log |Q x|y |. These values are also given for the standard SPDE approach with β = 2, 3, 4.
for Matérn fields with arbitrary smoothness. However, as for the standard SPDE approach, we need to discretize the problem in order to be able to use the method in practice, e.g., for inference. This induces an additional error source, which means that one should balance the two errors by choosing the degree m of the rational approximation appropriately with respect to the FEM error. A calibration based on the theoretical results has been suggested in Remark 3.4. In this section we address this issue in practice and investigate the computational cost of the rational SPDE approximation. As a test case, we compute approximations of a Gaussian Matérn field with unit variance and practical correlation range r = 0.1 on the unit square in R 2 . We assume homogeneous Neumann boundary conditions for the Matérn operator κ 2 − ∆ in (1.3). For the discretization, we use a FEM with a nodal basis of continuous piecewise linear functions with respect to a mesh induced by a Delaunay triangulation of a regular lattice on the domain, with a total of n h nodes. We consider three different meshes with n h = 57 2 , 85 2 , 115 2 , which corresponds to h ≈ r/4, r/6, r/8.
In order to measure the accuracy, we compute the covariances between the midpoint of the domains * and all other nodes in the lattice {s j } n h j=1 for the Matérn field and the rational SPDE approximations and calculate the error similarly to the L 2 -error in §5.1, where Σ u = P r P −1 CP − P r is the covariance matrix of u, see Appendix A. As a consequence of imposing boundary conditions, the error of the covariance is larger close to the boundary of the domain. However, we compare this error to the error of the non-fractional SPDE approach, which has the same boundary effects. As measures of the computational cost, we consider the time it takes to sample u and to evaluate log |Q x|y | for the model (4.2) with σ = 1, when y is a vector of noisy observations of the latent field at 1000 locations, drawn at random in the domain (a similar computation time is needed to evaluate µ x|y ).
The results for rational SPDE approximations of different degrees for the case β = 3/4 (exponential covariance) are shown in Table 2. Furthermore, we perform the same experiment when the standard (non-fractional) SPDE approach is used for β = 2, 3, 4. As previously mentioned in Remark 4.1, the computational cost of the rational SPDE approximation of degree m should be comparable to the standard SPDE approach with β = m + 1. Table 2 validates this claim. One can also note that the errors of the rational SPDE approximations are similar to those of the standard SPDE approach, and that the reduction in error when increasing from m = 2 to m = 3 is small for all cases, indicating that the error induced by the rational approximation is small compared to the FEM error, even for a low degree m. This is also the reason for why, in particular in the pre-asymptotic region, one can in practice choose the degree m smaller than the value suggested in Remark 3.4, which gives m ≈ 6, 7, 8 for β = 3/4 and the three considered finite element meshes.

Likelihood-based inference of Matérn parameters
The computationally efficient evaluation of the likelihood of the rational SPDE approximation facilitates likelihood-based inference for all parameters of the Matérn model, including ν which until now had to be fixed when using the SPDE approach. In this section we first discuss the identifiability of the model parameters and then investigate the accuracy of this approach within the scope of a simulation study.
6.1. Parameter identifiability. A common reason for fixing the smoothness in Gaussian Matérn models is the result by Zhang (2004) which shows that all three Matérn parameters cannot be estimated consistently under infill asymptotics. More precisely, for a fixed smoothness parameter ν, one cannot estimate both the variance of the field, φ 2 , and the scale parameter, κ, consistently. However, the quantity φ 2 κ 2ν can be estimated consistently. The derivation of this result relies on the equivalence of Gaussian measures corresponding to Matérn fields (Zhang, 2004, Theorem 2). The following theorem provides the analogous result for the Gaussian measures induced by the class of random fields specified via (1.3) on a bounded domain. Its proof can be found in Appendix D.
Note that, for D := R d , the parameter τ is related to the variance of the Gaussian random field via φ 2 = Γ(ν)(τ 2 Γ(2β)(4π) d/2 κ 2ν ) −1 . Thus, τ −2 ∝ φ 2 κ 2ν , which means that Theorem 6.1 is in accordance with the result by Zhang (2004). Since the Gaussian measures induced by the operators L 1 = τ (κ 1 + ∆) β and L 2 = τ (κ 2 + ∆) β are equivalent, we will not be able to consistently estimate κ under infill asymptotics. Yet, Theorem 6.1 suggests that it is possible to estimate τ and β consistently. In fact, with Theorem 6.1 available, it is straightforward to show that τ can be estimated consistently for a fixed ν by exploiting the same arguments as in the proof of (Zhang, 2004, Theorem 3 article to show that both ν and τ can be estimated consistently which would also extend the results by Zhang (2004).
6.2. Simulation study. To numerically investigate the accuracy of likelihoodbased parameter estimation using the rational SPDE approach, we again assume homogeneous Neumann boundary conditions for the Matérn operator in (1.3) and consider the standard latent model (4.1) from §4. We take the unit square as the domain of interest, set σ 2 = 0.1, ν = 0.5 and choose κ and τ so that the latent field has variance φ 2 = 1 and practical correlation range r = 0.2. For the FEM, we take a mesh based on a regular lattice on the domain, extended by twice the correlation range in each direction to reduce boundary effects, yielding a mesh with approximately 3500 nodes. As a first test case, we use simulated data from the discretized model. We simulate 50 replicates of the latent field, each with corresponding noisy observations at 1000 measurement locations drawn at random in the domain. This results in a total of 50000 observations, which we use to estimate the parameters of the model. We draw initial values for the parameters at random and then numerically optimize the likelihood of the model with the function fminunc in Matlab. This procedure is repeated 100 times, each time with a new simulated data set.
As a second test case, we repeat the simulation study, but this time we simulate the data from a Gaussian Matérn field with an exponential covariance function instead of from the discretized model. For the estimation, we compute the rational SPDE approximation for the same finite element mesh as in the first test case. To investigate the effect of the mesh resolution on the parameter estimates, we also estimate the parameters using a uniformly refined mesh with twice as many nodes. The average computation time for evaluating the likelihood is approximately 0.16s for the coarse mesh and 0.4s for the fine mesh. This computation time is affine with respect to the number of replicates, and with only one replicate it is 0.09s for the coarse mesh and 0.2s for the fine mesh.
The results of the parameter estimation can be seen in Table 3, where the true parameter values are shown together with the mean and standard deviations of the 100 estimates for each case. Notably, we are able to estimate all parameters accurately in the first case. For the second case, the finite element discretization seems to induce a small bias, especially for the nugget estimate (σ 2 ) that depends on the resolution of the mesh. The bias in the nugget estimate is not surprising since the increased nugget compensates for the FEM error. The bias could be decreased by choosing the mesh more carefully, also taking the measurement locations into account. In practice, however, this bias will not be of great importance, since the optimal nugget for the discretized model should be used.
It should be noted that there are several other methods for decreasing the computational cost of likelihood-based inference for stationary Matérn models. The major advantage of the rational SPDE approach is that it is directly applicable to more complicated non-stationary models, which we will use in the next section when analyzing real data.

Application
In this section we illustrate for the example of a climate reanalysis data set how the rational SPDE approach can be used for spatial modeling.
Climate reanalysis data is generated by combining a climate model with observations in order to obtain a description of the recent climate. We use reanalysis data generated with the Experimental Climate Prediction Center Regional Spectral Model (ECPC-RSM) which was originally prepared for the North American Regional Climate Change Assessment Program (NARCCAP) by means of NCEP/DOE Reanalysis (Mearns et al., 2014(Mearns et al., , 2009. As variable we consider average summer precipitation over the conterminous U.S. for a 26 year period from 1979 to 2004. The average value for each grid cell and year is computed as the average of the corresponding daily values for the days in June, July, and August. In order to obtain data which can be modelled by a Gaussian distribution, we follow Genton and Kleiber (2015) and transform the data by taking the cube root. We then subtract the mean over the 26 years from each grid cell so that we can assume that the data has zero mean and focus on the correlation structure of the residuals. The resulting residuals for the year 1979 are shown in Figure 2.
The 4112 observed residuals for each year are modeled as independent realizations of a zero-mean Gaussian random field with a nugget effect. That is, the measurement Y ij at spatial location s i for year j is modeled as Y ij = u j (s i ) + ε ij , where ε ij ∼ N(0, σ 2 ) are independent, and {u j (s)} j are independent realizations of a zero-mean Gaussian random field u(s). The analysis of Genton and Kleiber (2015) revealed that an exponential covariance model is suitable for a subset of this data set. Because of this, a natural first choice is to use a stationary Matérn model (1.3), either with β = 0.75 (exponential covariance) or with a general β which we estimate from the data. However, since we have data for a larger spatial region than Genton and Kleiber (2015), one would suspect that a non-stationary model for u(s) might be needed. The standard non-stationary model for the SPDE approach, as first suggested by Lindgren et al. (2011) and used in many applications since then, is where β = 1 is fixed. Until now, it has not been possible to use the model (7.1) with fractional smoothness. Therefore, our main question is now: What is more important for this data-the fractional smoothness β or the non-stationary parameters? We thus consider four different SPDE models for u(s). Two of them are non-fractional models, where β = 1 is fixed, and for the other two (fractional) models, we estimate the fractional order β jointly with the other parameters from the data. For both cases, we consider stationary Matérn and non-stationary models, where the latter are formulated via (7.1) with and the same model is used for τ (s). Here, ψ 1 j ( s) := sin(jπ s), ψ 2 j ( s) := cos(jπ s), ψ a (s) is the altitude at location s, and s = ( s 1 , s 2 ) denotes the spatial coordinate after rescaling so that the observational domain is mapped to the unit square. Thus, log κ(s) and log τ (s) are modelled by the altitude covariate and 16 additional Fourier basis functions to capture large-scale trends in the parameters. The altitude covariate and the eight Fourier basis functions ψ k 1 ( s 1 )ψ j ( s 2 ) : j, k, = 1, 2 are shown in Figure 3. We discretize each model with respect to the finite element mesh shown in Figure 2, assuming homogeneous Neumann boundary conditions. The mesh has 5021 nodes and was computed using R-INLA . For the fractional models, we set m = 1 in the rational approximation and, for each model, the model parameters are estimated by numerical optimization of the log-likelihood as described in §4.
The log-likelihood values for the four models can be seen in Table 4. The parameter estimates for the stationary non-fractional (β = ν = 1) model are κ = 0.67,   Table 4. Model-dependent results for (i) the log-likelihood, (ii) the pseudo cross-validation scores (RMSE, CRPS, LS, each ×100) averaged over ten replicates, and (iii) the computational time for one evaluation of the likelihood averaged over 100 computations. τ = 5.44, and σ = 0.014, which implies a standard deviation φ = 0.077 and a practical range ρ = 4.21. The estimates for the fractional model are κ = 0.20, τ = 10.58, σ = 0.012, and β = 0.72, corresponding to φ = 0.081, ρ = 9.21, and a smoothness parameter ν = 0.44. We note that the fractional model has a longer correlation range. This is likely to be caused by the non-fractional model underestimating the range ρ in order to compensate for the wrong local behavior of the covariance function induced by the smoothness parameter ν = 1. Figure 4 shows the estimated marginal standard deviation φ(s) for the two nonstationary models (computed using the estimates of the parameters for κ(s) and τ (s)) and 0.7 contours of the correlation function for selected locations in the domain. The estimate of β for the non-stationary fractional model is 0.723. Also for the non-stationary models, we observe a slightly longer practical correlation range ρ(s) for the fractional model.
To investigate the predictive accuracy of the models, a pseudo cross-validation study is performed. We choose 10% of the spatial observation locations at random, and use the corresponding observations for each year to predict the values at the remaining locations. The accuracy of the four models is measured by the root mean square error (RMSE), the average continuous ranked probability score (CRPS), and the average log-score (LS). This procedure is repeated ten times, where in each iteration new locations are chosen at random to base the predictions on. The average scores for the ten iterations are shown in Table 4. Recall that low RMSE and CRPS values resp. high LS values correspond to good scores.
We observe that the predictive performance of the non-stationary non-fractional (β = 1) model is similar to the stationary fractional model in terms of CRPS, and actually worse in terms of RMSE. This clearly indicates that the data should be analyzed by a fractional model. Although the non-stationary fractional model has a better performance in terms of CRPS and LS than the stationary fractional model, the difference is quite small given that the non-stationary model has 38 parameters, compared to 4 for the stationary model. Thus, the fractional smoothness seems to be the most important aspect for this data. The fact that the rational SPDE approach allows us to make these comparisons and to verify the smoothness parameter, for stationary and non-stationary models, is one of its most important features.

Discussion
We have introduced the rational SPDE approach providing a new type of computationally efficient approximations for a class of Gaussian random fields. These are based on an extension of the SPDE approach by Lindgren et al. (2011) to models with general second-order differential operators of arbitrary order β > d/4 For these approximations, explicit rates of strong convergence have been derived and we have shown how to calibrate the degree of the rational approximation with the mesh size of the FEM to achieve these rates. The results can also be combined with the results in (Bolin et al., 2018b) to obtain explicit rates of weak convergence (convergence of functionals of the random field).
Our approach can, e.g., be used to approximate stationary Matérn fields with general smoothness, and it is also directly applicable to more complicated nonstationary models, where the covariance function may be unknown. A general fractional order of the differential operator opens up for new applications of the SPDE approach, such as to Gaussian fields with exponential covariances on R 2 . For the Matérn model and its extensions, it furthermore facilitates likelihood-based (or Bayesian) inference of all model parameters. The specific structure of the approximation then in turn enables a combination with INLA or MCMC in situations where the Gaussian model is a part of a more complicated non-Gaussian hierarchical model.
We have illustrated the rational SPDE approach for stationary and non-stationary Matérn models. A topic for future research is to apply the approach to other random field models in statistics which are difficult to approximate by GMRFs, such as to models with long-range dependence (Lilly et al., 2017) based on the fractional Brownian motion. Another topic for future research is to modify the fractional SPDE approach by replacing the FEM basis by a multiresolution basis and to compare this approach to other multiresolution approaches such as (Katzfuss, 2017). Finally, it is also of interest to extend the method to non-Gaussian versions of the SPDE-based Matérn models (Wallin and Bolin, 2015), since the Markov approximation considered by Wallin and Bolin (2015) is only computable under the restrictive requirement β ∈ N.

Appendix A. Iterated finite element method
The rational approximation u R h,m of the solution u to (1.2) introduced in §3.3 is defined in terms of the discrete operators P ,h = p (L h ) and P r,h = p r (L h ) via (3.5). Since the differential operator L in (3.1) is of second order, their continuous counterparts P = p (L) and P r = p r (L) in (3.7) are differential operators of order 2(m + m β ) and 2m, respectively. Using a standard Galerkin approach for solving (3.7) would therefore require finite element basis functions {ϕ j } in the Sobolev space H m+m β (D), which are difficult to construct in more than one space dimension. This can be avoided by using a modified version of the iterated Hilbert space approximation method by Lindgren et al. (2011), and in this section we give the details of this procedure.
Recall from §3.2 that V h ⊂ V is a finite element space with continuous piecewise linear basis functions {ϕ j } n h j=1 defined with respect to a regular triangulation T h of the domain D with mesh width h := max T ∈T h diam(T ).
For computing the finite element approximation, we start by factorizing the polynomials q 1 and q 2 in the rational approximationr off (x) = x β−m β in terms of their roots, We use these expressions to reformulate (3.9) as where, again, we have expanded the fraction with x m . This representation shows that we can equivalently define the rational SPDE approximation u R h,m as the solution to (3.5) with P ,h , P r,h redefined as where Id h denotes the identity on V h . We use the formulation of (3.5) as a system outlined in (3.6): First we solve P ,h x h,m = W h and we then compute u R h,m = P r,h x h,m . To this end, we define the functions x k ∈ L 2 (Ω; V h ) for k ∈ {1, . . . , m + m β } iteratively by noting that x m+m β = x h,m . By recalling the bilinear form a L from (3.2) and expanding x k = n h j=1 x kj ϕ j with respect to the finite element basis, we find that the stochastic weights x k = (x k1 , . . . , x kn h ) satisfy where each of these equations hold for i = 1, . . . , n h . Recall from §3.2 that W h is white noise in V h . This entails the distribution where C is the mass matrix with elements C ij = (ϕ j , ϕ i ) L2(D) and, therefore, for every k ∈ {1, ..., m + m β }. Here, the matrix P ,k is defined by where L k := k j=1 I − r 2j C −1 L , with identity matrix I ∈ R n h ×n h , and the entries of L are given by cf. (3.1)-(3.2). In particular, the weights x of x h,m have distribution Note also that for the Matérn case, i.e., L = κ 2 − ∆, we have L = κ 2 C + G, where G is the stiffness matrix with elements G ij = (∇ϕ j , ∇ϕ i ) L2(D) .
To calculate the final approximation u R h,m = P r,h x h,m , we apply a similar iterative procedure. Let u 1 , . . . , u m be defined by x h,m = u m and the weights u k of u k can be obtained from the weights of x h,m via u k = P r,k x, where By (A.1), the distribution of the weights u of the final rational approximation u R h,m is thus given by u ∼ N 0, P r P −1 CP − P r , where P r := P r,m .
To obtain sparse matrices P and P r , we approximate the mass matrix C by a diagonal matrix C with diagonal elements C ii = n h j=1 C ij . The effect of this "mass lumping" was motivated theoretically by Lindgren et al. (2011), and was empirically shown to be small by Bolin and Lindgren (2013).

Appendix B. Convergence analysis
In this section we give the details of the convergence result stated in Theorem 3.3. As mentioned in §3.4, we chooser =r h as the L ∞ -best rational approximation of f (x) = x β−m β on the interval J h for each h. We furthermore assume that the operator L in (3.1) is normalized such that λ 1 ≥ 1 and, thus, J h ⊂ J ⊂ [0, 1].
Recall that Proposition 3.2 provides a bound for u − u h L2(Ω;L2(D)) . Therefore, it remains is to estimate the strong error between u R h,m and u h induced by the rational approximation of f (x) = x β . To this end, recall the construction of the rational approximation u R h,m from §3.3: We first decomposed f as f (x) =f (x)x m β , wheref (x) = x β−m β , and then used a rational approximationr = q1 q2 off on the interval J h = λ −1 n h ,h , λ −1 1,h with q 1 ∈ P m (J h ) and q 2 ∈ P m+1 (J h ) to define the approximation r(x) :=r(x)x m β of f . Here, P m (J h ) denotes the set of polynomials q : J h → R of degree deg(q) = m. In the following, we assume thatr =r h is the best rational approximation off of this form, i.e., For the analysis, we treat the two cases β ∈ (0, 1) and β ≥ 1 separately. If β ≥ 1, thenβ := β − m β ∈ [0, 1). Thus, ifr * denotes the best rational approximation off on the interval [0, 1], we find (Stahl, 2003, Theorem 1) where the constantĈ > 0 is continuous inβ and independent of h and the degree m. Since x m β ≤ 1 for all x ∈ J h , we obtain for r h (x) :=r h (x)x m β the same bound, If β ∈ (0, 1), thenβ ∈ (−1, 0) and we let r be the best approximation of f (x) := x |β| on [0, 1]. A rational approximation of f on the different interval where the constant C > 0 depends only on |β|.
Finally, we use again the estimate x m β ≤ 1 on J h to derive By (B.1) and (B.2), we can bound the last term by By (Strang and Fix, 2008, Theorem 6.1) we have λ n h ,h λ n h n 2/d h , for sufficiently small h ∈ (0, 1), where the last bound follows from the Weyl asymptotic (3.3). Finally, n h h −d by quasi-uniformity of the triangulation T h . Thus, we conclude which combined with Proposition 3.2 proves Theorem 3.3.
Appendix C. A comparison to the quadrature approach Bolin et al. (2018a) proposed another method which can be applied to simulate the solution u to (1.2) numerically. The approach therein is to express the discretized equation (3.4) h u h = f can be solved by using non-fractional methods, the focus was on the fractional case β ∈ (0, 1) when constructing the approximative solution. From the Dunford-Taylor calculus (Yosida, 1995, §IX.11) one has in this case the following representation of the discrete inverse, (2015) introduced a quadrature approximation Q β h,k of this integral after a change of variables λ = e −2y and based on an equidistant grid for y with step size k > 0, i.e., Exponential convergence of order O e −π 2 /(2k) of the operator Q β h,k to the discrete fractional inverse L −β h was proven for K − := π 2 4βk 2 and K + := π 2 4(1−β)k 2 . By calibrating the number of quadrature nodes with the number of basis functions in the FEM, an explicit rate of convergence for the strong error of the approximation u Q h,k = Q β h,k W h was derived (Bolin et al., 2018a, Theorem 2.10). Motivated by the asymptotic convergence of the method, it was suggested to choose k ≤ − π 2 4β ln(h) in order to balance the errors induced by the quadrature and by a FEM of mesh size h (Bolin et al., 2018a, Table 1). This corresponds to a total number of K = K − + K + + 1 > 4β ln(h) 2 π 2 (1−β) quadrature nodes. The analogous result for the degree m of the approximation u R h,m is given in Remark 3.4, suggesting the lower bound m ≥ ln(h) 2 π 2 (1−β) , i.e., K = 4βm asymptotically. Furthermore, if we let c j := e 2yj and P Q ,h := K + j=−K − c −β j (Id h + c j L h ) , P Q r,h := 2k sin(πβ) π

Bonito and Pasciak
we find that the quadrature-based approximation u Q h,k can equivalently be defined as the solution to the non-fractional SPDE P Q ,h u Q h,k = P Q r,h W h in D. (C.1) Remark C.1. A comparison of (C.1) with (3.5) illustrates that u Q h,k can be seen as a rational approximation of degree K − + K + , where the specific choice of the coefficients is implied by the quadrature. In combination with the remark above that K = 4βm quadrature nodes are needed to balance the errors, this shows that the computational cost for achieving a given accuracy with the rational approximation from §3.3 is lower than with the quadrature method, since β > d/4.

Appendix D. Parameter identifiability
This section contains the proof of Theorem 6.1. For the proof, we will use the Feldman-Hájek theorem which we restate here from (Da Prato and Zabczyk, 2014, Theorem 2.25) for convenience. Proof of Theorem 6.1. Since the two Gaussian measures have the same mean, we only have to verify conditions I. and III. of Theorem D.1.
We first prove that condition I. can hold only if β 1 = β 2 . To this end, we use the equivalence of condition I. with the existence of two constants c , c > 0 such that see, e.g., (Stuart, 2010, Lemma 6.15), where in our case Let λ ∆ j , j ∈ N, denote the positive eigenvalues (in nondecreasing order, counting multiplicity) of the Dirichlet or Neumann Laplacian −∆ : D(∆) → L 2 (D), where the type of homogeneous boundary conditions is the same as for L 1 and L 2 . By Weyl's law (3.3), there exist constants c,C > 0 such that Furthermore, we let {e j } j∈N denote a system of eigenfunctions corresponding to λ ∆ j j∈N which is orthonormal in L 2 (D). Now assume that β 2 > β 1 and let j 0 ∈ N be sufficiently large such that κ 2 1 < Cj 2/d 0 . Then, we have (κ 2 2 + λ ∆ j ) 2β2 (κ 2 1 + λ ∆ j ) 2β1 > c 2β2 (2C) 2β1 j 4(β2−β1)/d ∀j ∈ N, j ≥ j 0 .