Lower bound estimation of the maximum allowable initial error and its numerical calculation

ABSTRACT In the numerical prediction of weather or climate events, the uncertainty of the initial values and/or prediction models can bring the forecast result’s uncertainty. Due to the absence of true states, studies on this problem mainly focus on the three subproblems of predictability, i.e., the lower bound of the maximum predictable time, the upper bound of the prediction error, and the lower bound of the maximum allowable initial error. Aimed at the problem of the lower bound estimation of the maximum allowable initial error, this study first illustrates the shortcoming of the existing estimation, and then presents a new estimation based on the initial observation precision and proves it theoretically. Furthermore, the new lower bound estimations of both the two-dimensional ikeda model and lorenz96 model are obtained by using the cnop (conditional nonlinear optimal perturbation) method and a pso (particle swarm optimization) algorithm, and the estimated precisions are also analyzed. Besides, the estimations yielded by the existing and new formulas are compared; the results show that the estimations produced by the existing formula are often incorrect. Graphical Abstract


Introduction
In numerical weather and climate prediction, the prediction result is uncertain due to the uncertainty of initial values and models. Accordingly, Lorenz (1975) classified predictability problems into two types: the uncertainty of the forecast results caused by the initial conditional error, and that by the model error. Numerical weather prediction is essentially initial and boundary problems of partial differential equations (Kalnay 2002), and initial values are generally provided by observations or background values. The initial error is inevitable because of the precision of observing instruments, discretization errors of the model, data loss in the initial value pretreatment, and so on. In addition, the absence of descriptions of some physical processes, and errors in certain parameters can make the model inaccurate and imperfect when describing atmospheric or oceanic movements, and then model error occurs. Mu, Duan, and Wang (2002) divided predictability problems into three subproblems: problems associated with the maximum predictable time, prediction error, and the maximum allowable initial error (and parameter error). Since the true atmospheric value is not available in the actual numerical weather prediction, the quality of observations can be controlled within a certain range of precision, although there exists error between the initial value and true value. Taking these factors into account, Mu, Duan, and Wang (2002) further reduced the three subproblems into the lower bound of the maximum predictable time, the upper bound of the prediction error, and the lower bound of the maximum allowable initial error. Although estimations of these three subproblems are lower or upper bounds, they are of great guiding significance for the actual numerical weather prediction.
In studying the three reduced predictability subproblems, the programming of numerical experiments has always been an important issue because the related CONTACT ZHENG Qin qinzheng@mail.iap.ac.cn numerical results are indispensable. In previous research on the three predictability problems, some numerical experiments were based on the filtering method (Mu, Duan, and Wang 2002), i.e., solving the objective function value at each mesh point by dividing the feasible region by an equal cube mesh. However, this method is only applicable to theoretical studies of simple models. As the division intervals decreases, the calculation amount increases sharply (Duan and Luo 2010) and cannot be realized in complex models. Conditional Nonlinear Optimal Perturbation (CNOP) is a kind of initial perturbation that satisfies some constraints and has the largest nonlinear evolution at the prediction moment. For predictability problems, CNOP is the initial error that leads to the maximum prediction error at the prediction moment (Mu and Duan 2003). Considering the physical meaning of CNOP, Duan and Luo (2010) firstly applied the CNOP method to solve the upper and lower bounds. Further, Zheng et al. (2017) gave a lower bound of the maximum predictable time and the upper bound of the prediction error by combining a particle swarm optimization algorithm (PSO) with CNOP.
There are many studies on the first and second subproblems, but few on the third. In particular, based on the existing lower bound estimation of the maximum allowable initial error, although the prediction error of the initial analysis always satisfies the limitation of the prediction error at the given prediction time, it is not a correct estimation. This study attempts to present a lower bound estimation and validate it theoretically and numerically.
The paper is organized as follows: Section 2 describes the three predictability problems, giving a lower bound estimation of the maximum allowable initial error and its proof. Section 3 investigates the definition of CNOP and the two forecast models. In Section 4 we compare the performances of the two estimations in the two-dimensional Ikeda model and Lorenz96 model. A conclusion and discussion are presented in Section 5.

Three problems of predictability
This paper assumes that the model is perfect. What we focus on is the initial error, while ignoring the errors of the parameter. That is to say, denote M t as the propagator that propagates the state from the initial time to time t; u t 0 and u t t are the true values of the state at the initial time and time t, respectively; then, u t t ¼ M t ðu t 0 Þ. Norm : k k A is 2-norm in this paper.
Problem 1. Assume that the initial analysis u a 0 is known and M t is the propagator that propagates the state from the initial time 0 to time t. For any given prediction error ε > 0, we call prediction T the allowable prediction time if (1) Mu, Duan, and Wang (2002) defined the maximum predictable time: where τ is allowable prediction time.
Since the true value cannot be obtained exactly, it is impossible to obtain the exact value of T ε by solving this nonlinear optimization problem (Mu, Duan, and Wang 2002). When the initial analysis meets the quality control a lower bound estimation of the maximum predictable time is given as (Mu, Duan, and Wang 2002) where δu 0 is initial perturbation.
Problem 2. The meanings of u a 0 , M T , and u t T are the same as in Problem 1. The prediction error of u a 0 in the given prediction moment T is Similar to the above problem, it is also impossible to obtain the exact value of E. When the Equation (3) holds, an upper bound of prediction error (E u ) was given by Mu, Duan, and Wang (2002): where B δ is a sphere with center at 0 and radius δ.
Problem 3. Assuming the initial analysis u a 0 is known, for a given prediction moment T > 0 and the allowable prediction error ε, the maximum allowable initial error is Similar to the above problem, it is also impossible to obtain the exact value of the true state. If we know more information about the errors of u a 0 , useful estimation can be derived. Suppose that the initial analysis holds that where the σ is the given constraint, then Mu, Duan, and Wang (2002) gave a lower bound estimation of the maximum allowable initial error: and pointed out that δ max δ max : There are two points to note about the δ max : (1) If σ δ max , i.e., the error of initial value is in the range of δ max , then which means that the prediction error of the initial analysis at prediction moment T is less than the prediction error.
(2) δ max is not the lower bound of the δ max , i.e., Equation (9) is not correct, even though σ δ max . In fact, there is no relation of inclusion between a sphere with its center at u a 0 and radius δ max and one with its center at u t 0 and radius δ max (Figure 1). We can only know that u t 0 2 Bðu a 0 ; δ max Þ when σ δ max . Equation (9) is correct only if Bðu a 0 ; δ max Þ & Bðu t 0 ; δ max Þ. It is impossible to reach a conclusion by Equations (7), (8), and (9). We cannot even tell whether u a 0 2 Bðu t 0 ; δ max Þ is correct. The results of the numerical experiments will be given in Section 3.
We give a lower bound of δ max according to the definition of the maximum allowable initial error. Here is the estimation: If σ δ 1 max , it is easy to know that u t 0 2 Bðu a 0 ; δ 1 max Þ. According to the expression of δ 1 max , for any u Ã 0 2 Bðu t 0 ; δ 1 max Þ, M T ðu t 0 Þ À M T ðu Ã 0 Þ A ε, and then δ 1 max is a lower bound of δ max . From the Equation (10), we know that we have to search all the points around the points that are around u a 0 . The calculation cost is so high that algorithms are difficult to design.
Considering Equation (10), it is easy to know that for any u Ã 0 , u Ã 0 2 Bðu a 0 ; 2δÞ. Then, let u Ã 0 satisfy M T ðu Ã 0 Þ À M T ðu a 0 Þ A ε 2 ，and considering the triangle inequality of norm, we know that for any u 0 2 Bðu a 0 ; δÞ, Therefore, we give another lower bound estimation δ max : If σ δ max , then where Bðu a 0 ; 2δÞ is a sphere with center at u a 0 and radius 2δ.
Proof: From Figure 2, we can easily see that for any u 0 2 Bðu a 0 ; 2δ max Þ, For u t 0 À u a 0 A σ δ max , then u t 0 2 Bðu a 0 ; δ max Þ & Bðu a 0 ; 2δ max Þ: Draw the inscribed circle Bðu t 0 ; δ 0 Þ of Bðu a 0 ; 2δ max Þ with u t 0 as its center. Notice that u t 0 À u a 0 A σ δ max , and then δ max δ 0 : Combining with Equation (13) and the triangle inequality of norm, for any u 0 2 Bðu t 0 ; δ 0 Þ, Combining with Equation (7) and Equation (14), we can imply that δ max δ 0 δ max : Without much difficulty, we can prove that δ max δ 1 max . δ max is less than δ 1 max , and is easy to solve with the existing optimization algorithms.

Related concepts and the forecast model
3.1 Nonlinear model

Ikeda model
The Ikeda model was first proposed by Ikeda (1979). The description of the model in the next two paragraphs parallels that of Li, Zheng, and Zhou (2016).
The two-dimensional Ikeda model is adopted as the prediction model: where 0 μ 1; a ¼ 0:4; and b ¼ 6. From the expression of the model we find that there are trigonometric functions in Equation (16), and Equation (17) is a fraction whose denominator includes two quadratic components. Thus, the two-dimensional Ikeda model has fairly strong nonlinearity.

Lorenz96 model
The description of the model in the next three paragraphs parallels that of Wang (2007).
The Lorenz96 model is a new dynamic model derived and simplified from the dynamic model of Lorenz (1996). The model still has the characteristics of the Lorenz model and is very sensitive to the initial value.
The control faction of the model is where i ¼ 1; 2; Á Á Á ; N (N¼40). X i is the state of the system, and F is a forcing constant, and F ¼ 8 is a common value known to cause chaotic behavior. This model is solved using a fourth-order Runge-Kutta difference with a time step dt of 0.05 dimensionless units, equivalent to a time scale of 6 h in the real integration model.

CNOP
CNOP was proposed by Mu and Duan (2003) to study numerical weather and climate predictability and indicates a kind of initial perturbation that makes the maximum prediction error under certain constraints. The formula is the same as it in Mu and Duan (2003). Denote M as a propagator and the initial value U 0 as a state vector. Integrate U 0 using M from the initial time to time t. For a given norm : k k A , we call an initial perturbation δU Ã 0 a CNOP if and only if where δ is the given constraint of the norm : k k A . So far, the three predictability problems of the second part can be solved by solving CNOP. In the algorithm selection of solving CNOP, Zheng et al. (2017) compared the advantages and disadvantages of a traditional gradient algorithm and PSO in finding the CNOP and concluded that the PSO method can capture the CNOP more accurately.
In order to ensure that the obtained CNOP is accurate, when calculating the three predictability problems, we first look for the CNOP by using PSO to obtain the solution of the predictability problem, and then use the filtering method to verify, which effectively shortens the calculation time. Duan and Luo (2010) gave the calculation algorithm and gave the calculation flow chart.
To prove that δ max is not a lower bound estimation of δ max , we randomly select some values within the sphere with centers at the initial value and radius δ max , and solve the maximum allowable initial error of  Figure 3 is the result of (δ max − δ max ) and (δ max − δ max ).
None of the cases below the blue line satisfy Equation (11). It is clear from Figure 3 that δ max is not a lower bound estimation of δ max . The maximum allowable initial errors of the simulated true values are all more than δ max , δ max ¼0:03154. In the 200 experiments, δ max is in the range of 0.1450-0.2066 and its average value is 0.1718, while δ max is 18.36% of its average value.
For the Lorenz96 model, we use the following method to select the initial value: Firstly, a set of data is given. Then, we spin it up for 4000 days and select the result of one day randomly to spin up for another 365 days. Finally, we use the result of the 365th day as the initial value. When calculating the maximum allowable initial errors of the simulated true values, we set the prediction error as 3.0 and the prediction time as three days. When test and verify δ max and δ max , the method to ensure the true value is the same as the Ikeda model. Figure 4 is the result.
None of the cases below the blue line (x-axis) satisfy Equation (11). The maximum allowable initial errors of the simulated true values are all more than δ max , δ max ¼0:08570. In the 100 experiments, δ max is in the   Figure 4. Results of (a) (δ max -δ max ) and (b) (δ max -δ max ). range of 0.3224-0.3775 and its average value is 0.3498, while δ max is 24.50% of its average value. It is clear from Figures 3 and 4 that δ max is not a lower bound estimation of δ max and δ max is a lower bound estimation of δ max . Although δ max is not a lower bound estimation of δ max , it still plays an important role in practical applications. For the reason that for every initial value we can calculate its δ max and then compare it with the accuracy of the analysis value, if δ max ! σ we can make sure that the prediction error of the initial value is good, which means the prediction error is less than ε. It is important to note, however, that the time and space of the calculation will increase because we have to do this for every initial value or observation.

Discussion and conclusion
Aimed at the third subproblem of predictability, we analyze the deficiencies of the existing lower bound estimation of the maximum allowable initial error. According to the definition of the maximum allowable initial error, a new lower bound estimation based on the initial observation is presented and proven. The numerical calculation of the new estimation is actualized by using the CNOP method and a PSO algorithm for both the Ikeda and Lorenz96 models. The estimations yielded by the existing and new formulas are compared to demonstrate the shortcomings of the existing estimation.
It is illustrated that δ max is not a lower bound estimation of δ max , but δ max is. Although δ max is not a lower bound estimation of δ max , it still plays an important role in theoretical research and practical applications. It can estimate the range of the maximum allowable initial error, but it is a pity that the calculation time and space will increase because we need to calculate it for every initial state. As for the δ max , the added conditions are stronger during the scaling process, which result in the estimation being much less than the maximum allowable initial error. Combined with the test experiments, we think one reason is that when the prediction time is not so long, the nonlinear effect is not very strong, and the linear effect still plays a big role.
Subsequent work should try to make the estimation more accurate, study its relationship with the maximum allowable initial error in different and nonlinearly stronger models, and obtain the lower bound of the maximum allowable initial error more accurately. Also, considering the high-dimensional characteristics of existing climate models or weather models, it is worth exploring whether a PSO algorithm can perform well in high-dimensional optimization problems.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This work was supported by the National Natural Science Foundation of China (Grant No. 41331174).