Ensemble transform sensitivity for adaptive observations: a general formulation and its practicable implementation

ABSTRACT The Ensemble Transformation (ET) method and its variation ET-based Sensitivity (ETS) method have been used in adaptive observation studies. However, the solution of the ensemble transformation matrix in the ET and ETS methods is not unique. A general mathematical formulation for the ensemble transformation matrix is derived and then a generalized equation for the ETS method is derived. It is proved that the previous ETS formulation is a special implementation of the newly derived general formulation. Another practicable implementation of the general ETS formulation that avoids calculating the inverse of some matrices is proposed. This ETS implementation showed physically reasonable statistical sensitivity regions for improving 1–3 day weather forecasts over eastern regions of U.S.A and Beijing region, China.


Introduction
Adaptive observations have been employed in field campaigns to increase forecast accuracy of high-impact weather events. One of key factors in adaptive observations is to provide guidance of data sensitive region for collecting additional observations in a few days ahead of the high-impact weather events. In addition to several adjoint-based methods, such as the singular vector method (SVs; Palmer et al. 1998;Buizza and Montani 1999) and the conditional nonlinear optimal perturbation method (Mu, Zhou, and Wang 2009;Wang, Mu, and Huang 2011;Yu et al. 2017), ensemble-based methods, such as the ensemble transformation method (ET; Toth 1999, hereafter BT1999) and the ensemble transform Kalman filter method (ETKF; Bishop, Etherton, and Majumdar 2001) are also widely applied in field campaigns (Chang, Zheng, and Raeder 2013;Xie et al. 2013). Compared to the adjoint-based methods, the ensemble-based methods do not require adjoint models and are also less computationally demanding (Ancell and Hakim 2007;Ito and Wu 2013).
Among the ensemble-based methods, the ET method proposed by Bishop and Toth (1999) provides a practicable approach for adaptive observations. In a recent research, an ET-based sensitivity (ETS) method was proposed by Zhang et al. (2016 to specify sensitive area for adaptive observations. The ETS method calculates the gradient of forecast error variance reduction in terms of analysis error variance reduction. The ETS method produces similar results to ET and reduces computational cost because only a single transformation matrix calculation is required. In this letter, it is showed that the solution of the ensemble transformation matrix in is not unique, and thus the previous study (BT1999 and Z2016) may potentially provide a suboptimal solution. A general mathematical formulation of the ensemble transformation matrix and a general formulation of the ETS method are derived and described. The general ETS formulation provides a basis for further investigating other kind of practical solutions and uncertainties caused by choosing different transformation matrices. This paper is organized as follows. In section 2, a precise review on the ET method and a general formulation of the transform matrix are presented. In section 3, a general ETS formulation is described. And it is shown that the ETS formulation in Z2016 is a special implementation of the general solution. Another more practicable implementation of the general ETS formulation is introduced, which has merits of avoiding inverse process of a matrix and its applications in some cases are also presented. A summary is provided in the final section. First, a precise description on the ET method (BT1999) is presented before introducing the ETS method. Assuming an ensemble of forecasts perturbations (ensemble forecasts minus ensemble mean) over a period is available. Let X a denotes ensemble perturbations at a future sensitivity analysis time before assimilating a set of the adaptive observations, X v ensemble perturbations at a future verification time, and Y a the ensemble perturbations after assimilating the adaptive observations. The idea of ensemble transformation is to present Y a by introducing a transformation matrix C, such that Here it is assumed that Y a is a linear transformation from X a . Y a ðY a Þ T presents the analysis error covariance matrix after assimilating the adaptive observations for evaluation. C is a K Â K matrix. The matrixes X a , X v , and Y a are M Â K matrixes. Here M is dimension of an ensemble state vector, and K is number of member in the ensemble.
The transformation matrix C is also used to predict the ensemble-based forecast error covariance at verification time (P v ), The C in the above two equations is required to approximate a known guessed analysis error covariance of A after assimilation the set of adaptive observations at the sensitivity analysis time, A solution of C given in BT1999 is, The above Equation (4) is used to predict the prediction error covariance defined by Equation (2). In order to estimate impact of the set of adaptive observations, two experiments are required, one control experiment that does not assimilate that set of observations and the other sensitivity experiment that assimilates that set of observation are required. Assuming A a ctr is the guess error covariance for the control experiment at the sensitivity analysis time, and A a sen is the guess error covariance for the sensitivity experiment at the sensitivity analysis time, then the C matrices associated with the two experiments are, In real applications, the inverse of the guess analysis error is difficult to be obtained. Following BT1999 and Majumdar et al. (2002), only the diagonal elements of analysis error covariance are considered to obtain the transformation matrix.
Here D a ctr and D a sen are diagonal matrixes that only include diagonal elements of the analysis error covariance matrixes of A a ctr and A a sen . Given the ensemble transformation matrix defined in Equations (5)-(6), the impact of the targeted data is evaluated by calculating the difference between the two prediction error covariances (P diff ), It is seen that the ensemble transformation matrix needs to be calculated twice to evaluate the impact of the targeted observations. If thousands of potential observational locations or flight patterns for adaptive observations need to be evaluated, then thousands of calculations of the ensemble transformation matrix are required.

A general solution of ensemble transformation matrix C
BT1999 described a solution of C (Equation (4)) which had been used in the ETS formulation in Z2016. A careful investigation of Equation (3) showed that the solution for the ensemble transform matrix is not unique because the X a , which is a non-square matrix, is not invertible. The previous applications (BT1999 and Z2016) may potentially provide a sub-optimal solution, instead of an optimal solution for targeted observations. A general solution for the ensemble transform matrix is described in this section. Since ensemble perturbation matrix X a can include different atmospheric variables, a symmetric matrix Wintroduced to weight each element in the perturbation ensemble. W is a M Â M matrix. By multiplying ðX a Þ T W and WX a from the left and right of Equation (3), Assuming the inverse of the symmetric square matrix ðX a Þ T WX a exists, then The above Equation (9) is a general formulation of the ET matrix. It is obvious that different implementations of the weighting matrix W will lead to the different ensemble transformation matrix. It is proven that the solution of the ensemble transformation matrix (Equation (4)) defined in BT1999 is a special implementation of the general solution Equation (9). Provided that W is equal to the inverse of the guess analysis error covariance matrix A, W ¼ A À1 , then The above solution is exactly same to Equation (4) derived in BT1999. In another word, the ensemble transform matrix defined in BT1999 is a special implementation of the general solution Equation (9).

A general ETS formulation and its simplified implementation 3.1 A general ETS formulation
To objectively measure on the error reduction caused by assimilating a set of adaptive observations, usually a scalar response function is defined to measure the forecast errors reduction defined in Equation (7).
Mathematically, the response function J is the sum of the diagonal elements of P diff , where The vector a and b whose element a ii , b ii ði ¼ 1; MÞ denotes the variance parts of analysis error matrixes of A a ctr and A a sen . β whose element β l ðl ¼ 1; MÞ is a parameter measuring an analysis error reduction ratio caused by assimilating the observations for evaluation. The maximum value of β l is 1.0. It means the analysis error variance b ii becomes 0 after assimilating the observations for estimating impact.
The transformation matrix can be rewritten as J can be written as where Z ¼ The idea of the ensemble transform sensitivity method (ETS) is to use the gradient of forecast error variance in terms of analysis error variance to determine data sensitive regions. The gradient of J to the analysis error variance reduction ratio β is @J @β ¼ ð @J @β 1 ; :::; @J @β l ; :::; @J @β M Þ T ¼ 1 K f X i ½z i @CðβÞCðβÞ T @β 1 z i T ; :::; X i ½z i @CðβÞCðβÞ T @β l z i T ; : Using general ensemble transformation matrix Equations (9) and (11), the general ETS formulation is written as, Provided that analysis error matrix is not a function of W, it is obvious that the ETS signal is proportional to amplitude of analysis error variance because @D a diff ðβÞ @β l ¼ a ll . In other words, ETS intends to identify region of larger analysis error variance as data sensitive area. This feature is physically meaningful because that targeted observations need to be placed in regions where the analysis errors have room to be reduced. The larger analysis errors in regions are, the more room to reduce the errors.

Relation to the ETS formulation in Z2016
Here we will show that the ETS formation in Z2016 is a special implementation of the general ETS formulation and then another practicable implementation of the ETS formulation is proposed. When W is equal to A À1 , it is obvious that Equation (15) is converted to, The above ETS equation is the same to the ETS equation described in Z2016 (Equation (15)). It is proved that the ETS equation in Z2016 is a special implementation of the general ETS formulation Equation (15).

A new practicable implementation of the general ETS formulation
In the general ETS formulation Equation (15), the inverse matrix of ðX a Þ T WX a needs to be calculated. Here we discuss another implementation of the weighting matrix that avoids the inverse calculation of , then it is easy to derive that X a WðX a Þ T ¼ KI, then the ETS is, In practice, it is problematic to obtain the W when X a only includes limited number of forecast samples. Thus only the diagonal elements (error variances) of the matrix X a ðX a Þ T K is suggested to approximate the diagonal elements of W as in BT1999 and Z2016.
where W d denotes the matrix that only includes the diagonal elements of W. The above Equation (18) is the new practicable implementation of the general ETS formulation proposed in this paper.

Illustration of ETS
Here the sensitivity regions identified by the ETS formulation (Equation (18) forecasts at NCEP are used as inputs. To reduce the sample noises, the time lagged ensemble with 60 members and 100 members are used for the U.S and Beijing experiments, respectively. The total energy norm is defined same to Z2016 but only includes wind and temperature at three pressure levels 200, 500, and 700 hPa. The 3-month calculations cover the whole summer period (June, July, and August) in 2017. Instead of using a constant guess analysis error matrix in Z2016, the analysis error matrix is approximated by the ensemble spreads at the sensitivity analysis time, which is expected to capture some flow-dependent structures of the error covariance. Figure 1 shows the 3-month averaged sensitivity regions to improve forecast over Beijing with different lead times. It is seen that Beijing is in front of the pressure ridge at 500 hPa. The sensitivity region with maximum total energy is located within the verification region at 0day lead time, and moves to upstream region with longer lead times. The main signal at 2-day lead times is located in Mongolia area where is less of conventional meteorological observations. It indicates more remote data like satellite radiances might be help to improve 2-day wind and temperature forecasts over Beijing region. And an advanced data assimilation system that can spread observation information from China to improve analysis over Mongolia area might also be helpful. Figure 2 shows the 3-month averaged sensitivity regions to improve forecast over eastern regions of U.S.A with different lead times. It is seen that the eastern regions of the U.S.A is dominated by a 500 hPa trough. The sensitivity region with maximum total energy is located within the verification region at 0day lead time, and moves to northwestern upstream region with longer lead times. The main signals are located over the North American that are rich of conventional observations, which indicates that a regional forecasting system has a good chance to make accurate forecasts over eastern regions of U.S.A up to 2 days in summer when those data are well quality screened and assimilated.

Summary
Ensemble-based sensitivity analysis methods are computationally efficient provided an ensemble of forecast are already available. The ET method proposed by Toth (1999, hereafter BT1999), and a variation ET-based sensitivity (ETS) method (Zhang et al. 2016) have been used to specify sensitive area for adaptive observations. The ETS method increases computation efficiency by calculating the gradient of forecast error variance reduction in terms of analysis error variance reduction that only requires one calculation of the ensemble transformation matrix.
In this paper, it is noted that the solution of the ensemble transformation matrix in the ET and ETS methods is not unique. A general mathematical formulation for the ensemble transformation matrix is derived and then a general mathematical formulation of the ETS is described. It is proved that the ET formulation in BT199 and ETS formulation in Z2016 are special implementations of the general ET and ETS formulations.
Under some conditions, it is shown that the ETS signal is proportional to amplitude of analysis error variance. In other words, ETS intends to identify region of larger analysis error variance as data sensitive area. This feature is physically meaningful because that targeted observations should be placed in regions where Figure 1. Three-month averaged sensitivity regions (shaded) to improve forecasts over Beijing with different lead times, (a) 2 days, (b) 1 day, and (c) 0 day. Three-month averaged ensemble mean geopotential height at 500 hPa is denoted by black contours. The verification region is shown by the rectangular. The signal is rescaled with the maximum value of the total energy in the domain. . Three-month averaged sensitivity regions (shaded) to improve forecasts over eastern regions of U.S.A with different lead times, (a) 2 days, (b) 1 day, and (c) 0 day. Three-month averaged ensemble mean geopotential height at 500 hPa is denoted by black contours. The verification region is shown by the rectangular. The signal is rescaled with the maximum value of the total energy in the domain. the analysis errors have room to be reduced. The larger analysis errors in regions are, the more room to reduce the error. Another more efficient implementation of the ETS formulation, which inherits the above feature and avoids calculating inverse process of some matrices, is proposed. In the practical implementation of this efficient ETS, the flow-dependent ensemble spread is used to represent the analysis error variance, whereas Z2016 used constant analysis error variances that does not vary with locations.
The three-month averaged sensitivity regions for improving forecasts during summer 2017 over Bejing, China and eastern regions of U.S with forecasting lead times up to two days show reasonable sensitivity regions that locate in upstream of the verification regions. Moreover, the general formulation provides an opportunity to investigate the uncertainty of the different ET and ETS implementations on sensitivity areas identified for adaptive observations and/or observation network optimization. This can be a subject of future studies.

Disclosure statement
No potential conflict of interest was reported by the authors.