Precipitation data assimilation in WRFDA 4D-Var: implementation and application to convection-permitting forecasts over United States

Precipitation data assimilation has been developed in the Weather Research and Forecasting model data assimilation system (WRFDA) using four-dimensional variational (4D-Var) approach. Unlike other conventional data, precipitation is an integral quantity and it is not included as a control variable in WRFDA. A simplified Kessler scheme is used in tangent linear and adjoint model. Precipitation data are directly assimilated in WRFDA 4D-Var, and the assimilation of precipitation will have feedback to all the control variables via the constraint of the linearized physics package. Firstly, we present single observation experiments to exhibit how dynamic, thermodynamic and moisture fields are adjusted by assimilating rainfall information. Then, the National Centers for Environmental Prediction Stage IV precipitation data are assimilated for a heavy rainfall case on 9 June 2010 at a convection-permitting model setting (4-km). Finally, we conducted one-week experiments to further validate the robustness of the results for precipitation assimilation. Results show that precipitation assimilation has a positive impact on model fields, particularly on the low-level humidity. For the impact on precipitation forecasts, it indicates that precipitation assimilation reduces spin-up time efficiently, removes false alarms and produces model forecast precipitation closer to the observations through the changes in temperature, moisture and wind imposed in the analyses. The impact from precipitation assimilation persists up to three hours on average.


Introduction
In the past few decades, several approaches for assimilating precipitation observations in Numerical Weather Prediction (NWP) models have been developed to improve the model initial states and subsequent short-range forecasts. The first approach is an initialization scheme (or reverse scheme), such as dynamical initialization (Fiorino and Warner, 1981), physical initialization (Krishnamurti et al. 1991) and cumulus convection initialization (Donner, 1988;Kasahara et al., 1992). The basic premise of these methods is to restructure the moisture and temperature fields so that they are consistent with observed precipitation rates. The studies mentioned above all showed that an improved forecast could be obtained after incorporation of the observed rainfall rate. However, the initialization scheme is an indirect way of using precipitation data, and the shortcoming of the conditions, like initialization methods, are also dynamically inconsistent (Zou and Kuo, 1996;Bao and Ronald, 1997).
The third approach is to use variational data assimilation, such as 1D-Var, 2D-Var and 4D-Var. Hou et al. (2004) explored an 1D variational continuous assimilation (VCA) algorithm for assimilating tropical rainfall data using moisture and temperature time tendency corrections as the control variable. A 2D-Var technique (the second dimension being time) based on 12h integrations of the single column version of the European Centre for Medium-Range Weather Forecasts (ECMWF) model was explored by Lopez et al. (2006) to assimilate rain-gauge measurements and total column water vapour retrievals from the Global Positioning System. Zupanski and Mesinger (1995) demonstrated the technical feasibility of assimilating precipitation data using 4D-Var, and showed an improvement of precipitation forecasts in the mid-latitudes using a regional National Meteorological Center (NMC) Eta forecast model and its incomplete adjoint model. In addition, studies (Zou and Kuo, 1996;Guo et al., 2000;Xiao et al., 2000;Zhang et al., 2003;Xu et al., 2006) indicated that precipitation assimilation using the fifthgeneration Mesoscale Model (MM5) 4D-Var leads to a reduction in spin-up time, improves the moisture distributions in model initial conditions and improves the skill of short-range forecasts. Compared to the initialization scheme (such as physical initialization mentioned above), 4D-Var can assimilate precipitation data directly and the major advantage of 4D-Var is that it uses the model dynamics to adjust the model variables to the observed precipitation (Tsuyuki, 1997). The challenge for precipitation assimilation in 4D-Var is to include parameterization schemes of moist processes into tangent linear model and corresponding adjoint model, for these processes are strongly non-linear and discontinuous (Zupanski and Mesinger, 1995;Zou and Kuo, 1996).
Some operational weather services also assimilate precipitation data operationally using 4D-Var method to improve precipitation forecasts (Tsuyuki et al., 2002;Koizumi et al., 2005;Lopez and Bauer, 2007;Lopez, 2011). Tsuyuki et al. (2002) and Koizumi et al. (2005) reported that the Japan Meteorological Agency (JMA) implemented a mesoscale four-dimensional variational assimilation system (Meso 4D-Var) to assimilate precipitation data. The JMA Meso 4D-Var adopted an incremental approach (Courtier et al., 1994) to reduce the computational cost (Tsuyuki et al., 2002). To effectively assimilate the precipitation data, the full-physics non-linear model and a reduced physics adjoint model are used for inner-loop forward and backward integrations (Tsuyuki et al., 2002). They noted the precipitation forecast throughout the 18-h forecast time have been improved, although Meso 4D-Var does not include the adjoint codes of cumulus parameterization. Lopez and Bauer (2007) employed 1D+4D-Var approach to assimilate the National Center for Environmental Prediction(NCEP) Stage IV precipitation estimates at ECMWF. In the first step, temperature and specific humidity profiles are retrieved from surface rainfall rate observations via a 1D-Var approach. In the second step, the total column water vapour derived from the 1D-Var humidity profiles is assimilated in 4D-Var. This means that the rainfall rate information is first converted into humidity information before being used in 4D-Var. Lopez (2011) reported that the direct 4D-Var assimilation of NCEP Stage IV data has been developed and tested in ECMWF's global Integrated Forecasting System. One of the major restrictions of assimilating precipitation data pointed by Lopez (2011) is 'zero rain' issue: wherever the model background has no precipitation while observations have, and, conversely, wherever there is no precipitation in the observations, while the model background is rainy, and this issue restricts precipitation observation usage to the situations where both model background and observations are rainy.
The last approach is to use Ensemble Kalman Filter (EnKF) method to assimilate precipitation. This method does not require linearization of the model as required in variational methods. Several experiments of precipitation assimilation using EnKFs have already been conducted with encouraging results (Miyoshi and Aranami, 2006;Zupanski et al., 2011;Lien et al., 2013). Although 4D-Var and EnKF solve similar estimation problems, they are built around different constraints and thus have different strengths and weaknesses (Skachko et al., 2014).
In this study, we use 4D-Var to assimilate NCEP Stage IV precipitation directly. Different from Lopez (2011) who assimilated precipitation in a global 4D-Var system and presented the impact on US domain and global atmospheric forecast scores, we use a regional data assimilation system and emphasize the impact on a heavy rainfall event with high resolution as well as one week averaged results. The data assimilation system of the Weather Research and Forecasting model (WRFDA) (Huang et al. (2009);Barker et al., 2012;Zhang et al., 2013) is employed. The tangent linear and adjoint codes of the WRF model (called WRFPLUS) used here have been redeveloped by Zhang et al. (2013), which include simplified surface friction and simplified microphysics and cumulus parameterization schemes. The objectives of this paper are, (1) to illustrate how the assimilated precipitation affect the model's dynamic and microphysical fields, (2) explore the impact of precipitation assimilation on analyses and subsequent forecasts using a heavy rainfall case and oneweek averaged results.
The outline of the paper is as follows: WRFDA 4D-Var system and technical details related to precipitation assimilation are described in Section 2. The NCEP Stage IV precipitation data used in assimilation experiments are described in Section 3. Model configurations and experimental design are presented in Section 4. The case description is given in Section 5. In Section 6, we present single observation experiments to exhibit how dynamic, thermodynamic and moisture fields are adjusted by assimilating precipitation data, and then, precipitation assimilation is applied to a heavy rainfall event. One-week experiments from 9 to 15 June 2010 have also been conducted to further validate the robustness of the results on precipitation assimilation. The results are evaluated using the Model Evaluation Tool (MET) developed at the Developmental Testbed Center (DTC). The conclusion is given in Section 7.

WRFDA 4D-Var
The WRFDA 4D-Var employs the incremental method (Courtier et al., 1994) which is commonly used in operational systems. Huang et al. (2009) described the formulation of the WRFDA 4D-Var system and presented preliminary results from real-data 4D-Var experiments. Simplified large-scale condensation, simplified Kessler scheme and cumulus parameterization schemes are included in WRFPLUS (i.e. WRF TLM and ADM) , which makes it possible to add the function of assimilating precipitation data in WRFDA 4D-Var.

A brief description of WRFDA 4D-Var
The objective of 4D-Var is to find the optimal estimate of the true atmospheric state at the analysis time by iteratively minimizing a cost function, where J b , J o , and J c are the background, observation and balancing cost functions, respectively. In J c , a digital filter is included in WRF 4D-Var to remove high-frequency waves in the analysis state. The background and observation cost function term J b and J o are and In the background term J b , x n is the analysis vector of model prognostic variables at the nth outer loop, x b is the background. B is a covariance matrix for background error. In the observation term, K is the total number of time slots in which observations are available. H is the linearized observation operator which transforms variables from gridded analysis space to observation space. M is the tangent linear model. x n−1 is the analysis vector from the previous outer loop. d k is the innovation vector given by d k = H k M k (x n−1 ) − y k . H and M are the non-linear observation operator and simplified WRF non-linear model, respectively, and y is observations. R is the observation error covariance matrix. The conjugate gradient method is used to minimize the cost function.
Control variables used here are u-wind, v-wind, temperature T , surface pressure Ps and pseudo-relative humidity, and additional two cloud control variables are: cloud water and rainwater. The use of u-wind, v-wind as the momentum control variables is available in WRFDA version 3.7 and later. Precipitation is not included as a control variable. The model trajectory of grid scale and sub-grid cumulus precipitation is obtained from nonlinear WRF model. The sum of the model simulated grid scale and sub-grid precipitation is linearly interpolated from the model grid to observation location and compared with the observations to generate the innovations. For each iteration of inner loop minimization, the gradient of the cost function with respect to the control variables at the beginning of the assimilation time window is obtained with the computational cost of one forward integration of the tangent linear model and one backward integration of the adjoint model.
It should be noticed that the assimilation of precipitation observations is different from conventional data assimilation in the following ways. Firstly, rainfall observations are total amounts, which should be compared to the sum of grid-scale and subgridscale precipitation from a model simulation. Secondly, precipitation is not included as a control variable in WRFDA, while the assimilation of precipitation observations will have feedback to all the control variables via the constraint of the linearized version of the physics package. Moreover, the model precipitation is an integral quantity and the same duration of model precipitation should be compared to the observations (Zupanski and Mesinger, 1995;Zou and Kuo, 1996).
When dealing with a new type of observations, it is very important to estimate the observation error. In previous studies, many authors assigned the observed precipitation error empirically based on the source of the observation, the length of the precipitation accumulation period and data preprocessing method. For example, Zupanski and Mesinger (1995) used an observed precipitation error of only 0.001 mm for 24-h accumulated precipitation which is derived from simulated observations. Zou and Kuo (1996) used 0.045 mm for 3-h accumulated precipitation. Guo et al. (2000) used an error of 3 mm for 1h accumulated precipitation. According to a series of experiments, the precipitation error is empirically assigned a value of 1 mm for 1-h accumulated precipitation in this study.

Microphysics in WRFPLUS
There are two microphysics schemes available in WRFPLUS: simplified Kessler scheme (Wang et al., 2013b) and simplified large-scale condensation. The standard Kessler scheme (Kessler, 1969) includes water vapour, cloud water and rainwater. It is a simple warm cloud scheme, in which microphysical processes included are: the production, fall and evaporation of rain; the accretion and auto-conversion of cloud water; and the production of cloud water from condensation (Skamarock et al., 2008). To simplify the Kessler scheme and include it into TLM and ADM, Wang et al. (2013b) eliminated large derivatives of the rainwater evaporation and the rainwater fall speed with respect to rainwater mixing ratio by setting a threshold for rainwater mixing ratio. The simplified Kessler scheme was successfully used in previous studies of radar data assimilation (Sun and Wang, 2013;Wang et al., 2013b). Another microphysics scheme in WRFPLUS is the simplified large-scale condensation scheme. The basic idea of the scheme is to remove super saturation as precipitation, add the latent heat to the thermodynamic equation and moisture is also adjusted. In this study, the simplified Kessler scheme is used for the convective scales precipitation assimilation.

Adjoint and gradient check
The check procedure of WRFDA 4D-Var TLM and ADM  follows the method of Navon et al. (1992). The correctness of the WRF TLM with simplified microphysics and cumulus parameterization scheme is tested using the formula, where F and G denote the non-linear and tangent linear models, the term α is a small scalar. x 0 is the initial model state vector.
For the values of α which are small but not too close to the machine zero, h is a vector of unit length in an arbitrary direction (usually, h is taken to be a vector in the gradient direction), the above value is expected to be close to 1. A summer rainfall case over the CONUS domain is selected and the gradient check is performed by assimilating NCEP Stage IV precipitation and conventional data simultaneously in 1-h time window. Table 1 shows the gradient check results that obtain six satisfactory digits and indicates a good approximation of the gradient. The accuracy of the adjoint can be checked by, where x is the TLM input, y = L(x) is the TLM output, L and L * are the TLM and ADM, respectively. , denote an inner product. We obtained left-hand side 1.54772958977293E+03 and righthand side 1.54772958977292E+03. The two terms are almost

Observations
The set of precipitation observations to be assimilated in this study is NCEP Stage IV precipitation data, which is based on the multi-sensor hourly/6-hourly NCEP Stage III analyses produced by the 12 River Forecast Centers (RFCs) (Fulton et al., 1998;Lin and Mitchell, 2005) in continental United States (CONUS) and benefits from the RFCs manual quality control step. The Stage IV analysis is used at NCEP as input for precipitation assimilation in the North American Mesoscale Model (NAM), as part of the forcing for the land surface model and as verification data source for precipitation forecast (Lin and Mitchell, 2005). Hourly, 6hourly and 24-hourly precipitation data are available on the National Center forAtmospheric Research (NCAR) Cooperative Distributed Interactive Atmospheric Catalog System (CODIAC) (http://data.eol.ucar.edu/codiac/dss/id=21.093).
Conventional data used in this study include land surface, marine surface, radiosonde, pibal and aircraft reports from the Global Telecommunications System (GTS) which originated from a wide variety of sources, and were downloaded from http:// rda.ucar.edu/datasets/ds337.0/.

Forecast model
The Advanced Research Weather Research and Forecasting model (ARW-WRF) (Skamarock et al., 2008) version 3.8 is used as the forecast model. The integration domain of the model covers most of South Central United States (Fig. 1). The horizontal grid spacing is 4 km (541 × 541 grid points), and there are 45 vertical levels with the model top at 30 hPa. The time step is 15 s. The Thompson microphysics scheme (Thompson et al., 2008), the Rapid Radiative Transfer Model for Global Climate Models (RRTMG) (Mlawer et al., 1997;Iacono et al., 2008) longwave and shortwave schemes, the Noah land surface model (Chen and Dudhia, 2001) and the Yonsei University (YSU) boundary layer parameterization (Hong et al., 2006) are used. Cumulus parameterization is turned off.

Data assimilation configurations
The CV7 option (Wang et al., 2013a) is used and the control variables with CV7 in the WRFDA are, u-wind, v-wind, T , Ps and pseudo-relative humidity. The multivariate correlations between control variables are not taken into account in the CV7 option . However, flow dependence and multivariate correlation of forecast errors are implicitly included in 4D-Var through the time integration of the linearized version of the WRF model (Gustafsson, 2007). Using of u-and v-wind as control variables in WRFDA were introduced by Wang et al. (2013a) and studied in detail by Sun et al. (2016) for high-resolution data assimilation. Their results (Wang et al., 2013a;Sun et al., 2016) indicate that use of u-and vwind as control variables are able to further increase precipitation forecast skills compared to use of stream function and velocity potential at convective scale forecasts. For control variables with the CV7 option in WRFDA, same as the default CV5 option, the horizontal correlation of background error covariance (BEC) is represented by recursive filters and the vertical component of the BEC is diagonalized using empirical orthogonal functions (EOFs) (Barker et al., 2003). The BEC are estimated with the National Meteorological Center (NMC) method (Parrish and Derber, 1992) by taking differences between forecasts of different lengths valid at common times. Differences of 24-and 12-h WRF forecasts of the analysis variables valid at the same time for 60 pairs at either 0000 and 1200 UTC over June 2010 were used to compute the BECs. Figure 2 shows the domain averaged horizontal length scales for u, v, temperature and pseudo-relative humidity. The horizontal length scale describes to what extent the horizontal background covariance can spread observation information. The length scale value for u and v are around 40 km for the first EOF mode, while temperature and relative humidity length scale are around 25 km for the first EOF mode, then decreases to about grid spacing around 30th EOF mode, which means that they have more local features represented by a small radius of influence. There are increases in the relative humidity length scales for higher modes (around 10) as also shown in Chen et al. (2013), due to the very small quantity of moisture represented by the higher modes of relative humidity. As length scales determine the spread of analysis increment introduced by the observation, we applied a scaling factor 0.4 to the length scale in precipitation assimilation. As pointed out by Barker et al. (2003), background errors will vary between each application and should ideally be tuned for each domain, which is time-consuming, but important, work. The adjustment of length scale is commonly used for dense data in convective scale data assimilation, such as radar data (Li et al., 2012;Tong et al., 2016). Li et al. (2012) demonstrated reducing horizontal correlation scale in the static covariance derived from the NMC method by a factor of 0.3 produced much more realistic wind increments than the default scale. Tong et al. (2016) reported radar data assimilation with the length scale reduced by a factor of 0.25, and the specific value is based on a series of trial-anderror experiments.

Experimental design
In single observation tests, three experiments are carried out using different precipitation quantities. 6-h forecast initialized from 0.5 • NCEP Global Forecast System (GFS) analysis at 1800 UTC on 8 June 2010 is used as the background. 1-h time window, which is one-side window starting from 0 to 1-h, is used and the time of the observation is assumed at the end of the time window. One outer loop is used. For the first two experiments (SOBS1 and SOBS2), observation location is at 37.95 • N, 95.13 • W and background precipitation is around 56 mm. The first experiment SOBS1 assumes the observation is larger than the background precipitation and innovation is 10 mm, while the second experiment SOBS2 assumes the observation is smaller than the background precipitation and innovation is −10 mm. For the third experiment SOBS3, observation location is at 38.5 • N, 95.13 • W and background precipitation is around 3.2 mm. We assume innovation is −3.2 mm, i.e. zero rain observation is assimilated.
In a real case study, experiments GTS and GTS+RAIN are conducted. Two outer loops are employed and 30 iterations for every outer loop are carried out for the minimization of the cost function in the inner loop. In GTS, the background is the same as single observation tests and all conventional observations are assimilated. The analysis after the GTS assimilation is used as the background for the experiment GTS+RAIN that assimilates precipitation data. GTS and precipitation data are assimilated in separate steps and the length scale is tuned (as Section 4.2 mentioned) when assimilating precipitation. This method, which assimilates broadly distributed coarse-resolution observations in the first step and then assimilate locally distributed highresolution observations in the second step, is found to be advantageous for analysing convective storms (Gao et al., 2013;Li et al., 2015;Tong et al., 2016). The 1-h time window is chosen for precipitation assimilation following Wang et al. (2012) who suggested that, for simplified liquid-only microphysics scheme, a short time window (less than 1-h) is preferable in 4D-Var to reduce the model error and nonlinearity and thus improve the validity of tangent linear model. Because the shortest accumulated NCEP Stage IV data, we can get is hourly data, 1-h time window is used to assimilate one hour accumulated precipitation. Original 4 km NCEP Stage IV gridded data are thinned to 12 km resolution in the assimilation to reduce observation error correlations, which are not considered in the observation error covariance matrix. For precipitation processes mainly on Great Plains (Fig. 1), there is no observation rejection according to orography. Zero rain value is also assimilated. Observations with innovation greater than 50 times observation error will be rejected. Fig. 3. Geopotential height (blue solid contours every 20 gpm), temperature (shaded contours), relative humidity (red line, greater than 60% is shown in 500 hPa, and greater than 80% is shown in 850 hPa) and wind (in knots) at 0000 UTC 9 June 2010 at (a) 500 hPa and (b) 850 hPa. Non-colour areas indicate the 850 hPa surface was underground. We extended the real case study to 1-week (9 June-15 June) experiments to demonstrate the robustness of the improvement after the implementation of precipitation assimilation. There is no cycling assimilation for all the experiments. The short-term forecast from GTS analysis is not used as the background for the next assimilation. Note that less observations are used in our regional data assimilation experiments (e.g. no satellite radiance data) than global operational system such as NCEP GFS, cycling assimilation will lead to systematic bias due to insufficient constraint from observations. For every 0000, 0600, 1200 and 1800 UTC analysis, an ARW-WRF 6 h-forecast is used as the background for experiments GTS and GTS+RAIN, then, 12 h forecasts are conducted using analysis from two experiments.

Case description
The period of study is 9-15 June 2010, when several mesoscale convective systems occurred in the Central and Southern United States that produced heavy to extreme precipitation amounts (Higgins et al., 2011;Clark et al., 2012;Schumacher et al., 2013;Schumacher and Clark, 2014;Dahl and Xue, 2016).
The single observation tests and a real case study are performed using the simulation of the heavy rainfall event on 9 June 2010. Figure 3 shows 500 and 850 hPa geopotential height, temperature, wind and relative humidity at 0000 UTC 9 June 2010 from the NCEP analysis. At 500 hPa (Fig. 3a), a deep upper level trough over south Kansas and north-east Oklahoma gradually strengthened as it shifted eastwards. Over southern Texas, an upper level cutoff low moved slowly north-eastwards. At 850 hPa (Fig. 3b), wind convergence zone is located over eastern Kansas into western Missouri. On the south, strong lowlevel jet and moisture transport from the western Gulf of Mexico progressed northwards across eastern Texas (Fig. 3b). A rainfall band extended from south-eastern Kansas to Illinois, where the maximum 6-h accumulated precipitation (valid at 0600 UTC 9 June 2010) exceeds 200 mm in NCEP Stage IV (Fig. 4). At the same time, mesoscale convective systems over Texas developed. Low-level moisture convergence, vertical wind shear, relatively cold air (centre of cut-off low over Texas) all favoured the slowmoving convective systems in the following days.

Single observation tests
An efficient way to illustrate the sensitivity of the 4D-Var system to the precipitation observations and how precipitation assimilation affects wind, temperature and moisture fields is to conduct single observation experiments. In this section, we discuss the results of single observation tests by conducting experiments using a summer rainfall case.
WRF 1-h accumulated rainfall forecast is shown in Fig. 5. The black mark is precipitation observation location for single observation tests. Model precipitation at this location is about 56 mm. Figure 6 displays the vertical cross section of temperature and moisture increments as a result of assimilating single rainfall observation from experiment SOBS1, which assumes precipitation is 10 mm larger than model predicted value. Figure  6a shows that warming appears above 400 hPa over the observation location, which corresponds to the latent heat released by condensation. The cooling appearing below 400 hPa is caused by the evaporation of rain in Kessler scheme. Figure 6b indicates that the impact of precipitation assimilation on moisture analysis mainly occurred below 700 hPa and the maximum is around 850 hPa, with the analysis wetter than the background. Figure 7 shows horizontal cross sections of wind increments. The horizontal wind convergence around 850 hPa (Fig. 7a) and the divergence around 300 hPa (Fig. 7b) near observation location indicates the strengthening of the convection. It means that the convective system around the observation location is intensified by precipitation assimilation. Figure 7 also indicates 4D-Var produces analysis increments with a flow-dependent nature. The flow-dependent structure function in 4D-Var is important in the analysis of fast-evolving systems, such as convective storms (Huang et al., 2009).
In the experiment SOBS2, the assumed observation is 10 mm smaller than model precipitation. Figure 8 displays the vertical cross sections of temperature and moisture increments from SOBS2. Comparing with the results from SOBS1, the increments    of the two variables exhibit opposite effect in SOBS2. It can be seen that the temperature above 400 hPa becomes cooler and the moisture around 850 hPa gets drier than the background. Figure 9 shows horizontal cross sections of wind increments. The horizontal wind divergence around 850 hPa (Fig. 9a), as expected, and the convergence around 300 hPa (Fig. 9b) near observation location indicated the weakening of the convection. Although only precipitation data are assimilated, 4D-Var can spread the single observation information spatially, and the model's temperature, moisture, and wind fields can be adjusted by the model constraint.  The purpose of conducting the experiment SOBS3 is to show the results of assimilating the zero precipitation value. Past studies (Tsuyuki et al., 2002;Lopez and Bauer, 2007;Lopez, 2011) set a small threshold (0.5 mm/h, 0.1 mm/h and 0.001 mm/h, respectively) when using precipitation data, i.e. zero rain observation was not assimilated. Tsuyuki et al. (2002) noted that observations of 1-h precipitation less than 0.5 mm are not assimilated, for it was found that the quality of such observations was rather poor in snowfall cases. Lopez and Bauer (2007) assimilated in the 1D-Var only points where the background hourly precipitation rate is 0.1 mm/h, and points where the observed rainfall value is lower than 0.1 mm/h are rejected prior to 1D-Var to avoid the possible development of a dry bias after 1D-Var due to this one-sided screening. Lopez (2011) restrict the 4D-Var assimilation of NCEP stage IV data to locations where precipitation is simultaneously higher than a small threshold (0.001 mm/h) Fig. 11. Same as Fig. 7, but for the experiment SOBS3, which observation location is at 38.5 • N, 95.13 • W. in the model background and in the observations. In this study, the background zero rain issue is discussed in Section 6.2 and zero rain observation assimilation at rainy model background is displayed in the experiment SOBS3. Figure 10 shows the vertical cross sections of temperature and moisture increments from SOBS3. The pattern is similar to the experiment SOBS2 (Fig. 8). Figure 11 shows horizontal cross sections of wind increments. It indicates that the convection is weakened by assimilating zero rain value. In radar data assimilation, some studies (Aksoy et al., 2009;Wattrelot et al., 2014) confirmed that assimilating the no-rain observation (i.e. reflectivity observations with values small enough to indicate the absence of precipitation) suppressed spurious convection.

A real case study
To better investigate the impact of precipitation assimilation on precipitation forecast, we focus on a convective event on 9 June 2010. Case description is given in Section 5. For precipitation assimilation, Lopez (2011) mentioned that one of the major unsolved issues is when model background has no precipitation, a rainy observation will be unable to produce any increment as a result of the absence of sensitivity in the adjoint moist physics. The model zero rain issue is also unsolved in WRFDA 4D-Var. However, we can mitigate the model zero rain issue through multiple outer loops. Due to computer resource limitations, only two outer loops are used in this study. Figure  12 shows non-linear model predicted precipitation initialized from the first outer loop analysis is improved in pattern and intensity compared to using first guess (Fig. 5), meanwhile, the precipitation coverage is slightly expanded. It indicated that, in first outer loop, although the model grid points which have no rain cannot get increments from rainy observation, the second guess is improved by the adjustment of microphysics and dynamic fields through assimilating precipitation at rainy model grids, and then points on the model grid with zero rain after the first outer loop may now contain nonzero values for the second outer loop.
The depiction of the humidity field in the initial state is crucial for precipitation forecasts. Figure 13 gives the moisture increment at 850 hPa for the experiments GTS and GTS+RAIN to demonstrate how precipitation assimilation has an effect on the moisture field. The distribution of 1-h accumulated precipitation assimilated in GTS+RAIN is mainly over south-east Kansas and west central Missouri (Fig. 15a). In Fig. 13a, moisture increment in GTS at 850 hPa is around −3-3 g/kg and the large increments are over north Texas where the analysis gets drier than the background, while small increments (around 1 g/kg) are found in south-east Kansas and western central Missouri. In GTS+RAIN (Fig. 13b), additional moisture increments are around a convergence line, resulting from precipitation assimilation. It indicates that GTS+RAIN can obtain small-scale structures associated with convection. The positive impact on moisture field in precipitation assimilation is very important, for it subsequently influences the latent heat release, alters the thermodynamic and dynamic structure of the atmosphere, and makes improvements to precipitation forecasts. Figure 14 shows temperature increment in GTS and GTS+RAIN. Similar to the moisture increment, GTS produces very small increment in temperature near convergence line over south-east Kansas and north Missouri. In contrast to GTS, there is a strong cooling area over south-east Kansas near the convergence line in GTS+RAIN. The convection is strengthened by assimilated precipitation.
In Fig. 15, the forecasts of 1-3 h hourly accumulated precipitation from 0000 to 0300 UTC 9 June 2010 for the two experiments are compared with NCEP Stage IV precipitation data. In the bottom two rows, the locations of observed fea- tures are outlined in blue and the features predicted by the GTS and GTS+RAIN are overlaid and colour filled for precipitation greater than 5 mm. The observations (top row) show that a distinct rain band stretches from south-eastern Kansas to Missouri, with 1-h accumulated precipitation reaching 80 mm in southeastern Kansas. The rain band moved to Illinois in next hour, and the southern end of the rain band remained just east of Kansas for several hours. Figure 15b and d shows GTS cannot capture the characteristic of 1-h precipitation distribution. Comparing with GTS, GTS+RAIN has a very good spatial overlap with the observations in 1-h forecast ( Fig. 15c and e) and efficiently reduces spin-up time. For 2-h and 3-h forecasts, although GTS+RAIN overestimated precipitation on Kansas ( Fig. 15h and m), it still matched observation very well (Fig. 15j and o). The improvement can be explained by Figs. 13b and 14b. By the adjustment of microphysics and dynamic fields through assimilating precipitation, GTS+RAIN outperformed GTS in precipitation forecasts. In Fig. 16, the forecasts of 4-6 h hourly accumulated precipitation from 0400 to 0600 UTC 9 June 2010 for the two experiments are compared with NCEP Stage IV precipitation data. Stage IV rainfall pattern over Kansas shifted from northeast-south-west oriented to north-west-south-east oriented from 1 to 6-h. It shows a V-shaped pattern over Kansas, Missouri and Illinois. GTS cannot capture the V-shape characteristic, and moreover, it produces false alarms in northern Missouri. Com-pared to GTS, precipitation assimilation experiments improved rainfall over northern Missouri and reduced the false alarms. Overall GTS+RAIN produces a slightly better rainfall pattern than GTS.

1-week results
The previous results investigated the improved analysis and forecasts from GTS+RAIN. In this section, statistical results over 1-week, total 28 sets of 12 h forecasts for each experiment, are provided to further validate the robustness of the results from GTS and GTS+RAIN. We employed MET verification package (Brown et al., 2009), which developed by DTC at NCAR. The neighbourhood-based fractions skill score (FSS) (Roberts and Lean, 2008) is used as the precipitation verification metric. Figure 17 shows aggregate FSS of hourly accumulated precipitation over 1-week for the two experiments at 1, 5 and 10 mm/hr thresholds. The FSS is computed with a radius of influence of 20 km and 95% confidence intervals were added. In the first hour, GTS FSS decreases with increasing precipitation threshold. GTS+RAIN FSS outperformed GTS and almost kept the same value for the three thresholds at 1-h forecast. The confidence interval also indicates that the improvement are statistically distinct. The improvement in GTS+RAIN was sustained to the 6-h forecast but it is not significant after 3-   h. For 7-12-h forecasts, the FSSs for two experiments almost overlapped, thus, the benefit of precipitation assimilation disappeared beyond 6-h forecast. Therefore, 3-hourly (or more frequent) cycling run is suggested in future work. To preserve the impact, ensemble-based assimilation methods may be needed in precipitation assimilation. Johnson et al. (2015) demonstrated the advantage from radar data assimilation persists for about 5-h using EnKF, but only about 1-h using 3D-Var averaged from 10 diverse cases.
In order to evaluate how model variables are affected by the precipitation assimilation, aggregated RMSEs verified against radiosonde observations over one-week for relative humidity, Fig. 18. Vertical profiles of the RMSEs aggregated over 1-week results for GTS and GTS+RAIN at 6 h forecast. The horizontal bars represent the confidence intervals at the 95% confidence level using bootstrap resampling method with 10,000 resamples. (a) relative humidity, (b) temperature and (c) wind speed. temperature and wind speed on different pressure levels for two experiments at 6-h forecast in Fig. 18. In Fig. 18a, GTS+RAIN has small improvement for relative humidity below 700 hPa. It is consistent with the results in Section 6.1, where the maximum of moisture increments are around 750 hPa. The temperature RMSE for two experiments (Fig. 18b) are almost overlapped except over 300 hPa because of sparseness of observation. For wind speed (Fig. 18c), RMSE is reduced around 500 hPa in GTS+RAIN and in other levels they are very close. It is not surprised to see the minimal impact produced by GTS+RAIN in model's fields, which might be due to sparseness of observation used in verification and localized convection.

Conclusion
In this study, precipitation data assimilation is implemented in WRFDA 4D-Var and applied to a 4-km convection-permitting model configuration. Single observation tests, a real case study, and 1-week experiments are conducted. 6-h forecast initialized from NCEP GFS analysis is used as the background. The experiment GTS assimilates all conventional observations and no satellite data is used. The analysis after the GTS assimilation is used as the background for the experiment GTS+RAIN that assimilates precipitation data. GTS and precipitation data are assimilated in separate steps and the length scale is tuned when assimilating precipitation. For 1-week experiments, there are in total 28 sets of 12-h forecasts in each experiment for verification. The NCEP Stage IV precipitation is also used in precipitation verification for it is the only precipitation data we can get in study period.
Firstly, single precipitation observation experiments indicate that the convection around the observation location is intensified (weakened) by assimilating stronger (weaker) precipitation quantity. Although only precipitation data are assimilated, 4D-Var can spread the single observation information spatially, and the model's temperature, wind, and moisture fields can be adjusted by the model constraints. We also conducted an experiment to assimilate zero rain observation. The convection is suppressed by assimilating dry rain value.
Then, precipitation assimilation is applied to a real case study. Results shows that GTS gets weak increments in temperature and moisture fields over heavy rainfall region, while GTS+RAIN gets strong increments and obtains small scale information associated with convection. As expected, forecasts in GTS+RAIN reproduce better precipitation patterns, reduce false alarms, and get closer quantities to the observations than GTS. We conduct 1week experiments to further validate the robustness of the results on precipitation assimilation. On average, GTS+RAIN produces higher FSS comparing to GTS. The impact of precipitation assimilation is mostly up to 3-h. This paper shows the preliminary results of precipitation assimilation and they are encouraging. In future work, the further improvement can be made. Firstly, the current formula of WRFDA 4D-Var assumes unbiased and Gaussian-distributed precipitation observation errors. However, the innovation statistics show deviation from Gaussian behaviour. Non-Gaussian error distribution will be considered in future improvement. Secondly, precipitation observation error and bias removal can be adjusted according to more thorough statistics. Thirdly, Johnson et al. (2015) showed flow dependent BECs are crucial for radar data assimilation, therefore, ensemble based assimilation method might also be introduced in precipitation assimilation.
Finally, cold processes will be considered in future work. Wu et al. (2000) demonstrated that there are strong limitations of assimilating radar data in 4D-Var when applied to cold processes, therefore, it is necessary to extend experiments to cold processes to investigate the model behaviour by including ice phase microphysics in tangent linear and adjoint model.