Evaluation of the subseasonal forecast skill of surface soil moisture in the S2S database

ABSTRACT Based on the reforecasts from five models of the Subseasonal to Seasonal (S2S) Prediction project, the S2S prediction skill of surface soil moisture (SM) over East Asia during May–September is evaluated against ERA-Interim. Results show that good prediction skill of SM is generally 5–10 forecast days prior over southern and northeastern China in the majority of models. Over the Tibetan Plateau and northwestern China, only the ECMWF model has good prediction skill 20 days in advance. Generally, better prediction skill tends to appear over wet regions rather than dry regions. In terms of the seasonal variation of SM prediction skill, some differences are noticed among the models, but most of them show good prediction skill during September. Furthermore, the significant positive correlation between the prediction skill of SM and ENSO index indicates modulation by ENSO of the S2S prediction of SM. When there is an El Niño (a La Niña) event, the SM prediction skill over eastern China tends to be high (low). Through evaluation of the S2S prediction skill of SM in these models, it is found that the prediction skill of SM is lower than that of most atmospheric variables in S2S forecasts. Therefore, more attention needs to be given to the S2S forecasting of land processes. Graphical Abstract


Introduction
Accurate subseasonal to seasonal (S2S) climate prediction is crucial for policy decisions (Brunet et al. 2010;White et al. 2017). However, limited S2S prediction skill based on both dynamical and statistical models makes it difficult to apply S2S prediction products to climate services . In order to improve S2S prediction skill, the S2S Prediction project was sponsored by the World Meteorological Organization in 2013 (Vitart et al. 2015) and provided a comprehensive database including realtime forecasts and reforecasts on subseasonal time periods . Based on the S2S products, an increasing number of studies have evaluated the prediction skill of S2S climate prediction (e.g. Vitart 2017;Jie et al. 2017;Zhou et al. 2018;Zeng and Yuan 2018).
Soil moisture (SM) is a key variable of the land surface processes. Compared to atmospheric variables, SM varies relatively slowly and anomalous SM can persist for a longer time (Entin et al. 2000). After a rainfall process, soil retains water for days or even months (Mccoll et al. 2017) and soil water exhibits evident variations on subseasonal time scales (Li et al. 2015). SM can significantly affect the energy balance, hydrological cycle, and even the atmospheric general circulation (Koster and Suarez 2000;Seneviratne et al. 2010;Mei and Wang 2011). Moreover, SM is an important predictor for the atmospheric variables in the midlatitude Northern Hemisphere during boreal summer (Liu, Yu, and Mishra 2016;Koster et al. 2017). It has been noted that the initialization of SM in numerical models can improve the performance of S2S forecasts (Koster et al. 2011). Therefore, the improvement of SM prediction could potentially improve climate prediction, especially on S2S scales.
In order to learn about the performance of current operational models to capture the SM variability on S2S scales, the S2S prediction skill of SM during the warm season (May to September) over East Asia was evaluated in this study. Section 2 describes the data and methods. The S2S forecast skill of SM is evaluated in Section 3. A summary is provided in Section 4.

Data and methods
Six-hourly SM data at depths of 0-7 cm and 7-28 cm, with a horizontal resolution of 1.5°× 1.5°, during 1979-2017, were obtained from the ERA-Interim reanalysis dataset (http://apps.ecmwf.int). ERA-Interim SM generally agrees with satellite and microwave observations (Rahmani, Golian, and Brocca 2016). In this study, the daily averaged surface SM at the depth of 0-20 cm was estimated by interpolating the six-hourly SM at 0-7 cm and 7-28 cm, which were weighted by distance and then averaged into daily data. Furthermore, in order to evaluate the prediction skill against different climate backgrounds, especially El Niño-Southern Oscillation (ENSO), the ENSO index during December-January of 1979-2016 was obtained from the US National Weather Service/Climate Prediction Center (http://www.cpc.ncep.noaa.gov). Although the evaluation of SM prediction skill was performed for May-September, winter ENSO index was utilized because ENSO events show their greatest strength during winter, the effects of which can persist from winter to summer.
The S2S reforecasts of SM were provided by five operational centers participating in the S2S project (http://www. s2sprediction.net/; Table 1): the Australian Bureau of Meteorology (BoM), the China Meteorological Administration (CMA), the European Centre for Medium-Range Weather Forecasts (ECMWF), the Hydrometeorological Centre of Russia (HMCR), and the US National Centers for Environmental Prediction (NCEP). Table 1 shows the land surface models (LSMs), initial conditions, and horizontal resolutions applied in the five models. In the five models, the LSMs used are the bucket model (Manabe and Holloway 1975), Noah LSM 2.7.1 (Ek et al. 2003), the Hydrology Tiled ECMWF Scheme for Surface Exchanges over Land (HTESSEL; Balsamo, Beljaars, and Scipal 2009;Balsamo et al. 2015), BCC_AVIM2 (Wu et al. 2013(Wu et al. , 2014, and the Interactions between Soil, Biosphere, and Atmosphere scheme (ISBA; Noilhan and Planton 1989), for BoM, NCEP, ECMWF, CMA, and HMCR, respectively. The land surface variables in BoM are initialized through nudging the atmosphere model to the ERA-40 reanalysis (Hudson et al. 2011). NCEP SM is initialized by the CFSR and the Global Land Data Assimilation System (GLDAS) analyses (Saha, Moorthi, and Pan 2010). The SM of the ECMWF LSM is initialized by ERA-Interim soil reanalysis. The SM from the CMA model is not directly initialized but produced during the coupling processes of the CMA model through nudging the near-surface atmospheric forcing fields (Liu, Wu, and Yang 2017). The SM in the HMCR model is initialized through quasi-assimilation with the ERA-Interim reanalysis. The horizontal resolutions of the models range from 0.5°× 0.5°to 2°× 2°.
The forecast times, ensemble sizes, reforecast frequencies, and reforecast periods of the reforecasts of the five models are also presented in Table 1. Forecast times range from 44 to 62 days for the five models. Ensemble sizes are from 4 to 33, and reforecast frequencies are from once a week to once a day. Reforecast periods range from 20 to 32 years, and the common period of the reforecasts is 1999-2010. To present an equal evaluation for each model, reforecasts were evaluated for the period 1999-2010. However, the prediction skill during different ENSO years was analyzed during the entire period of the model reforecasts to enlarge the sample size. The horizontal resolutions of the archived data are 2.5°× 2.5°for BoM, and 1.5°× 1.5°for the rest of the models. Before evaluation, the seasonal cycles and long-term trends of both the observations and reforecasts were removed. After that, the correlation between prediction and observation was used as the prediction skill of the SM at 0-20 cm over East Asia during May-September. Note that ERA-Interim is used to assess the prediction skill of the five models because of the lack of station observations of SM. Although the same LSM (HTESSEL) is used for ERA-Interim and the ECMWF S2S reforecast, their SM results are different. The LSM in ERA-Interim is forced by ERA-Interim reanalysis data and the precipitation data of the Global Precipitation Climate Project (Balsamo, Beljaars, and Scipal 2009; 2015), while the LSM in the ECMWF S2S reforecast is coupled with the ECMWF atmospheric model. Therefore, the ERA-Interim SM is close to the observation, and it agrees well with satellite and microwave observations (Rahmani, Golian, and Brocca 2016). The SM in the ECMWF S2S reforecast is produced by the model itself, and its agreement with observation depends on the model performance.

SM prediction skill and its relationship with ENSO
East Asia was divided into four regions by the percentile of the climatology of ERA-Interim SM (0-20 cm) during May-September of 1979-2017 (Figure 1(a)). The topography was also a factor in the division. The results show that SM decreases gradually from south to north over East Asia. Therefore, Region 1 (Reg1), mainly in southern China (21°-35°N, 103°-122°E), is the area where soil has large water content; Region 2 (Reg2), in northeastern China (35°-54°N, 103°-132°E), is a transition area where soil varies from wet to dry; Region 3 (Reg3) is the Tibetan Plateau, with the highest topography in East Asia (27°-38°N, 76°-103°E); and Region 4 (Reg4), in northwestern China (38°-49°N, 75°-103°E), has the lowest SM in East Asia.
The prediction skill of SM over Regs1-4 in BoM, CMA, ECMWF, HMCR, and NCEP are shown in Figure 1(a-d). Prediction skill is defined by the correlation coefficient between prediction and observation. The correlation coefficient is calculated based on the time series of regionally averaged SM for all the forecast cases. Generally, good prediction skill is greater than 0.5. In Figure 1, the y-axis (x-axis) is the prediction skill (forecast day). In Reg1 (Figure 1 (a)), good prediction skill is found in the ECMWF and NCEP models during the first 10 forecast days. The CMA model has good skill only during the first three forecast days, while the rest of the models have no good skill. In Reg2 (Figure 1 (b)), ECMWF, NCEP, and CMA have good skill before about forecast-days 10, 9, and 5, respectively. BoM and HMCR have no good skill. In Regs3 and 4 (Figure 1(c) and (d), only ECMWF has good skill during the first 20 forecast days, whereas the rest of the models have no good skill. Generally, the ECMWF model has the best prediction skill among the five models. In most of the models, prediction skill is better over wet regions than dry regions, except for the ECMWF model. Figure 2 shows the spatial distribution of prediction skill on forecast-day 10 for the five models. In BoM (Figure 2(a)), prediction skill is higher over western Indochina, eastern China, and northeastern China than the rest of East Asia. In CMA (Figure 2(b)), good prediction skill can be found over the western Tibetan Plateau and northeastern China. In Figure 2(c), ECMWF has the best prediction skill of the five models. Moreover, good prediction skill is found over western Indochina, eastern China, northeastern China, the western Tibetan Plateau, western Mongolia, and India. In HMCR (Figure 2(d)), prediction skill is higher over central China and India than the remaining parts on the map. In NCEP (Figure 2(e)), the distribution is quite similar to that of ECMWF, but the value of prediction skill is smaller than that of ECMWF. The results shown in Figure 2 are consistent with those shown in Figure 1.
Besides spatial distribution, the prediction skill of SM has seasonal variation. Figure 3 presents the temporal variation of SM prediction skill over the four regions in the five models. In BoM (Figure 3(a-d)), SM can be well predicted five days in advance during July in Reg1, 10 days during May in Reg2, and 25 days during May in Reg4. In Reg3, BoM has no good skill during May-September. CMA has good prediction skill 10 forecast-days in advance in Reg1 ( Figure  3(e)). In Reg2 (Figure 3(f)), good prediction skill is found before forecast-day 5 (20) during May-June (August-September). In Reg3 (Reg4), SM can be well predicted 5 (20) days in advance during June-July and September (May and July-August) in Figure 3(g) (Figure 3(h)). ECMWF generally has good prediction skill before forecast-day 10-15 during May-August (Figure 3(i-l)). In Regs3 and 4 (Figure 3 (k) and (l), good prediction skill is before forecast-day 20-25. In HMCR, good prediction skill can be found before forecast-day 5 (10) during June (August) in Reg1 (Reg3) in Figure 3(m) (Figure 3(o)). In Regs2 and 4 (Figure 3(n) and (p), no good prediction skill can be found during May-September. In NCEP, good prediction skill is before forecastday 10 during May-September over Regs1 and 2 (Figure 3 (q) and (r). In Reg3 (Reg4), SM is well predicted 20 days in advance during August to September (May and September) in Figure 3(s) (Figure 3(t)). Overall, the seasonal variation of SM prediction shows large diversity in the five models, which indicates some uncertainties exist in the prediction skill of SM among the five models. However, a common feature can also be identified in that good prediction skill tends to be found in September over most of the regions in most of the models.
The prediction skill related to ENSO was also evaluated because the S2S prediction skill of SM can be modulated by ENSO. The results are shown in Figure 4 in the form of the correlation coefficients between the annual prediction skill and ENSO index at each grid point. The prediction skill on forecast-day 10 correlated with ENSO is shown because the correlation patterns between ENSO index and prediction skill during the first 15 forecast days are very similar. Moreover, forecast-day 10 is the start of the S2S forecast. In BoM ( Figure  4(a)), a region that generally features significant positive correlation is located from northern Indochina to Shandong Province of China. Correlations are also significantly positive over northern China and eastern Mongolia. In CMA (Figure 4(b)), positive correlation can be identified over northern India, the Indochina Peninsula, Shandong Province, and northern Xinjiang. In ECMWF (Figure 4(c)), correlations are generally significantly positive from northern Indochina to Shandong Province and northern China. In HMCR (Figure 4(d)), positive and significant correlations are found over southwestern China, part of the Tibetan Plateau, and a small part of northern China. The correlation pattern shown in NCEP (Figure 4(e)) is quite similar to those shown in BoM and ECMWF. In general, except for HMCR, the prediction skill of SM from southwestern China to Shandong Province is significantly correlated with ENSO index in all the models. This indicates that SM prediction is better in El Niño years than La Niña years.

Summary
Using the SM at the depth of 0-20 cm derived from the S2S Prediction project database, the S2S prediction skill of SM in East Asia during May-September was evaluated against ERA-Interim reanalysis data. Most of the five models provide good prediction skill at 5-10 days in advance over southern China and northeastern China. Over the Tibetan Plateau and northwestern China, only the ECMWF model has good prediction skill at about 20 forecast-days in advance. However, observations are very limited over the Tibetan Plateau, and therefore prediction here may not be very reliable. Generally, good prediction skill is found over wet regions rather than dry regions in most of the models. This may relate to the long 'memory' of SM over wet regions. For the seasonal variation of SM prediction skill, some uncertainties can be noticed, but most of the models show good prediction skill during September. This may be attributable to the fact that the rainy season over East Asia usually takes place before September, which makes the soil wet; after that, rainfall reduces and, due to the memory of SM, the persistence of anomalous SM may lead to good prediction skill. Except for the HMCR model, all the S2S models show that the prediction skill of SM over eastern China is significantly and positively correlated with ENSO index, which means SM predictions are better during El Niño years than La Niña years. Comparing this study to others (e.g. Zhou et al. 2018), it is apparent that the prediction skill of SM is lower than that for most atmospheric variables in the S2S forecast. This indicates that more attention should be paid to the subseasonal forecasting of land processes.

Acknowledgments
Yang ZHOU thanks the support from the Startup Foundation for Introducing Talent of NUIST. Finally, we thank the reviewers for their insightful and constructive comments. We thank Ms. Chuyan CHEN for proofreading the manuscript.

Disclosure statement
No potential conflict of interest was reported by the authors. . Temporal variation of the prediction skill of SM (shading) over each region (columns) in each model (rows). Prediction skill is calculated in 10-day bins during may-september (x-axis) at each forecast time (y-axis) to obtain the temporal variation.