Comparison of satellite-estimated and model-forecasted rainfall data during a deadly debris-flow event in Zhouqu, Northwest China

Abstract The data of several rainfall products, including those estimated from satellite measurements and those forecasted via numerical weather modeling, for a severe debris-flow event in Zhouqu, Northwest China, are compared and analyzed in this paper. The satellite products, including CPC MORPHing technique (CMORPH), TMPA-RT, and PERSIANN are all near-real-time retrieved with high temporal and spatial resolutions. The numerical weather model used in this paper for precipitation forecasting is WRF. The results show that all three satellite products can basically reproduce the rainfall pattern, distribution, timing, scale, and extreme values of the event, compared with gauge data. Their temporal and spatial correlation coefficients with gauge data are as high as about 0.6, which is statistically significant at 0.01 level. The performance of the forecasted results modeled with different spatial resolutions are not as good as the satellite-estimated results, although their correlation coefficients are still statistically significant at 0.05 level. From the total rainfall and extreme value time series for the domain, it is clear that, from the grid-to-grid perspective, the passive microwave-based CMORPH and TRMM products are more accurate than the infrared-based PERSIANN, while PERSIANN performs very well from the general point of view, especially when considering the whole domain or the whole convective precipitation system. The forecasted data — especially the highest resolution model domain data — are able to represent the total or mean precipitation very well in the research domain, while for extreme values the errors are large. This study suggests that satellite-retrieved and model-forecasted rainfall data are a useful complement to gauge data, especially for areas without gauge stations and areas not covered by weather radars.


Introduction
At around midnight on 8 August 2010, local time, sudden heavy rain fell in the mountain area to the northeast of Zhouqu, Gansu Province, Northwest China, triggering a massive debris-flow event in the Sanyanyu and Luojiayu gullies. The debris flow was about 5000 m long, 300 m wide, averaged 5 m thick, and led to 1492 deaths and a reported 272 missing persons. It then flooded into the Bailongjiang River, jamming it to form a barrier lake, which flooded 1/3 of the area of the whole city, destroyed 233.4 acres of farmland, and damaged 5508 rooms, amongst other disastrous consequences and losses (according to the Ministry of Civil Affairs).
The cause of this disaster is very complicated. Several countermeasures against this type of event happening again in the future have been presented on the basis of its factors of influence, characteristics, and trends (Hu et al. 2010;Yan et al. 2010;Liu et al. 2011). However, under the same conditions, the likelihood of fatal flows occurring when it rains heavily remains high (Yu et al. 2010).

Data and methods
Routine observations of rainfall from weather stations, rainfall data retrieved from satellite instruments, and rainfall forecasted via numerical weather modeling, are used in this paper.
The main research domain, located in the west of China, is domain d03 indicated in Figure 1(a); the terrain is shown in Figure 1(b). From Figure 1(b), it is clear that the terrain of this domain is complex, with elevation ranging from 200 to 4600 m. The black dots represent the gauges from where the data used in this paper originated; the red circled one is the gauge at Zhouqu.
The study period is from 0000 UTC 6 August 2010 to 2300 UTC 9 August 2010. The debris flow event happened at 1602 UTC 7 August 2010, after which it rained twice in the area. This triggered floods and the formation of barrier lakes, which led to additional problems with disastrous consequences.
Field investigations and remote sensing images indicate that the debris flows were formed by upstream flooding, and the flash flood was triggered by a severe convective weather process with heavy rainfall (Qu et al. 2010). However, because of the challenging terrain of the region, there are few routine weather stations (gauges) and the visual angle and data coverage of radar is restricted. Fortunately, satellites cover almost all of Earth's surface, meaning QPE retrieved from their measurements can compensate for the disadvantages of gauges and radars. Furthermore, numerical weather models can also offer coverage for mountainous areas and, moreover, can provide real-time quantitative precipitation forecasts. The qualities of both satellite-derived and numerical weather model-forecasted rainfall during this debris-flow event in Zhouqu are validated and discussed in this paper.

Data
The gauge data from routine and automatic weather stations shown in Figure 1(b) are all six-hourly data with some missing values. The missing values were all eliminated for the following analyses. The distances between stations are unequal, varying from kilometers to hundreds of kilometers. The numbers of gauges near Zhouqu is quite small.
The satellite rainfall data used in this paper derive from three of the world's most well-known, good quality, and most representative products: TRMM-RT (RT stands for real-time), CPC MORPHing technique (CMORPH), and PERSIANN. TRMM is mainly developed at NASA and comprises merged data from passive microwave (PMW) and infrared sensors (Huffman et al. 2007). CMORPH is from NOAA CPC and mainly based on PMW sensors, combined with some infrared information (Joyce et al. 2004(Joyce et al. , 2010. PERSIANN is from the University of California, Irvine, mainly developed from the data of infrared sensors and calibrated with PMW data (Hong et al. 2004(Hong et al. , 2005. Table 1 shows the basic information of these products, including their temporal and spatial resolutions, revealing that the temporal and spatial resolutions of CMORPH and PERSIANN are rather high, but relatively low for TRMM. The numerical weather model used in this paper is WRF-ARW V3.2.1, which is a next-generation mesoscale numerical weather prediction system designed to serve both operational forecasting and atmospheric research needs and is suitable for multiple scales ranging from meters to thousands of kilometers (Michalakes et al. 2005;Pattanayak and Mohanty 2008;Skamarock et al. 2008;Coniglio et al. 2010). A number of studies have evaluated WRF-simulated precipitation, snowfall, or wind speed in China, but only a few have examined WRF-forecasted rainfall in China -especially in mountain areas (Wang, Yu, and Song 2011;Wang, Yu, and Wang 2012;Wang and Wang 2013;Yu 2013;Yu, Sun, and Xiang 2013).
The WRF model was set up with three fully coupled domains (Figure 1(a)) and was forced with GFS real-time output every day at 0000 UTC, running 72 h to forecast three days' weather with hourly results outputted. The horizontal spatial resolution of the outer domain (domain 01, d01) was approximately 36 × 36 km, and the ratio of each domain was 1:3. Thus, the horizontal spatial resolution of the inner domain (domain 03, d03) was approximately 4 × 4 km, which is similar to the PERSIANN data. domain 03 is the main research domain for this paper, covering an area of hundreds of kilometers around Zhouqu. Within this domain, the elevation decreases sharply from west to east, varying from 5227 to 244 m (Figure 1(b)), and the sharpest change is in the middle of the domain where Zhouqu is located. The weather stations in this domain are irregularly distributed, with high density in the northeastern plains and low density in the western mountains (Figure 1(b)).

Methods
Given the irregular distribution of the stations, the gauge data needed to be interpolated to regular grid boxes in order to compare with the satellite data. different methods of interpolation may lead to slightly different results (Accadia et al. 2003), but the methods adopted in this paper are all mature, having been tested many times. They include: bi-linear interpolation, Cressman object analysis (Cressman 1959;Vollbrecht and Stahnkejungheim 1978), and weighted area average interpolation.
There are many methods available to evaluate and validate precipitation, among which the 'threat score' or 'critical success index' method is the most popular. However, this method can only determine the scales of different products. Since the third Precipitation Intercomparison Project, many new methods have been developed for comparing precipitation at high resolution, but none has stood out over the others as superior (Adler et al. 2001;Kubota et al. 2006;Ebert et al. 2007;Wernli et al. 2008;Jobard et al. 2011). Thus, traditional methods including the Taylor diagram (Taylor 2001), correlation coefficients, and RMSE, are used in this paper for the quantitative assessment of the rainfall data. Figure 2 shows the accumulated rainfall over 96 h of the different products in their original spatial resolution (the than in the gauge data and the position of the northwest hotspot is a little to the east. Incidentally, no rain is indicated in the eastern region by the PERSIANN data, which is quite different from the gauge data and other remote sensing data. The spatial resolution of both CMORPH and PERSIANN is much higher than that of the gauge data, meaning they can provide much more detailed information. The spatial resolution of the TRMM data is the same as the gridded gauge data, but reveals only one rainfall center that covers almost all of the four hotspots found in the gauge data. Like CMORPH, the accumulation values of TRMM are larger than in the gauge data, but the resolution is much lower than that of CMORPH. The rainfall patterns of the three different domains produced by the WRF model are basically the same, just with different spatial resolutions, and the detailed distribution is not identical with that of the gauge data. The rainfall forecasted in the southwest and northeast corners of the three domains is heavier than in the gauge data, and there is no rain at all in the southeast corner in the WRF-forecasted data.

Accumulated rainfall
In general, most of the rainfall products can capture the basic information in terms of the spatial pattern and scale of the observed rainfall, but they overestimate the rainfall in the western mountainous areas compared with the gauge data. It is unknown whether it is really overestimated or just because of the lack of observations in the gauge data were interpolated to 0.25° longitude × 0.25° latitude grids via the Cressman object analysis method). Although both the temporal and spatial resolutions are different, and the source of the data and methods of rainfall estimation are also quite different, the basic spatial patterns of the accumulated rainfall look quite similar. All products show the basic pattern feature that it rains heavily in the middle and northwest part of this domain (mountains), while it rains little in the southeast part (plains), and all products show similar scales and ranges of accumulated rainfall.
In more specific terms, however, there are many differences between each product. The gauge data indicate four rainfall hotspots, among which the middle rainfall belt is the largest rainfall area with the largest rainfall amount across the domain from north to south. The areas of the other hotspots are much smaller, as are their scales. Besides these rainfall hotspots, there are two minor rainfall centers in the southeast and east in the gauge data. There are four rainfall hotspots in the CMORPH data too, albeit connected closely and barely distinguishable as separate. The values of each hotspot are higher than those in the gauge data, and the areas of all the hotspots except the middle one are larger than the gauge data too. The distribution of the PERSIANN data is more similar to the gauge data than CMORPH, but the area of the middle hotspot is smaller  Figure 1 from 0000 Utc 6 August 2010 to 2300 Utc 9 August 2010 for each rainfall product at their original spatial resolutions. note: the gauge data were interpolated to a 0.25° longitude × 0.25° latitude grid using cressman object analysis.
positioning and spatial pattern is somehow questionable in this area of complex terrain.

Mean and maximum time series
In order to compare the performances of all the products in different periods of the lifespan of the rainfall, maximum rainfall within the research domain at each time step is plotted in Figure 4. Figure 4 shows three local extreme values at the 7th, 11th, and 15th time steps in the gauge data time series, and most of the other data also have those extreme values. The 7th time step extreme is the largest during the whole period in the gauge data, which is also the case in all of the other products apart from TRMM. In the first time step with extreme values (the 7th time step), the CMORPH data are vastly higher than the gauge data, while TRMM and PERSIANN are rather close to the gauge gauge data in those areas. The remote sensing data are more accurate than the model-forecasted data according to the accumulation results.

Spatiotemporal quantitative comparison
In order to obtain more precise comparison information, the different rainfall products are compared quantitatively in the following Sections 3.2.1 and 3.2.2 with unique temporal and spatial resolution. To facilitate the comparison, all of the data products were first regridded or interpolated to unified grids of 0.25° longitude × 0.25° latitude, and then accumulated to six-hourly data (the same as the gauge observation interval).

Taylor diagram
The RMSE measures the differences between values predicted by a model and/or estimator and the values actually observed. It is a good measure of accuracy. Meanwhile, the correlation coefficient is a measure of the correlation (linear dependence) between two variables. It is widely used as a measure of the strength of linear dependence (or pattern similarity) between two variables. Taylor (2001) introduced a single diagram to summarize multiple aspects, including RMSE and correlation coefficient information, in the evaluation of model performance. Figure 3 shows the Taylor diagram of the regridded data of all the products examined in the present paper.
More specifically, the Taylor diagram in Figure 3 compares the six-hourly satellite-and model-derived rainfall data with the gauge data. The radius of the sector is the normalized standard deviation (the standard deviation of the gauge data is treated as (1), and the angle of the sector indicates the correlation coefficient between the estimated or forecasted data and the gauge data. The distance between the satellite data or model data and the gauge data is the RMSE. The diagram shows that the RMSEs of the remote sensing data are smaller than the WRF-forecasted data, which is quite reasonable. The RMSEs of CMORPH and PERSIANN are smaller than for the TRMM data, indicating higher resolution data are more accurate than lower resolution data. It also shows that the correlation coefficients of CMORPH and TRMM are higher than for PERSIANN; however, the standard deviations are also higher than for PERSIANN, meaning their positioning is better than for PERSIANN while the scale errors are a little higher. The RMSEs of the WRF model forecasted rainfall at different resolutions are much larger than for the satellite data. The higher the resolution of the WRF model, the larger the RMSEs. Although the standard deviation of the higher resolution model is closer to the gauge data, the correlation coefficient is lower, which indicates a higher resolution can forecast the scale accurately but the

Funding
This work was supported by the National Natural Science Foundation of China [grant numbers 41421004 and 41210007]; and the International Innovation Team project of the Chinese Academy of Sciences entitled 'High Resolution Numerical Simulation of Regional Environment' . data. Meanwhile, the performance of the higher resolution model data, WRFd02 and WRFd03, is rather good, even as good as TRMM and PERSIANN. In the 11th time step, only CMORPH, TRMM, PERSIANN, and WRFd03 are roughly in the same scale as the gauge data; the other two model data maximum values are too small, and all of the satellite and model data lag by one time step. CMORPH and TRMM perform the best in the 15th time step, while PERSIANN produces values that are less than half of those in the gauge data. The performance of the model data is not good in this time step, as the value of maximum rainfall is much lower than in the gauge data. Generally, the CMORPH (despite it producing much higher values than the gauge data in the 7th time step) and TRMM maximum time series are the best among the regridded data.

Conclusion and discussion
From the above results, it is clear that all of the products, i.e. satellite-retrieved data and numerical model-forecasted data, can capture the main features of mesoscale convective system rainfall, including the distribution, scales, timing, total accumulation, and extreme values. In general, the performance of satellite-derived rainfall is better than model-forecasted data (within which the microwaveretrieved data are better than the infrared data); and basically, the higher resolution model data are better than the lower resolution data. This paper reveals that real time high-resolution satellite data can reproduce observed rainfall, even in mountainous areas with high altitude and complicated terrain, while there are still considerable errors and uncertainties in both microwave and infrared-based rainfall retrieval. One possible reason for the errors is the basic theory and rationale of the estimation. Another is that the algorithms involved are developed and calibrated mainly based on data from the plains where there is an abundance of gauge stations, while in mountainous areas the algorithms are poorly calibrated. To improve the performance of satellite-derived rainfall, better sensors, algorithms, and more calibrations are needed.
The WRF model, especially at high resolution, can to a certain extent forecast the basic features of rainfall, including the basic distribution, rough position, and time of rainfall; but for precise usage, like monitoring rainfall induced hazards, the model needs to be improved.

Disclosure statement
No potential conflict of interest was reported by the authors.