Identification and quantitative analysis of flash flood risks for small catchments in China: a new operational modelling approach

ABSTRACT Under the influence of multiple factors, such as extreme weather conditions and human activities, flash flood disasters occur frequently in China and are very difficult to predict and anticipate. Based on the national flash flood disaster investigation and assessment project, this article focuses on small (10–50 km2) catchments and extracts 83 alternative indicators from the perspectives of rainfall, underlying surfaces, present social and economic conditions, flood control capacity, wading engineering, and monitoring and early-warning facilities. A dimension reduction processing is conducted using a principal component analysis, and 10 core and independent indicators are obtained. A risk evaluation indicator system is proposed that establishes the risk cube model, which is verified with data from 53,235 historical flash flood events in China from 1949 to 2018. The results show that flash flood risk identification based on small watersheds may effectively reflect the disaster response relationship based on rainfall and the underlying surface. In addition, 91% of historical flash floods occurred in high-, medium-, and low-risk areas, and the occurrence density in high-risk areas was double of that in low-risk areas. These evaluation results provide data support for the accurate defence, forecasting and early warning of flash flood disasters in China.


KEYWORDS
Flash flood disasters; principal component analysis; risk identification; risk cube; small catchment

Introduction
China is affected by both a typical monsoon climate and special geological features, and the country is known to suffer from the most frequent flash floods in the world. In a country characterised by sudden extreme rainfall, sophisticated geological and topographical properties, and high population densities in the hazard areas, flash flood defence and management in China are facing great challenges that require accurate early risk identification and detailed quantitative risk analysis to support decision-making processes of emergency and crisis management at both local and national scales (Guo et al., 2017).
A sustainable development approach for a flood retention area can include three types of strategies: Risk is an integration of danger, exposure and vulnerability, which includes physical, social, economic, and environmental factors and many other aspects. In risk identification and evaluation, comprehensive consideration should include many risk indexes, such as rainfall intensity, flood depth and other spatiotemporal factors (Hong et al., 2018;Tehrany et al.,, 2017). In the field of flash flood risk assessment, many approaches have been applied to integrate risk identification and evaluation (e.g. the rainfall threshold statistical method (Wang & Wang, 2016), the index system method (Alilou et al., 2019), the spatial analysis method combined with remote sensing and GIS (Geological Information System) (Kia et al., 2012), and the multicriteria analysis method (Farzaneh et al., 2012). The rainfall threshold statistical method is one of the classical approaches that has been widely applied in the assessment of flash flood risk. For instance, in a flash flood risk analysis of the Nel Caine area, Wieczorek and Glade (2005) extended the rainfall threshold value obtained from a local statistical analysis of historical records to evaluate the flash flood risk level of the entire study area.
With the progression of risk analysis methods, the establishment of an indicator system for risk assessment and evaluation has been considered the core of a flash flood risk analysis. An increasing number of scientists deem that the indicator system of a flash flood risk evaluation should consist of many indexes that can comprehensively represent the characteristics of local and regional natural and social and economic properties. Elkhrachy (2015) established an integrated indicator system that consisted of many indexes, including river discharges, soil types, surface slope, river channel roughness, and the density of river networks for a flash flood risk analysis of Najran City in the Kingdom of Saudi Arabia. Moreover, with modelling development, in 2012, the US Hydrologic Research Center (HRC) proposed a dynamic rainfall threshold model for flash flood assessment and embedded it into its national flash flood guidance system (FFGS; Totschnig et al., 2011). In the same year, by analysing the relationship between hourly rainfall intensity and flash flood occurrence, Fan et al. (2012) developed a modelling tool to forecast the flash flood risk in Jiangxi province, China. For a larger scale assessment, using spatial analysis techniques (GIS environment) and referencing nearly 300 years of data, including flood hazard records as well as natural and socio-economic information, Tan et al. (2004) established a regional flood risk map series for China. A similar technological approach was applied for a US nature disaster map produced based on data from 1975 to 1994 (Dennis, 1999). In Europe, especially in the Alps region, many physical and social indicators extracted from the historical records from 1279 to 2002 were integrated to determine the spatiotemporal distribution of the flash flood risk in not only the Alps region, but also the mountainous areas of Italy (Guzzetti et al., 2008(Guzzetti et al., , 2005Holub et al., 2012;Totschnig et al., 2011).
Under current conditions, most approaches for risk identification and analysis have been implemented at the regional scale. Some distribution analyses were able to provide results based on a series of computation grids that covered the targeted area. However, the resolution of those grid-maps was relatively low. Very few evaluation and analysis methods have been focused on small-sized catchments (up to 50 km 2 ), and most such approaches have lacked the integration of detailed physical properties of the area that are driving factors at that scale. To address the challenges of flash flood management for small-sized catchments (10-50 km 2 ) in the hilly regions of China, a new modelling and operational approach is developed in this study to support the establishment of prevention and control strategies. Based on 70 years of flash flood data collected at the national scale in China, an evaluation indicator system has been constructed and used for a new risk assessment model, termed the "risk cube", to produce risk maps of the Chinese territory.

Materials and methods
To achieve a comprehensive and consistent view of the flash flood risks in China, flash flood data were collected from 30 provinces that included information on 305 cities and 2138 counties, and covered 7.55 million km 2 and more than 900 million inhabitants. The collected data covered the period from 2013 to 2016. This initiative was part of the "National flash flood disaster investigation and assessment" framework led by the central government. By classifying the data into various categories, an integrated evaluation indicator system was established and further used in an innovative model proposed in this study to quantitatively analyse the flash flood risk throughout China.

The national flash flood database
In the project called the "China national flash flood disaster investigation and assessment", flash flood data were collected that included meteorological, hydrological, geological, and social information for 535,858 small-sized catchments (10-50 km 2 ). A national database was established using 13 data types, 55 table formats, 30 types of spatial productions, 14 media records and seven types of files. The size of this database reached 102 TB with low-resolution historical data. In addition, the updates have reached 120 TB per year with new high-resolution data. The data used for this study are listed in Appendix 1 and include eight categories and 23 data types.

Method
Referencing the United Nations (UN) evaluation method, the risk of flash flood (R) is a function of hazard (H) and vulnerability (V; Sven et al., 2012): Moreover, the Asian Disaster Reduction Center in Japan considers that the risk of flash flood is affected by three independent elements: hazard (H), exposure (E) and vulnerability (V; Asian Disaster Reduction Center, 2008): The difference between these two risk evaluation formulas is the expression of the risk elements. The UN formula considers that vulnerability is a characteristic of a community system whose flash flood vulnerability includes physical, social, and economic variables. However, the Japanese method provides more details by considering the exposure of the population and property under a certain level of hazard. Although the UN definition of vulnerability covers both the Japanese exposure and vulnerability elements, the Japanese expression is more easily understood. Hence, the evaluation of flash flood risk in China followed the Japanese expression that combined the three elements. The technical framework for this analysis is shown in Figure 1.

Indicator selection
Based on the information supported by the national flash flood database, the hazard of flash flood risk was evaluated based on the meteorological values (primarily rainfall) and geological data. The exposure was quantified based on the current social and economic conditions of the targeted area. The vulnerability was determined using the integrated consideration of the current flood defence capability, including the building situation and the status of the alert equipment. In total, 83 risk elements were extracted from the database and used to identify and analyse the flash flood risk in the small-sized catchments of China (Appendix 2).

Principal component analysis
Based on the 83 indexes collected from the 255,382 small catchments with 53,235 historical flash flood records , a principal component analysis (PCA) was applied to extract the primary independent elements that contribute to flash flood defence and management. PCA is a classical method of multivariate statistical analysis that uses an orthogonal transformation to reduce the amount of data into a few significant data that can represent most of the information in the data set. By assuming the study focuses on P indexes, which are represented by x 1 , x 2 , x 3 , . . ., x p , the p-dimension random vector X is X = (x 1 , x 2 , x 3 , . . ., x p ). Its mean value and covariance matrix are μ and ∑. Implementation then uses the linear transformation of the X matrix to a new comprehensive variable Y: where Y 1 is the first principal component that has the maximum variance of all linear combinations of the p elements. However, if the first component is not able to represent the data set, then the second component is considered. Y = (Y 1 , Y 2 , . . ., Y m ) is constructed using the first, second, and up to the mth principal components of the original index data set.
By defining a correlation coefficient of 0.6 as the threshold value, 10 principal components from three categories were extracted from the 83 indicators, as shown in Figure 2. The 10 major components obtained were primarily focused on the position of the villages regarding river networks, short-term heavy rainfall properties, concentration times, flood peak modulus, hydraulic infrastructures, house types, flood prevention capacity, and monitoring and early-warning facilities. The spatial distribution of these indicators is shown in Figure 3.

Establishment of the risk evaluation system
After the PCA, the 10 extracted indicators were integrated to construct the flash flood risk evaluation system (Figure 4). Regarding the weight setting procedure, the weights for three first-level indicators were equally separated. For the secondary indicator weight, the "meteorological element" was considered to make a greater contribution to the "hazard" than the "geological element". Compared to the "fragile house", the "monitoring and early warming facilities" had a relatively higher weight in the category of "vulnerability". For the thirdlevel indicators, except for the "flood peak modular" element, which had a higher weight than the "flood lag-time" element in the category "topography", all of the elements had equal weights in their respective categories.

The risk cube model
By normalising the principal components, the following equations were used to evaluate the flash flood risk in the small-sized catchments of China: where H i , E j and V k refer to the variable indicators; m, n and l refer to the quantities of variable items; m', n', and l' refer to the variable Based on the established risk assessment index system, a risk cube model was developed to further evaluate the risk of flash flood disasters in the small-sized catchments of China. The three axes of the cube represent the hazard, exposure and vulnerability, respectively. The higher the value on a given axis, the higher the evaluated risk. Hazard, exposure and vulnerability were equally ranked with three levels each (high risk, medium risk and low risk; Figure 5; Table 1).

Results and discussion
The flash flood risk identification and quantitative analysis results for the entire country of China are shown in Figure 6. The high-risk areas are in dark, the medium-risk areas are in grey, and the  The high-risk areas were primarily located in six regions of China: • The Wuyi mountain area covers four provinces (Guangdong, Fujian, Zhejiang and Jiangxi) and is characterised by a high population density, high urbanisation level and abundant rainfall. It was identified as a highly frequent flash flood area. • The Hengduan mountain area covers three provinces (Sichuan, Chongqing and Yunnan) and is located at the intersection of the Tibetan Plateau and the Sichuan plain. Historically, this area has frequently suffered geological disasters, such as the 2008 earthquake.
• The Qinlin-Daba mountain area covers five provinces (Shaanxi, Gansu, Henan, Sichuan and Chongqing). The Earth's crust in this region displays major fractures, and the terrain is rising in the north and declining in the south with steep mountains, short rivers and multiple rapids. This contributes to the easy formation of flash flood disasters. • The Taihang-Yanshan mountain area covers three provinces (Hebei, Shanxi, and Beijing) and is adjacent to the political centre of China with a large population density and high urbanisation level. • The Loess Plateau area covers three provinces (Gansu, northern Shaanxi and Inner Mongolia). Its morphology is inclined from the north-west to the south-east and is primarily covered by thick loess. The surface has been eroded intensively by river flow for a long period of time, gradually forming thousands of gullies and valleys with terrain fragmentation. As the western centre of  the country it has a large population density, and it is easily prone to the influence of flash flood disasters. • The Changbai mountain area covers three provinces (Liaoning, Jilin and Heilongjiang) and has many mountainous valleys with a dense stream network. In recent years, there have been frequent flash floods in this area. Areas identified with medium and low risk levels are widely distributed in this region according to the meteorological, hydrological and social conditions.
After rechecking the data collected from the national flash flood database, it was determined that 91% of the historical flash flood hazards were recorded in the risk areas identified in this study. The density of the flash flood hazards in the high-, medium-, and low-risk areas were 190/10,000, 119/10,000 and 98/ 10,000 km 2 , respectively.
With regard to flash flood disasters that have occurred in the different provinces of China (Figure 7), Sichuan Province, located in the southwest part of China, was the largest flash floodaffected area (355,494 km 2 ) among all of the flash flood-affected provinces of China. In addition, 12% of its flash flood-affected area was identified as a high-level area that contains 20% of its flash floodaffected population. The Yunan Province of China, which is close to the Tibetan Plateau, had the second largest flash flood-affected area (342,003 km 2 ). However, the population that can be affected by a flash flood disaster in this province is obviously higher than that in any other flash flood-affected provinces of China. There are nearly 7,670,000 people under threat of serious flash floods (i.e. they live in high-risk areas), comprising nearly 18% of the total flash flood-affected population. The Liaoning province of China has the second highest population affected by flash flood disasters (nearly 31,420,000). Moreover, the flash flood-affected area of this province is merely 117,423 km 2 and is characterised as the most serious flash flood-affected area of China's 268 people/km 2 .
These results demonstrate the rationality and applicability of the proposed approach for flash flood risk identification and analysis. The risk assessment results in Figure 7 show the areas and populations under the different risk levels in the various provinces of China. The approach is promising for its ability to guide future decision-making processes and support the policymaking process for flash flood disaster reduction and early warning.

Conclusions
China is a country that has frequently suffered from flash flood disasters in recent years. Flash flood hazards in small catchments are characterised by their sudden occurrence, short foreseeable periods, and serious disaster losses that present huge challenges for flood defence and management processes. Accurately identifying and quantitatively evaluating the flash flood risk of small catchments in China could help decision makers and even local residents to be aware of and prepared for flash flood hazards. This would then contribute to reducing losses caused by flood disasters. Based on the national flash flood database, an operational approach was proposed to identify and quantitatively analyse the flash flood risks in 535,858 small-sized (10-50 km 2 ) catchments in China. The approach considered the primary variables of flash flood hazards -hazard, exposure and vulnerability -and it associated 83 indicators collected from the national database to evaluate the risk of small catchments. PCA was used to extract the primary principal components for the analysis. Even though this method is quite simple, it was able to address the full situation of the Chinese territory. By using the risk cube approach, the risk levels of the small catchments in China were determined and are shown on a map at the scale of the entire country. With 53,235 historical flash flood records collected during the last 70 years , the conclusions can be summarised as follows: (1) At the small catchment scale, the flash flood risk responded to hazard characteristics such as short-term heavy rainfall, the underlying surface of the watershed, and human activities. The risk evaluation proposed in this article considered the small catchment as the basic evaluation unit and then further analysed both its physical and its social properties. (2) The results produced from the proposed approach showed that the high-risk areas in China are primarily concentrated along the coastal regions that frequently suffer from extreme rainfall events, such as typhoons and high-intensity rainfalls. Moreover, the evaluation of flash flood risk also considered human activities and socio-economic conditions. Obviously, the same level of flash flood risk in areas with different populations will cause different losses. Under the current conditions, there are 54.63 million people who live in highrisk flash flood areas in China.
(3) The proposed approach was validated using historical flash flood records. A total of 91% of historical flash floods occurred in the high-, medium-and low-risk areas identified by the approach. The results produced here will be applied to support flash flood disaster prevention strategies in China and will help define priorities going forward.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This work was supported by the National Key R&D Program of China [2019YFC1510603].

ORCID
Qiang Ma http://orcid.org/0000-0001-5572-0265 Appendix 1. A brief introduction to the national flash flood database (only the data related to this study are listed).   Historical records Design storm rainfall Within hourly, daily, monthly, and yearly time intervals Hydrological data  Observation of river discharge Observation of flood peak and peak time Calculated flood parameters (C v , C s ) Estimated flood lag-time etc.
Within hourly, daily, monthly, yearly time intervals Maximum value in 10 minutes of rainfall 2 C v10min Variation in 10 minutes of rainfall 3 P 10min,20% Maximum value in 10 minutes of rainfall in a 5-year return rainfall event 4 K P10min,20% Modular of 10 minutes of rainfall in a 5-year return period rainfall event 5 P 10min,10% Maximum value of 10 minutes of rainfall in a 10-year return rainfall event 6 K P10min,10% Modular of 10 minutes of rainfall in a 10-year return period rainfall event 7 P 10min,5% Maximum value of 10 minutes of rainfall in a 20-year return rainfall event 8 K P10min,5% Modular of 10 minutes of rainfall in a 20-year return period rainfall event 9 P 10min,2% Maximum value of 10 minutes of rainfall in a 50-year return rainfall event 10 K P10min,2% Modular of 10 minutes of rainfall in a 50-year return period rainfall event 11 P 10min,1% Maximum value of 10 minutes of rainfall in a 100-year return rainfall event 12 K