A novel semantic segmentation approach based on U-Net, WU-Net, and U-Net++ deep learning for predicting areas sensitive to pluvial flood at tropical area

ABSTRACT Floods remain one of the most devastating weather-induced disasters worldwide, resulting in numerous fatalities each year and severely impacting socio-economic development and the environment. Therefore, the ability to predict flood-prone areas in advance is crucial for effective risk management. The objective of this research is to assess and compare three convolutional neural networks, U-Net, WU-Net, and U-Net++, for spatial prediction of pluvial flood with a case study at a tropical area in the north of Vietnam. They are relative new convolution algorithms developed based on U-shaped architectures. For this task, a geospatial database with 796 historical flood locations and 12 flood indicators was prepared. For training the models, the binary cross-entropy was employed as the loss function, while the Adaptive moment estimation (ADAM) algorithm was used for the optimization of the model parameters, whereas, F1-score and classification accuracy (Acc) were used to assess the performance of the models. The results unequivocally highlight the high performance of the three models, achieving an impressive accuracy rate of 96.01%. The flood susceptibility maps derived from this research possess considerable utility for local authorities, providing valuable insights and information to enhance decision-making processes and facilitate the implementation of effective risk management strategies.


Introduction
Floods remain a pressing global concern due to their occurrence in diverse regions worldwide.A report by the United Nations in 2015 pointed out that the estimated average global losses reached 104 billion USD per year (Desai et al. 2015) and these losses continue to increase (Paprotny et al. 2018;Winsemius et al. 2015).Thus, floods present a great risk among the natural hazards to our society both at the economic level (Bui et al. 2020), affecting crops or infrastructures, and at the health level, with the potential to cause epidemics or even deaths (Nkwunonwo, Whitworth, and Baily 2020).The projected outlook indicates an exacerbation of flood problems in the future due to the adverse effects of global warming and climate changes, which result in extreme rainfall events worldwide (Barnes et al. 2018;Konapala et al. 2020;Papalexiou and Montanari 2019;Schiermeier 2011;Troncoso et al. 2018;Xie et al. 2010).According to Rentschler, Salhab, and Jafino (2022), an estimated 1.81 billion people are directly exposed to 1-in-100-year floods, i.e. approximately 23% of the world population.In addition, it is anticipated that around 68% of the population is expected to live in urban areas by 2050 (O'Donnell and Thorne 2020).Therefore, investing in research and innovation to advance flood prediction techniques and develop robust flood management strategies is of paramount importance.
Literature review shows that flood modeling and prediction is becoming a well-studied topic.In 2013, a strategic approach for the flood study was published by Sayers et al. (2013) where UNESCO and the World Wide Fund for Nature (WWF), among others, collaborated.Extensive reviews of models and methods for flood predictions can be seen in Mudashiru et al. (2021), Santiago-Collazo, Bilskie, and Hagen (2019) and Zounemat-Kermani et al. (2021).Essentially, flood prediction models can be categorized into three main groups: statistical analysis (McCuen 2016), rainfallrunoff models (Bennett et al. 2016), and 'on-off' classification models (Bui et al. 2016).The first group relies on conducting regression analysis of time series data at gauged stations.In contrast, the second group employs mathematical equations to simulate the spreading of floods and extrapolate their impact to surrounding areas.Both of these groups possess the capability to yield accurate predictions with high levels of accuracy.However, long time series data at gauged stations with return periods are required for the reliability of the flood prediction (Read and Vogel 2015).The last group represents a relatively novel approach to flood modeling that does not rely on measured data from gauged stations.Consequently, this approach does not incorporate the concept of return periods, which is a requirement in traditional modeling approaches.The concept of 'on-off' or binary classification was employed.Here, 'on' signifies flood locations that have occurred in the past or present, while 'off' refers to non-flood locations within the region under consideration (Tien Bui and Hoang 2017).This has proven a powerful approach, allowing the incorporation of a large number geo-environment data (Nguyen et al. 2020).For example, the flash-flood event in October 2018 in Mallorca, Spain was studied through an integrated approach with rainfall radar images, meteorological, hydrological, geomorphological, damage and risk data analysis of the area (Estrany et al. 2019).In Martins et al. (2023), a Brazilian flash-flood event was analyzed with images and news of the event and social perception through surveys of residents of the area.
In this context, machine learning has to be key tools to help characterize flood susceptibility areas with promising results, i.e. support vector machines (Youssef et al. 2022), random forest (Hasanuzzaman et al. 2022), decision trees (Abedi et al. 2022), neural networks (Cui et al. 2023), ensemble machine learning (Fang et al. 2022).In recent years, deep learning has emerged as a potent technique for flood susceptibility mapping, offering significant potential to enhance our comprehension and management of flood-prone regions.As the capabilities of flood prediction continue to advance through deep learning (Bui et al. 2020;Li and Hong 2023;Youssef et al. 2022), its application holds immense promise in enabling more precise assessments and facilitating the development of effective strategies to mitigate flood risks.However, despite the abundance of deep learning algorithms, their exploration in the context of flood susceptibility studies remains limited.Therefore, further research is necessary to draw reasonable conclusions and unlock the full potential of deep learning in this domain.This research aims to partly fill this gap in the literature by evaluating and comparing the potential application of U-Net, WU-Net, and U-Net++ for fluvial flood susceptibility in tropical area.
The rest of the article is structured as follows: First, Section 2 introduces the previous studies of the deep learning methodologies in flood prediction problems.Then, Section 3 describes the study area.Afterwards, the developed flood prediction methodologies and the presentation and discussion of the obtained results are presented in Section 4 and in Section 5, respectively.Finally, Section 6 summarizes the conclusions obtained in this research.

Related works
In recent decades, artificial intelligence techniques have become an essential tool for managing natural disasters (Martínez-Álvarez and Morales-Esteban 2019).In this type of phenomenon, it is crucial to understand not only the phenomenon itself but also its relationship with exogenous factors.In Taromideh et al. (2022), traditional machine learning algorithms such as classification and regression trees, random forest, or support vector machines, among others, were implemented to create an urban flood-risk map of a city in Iran.The vulnerability and hazard of the areas were analyzed and significant parameters were discovered for each of them, such as population density or distance to rivers, respectively.In Hosseini et al. (2021) a decision-making trial and evaluation laboratory (DEMATED), an analytical network process (ANP) and fuzzy methods were conducted to address the flood vulnerability of watershed in a severely flooded area of Iran.The authors found that the two most important variables in the model were the land use and the distance to the stream.
Several spatial models of flood events are being analyzed to obtain accurate flood maps.In Rahman et al. ( 2021) authors created multi-type flood maps, i.e. fluvial, flash, pluvial and surge floods, in Bangladesh using machine learning algorithms, for instance, the LWLR or locally weighted linear regression model.They included flood data, remote sensing images, topography, hydrogeology and environmental datasets.The same information was utilized by Baig et al. (2022) to create a flood probability map of the area of the Koshi River basin in the Himalayas.Traditional machine learning algorithms with multiple kernel functions were developed.
Along with traditional machine learning methods, deep learning techniques are also an interesting approach to modeling flood problems.A state-of-the-art review of the application of deep learning models to flood mapping was published in 2022 (Bentivoglio et al. 2022).Convolutional neural networks (CNN) were identified as the leading and most accurate deep learning models considering all the reviewed research articles.CNN were trained in Ullah et al. (2022) to produce multi-hazard susceptibility maps, i.e. prediction of the probability of flash-floods, debris flows and landslides.The proposed CNN model performed better than several traditional machine learning algorithms with which it was compared.In Kabir et al. (2020), a CNN method for real-time fluvial flood forecast was presented demonstrating, once again, its superiority over other traditional machine learning methods.
Convolutional neural networks are composed of different layers that can be adapted depending on the objective of the problem.In Minaee et al. (2022), authors stated that CNN are often involved in image segmentation tasks.They conducted a literature review in this field and defined semantic segmentation as a classification problem of pixels with semantic labels.In this survey, authors mentioned the encoder-decoder technique as a class of image segmentation models.Some of the best known encoder-decoder models follow the u-shaped architecture, i.e. a contraction path for capturing context and a symmetric expansion path for localization.
All the convolutional neural networks used in our study to predict areas sensitive to pluvial flood follow u-shaped architectures.In particular, the three algorithms implemented in our research are: U-Net, WU-Net and U-Net++.They are defined in detail in Section 4.2.These three algorithms were originally used in biomedical applications.A survey of the different usages of the u-shaped networks in medical images segmentation was presented in Liu et al. (2020).The following paragraphs review current applications of these three algorithms in other fields.
U-Net was firstly proposed in 2015 with the goal of obtaining a better segmentation of biomedical images (Ronneberger, Fischer, and Brox 2015).However, this algorithm and its applications in different areas have evolved.For example, in Flood, Watson, and Collett (2019) U-Net was used to map the presence or absence of trees and shrubs in Australia.In Zhuang, Zhang, and Wang (2020) it was used to segment small-scale residential solar panels.Regarding flooding, U-Net along with PSPNet and DeepLapV3 (two other encoder-decoder techniques), were developed by Andrew et al. (2023) for automatic flood mapping image segmentation.The U-Net algorithm behaved more accurately, in terms of accuracy and f1-score compared to the other two deep learning algorithms.In Li and Demir (2023) U-Net was modified to capture water bodies at the 2019 Central US flooding.Their modification of the U-Net outperformed the traditional benchmark models.In Zhao et al. (2022) near-real-time urban flood mapping was accomplished using an urban-aware U-Net model.
At present, U-Net is being used as a base model to develop improved and more specific algorithms.WU-Net is one of these new versions of the traditional U-Net.WU-Net was introduced by Muralikrishnan, Kim, and Chaudhuri (2018).Since its implementation, WU-Net was mostly used in the biomedical area.An application of the WU-Net to hyperspectral imagery simulated from the HyMap data over Munich, Germany, was conducted in Hong et al. (2019).To the best of our knowledge, there is no specific article in the literature in which a WU-Net model was applied to analyze a natural flood catastrophe.
The last deep learning algorithm implemented in our research work is U-Net++ which is a very recent algorithm (Zhou et al. 2020) based, as well, on U-Net.As the previous mentioned models, U-Net++ is mainly used in detection and segmentation of biomedical images.In Yang et al. (2020), authors focused on a new application of this algorithm, in particular, in the detection of seismic fault using a 3D U-Net++ model.In Helleis et al. ( 2022) five different CNN architectures, including U-Net and U-Net++, were applied to Sentinel-1 data to map water and floods.Authors reported that all CNN algorithms performed similarly on the flooding dataset and that attention needed to be paid in arid or mountainous environments.They ended the article by enhancing the need for further optimization of the CNN architectures.In Khan and Basalamah (2023) flood segmentation using aerial images from the 2017 hurricane landfall in Texas and Louisiana was carried out with several deep learning algorithms.Two of them were U-Net and U-Net++.U-Net++ outperformed U-Net by redesigning the skip-connection, i.e. feature maps of the encoder part of the U-Net++ were pre-enriched.

Study area and flood data
This Section presents the area of Vietnam subject of this research.More specifically, Section 3.1 details the spatial and environmental characteristics of the area.The specific flash-flood data used in this study is described in Section 3.2, focusing in Section 3.2.1 on the flood inventory maps and in Section 3.2.2 on the indicators considered.

Description of the study area
The study area is the Phu Tho province, locating in the northwest region of Vietnam, approximately 70 km from Hanoi City (Figure 1).The province lies between latitudes of 20 • 55'0" N and 21 • 43'0" N and between longitudes of 104 • 48'0" E and 105 • 27'0" E with an area and a population of 3534.6 km 2 and 1,463,726, respectively.
In term of morphometry, Phu Tho has a very complex terrain, consisting of mountains and hills in the West and South regions, valleys, and plains in the North and East regions with a wide variety of elevation ranging from 30.2 to 1382.1 m.a.m.sl.It was noted that a significant percentage (51.6%) of high slope area (. 15 • ) may increase the possibility of the flash-flood occurrences in this study area.The hydrological system in the study area consists of a dense river and stream networks, which connect directly three main rivers including the Thao river, the Da river, and the Lo river.
Climatically, Phu Tho is located in the tropical monsoon climate region characterized by less rainfall and dry air in the northeast monsoon season (from April to October) and heavy rainfall and hot in the Southeast monsoon season (from November to March).Historical rainfall monitoring showed that yearly maximum and minimum rainfall amount in this area were approximately 3057.2 mm in 1980 and 1192.5 mm in 1977, respectively.The study area has an average temperature, humidity, and rainfall of 23 • , 85% and 700 mm per year, respectively.More noticeably, the complexity of the terrain and recent rapid land cover transformation, coupled with heavy rainfall in a short time in high slopes areas, may cause a high possibility of flash-flood occurrences in the study area.

Flood inventory map
Making a flash-flood inventory map is the first and essential step for mapping susceptibility (Costache et al. 2020;Hosseini et al. 2020;Razavi Termeh et al. 2018).A flash-flood inventory map presents a correlation between influencing factors and previous flash-flood events (Nguyen et al. 2020;Tien Bui et al. 2020).For this purpose, we collected more than 796 historical flash-flood locations that occurred between 2010 and 2019 from our field surveys, local authorities, newspaper, and literature.These data were randomly divided into two categories with 70% samples for training and 30% for testing models.The locations and distributions of the flash-flood events are shown in Figure 1.

Flood indicators
The selection of influencing factors to flash-flood occurrences is an important stage to produce accurate and reliable flood susceptibility maps (Tehrany et al. 2018).The influencing factors were often selected based on the physical and statistical relationship between the previous flash- flood events and these factors.However, the occurrences of flash-flood events in a certain catchment are very complicated, therefore, we selected influencing factors based on the literature reviews, coupled with consideration of previous flash-flood characteristics and hydrometeorological and geological conditions in this study area (Hosseini et al. 2020;Nguyen et al. 2020;Tien Bui et al. 2020).Accordingly, a total of twelve influencing factors were selected including Land cover, NDVI, Land use, Lithology, Soil type, stream density, rainfall, Elevation.Aspect, Plan curvature, Profile curvature, and Slope.
Land use/Land cover (LULC): LULC plays an essential role in generating runoff processes and flash-flood occurrences in a certain catchment.In large areas, for instance, LULC may cause an increase or a decrease in soil moisture influencing the soil infiltration capacity (Pishvaei et al. 2020).Meanwhile, the modification of LULC in hillslopes can change runoff characteristics such as changing flow paths, velocity, flow connectivity and times of concentration within a watershed (Rogger et al. 2017).Therefore, LULC is always considered as an important factor in mapping flashflood susceptibility (Costache and Tien Bui 2020;Hosseini et al. 2020;Yariyan et al. 2020).In the present study, LULC map is obtained from Sentinel-1 C band SAR data (Nguyen et al. 2020).In this study, there are eight land cover types, including water, urban and built-up, paddy rice, crops, grassland, orchard area, bare land, and forest (Figure 2(a)) in which a significant percentage of low vegetable density areas such as urban and built-up, grassland, and crops may have a strong influence on the generation of flash-floods.Similarly, the study area consists of nine land cover types including built-up area, agricultural land, annual crop land, Pcland, Pdland, Ptland, water surface, NTRM, and unused hilly land (Figure 2(c)).It was noted that significant land cover related to artificial activities such as intensive agricultural development and urbanization may have high potentiality of taking place flash-floods.
Normalized Difference Vegetation Index (NDVI): NDVI presents characteristics of vegetable coverage such as types of species, density, diversity and variation influencing on runoff flow processes (Kalisa et al. 2019).Many studies demonstrated the relationship between NDVI and flash-flood occurrences in which the higher NDVI areas showed a high probability of flash-flood occurrences compared to lower ones.Therefore, NDVI is often used as a crucial variable in predicting flash floods (Nguyen et al. 2020;Tien Bui et al. 2020).For the present study, Landsat-8 OLI imagery is used to prepare the NDVI map (Figure 2(b)) (Tien Bui et al. 2020).The NDVI value is estimated using the Equation (1): where NIR and RED indicate surface reflectance of the near-infrared band and the red band, respectively.The NDVI in the selected area is presented in Figure 2 Lithology: The lithological factor is an important variable in predicting flash-flood susceptibility because its characteristics effect on processes of filtration, runoff generation and flood occurrences.For example, each type of rock has different in grain size composition, moisture content, plasticity, porosity, and volume influencing capability of water holding, absorption capacity, infiltration, and storage capacity (Myslinska 1983).These hydraulic properties have a strong effect on increasing or decreasing the magnitude of water flow.For this study, the lithological map (Figure 2(d)) is prepared from the Phu Tho Geological and Mineral Resources Map in a scale of 1:50,000 obtained from Vietnam Institute of Geosciences and Mineral Resource.Accordingly, there are eighteen lithological types in this study area, in which six lithologies including L2, L4, L5, L12, L13, and L18 covered more than 82.7% of total area.It can be tracked that high density of previous flash-flood events are recorded in L4 and L15 groups indicating a high probability of flash-flood occurrences.Soil type: Physical characteristics of soil type influence the infiltration rates, and therefore, it often regulates runoff generation and flash-flood processes taking place in a watershed (Nguyen et al. 2020;Tien Bui et al. 2020).The impermeable and permeable soil types have the different capability of absorbing water, it can increase or reduce runoff flow and concentration times of flow from a certain catchment or watershed to outfall.For the present study, soil map is created using the soil texture map 1:50,000 scale from Vietnam Soil Map.A total of 14 soil types are identified in this study area, in which Fs, AS, and Fp account for approximately 89.6% of the total area (Figure 2(e)).
Stream density (SD): Streams or drains are a conveyable route of water flow from hillslopes to valleys and plains.Stream density (SD) is a representative factor showing the magnitude flow concentration in a watershed.Previous studies also indicated that high density of streams can produce more prone to flash-flood occurrences (Nguyen et al. 2020;Tien Bui et al. 2020;Yariyan et al. 2020).Therefore, the SD is often considered as a crucial influencing factor for mapping flash-flood susceptibility (Glenn et al. 2012).Stream density in this study area is prepared from a digital elevation model (DEM) with a scale of 30 m × 30 m and Vietnamese river network system database.As shown in the Figure 2(f), relatively high stream density (. 2.5 km/km 2 ) located in small valleys along the streams and rivers while more frequent historical flash-flood records are found in the high elevation areas.This fact indicates that the occurrences of flash-flood events in this study area depend not only on the SD but also on other influencing factors.
Rainfall: Rainfall is characterized by intensity, duration and frequency which have a strong influence on the runoff generation and flash-flood occurrences within a certain watershed.Although each rainfall event may have different impacts on flash-flood magnitude, flash-flood often takes place immediately after heavy rainfall within a short time in the steep slopes areas (Nguyen et al. 2020;Tien Bui et al. 2020).Hence, rainfall is an essential influencing factor in flood prediction.For this study, we obtained the maximum 10-day rainfall during the last 5 years at 50 stations in and around case study to generate the rainfall pattern map using Kriging Interpolation Method.Accordingly, the rainfall varied from 453.4 to 1162.5 mm (Figure 2(g)), in which high rainfall intensity (. 750 mm) is observed in the southwest and northeast region where many flash-flood records occurred previously.
Elevation: Elevation has a strong influence on local and regional climate system which controls the development of plants and vegetation (Moradi, Fattorini, and Oldeland 2020).Also, high elevation and wide derivatives produce more energy speeding up flow generation and its movement from higher elevation area to lowland areas (Tien Bui et al. 2020).The elevation map in this study area is created using a DEM with a grid size of 30 m × 30 m. High elevation areas are mainly located in the southwest region of the study area with the maximum of 1382.1 m, whilst low elevation areas extend from the northwest to southeast regions with a minimum of −93.1 m (Figure 2

(h)).
Aspect: Aspect is an important influencing factor for making flood-flash susceptibility because it can impact on local climate, physiographic conditions, soil moisture content and vegetation growth in a watershed (Costache, Hong, and Wang 2019;Florinsky 2016b).As shown in Figure 2(i), the study area was dominantly covered by N, NE, E, and SE aspect with more previous flash-flood occurrences.
Plan curvature: Plan curvature is defined as the perpendicular to the slope influencing the convergence and divergence of flow across the surface (Florinsky 2016a) which is a crucial factor in flash-flood prediction (Costache, Hong, and Wang 2019;Nguyen et al. 2020).The plan curvature is often classified into concavity (positive), convexity (negative), and flat (zero).In this study, the plan curvature map is created by intersecting the horizontal plane and the surface on the DEM model with a resolution of 30 m × 30 m.As a result, approximately 85% of the total area in this province is covered by concave zones (Figure 2(j)).
Profile curvature: Profile curvature is parallel to the direction of the maximum slope, indicating the downhill or uphill rate in the changes of gradient direction.Similar to plan curvature, the profile curvature has three classes including concavity (positive), convexity (negative), and flat (zero) (Florinsky 2016a).Profile curvature has a strong effect on accelerating or decelerating the flow across the surface within a certain watershed (Nguyen et al. 2020).Therefore, the combination of plan and profile curvature could help to understand the characteristics of flow across the surface in this study area.For this purpose, the profile curvature map is prepared based on the DEM map with a resolution of 30 × 30 m.The profile curvature in this area shows a wide variation ranging from −11.03 to 11.29 with approximately 82.6% of the total area covered by concave zones (Figure 2(k)).It was tracked that most of previous flash-flood records in this area are in plains which have very low values indicating the high potentiality of future flash-flood occurrences in these areas.
Slope: The slope is an extremely important flash-flood predictor because high slopes can speed up water flow velocity while low slopes may reduce the water flow to downstream (Costache and Tien Bui 2020;Nguyen et al. 2020).In this study area, the slope shows a large range varying from 0 to 68.6 degrees (Figure 2(h)).High slope values are found in the mountainous areas from the southwest region, whilst low slope values are in small valleys and plains with a high density of historical flash-floods.This fact indicates that next flash-flood events may occur in low slope areas with high possibility compared to high slopes areas.

Proposed modeling approach based on convolutional networks for tropical cyclone-induced flood susceptible mapping
This Section presents the dataset, the three convolutional networks used in this research and the metrics used to evaluate the performance of the models and predicted results.

Flood database
The experimentation of this research is done with a dataset of 1412 input images and its corresponding 1412 target images.Every image chip is made up of 32 × 32 pixels.As Figure 2 shows, every pixel of an input image has 12 flood indicators, i.e. input images are designed as raster with 12 bands.On the other hand, target images are binary raster where '1' means flood and '0' means non-flood.Therefore, in summary, one input image corresponds to 32 × 32 × 12 pixels and one target image to 32 × 32 × 1 pixels.During the training phase of the convolutional networks, the dataset is split into a training set with 692 images (approximately a 49% of the whole dataset) for training, a validation set with 296 images (approximately a 21% of the whole dataset) for tuning of parameters and a test set with 424 images (the remaining 30%) to evaluate the models.
Once a model is successfully trained and tested, it is ready to predict the whole study area image.The whole study area is represented with an image of 2218 × 2961 × 12 pixels.This whole image is separated into image chips of 32 × 32 × 12 with an identification number.Considering that the dimensions of the image to predict are not divisible by 32, the empty pixels to complete the partial chips are filled with −9, the value for the out of the boundary of the original image.When prediction is done, all predicted images chips of 32 × 32 × 1 are merged (considering the identification number) to get the prediction of the original full image.This is the final susceptibility map.

Convolutional networks for flood modeling
Since the late 1980s, convolutional neural networks, also known as CNN, have been used in many visual tasks.Traditionally, when a CNN is applied in classification problems, the complete output of an input image is a single class label (Rawat and Wang 2017).Nevertheless, some visual tasks need localization, i.e. assignation of a class label to each pixel.In this section, the three convolutional networks selected to predict flood in our study area are presented: (Ronneberger, Fischer, and Brox 2015) uses as base architecture the fully convolutional network (Long, Shelhamer, and Darrell 2015) and applies some modifications with the objective of getting a fast and precise segmentation of images.The architecture of U-Net is divided into two paths: a contracting path and a symmetric expanding path, which are represented with a u-shaped architecture, hence its name.The total architecture has 23 convolutional layers: 22 in these phases and 1 in the last final layer to get the requested number of classes.During the first path, the context of the image is captured using a typical convolutional network, i.e. two convolutions followed by Rectifier Linear Unit (ReLU) activations and a max-pooling operator after which the number of feature channels is increased by two.Afterwards, the U-Net starts the expansion path to precise localization creating high-resolution segmentation maps.This second path is made up of upconvolutions, concatenations with the corresponding feature maps of the first path and, again, two convolutions followed by ReLU activations.
This architecture contains a special strategy, called overlap-tile strategy, to get the most accurate prediction of the pixels in the borders of the images.Other characteristics of this network are the possibility of using data augmentation and the separation of the touching objects of the same class.Both are interesting tasks in biomedical image segmentation as introduced in Ronneberger, Fischer, and Brox (2015).The energy function used in the final output segmentation map with probabilities (near to '1' for flood and to '0' for non-flood) is a pixel-wise soft-max combined with the crossentropy loss function (Ronneberger, Fischer, and Brox 2015).

WU-Net model
The architecture WU-Net (Muralikrishnan, Kim, and Chaudhuri 2018) is built upon the U-Net adding new characteristics to increase its possibilities.Basically, WU-Net is made up of three ushaped structures of U-Net linked in sequence that creates a w-shaped architecture followed by an u-shaped.All u-shaped structures are also connected among them (respecting their order), i.e. the concatenation exists not only in the high-resolution paths (as in U-Net) but also in the low ones.This added symmetry lets data wind back and forth in two directions.In this way, the convolutional filters of the later layers have a high effective field of view and localization of segments is improved.
In addition, the weakly supervised model is trained in two phases to get more accurate results.The first training phase removes the final segmentation branches and adds a simple classifier layer.This first phase is trained with cross-entropy loss until accuracy is higher than 95%.The first training phase does not output any segmentation map in contrast with the second phase that trains the completely original network (with the segmentation branches and without the classification layer) and outputs a final segmentation map.This architecture is able to detect parts from weak shapelevel tags and finds large consistent regions in shapes of differentiating parts.

U-Net++ model
U-Net++ appears as a new architecture to overcome two limitations of the fully convolutional networks, U-Net and variants of U-Net for image segmentation.These limitations are: the optimal depth of the network is not known and the design of the skip connections is restrictive.U-Net+ + is an ensemble architecture made up of several U-Nets with different depths.This is an important improvement over the fixed-depth of U-Net that sometimes needs several trainings of the same model or an inefficient ensemble of models with different depths.
In addition, the decoders of U-Net++ are designed in such a way that the skip connections are densely joined allowing flexibility feature fusion in decoders, not like the restrictive connections of U-Net.U-Net++ includes as well a scheme to prune a trained model where only one segmentation branch is selected and so, its output.The level of pruning is studied in each specific case evaluating the performance of the model in the test set.The inference speed of U-Net++ is accelerated thanks to this pruning scheme.Besides the above-mentioned improvements of U-Net++, Zhou et al. (2020) prove that the performance of the segmentations of U-Net++ is much higher.It also allows the possibility of aggregating image features in the network horizontally and vertically.

Prediction performance evaluation
The performance of the models is evaluated differently depending on its phase.During the tuning phase, where the optimal parameters are found, the training and validation phases are evaluated with the accuracy and the loss value.These metrics are used as well to evaluate the test data.However, the prediction of the whole study area cannot be assessed with accuracy and loss as the real probabilities (flood or not) of this image are not known.In this case, the predicted results are shown and discussed using susceptibility maps.

Results and discussion
This section presents the results obtained from the application of the selected three networks in the flood dataset.In particular, the selection of the parameters of each network and the training, validation and prediction results are discussed including their susceptibility maps.

Model performance and comparison
The experimentation of this research work is done in a GPU NVIDIA-SMI TITAN V with 12 Gb.The grid search strategy is used to tune the models and obtain the optimal network structure.The following parameters are tuned within these ranges (based on the literature review Torres et al. 2021): number of initial filters (from 8 to 64), learning rate (0.001, 0.003, 0.01, 0.03, 0.1 and 0.3), batch size (from 10 to 30), number of epochs (from 1 to 10) and dropout rate to relieve overfitting in the training phase (from 0.0 to 0.5).The Adam algorithm helps to optimize the parameters efficiently.
After the tuning process, the models with the highest accuracy are selected and so, the following are the selected parameters of each of them.The U-Net architecture with the most accurate performance in the validation set is a model trained with 16 initial filters, a learning rate of 0.01, a batch size of 30, 10 epochs and a dropout rate of 0.0.The WU-Net network selected has 16 initial filters, a learning rate of 0.01, a batch size of 30, 10 epochs and a dropout rate of 0.2.The selected U-Net++ architecture has 32 initial filters, a learning rate of 0.003, a batch size of 20, 7 epochs and a dropout rate of 0.2.
Table 1 shows the accuracy and loss results for the selected parameters for each model in the tuning phase.Table 2 shows the accuracy and loss results of the evaluation of the models with the above-mentioned selected parameters, when they are trained using the training+validation set, that represents the 70% of the data, and evaluated using the test set, that represents the remaining 30%.These results prove an accurate performance of the models considering that each model has to predict 32 × 32 × 1 localized pixels.

Flash-flood susceptibility map
The selected models, specified in the above section, are used to predict the whole study area image.Firstly, as mentioned in Section 4.1, this image needs to be divided into image chips of 32 × 32 × 12 as these are the dimensions of the training and validation images.The models predict the possibility of flood providing probability values of each specific pixel in the interval [0,1] where '1' means flood and '0' means non-flood.These localized probabilities are represented in susceptibility maps.The flood susceptibility map of the U-Net, WU-Net, U-Net++ models are shown in Figure 3(a-c), respectively.
In addition, to easily visualize the flood probability predicted by each model we split the maps into four ranges (more than 20%, 40%, 60% and 80% of probability) taken from 0 to the maximum value of the probability in each case.The 80% figures represent the locations most susceptible to damage, i.e. locations where the highest predicted values (more than 80%) are reached.The 100% figure would correspond to the location where the maximum predicted value is reached.These susceptibility maps divided by probabilities are presented for U-Net in Figure 4, for WU-Net in Figure 6 and for U-Net++ in Figure 7.In the particular case of the U-Net network, some specific probability values are too high and so, they are not representative for the rest of the values predicted by the model.Due to this reason, the representation with 20% of probability to 80% is not very representative for U-Net, therefore, Figure 5 represents ranges from 5% to 20% to see similar behaviors.Even if the real probabilities of the predicted models are not known, when comparing the results of the models, it is clear that the physical areas with higher probability of flood are very similar.These areas are the nearest to the water areas inside land (see Figure 2(a,c)) and are close to the soil type Fp (see Figure 2(e)).
These flood susceptibility maps are of great interest for evaluating areas prone to flood hazards.The division by probability ranges allows for the ranking and identification of the different areas susceptible to damage.Mitigation strategies and management plans in the Pho Tho province can be drawn up based on these susceptibility maps.Locations with similar characteristics could benefit from the u-shaped models already trained to create their own susceptibility map.Therefore, automatic flood map prediction could be implemented.

Concluding remarks
In this study, flood susceptibility prediction has been carried out for a particular province in northwest Vietnam.The province corresponds to Phu Tho, which has a very complex terrain in terms of   mountains, hills, valleys, plains, variety of elevation and rapid transformation of land cover.In addition, in Phu Tho heavy rainfall in a short time usually occurs in high slopes areas.The objective of predicting which areas have a high possibility of flash-flood in this province can prevent it from management risks.
The experimentation has been performed with 1412 images correctly labeled with '1' for flood and '0' for non-flood.Each image has been associated with 12 flood indicators.These indicators are factors influencing flash-flood events according to the literature.They match the factors of previous flash-flood and the hydrometeorological and geological conditions of Phu Tho.Three prediction models have been used: U-Net, WU-Net and U-Net++.These models are deep learning algorithms based on semantic segmentation.To our knowledge, they have never been used before to predict areas sensitive to pluvial flood but they have demonstrated accurate performance in related study domains.The optimal parameters of each model have been found with a tuning process.Afterwards, models have been trained providing good loss and accuracy metrics with the test set.Finally, the prediction over the study area has been performed and each localized probability of flood has been represented in a susceptibility map.The three susceptibility maps from U-Net, WU-Net and U-Net++ models have shown very similar areas as having the highest probability of flood.
Comparing with Phu Tho flood indicator maps, these areas are water zones within the land and close to the Fp soil type.
Future work will focus on two aspects of relevance at present.Firstly, explainable artificial intelligence (XAI) techniques will be used to understand more clearly the factors that most influence the different areas of the predicted susceptibility maps.Secondly, susceptibility maps will be created with multi-classes going beyond the binary classification (flooding or non-flood).

Figure 1 .
Figure 1.Location of the Phu Tho province and flooded locations.
(b).It varies widely from −0.365 to 0.895 indicating a large difference in vegetable coverages.High NDVI values are located in the western region which has high slopes while low NDVI values are in the transmission regions between mountains and plains along the Thao river.

Figure 2
Figure 2 Continued

Table 1 .
Accuracy and loss results for the selected parameters for each model.

Table 2 .
Accuracy and loss results obtained for each model in the evaluation phase.