Estimation of flood land use/land cover mapping by regional modelling of flood hazard at sub-basin level case study: Marand basin

Abstract Flood is one of the most dangerous environmental hazards that threaten the human lives and properties on a large scale. Identification of flood risk areas with the aim of optimal management of this area is very necessary. In this research, the estimation of Land use/Land cover (LULC) in the Marand basin was modelled on the Sub-basin level using remote sensing and geographic information systems (GIS). The runoff coefficient was determined using the LULC extracted from satellite images, slope map and soil hydrologic groups, and rainfall intensity. Then, peak runoff for each sub-basin was calculated. In the following, by using linear membership function in the fuzzy logic model, the integration of two layers of peak runoffs and the elevation line layers between 0 and 1 were transformed into fuzzy values. Afterward, by applying multiple overlaps of weights to each of these two layers and their results, classes of flood hazard distribution map were produced. By comparing the hazard map with the results of participatory geographic information system (PGIS) and entering this information into the confusing matrix, the collision accuracy of the map was 87.83%. Finally, by comparing this map with the land cover/use map, their flood extent was determined separately.


Introduction
Flood is one of the main global hazards causing massive damages to industries such as agriculture, fisheries, housing, and infrastructures that heavily affects social and economic activities (Chang et al. 2008). It occurs after heavy rainfall due to rapid accumulation and the runoff from upstream to downstream. In studies of global hazards, flood is one of the main hazards that has affected the lives of people around the globe with many deaths (Borga et al. 2011). Flood accounts for about 40% of the deaths from natural hazards and only in the last decade, it has killed about 100,000 people and affected life of 1.4 billion people (Ezemonye and Emeribe 2011). Managing and reducing the flood hazard, play an essential role in maintaining the physical and social environment of different societies and their economic development (Huang et al. 2008). Also, managing this risk is crucial for reducing damage and adapting the effects of climate change to land use change. The purpose of risk here is the same probability of causing harm or expected damage resulting from interactions that occur in human or natural hazards and vulnerable situations (ISDR 2004). Therefore, the assessment and reduction of flood risks should not be ignored. The prediction of sudden flooding is complex and difficult and this risk is often associated with secondary incidents such as mass movements and soil erosion (Grimaldi et al. 2013). The Marand basin is in the East Azarbayjan province, which has been the source of many floods since the last years and has caused a lot of damages to different parts of the plantation, gardens, livestock, residential areas, and infrastructures including roads, bridges, and so on. According to a survey carried out at the Office of Protected and Engineering of Rivers and Beaches in the Regional Water Organization of East Azarbayjan Province (RWOEA), it has been determined that 51 flood events occurred in different areas of the Marand basin from 2003 to 2016. The recorded figures and results from the recent floods revealed that most of the floodaffected areas are mountainous and lacking hydrological statistics which explains an assessment of this area that is vulnerable to flood risk should be made in terms of flood hazard and Land use/Land cover (LULC).
From the past, remote sensing technology and geographic information systems (GIS) have been applied to assess and model flood risk at a widespread level globally (Van Westen et al. 2003;Althuwaynee et al. 2014). In this study, in order to determine the flood regions in Marand basin in the sub-basin scale, modelling was carried out in GIS environment. To have an accurate and up-to-date land use map, satellite images with 10 m spatial resolution were used and classified. By this map, along with the slope layer and the map of the hydrological group of the soil, the runoff coefficient of the basin was calculated. Also, using the rational model, the peak runoff in each sub-basin was calculated. After fuzzing the DEM layers and peak runoff, flood hazard index (FHI) was calculated by the overlapping factor. Finally, the area of the flood LULC was determined.
To get acquainted with the latest flood modelling and evaluation methods, many studies have been made on library and internet resources. Since, the Marand is one of the basins lacking hydrologic statistics. For this purpose, we should seek methods that use empirical methods to obtain citation estimates for flood hazard modelling. A sample of the worldwide studies has been presented as follows: (Asumadu-Sarkodie et al. 2015), in a research aimed at managing flood risk in Ghana using a modified rational model and rainfall data, it was finally determined that with increasing land, runoff coefficient, and peak runoff rates would increase. Franci et al. (2016) conducted a study in the Nicosia basin of Cyprus aimed at providing a flood risk map using simultaneously high-resolution spatial resolution and also using multivariate analysis on the five flood factors, and after calculation, the FHI of the study area was classified into seven classes. Vojtek and Vojtekov a (2016) published their study on flood hazard and flood risk assessment at the local spatial scale. In another research, Armenakis et al. (2017) investigated flood risk assessment in urban areas based on spatial analytics and social factors.

Study area and data
The Marand basin in the northwest of Iran with a geographical location 10˚45 to 10˚46˚east and 15˚38 to 45˚38˚north is located in the East Azarbayjan province. The area of this basin is about 2030 square kilometres. According to Iran's Meteorological Organization, the average long-term rainfall in the region is about 395 mm and the average annual temperature in this region is 12.1 C. Its climate is Mediterranean with climatic features of semi-cold mountainous regions. The maximum and minimum altitudes of the area are 3267 and 972 metres respectively, and the average elevation is 1561 metres above free water level (Feizizadeh and Ghorbanzadeh 2017). The Marand Basin has four metropolitan areas and 183 villages with 245,000 inhabitants. The main economy of this region is based on agriculture and animal husbandry (East Azarbaijan Governorate, EAG 2017). Figure 1 shows the location of this area.

LULC map
Since, the use of accurate LULC maps can improve the results of flood risk management (Bhatt et al. 2014). In this study, satellite images of Sentinel 2 A with a spatial resolution of 10, 20, and 60 m for different bands were used to generate the LULC The image posted on https://scihub.copernicus.eu in August. The classification of satellite images based on spectral information has some limitations, therefore, other information resources should be used to increase the accuracy of classification (Chen et al. 2009). In this research, the object-oriented classification method was used. Before performing its various stages, the image was visualized using the semi-automatic classification plugin (SCP) of the QGIS-corrected atmospheric image to eliminate the negative atmospheric effects in the image.
In addition to spectral information, texture and shape information is also used to enhance the classification accuracy in object-oriented image processing (OBIA) (Blaschke 2014). Because in this image processing method, based on the spectral, physical and geometric parameters of ground-based phenomena recorded on the image, segmented and image processing units from pixels to image phenomena or segments are changed, and as a result, by comprehensive processing of this information, objects and phenomena of the real world can be extracted more accurately. The process and the result of this type of image processing can be divided into three sections: segmentation, fuzzy classification, and validation (Aksoy and Ercanoglu 2012). For the segmentation in the eCognition software for the parameters of Scale Parameter, Shape, and Compactness, values of 50, 0.1, and 0.5 were considered respectively and AND and OR functions were used for fuzzy classification. The study area was divided into seven general classes in terms of LULC including garden, plantation, built, rich rangeland, poor rangeland, salt-marsh, and water surface. The Kappa coefficient and the overall accuracy of the map in the classification were 0.91 and 93.5, respectively, which is an acceptable accuracy. Figure 2 shows the map of the LULC classification of the study area.

Soil texture map
Soil properties affect the evolution of runoff and should be considered in its calculations. Soil properties can be expressed in terms of a hydrologic factor, which is the minimum permeability velocity in the long-term wet state of the crude. All soil textures are divided into four groups based on the Natural Resources Conservation Service (NRCS) indicators, each group can be divided into subgroups, if necessary, more accurately. The penetration intensity in soil hydrological groups and other characteristics of each group are presented (Council 1999). Thus in this study, the soil map was prepared using aggregation and analysis of soil texture reports and maps from RWOEA, natural resources organization (NRO), and agriculture research centre of East Azarbayjan (ARCEA), which is shown in Figure 3.

Digital elevation model
In flood risk assessment studies, using a digital elevation model of that area is very important in demonstrating the floodgates' zones (Asare-Kyei et al. 2015). In this study, the digital elevation model (DEM) of the Aster TERRA satellite with a spatial resolution of 30 metres was used. According to Figure 4, the maps of the elevation classes in the study area are presented in five classes.

Slope map
To provide a slope map in this research, the 30-metre digital elevation model Aster was used. Totally, the area is divided into low slopes in plain areas and harsh areas in mountainous areas. According to Figure 5, the map of the area is classified in three classes and is represented by percentage.

Rainfall intensity
In the northwest region of the country, up to 2400 metres high rainfall increases while after this height, the severity of these rainfall decreases, so that rainfall of more than 12 mm is rarely seen at those heights. (Motaleb Faed 2007). In the northwest region of Iran, there is a high correlation between the values of DEM and ground stations, and the physiographic factors of Sahand, Sabalan, and Misho heights affect the extreme rainfall (Valizadeh Kamran and Nasiri Qalehini 2013). In this research, in order to determine the intensity of rainfall for each sub-basin, the time of concentration of each subbasin was determined based on kirpich method according to the Equation (1).
(1) In this equation; tc = Concentration time of the basin in terms of hours L = Length of the main waterway in km H = Difference in height between the highest and lowest points of the basin (Wijesinghe and Wijesekera 2011). The length of the main waterway is also determined by the Equation (2). In this equation; L = Length of the main waterway in km A = Area in km 2 (Tribhuvan and Sonar 2016). Based on the calculated values, it was found that the time for concentrating subbasins was about 1 hour. In the following, the data of recorded logistic data with a period of 11 years including 2004 to 2015 at regional stations including the cities of Jolfa, Marand, Varzaghan, and Shabestar the same period rainfall events extraction were recorded from the intensity curves of the recorded rainfall.
Due to the inappropriate geographic distribution of precipitation stations, especially in mountainous regions, algebraic interpolation method was used based on DEM values according to the results of researches on northwest rangelands in Iran. According to Table 1, the maximum rainfall intensity of 1 hour, along with the To determine the relationship between rainfall and DEM, linear regression was used in Excel software and the relationship (Garcia 2016) was determined according to Equation (3).
The result of this equation shows that in this area, with an increase in height, also rainfall increases.

Methods
The studies have shown that there are a total of four main techniques for producing a flood hazard map. These techniques include hydrological frequency analysis, hydraulic modelling, hydrological models, and a statistical method (El morjani 2011). In this research, the combination of hydrological model and statistical method using the remote sensing and GIS, was used to generate a flood hazard map at the study area.

Integration of the hydrological and statistical model in GIS
To prepare the mapping of flood hazard areas with high spatial resolution in the subbasin, it is necessary to combine the results of the two methods. Firstly, using the modified version of the hydrological model (Viessman and Lewis 1996), the amount of runoff is determined based on the intensity of rainfall, land use, and runoff coefficient of the basin. Then, the statistical method in the GIS environment is combined with the output of the hydrologic model, which is integrated with topography factors (DEM) to determine the severity of flood hazard in the study area. Finally, by defining the class for classification on the map of the risk of flood intensity, the FHI for flood risk areas is determined (Asare-Kyei et al. 2015).

Determination the peak runoff using rational model
This method is usually used in small basins and it is assumed that the rain is constant at all levels of the basin. The simple application of this method has led to widespread use in related studies, especially for urban hydrology and basin management. In this study, this model was used to calculate maximum runoff discharge, which is presented in accordance with the Equation (4).
In this equation, Qp is the rate of peak runoff according to cubic meter per second, C, is the runoff coefficient derived from an empirically pre-designed table and each of them are prepared based on the slope of the area, the type of land use, as well as the hydrological group Soil of the region, I, is the rain intensity in millimetres per hour, and A, is the area of drainage area in square kilometres (Needhidasan and Nallanathel 2013).

Determination of surface runoff coefficient
Surface runoff coefficient is the ratio of rainfall height that flows on the ground surface and depends on factors such as intensity of soil permeability, the density of LULC, rainfall intensity, and slope. According to Table 2, the runoff coefficient of different regions determined by overlapping each of these factors (County 2008).

Flood statistical modelling
A peak runoff map is combined with the DEM layer to prepare a flood hazard plan. However, it is necessary to be standardized before combining two informative layers of peak and height runoff into their units such as cubic metres per kilometre for peak runoff and metres for elevation. Using the theory of fuzzy theories, before the combination of layers, standardization is performed on comparable scales (Malczewski 1999). The standardization of these layers was done using the fuzzy membership function in Arc/GIS software. Regarding the direct and linear relationship between peak runoff and flood probability, using linear membership function, the peak runoff layer was re-classified between upper and lower thresholds. This means that the lowest amount of runoff is the zero-value peak, meaning no membership or low-flood probability, while the highest amount of peak runoff is the one-value meaning full membership or the highest probability of flood is given. Other peak runoff values are between zero and one degree. This is a reversal to the DEM layer because low-altitude areas are likely to be higher, but high altitude areas are less likely to occur in flood events. Therefore, low and high altitude areas determined with one and zero degrees of membership, respectively (Asare-Kyei et al. 2015).

Distribution mapping of flood hazard preparation
To map flood hazard severity at different heights, standardized layers of height and peak runoff are combined with linear weight composition method (Malczewski 2000). This was done using the FHI equation in the Arc/GIS and by specifying the weight of each layer. The total weight of all layers should be equal to 1. In this study, each layer of DEM and runoff peaks rated 0.5. Equation 5 is the flood risk index equation: In the above equation, each parameter redescribed as follow: FHI, Xi, i-th standardized layer, Wi, the weight of each i-th layer and n is the number of combined standard layers. Since the weight of the standard layers is between zero and one, the value of the output pixels of the flood risk map is also within this range. Then, to identify areas with high and low risk of flooding, classification occurs in Arc/GIS environment. In this study, statistical methods were used in two stages. In the first step, standard statistical methods were used to obtain an FHI. In the second stage, statistical methods were combined using the principles of PGIS to assess the outcome of  -Kyei et al. 2015). In the first step, using the rational model, the value of each of the factors affecting the peak flood flow was calculated. These effective factors include LULC, soil texture, slope map, DEM, and drainage area (El morjani 2011). Then, using statistical methods, runoff rates at different DEMs were determined before using standard methods for determining the flood hazard areas (Asare-Kyei et al. 2015). Figure 6 shows the flowchart of the research.

Results and discussion
To determine the sub-basins, the Aster DEM and the Arc Hyrdo tools were used in 58 sub-basins. Figure 7 shows the distribution map of sub-basins.

Runoff coefficient calculation
Calculation of the runoff coefficient was done in two main stages, including the preparation of layers and then applying the multiplication overlap of these layers using Figure 7. Map of the case study sub-basins. Source: Author GIS modelling. In order to accurately overlap the layers, data layer of LULC and soil texture map became rasterized, and then, along with a slope map, the pixel size of all of them adjusted at 10 Â 10. In the following, using the Boolean logic (0 and 1), each of the classes in this map is also separated from each other. Using this logic, the class is worth one and the remaining unclassified classes are marked with zero value. This means that in this research, seven Boolean maps for LULC map, four Boolean maps for soil texture map and three Bolin maps were also provided for the slope map.
To overlap the different layers to calculate the runoff coefficient of the basin, all produced maps derived from the application of Boolean logic entered the GIS. Considering that according to Table 2 it is determined that each pixel with three characteristics of the slope, the type of application and the specified hydrological group, what is the runoff coefficient, and also many states are defined for determining these coefficients, the GIS modelling was used to perform this calculation. After multiplying the defined states in their specified coefficient, for all states in the table, at each pixel, the runoff coefficient was calculated and the output was taken. Finally, Figure 8. The area runoff coefficient map. Source: Author by applying the aggregation of all produced outputs, the runoff coefficient of the whole basin was prepared as shown in Figure 8.
To determine the runoff coefficient of sub-basins separately and in terms of average in GIS, the average runoff coefficient for each sub-basin was calculated and the values of each sub-basin are shown in Figure 9.
To determine the peak runoff from rainfall intensity of 1 hour on each sub-basin, using the rational model, runoff coefficient layers, rainfall intensity, and area of each sub-basin were multiplied. The runoff rate for each sub-basin is given in Table 3.
Considering the correlation between runoff and DEM, the integration has been done using linear membership function of fuzzy logic model and peak runoff values as well as DEM distributed with phasing out between zero and one.
In the case of DEM layer, the lower the height, the more flooding occurs and the high DEMs are not flooded; therefore, in the fuzzy membership function, the low and high DEM regions are set to 1 and zero, respectively. Figure 10 shows the fuzzy map of the DEM layer between zero and 1.   Figure 10. The fuzzy map of DEM. Source: Author In the case of peak runoff, this is upside down, the higher peak runoff, the greater the flooding, and vice versa, the lower the peak runoff, the lower this rate, therefore, in the fusion membership function, the higher peak runoff, tends to number 1 and the less runoff tends to zero. Figure 11 shows the fuzzy map of peak runoff from 1 hour rainfall intensity between zero and 1.
To generate a flood hazard intensity map at different DEMs, height standardized and peak runoff layers were combined using linear weight composition method. This procedure was completed using the FHI equation in the GIS with specification the weight of each layer. Since, the total weight of all layers should be equal to one and there is a direct relationship between runoff and DEM, in this study, the weight of 0.5 was given to each one of DEM and peak runoff layers in the maximum rainfall intensity of 1 hour. Finally, the runoff layers and the fuzzy DEM of the area were multiplied by the coefficient and gathered together. After classifying the prepared map in five categories, the area with sub-basin levels divided into very high-risk, high-risk, moderate, low-risk, and very lowrisk classes. Figure 12 shows a flood risk map with 1 hour rainfall intensity.

Verification of produced map
In this study, the PGIS was used to validate the final maps of Marand's flood risk to determine the accuracy of the modelling performed. In this type of verification, local experts and authorities, using the science, experience, and historical memory of the floods occurred in different regions, as well as the natural characteristics of each subbasin, determine the risk class of each region (Craig et al. 2002). In the present study, with the cooperation of experts with a high level of work in Marand County and native area water administrations who have proper knowledge of the area, flood, and hazardous levels in the area, a field survey was conducted to determine the accuracy of the produced maps. In the field survey, the risk class was marked with a GPS device according to Figure 13.
Comparison of the recorded points using PGIS and the risk maps obtained using the confusion matrix is presented, and this matrix is illustrated in Table 4. According to the validation results of each classifications and the average of each five classes, it was found that the accuracy of the maps was 87.83%.
The purpose of this study was to assess the regional risk of flooding in the Marand basin in the Sub-basin level. In this study, using Sentinel 2A satellite imagery with a spatial resolution of 10 metres, the LULC types of the study area were carefully prepared. Since it is obvious that in order to obtain the higher accuracy of basin runoff factor, the higher precision of the image, as well as the prepared map of the soil  hydrologic groups and the map of the used slope, the closer results are to the reality. By isolating each of the abovementioned classified maps using Boolean logic, all layers entered GIS modelling and using the runoff coefficient given in the table, the runoff coefficient of each sub-basin was determined and finally, peak runoff for each sub-basin was calculated with 1 hour rainfall intensity at sub-basin level. Then, using the linear membership function in the fuzzy logic model, the DEM and peak runoff amounts were fuzzy. Also, by applying the multiplication of assigned weights for each of these layers according to the FHI, the final flood risk map of the study area was determined in five risk classes including very low-risk, low-risk, moderate, high-risk, and very high-risk. By comparing the results of the designated classes and the values recorded in field observations using the GPS device, the provided confusion matrix showed the accuracy of produced maps as 87.83%. In order to determine the flooding of each LULC, the flood hazard map was compared with the LULC map and according to Table 5, the risk level of each cover/use class was separately determined.
In the investigation and comparison of risk categories in the occurrence of Marand basin flood, with considering the high-risk and very high-risk classes, which subsequently cause the most damages to different types of LULC, it was found that the class of rich rangeland with consisting of 186.01 km 2 , it is exposed to the most vulnerable conditions. The area of LULC categories affected by flood risk is presented in Table 6.
The area of all classified materials is significant in flood hazard, so that totally 25.3% of the entire LULC area is in high and very high risk categories. Constructed lands including urban, rural, industrial, and industrial-productive centres, with a significant area of about 17.3 square kilometres, is at risk of flooding and vulnerability. Garden and plantation with a surface area of 123.8 square kilometres are at the risk of flooding as one of the most important sources of indigenous income. Every year they are heavily affected by the flood damages, especially their local economy. Also, poor and rich rangeland, which on the one hand, play an important role in terms of penetration of atmospheric depression and soil erosion control in the region, and on the other hand, it is a suitable rangeland for the livestock of the region native people, are at the risk of flooding and vulnerability. The salt-mersh areas are subject to flooding and leaching. In addition to the salinization of runoff and soil, gradual entry into underground water prevents their proper use. The results obtained in this research can be used to prioritize sub-basins with the aim of proper management of flood hazard underwater basins. Recognizing the high risk areas will increase the productivity of the costs and reduce the mortal and financial losses.

Conclusions
In this study, using the modelling of empirical estimation of peak runoff, the regional flood risk of each LULC in Marand basin has been evaluated. The study area was divided into five risk classes. By comparing maps of the flood hazard areas and LULC and their adaptation to the results of PGIS, high and high risk areas were identified. The use of empirical modelling and provision of its precise required data can greatly increase the accuracy of the region flooding. Utilization of native and regional managers' views can be highly effective to confirm the results of such research. It also seems that the efficiency of the current model and other models in this field, in areas without hydrologic statistics and with different climates, can correctly aim to determine the flood zones.

Disclosure statement
No potential conflict of interest was reported by the authors.