Synthetic aperture radar and optical remote sensing image fusion for flood monitoring in the Vietnam lower Mekong basin: a prototype application for the Vietnam Open Data Cube

ABSTRACT Flood monitoring systems are crucial for flood management and consequence mitigation in flood prone regions. Different remote sensing techniques are increasingly used for this purpose. However, the different approaches suffer various limitations, including cloud and weather effects (optical data), and low spatial resolution and poor colour presentation (synthetic aperture radar data). This study fuses two data types (Landsat and Sentinel-1) to overcome these limitations and produce better quality images for a prototype flood application in the Vietnam Open Data Cube (VODC). Visual and quantitative evaluation of fused image quality revealed improvement in the images compared with the original scenes. Ground-truth data was used to develop the study flood extraction algorithm and we found a good agreement between our results and SERVIR Mekong (a joint initiative by the US agency for International Development (USAID), National Aeronautics and Space Administration (NASA), Myanmar, Thailand, Cambodia, Laos and Vietnam) maps. While the algorithm is run on a personal computer (PC), it has a clear potential to be developed for application on a big data system.


Introduction
Flood monitoring has been crucial for management and mitigation of impacts in flood prone regions in terms of providing updates and sufficient information (Merkuryeva et al., 2015). Flood water mapping, which is critical to flood monitoring, involves analysis of the propagation of flood water based on remote sensing (RS) acquisitions of pre-and post-event image pairs (Rahman & Thakur, 2018). Synthetic aperture radar images (SAR) have been widely applied to flood studies because smooth surface flood water has a dark appearance that can be clearly distinguished from other objects (Gan, Zunic, Kuo, & Strobl, 2012;Horritt, 2003), regardless of weather-related obstacles such as clouds (Javelle et al., 2002;Schlaffer, Matgen, Hollaus, & Wagner, 2015). Although several spaceborne SAR instruments provide fine spatial resolution and multi-polarization capabilities (Schlaffer, Chini, Giustarini, & Matgen, 2017), application of freely available coarse spatial resolution SAR datasets can limit the ability of users to extract floodplain segments into "flooded" and "non flooded" areas. In contrast, optical RS data provides a wide range of spectral bands that produce finer spatial resolution, but it is often affected by cloud and low light levels. For these reasons, and to overcome limitations, SAR and optical RS imagery have been fused in some flood studies (Dey, Jia, & Fraser, 2008;Kyriou & Nikolakopoulos, 2017).
Data fusion involves the "combination of two or more different images to form a new image by using a certain algorithm" (Pohl & Van Genderen, 1998). Fusing different RS scenes can enhance cartographic object extraction, and improve spatial resolution (Ehlers, 1991;Mangolini, 1994). Hence, there is an increasing interest in data fusion of multisource RS acquisitions (Amarsaikhan & Douglas, 2004;Byun, Han, & Chae, 2015). Over time, the availability of earth observation data has improved and it now covers different portions of the electromagnetic spectrum at different spatial, temporal and spectral resolutions (Pohl & Van Genderen, 1998). This provides users with multiple data choices, but also creates additional challenges related to preserving the original spectral characteristics of the input image data (Ehlers, 2004) in the resulting fused images.
In recent years both radar and optical satellite constellations have been enhanced, and become widely available via open access. For example, since October 2014 the Sentinel-1 (S1) mission of the European Space Agency (ESA) has provided a constellation of two SAR satellites that monitor the entire Earth surface every 6 days (Cian, Marconcini, & Ceccato, 2018), whilst Landsat provides a large archive of freely available optical images ranging in date from the 1970's (Landsat-1) to the current time (Landsat-8 with revisit times of 16 days) via the U.S. Geological Survey (USGS) Earth Explorer website (Muriithi, 2016). Resources such as these have provided myriad potential opportunities for data users, including the Vietnam National Space Center (VNSC). VNSC has been enabled to establish the Vietnam Open Data Cube (VODC) (http://datacube.vn), one of the aims of which is rapid flood mapping. The VNSC has already developed a water detection tool using the Landsat archive. The objective of this study is to develop a VODC application by fusing Landsat and Sentinel-1 images to overcome the limitations associated with the individual image types and enable more rapid mapping of water inundation.
To achieve our study objective, we first aim to fuse optical (Landsat 8) and SAR (Sentinel-1) images to form higher quality, fine resolution (from 30 m in Landsat 8 to 10 m Sentinel-1) fused colour visualizations (original SAR images are in grey colour), and to reduce the effects of clouds and cloud shadows in optical images. We also evaluate the accuracy of the ISH, Brovey, PCA and GS methods and histogram analyses used to produce the fused images. Our second aim is to adapt the method of water extraction developed by Mueller et al. (2016) for rapid flood mapping as a prototype application for the VODC as the method was validated using large amount of ground truth data and applicable for an ODC application.

Hong Ngu
Hong Ngu Town covers an area of 121.9 km 2 and is home to approximately 80,000 people. It is located near the Vietnam-Cambodia border (Figure 1), where there are two distinct hydrological seasons: the dry season from January to June, and the flood season from July to December. However, most flood water comes from the Mekong River basin rather than from local rains. The location of Hong Ngu means it is the first town in Vietnam to be affected by flood water flowing downstream from the Mekong River. Hence, flood data collected here is valuable for other Vietnam Lower Mekong regions in terms of flood monitoring and management.

Remote sensing data
Guided by flow rate information in lower Mekong regions during both the dry and flood seasons (Fan, He, & Wang, 2015), we collected Landsat 8 and Sentinel-1 images (Table 1) captured in March (dry season) when surface water can be considered as permanent water, and October (flood season) when surface water can be considered temporary. The Landsat 8 images originally contained 11 bands at different wavelengths. However only three bands were needed to present flood water clearly and provide satisfactory spatial resolution in the resulting images: 3 (green), 4 (red) and 8 (panchromatic). Sentinel-1 datasets processed at level 1 Ground Range Detected (RGD) do not include the Phase information, but are projected to ground range using an Earth ellipsoid model, after speckles are filtered out. Of the four available bands in Sentinel-1 (Amplitude_VV, Amplitude_VH, Intensity_VV and Intensity_VH), we used the Amplitude VV band after conducting a quick histogram analysis for water discrimination.

Ground truthing data
A field survey focusing on flooded and dry areas was conducted using GPS and mapping techniques to ground truth data. This was undertaken close to the date of flood season remote sensing data acquisition (started on 4 October 2018). Three dominant areas were used: two and three-season rice fields, and urban areas ( Figure 2).

Methodology
The proposed approach for monitoring flood water in the VODC is illustrated in Figure 3. The image fusion, accuracy assessment and the algorithm of flood water extraction will be explained in detail in sub-sections. First, the program searches for image pairs from the VODC database that meet three conditions: the acquisition dates of the pairs must not be more than 2 weeks apart (based on flood status in the lower Mekong delta); tile overlay must be greater than one third of the smaller image; and cloud percentage of the overlay area must be less than 30%. Secondly, the image preprocessing is done with steps including; (1) metadata description; (2) radiometric calibration; (3) geometric calibration; (4a) solar and atmospheric calibrations (for Landsat-8) or (4b) speckle filtering (for Sentinel-1). Analysis ready data (ARD) are not commonly provided by data producers. The Committee on Earth Observation Satellites (CEOS) defines ARD as "satellite data that have been processed to a minimum set of requirements and organized into a form that allows immediate analysis without additional user effort" (Killough, 2016), and that ARD should satisfy the four requirements in the image pre-processing procedure (Giuliani et al., 2017). Lastly, a user interface tool allows users to interact with the system to send their queries and download products.

Image fusion
In the mid-1980s image fusion attracted significant attention from those researching image processing of remote sensing data. This technique is used to "integrate the geometric detail of a high-resolution panchromatic (Pan) image and the colour information of a low-resolution multispectral (MS) image to produce a high-resolution MS image" (Zhang, 2004) and hundreds of image fusion techniques have been developed. In this study, the methods of Modified HIS, Brovey transformation, PCA, and Gram-Schmidt Spectral sharpening are used to undertake image fusion, and the outputs compared. We used bands 3, 5 and 8 of the Landsat-8 images and the Amplitude VV band of the Sentinel-1 datasets for all four fusion methods which are included and documented in detail in the ENVI 5.3.0 package. These bands distinguish water surfaces most clearly. At wave lengths of 0.533-0.590 µm (band 3-Green) the water surface reflects most strongly in the visible range. In contrast, open water absorbs light completely at wave lengths between 0.851 and 0.879 µm (band 5-Near-infrared (NIR)), so it appears dark. The band 8 (PAN), with a higher resolution of 15 m, was chosen to preserve more spatial information. Finally, it is generally accepted that no significant differences occur among the different SAR polarizations when using Sentinel-1 for water detection, so the Amplitude VV band was used as an alternative to the Amplitude band.   Modified IHS approach Amarsaikhan and Douglas (2004) stated that the Modified IHS approach is the most widely used of the data fusion techniques. Detailed description of the method can be found in (Mather & Koch, 2011). Briefly, the IHS method minimizes the limitations of the Red Green Blue (RGB) colour system allowing more natural colours to be displayed. For this task, we transferred the original RGB images to the IHS colour system, replacing the intensity colour component with a SAR band before transferring back to the RGB colour system.

Brovey transformation
The Brovey transformation (BT) method is a common remote sensing image fusion process with flexibility to transform all optical and SAR bands (Vijayaraj et al., 2004) as described in the following Equation (1):

XS SAR
XS 1 ; XS 2 . . . XS n1 and SAR 1 ; SAR 2 ; . . . SAR n are bands of optical and SAR images respectively. XS SAR 1 ; XS SAR 2 . . . XS SAR n are fused bands. The main advantage of BT is that it preserves the spectral information and the resolution of the optical and SAR images respectively, with the result that the fused images are sharpened.

Principal Component Analysis
Principal Component Analysis (PCA) is a statistical technique that identifies key variability among variables within a dataset, to reduce it to fewer dimensions or "components" of related variables that are uncorrelated with each other (Pohl & Van Genderen, 1998). In this study, PCA of Landsat 8 and Sentinel-1 bands, produced four principal components: PC1, PC2, PC3 (optical) and PC4 (SAR). In the data fusion process, we used PC1-3 as the multispectral (MS) lower spatial resolution images, and PC4 as the higher resolution image.

Gram-Schmidt spectral sharpening
The Gram-Schmidt (GS) fusion technique is used to simulate high-resolution panchromatic (PAN) layers from lower-spatial MS bands with suitable weights (Laben & Brower, 2000). The GS transformation is applied to simulated PAN and MS images, where the simulated PAN raster layer is used as the first band.
Afterwards the high-resolution PAN image is swapped with the first band. Finally, the inverse GS image sharpening is used to form the pan-sharpened spectral bands (Kumar, Mukhopadhyay, & Ramachandra, 2009).

Evaluations
We first evaluated the quality of the fused images against originals subjectively by visually inspecting them for feature interpretations. Visually comparing fused images is a simple but effective approach for understanding the advantages and drawbacks of the fusion techniques used (Dahiya et al., 2013). We then undertook a more objective assessment with widely used indices to quantify quality improvements and determine which of the fusion techniques produced the best results.
Histogram statistics were examined to assess the preservation of spectral information, particularly for those images undergoing further processing (Dahiya et al., 2013), and to understand the pixel value frequency of individual bands. We the used ratios of Bias, the Entropy different index (EHD), the relative dimensionless global error in synthesis (ERGAS) index, and the correlation coefficient (CC) to quantify departure from optimal values. Detailed explanations of these evaluation methods are provided elsewhere in Dimov, Kuhn, and Conrad (2016), Ehlers (1991), Fryskowska, Wojtkowska, Delis, and Grochala (2016), Gangkofner, Pradhan, and Holcomb (2007), and Kedzierski, Wilinska, Wierzbicki, Fryskowska, and Delis (2014).

Flood extraction algorithm on Vietnam Open Data Cube
We adapted the water classification method of Mueller et al. (2016) and used it with our in-situ ground truthing data to undertake flood extraction. This approach includes two phases: 1, the water surfaces and non-water areas are classified and assigned pixel values of 1 and 0 respectively; 2, maps of water and non-water are combined to produce flood maps (Figure 4). This framework was originally based on the use of GS fusion images. When using ISH, Brovey and PCA fused images, the data values (DV) were adjusted based on histograms analyses.

Visual comparison of fused and original images
Image fusion resulted in a general visual improvement in the quality of both dry and flood season images ( Figures 5 and 6 respectively) compared with the originals for the following reasons: 1. spatial resolution improved from 30 m (the resolution of the original Landsat 8 images) to 10 m; 2. the addition of colour, which is missing from the original black-and-white SAR grid; 3. the ability to distinguish urban areas which were not identifiable in the Landsat image, as evidenced by comparing the location of Hong Ngu Town in Figure 5(a,c,d); 4. the replacement of cloud cover and cloud shadows in the Landsat 8 image with interpretable pixels in the fused images, although noise created by cloud appears on the fused images where neither cloud or noise were present on the original Sentinel-1 image (noise is represented by the red colour in Figure 5(b-f).
Visual comparison of the images produced from the four fusion techniques (ISH, Brovey, PCA and GS) also revealed differences. The PCA image (Figure 6(e)) provided a clearer depiction of houses compared with the Brovey image (Figure 6(d)). In addition, the two season (darker part) and three season (blue part) rice fields are distinguishable (Figure 6(c) see the red circle of the ISH image). The image underneath clouds was easily interpretable (orange colour in Figure 6(f)). In fact, we could have chosen either cloudless or cloud free Landsat images to fuse in both dry and flood seasons. We used images with some clouds to test their effects on the fusion and flood extraction results.

Quantitative evaluation of fused image quality
There was minimal departure from ideal values by mean values for band 1, 2 and 3 of the fused images for all evaluation indices, showed minimal departure from optimum values for each of the fusion techniques (Table 2). This indicated that all of the fused spectral bands were well generated. The PCA and GS techniques generally produced values closest to optimum for all indices, with the exception of the EHD index. EHD values for GS were higher than values produced for any of the other fusion techniques. Bias values were lowest for the PCA and GS fused image bands. However, the lower absolute EHD values indicated that the ISH and PCA fusion techniques generated the better results.

Histogram analysis of fused images
Overall, pixel value frequencies for the three fused bands were lower for the ISH and Brovey fusion techniques compared with PCA and GS, with the  exception of two high peaks for band 1 at the value of 15 (Figure 7). The lack of sharp peaks in the PCA histogram indicates a balanced contrast between pixels, while the GS technique increased the contrast between pixel values (the sharp peak of band 1 around data value 195 may be due to the effect of cloud). Repeating this analysis with flood season images ( Figure 8) revealed increased data values for the PCA and GS fusion techniques. However, higher data peaks were restricted to lower values for the ISH and Brovey techniques, and higher values for the PCA and GS techniques.

Water and flood extraction
Inundated and dry areas were clearly visible when comparing the same locations in GS fused 10 m resolution images for the dry and flood seasons (Figure 9). Large dry areas (represented in yellow) in the dry season (25th of March) were inundated (represented in blue) in the flood season (15th of October), with the exception of residential areas, which were confirmed not to be inundated during field survey.
We found no significant difference between flood maps generated using outputs from the four fusion methods. In addition, we found good agreement between the VODC flood map and an inundation map downloaded from the Surface Water Mapping Tool (SWMT) of SERVIR Mekong for the same date of 15 October 2018. Some differences in permanent water are indicated between map C and D. It remains difficult to assess the source of this divergence without the algorithms and input data for the SWMT.
Comparing cloud effect on flood extraction results of the applied fusion The four methods used in this study show different portions of cloud cover effects (Figure 10). Comparing Figure  10 (ISH and Brovey) with Figure 5(a), some clouded areas were miss-classified into permanent water using both ISH and Brovey methods. In contrast, there was no or very little influence of cloud cover in the Landsat 8 images on the flood maps generated using the PCA and GS methods. The problems of cloud cover and cloud shadow could be solved by employing the Pixel Quality Assessment (PQA) introduced in the study of (Lewis et al., 2017) prior to fusion.

Cross comparisons of the fused flood maps
The fused flood maps were superimposed in pairs to show similarities in permanent water (in blue), flood water (in   light blue) and dry areas (in grey), and differences in water transition from permanent to flood water (in red), and flood to permanent water (in yellow) ( Figure 11). The most similar pairing was Brovey vs ISH in terms of flood, permanent water and cloud cover, followed by GS vs ISH, and GS vs PCA. More red and yellow areas were found in the Brovey vs PCA and ISH vs PCA pairings. In general, permanent water (the river) and transitions between permanent and flood water(red and yellow) are clearly mapped (Figure 11).

Discussion and conclusion
The rise in uses of EO satellite data in terms of diversity of the electromagnetic spectrum, spatial, temporal and spectral resolutions is evident (Lewis et al., 2017;Pohl & Van Genderen, 1998;Solberg, Jain, & Taxt, 1994). The use of data fusion has increased considerably since the mid-1980s. Numerous different image fusion techniques have been developed for a multitude of purposes including improving the accuracy of image classification, the visual appeal of fused colour images, or simply for visualization (Zhang, 2004). It is now widely accepted that fusion methods improve spatial/spectral resolution of resultant images and improve image interpretability. However, much attention continues to be given to the improvement and development of new methods (learning-based for example) for RS data fusion (Belgiu & Stein, 2019;Ghahremani, Liu, Yuen, & Behera, 2019;Vargas, Arguello, & Tourneret, 2019) but less has been done to integrate optical and SAR images. This study makes contributions to that area of research.
Few studies have discussed the limitations of existing fusion techniques, nevertheless disadvantages exist. First, Figure 9. Maps of VODC algorithm run on a PC in the dry season (a) and flood season (b). The flood map (c) is a combination of A and B, where p-water is water of short permanence. It is compared with a flood map of the Surface Water Mapping Tool of SERVIR Mekong (d) produced for the same date. All images are at 10 x 10 m resolution. colour distortion is a significant problem due to differing wavelengths in the input images. This is particularly the case for PAN images which are acquired by different sensors and which have wavelengths ranging from visible to the near infrared. This variation results in large changes in the grey values in the new PAN images (Zhang, 2004). Secondly, there are no automatic fusion solutions that achieve consistent outcomes. Therefore, fusion quality is frequently defended based on the operator's personal experience with the process. In our study, the GS method generated the best outcomes, in contrast to the ISH, PCA and Brovey approaches. In addition, the histograms analysis showed a wide range of, and changing, pixel data values, resulting in inconsistency in colouring the images, particularly in the PCA method which created bright images (DV increased).
This study mapped flood areas as a prototype of the VODC, and most image fusing procedures were done manually. In the real application these steps would be automated by creating a batch mode routine within the ENVI Classic library. Alternatively the statistics-based fusion method (Zhang, 2002), which is a new automatic fusion approach which conducts geo-referencing and resampling in one step (Zhang, 2004), could be used. The two major problems of colour distortion and operator/data dependency could be solved using this automatic technique. If we didn't undertake fusion, our flood algorithm with a modification could work for Landsat 7, 8 and Sentinel-1 data without the image fusions. However, the cloud effects would remain in the optical scenes and the better quality achieved in the fused images would not be gained.
Flood inundation mapping is a promising and suitable topic for ODCs since it requires time-series data and a robust computational platform. Big data systems meet these requirements. In the VODC system, the fusion is considered as a tool to generate better quality images and reduce the effects of cloud on optical RS images. Since there are a large number of image fusion methods available, it remains difficult to judge or choose the best method for particular uses. In addition, the accuracy of the final flood maps could be affected by band selection because the RS scenes contain more than three bands, and some fusion methods allow combining more than three bands. Therefore, we recommend the selection of bands performing optimally in classification of target geographic features based on physical reflection (scattering mechanism for SAR) or light absorption. Once optimal bands are chosen, other bands added to the fusion process should not significantly affect the final results.
In terms of flood map accuracy, visual agreement was found not only between the extracted and the field survey maps, but also with the SERVIR Mekong inundation map. Minor differences (mostly in cloud cover areas) were found between the fused flood maps, and the GS fusion method produced the image with the lowest cloud effects., However, simple zonal statistics (overlaying the GS flood map and the field survey map) found the total flooded areas in the RS-based flood map to be larger than measured in the field by 5.1%. This difference could arise from several sources including the mismatched dates of the Sentinel-1 and the field survey, cloud effect impacts, and the reference water extracted from pre-flood images. Higher accuracy in flood maps might be generated when the algorithm is applied in the VODC. Since it has a larger data archive, better matching image pairs may be found to serve as inputs.