Drone-borne sensing of major and accessory pigments in algae using deep learning modeling

ABSTRACT Intensive algal blooms increasingly degrade the inland water quality. Hence, this study aimed to analyze the algal phenomena quantitatively and qualitatively using synoptic monitoring, algal pigment analysis, and a deep learning model. Water surface reflectance was measured using field monitoring and drone hyperspectral image sensing. The algal experiment conducted on the water samples provided data on major pigments including chlorophyll-a and phycocyanin, accessory pigments including lutein, fucoxanthin, and zeaxanthin, and absorption coefficients. Based on the reflectance and absorption coefficient spectral inputs, a one-dimensional convolutional neural network (1D-CNN) was developed to estimate the concentrations of the major and minor pigments. The 1D-CNN could model periodic trends of chlorophyll-a, phycocyanin, lutein, fucoxanthin, and zeaxanthin compared to the observed ones, with R2 values of 0.87, 0.71, 0.76, 0.78, and 0.74, respectively. In addition, major and secondary pigment maps developed by applying the trained 1D-CNN model to the processed drone hyperspectral image inputs successfully provided spatial information regarding the spots of interest. The model provided explicit algal biomass information using the estimated major pigments and implicit taxonomical information using accessory pigments such as green algae, diatoms, and cyanobacteria. Therefore, we provide strong evidence of the extendibility of deep learning models for analyzing various algal pigments to gain a better understanding of algal blooms.


Introduction
Severe algal blooms in inland water systems are a water quality concern all over the world (Aguilera et al. 2018;Hallegraeff 2003;Paerl et al. 2020). Eutrophication frequently occurs due to rapid urbanization (Luo et al. 2017) and global climate change (Michalak 2016). Excessive nutrients and unusually long warm periods trigger and accelerate the explosive algal growth (Paerl, Hall, and Calandrino 2011). In addition, decomposition of the massive die-off of algae depletes the dissolved oxygen causing to fatality of aquatic animals (Mhlanga et al. 2006). This vicious process impairs the quality of fresh water (Brooks et al. 2016). Thus, numerous studies emphasize that sophisticated monitoring of the algal phenomena is required to manage intensive algal blooms in water supply (Barruffa et al. 2021;Lekki et al. 2019;Smith et al. 2004;Zhou et al. 2021).
Synoptic monitoring using satellites, aircrafts, and drones, which provides detailed spatial distribution of the phenomena, can be performed to detect algal blooms (Ahn et al., 2006;Johansen et al. 2018;Stoyneva-Gärtner et al. 2020). Passive multi-and hyperspectral sensors in the platforms can be used to retrieve spatial and spectral information of algae. In particular, drones are versatile in terms of meteorological conditions and are cost-effective unmanned aerial vehicles that can be utilized for algal monitoring of inland waters (Jung et al. 2017;Kislik, Dronova, and Kelly 2018). In addition, hyperspectral image sensors allow the drone to capture fine resolution images of the algae. Kim et al. (2018) evaluated the algal detection capability of fixed wing type drones by providing a high-resolution distribution of chlorophyll-a. Becker et al. (2019) successfully utilized a quadcopter drone with a hyperspectral spectroradiometer to monitor harmful algal blooms. In addition, this drone system has the advantage of monitoring algal blooms in small and shallow water bodies such as a tributary (Cillero Castro et al. 2020;Pölönen et al. 2014;Su and Chou 2015;Xu et al., 2019).
Optical spectra, such as remote sensing reflectance and absorption spectra, can be utilized to estimate the algal biomass. Chlorophyll-a and phycocyanin are representative pigments used as quantitative indicators of harmful algae (Hunter et al. 2010). Bio-optical algorithms have been developed to estimate pigment levels using their apparent optical properties (Duan et al. 2010;Gilerson et al. 2010;Lesht, Barbiero, and Warren 2013;Liu et al. 2021) and inherent optical properties (Gons, Rijkeboer, and Ruddick 2002;Lee, Carder, and Arnone 2002;Li et al., 2013;Simis et al. 2007). Most of previous studies focused on algal biomass quantity by estimating chlroophyll-a and phycocyanin. However, algae contain accessory pigments that represent their community composition (Yao et al. 2011), optical physiology (Arrigo et al. 2014), and predation pressure (Dolan et al. 2002). These pigments provide qualitative evidence of algae. That is, the monitoring of accessory pigments is important to better understand the overall algal phenomena. Richardson and Kruse (1999) attempted to monitor auxiliary pigments such as chlorophyll-a, -b, and -c and myxoxanthophyll using airborne hyperspectral imagery to classify algal species. Chase et al. (2017) developed a global algorithm to estimate ap-carotene, zeaxanthin, alloxanthin, and diadinoxanthin levels to identify the photoprotective physiology. Remote sensing reflectance and absorption spectra involve the spectral features of accessory pigments (Vijayan and Somayajula 2014;Morton 1975). However, the endmembers of minor pigments are difficult to trace because of the low concentrations of pigments in the complex optical spectra of inland water (Matthews and Bernard 2013;Richardson and Kruse 1999). Thus, alternative approaches are needed to estimate auxiliary pigments using remote sensing signal data.
Deep learning is an upcoming technique that provides superior performance while dealing with complex and numerous datasets (Chen and Lin 2014). In particular, convolutional neural networks (CNNs) have been introduced as representative deep learning models that provide reliable performance for feature extraction from multi-dimensional data (Kalchbrenner, Grefenstette, and Blunsom 2014;Sainath et al. 2013;Sothe et al. 2020). In addition, the CNN model has been widely applied to signal type data, such as spectral data. Ng et al. (2019) developed a one-dimensional CNN model in which both visible/near-infrared and mid-infrared spectra were input from soil to predict comprehensive soil properties. Yu et al. (2021) designed a one-dimensional CNN model to detect pesticide traces using visible/near-infrared spectral data of fruits. Moreover, deep learning techniques can provide information regarding the important features of the input data using an attention network module (Woo et al. 2018). Thus, these data-driven approaches are expected to account for the sophisticated features of accessory pigments from the spectral data, including remote sensing reflectance and absorption coefficients. In addition, the CNN model with the attention module provides valuable optical bands when estimating multiple pigments. However, few studies have investigated algal events by focusing on major pigments and accessory pigments using a deep learning approach with optical spectra.
Therefore, the objective of this study was to evaluate the potential of deep learning models to estimate biomass pigments (i.e. chlorophyll-a and phycocyanin) and accessory pigments (i.e. lutein, fucoxanthin, and zeaxanthin). To accomplish this aim, we aimed 1) to monitor reflectance spectra using field measurements and drone-borne hyperspectral sensing, 2) to analyze the algal pigments using standard methods and highperformance liquid chromatography (HPLC), 3) to design a one-dimensional CNN model (1D-CNN) to estimate biomass indictors and auxiliary pigments, and 4) to evaluate the estimation performance of the CNN model and spatial distribution maps of the pigments using drone imagery. Figure 1 shows the Chuso region (CR) of Daechung Lake (36° 28' 40.7″ N, 127° 28' 51.3″ E). The lake water is stored by constructing an artificial dam for efficient water supply and flood control. The lake has a water volume of 1,490 × 10 6 m 3 with a surface area, length, and average water depth of 72.8 km 2 , 86 m, and 20 m, respectively. CR is located in the upper part of the lake and is connected to the Sook Creek. This stream brings nutrients into the CR (Oh, Kim, and Cho 2015). In addition, the geological structure of CR includes numerous river bends that can increase the residence time of water. Accordingly, CR is vulnerable to algal bloom outbreaks. In particular, harmful cyanobacterial blooms are frequently observed in CR from summer to autumn (Shin et al. 2016). Microcystis is the dominant genus of the harmful bloom (Lee et al. 2007). Thus, the harmful algal phenomena of CR need to be understood.

Research overview
An overview of this study is presented in Figure 2. The reflectance and absorption coefficient data were collected from the field monitoring and experimental analysis ( Figure 2a). These data were used as inputs for the 1D-CNN model with the attention module ( Figure 2b). Subsequently, the 1D-CNN model was trained to estimate the levels of chlorophyll-a, phycocyanin, lutein, fucoxanthin, and zeaxanthin ( Figure 2c).  Moreover, to generate maps of the biomass and accessory pigment, reflectance data were used as input for another 1D-CNN model (Figure 2d). This model was trained to estimate the absorption coefficient spectra (Figure 2e). Then, the drone-borne reflectance map data were fed into the trained 1D-CNN model (Figure 2f). This resulted in the generation of drone-borne absorption coefficient maps ( Figure 2g). The processes in Figure 2d-g allowed the drone-borne map data to be configured with the same dimensions as the original input dimension by having reflectance and absorption coefficient spectral information in each pixel ( Figure 2h). Afterward, both maps of reflectance and absorption coefficients were supplied to the trained 1D-CNN model with the attention module (Figure 2i). Spatial distribution maps of chlorophyll-a, phycocyanin, lutein, fucoxanthin, and zeaxanthin were eventually produced ( Figure 2j).

Field monitoring
In this study, field sampling was conducted in the CR from 2019 to 2020. Fifteen monitoring events were performed resulting to collect the data from total 126 sampling points (Table 1). The sampling schedules were selected to reflect the lag, exponential, stationary, and lysis phases of algal age ( Figure 1). Furthermore, we determined each sampling day based on clear sky and low wind speed conditions. For CR monitoring, water samples were collected for algal pigment analysis. Radiometric measurements were performed to observe water-leaving radiance, downwelling sky radiance, and downwelling sky irradiance using an ASD hand-held visible/near-infrared spectroradiometer device (Malvern Panalytical Ltd., Malvern, UK). The measurement posture and angles of radiometric data were promoted to minimize the impact of sunlight on the water surface by having zenith angle of 40° and azimuth angle of 135° (Mobley 1999). Subsequently, the observed radiance and irradiance data were used to calculate the remote sensing reflectance using the following equation: where R rs is the remote sensing reflectance at the water surface (sr −1 ); L w is the water-leaving radiance (W m −2 nm −1 sr −1 ); L sky is the downwelling sky radiance (W m −2 nm −1 sr −1 ); 0.025 is the skylight correction constant based on clear sky and soft wind speed (i.e. <5 m s −1 ); E d is the downwelling sky irradiance (W m −2 nm −1 ); and λ is the wavelength band (nm). This study built an empirical bio-optical algorithm to estimate major and auxiliary pigment concentrations to compare with the estimation performance of the deep learning model. The band ratio algorithm was adopted by using the two remote sensing reflectance bands that were optically sensitive to each pigment (Duan, Ma, and Hu 2012). The two-band ratio algorithm is defined by following equation: where P i is the estimated concentration of chlorophyll-a, phycocyanin, lutein, fucoxanthin, and zeaxanthin (mg m −3 ); R rs (λ 1,i ) is the reflectance band sensitive to both absorption, backscattering, and fluorescence of each pigment; and R rs (λ 2,i ) is the reflectance band that is sensitive to the individual pigment absorption.

Drone-borne sensing
At the same time when field monitoring was conducted, drone-borne monitoring was conducted to capture the optical scene of the CR region. In this study, a hexacopter drone was employed (MATRICE M600 Pro, DJI Inc., China). The drone was equipped with a hyperspectral imaging sensor (Nano-Hyperspec, Headwall Photonics, MA, US). This sensor had a wavelength range of 400-1000 nm with a spectral resolution of 4 nm. In addition, the field of view of the hyperspectral sensor was 32.2°, and the spatial resolution was configured as 0.4 m. The drone monitoring procedures are as follows: 1) a dark reference was collected by covering the lens cap before flying; 2) the reference reflectance tarp was placed with a reflectivity of 70%; and 3) the drone was then hovered. The flying schedule of drone sensing was 15 min for monitoring and 5 min for return, and the altitude maintained was 150 m. After drone monitoring, the hyperspectral imagery of CR was retrieved with spectral information of the dark and tarp reference. This study processed the raw hyperspectral images without atmospheric correction because the atmospheric interference was insignificant owing to the low flying altitude of 150 m. The raw images were processed using SpectralView software (Headwall Photonics Inc., MA, US). For radiometric calibration, the measured dark reference and internal calibration variable file of the software were used, which included gains and offsets to transform digital numbers into the at-sensor radiance. Subsequently, the measured tarp data, internal tarp reference, and white reference data were used to calculate the water surface reflectance spectra in each image pixel from the radiance spectra by empirically comparing the measured and reference data. Then, the optically processed image was ortho-rectified based on the measured geometric information of the CR.

Major pigment extraction
This study followed a standard experimental method for chlorophyll-a analysis (APHA (American Public Health Association) 2001). Filtering water samples were conducted by using a GF/F filter with 0.7 μm pore size (Whatman Inc. German). A solvent using acetone:water (90:10 v/v) was used to extract the pigment from the filtered samples. The samples were maintained in a dark room at 4°C for 24 h. Then, the samples were centrifuged at 500 × g at 20°C for 20 min to settle the filter debris. A Cary-5000 UV-Vis-NIR spectrophotometer (Agilent Inc. CA, US) measured the optical density of the supernatants. The chlorophyll-a concentration was calculated using the measured optical density: where Chl-a is the chlorophyll-a concentration (mg m −3 ), OD is the optical density, S n is the supernatant (mL), and V o is the filtered sample (L).
To obtain the phycocyanin concentration, the freezing and thawing method was employed (Bennett and Bogorad 1973). The homogenization of the concentrated water samples was conducted using an Ultra-Sonicator (Sonictopia Inc., South Korea) to release the algal scum. The homogeneous samples were centrifuged at 4000 rpm at 4°C for 15 min. Then, the pellets were used for further analysis by removing the supernatant. Phosphate buffer (pH 7.0) was used to stabilize the changes in the sample during the freezing step at −20°C for 24 h, followed by thawing at room temperature. The cell volume changed because of these steps, thus breaking the algal cells and releasing the phycocyanin pigment only. To improve phycocyanin release, further physical disturbance was applied by shaking the incubator (N-BIOTEK Inc., South Korea) at 150 rpm for 15 min. Next, the centrifugation of the samples was conducted at 4000 rpm and 4°C for 15 min. The optical density of the phycocyanin samples was measured using a Cary-5000 UV-Vis-NIR spectrophotometer. The phycocyanin concentration was calculated using the measured optical densities using the following equation: where OD 620 and OD 652 are the optical densities at 620 nm and 652 nm, respectively.

Accessory pigment extraction
This study adopted HPLC (Agilent Technologies, Inc., CA, USA) to analyze the levels of lutein, fucoxanthin, and zeaxanthin. These auxiliary pigments represented the algal communities as green algae, diatoms, and cyanobacteria, respectively. HPLC samples were prepared by filtering 250 mL water samples using a GF/F filter (Whatman Inc., Germany). The filtered samples were frozen at −80°C prior to pigment analysis. Next, 10 mL of 95% acetone was used to extract the pigments from the filters, after which sonication was performed for 5 min to improve pigment extraction by breaking the algal cells. The solvent samples were kept in a dark room at −2°C for 24 h. The extracted samples were filtered again using a 0.2 μm syringe filter (PTFE) to remove the filter debris. The filtered sample (1 mL) was prepared for HPLC analysis by adding HPLC grade water (400 μm for water packing and canthaxanthin and 50 μm as the internal standard). In this study, the overall HPLC process followed that described in Zapata, Rodríguez, and Garrido (2000). The Waters Symmetry C8 column (150 mm × 4.6 mm, 3.5 μm particle size, 100 Å pore size) was connected to the HPLC as the stationary phase, which separated each pigment based on different retention times. In addition, methanol:acetonitrile:aqueous pyridine (50:25:25 v:v:v) was used as eluent A for the mobile phase. Eluent B included methanol:acetonitrile:acetone (20:60:20 v:v:v). The eluent solutions were prepared as HPLC grade, and the mixed solution was filtered. Subsequently, in the HPLC system, the sample injection volume was set to 100 μL, a photodiode array detector (PAD) was used, and the eluent flow rate was fixed at 1 mL min −1 . In addition, a gradient profile was used for the mobile phase composition of eluents A and B (Zapata, Rodríguez, and Garrido 2000). The chromatogram peaks for the accessory pigments were evaluated using standard pigments of lutein, fucoxanthin, and zeaxanthin. These pigments were quantified using the equations developed by Woods (1997). The response factor was calculated using the following equation: where RF is the standard response factor (ng unit area −1 ), C b is the standard pigment concentration (ng μL −1 ), I V is the injection volume (μL), and A is the integrated peak area. Based on this, the accessory pigment concentration can be calculated as follows: where C i is the concentration of specific pigments, including lutein, fucoxanthin, and zeaxanthin (ng L −1 ); A is the integrated peak area; RF is the standard response factor (ng unit area −1 ); I V is the injection volume (mL); E V is the extraction volume (mL); S V is the sample filtration volume (L); D is the dilution vector; V 0 is the total standard solution volume (μL); and V s is the total sample solution volume (μL).

Absorption coefficient measurement
The absorption coefficient spectra of the algal samples were measured using a Cary-5000 UV-Vis-NIR spectrophotometer equipped with a specific accessory as an integrating sphere. This accessory in the spectrophotometer was composed of a dual-beam port of 25 mm. Filtered samples were attached to the ports to measure the reflectance and transmittance of the samples. Following this procedure, the absorption coefficients of pure phytoplankton and non-algal particles were measured (Tassan and Ferrari 1995). The absorption coefficient of the non-algal particles was measured by bleaching the algal pigments of the samples using a 5% NaClO solution. The absorption coefficient signal of phytoplankton without a non-algal particle signal was obtained by subtracting the absorption coefficient of the non-algal particles from the coefficient of the phytoplankton. We referred to the equation for calculating the absorption coefficient proposed by Tassan and Ferrari (1995).
where a p (λ) is the absorption coefficient of phytoplankton (m −1 ), a np (λ) is the absorption coefficient of non-algal particles (m −1 ), a pp (λ) is the absorption coefficient of pure phytoplankton (m −1 ), A is the filtered surface area (cm 2 ), V is the filtered water volume (mL), ɛ is the scaling factor, OD f (λ) is the optical density of the filter sample, OD r (λ) is the optical density of the reference filter, and OD m is the minimum optical density of the absorption coefficient.

1 D-CNN model
In this study, a CNN model was used to estimate the absorption coefficient and the levels of chlorophyll-a, phycocyanin, lutein, fucoxanthin, and zeaxanthin. CNN models are mainly composed of convolutional layers, which apply convolutional filters containing weights and biases to extract input features with element-wise multiplication (Kalchbrenner, Grefenstette, and Blunsom 2014). This study adopted a onedimensional CNN model (1D-CNN) as it used onedimensional filters with a size of 1 × S. The extracted feature output was estimated using the following equation: where o t 1;i is the output layer t, o tÀ 1 1;i is the input layer t-1, w tÀ 1 1�s is the weight with a one-dimensional filter size s, B t is the bias of layer t, Conv1D indicates element-wise multiplication, and f represents the activation function.
In addition to the convolutional layer operation, this study adopted batch normalization to promote stable model learning by regularizing the input features, determine the max-pooling layer to release the computation burden, extract the highlight features, and obtain a dropout layer to prevent model overfitting (Luo et al. 2018;Talman, Yli-Jyrä, and Tiedemann 2018;Srivastava et al. 2014). Moreover, multiple fully connected layers were simultaneously applied to estimate the biomass and accessory pigment concentrations simultaneously.
A convolutional block attention module was applied to the 1D-CNN model structure to identify the spectral importance of the pigment estimations (Woo et al. 2018). This study applied a spatial attention module prior to the first convolutional layer, which emphasized important spectra and suppressed the insignificant bands. The attention module was modified as follows: where Aw denotes the refined input feature weights, conv1D is the one-dimensional convolution layer, AvgP indicates the average-pooling of input, MaxP presents the max-pooling, Ñ is the concatenation of results of AvgP and MaxP, w 1×7 is a filter with a size of 1×7, I R denotes the refined input features, sig is the sigmoid activation function, and ReLU is the rectified linear unit activation function.
The concatenation operator from the average-and max-pooling results allowed the consideration of comprehensive spectral features. In addition, the addition of a one-dimensional convolution layer promoted the recalibration of the input feature once again to improve the feature representation for important spectra. Figure 3 shows the overall 1D-CNN model structure with the attention module. This study used spectral input, including the observed surface reflectance and the absorption coefficient. This input was fed into the attention module to generate refined spectral features by providing important spectral bands with specific weight values. Then, the recalibrated feature was fed to the convolutional layer. The use of three convolutional layers led to an increase in the number of filters by 32, 64, and 256 with a filter size of 1×3. Batch normalization was performed after each convolution layer. The max-pooling and dropout layers were used after the third convolutional layer. After convolutional feature extraction, three fully connected layers were adopted for estimating major and auxiliary pigments. Specifically, one fully connected layer estimated both chlorophyll-a and phycocyanin, another estimated both lutein and zeaxanthin, and the last one estimated fucoxanthin ( Figure 3). These layer compositions are driven by the correlation between the pigments. In addition, each fully connected layer is linked to an individual loss function: one for the estimation of chlorophyll-a and phycocyanin, another one for lutein and zeaxanthin, and the last for fucoxanthin. Subsequently, the total loss function is assigned by summing the three loss functions. When minimizing the total loss function, the 1D-CNN model was expected to consider the common features of the grouped pigments and to minimize the features of the other pigment groups. Thus, a single 1D-CNN model training can provide two major algal pigments and three auxiliary pigments at once. Furthermore, for the model training, we applied the learning rates and batch sizes of 0.001 and 16. The layer compositions and hyperparameters were determined by repeated tuning to minimize the pigment training errors. This study allocated the data of 70% and 30% to train and validate the model. That is, 88 and 38 data were randomly assigned for the model training and validation, respectively.
Moreover, to generate spatial maps of the pigments, absorption coefficient maps were required. That is, another 1D-CNN model was designed to estimate the absorption coefficient using the observed reflectance spectra. In the model, three convolutional layers were adopted with filters of 16, 32, and 64 and a filter size of 1×3. After extraction of the convolutional features, max-pooling and dropout layers were applied. Then, the fully connected layers were used to estimate the 79 bands of the absorption coefficient spectra. The trained 1D-CNN model was applied to the drone reflectance map data to generate an absorption coefficient map. Subsequently, the drone input maps with reflectance and absorption spectra in each pixel were applied to the previously trained model with an attention module to produce spatial distribution maps of chlorophyll-a, phycocyanin, lutein, fucoxanthin, and zeaxanthin.

Performance evaluation
The performance evaluation of 1D-CNN model was based on the R squared (R 2 ), the root mean square error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE), and Willmote agreement index (WAI) values between the estimated and observed terms of chlorophyll-a, phycocyanin, lutein, fucoxanthin, and zeaxanthin concentrations. The calculations of R 2 , RMSE, MAE, MAPE, and WAI were based on the following equations. .
ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi RMSE ¼ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi where Y e is the estimated concentration of chlorophyll-a, phycocyanin, lutein, fucoxanthin, and zeaxanthin (mg m −3 ), or absorption coefficient (m −1 ), Y o is the observed pigment concentration (mg m −3 ) or absorption coefficient (m −1 ), Y o is the average pigment observation (mg m −3 ) and n is the total number of sampling points.

Observed algal pigments
We identified variations in the concentrations of the major and accessory pigments of algae ( Figure 4 and Table 2). Algal blooms were observed in August 2019, whereas blooms developed in September (Figure 4a and Table 3). Typically, blooming events can be observed in the summer when the water temperature is high. However, in 2020, intensive monsoon periods prolonged algal blooms until autumn. The nutrient sources were washed off by heavy rainfall, triggering relatively intensive blooming phenomena compared with the 2019 event ( Figure 4). Overall, in these blooming periods, the maximum concentrations of chlorophyll-a and phycocyanin were 383.44 mg m −3 and 355.56 mg m −3 , respectively (Table 2). In contrast, the dominant genera changed from cyanobacteria to diatoms because the fucoxanthin concentration crossed the zeaxanthin concentration from September 2019 to March 2020 (Figure 4b and Table 3).
The observed absorption coefficient contained spectral information in terms of the major and auxiliary pigments ( Figure 5a). As the total algal concentration increased, the magnitude of the absorption coefficient spectra increased. In particular, the absorption coefficient peak between 660 nm and 680 nm represented chlorophyll-a, while the absorption peak between 610 nm and 630 nm indicated the phycocyanin peak. The major pigment peaks increased as the pigment concentration increased. The accessory pigment peaks were located between 440 and 480 nm in the absorption coefficient spectra. The distinctive peak for each auxiliary pigment is difficult to recognize because of the low spectral signal from the low concentration of   the pigments. However, the overall trend of the absorption spectra in these band ranges increased when the algal concentration increased. The unit of pigment concentrations is mg m −3

Major and accessory pigment estimations of 1D-CNN model
The developed 1D-CNN model with absorption and reflectance spectral inputs was used to estimate the concentrations of chlorophyll-a, phycocyanin, lutein, fucoxanthin, and zeaxanthin. The overall model performance, including training and validation, is presented in Figures 6-7. For the major pigments, chlorophyll-a and phycocyanin showed R 2 values of both 0.97 and WAI of 0.85 and 0.84 as training results, whereas the validation results showed R 2 values of 0.87 and 0.71 and WAI of 0.74 and 0.76, respectively (Table 4). In addition, the estimations of accessory pigments of the CNN model showed training accuracies with R 2 values higher than 0.93 and WAI higher than 0.79 for lutein, fucoxanthin, and zeaxanthin concentrations (Figure 7). The model validations for the minor pigments had R 2 values of 0.76, 0.78, and 0.74,  and WAI of 0.71, 0.75, and 0.72, respectively (Table 4). In calculating the estimation error for major pigments, phycocyanin showed higher error values than chlorophyll-a in terms of RMSE, MAE, and MAPE while, in the estimation of auxiliary pigments, fucoxanthin showed a relatively high error with respect to MAPE. Compared to the algal pigment estimation performance of the 1D-CNN, the bio-optical algorithms showed relatively low performance in terms of R 2 and RMSE values (Figures 8 and 9). The estimation of chlorophyll-a showed MAE, MAPE, and WAI over 12.71 mg m −3 , 85.84%, and 0.17, respectively. Phycocyanin estimation had MAE, MAPE, and WAI over 24.11 mg m −3 , 2708%, and 0.32, respectively. In addition, the band ratio algorithm provided MAE values over 0.50, 1.44, and 1.59 mg m −3 , MAPE values over 294.44, 1263.46, and 511.48%, and WAI values over 0.50, 0.45, and 0.51 for lutein, fucoxanthin, and zeaxanthin, respectively.
The attention module of the 1D-CNN model emphasizes the importance of optical spectral bands for estimating major and minor pigment concentrations. The principle of the attention mechanism considers absorption and reflectance properties together using average-pooling and allows us to emphasize the absorption coefficient spectral signals from 440 nm to 700 nm ( Figure 10a) and reflectance signals from 700 to 750 nm (Figure 10b) because of the maxpooling applications. The attention module assigned high weight values to the bands from 508 nm to 650 nm and 678 nm to 750 nm to estimate major and accessory pigments during model training ( Figure 10). In contrast, the spectral bands between 400 and 436 nm were relatively suppressed compared to the other bands ( Figure 10).

Spatial distribution of the algal pigments
In addition to drone-borne reflectance maps, we produced absorption coefficient maps because the map data were required to contain reflectance and absorption coefficient information in each pixel to generate spatial maps of biomass and accessory pigments. To do so, another 1D-CNN model was trained to estimate the absorption coefficient using the observed reflectance data. The validation results of the model are presented in Figure 5. The estimated absorption coefficient followed the spectral trends of the observed absorption coefficients (Figure 5a-b). Although the averaged spectral estimation had a magnitude difference showing RMSE, MAE, MAPE, and WAI values of 0.25 m −1 , 0.21 m −1 , 30.02%, and 0.71, respectively, the overall results showed a coefficient of determination of over 0.90 (Figure 5c-d).
Based on the stacking of the drone reflectance and absorption images, the trained 1D-CNN model for algal pigment estimation was applied to these image inputs to map the spatial distribution of chlorophyll-a, phycocyanin, lutein, fucoxanthin, and zeaxanthin. Figure 11 shows the concentration maps of the major pigments at specific locations in the CR. The tested chlorophyll-a and phycocyanin levels followed the spatial patterns of ground truth, RGB images generated by drone hyperspectral imagery. The concentration levels of both pigments in the spots varied with respect to the sampling period (Figure 11d, h, l, p, and t). In particular, the blooming spot on 14th September showed relatively high chlorophyll-a and phycocyanin concentrations compared to the other periods, similar to the trend observed using the ground truth data (Figures 4a and 11q-t). In addition to estimating the major pigments, the 1D-CNN model could generate a spatial map of lutein, fucoxanthin, and zeaxanthin. The specific spots and concentrations are shown in Figure 12. The spatial distribution of the accessory pigments followed the observed spatial trends (Figures 11 and 12). Based on the concentration level, zeaxanthin showed a higher pigment composition than lutein and fucoxanthin.

Observed algal pigment implication
High concentrations of phycocyanin and zeaxanthin were classified as harmful algal blooms, where cyanobacteria were the dominant genera (Figure 4). Ernst et al. (1992) classified cyanobacteria genera, such as Synechococcus, by analyzing zeaxanthin pigment. Schagerl and Müller (2006) observed an increase in zeaxanthin concentrations in emerging cyanobacteria. Gardian et al. (2014) suggested that fucoxanthin is the major secondary light-harvesting pigment of diatoms, in addition to other minor pigments. The seasonal variation of lutein pigments indicates that green algae were minor genera in the sampling periods, although the concentrations of diatoms increased when cyanobacteria decreased (Figure 4b). Kleinig (1969) utilized lutein pigment as a taxonomical value for investigating green algae. The temporal variation of the accessory pigments provided additional insights for analyzing the variation and succession of different algal species.

1D-CNN model performance with variable importance for algal pigment estimation
Compared to the estimation accuracy of chlorophylla, the 1D-CNN model exhibited a relatively poor performance of phycocyanin estimation because of the underestimation in terms of the specific sampling period, resulting in high RMSE, MAE, and MAPE values (Figure 6c and Table 4). We expected the concentration level of the validation point to be covered by the concentration range of the model training. High concentration variability of phycocyanin might lead the 1D-CNN model to further localize to the low concentration variation due to the high number of low concentration occasions compared with high concentration occasions. This might result in the reduction of the high pigment concentration of phycocyanin. However, the 1D-CNN model generally provided acceptable validation accuracy for phycocyanin estimations with an R 2 value of 0.91 without the significant underestimation point in Figure 6c. Previous studies have used optical spectra to apply semi-analytical models to estimate chlorophyll-a (Cannizzaro and Carder 2006;Liu et al. 2020) and phycocyanin (Li, Li, and Song 2015;Ogashawara and Li 2019) to apply machine learning models for estimating chlorophyll-a (Diouf and Seck 2019;Cao et al. 2020) and phycocyanin (Heddam, Sanikhani, and Kisi 2019;Smith et al. 2020), and to apply the 1D-CNN model for estimating chlorophyll-a (Maier, Keller, and Hinz 2021). However, a 1D-CNN model has hardly been studied for phycocyanin estimation. Thus, this study demonstrated the capability of the CNN model to estimate major pigments by applying optical spectra. In previous studies related to auxiliary pigment estimation, Chase et al. (2017) estimated the assemblage concentrations of four accessory pigments called photoprotective carotenoids using a biooptical algorithm that resulted in an R 2 value of 0.77. Duppeti et al. (2017) applied partial least squares regression to estimate the total carotenoids of cultured algal samples using diffuse reflectance spectra, showing an R 2 value of 0.94. In contrast, Sun et al. (2021) estimated individual accessory pigments using absorption coefficients leading to R 2 values of 0.65 for lutein estimation, 0.87 for fucoxanthin estimation, and 0.47 for zeaxanthin estimation. In this study, the biooptical algorithms using reflectance bands were found to provide poor estimation performances of lutein, fucoxanthin, and zeaxanthin ( Figure 9). This implies that the reflectance signal alone cannot represent the periodic variations of the auxiliary pigments; that is, the 1D-CNN model can potentially to analyze the variations of the axillary pigments using both reflectance and absorption coefficient traits (Figure 7b, d, and e). The designed CNN model could be applied to simultaneously provide taxonomical information, such as green algae, diatoms, and cyanobacteria, to understand algal phenomena. However, further improvement of the model performance is still required because of the relatively high MAPE values compared to the trends of RMSE and MAE by capturing the variation of low concentration. In particular, the high MAPE value of fucoxanthin was driven by overestimation in substantially low concentration periods (Figure 7).
The absorption coefficient spectra contain information on algal pigment compositions and proportions that provide taxonomical information on algae (Bricaud et al. 2007). The reflectance signal that can interact with algal cells can help trace the optical proxies of algae (Fragoso et al. 2021). During the convolutional feature extraction, the weights of the convolutional layers were calculated based on the relationship between the spectral inputs and the variation in the algal pigment composition. The weighted convolutional features from the relationship contained implicit information in terms of the major and minor pigments. If the relation is insignificant because of the difficulty in capturing the trace signal of low pigment concentration, the calculated weights of the 1D-CNN model could not follow the variation of the pigments. In addition to the convolutional feature extraction, this study refined the training weight using the combination of the additional convolutional feature extraction and sigmoid function, amplifying the valuable spectral bands of the absorption coefficient and reflectance spectra and suppressing the unimportant bands. In particular, the trained weights covered the specific absorption bands of chlorophyll-a (660 nm-680 nm), lutein (445 nm-478 nm), fucoxanthin (446 nm-468 nm), zeaxanthin (450 nm-481 nm) (Mantoura and Wright 1997), and phycocyanin (615 nm-620 nm) (Patel et al. 2005). The high weights above 700 nm might be caused by the high concentration of algae, which can induce a dominant effect of optical scattering driven by a large number of phytoplankton cells (Gitelson et al. 2000). This might be due to the interference effect of organic matter, which can cause a low relationship between optical bands and algal concentrations (Li, Li, and Song 2015). Thus, the attention module was proven to help in model training by refining the weights and providing an implicit description of the important spectral bands for estimating the five different algal pigments.

Pigment map generation of 1D-CNN model
The relative difference between the observed and estimated absorption spectra has been observed in previous studies. Ioannou et al. (2011) used a neural network model with a reflectance band at 442 nm to estimate the absorption coefficient at 442 nm with an RMSE value of 1.13 m −1 despite showing an R 2 value of 0.99. In addition, Pahlevan et al. (2021) applied a machine-learning model to estimate the absorption coefficient and achieved an estimation error between 20% and 30% in the band range from 400 nm to 690 nm; however, they successfully provided an algal concentration map using the estimated absorption coefficient. Thus, we applied the trained 1D-CNN model to drone reflectance images to generate absorption coefficient images. Pyo et al. (2019) demonstrated the capability of the 2D-CNN model to generate chlorophyll-a and phycocyanin maps using complex input images. However, this study showed that the 1D-CNN model could also provide the distribution of major and auxiliary algal pigments with simple spectral inputs. The trained 1D-CNN model could provide allowable accuracy by showing the spatial correspondence between the RGB and the generated pigment distribution. In the data-driven model application to remote sensing imagery, the determination of the spatial distribution accuracy was performed by visual verification between the actual and estimated imagery (Cao et al. 2020;Fan et al. 2021;Peterson, Sagan, and Sloan 2020) or by conviction of the well-trained model without visual checking (Kim et al. 2014;Li et al. 2021;Saberioon et al. 2020). In particular, the high zeaxanthin concentration implied that cyanobacteria were dominant in these spots with respect to the image period; that is, the concentration of zeaxanthin on 14th September indicated that the intensive cyanobacteria bloomed at the spot (Figure 12s-t). Therefore, the 1D-CNN model proposed here could be utilized to explicitly and implicitly understand algal phenomena with respect to biomass by estimating the major pigments and their taxonomical information by analyzing auxiliary pigments. However, further research is still needed to resolve the swath boundary issue of the drone-borne sensing (Figure 11e-g and 12e-g) because these borders could provide distorted information with respect to the spatial distribution and concentration level of the algal pigments that could be driven by the distinctive difference in optical information between the observed regions. He et al. (2019) introduced the distortion of remote sensing images owing to the swath border driven by the limited sensor hardware and constrained swath width coverage.

Quantitative and qualitative analysis of algae using deep learning
This study showed that the deep learning model could analyze algal phenomena quantitatively and qualitatively by detecting the major and accessory pigments. Accurate estimation of algal biomass is critical for managing water resources. Moreover, estimation of minor pigments allowed us to obtain additional insights into algal blooms. In addition to taxonomical classification, more qualitative perceptions, such as photophysiology and predation, must be performed. Diadinoxanthin and diatoxanthin are typical photoprotective secondary pigments that implicitly explain the variation in the photoconditions of algal phenomena (Polimene et al. 2014). Pheophytin concentration increases as zooplankton grazing increases, potentially indicating algal migration for predator avoidance (Dini and Carpenter 1992). Thus, further studies are required to examine the performance of synoptic sensor imagery for estimating various auxiliary pigments. Although this study preliminarily demonstrated the potential of a deep learning model for estimating the major and auxiliary algal pigments in site-specific water bodies, the trained 1D-CNN model can be directly applied to other water bodies if the optical and pigment variabilities are similar to those of Daecheong Lake. Furthermore, the trained model can be applied to other water bodies if the spatial and temporal variations of the optical and pigment properties are different when using the fine-tuning approach. The use of a pre-trained model with fine-tuning can decrease the model building and training time, resulting in a relatively low generalization error compared with building and training the deep learning architecture from the beginning (Goodfellow, Bengio, and Courville 2017). However, collecting more observation data on various water bodies is still needed to solidify the model generalization performance for estimating major and auxiliary pigments. All of these can accomplish a sophisticated quantitative and qualitative understanding of algal blooms in inland waters. Additionally, the well-trained deep learning model can overcome the drawbacks of HPLC analysis because of the high cost of collection and analysis of samples (Graban et al. 2020). This novel trial can provide algal biomass and characteristics without consuming samples (Duppeti et al. 2017). Figure 11. Spatial distribution maps and concentration levels of major pigments generated by the 1D-CNN model. The ground truth RGB image, chlorophyll-a map, phycocyanin map, and concentration level distributions observed on (a-d) 11 June 2020 (e-f) 6 July 2020 (i-l) 20 August 2020 (m-p) 4 September 2020 and (q-t) 14 September 2020.

Conclusion
In this study, we evaluated the developed 1D-CNN model for estimating the major algal pigments (i.e. chlorophyll-a and phycocyanin) and accessory pigments (i.e. lutein, fucoxanthin, and zeaxanthin). To do so, field and drone-borne monitoring was conducted to retrieve reflectance data of CR, and experimental analysis was performed to measure the pigment concentrations and absorption coefficients of algae. Then, a 1D-CNN model was designed to simultaneously estimate the major and minor pigments using reflectance and absorption spectra. The trends of the five algal pigments from the model agreed with the observed trends.
This result was also reflected in the spatial distribution of pigments in CR. Thus, the 1D-CNN model could be used to analyze algal phenomena quantitatively and qualitatively. Further improvement of the model performance is recommended to achieve a more reliable 1D-CNN model. In addition, future research must be performed to investigate the extendibility of the deep learning model in different water bodies to estimate various algal secondary pigments to obtain more insights into algal phenomena. However, the trace concentration levels and similar spectral properties of the auxiliary pigments are still a challenge, which reduced the estimation performance.