Biophysical parameters mapping within the SPOT-5 Take 5 initiative

ABSTRACT Leaf area index and Fraction of Absorbed Photosynthetically Active Radiation are important land biophysical parameters that enable monitoring and quantitative assessment of vegetation state. Remote sensing data from space can be used to estimate these parameters at regional and national scale. High temporal satellite imagery is usually required to capture main parameters of crop growth. In this paper, we assess efficiency of using satellite images at high spatial and temporal resolution acquired within the SPOT-5 Take 5 initiative for mapping biophysical parameters.


Introduction
With launch of wide range of modern satellites, the role of satellite imagery in environment monitoring tasks (Kussul, Shelestov, & Skakun, 2011; has increased significantly. Assessing and quantifying biophysical parameters of vegetation cover are extremely important problems when monitoring land cover and associated changes, identifying vegetation stress, and assessing crop production. Parameters such as leaf area index (LAI) and Fraction of Absorbed Photosynthetically Active Radiation (FAPAR) have been included into the list of terrestrial essential climate variables . They can be used to quantify crop state within agriculture monitoring tasks under the Global Agriculture Monitoring (GLAM) initiative (Becker-Reshef et al., 2010), and have already proved to be efficient for crop yield (Kogan et al., 2013b(Kogan et al., , 2013aKolotii et al., 2015;Duveiller et al., 2013) and production prediction estimation (Gallego et al., 2014;Kussul et al., 2012;. In situ estimation of biophysical parameters is a time and resource-consuming task (Weiss, Baret, Smith, Jonckheere, & Coppin, 2004) even with the aid of automation systems (Qu, Zhu, Han, Wang, & Ma, 2014). Remote sensing data from space are the only source of information in order to enable regular and consistent estimation of biophysical parameters at regional, national, and global scale (Camacho, Cernicharo, Lacaze, Baret, & Weiss, 2013;Kolotii et al., 2015;Morisette et al., 2006;Shelestov et al., 2015). Both optical and synthetic aperture radar imagery can be used to extract biophysical values. Currently, there are several operationally available products at coarse (1 km) spatial resolution: AVHRR (1981-present), SPOT-VEGETATION (1998-present), and MODIS (2000present). With the availability of high-resolution images acquired by Landsat-8, Sentinel-2, and SPOT series satellites, it is becoming important to provide corresponding products at high spatial resolution that will be consistent with those at coarse resolution.
In December of 2014, for preparation of operational use of Sentinel-2 data, the European Space Agency launched the SPOT-5 Take 5 experiment (https://spot-take5.org/). In this experiment, SPOT-5 was considered as a simulator of the Sentinel-2 timeseries images. The experiment was aimed at developing new methods, services, and products ready for operational use with Sentinel-2 data. During 5 months of 2015 (08/04/2015-31/08/2015), SPOT-5 started to acquire data for 180 sites all over the world. Ukrainian Joint Experiment for Crop Assessment and Monitoring (JECAM) test site  participated in this experiment.
Derivation of biophysical parameters from highresolution satellite imagery has been an area of active research for the past several decades (Wiegand et al., 1992;Chen et al., 2002;Ganguly et al., 2012;Li et al., 2015;Walthall et al., 2004). In general, existing approaches can be divided into physical based and empirical based. Physical models simulate spectral response based on input biophysical parameters values, viewing geometry, land cover, and wavelength (Ganguly et al., 2012;Li et al., 2015). In order to derive biophysical values from remote sensing imagery, an inversion of the physical model is performed. For this, different methodologies have been applied, for example, look-up tables (Ganguly et al., 2012) and machine-learning algorithms (Walthall et al., 2004;Verrelst et al., 2012;Verrelst, Rivera, Moreno, & Camps-Valls, 2013;Fang & Liang, 2003). While physical-based models can be usually applied at global or continental scale, they are computationally complex and require multiple additional datasets.
On the other hand, empirical models connect biophysical parameters with some selected predictors (features), for example, surface reflectance (SR) or vegetation indices (VIs) derived from remote sensing data (Verrelst et al., 2012;Fensholt, Sandholt, & Rasmussen, 2004;Turner, Cohen, Kennedy, Fassnacht, & Briggs, 1999;Ali et al., 2015). Empirical models are quite easy to implement, data driven (rely on data) and site specific, and usually implemented and evaluated within the single vegetation season. One of the main problems associated with these models is the selection of the most informative features. Adding all possible features will increase complexity of the model but not necessarily will make the model more accurate and robust. In fact, adding nonrelevant features might overfit the model that will lack robustness even at regional scale. Previous studies have addressed this issue through, for example, a standard statistical assessment of significance level in regression models, uncertainty analysis in Gaussian process kernel models (Verrelst et al., 2013), or through permutation of input feature and feature engineering (Fang & Liang, 2003).
In this paper, we propose a new approach for selecting features to build regional empirical-based models for biophysical parameters retrieval from remote sensing images using such ML method random forest (RF) explore efficiency of using satellite data at high spatial and temporal resolution for cropspecific biophysical parameters mapping. In particular, a RF algorithm (Breiman, 2001) is trained based on all features that in this study included top-ofatmosphere (TOA) and SR, as well as VIs. Random permutations of these input features are performed, and after such permutations, performance of the RF is statistically evaluated. This forms a basis for estimating "feature importance". Advantage of this approach comparing to other techniques is that it allows one to automatically select different permutations of features based on information gain (Strobl, Boulesteix, Kneib, Augustin, & Zeileis, 2008). In other words, features that will provide higher informativeness will have a higher importance level.
Linear and exponential regression models that connect satellite-derived features with biophysical parameters are built using the most informative features. Performance of these models is evaluated in terms of R-square parameter and root mean square error (RMSE) between the observed variable and estimated variable. We also discuss how the derived feature importance correlates with these performance metrics.
The study is performed for the JECAM test site in Ukraine using a 3-year set of ground observations (2013)(2014)(2015) and satellite images acquired by Landsat-8 and cross-compared with results obtained with use of SPOT-5 imagery for 2015 (via participation in SPOT-5 Take 5 experiment). The study particularly aims at addressing the following questions: Is there any variability of features importance in time (inter-and intra-season)? Is feature importance level different for different satellite sensors and different land cover types? Is there dependence of feature importance on the number of ground observations? How does the derived feature importance correlates with performance of empirical regression models? How efficiently 10-m satellite data can be used for crop-specific biophysical parameters mapping?

Study area
The study was performed on the JECAM test site in Ukraine that was established in 2011, and covers the area of Kyiv Oblast. For biophysical parameters estimation from satellite remote sensing imagery, a subsite near Pshenichne village (center location: latitude is +50.07997 and longitude is +30.23081) in Vasylkiv district was selected ( Figure 1). This subregion represents an agriculture-intensive area with the following major crop types: maize, wheat, and soybeans. The climate in the region is humid continental with approximately 709 mm of annual precipitations. Landscape is mostly flat terrain with slopes ranging from 0% to 2%. The crop calendar is September-July for winter crops and April-October for spring and summer crops.

Data description
Ground data description Ground measurements were collected for a 3 km-by-3 km segment in 2013-2014 (Figure 1(c)), and for the whole Vasylkiv district in 2015 to maximize the number of samples for particular crops (Figure 2). For collecting ground measurements to support satellite observations, we followed the Validation of Land European Remote sensing Instruments protocol under which the measurements were made for the elementary sampling units (ESUs) (Morisette et al., 2006). The size of ESU was equal to 20 m × 20 m to match the spatial resolution of satellite imagery. A pseudo-regular sampling was used within each ESU with 12-15 samples per ESU ( Figure 3). A nondestructive method utilizing digital hemispherical photos (DHPs) was used for collecting samples on biophysical parameters inside each ESU. DHPs were acquired with a NIKON D70 camera. Hemispherical photos allow the calculation of LAI and FCOVER measuring gap fraction through an extreme wide-angle camera lens (i.e. 180°) (Weiss et al., 2004). It produces circular images that record the size, shape, and location of gaps, either looking upward from within a canopy or looking downward from above the canopy. Hemispherical images acquired during the field campaign were processed with the CAN-EYE software (http://www.avignon.inra.fr/can_eye) to derive LAI, FAPAR, and FCOVER. It is based on a RGB color classification of the image to discriminate vegetation elements from background (i.e. gaps). This approach allows exploiting downward-looking photographs for short canopies (background = soil) as well as upward-looking photographs for tall canopies (background = sky).
CAN-EYE software processes simultaneously up to of N = 16 images acquired over the same ESU. Note that the N images were acquired with similar illumination conditions to limit the variation of color dynamics between images.
Ground measurements were collected from 2013 to 2015. Several field campaigns were conducted each year (2013)(2014)(2015) in order to estimate biophysical parameters values at different stages of crop growth. Tables 1 and 2 provide details on dates of ground measurements and number of ESU collected during each field campaign.

Satellite data description
Landsat-8 and SPOT-5 imagery was acquired during 2013-2015 and 2015, respectively. Landsat-8 images at 30-m spatial resolution were obtained through the US Geological Surveys' (USGS) EarthExplorer service (http://earthexplorer.usgs.gov), and further converted to the TOA reflectance using conversion coefficients in the Landsat-8 metadata file (Roy et al., 2014).
Clouded and shadowed pixels in the Landsat-8 imagery were detected using the Fmask algorithm (Zhu et al., 2012), and were further excluded from the analysis. The normalized difference vegetation  index (NDVI) and ratio between near-infrared (NIR) and red spectral bands were estimated, and along with Landsat-8 spectral bands, 2-7 were used as features to correlate with biophysical parameters collected during ground surveys. SPOT-5 images at 10m spatial resolution were acquired within the Take-5 initiative (Hagolle et al., 2015). We used a L2A reflectance product that provides satellite images with atmospheric and terrain corrections as well as with corresponding masks, clouds, and shadows. NDVI and ratio between NIR and red SR values were estimated and along with SPOT-5 spectral bands (green, red, NIR, and shortwave infrared) were used as features to correlate with biophysical parameters collected during ground surveys. Dates of Landsat-8 and SPOT-5 images acquisitions are presented in Tables 1 and 2, respectively.

Methodology
Random forests for feature selection RF is a machine-learning algorithm that represents an ensemble of decision trees (DTs) (Breiman, 2001). A DT classifier or regression model is built from a set of data using the concept of information entropy. At each node of the tree, one attribute of the data that most effectively splits its set of samples into subsets enriched in one class or the other is selected. Its criterion is the normalized information gain that results from choosing an attribute for splitting the data. The attribute with the highest normalized information gain is chosen to make the decision. The algorithm then recurs on the smaller sublists. One disadvantage of the DT classifier is the considerable sensitivity to the input dataset, so that a small change to the training data can result in a very different set of subsets (Bishop, 2006). In order to overcome disadvantages of a single DT, an ensemble of DTs is used to form a RF. Each DT represents an independent expert (or weak classifier) in RF that is trained based on different input datasets that are generated through a bagging procedure.
RF can be used not only for building classification and regression models but also for assessing variable importance. Compared to other techniques, RFs are fast and computationally effective, and can deal with situations when number of samples is comparable or in order of the number of predictors (features). Different random permutations of input features are performed for a RF, and a loss of accuracy for is estimated. In our case, RF is used as a regression model, and its performance is evaluated in terms of RMSE. We use a Z-score (ratio between average loss and its standard deviation) as a feature importance metric implemented in the Borut wrapper (Kursa et al., 2010) in the R package RF (Liaw et al., 2002). The particular advantage of the Borut algorithm is that it has an iterative nature, and accounts for the fluctuations of the mean accuracy loss among trees in the forest (Kursa et al., 2010).
Spectral reflectance values, NDVI, and ratio between NIR and red bands from Landsat-8 and SPOT-5 were used as features to estimate biophysical parameters values, namely LAI and FAPAR, with RF. A Borut algorithm was run to quantitatively estimate importance of each feature. These runs were performed separately for maize, winter wheat, soybeans, and all crop types.

Regression models
The most informative satellite-derived features and their combinations were used to build regression models that connected these features with biophysical parameters, i.e. LAI and FAPAR. The following models were considered: linear, exponential, and RF. The performance of these models was evaluated in terms of two metrics: (1) where y i;obs and y i;mod are observed and modeled biophysical variables, respectively, and y obs is the averaged observed value y obs ¼ 1 n P i y i;obs .
Performance metrics are estimated using a leaveone-out cross-validation concept. The model is iteratively built for ESUs from all but one campaign that is reserved for validation purposes. Performance metrics are estimated on the validation set. The procedure is repeated for all campaigns and thus performance metrics are averaged over validation sets.  Lavreniuk et al., 2015;Kussul et al., 2015). We also varied number of training data for each case in order to investigate sensitivity of the feature importance metric to the number of input samples. Obtained results for Landsat-8 and SPOT-5 are presented in Tables 3-6, respectively. The most important feature for Landsat-8 and SPOT-5 for deriving LAI and FAPAR was NIR spectral band and two of its combinations with red band: NDVI and NIR/RED ratio. Other spectral bands such as blue, green, and SWIR were less informative. It means that variance of output variable (LAI, FAPAR) was not dependent on varying values of these parameters.
These three the most informative features (NIR, NDVI, NIR/RED) were used for building regression models to estimate LAI and FAPAR. The obtained performance metrics, namely R 2 and RMSE, are shown in Tables 7 and 8. For LAI, the best model in terms of RMSE was exponential one with NDVI as a variable. The RMSE error varied 0.46-0.61 for Landsat-8, and 0.4-0.51 for SPOT-5 depending on the crop type. Linear and RF models yielded RMSE 1.5-2 times worse than the exponential model. Also, the RF-derived importance was not always the highest one for NDVI. For example, NIR/RED ratio was the most important for Landsat-8 for deriving LAI for maize. However, RMSE value for exponential model with NDVI was lower than RMSE for exponential model NIR/RED (0.46 versus 0.63). This suggests that RF was not able to capture exponential dependency among parameters when calculating feature importance, and these potential dependencies should be constructed manually from a priori considerations.
In case of FAPAR, linear models (with NDVI and NIR) yielded the same performance outperforming the exponential one. The RMSE error varied 0.1-0.13 for Landsat-8, and 0.07-0.1 for SPOT-5.
In almost all cases, the parameter with the highest importance yielded the minimum RMSE in the linear regression models. This was not the case for the exponential model. The reason for this is that RF did not capture potential exponential dependency, and exponent for the features was not tested for importance within the RF framework.
The developed models were used to estimate the biophysical parameters and compared to the observed ones.

Discussions and conclusions
In this paper, we quantitatively assessed importance of features derived from remote satellite sensing images to build regional empirical-based models for biophysical parameters retrieval with use of machine learning. The approach was based on RF algorithm that randomly permutes features to statistically estimate its influence on the resulting error, and applied to the Landsat-8 and SPOT-5 imagery acquired for the JECAM test site in Ukraine.
The most important features to estimate biophysical parameters from satellite imagery included a NIR spectral band: NIR band itself, vegetation index NDVI, and ratio between NIR and red bands. Other spectral bands such as blue, green, and SWIR for Landsat-8; and green and SWIR for SPOT-5 were not important. In other words, these features provided little informativeness (in terms of entropy) for estimation of biophysical parameters (LAI and FAPAR). These results suggest that including all spectral bands into the models to estimate biophysical parameters from satellite imagery will increase its complexity and likelihood of overfitting but will not lead to the decrease of estimation error. These results were observed under varying input conditions, in particular: • LAI and FAPAR. NIR, NDVI, and NIR/RED were equally important when estimating different biophysical parameters, namely LAI and FAPAR. This suggests that the same set of parameters can be used for extracting different biophysical parameters from satellite imagery and agreed with results from Dahms, Seissiger, Conrad, and Borg (2016); Xavier and Vettorazzi (2004). • Different satellite sensors. Features involving NIR spectral bands were the most important for different satellite remote sensing sensors, in particular, Landsat-8 and SPOT-5. This suggests possibility of interoperable application of  satellite imagery and possibility to build multimission models for extracting biophysical parameters. • Crop types. There was no dependence of a set of the most important features on crop types. The same set was important when building models for separate crops (maize, winter wheat, and soybeans) and all crops together. With use of crop maps  and machine-learning approach, it is possible to build crop-specific maps of biophysical parameters that are better in terms of error values. • Dependence of LAI from NDVI is exponential, while for FAPAR relation to NDVI is linear. Other satellite-derived parameters (separate bands) constantly provided large error when building for separate crops and at different timing of crop growth. The same results were obtained by Goswami, Gamon, Vargas, and Tweedie (2015) for the territory of Alaska with completely different agroclimatic conditions. • -Number of input data. The set of important features was the same when decreasing number of samples for training the RF algorithm. This suggests the robustness of this approach in terms downscaling and little influence on the data size. However, since feature importance metric is statistical in nature, a minimum number of samples should be used. We estimated empirically that minimum 10-12 samples with 7 features are necessary to reliably estimate feature importance. • Intra-season variability. Results obtained for different vegetation seasons (2013)(2014)(2015) show that there is little variability in feature importance suggesting the same set of features can be used for building models. These results will be further exploited for building multi-mission (Landsat-8, Sentinel-2) multi-season models for extracting biophysical parameters from satellite imagery.

Acknowledgment
This work was supported in part by European Commission (EC) under the Framework Program (FP7) Grant "Stimulating Innovation for Global Monitoring of Agriculture and its Impact on the Environment in support of GEOGLAM" (SIGMA) [number 603719]. SPOT-5 images from Take-5 initiative were downloaded from https://spot-take5.org.

Disclosure statement
No potential conflict of interest was reported by the authors.