Assimilation of NDVI data in a land surface – Vegetation model for leaf area index predictions in a tree-grass ecosystem

ABSTRACT Periodic observations of vegetation index, such as the normalized difference vegetation index (NDVI), can be used for data assimilation in heterogenous ecosystems. Indeed, the new Sentinel 2 Multispectral instrument and Landsat 8 Operational Land Imager sensor data are available at such high temporal and spatial resolutions that can be used to detect the patches of the main vegetation components (grass and trees) of heterogenous ecosystems, and capture their dynamics. We demonstrate the possibility to merge grass and tree NDVI observations and models, to optimally provide robust predictions of grass and tree leaf area index. The proposed assimilation approach assimilates NDVI data through the Ensemble Kalman filter (EnKF) and dynamically calibrates a key vegetation dynamic model parameter, the maintenance respiration coefficient (ma ). In the presence of large bias of the grass and tree ma base values, only the use of the proposed assimilation approach removes the large bias in the biomass balance, dynamically calibrating maintenance respiration coefficients, and corrects the model. The performance of a land surface – vegetation model was improved by assimilating observations of NDVI. The effective impact of the proposed assimilation approach on the evapotranspiration and CO2 uptake predictions in the heterogenous ecosystem is also demonstrated.

The leaf area index (LAI) is a key vegetation characteristic, driver of the vegetation productivity, and can be considered a key biophysical variable in vegetation models (Ewert 2004;Parker 2020;Wythers, Reich, and Turner 2003).Indeed, efforts have led to improvements in the estimate of key ecohydrological variables, such as the LAI, from remote sensors (Houborg and McCabe 2018;McCabe et al. 2017;Pettorelli et al. 2014).LAI mapping can be derived from observations of optical remote sensors operating in the visible and infrared bands (Broge and Leblanc 2001;Fang et al. 2019;Zheng and Moskal 2009).Indeed, vegetation indexes, e.g. the normalized difference vegetation index (NDVI), can be estimated from optical remote observations (Fang et al. 2019;Zheng and Moskal 2009), and NDVI is strictly related to LAI through empirical and physically-based models (Dong et al. 2019;Li et al. 2017;Verrelst et al. 2015;Wang et al. 2005).Nowadays, a wide variety of optical satellite remote sensors are available, at different spatial resolutions.They range, for instance, from the coarse spatial resolutions of AVHRR (1100 m) and MODIS (250-1000 m) to the fine spatial resolutions (10-30 m) of ASTER, Landsat 8 and Sentinel 2 (Gim et al. 2020;Li et al. 2017;Ngadze et al. 2020), and the highest spatial resolutions (≤ 5 m) of IKONS, QUICKBIRD and WorldView (Huang et al. 2018).Originally, the main limit of optical sensors at high spatial resolution was the low time resolution of their platforms (Gao et al. 2006;Hill, Quaife, and Williams 2011), but new platforms, such as Sentinel 2 (Attarzadeh et al. 2018; Attarzadeh and Amini 2019) and Landsat 8 (Li et al. 2017;Ngadze et al. 2020), provide observations, potentially, at high temporal resolutions (up to 5-10 days), and are freely available.Observations of vegetation index at such high temporal resolutions are essential for operative data assimilation approaches, being vegetation phenology highly sensitive to temporal resolution (Jin et al. 2018;Kross et al. 2011;Thayn and Price 2008).
The fine spatial and time resolutions of the remote sensing data allow the vegetation mapping in heterogeneous ecosystems (Gao et al. 2006;Hill, Quaife, and Williams 2011), characterized by contrasting plant functional types (PFTs, e.g.grass and trees) competing for water and energy (Baldocchi, Xu, and Kiang 2004;Detto et al. 2006;Fernandez, Mora, and Novo 2004;Ramirez-Sanz et al. 2000;Scholes and Archer 1997;Williams and Albertson 2004).In these ecosystems, spatial patterns of tree and grass covers are variable in space and time, depending on the water availability and incoming radiation (Sankaran et al. 2005), both of which respond to topographic, climatic and edaphic factors, and ultimately produce the tree-grass mosaic on the landscape (Breshears 2006;Moore and Heilman 2011;Villegas et al. 2014).For capturing the fine spatial land cover variability and its evolution in time, remote optical sensing observations need to be at fine spatial and time resolutions (Hill, Quaife, and Williams 2011;Olsoy et al. 2017), which could be achieved, for instance, through the combined use of Landsat 8 and Sentinel 2 data.The fine spatial resolutions of Landsat 8 and Sentinel 2 allow to capture the patches of grass and trees in heterogeneous ecosystems.
Land surface models (LSMs) have been developed to simulate land and atmosphere interactions and the soil moisture and thermal states, from the integration of mass and energy balance equations (e. g., Albertson and Kiely 2001;Famiglietti and Wood 1994;Montaldo and Albertson 2001;Noilhan and Planton 1989;Wigmosta, Lettenmaier, and Vail 1994).LSMs have been coupled with vegetation dynamic models (VDM) for modeling vegetation dynamics and their interactions with land surface processes (Arora 2002;Montaldo et al. 2005;Montaldo, Albertson, and Mancini 2008).The coupled LSM-VDMs are part of the ecohydrological models, which are able to integrate hydrological mechanisms and ecological patterns and processes (Chen et al. 2015;Fatichi, Pappas, and Ivanov 2016;Rodriguez-Iturbe 2000).In a coupled LSM-VDM, vegetation characteristics such as the leaf area index (LAI), which were often considered to be yearly invariant or seasonally variable according to pre-defined shapes in LSMs (Arora 2002;Jasper et al. 2004;Vanrheenen et al. 2004), become variable and are predicted by the VDM (Montaldo et al. 2005;Montaldo, Albertson, and Mancini 2008).
Data assimilation techniques guide the ecohydrological models with observations of certain state variables from satellite remote sensors (Crow and Wood 2003;Fang et al. 2019;Montaldo et al. 2001;Montaldo, Albertson, and Mancini 2007;Reichle et al. 2004;Wigneron et al. 1999).Little effort has been made to assimilate LAI from optical remote sensing data in LSM (Bonan et al. 2020) and LSM coupled with a crop growth model or VDM (Cheng et al. 2020;Huang et al. 2015;Kumar et al. 2019;Migliavacca et al. 2009;Peng et al. 2021;Tripathy et al. 2013); most research mainly used remote observations at the coarse spatial scale of MODIS (Demarty et al. 2007;Fox et al. 2018;Li et al. 2017;Ma et al. 2017;Migliavacca et al. 2009;Quaife et al. 2008).MODIS observations do not seem suitable for highly heterogeneous, water-limited ecosystems due to the coarse spatial resolution (>250 m), which would not allow a clear distinction of vegetation cover components (e.g.trees), which usually cover less than 50 m in these ecosystems.This study proposes the assimilation in the coupled LSM-VDM of observations from two optical satellites (Landsat 8 and Sentinel 2) with fine spatial and time resolutions, which could be appropriate for heterogeneous ecosystems.
Data assimilation accounts for both measurement and model errors optimally using, for instance, the Kalman filters (e.g.Reichle et al. 2002).The Kalman filter considers errors of the model from parameters, initial conditions and forcing (e.g.Dunne and Entekhabi 2005;Margulis et al. 2002;Montaldo, Albertson, and Mancini 2007).The Kalman filter should optimally assimilates measurements of state variables, such as provided by remote sensor data (e.g.soil moisture, leaf area index, surface temperature), to reduce the model errors.Besides Kalman filters, other data assimilation algorithms are variational assimilations (Jin et al. 2018), which assimilates all observations at once at their respective measurement times over a given period (Scholze et al. 2017), in contrast with the sequential assimilation approach used by Kalman filters and Particle filters, which assimilates observations subsequently at discrete model time steps (Jin et al. 2018).Reichle et al. (2002) suggested the use of the ensemble Kalman filter (EnKF) of Evensen (1994), because it is 'more robust and offers more flexibility in covariance modeling'.The EnKF is widely used in land data assimilation and hydrology (Crow 2003;Crow and Wood 2003;Evensen 2003;Margulis et al. 2002;Montaldo, Albertson, and Mancini 2007;Reichle et al. 2002;Reichle et al. 2019) due to its relative ease of implementation (Reichle et al. 2017).Some data assimilation efforts assimilated NDVI and LAI data in ecohydrological models using EnKF (Dong et al. 2013;Fox et al. 2018;Kumar et al. 2019;Li et al. 2017;Ling et al. 2019;Ma et al. 2017;Quaife et al. 2008).The Kalman filter assumes that the model errors are zero-mean and uncorrelated in time.In ecohydrologic modeling this requirement is frequently not respected, and the model becomes biased.Montaldo, Albertson, and Mancini (2007) assimilated soil moisture observations in a LSM using EnKF and developed an assimilation approach that dynamically calibrated a key soil water balance model parameter as a function of the persistent bias in soil moisture predictions.This approach was useful when model parameters largely differed from the calibrated values (Montaldo, Albertson, and Mancini 2007), and was applied using Sentinel 1 radar data by Montaldo et al. (2022).Lü et al. (2011aLü et al. ( , 2011b) ) followed the approach of Montaldo, Albertson, and Mancini (2007) assimilating soil moisture in two Chinese field sites, and Nie, Zhu, and Luo (2011) and Zhang et al. (2017) assimilating soil moisture using EnKF, and calibrated several parameters of a soil water balance model, by using field measurements and not using remote sensor observations.
Here, the performance of the EnKF for LAI predictions is addressed for reducing LSM-VDM biases due to parameter errors.Using the Montaldo, Albertson, and Mancini (2008) coupled LSM-VDM, we propose an approach for assimilating tree and grass NDVI from Landsat 8 and Sentinel 2 remote sensors in the LSM-VDM, which removes the model biases through a dynamic calibration of key VDM parameters.
The proposed multiscale assimilation approach was tested in a Sardinian field site, characterized by strong heterogeneity with wild olives randomly distributed in surrounding grass, and water-limited conditions typical of Mediterranean ecosystems (Detto et al. 2006).In the Sardinian field site, a micrometeorological eddy-covariance based tower has been operating (Montaldo, Albertson, and Mancini 2008;2020), and Landsat 8 and Sentinel 2 remote observations were used for data assimilation.Analogous solutions should be derivable for most other ecohydrological models, and using data of other remote sensors with similar or higher spatial and temporal resolutions.In summary, our objectives are to: 1) assimilate tree and grass NDVI observations of optical satellites at fine spatial resolution (Landsat 8 and Sentinel 2) in a coupled LSM-VDM over an heterogenous ecosystem; 2) dynamically calibrate key VDM parameters for LAI predictions, through a proposed approach based on the manipulation of the conservation equations of green and tree biomass; and 3) evaluate the effectiveness of the proposed assimilation approach for predictions of key land surface fluxes, such as evapotranspiration and CO 2 assimilation.

Methods
In this section we describe the proposed approach for assimilating NDVI data in a coupled LSM-VDM.The case study and data are then presented.

The assimilation approach
The proposed multiscale assimilation approach includes (Figure 1): 1) the LSM running at a fine time scale (e.g.half hourly), 2) the VDM that predicts LAI dynamics at a coarser time scale of (e.g.daily), 3) the NDVI observations from remote sensors at a larger time scale (e.g.weekly), which are assimilated through the EnKF, 4) the updating of model parameters at a further larger time scale (e.g. three weeks or more).
Below, each component of the proposed assimilation approach is described.

The coupled LSM-VDM
The ecohydrological model is a three-component coupled land surfacevegetation dynamic model.The VDM estimates the LAI evolution through time for two vegetation components (grass and trees), which are used by the LSM for computations of the energy exchanges between soil and vegetation (Figure 2).The details are given in Montaldo, Albertson, and Mancini (2008).Here, a summary of the main components are described.
2.1.1.1.The land surface model.The LSM predicts the dynamics of water and energy fluxes at the land surface on a half-hour time step (Figure 1).It includes three components of land surface: bare soil, grass and trees.The root zone supplies the bare soil and vegetation with soil moisture for evapotranspiration and controls the infiltration and runoff mechanisms.Equations for surface temperature and the components of the energy balance are applied separately for each land cover component, so that the model predicts the energy balance distinctly for each land cover component (Montaldo, Albertson, and Mancini 2008) (Table 1).
The soil water balance equation of the root zone is computed by where θ rz is the soil moisture of the root zone, d rz is the root zone depth, I bs is the infiltration rate on bare soil, I t and I gr are the throughfall rates infiltrating into the soil covered by trees and grass respectively, q D is the rate of drainage out of the bottom of the root zone, E bs is the rate of bare soil evaporation, E t and E g are the rates of transpiration of trees and grass respectively, f v,t is the fraction of tree cover, f v,g is the fraction of grass cover, and f bs is the fraction of bare soil (Montaldo, Albertson, and Mancini 2008).The throughfall rate is modeled through a balance equation of the intercepted water by the canopy reservoir, which storage capacity is a function of the LAI (= 0.2 LAI; Noilhan and Planton 1989) and produces throughfall when the reservoir is saturated (Montaldo and Albertson 2001;Noilhan and Planton 1989).An infiltration excess mechanism, based on the Philip's infiltration equation (Philip 1957), is used for the infiltration.The q D rate is estimated using the unit head gradient assumption (Table 1; Albertson and Kiely 2001;Montaldo, Albertson, and Mancini 2008).E t and E g are estimated distinctly using the Penman-Monteith equation (e.g.Brutsaert 1982, 224) for each vegetation component, with canopy resistances estimated using a typical Jarvis (1976) approach (Table 1).The actual rate of bare soil evaporation is determined as a(u)PE, where α(θ) is a rate-limiting function, and PE is the potential evaporation estimated by the Penman equation (e.g.Brutsaert 1982, equations 10.15, 10.16 and 10.19).Hence, the total evapotranspiration is estimated as: Paralleling the approach for ET estimation, a 3-component approach is implemented for estimating the total net CO 2 flux (Montaldo, Corona, and Albertson 2013): where F c,t and F c,g are the carbon exchange of trees and grass, respectively, and R bs is the soil respiration.Carbon exchange rates for each PFT (i.e.F c,t , F c,g ) are computed as the difference between photosynthesis and growth respiration (Table 2).Soil respiration is estimated as a function of the temperature (Table 2; Montaldo, Corona, and Albertson 2013;Novick et al. 2004;Ruehr and Buchmann 2010).
The model parameters are presented in Table 3. 2.1.1.2.The vegetation dynamic model.The VDM computes the change in biomass over time from the difference between the rates of biomass production (photosynthesis) and loss, mainly through respiration (e.g.Cayrol et al. 2000;Larcher 1995).The VDM distinguishes tree and grass components.In the VDM of trees, four separate biomass states are considered: green leaves (B l ), stem (B s ), living root (B r ), and standing dead (B d ).However, the VDM of grass only distinguishes three biomass states (green leaves, roots and standing dead).The biomass [g DM m −2 ] components are simulated through ordinary differential equations, integrated numerically at a daily time step (Montaldo, Albertson, and Mancini 2008): Table 1.Equations of drainage (q D ), canopy resistance (r c ) with stress functions of soil moisture (θ), air temperature (T a ) and vapor pressure deficit (VPD), sensible heat flux (H ), net radiation (R n ), soil heat flux (G), and surface temperature (T s ) in the LSM.
Parameters are defined in Table 3.
where C H the heat transfer coefficient Net radiation , with shortwave incoming ration, R swin , longwave incoming ration, R lwin , estimated based on equation 6.10 in Brutsaert (1982), α albedo, ε emissivity and σ the Stefan-Boltzmann constant with T 2 being the mean T s value over one day τ, and C T the soil thermal coefficient where Ph is the gross photosynthesis, a a , a s and a r are allocation (partitioning) coefficients to leaves, stem and root states, R l,μ , R s,μ and R r,μ are the maintenance respiration rates from leaves, stem and root biomass respectively, R l,γ , R s,γ and R r,γ are the growth respiration rates from leaves, stem and root biomass respectively, while S g , S s and S r are the senescence rates of leaves, stem and root Table 2. Equations of the vegetation dynamic model components.Parameters are defined in Table 3.

Allocation
For the tree cover: For grass cover:

Respiration
Maintenance and growth respirations of biomass components R l,m = m a f 4 (T)B l ; R l,g = g a a a P g R s,m = m s f 4 (T)B s ; R s,g = g s a s P g R r,m = m r f 4 (T)B r ; R r,g = g r a r P g f 4 (T) = Q Tm 10 10 with T m = mean daily temperature Soil respiration is the reference respiration rate at 10°C and Q N is the soil respiration sensitivity to temperature.Senescence biomass, respectively, and L a is the litter fall.The model equations are given in Table 2 and the parameters are presented in Table 3, where calibrated values are reported.
The leaf area index is estimated from the biomass by a linear relationship (Arora 2003;Hanson, Skiles, and Parton 1988;Montaldo et al. 2005;Nouvellon et al. 2000): where c l is the specific leaf area of the green biomass, which is different for grass and trees (Table 3).
In the case of LAI predictions of grass, the Equation ( 5) is not used because the VDM of grass only distinguishes three biomass states (green leaves, roots and standing dead) and not stem biomass.
The VDM provides estimates of daily values of leaf biomass and, thus, the LAI of the tree and grass, which is used by the LSM to estimate evapotranspiration, energy flux, rainfall interception, carbon assimilation, and the soil water content at a half-hour time step (Figure 2; Montaldo, Albertson, and Mancini 2008).The LSM provides soil moisture and aerodynamic resistances to the VDM (Figure 2).The details are given in Montaldo et al. (2005), Montaldo, Albertson, andMancini (2008), andMontaldo, Corona, andAlbertson (2013).

Optical remote sensing data
NDVI is estimated from red and near-infrared spectral reflectance measurements of satellite remote sensors (Figure 2).The Operational Land Imager (OLI) -Landsat 8 data is mainly used, and, secondly, the Sentinel 2 Multi Spectral Imager (MSI) data increases the optical database.Landsat 8 has a temporal resolution of 8-days and a spatial resolution of 30 m for the optical bands (panchromatic band at 15 m spatial resolution), which is similar to Sentinel 2 data, and both are freely available.
From Landsat 8 and Sentinel 2 observations, NDVI is estimated at 30 m spatial resolution for being assimilated in the EnKF based approach.LAI is related to NDVI through the Γ operator using an empirical approach (e.g.Gupta, Prasad, and Vijayan 2000;Potithep et al. 2010;Wang et al. 2005): where β 1 , β 2 and β 3 are coefficients for vegetation species.The values of β 1 , β 2 and β 3 have been estimated for the case study from simultaneous observations of LAI in the field and NDVI from remote sensors.Note that analogous solutions should be derivable with different G(NDVI) relationships.

The ensemble Kalman filter
We assimilate observations of NDVI, which is related to LAI through (9), in the VDM, which describes the evolution of LAI.In the Kalman filters, w is a vector of surface state variables (in this case LAI).The equation describing the evolution of w (at the t i time step), as determined by a nonlinear model ( f , in this case the vegetation dynamic model) can be written as (e.g.Crow and Wood 2003): where v relates errors in model physics, parameterization, and/or forcing data, and is taken to be with mean zero and covariance V. H is the operator that represents the observation process which relates w to the actual measurements available at time t j (with j = N la , 2, 3 N la , … N, and N la the number of Δt 1 model time steps in Δt 2 measurement time steps) where 1 represents the vector of measurement errors, assuming a probabilistic distribution with zero mean and covariance R.
In the EnKF (Evensen 1994;Reichle et al. 2002), an ensemble of w z (ζ = 1, … , N e , with N e the size of the ensemble) is predicted in parallel, using (10).The EnKF updates each ensemble member separately, using the d(t j ) observation and the diagnosed state error covariance P − (t j ) (e.g.Reichle et al. 2002, Equation 6b).The superscripts '-' and '+' refer to the state estimates before and after the update at time t j , respectively.Ensemble members are updated using (Reichle et al. 2002): where K is the Kalman gain, which depends on P − , and 1 z is a random realization of the measurement error, which should have the same statistical properties as the error included in (11) (Margulis et al. 2002).The mean of the ensemble members, w + (t j ) is the state estimate of the variables.Model errors in the EnKF are included through errors in the VDM initial conditions, physical parameters and forcing data.Errors of i) LAI initial conditions, ii) incoming short-wave solar radiation (R swin ), and the Photosynthetically Active Radiation (PAR), and iii) model parameters, such as the maintenance respiration coefficients for aboveground biomass (m a ) of grass and trees (Table 2), are included.We chose m a as the VDM parameter for data assimilation after a sensitivity analysis of LAI to VDM parameters, which proved the high sensitivity of grass and tree LAI to m a .Indeed, we performed an univariate sensitivity analysis of LAI to VDM parameters for tree and grass (m a , g a , Q 10 , ξ a , Ω, r s,min , θ lim , θ wp ), varying parameter values in a range of ±50% of the calibrated values (which are in Table 3).When we varied the m a parameter we found the largest effects on predictions of both grass (from 72% to 224% of the calibrated mean LAI value) and tree (from 14% to 121% of the calibrated mean LAI value) LAI, while varying the other parameters the variations of LAI predictions were less than half.
The ensemble of LAI initial values was generated by altering a particular value of LAI through the addition of a normally distributed perturbation with mean zero and SD LAI standard deviation.At each time step, the ensembles of R swin and PAR were generated by multiplying the recorded R swin and PAR values by normally distributed random variables.The ensembles of grass and trees maintenance respiration coefficients (m z a,g for grass and m z a,t for trees), were generated as being normally distributed with means of ma,g and ma,t and standard deviations of SD mag and SD mat , respectively.
In this way, ensembles of LAI ζ of grass and trees, which include model errors, were generated and evolved in time according to (4) and ( 8).The d(t j ) observations were obtained including the NDVI z random error in the NDVI observations derived from Landsat 8 (or Sentinel 2) according to (11), where the operator H is the inverse of Γ in (9).When observations from Landsat 8 (or Sentinel 2) are available, the ensemble of LAI z (i.e.LAI z− (t j )) is replaced by (or updated to) the ensemble LAI z+ (t j ), which is optimally estimated by (12) using the d(t j ) observations.Again, the state estimates of grass and tree LAI are given by the means of the ensembles.Hence, the EnKF filters the errors in observations of NDVI, which is related to LAI through (9), and the errors in LAI predictions made by the VDM.

The proposed approach for the parameter updating
The EnKF approach compensates for both inaccurate initial conditions and moderate model parameter errors.Here, we propose a method for adjusting a VDM parameter, the maintenance respiration coefficient for above-ground biomass, over a longer time scale than the remote sensing observation time scale when it largely diverges from calibrated values (Table 3).The proposed method updates (i.e.dynamically adjusts) m a based on observations of persistent bias in the modeled biomass (i.e.LAI).The proposed procedure derives the required m a adjustment from the conservation equation of the biomass (i.e.LAI) and is the same for grass and tree LAI.Substituting (8) in (4), the biomass balance for the modeled 'm' state variables is: Since the biomass balance must be conserved in both the model and reality, we write (13) for observed 'o' state variables.
Assuming that Ph m ; Ph o , R m l,g ; R o l,g , and S m l ; S o l , and subtracting ( 14) from ( 13), where DLAI o,m is the assimilation correction.From the maintenance respiration equation (Table 2) and ( 8): assuming that f m 3 (T) ; f o 3 (T), the first-order Taylor series expansion of the maintenance respiration function about the modeled parameter values relates the modeled maintenance respiration to the 'real' maintenance respiration, in terms of differences between modeled and 'real' LAI and maintenance coefficient values where Substituting ( 18) into (15) relates the difference between 'real' and modeled LAI to the difference between 'real' and modeled maintenance coefficient values Differentiating ( 16) and substituting into (20) yields Solving ( 21) for the 'real' maintenance respiration coefficient, in terms of known quantities This expression would, theoretically, provide an estimate of the actual m a at each time the LAI is updated through NDVI, from knowledge of the change in DLAI o,m since the last update.However model parameter updating over a short interval is ill advised (Montaldo and Albertson 2003;Montaldo, Albertson, and Mancini 2007), as instantaneous measurement errors would induce detrimental shocks in the model.We propose updating over a reasonable averaging period, to relate persistent bias (as defined by a time average) to model parameter bias.By averaging ( 22) over an appropriate time interval (Δt 3 , e.g. 3 weeks, Figure 1) to capture a reliable estimate of the 'persistent' LAI bias, we are able to estimate the required change in the maintenance respiration coefficient needed to reduce the model bias where we employ the overbar to denote a time averaged term.From inspection of (23), it is apparent that the process of averaging the LAI increments derived from the state variable assimilation will cancel temporally uncorrelated noise in favor of retaining the long-term persistent bias.When ( 23) is used in the EnKF, we need to update each component of the m z a,g and m z a,t ensembles over the Δt 3 time interval, which coincides with time steps t k (k = N la N ma , 2N la N ma , … N and N ma the number of Δt 2 steps in Δt 3 , i.e.Δt 3 = N ma Δt 2 = N la N ma Δt 1 ), through where the notation in the summation operator gives an averaging from (k -N ma N la + N la ) to k in increments of N la .In this way, an estimate of the 'persistent' LAI bias is used for evaluating the necessary change in the m a .Thereby, after a learning (calibration) period the error of the model can be eliminated, allowing to restore a main assumption of the Kalman filter, which is zero mean model error.The solution is the same for grass (m z+ a,g ) and tree (m z+ a,t ) maintenance respiration coefficients.
In short, the multiscale assimilation scheme includes (Figures 1 and 2): 1) the land surface model, which run at the half hourly time scale, required for predicting the diurnal dynamics of water and energy balance terms; 2) the VDM-predicted LAI dynamics of grass and trees through (4) and ( 8) at a daily time scale, generating the ensembles of LAI at the t i time intervals; 3) the EnKF, which filters the NDVI remote data (9) of grass and trees, available over the t j time intervals (e.g.weekly) with model errors and measurements, and optimally updates the ensembles of LAI z− (t j ) of grass and trees through ( 12) to arrive at LAI z+ (t j ); 4) the updating of the m z a ensembles of grass and trees through (24) over the t k time intervals (e.g. 3 weeks).
Hereafter, we indicate the ensemble open loop without assimilation (i.e.only steps 1-2) as 'EnOL', with 'EnKF' being the assimilation approach that includes the ensemble Kalman filter only (i.e.step 1-3), and with 'EnKFdc' being the assimilation approach that includes the four steps described above.
Because the performance of the assimilation approach has to be proven for increasing uncertainty of the model (given the observation errors), the proposed assimilation approach will be tested for increasing errors in grass and tree m a model parameters, comparing EnOL, EnKF, and EnKFdc performance.

Case study
The proposed assimilation approach was tested with observations from a field site at Orroli, Italy, located in east-central Sardinia (39°41'12.57''N, 9°16'30.34''E, 500 m a.s.l.; Detto et al. 2006;Montaldo, Albertson, and Mancini 2008;Montaldo, Corona, and Albertson 2013;Montaldo et al. 2020).The landscape is a patchy mixture of woody vegetation, where the dominant tree species is wild olive with a variable height of 3.5-4.5 m (∼33% of the footprint area), and herbaceous and grass species, on a gently sloping plateau (∼3°from NW to SE).The grass species grow during wet seasons and reach approximate heights of 0.5 m in spring.The soil thickness varies from 15 to 40 cm, averaging 17 cm ± 6 cm (standard deviation, SD) above a fractured basalt (Montaldo, Albertson, and Mancini 2008;Montaldo, Corona, and Albertson 2013).The climate at the flux site is maritime Mediterranean, with a mean annual precipitation of 643 mm, and mean July precipitation of 11 mm.Mean annual air temperature (T a ) is 14.6 °C, with mean July T a of 23.7 °C.

Field data
A 10 m micrometeorological station was operating at the site to measure land-atmosphere flux of energy, water and carbon in addition to key state variables.The apparatus included a Campbell Scientific CSAT-3 sonic anemometer and a Licor-7500 CO 2 /H 2 O infrared gas analyzer positioned adjacent to each other at the top of the tower.These two instruments measured velocity, temperature and gas concentrations for the estimation of sensible heat flux, evapotranspiration (ET) and CO 2 exchanges (Fc) with the standard eddy covariance method (e.g.Baldocchi 2003).Half hourly statistics were computed.
The two-dimensional footprint model of Detto et al. (2006), previously tested for this site, was used for interpreting eddy-correlation measurements in the context of the contributing land cover area.The combined use of the footprint model and the satellite images allowed us to interpret the eddy-correlation observed surface flux and distinguish the source area of each PFT and bare soil to the measured flux, using the methodology in Detto et al. (2006).
LAI was measured indirectly through a ceptometer (Accupar model PAR-80, Decagon Devices Inc., Washington USA), which measures the PAR in the 400-700 nm waveband, and estimates the LAI from these readings (details are given in the instruction manual edited by Decagon Devices Inc.).LAI measurements were performed mainly during the grass growth season (Montaldo, Albertson, and Mancini 2008).Finally, specific leaf areas (LAI divided by dry biomass) of predominant grass (= 0.01 m 2 gDM −1 ) and woody vegetation (= 0.005 m 2 gDM −1 ) species were measured directly (by weighing the dry biomass).

Remote sensing and data assimilation approach for the case study
A total of 176 images were acquired (126 from Landsat 8 and 50 from Sentinel 2, see Figure 2c) for the 2016-2020 period, from which NDVI was derived at a 30 m spatial resolution.Images from the Sentinel 2 radiometer were acquired at the L1C level and atmospherically corrected with the Sen2-Cor tool of the Sentinel Application Platform (SNAP), or directly at the L2A level (already corrected).For Landsat 8 the L1TP product was used (it is radiometrically calibrated and orthorectified using ground control points and a digital elevation model), and the dark object subtraction (DOS) method (Chavez 1996) was used for the atmospheric correction.The coefficients of (9) were estimated using simultaneous NDVI data from remote sensors and LAI observations in the field (a total of 24 simultaneous days) distinguishing grass and trees (β 1 = −0.435,β 2 = 1.014 and β 3 = 0.4029 for grass in the fall-winter period, β 1 = −0.141,β 2 = 1.720 and β 3 = 1.674 for grass in the spring-summer period, β 1 = 0, β 2 = 5.392, and β 3 = 0.486 for trees).
In this case study, the LSM time step was half an hour, VDM time step was one day, the assimilation time step of NDVI data was variable according to data availability ranging from 2 days to 20 days with an average of 7 days, and the time step of the m z a,g and m z a,t updating was 3 weeks for both grass and trees (Figure 1).In the EnKF, N e was 100, which is a sufficiently large number for accurate predictions (Crow and Wood 2003;Reichle et al. 2002).
We assumed the measurement errors being with zero mean and a standard deviation of 0.025, for both grass and tree NDVI.We generated the ensembles of initial grass and tree LAI values of grass (LAI g ) and tree (LAI t ) from a Gaussian distribution with means of 0.5 and 5.5, intentionally different from the observations, and a standard deviation s LAI of 0.2 for both grass and tree LAI.At each time step we generated the ensembles of incoming solar radiation and PAR by multiplying the measured values by a normally distributed random variable with mean zero and a standard deviation equal to 10%.It should be noted that the errors of the initial model states and parameters were uncorrelated.
The proposed assimilation approach was tested comparing the EnOL, EnKF, and EnKFdc approaches for nine initial m z a,g and m z a,t ensembles at most, generated with nine different initial ma,g and ma,t values (with the same SD mag and SD mat = 5% of the initial value).
A spatially distributed version of the multiscale assimilation approach was also applied to the area around the tower for the case study, running the model and assimilating NDVI data for LAI predictions in each cell of the spatial domain (Figure 3a).

Results
The NDVI map of the 500 m × 500 m area around the tower, derived from the Landsat-8 data of a dry summer day (08/14/2017), captured the spatial heterogeneity of the Orroli field site, allowing the identification of cells with predominately bare soil (NDVI < 0.22) and tree cover (NDVI >0.4) components (Figure 3a).The 2-component system (bare soil and tree) evolved from the start of the rainy seasons (typically in autumn in Sardinia), when grass grew replacing bare soil, and reaching its maximum growth in spring.Indeed, the distribution peak of the NDVI frequency in the examined area, which was mainly unimodal (Figure 3b), moved from ≈ 0.35 on a dry summer day to ≈ 0.47 on an autumn day, and reached ≈ 0.65 on a spring day (Figure 3b).The NDVI time evolution of the 289 cells in the examined area marked the heterogeneity and seasonality of the field (Figure 3c), with contrasting summer (NDVI decreased up to 0.2 with a high spread of the data Figure 3. NDVI data of the field around the tower at the Sardinian site: a) the map in a dry day of the 2017 summer estimated from a Landsat 8 image (the selected representative grass and tree cells are market with edges in red; the tower is in the center of the map), b) frequency distributions during representative days of 2017 winter (01/25), spring (04/08), summer (08/14) and fall (11/18), c) evolution of the 289 cells in the field in time, yellow and green represent the selected representative cells of grass and tree respectively (circles: Landsat 8 data; squares: Sentinel 2 data).
distribution) and spring (NDVI up to 0.8 with a narrower distribution).Tree NDVI was much less variant over time than grass NDVI, which ranged from ≈ 0.2 (i.e.no grass, just bare soil) in summer to ≈ 0.75 in spring (Figure 3c).The use of the optical remote sensing data allowed to distinguish grass and tree cells.We selected the representative grass cell with low NDVI (≈ 0.2) and the representative tree cell with high NDVI (≈ 0.45) in the field from the NDVI map of the dry summer day (Figure 3a).We then used the NDVI time series of the two selected representative tree and grass cells (Figure 3c) for the data assimilation.In addition, for the spatially distributed application of EnKFdc, we assimilated the entire NDVI maps (Figure 3a), evaluating spatial patterns of the updated model parameters.

Data assimilation results
The coupled LSM-VDM were calibrated and validated in Montaldo, Albertson, and Mancini (2008) and Montaldo, Corona, and Albertson (2013) for this case study; the model parameters are given in Table 3.Here, the EnOL, EnKF and EnKFdc approaches are compared for increasing model uncertainties, evaluating the benefit of the assimilation approaches.
First, we tested the EnKF approach, generating m z a,g and m z a,t ensembles with slightly lower ma,g and ma,t values (= 0.02 d −1 and 0.0006 d −1 , respectively) than the calibrated values (= 0.032 d −1 and 0.001 d −1 , respectively) intentionally, assuming a moderate model error (Figure 4).The EnOL and EnKF approaches were compared for evaluating the performance of the filter in Figure 4, where the grass and tree LAI ensembles predicted using the EnOL and the EnKF approaches are plotted with the Figure 4. Assimilation results at the Sardinian site comparing the ensemble open loop configuration (EnOL) and the Ensemble Kalman filter (EnKF) approach for grass and tree LAI predictions for slightly low ma,g (for grass) and ma,t (for trees) values of 0.02 d −1 and 0.0006 d −1 : a) the ensemble mean grass LAI predictions using EnOL and EnKF and LAI observations derived by Landsat 8 and Sentinel 2 (obs.)(the 95% confidence interval of the EnKF LAI assimilation is shown as a gray band); c) the evolution of the rmse of the ensemble mean grass LAI predicted by EnOL and EnKF with respect to the observed LAI from remote data using a 60day window, translated in 10-day increments; b) and c) are the same as a) and c), respectively, but for tree LAI.
LAI values derived by NDVI satellite observations using (9).The spread of the ensembles of grass and tree LAI z decreased rapidly through time (Figure 4a and 4b).Compared to EnOL, the evolution of tree LAI with EnKF becomes closer to the observed tree LAI (Figure 3b).The EnKF guided LAI towards observations from the optical sensors (Figure 4), correcting the EnOL especially in spring and earlysummer, mostly for the tree component (rmse of 0.26 for the whole period and = 0.40 for the springs using EnOL and rmse of 0.15 for the whole period and = 0.18 for the springs using EnKF for LAI of grass, and rmse of 4.01 using EnOL and rmse of 0.20 using EnKF for LAI of trees).
The performance of the EnKF assimilation scheme and the proposed EnKFdc were verified for increasing weak knowledge of the maintenance respiration coefficients.Generating m z a,g and m z a,t ensembles with initial ma,g and ma,t values (= 0.12 d −1 and 0.01 d −1 , respectively) being much higher than the calibrated values, meant that the persistent biases of the biomass balances were not removed by the EnKF-based assimilation approach (Figure 5a and 5c), and the EnKF assumption (mean zero of the model error) was not yet respected.Instead, the EnKFdc removed the bias in grass and tree LAI predictions through the dynamic updating of the m z a,g and m z a,t ensembles (Figure 6a and 6c).While using EnKF, the rmse of grass and tree LAI was still high (= 0.25and 4.05, respectively, after one year), using the EnKFdc the rmse of the grass and tree LAI estimates decreased significantly (= 0.12 and 0.13, respectively, after one year).Indeed, in EnKFdc, the m z a,g and m z a,t ensembles decreased and converged close to the calibrated values after a learning (calibration) period (Figure 6a and 6c) because ( 24) guided the paths of the ensemble members toward Figure 5. Assimilation results for grass and tree LAI predictions at the Sardinian site using initial high ma,g (for grass) and ma,t (for trees) values of 0.12 d −1 and 0.01 d −1 , respectively: a) and c) the comparison between LAI observations derived from assimilated remote optical data (obs.), the ensemble mean LAI predicted using the EnOL, EnKF, and EnKFdc approaches; b) and d) same of a) and b) but for a shorter time period.
the actual values (i.e.close to the calibrated values), and the assumption of the filter became satisfied.
We validated the EnKFdc assimilation scheme using a different start period, 1th January of 2018 instead of 1 st January of 2016 (Figure 5b and 5d), always generating m z a,g and m z a,t ensembles with initial ma,g and ma,t values ( = 0.12 d −1 and 0.01 d −1 , respectively) being much higher than the calibrated values (Figure 6b and 6d).Again, EnKFdc removed the bias in grass and tree LAI predictions and dynamically updated the m z a,g and m z a,t ensembles after one year (Figures 5 and 6).In the same way, when initial ma,g and ma,t ( = 0.0032 d −1 and 0.0001 d −1 , respectively) were largely lower than the calibrated values, the use of the proposed multi-scale assimilation approach allowed the increase of both m z a,g and m z a,t ensembles, which converged to values close to the calibrated values after one year (Figure 6).
We compared the performance of EnOL, EnKF and EnKFdc approaches for a large range of initial ma,g (ranging from 0.0032 d −1 to 0.12 d −1 ) and ma,t (ranging from 0.0001 d −1 to 0.01 d −1 ) (Figure 7).Using the EnOL approach, the rmse of the predicted grass LAI compared to the observed LAI derived by remote observations was very high (>0.5)for initial ma,g lower than 0.02 d −1 , decreasing for ma,g values close to the calibrated m a,g , and increasing again for higher ma,g up to a rmse of 0.35 (Figure 7a).When the EnKF approach was used, the rmse of the predicted grass LAI decreased but was lower than 0.3 only for initial ma,g (= 0.015-0.045d −1 ), close to the calibrated value.Only the use of the proposed EnKFdc approach allowed the removal of model bias for all the range of initial ma,g , and the rmse of LAI always became lower than 0.15 (Figure 6a).Using EnOL, the rmse of the predicted tree LAI was even higher when compared to LAI observations, reaching a value close to 4.0 for the highest ma,t (= 0.01 d −1 ) and 2.6 for the lowest ma,t (= 0.0001 d −1 , Figure 7b).Again, by only using EnKFdc, the model bias was removed for the whole range of initial ma,t Figure 6.Assimilation results for grass and tree LAI predictions at the Sardinian site using initial high ma,g (for grass) and ma,t (for trees) values of 0.12 d −1 and 0.01 d −1 , respectively, and initial low ma,g (for grass) and ma,t (for trees) values of 0.0032 d −1 and 0.0001 d −1 , respectively: a) and c) the evolutions of the m z a,g and m z a,t ensembles using the EnKFdc approach (for reference, the calibrated values of m ag and m at are reported in dotted horizontal lines); b) and d) same of a) and b) but for a shorter time period.
with rmse ≈ 0.1 (although EnKF performed sufficiently well, at least for the lowest ma,t values with rmse ≈ 0.3; see Figure 7b).
We also applied the proposed approach to the entire field around the tower (Figure 3a), running the spatially distributed version of EnKFdc from the 1th January of 2018, with still initial LAI values of grass and tree from a Gaussian distribution with means of 0.5 and 5.5, respectively, and still generating m z a,g and m z a,t ensembles with high initial ma,g and ma,t values ( = 0.12 d −1 and 0.01 d −1 , respectively).While the predicted LAI (Figure 8a) were much higher than the estimated LAI from remote sensor observations (Figure 8b; mean predicted LAI of 4.56, mean observed LAI of 1.92) at the end of January 2018, the predictions of LAI (Figure 8c) became close to LAI observations (Figure 8d; mean predicted LAI of 1.60, mean observed LAI of 1.65) at the end of July 2019, because the EnKFdc removed model bias in all the cells of the spatial domain.Indeed, the difference between maintenance respiration coefficients of grass and trees and their calibrated values (Δm a,g and Δm a,t , respectively) were high at the end of January 2018 (Figure 9a and 9b; mean Δm a,g of 0.05 and mean Δm a,t of 0.005), becoming negligible in July 2019 (Figure 9c and 9d; mean Δm a,g of 0.001 and mean Δm a,t of 0.0001.

Discussion
In heterogenous ecosystems, tree and grass covers and their spatial patterns are variable, depending on the water availability and incoming radiation (Sankaran et al. 2005); their vegetation mapping requires fine spatial and time resolutions of remote sensing data.In these ecosystems, the use of satellite images at coarser resolutions, like the widely used MODIS (Fox et al. 2018;Li et al. 2017) that provides optical data at 250 -500 m resolution, cannot provide adequate information for distinguishing grass and tree cover components, which are usually less than 50 m.The combined use of the new satellites, Landsat 8 and Sentinel 2, whose data are freely available, is a promising solution, allowing monitoring of the Sardinian heterogenous ecosystem at fine temporal (average 7 days) and spatial (30 m) resolutions, including capturing the contrasting the seasonal dynamics of grass and trees.While few previous efforts of LAI data assimilation in LSM and VDM were concentrated on the use of MODIS (Demarty et al. 2007;Fox et al. 2018;Li et al. 2017;Ma et al. 2017;Migliavacca et al. 2009;Quaife et al. 2008), mainly due to the high time resolution (16 days) and the robustness of MODIS observations, we demonstrated that the Landsat 8 and Sentinel 2 NDVI data can be successfully used for data assimilation in heterogenous ecosystems.Coupling the observations of Landsat 8 and Sentinel 2, NDVI data became more frequent (average 7 days) and robust, starting in 2016.
Approaches have been developed to account for certain aspects of the spatial heterogeneity in ecohydrological models (Giorgi and Avissar 1997;Koster and Suarez 1992;Montaldo and Albertson 2003).The coupled LSM -VDM of Montaldo, Albertson, and Mancini (2008) follows the common mosaic of tiles approach, which divides the spatial domain of the model into a collection (or mosaic) of relatively homogenous land-surface elements (or 'tiles') and applies the model to each tile (Avissar and Pielke 1989; Koster et al. 2000;Koster and Suarez 1992).While previous efforts assimilated NDVI or LAI at the MODIS spatial scale (≥ 250 m) assuming the homogenous surface in grid cells  (Demarty et al. 2007;Li et al. 2017;Ma et al. 2017;Migliavacca et al. 2009;Quaife et al. 2008) are unable to distinguish grass and tree components, the proposed approach assimilated NDVI data of selected representative grass and tree cells of the heterogenous field in the coupled LSM-VDM, and successfully predicted grass and tree LAI.
The possibility to merge NDVI data and a VDM optimally for predicting accurately grass and tree LAI is demonstrated.The use of a typical Kalman filter, the ensemble Kalman filter (EnKF), as in previous efforts (Cheng et al. 2020;Huang et al. 2015;Kumar et al. 2019;Li et al. 2017;Ma et al. 2017;Migliavacca et al. 2009;Peng et al. 2021;Quaife et al. 2008;Tripathy et al. 2013), allowed the capture of LAI dynamics predicted by VDM, when moderate errors of key VDM parameters, like the m a maintenance respiration coefficients of grass and tree components, were included (rmse = 0.15 and rmse = 0.2 respectively, Figure 4).Instead, poor estimates of m a can bring to large VDM errors, and a main assumption of the Kalman filter (model error with zero-mean) was violated.
When the grass and tree m a base values were weakly estimated (of one order of magnitude), and the parameter bias cannot be ignored compared to the random error, the EnKF approach (which assimilates NDVI data only) poorly predicted grass and tree LAI (rmse = 0.25 and rmse = 3.94 respectively, for high m a base values).Instead, the proposed EnKFdc approach accepted the violation of the filter assumption in the early model runs, but removed the biased model error over time (rmse = 0.14 and rmse = 0.1 respectively, for high m a base values), adjusting all the components of the m z a,g and m z a,t ensembles from the observed persistent bias in LAI (Figure 6).Previously, Montaldo, Albertson, and Mancini (2007) assimilated soil moisture observations in the same LSM using EnKF and developed an assimilation approach that dynamically calibrates a key soil water balance model parameter, the saturated hydraulic conductivity, as a function of the persistent bias in soil moisture predictions.This approach was useful when parameters largely differed from the calibrated values (Montaldo, Albertson, and Mancini 2007), and was applied by Montaldo et al. (2022) also assimilating Sentinel 1 radar data for soil moisture (Montaldo et al. 2022). Lü et al. (2011a;2011b), Nie, Zhu, and Luo (2011) and Zhang et al. (2017) concentrated on assimilating soil moisture observed in the field using EnKF, and calibrating the few parameters in soil water balance models.We compared the performance of EnOL, EnKF and EnKFdc approaches for a large range of initial ma,g (ranging from 0.0032 d −1 to 0.12 d −1 ) and ma,t (ranging from 0.0001 d −1 to 0.01 d −1 ) (Figure 7), and wdemonstrated that the proposed dynamic calibration of key model parameters was also essential for assimilating NDVI data for LAI predictions.The proposed EnKFdc approach, which dynamically calibrates maintenance respiration coefficients of grass and trees, removed the bias in the biomass balance due to the bias in plant respiration (Figure 5), which was essential for balancing photosynthesis production, and reduced the model uncertainty (Figure 7).EnKFdc well corrected the VDM, guiding the entire ensemble of plant maintenance respiration coefficients toward the optimal value throughout the simulations.It should be noted that analogous solutions should be derivable for other VDMs, updating those parameters that highly impact model predictions.The new approach demonstrated its efficiency also when applied to the entire site around the tower (Figure 3a), which is characterized by a spatial variability of vegetation composition.The EnKFdc was able to guide the model for optimally updating the LAI of both vegetation components for all the cells in the field (the mean difference between the modeled and observed LAI in the field was only 3% in July 2019, Figure 8).Indeed, the maintenance respiration coefficients have been efficiently updated in all the cells of the field around the tower, with differences lower than 0.1% the calibrated values.We also demonstrated the effectiveness of the proposed assimilation method for evapotranspiration and CO 2 exchange predictions by comparing the predictions using the EnOL, EnKF, and EnKFdc approaches with ET and Fc observations of the eddy-correlation based tower.We evaluated the proposed assimilation approach for ET and Fc predictions using the full ranges of initial ma,g and ma,t values (Figure 10) and comparing the total predicted ET and Fc using the EnOL, EnKF, and EnKFdc approaches and the total observed ET and Fc (considering the days with observations only, which were 212 and 157 days, respectively).Using the EnOL approach, the errors in grass and tree LAI predictions were reflected in ET predictions (Figure 10a).The EnKF approach was not enough for removing error bias in ET predictions, mainly for high ma,t regardless of ma,g , under-estimating ET by up to 40-50%, and for low ma,g (<0.02 d −1 ) and ma,t (< 0.003 d −1 ) over-predicting ET by up to 15% (Figure 8c).The use of EnKFdc allowed the removal of model bias in ET predictions for the whole range of initial ma,g and ma,t values, remaining a slight over-prediction (up to 7%) only for extremely low initial ma,g < 0.009 d −1 (Figure 10e).Model bias in Fc predictions was even higher when the EnOL and EnKF approaches were used, with over-predictions up to 120% for low ma,g and ma,t , and under-predictions up to 60% for high ma,g and ma,t (Figure 10b and 10d).The use of the proposed EnKFdc approach allowed correction of the model and Fc estimates were close to those observed, except the low mis-prediction of ≈14% for the highest (> 0.1 d −1 ) and the lowest (< 0.009 d −1 ) ma,g (Figure 10f).Hence, for the robust prediction of two key terms of the land surface fluxes, the evapotranspiration and the CO 2 net flux, the EnKFdc approach is demonstrated to be essential.The proposed EnKFdc approach removed the model bias through a dynamic (on-line) calibration, allowing the correct prediction of ET and F c for the entire range of the parameter values.

Conclusions
The assimilation of periodic observations of vegetation index, such as NDVI, from optical remote sensing platforms, allows for guiding the predictions of the coupled land surface model (LSM) and vegetation dynamic model (VDM) in heterogenous ecosystems.Indeed, the new Landsat 8 and Sentinel 2 optical sensors are available at such high temporal (≈ 7 days) and spatial (≈ 30 m) resolutions that they capture the dynamics of the main vegetation components (grass and trees).We demonstrated the possibility to successfully assimilate optical observations of grass and tree cover components of the Sardinian heterogenous ecosystem in a VDM for a long data series (5 years).We merged NDVI observations and the VDM optimally, to provide robust predictions of grass and tree LAI in the heterogenous ecosystem.Moreover, the inefficacy of a typical assimilation approach based on the Ensemble Kalman filter was demonstrated when a key VDM parameter, the m a maintenance respiration coefficient, was estimated poorly, because it brings excessive model errors that alter the grass and tree biomass balances, and the Kalman filter assumption (model error with zeromean) was violated.
Only using the proposed EnKFdc assimilation approach, the persistent bias of the model was removed through the adjusting of the model parameters (m z a,g and m z a,t ) ensembles related to the persistent bias in grass and tree LAI predictions.
Finally, we demonstrated that the proposed EnKFdc approach allowed good prediction of evapotranspiration and CO 2 exchanges, which are strictly related to LAI, when there was a strong, uncorrected estimate of the ma,g and ma,t base values.The proposed multiscale assimilation approach removed the model bias through a form of dynamic calibration.
The proposed assimilation approach can be useful in operational forecasting ecohydrological models over large domains, where the estimate of model parameters would be extremely uncertain.

Figure 1 .
Figure 1.The multiscale assimilation scheme.LSM is the land surface model, VDM is the vegetation dynamic model, EnKF is the ensemble Kalman filter, and LAI is the leaf area index.On the right is an example of the time scale of each component of the assimilation scheme.

Figure 2 .
Figure 2. The technical workflow (flowchart) of the assimilation approach.

Figure 7 .
Figure7.The rmse of ensemble mean LAI of a) grass and b) tree predicted using the EnOL, EnKF, and EnKFdc approaches with respect to the observed LAI derived from remote data, varying the initial ma,g and ma,t values.

Figure 8 .
Figure 8.The LAI predictions for each cell of the field around the tower a) at the end of January 2018 and c) at the end of July 2019, compared with estimated LAI from remote sensor data b) at the end of January 2018 and d) at the end of July 2019.

Figure 9 .
Figure 9.The difference between the maintenance respiration coefficients of a) grass and b) trees and their calibrated values (Δm a,g and Δm a,t ) for each cell in the field around the tower at the end of January 2018; c) and d) same as a) and b) at the end of July 2019.

Figure 10 .
Figure10.The comparison of the total evapotranspiration (ET, left panels) and carbon assimilation (Fc, right panels) predicted using the EnOL, EnKF, and EnKFdc approaches with the total observed ET and Fc varying the initial ma,g and ma,t values.

Table 3 .
Model parameters of the coupled LSM-VDM and their values for the Orroli site.
*Two values for vegetation parameters are those of the grass and tree.