Improved food-insecurity prediction in smallholder-dominated landscapes using MODIS Enhanced Vegetation Index and Google Earth Engine: a case study in South Central Ethiopia

ABSTRACT Recent droughts and food insecurity underline the need for objective, timely, spatially explicit food aid prediction in Ethiopia. We developed a generic user-friendly method to detect greening of agricultural areas and derive predictions of agricultural production for potentially food-insecure areas. We used the Enhanced Vegetation Index (EVI) from combined Terra/Aqua MODIS (Moderate Resolution Imaging Spectroradiometer) images to generate EVI time series over multiple growing seasons. Maximum seasonal greening (EVImax), as proxy for biomass and expected crop yield, was related to rainfall variability and to indicate areas of risk for crop failure due to drought within the necessary reaction time for emergency aid. Four agroecological zones were covered from 2003 to 2019. Vegetation periods per 250m pixel were calculated back from EVImax. EVImax was validated against measured yields on large-scale farms. Interannual means and variability of EVImax served to assess production and drought risk. Yield predictions corresponded well with wheat production (r2≅0.5 p≤0.05). High temporal variability and low absolute EVI indicated drought-prone areas. EVI was positively correlated with rainfall data in cropped drought-prone areas (r2≅0.4, p≤0.05), but negatively in temporally water-logged highlands (r2≅0.3, p≤0.05). Our user-friendly approach on Google Earth Engine can accurately detect imminent food insecurity and facilitate timely interventions.


Introduction
Planning, monitoring, and research activities to mitigate food insecurity in rural areas require information on land use in regard to agricultural production at local and lower administration levels, where food-securityrelated policy and budget allocation strategies are implemented. In developing countries, this information is often not consistently available at the required resolution. For example, in Ethiopia, the Emergency Need Assessment (ENA) is compiled every year by the government, based on interviews with regional, zonal and district officials, but only few districts are visited to confirm the findings (DRMFSS & MoA, 2012). Production data from the statistics agency are only available at a higher administrative level and aggregated across agroecological zones. Consequently, discrepancies between the ENA and farmers' needs have been reported (personal communication). Spatial and temporal information on agricultural production is required to plan, implement, and monitor development activities and should be available at district (woreda) level because many agricultural policy decisions are implemented at this level (Warner et al., 2015). Unfortunately, no such woreda-level agricultural production statistics are currently available in Ethiopia.
Accuracy and usefulness of crop yield assessments can be improved through the application of long-term remote-sensing data, such as by analyzing past trends and identifying areas where agriculture has been shown to be unsustainable (UN Committee on the Peaceful Uses of Outer Space, 2013; Dubovyk et al., 2015;Kibret et al., 2020). Vegetation indices (VI) from remote sensing data at monthly or higher temporal resolution have been used to monitor biomass, yields (Funk & Budde, 2009) and drought (Brema et al., 2019) as well as a combination of both VI of NOAA (National Oceanic and Atmospheric Administration), MODIS (Moderate Resolution Imaging Spectroradiometer) and SPOT images. Various online portals and software packages provide specific processed spatio-temporal information (Eerens et al., 2014), e.g., for agricultural purposes. Commonly, vegetation indices like the Normalized Differenced Vegetation Index (NDVI) and the Enhanced Vegetation Index (EVI), both derived from MODIS, are used for crop monitoring. NDVI has been found to be influenced by noise due to atmospheric effects (Xue & Su, 2017), background signals (e.g., bare soil and litter) under incomplete canopy cover (Huete et al., 2002), and the interactions of both (Graw et al., 2017;Liu & Huete, 1995). EVI has been developed as an enhancement over traditional NDVI to reduce such effects (Huete et al., 2002). It avoids distortions by including the blue reflectance band (Rowhani et al., 2011) and reducing atmospheric influences (Huete et al., 2011). EVI facilitates vegetation monitoring by decoupling the canopy greenness from background signals (Huete et al., 2002), which is of relevance for monitoring of scattered natural vegetation (Ardö et al., 2018) or vegetation with pronounced seasonal change in LAI like annual cropping systems. EVI has been shown to be highly correlated with LAI, biomass, canopy cover, and the fraction of absorbed photosynthetically active radiation (Gao 2000). Therefore, EVI has been widely used and recommended to detect spatio-temporal vegetation patterns like crop rotations, land management (e.g., single vs. double cropping) or land use change. Impacts of human management over longer periods, land degradation and productivity in context with socio-economic factors (Dubovyk et al., 2015) as well as implications on food security (Zhang et al., 2014) have been interpreted based on EVI. The strength of MODIS-EVI to depict LAI, canopy architecture and structure and their variability over time (Gao et al., 2009) has made it a widely used tool for crop monitoring (Li et al., 2014;Nguyen et al., 2020) and classification (Muhammad et al., 2016), particularly in context with phenology (Wang et al., 2020;Yan et al., 2015;Zhang et al., 2014;Zhang et al., 2019) and climatic variables (Dubovyk et al., 2015;Sedighifar et al., 2020). Wardlow and Egbert (2010) state that most cropmapping studies preferred EVI over NDVI, but found similar accuracy in a direct comparison. MODIS vegetation indices are produced globally over land at daily resolution, and their composite per alternating 16-day intervals of two sensors, Terra and Aqua (Didan, 2015a), is made publicly available. Most vegetation studies and online applications are based on EVI products of either of these two sensors (Broich et al., 2015;Clark et al., 2010;Dannenberg et al., 2014;Pervez et al., 2014;Phompila et al., 2015;Potgieter et al., 2013;Sakamoto et al., 2005).
Regression models using vegetation indices to predict current agricultural production or to validate existing models have been suggested by several authors (Funk & Budde, 2009;Meroni et al., 2014). However, most current remote sensing approaches focus on large-scale assessments rather than at a landscape, community or field level. Studies in East Africa monitored agricultural production aggregated to higher administrative boundaries (Funk et al., 2015;Rojas et al., 2011;Tonini et al., 2012), which neglects spatial variability at the landscape scale. Variability can be due to different agroecological zones or topography, which makes generalization of information difficult even at the district level (Ross et al., 2009).
In regions where water is the main factor limiting production, further improvement of crop predictions can be achieved by accounting for the influence of rainfall on vegetation development. The relationship between VI and rainfall has been studied in different regions of Africa, and drylands globally (e.g., Funk & Brown, 2006;Dhakar et al., 2013;Kileshye Onema & Taigbenu, 2009). However, these studies mostly analyzed uniform and fixed rainfall periods, not spatially and temporally variable rainfall patterns.
Our aim was to objectively predict crop failure and identify resulting emergency needs dynamically and at full spatial coverage for South Central Ethiopia based on an enhanced remote sensing approach that can be easily applied in other regions by extension organisations.
The objectives were to a) combine MODIS EVI from Terra and Aqua to achieve high temporal resolution of input data; b) integrate EVI with rainfallagricultural production relationships in different agricultural land use types to achieve enhanced information on spatio-temporal agricultural production patterns; c) improve predictions by stratified assessment based on agroecological zones at a landscape level after exclusion of non-agricultural areas; d) assess the application of our approach in the field for a more reliable emergency response.
We hypothesized that accurate yield predictions can be made based on the available EVI data put within the context of rainfall -biomass relationships and that these predictions will be available within the required reaction period to allow timely emergency aid.

Study area
The study was conducted across four of the nine major agroecological zones found in South Central Ethiopia  (Figure 1). These zones have distinct rainfall regimes (short rain season from February to June and main rain season from June to October) and elevations. Consequently, they are characterized by different land suitability and agricultural production potentials. All smallholder farming systems in the area are rainfed. The main agricultural regions are characterized by two rainy seasons, the Meher and the Belg, which are also synonymous for the two cropping seasons. Meher is the main cropping season (rainy season June to September, harvest between September and February) and produces 90-95% of the nation's total cereals output, whereas Belg refers to the short rainy season with harvest between March and August. Corn accounts from one-third to nearly half of the Belg's cereal production and the remaining output comprises mostly short-cycle wheat, barley, and tef. Only smallholders cultivate crops during the Belg season (Taffesse et al., 2013). The study site encompasses both food-secure andinsecure livelihood zones. The major land cover/use types included rain-fed agricultural land (e.g., maize, haricot beans, tef, wheat, potato), lakes/reservoirs, forest/woodland (natural and plantation forest) and grass/shrub land.
Maize/Beans are grown mostly in the DMH in the Northwest of the study area, which is characterized by a recurring shortfall of agricultural production due to lack and/or late onset of rains. The MH is located in the downstream part of the watershed, and mostly wheat (Triticum aestivum) and barley (Hordeum vulgare) are cultivated. Wheat and barley are also grown in the WH, which is located in the upstream part of the watershed and experiences excess rainfall. The MMH is a double cropping area, where tef (Eragrostis tef), potato (Solanum tuberosum), beans (Vicia faba) and maize (Zea mays) are predominantly cropped during Belg. In the downstream parts of this area, tef, potato, and beans are grown during Meher, and in the upstream parts mostly wheat. Four sites representing the various agro-ecological zones were identified based on information by local people and experts. The major landscape types and homogeneous cropping systems were determined using a recent land cover map (Kibret et al., 2016(Kibret et al., , 2020.

Methods
Specific steps described in this section were to: a) identify areas of significant change in agricultural production over time (2003 to 2019); b) determine the impact of rainfall variability on the production of the major agricultural crop types; c) characterize each year's agricultural production at district level for 2003 to 2019. The results were validated against well-documented data of homogeneous plots on largescale state farms and cross-checked with information from farmers.

MODIS data
The Terra sensor passes the equator from north to south in the morning, and Aqua south to north in the afternoon. The MODIS instruments on both satellites monitor the entire Earth's surface every 1 to 2 days, acquiring data in 36 spectral bands. Their Vegetation Indices MOD13Q1 (Terra) and MYD13Q1 (Aqua) Version 6 are generated every 16 days at a 250 m spatial resolution. Both products provide two primary vegetation layers, NDVI and EVI. An algorithm chooses the best available pixel among all 16-day period acquisitions, filtering by low clouds, low view angle, and highest NDVI/EVI value. HDF files offered for download contain the vegetation layers, two quality layers, the MODIS reflectance bands 1 (red), 2 (near-infrared), 3 (blue), and 7 (mid-infrared), and four observation layers (Didan, 2015a). 391 MOD13Q1 and 391 MYD13Q1 datasets, with alternating 16-day interval from 2003 (full availability of AQUA) until 2019, were accessed in the Google Earth Engine Environment (GEE) environment. In total 46 MODIS products per year were used. Each dataset of MOD13Q1 and MYD13Q1 consisted of twelve data layers at 250 m spatial resolution. Of these twelve data layers, two layers were used including EVI and the corresponding pixel reliability layer (SummaryQA), which contains ranked values describing overall pixel quality (Didan, 2015a(Didan, , 2015b. EVI bands of each image were clipped to the study area. Values 0 and 1 (good and marginal data, respectively) from the respective MODIS quality band were used to exclude cloud affected pixels. The masked Terra and Aqua layers were then merged to create 8-day composite EVI layers. For each pixel, yearly EVI time-series were generated and curves fitted using the harmonic trends function of GEE. To detect double cropping areas the harmonic cycle was increased from two to four (Kibret et al., 2020).

Rainfall data
Spatially explicit rainfall data for 2003 to 2019 were derived from Climate Hazards Group InfraRed Precipitation with Station data (CHIRPS), a 30+ year quasi-global rainfall dataset (Funk et al., 2015). CHIRPS incorporates 0.05° resolution satellite imagery with in-situ station data to create gridded rainfall time series for trend analysis and seasonal drought monitoring. With 4 km spatial resolution, the CHIRPS data set was the best available option. It has a temporal resolution of one day.

Local information
Information on local variability of rainfall and agricultural production was collected for cross-checking from different villages at three of the selected sites through group discussions. The three sites were in Adami Tulu, Shashemene, and Gedeb Hassassa ( Figure 1). Field observations and group discussions were carried out in 2011, 2012 and 2013. Group discussions with 4 to 6 participants per group were carried out in a total of 18 villages throughout the three study sites (six at Adami Tulu site, seven at Shashemene site, five at Gedeb Hassassa site).
Preliminary maps created showing EVI trends were used during group discussions. Furthermore, district reports on agricultural production and hazards were reviewed. The focus was to collect years with either above or below average conditions as of the participants' experience. Based on the interviews, we were able to identify years with pronounced shortfalls or sufficient agricultural production. This information was used to validate the model created in this study.
Additionally, eleven large parcels on a stateoperated farm (Figure 1), each with an area >70 ha, with well-documented wheat yield data from 2003 to 2009 were used to validate the EVI-based monitoring of agricultural production. The parcels were large enough to provide homogeneous land use and management practices given the pixel size of MODIS images. Geradella farm was selected because it is organized in uniform blocks covering several MODIS pixels each. The EVI max was validated against measured crop yields over 7 documented years on eleven large uniform plots.

Plant phenology and cumulative rainfall
Areas of homogeneous cropping systems had been identified in a previous study (Kibret et al., 2020) and were used here. Given the small size of land holdings, fallow was not practiced by smallholders in the area. We used maximum seasonal EVI (EVI max ) from MODIS images during Meher (June to October; Belg is only locally relevant) as a proxy of aboveground biomass (AGB) and, as AGB at a late phenological stage is closely related to crop yield via Harvest Index (Duncan et al., 2015), also validated EVI max against field data of wheat crop yields. Crop yields have been estimated by VI by Guan et al. (2018) and Rojas (2007). We then related EVI max to rainfall patterns assuming that cropping in the region is mainly water-limited and anomalies in rainfall would reflect anomalies in agricultural production (Meroni et al., 2014;Rojas et al., 2011). To this end, smoothened EVI time series were created using MYD13Q1 and MOD13Q1 products in GEE. First, a mask layer was created for each composite using the SummaryQA layer of the composite. Only pixels not affected by clouds, which corresponds to SummaryQA values of 0 and 1, were considered in the process. The masked EVI layers of Terra and Aqua were merged to create a time-series image collection (ee.ImageCollection in GEE). The layers in the image collection were fitted to harmonic models with four cycles per year, so that each year's main season's EVI subset could be modelled taking EVI max as a reference point. Furthermore, the day of the year when the EVI time series reached the maximum value was determined. This was done for each year's Meher. In particular, the EVI max was used to assess biomass ( Figure 2) as proposed by Funk and Budde (2009). Figure 2b shows the EVI profile of a pixel covering the period from 2003 to 2019. The corresponding decadal (i.e. 10-day interval) rainfall of the same pixel was derived from the CHIRPS dataset ( Figure 2c). For every pixel and season cumulative rainfall during ninety days (RF 90d Þ back from EVI max was calculated (Figure 2a), thus approximating duration of the vegetation period. Biomass predictions based on this period gave us the best correlations with measured biomass.
Thus, only local pixel-specific rainfall during the respective season, Meher (rainfall from June to October) or Belg (February to June), was considered to predict crop yields (Figure 3).
In some highland regions, the Belg and Meher seasons merge into one extended growing period whereby both long-cycle grains (such as sorghum and corn) and shortcycle small grains (such as wheat, barley, and teff) can be grown. To avoid confusion between these two growing seasons, the Ethiopian Belg (short season hereafter) crop season is officially defined as any crop harvested between March and August, while the Meher (main season hereafter) cropping season is defined as any crop harvested between September and February.

Derivatives of the main season's EVI and rainfall anomalies
To classify the study area into homogenous regions, two derivatives of the main season's EVI max were used for every pixel, namely: a) the average main season EVI max (EVI max ) of the period 2003 to 2019, and b) the coefficient of variation of the main season EVI max (CVEVI max ). The CVEVI max was derived using EVI max and the standard deviation of EVI max (δEVI max ) as shown in equation 1. EVI max and CVEVI max were then used for supervised classification of the study area into regions of similar production and risk for stratified analysis of agricultural production. The agricultural landscapes were classified into four categories using EVI max and CVEVI max . Four training sites were selected based on the group discussions with local communities and visually analyzing high-resolution satellite imageries covering a) DMH known for recurring shortfall of agricultural production during the main season, b) MMH with shortfall of agricultural production in some years, c) Highlands known for no or rare shortfall of agricultural production due to shortage of rainfall, and d) the same as "c" but with fallowing practice (large-scale state-operated farms). The dataset, comprising EVI max and CVEVI max , was classified using the training samples and the random forest classifier. The deviation of EVI max of a year under consideration from EVI max (equation 2) was used as a proxy for spatial and temporal anomalies of agricultural production. Anomaly in this study refers to the quantitative measure of how different a variable at a certain place and time (the EVI max in this section) was from reference conditions. Anomalies can be derived in different ways. In this study, we used the standardized z-score to derive the EVI max anomaly (EVI z ) from the difference between the selected year's EVI max and the average of 2003 to 2019 EVI max (EVI max Þ, divided by the standard deviation of 2003 to 2019 EVI max (δEVI max Þ. The outputs of equation 2 were yearly time-series anomaly maps (EVI z maps) with positive and negative values indicating deviations above and below average. Negative z-scores in this water-limited context indicate areas drier in that particular year than under average conditions. The anomaly maps were used for analysis of food security and need for relief measures since they indicate areas of concern regarding crop or pasture availability. In analogy to δEVI max for EVI z in Equation 2, seasonal cumulative ninety-day rainfall until EVI max (δRF 90d ) was used as a denominator to derive rainfall anomaly (RF Z ) (Equation 3).
Whereas RF 90d is the ninety-day seasonal cumulated rainfall before EVI max (a proxy of rainfall available during the vegetation period), and RF 90d its seventeenyears average from 2003 to 2019; δRF 90d is the standard deviation of RF 90d . EVI-rainfall analysis was performed first for all pixels in GEE and second for selected agricultural areas in R (R Core Team, 2021). Pixelwise linear regressions were fitted between RF z and EVI z of the period 2003 to 2019 and correlation coefficients and p-values were calculated. The results of this analysis of all pixels were used to produce a distribution map of the rainfall-EVI relationship (RF Z -EVI Z ). The land use level relationship between RF and EVI was displayed per selected site as a scatter plot.
To analyze the trend in biomass of agricultural areas, the 2003 to 2019 main seasons' EVI max were used. To determine places where the EVI max was increasing or decreasing and by how much, the simplest linear regression reducer, an algorithm provided in GEE, was used (Gorelick et al., 2017). The dependent variable was the EVI max of the main season and the independent variable was time (year). The output contained two bands, the "offset" (intercept) and the "scale" (in this context referring to the slope of the line). The result was visualized as a map showing areas of increasing, decreasing and no trend.
For selected pixels at different sites, EVI max trends were analysed visually as charts.
Results of the trend analysis of all pixels were used to produce a distribution map showing the degree and type of biomass change from 2003 to 2019. Pixels with frequent land use change between 2003 and 2019 (Kibret et al., 2020) were excluded from our analysis.

EVI anomaly and food aid requirements
Time-series anomaly maps were created at a district level and used as proxies to identify years with belowand above-average agricultural production. This was further supported by multiple mean comparison of EVI z , also performed at district level, to identify similarities among agricultural production across different years. The output of the comparison was a chart at the district level showing years with no significant difference in mean EVI z (p ≥ 0.05) receiving the same label. Values below zero indicate severity of agricultural production shortfall below long-term average. Values above zero indicate agricultural production that was above average. Results of the comparisons were compiled as maps and charts at a district level to assess historical production and predict agricultural production of the ongoing season.
A regression model for predicting agricultural yield was developed for wheat using 2003 to 2009 recorded production data of a state-operated largescale farm. The regression model was developed using data at block level (with wheat on areas greater than 70 hectares): EVI max per year was employed as predictor and yield per hectare during the same year as response variable. All technical steps related to EVI estimation and correlations to rainfall are summarized in Figure 4.

Major types of landscapes distinguished by EVI max and CVEVI max
The 2003 to 2019 EVI max analysis shows that long-term EVI max varied with elevation. The EVI max of areas under agriculture ranged from 0.3 to 0.8 and increased from the dry areas of the lakes region (DMH) to the MMH and to the WH environment ( Figure 5A). Areas of high intraannual EVI max variability were found in each of the four agroecological zones with varying degree of area coverage ( Figure 5B). Inter-annual variability in EVI max was higher in the DMH than in MMH and WH. The DMH was characterized by larger areas and higher frequency of high CVEVI max values compared to the other two agroecological zones. Large areas of agricultural land with EVI max below 0.50 and elevation below 1700 m a.s.l.
were characterized by CVEVI max above 10%, e.g., in Adami Tulu J/K district in the DMH. The MH zone showed patches with high CVEVI max mostly at locations of state-operated farms that practice crop rotation (label "a" in Figure 5B and D), which is not common among smallholders and thus is not relevant to food insecurity.
The clouds of sample points in Figure 5C show that areas with high EVI max can be clearly grouped into two classes on the basis of the CVEVI max . The information generated by using the EVI max and CVEVI max shows four distinct categories of the agricultural landscape including areas with (1) high EVI max and low CVEVI max , (2) high EVI max and high CVEVI max , (3) low EVI max and low CVEVI max , and (4) low EVI max and high CVEVI max . Among the four classes, the two with high CVEVI max presented higher risk of crop failure due to climatic variability: (1) areas of low EVI max with high CVEVI max , and (2) areas of high EVI max with high CVEVI max ( Figure 5D). The former mainly covered dry maize/other growing areas, the latter wheat/barley under risk of excess rain, which reflects agroecology and land suitability of the smallholder areas. Change in land use is also related with high CVEVI max around towns such as Shashemene (point 4 in Figure 6c). The expansion of the town onto agricultural land resulted in the decline of the EVI max .

Drivers of inter-annual variation of EVI max
To determine the causes behind the inter-annual variations in biomass, we looked at long-term trends of EVI max and at the relationship of the inter-annual rainfall and EVI anomalies.
EVI max increased significantly from 2003 to 2019 in the DMH with maize/haricot bean cropping (Adami Tulu J/K district [ Figure 6A, graphs 2 and 3] and west of Shashemene town [ Figure 6B, graph 4]). The area along Lake Abijata also showed a significant increasing trend ( Figure 6A, graph 1). A negative trend with regard to EVI max was observed in part of the wheat/barley growing highland areas. An example of this was found in Hassassa district ( Figure 6C, points 6 and 7).
A decline in EVI max was also observed along the boundary between the DMH and MMH, where mostly wheat was grown, as well as around the town of Shashemene (in Figure 6B line 5, in Figure 6C point 5).
RF z -EVI z analysis showed spatially heterogeneous relationships depending on crop and agroecological zone (Figure 7), where anomaly is a relative unitless value ranging from −2 to 2. In the drought-prone maize-haricot bean production areas RF z and EVI z were clearly positively correlated in some areas of the maize/haricot bean production areas ( Figure 7A, r = 0.65) and in downstream parts of the wheat/barley production areas ( Figure 7B, r = 0.56). However, EVI z and RF z were negatively correlated in some wheat/barley production areas ( Figure 7C r = − 0.56). The latter areas were located in the upstream part of the Wabishabelle watershed, where rainfall during the main growing season is abundant and waterlogging linked to soil properties has been reported. The downstream part of the wheat/barley growing areas was not as affected by waterlogging as the upstream area, shown by the overall positive EVI z -RF z correlation ( Figure 7B). Figure 8 shows the spatial distribution of the RF z and EVI z relationships. The maize/barley growing areas showed in most areas positive correlations ( Figure 8B). Negative correlations were observed mainly in the wheat/barley growing areas ( Figure 8D).

EVI as proxy for crop yields and temporal trends of production
For the wheat-growing highlands, a regression model was created using the 2003 to 2009 production records of a state-operated large-scale farm (Figure 9) to use as a validation database. A moderate, but highly  significant correlation was found between EVI max and crop yield (r = 0.7, p < 0.05). This supports our assertion that EVI max could be used as a proxy for monitoring the trends in agricultural production, and as an input for the Annual Emergency Assessment.

Added value of spatially explicit EVI time-series
In addition to the long-term trends presented before, yearly EVI z maps describe spatial and temporal anomalies of crop biomass and allow for more detailed interpretations of causalities in agricultural production. Figure 10A is an example of a map series for Adami Tulu, one of the most drought-affected districts in our study area. The estimated least square mean (lsmean) of EVI z at the district level gave further insights to interpret biomass growth patterns ( Figure  10). The EVI z in Adami Tulu/J/K in year 2015 was far below average, and was significantly different from all other years ( Figure 10). The EVI z in 2004, and 2015 was also below average while in 2008,2011,2012,2013,2017 and 2019 the EVI z was above average. In Figure 10, values of EVI z between 0.5 > Z > −0.5 show near normal vegetation periods for nine of 17 years. EVI z <-1 indicates drought events, such as in 2015. This was also used as baseline information for predicting the most recent year's agricultural production, such as in 2019, through analysis of the mean similarity between years ( Figure 10). Therefore, 2019 was a year with above-normal production in this district.

Validity of the approach to map yields and food-insecurity
Validation of our procedure for yield assessment by EVI max from 2003 to 2019 was challenging without systematically measured field data. Accuracy (r 2 ) of a linear regression on documented state farm yield observations for wheat and barley was highly significant, but moderate (r 2 = 0.5). Validation of maize yields under the typical smallholder conditions was not possible at the given resolution of MODIS imagery due to the absence of historical data. In future studies this could be achieved using imagery of higher spatial and sufficient temporal resolution such as Sentinel and ground data collected during the annual agricultural production survey by the Central Statistics Agency. In our study, we used MODIS EVI due to its high temporal resolution and extended coverage since 2003, along with its advantages over NDVI (see above). Recently, Sentinel 2 data, compatible with Landsat 8, have become an alternative to MODIS particularly due to the higher spatial resolution (10 m). Our use of the low-resolution MODIS images in smallholderdominated landscapes was possible under specific circumstances, as land use was synchronized by rainfall and homogeneous within each agroecological zone. To compare both approaches, we merged NDVI layers of Sentinel 2 and Landsat 8 to a 16 days interval from 2016 to 2020. These were then plotted against MODIS EVI data for three sites representing typical cropping systems of our research area (Figure 11). The three The combination of low EVI max and high CVEVI max , found mainly in the maize-growing DMH, was interpreted as an indicator of drought risk, recurring low productivity and thus potentially food insecurity. While low rainfall per se or late onset of the rainy season can be partly compensated by management (e.g., late sowing or switching to resistant varieties or crops), high CVEVI max at low EVI max is an integrated measure of productivity that stands for low resilience of cropping at the site. In the DMH of the maize production areas, increasing EVI max was positively correlated to RF Z (e.g., r 2 ≅ 0.6 at p < 0.05). Thus, biomass tended to be above average during seasons with above average rain, pointing to water as growthlimiting factor (van der Velde et al., 2012). Crops or varieties with better adaptation to water-stress might be more appropriate in these areas. In accordance, increasing greenness in large Sahelian areas between 1983 and 2003 has been interpreted as recovery from the Great Sahelian Droughts (Herrmann et al., 2005). In other environments, greenness is influenced more strongly by soil properties, management (e.g., fertilizer application) or land use.
In the WH, where wheat and barley were mostly cultivated, correlations between RF z and EVI z were found in some areas to be negative, i.e. biomass tended to be below average during seasons with above average rainfall. Possible explanations for this are waterlogging of clay soils, erosion, nutrient leaching, cloud cover (Lotsch et al., 2003) or disease pressure, leading to plant stress. Soil conservation and plant protection measures could be appropriate to tackle these challenges. In summary, multi-seasonal EVI z anomalies as well as EVI z -RF z relations can be used as indicators for agricultural production potential, system resilience (here related to drought) or unsustainable or not welladapted agricultural practices that require correction (like crop selection, planting date, soil conservation or drainage). The combination of EVI max and CVEVI max thus allows for a high spatial resolution identification of critical seasonal food production, contrary to single variables such as average vegetation indices.
Multiple comparisons allowed us to investigate differences between years' EVI z least square means using district-level data. The approach proved its potential to identify years with significant below-average agricultural production and also years with profound above-normal agricultural production. For instance, the approach showed that in 2015 significantly belownormal conditions were experienced in one of the drymid highland districts. This coincided with reports indicating that the worst drought in decades had gripped north and central Ethiopia in 2015, affecting nearly 10 million people (World Weather Attribution, 2015). Furthermore, EVI z trends matched our interview data.

Potential for applied food security strategies
Extension work, as part of food security strategies, requires information on the spatial distribution of both food-secure and insecure areas at smaller scales, because many agricultural policy decisions are implemented at this level. Such information is not adequately available in Ethiopia (Warner et al., 2015) and many other developing countries. Thus, current food security strategies could benefit from our approach, which includes several improvements compared to existing approaches: Firstly, merging of MODIS Terra and Aqua EVI time-series increased the number of high-quality EVI data and temporal resolution (up to 8 days), a necessity to derive spatially explicit continuous EVI timeseries. These were required to derive specific phenology from germination to peak biomass for every pixel in combination with spatially explicit rainfall data for the same pixels ( Figure 2). To identify the impact of local rain on agricultural production, rainfall was calculated backwards from spatially explicit seasonal EVI max , which differs from methods that use rainfall related to spatially averaged growing periods (e.g. Meroni et al., 2014). The resulting local EVI z from EVI fitted to the harmonic model in GEE -as proxy for pixel-specific biomass -could thus be better related to gridded rainfall data during the growing cycle. The presented approach used the long-term EVI z to determine relative crop productivity applying multiple means comparison instead of relying on commonly used fixed thresholds (e.g. Meroni et al., 2014). The same approach was employed for emergency aid prediction as EVI z precedes harvest and actual food insufficiency by several weeks. Furthermore, our approach can provide inputs for stratifying a region in development and research work such as climate change and adaptation-related studies. Information like EVI max and CVEVI can be linked to land use change (Kibret et al., 2016), meteorological data, or the performance of introduced crop varieties to derive mechanistic explanations of crop performance. The example of yield predictions for 2019 demonstrates that the approach can provide real-time spatial information on areas with below average production, which means that food aid could arrive in time, given the period between EVI max and harvest. Such information can also be used for prioritizing targeted field observations and direct interviews. The implementation in GEE opens new perspectives to users on the ground (e.g., agricultural extension services) with limited computing power as calculations over large areas are performed in the cloud and only selected results need to be downloaded. However, a critical precondition is a stable internet connection.
At this stage, our approach is applicable in spatially and temporally homogeneous landscapes, including smallholder and state farms. Occurrence of mixed land use pixels was largely excluded by the knowledge of the local farming systems obtained during our interviews (absence of fallow and local synchronization of planting time by the onset of the rainy season) and the zonation into areas of homogeneous cropping systems during an earlier study (Kibret et al., 2020). Our approach has potential when used with new sensors (we adapted our code to work by merging Landsat 8 and Sentinel 2 as well) that provide easily accessible data at high spatial resolution and independently of cloud cover. This would allow analyzing mosaics of land uses, thus closing an important gap in applicability of the current approach for smallholders, who are most vulnerable to food insecurity. On the other hand, temporal resolution of these new sensors is lower than that of MODIS, which synthesizes daily data.
While our approach works in principle, further geo-referenced independent calibration and validation data, for example, from the Central Statistics Agency's annual agricultural production survey, would need to be routinely considered for practical application. Crop-specific vegetation periods could improve the fit of EVI z -RF z relationships, but they can only be applied once the spatial distribution of agricultural land uses is known. On the other hand, greenness in semi-arid areas is influenced by soil moisture, including the time before germination, rather than actual rainfall (Herrmann et al., 2005;Lotsch et al., 2003), which justifies considering extended periods back from the period of EVI max .
Design of yield inventory questionnaires for compatibility with the satellite images would be important for this purpose. Our study shows that knowledge of an area, its agroecology and farming practices is necessary to correctly interpret model results. Data processing would be ideally centralized to ensure the consistency of methods and the comparability of data. In this case, feedback from the districts would be indispensable to guarantee data quality.

Conclusions
In this study, we proposed a workflow for ex-ante estimation of biomass of smallholder crops using derivatives of time-series remote sensing data. In context with socio-economic factors, our methods can provide improved information for annual emergency need assessments in imminent drought situations like 2009 and 2015 in our research area (Kibret et al., 2020), implementation of food security strategies and research studies.
Classification of agricultural landscapes into high and low EVI max -proxy for biomass -and CVEVI max , allowed a stratified analysis of interannual agricultural production. Furthermore, CVEVI max allowed pinpointing areas with recurring shortfall of agricultural production. The analysis of the relationship between anomalies of rainfall and EVI (EVI z -RF z ) facilitated the quantitative assessment of impacts of rainfall on agricultural production and identified possible reasons for the interannual variability of biomass and agricultural production. Although biomass and crop yields in our research area are strongly determined by water, other factors (e.g., plant available nutrients) are also included in the integrated assessment of EVI monitoring. In this context, the approach could be further refined to account for nutrient deficiencies if images of sufficient resolution were available. The potential of the described approach for food insecurity mitigation and emergency aid lies in the combination of publicly available remote sensing data and cloud computing facilities with spatially explicit crop yield prediction in real time.