Drivers of maize yield variability at household level in Northern Ghana and Malawi

Abstract Maize is a staple food, but productivity has stagnated due to limited access to advanced farming methods and knowledge. To promote sustainable agriculture, understanding the factors affecting maize yield at the farm level is crucial. This study used panel data on maize yield and agronomic practices in Northern Ghana and Malawi from 2014 to 2020. Satellite-based environmental variables were extracted at household locations, and Random Forest modeling was used to identify factors influencing maize yield variability. The models performance was sub-par with low R2 values (∼0.1 and ∼0.24 for Northern Ghana and Malawi). Fertilizer and precipitation were the most important factors explaining maize yield variability. Spatial maps showed that Malawi’s maize yield can increase with more fertilizer, but rainfall is essential. In Northern Ghana, relying solely on fertilizer may not be enough to boost maize production. KEY POLICY HIGHLIGHTS Survey data on maize is limited in making accurate yield predictions. Fertilizer use can increase maize yield in both Northern Ghana and Malawi. Fertilizer use intervention strategies should be region-specific. The efficiency of fertilizer use is dependent on adequate rainfall availability.


Introduction
Increased agricultural productivity is critical for Sub-Saharan Africa's (SSA) economic growth, poverty alleviation, and improved nutrition for the region's growing population.Maize (Zea mays L) is the second most cultivated and staple crop among SSA families, and it is primarily grown by small-scale farmers (Oluoch et al. 2022).Maize yield variance in SSA is influenced by agronomic, biophysical, and socio-economic factors such as variety type, soil fertility, fertilizer application, intercropping, crop rotation, irrigation, farm labour allocation, minimum tillage, input costs, and climatic shifts, among others (Danquah et al. 2020).However, the effect of these factors at the field level is lacking in most SSA countries because maize yield data is typically aggregated to larger administrative units, which averages out salient features of spatial and temporal variability in yield data (Vergopolan et al. 2021).For example, even in farms with similar environmental conditions, a farmer's choice of maize cultivar, fertilizer, or pesticides can result in interfarm yield variability (Muthoni 2021).Therefore, characterizing the drivers of crop production at the farm level is crucial for enabling evidence-based scaling out of sustainable agronomic methods that boost maize productivity.
Globally, machine learning algorithms such as Random Forest (RF) have proven to be more accurate in predicting and characterizing crop yield drivers because they can handle large amounts of data and decode complex non-linear relationships between the response variable and the predictor variables (Delerce et al. 2016).For example, Lohitha Reddy and Siva Kumar (2023) employed three different machine learning techniques (decision tree classifier, random forest classifier, and gradient boosting) to forecast crop yields using weather and soil properties as predictor variables.Their study revealed that the random forest classifier outperformed other algorithms in accurately predicting yield.Cai et al. (2019) found that ML methods outperformed than Ordinary Least Square regression when predicting wheat yield in Australia and also reported that combining climatic and vegetation indices data improved prediction of wheat yield.Additionally, other studies have utilized RF machine learning techniques to predict crop yield with high precision, such as Charoen-Ung and Mittrapiyanuruk (2019) predicted sugarcane yield using RF and forward feature selection, Jeong et al. (2016) forecasted the yields for maize, wheat, and potato tubers, Everingham et al. (2016) predicted sugarcane yield in Tully, Australia, and Ahmad et al. (2018) who predicted maize yield in Pakistan.
It is commonly assumed that machine learning methods like RF are immune to overfitting.However, including skewed training samples, and irrelevant and redundant predictor variables can significantly overfit the model when extrapolating beyond the areas where the model was trained (Meyer et al. 2018, Meyer et al. 2019;Meyer and Pebesma 2021).Furthermore, most data in nature are geographically dependent.Ignoring spatial dependencies in machine learning models might result in models that perform well on training data but fall short on spatial predictions (Meyer et al. 2019).As a result, applying feature selection approaches that incorporate target-oriented cross-validation (CV) processes, such as Leave-Location-Out (LLO), is critical for improving the model's performance beyond the training area and preventing spatial overfitting (Meyer et al. 2019).
In this study, we used a panel household survey data on maize yield and agronomic practices from Ghana and Malawi to 1) identify the target-oriented feature selection and cross-validation strategies that improve the performance of the RF model for predicting maize yield; 2) identify the most important sustainable agriculture intensification practices and socio-economic factors that explain variance in maize yield; and 3) predict the spatial distribution of maize yield under different management practices.The results of this research will provide information on where to scale out specific bundles of sustainable agriculture intensification (SAI) technologies with a low probability of failure.

Study area
The study area covers two countries in SSA i.e.Ghana and Malawi (Figure 1).Maize is a crucial crop in both countries, and its growth is heavily dependent on rainfall.Around 90% of smallholder farmers in Ghana and 97% in Malawi rely on maize farming as their primary source of income (Msowoya et al. 2016;Scheiterle and Birner 2018;White 2019).In Ghana, approximately 85% of maize production is consumed by humans, providing about 30% of the combined calorie intake when combined with other cereals such as rice and wheat, while the remaining 15% is used for animal feed to supplement poultry and livestock production (Andam et al. 2017;Adu et al. 2021).In Malawi, maize makes up more than half of the total calorie intake, with the central region having the largest harvested area, followed by the southern region (Warnatzsch et al. 2020).Soil infertility and inadequate use of improved cultivars are the two major obstacles to maize productivity in Ghana (Marfo-Ahenkora 2020), while in Malawi, the total family income and off-farm employment are the major determinants of maize yield productivity (Tamene et al. 2016).Climate variability has exacerbated maize productivity, resulting in malnutrition, poor human development, and a higher poverty index among small-scale farmers who rely on maize production for a living (Shi and Tao 2014;Parkes et al. 2018;Ngcamu and Chari 2020).As a result, determining the best agronomic strategies for increasing maize yield at the farm level will allow these countries to make data-driven decisions to increase yield.
The Palmer Severity Drought Index (PDSI) is a reliable measure used to assess the level of dryness or wetness in comparison to a historical average for a specific time period.In our study we utilised the PDSI (Abatzoglou et al. 2018) to evaluate the weather conditions for the two regions and seasons.We observed that the 2018-2019 growing season in Malawi was very wet while rainfall in all other seasons of both regions was below the normal ranges with extreme droughts in Ghana during 2019 season (Figure 2).

Agronomic data
A panel household survey was conducted in Ghana and Malawi in 2013 and 2019 under the Africa RISING program (https://africa-rising.net/; Tinonin et al. 2016).During the two surveys, respondents provided information on household demographics and production practices (Table 1).

Remote sensing variables
The gridded earth observation data were extracted using Google Earth Engine (GEE) cloud computing platform (Gorelick et al. 2017).These variables include the vegetation indices, meteorological, topography, socio-economic, hydrological, and soil properties (Table 2).Vegetation indices and meteorological data were generated for each month during the respective country's maize growing season.

Model training and evaluation
Eliminating irrelevant and redundant predictor variables in machine learning models is important because their inclusion can reduce the model's performance.While many Table 1.Variables used in the models.

Class
Continuous variables Categorical variables Demographics Household size (Hhsize), age of the household head (Headage), the number of education years for the household head (Headedu), maximum years of adult education (Edumax), average years of adult education (Meanedu) and plots fully managed by female (FempltsF).
The text in brackets indicate how the variables are referred to in various figures.feature elimination techniques are available, we used the VSURF feature elimination method, which is included the VSURF package (Genuer et al. 2022) in the R programming (R Core Team, 2020), to eliminate irrelevant or redundant variables.VSURF eliminates feature in three processes i.e. thresholding, interpretation and predictive step.The first step eliminates irrelevant variables from the dataset.The second step selects all variables related to the response for interpretation purpose.The third step refines the selection by eliminating redundancy in the set of variables selected by the second step, for prediction purpose.We focused on variables that were retained at the thresholding step, which retains or drops variables based on how important they are in explain the response variables.Because most continuous household survey data lacked corresponding raster data, we developed models that included all household survey data and those that only had categorical data to enable spatial predictions under various agronomic scenarios.To elaborate, the categorical household survey data allowed these variables to be converted into dummy variables, which could then be combined with the gridded raster data and toggled on (1) or off (0) to visualize the impact of using or not using the respective agronomic variable.What 'appending dummy variables to gridded data' does is create a grided layer of zeros for each pixel (not using agronomic practices), and this layer can be turned on by replacing all values with 1, indicating that all spaces use the agronomic variable.The VSURF elimination method was applied to the two sets of the dataset (all predictors and only categorical household survey data) independently.We used the 'ranger' method, as implemented by the train function in the caret R package (Khun 2022), to train the maize yield models and used the permutation method to rank the variable importance.Before training the model, we used the CAST package to generate training and testing folds of Leave-Location-Out (LLO) cross-validation.The LLO methodology internally subsets the testing and training set and thus we did not withhold any data for independent testing of the models.We then optimized the model by calculating the best mtry for each dataset separately.The Root Mean Square Error (RMSE) and R-squared (R 2 ) values were used to assess the model's performance where higher R 2 and lower RMSE values indicate a better model performance.We employed the varmImp function in the caret package to determine and rank the significance of the variables.To gain insights into the relationship between maize yield and the predictors, we generated partial dependence plots for the top six predictors using the pdp R package (Greenwell 2022).These plots provide a visual representation of the direction of the relationship between the response and the predictor variable.

Descriptive analysis
We investigated the distribution of maize yield for each season and country at different rainfall clusters using box plots (Figure 3).To annotate the various datasets, we will refer to Ghana data as D1 and D2 for the 2013 and 2019 surveys, respectively, and Malawi data as D3 and D4 for the 2013 and 2019 surveys, respectively.
To better understand the distribution of maize yield data based on all the predictor variables, we created histograms (Appendix 1) for various continuous variables for each country and season individually.The total number of predictor variables for D1 and D3 was 43, while D2 had 56 predictors and D4 had 57 predictors.

Feature elimination
The count of predictor variables that remained after elimination is presented in Table 3.Additional information on the actual names of the predictors that were retained can be found in the supplementary information (SS1).

Model performances, variable importance, partial dependence plots and spatial predictions
The models only explained a small variability in maize yield with low R 2 values across all seasons for each country (Table 4).When continuous household data were used, the explained variability was greater (11-15% in Northern Ghana and 24-35% in Malawi) than when only categorical data were used (6-10% in Northern Ghana and 7-14% in MalawiThis implies that the quantity of measurable agronomic practices used explains yield variability better than whether or not that agronomic variable is used.For both countries and seasons, the RMSE values obtained were consistently high, with normalized RMSE values (nRMSE; RMSE/mean yield) exceeding 50%.These values suggest that the model predictions were either overestimating or underestimating the actual yield by a significant margin, often by as much as twice or half the true value.The results underscore the necessity of refining the predictive model to enhance its accuracy and practical applicability.
Previous studies have reported the usefulness of fertilizer application in increasing maize yield in Northern Ghana (Braimoh and Vlek 2006;Kanton et al. 2016;Buah et al. 2017).Our analysis identified the amount of fertilizer used per hectare and total income (Figure 4a) as the most significant agronomic practices that positively (Figure 4c) influenced maize yield in Ghana in 2013.Interestingly, when considering only the categorical version of the agronomic practices, the importance of these two variables was relatively low (Figure 4b), indicating that the quantity used mattered more than simply their presence or absence.Both datasets showed that the total amount of rainfall experienced in October -which marks the end of the season -had a positive influence on maize yield (Figure 4c and d).Agronomic practices were found to be poorly correlated with maize yield variability in 2019 (Figure 5a and b).Instead, the most significant factors influencing yield productivity were August precipitation, which had a positive effect, and temperature, which had a negative impact (Figure 5c and d).The observed dynamics may be attributed to the exceptionally dry season (Figure 2), which resulted in reduced soil moisture and likely exacerbated the effects of temperature on yield.
The spatial prediction maps indicated that introducing fertilizer as an agronomic practice resulted in minimal improvements in maize yield in both seasons (Figure 6a and b).The yield gain observed was relatively low (<50kg/ha) but parts of the upper west and northern region had higher yield gain of more than 50 kg/ha (Figure 6c).The limited yield gain observed may be attributed to two key factors: first, the relatively dry conditions during the two seasons (Figure 2); and second, the low ranking of fertilizer use (yes/no) as a significant predictor of yield.
While recent studies have found a limited yield response to fertilizer use in Malawi (Burke et al. 2022;De Weerdt and Duchoslav 2022), several other studies have demonstrated that applying fertilizer and improving access to it can significantly boost maize yield productivity (Sauer and Tchale 2009;Wang et al. 2019;Burke and Jayne 2021;Cairns et al. 2021;Cassim and Pemba 2022).According to the 2013 season analysis, fertilizer usage per hectare and the extent of land devoted to maize cultivation were the primary factors accounting for yield variability (Figure 7a), with the former exerting a positive effect and the latter having a negative impact.respectively (Figure 7c).Although soil moisture was identified as the most critical variable affecting maize yield when categorical agronomic practices were employed for predictions (Figure 7b), the observation that yield declined with increasing soil moisture (Figure 7d) during a relatively dry season (Figure 2) is perplexing.Fertilizer use (both quantity and yes/no) was an important factor in 2019 (Figure 8a and b) with a positive effect on maize yield (Figure 8c and d).Total household income and labor were significant factors as continuous variables this season, but labor was less important in categorical analysis (Figure 8a and b).Livestock density had the most significant positive impact in categorical analysis, likely due to manure use and its positive effect on maize yield (Wang et al. 2019).
Spatial predictions based on agronomic models demonstrated that the introduction of fertilizer in the 2019 growing season resulted in a significantly greater increase in maize yield as compared to the 2013 season (Figure 9).This outcome may be attributed to the favorable soil moisture conditions in 2019 (Figure 2), which allowed for enhanced fertilizer uptake by crops and ultimately contributed to improved yield.Even so, it is important to note that the average maize yield was higher in 2013 as compared to 2019.Two possible reasons could explain this phenomenon: Firstly, the prevalence of extreme floods and soil erosion in Malawi (McCarthy et al. 2021) may have reduced crop yield, particularly given the excessively wet weather in 2019.Secondly, excessively moist environments can increase the incidence of corn ear infections (Wang et al. 2019), thereby leading to a decline in yield.

Discussion
This study examined the factors that affect the maize yield in different regions and periods in northern Ghana and Malawi.To do this, we looked at various biophysical, socio- economic and farm management practices as potential predictors and used a random forest machine learning algorithm with spatial blocking cross-validation.Despite efforts to develop accurate models, the performance was suboptimal, with explained variability ranging from 6 to 15% in Ghana and between 7 to 35% in Malawi over the course of two seasons (Table 4).While it is true that spatial blocking cross-validation can lead to  reduced R 2 values (Meyer et al. 2018;Meyer et al. 2019;Meyer and Pebesma 2021), there may be other factors that may have contributed to the underperformance of the models.For example, farmers reported yields from a different number of plots that were spatially displaced.These imprecise locations of farmer plots could have introduced errors when matching with remote sensing variables (Burke and Lobell 2017;Lobell et al. 2020).This can be resolved by aggregating the yield data into larger administrative zones although the practice can mask details in heterogeneous farms.Alternatively, we recommend that household surveys should endeavour to precisely map the plot boundaries to enable matching with satellite data.Also, the maize yield data was based on self-reported estimates and numerous studies have shown that self-reported estimates are frequently inaccurate when compared to farm-level estimates derived from actual harvest measurements (Jin et al. 2017;Scheiterle et al. 2019;Burke et al. 2020;Li et al. 2022).
The low spatial resolution of the predictor variables used in the models, which were resampled from about 4 to 0.03 km, could also be a contributing factor to the poor performance of the models.Generating reliable satellite-based productivity estimates for smallholder farms in sub-Saharan Africa, which are typically characterized by small land size and intercropping, is unlikely when using low spatial resolution data due to the presence of mixed crops within a single pixel (Jin et al. 2017;Li et al. 2022).Studies have demonstrated that utilizing higher spatial resolution satellite data, such as those provided by the Sentinel-2 mission (10 m) and PlanetScope (3 m), has resulted in improved model performance (R 2 >0.5;Li et al. 2022).However, the utility of such high-resolution data is limited by frequent cloud cover and requires significant computational resources, particularly when analyzing vast areas.Furthermore, the choice of satellite-based predictor variables used in this study may have been insufficient in explaining the variations in maize yield.According to Jin et al. (2017) and Burke and Lobell (2017) Green Chlorophyll Vegetation Index (GCVI) is more effective at predicting maize yield than other vegetation indices, likely due to its ability to capture nutrient deficiency, which is highly correlated with yield.In addition, factors such as Leaf Area Index (LAI), radiation, and sowing period have been identified as good predictors of maize yield in several studies (Srivastava et al. 2017;Lambert et al. 2018;Danquah et al. 2020;Li et al. 2022).
Maize farming in the sub-Saharan Africa region heavily relies on adequate rainfall, which may explain why precipitation and soil moisture emerged as significant factors in explaining the variability of maize yield.Both Malawi and Ghana have made significant investments in fertilizer subsidy programs as part of their efforts to increase maize productivity (Mapila et al. 2012;Fearon et al. 2015;Ragasa and Chapoto 2017;Scheiterle and Birner 2018;Andani et al. 2020;Cassim and Pemba 2022;De Weerdt and Duchoslav 2022).There is a debate on the usefulness of fertilizer subsidy programs, with some studies reporting low yield response (Benin et al. 2013;Fearon et al. 2015;Andani et al. 2020;Burke et al. 2022), while others suggest that these programs have led to improved maize productivity by making fertilizers more accessible and increasing their usage (Braimoh and Vlek 2006;Chibwana et al. 2014;Kanton et al. 2016;Buah et al. 2017;Wang et al. 2019).Our results suggest that the application of fertilizer can significantly enhance maize production in both Malawi and Ghana during seasons with adequate soil moisture.This could be attributed to the fact that both countries face challenges of low soil fertility caused by a combination of factors such as low nutrient levels, continuous cropping, overgrazing, deforestation, and poor soil and water management practices (Tittonell and Giller 2013;Vuntade et al. 2022).
In terms of yield gain/loss, Malawi saw the highest increase in maize yield (Figure 9) when fertilizers were used, while Ghana experienced a much smaller increase (Figure 6).The high yield gain in Malawi could be because several studies have linked the use of fertilizer, urea, and manure to high maize yield (Snapp et al. 2014;Tamene et al. 2016;Liu and Basso 2017;Wang et al. 2019), as well as intercropping, which acts as a soil fertility replenishment (Akinnifesi et al. 2006;Silberg et al. 2017).The difference in yield gain between Ghana and Malawi in 2019 may be attributed to Ghana's comparatively dry season and Malawi's comparatively wet season, which likely explains why Ghana's yield increase was low (<50 kg/ha) while Malawi's was high (> 400 kg/ha).Another possible reason why the Ghana season had a lower yield increase could be the limited access to modern agricultural practices, such as mechanization and the use of improved seed varieties, which continue to constrain productivity (Ragasa and Chapoto 2017).
The presence of parasitic weeds like Striga (Scheiterle et al. 2019;Adu et al. 2022;Martey et al. 2022) and pests like fall armyworm (Agboyi et al. 2020;Nagoshi et al. 2021;Yeboah et al. 2021) outbreaks in maize farms and increased cost of pesticide that hinders their control could also be a contributing factor to why fertilizer use does not necessarily result in increased yields.Although hand-picking of the striga weed is a commonly used method, it is not sustainable in the long term (Kabambe et al. 2008;Wang et al. 2019).Therefore, an integrated approach that incorporates different control methods is necessary to effectively manage the weed.Push-Pull technology, which involves planting desmodium and bracharia grass, has been shown to effectively reduce striga weed infestation and ultimately increase maize yield, offering a sustainable and integrated approach to weed control (Niassy et al. 2022).To potentially enhance maize yield productivity, factors such as timely fertilizer application, adjusting planting dates to accommodate climate variability (Fosu-Mensah et al. 2019;Warnatzsch and Reay 2020), educating farmers on the appropriate fertilizer amounts (Addai and Owusu 2014;Asante et al. 2019;Wang et al. 2019;Andani et al. 2020;Cairns et al. 2021;Setsoafia et al. 2022), and promoting the adoption of improved seed varieties may also be beneficial.
Despite the poor performance of the models in this study, we have identified important variables that are consistent with existing knowledge and previous studies on maize yield.To enhance the model performance, we recommend the following: 1) include satellite-based factors like GCVI and LAI, which have shown better performance in predicting yield; 2) integrate a crop classification map that distinguishes maize and nonmaize fields; 3) refine yield data using simple thresholds and generate categorical predictive maps rather than actual yield; and 4) explore simple regression models that directly correlate yield data with vegetation indices, as these have been found to better explain variations in maize yield in sub-Saharan African countries (Jin et al. 2017;Li et al. 2022).

Conclusion
The identification of maize yield determinants through the use of household survey data and low spatial resolution satellite-based estimates of the environment has produced a model that performs moderately.Nonetheless, the significant variables identified align with existing knowledge of the factors that affect maize yield variability both at the farm and larger administrative levels.The findings of this study suggest that promoting the use of fertilizers is a viable option for improving maize yield in Ghana and Malawi.Additionally, since precipitation plays a crucial role in determining yield, it is recommended that measures such as rainwater harvesting be promoted to help cushion against the impact of extreme dry seasons.

Figure 1 .
Figure 1.Map of the study showing the location of the survey households and zones with relatively similar rainfall patterns.The rainfall zones were generated from long-term (2014-2020) aggregation of annual TerraClimate satellite rainfall estimates.

Figure 2 .
Figure 2. The average Palmer Drought Severity Index (PDSI) values for the growing seasons of Malawi and Ghana in 2013-2014 and 2018-2019.The growing season for Ghana spans from April to October, while for Malawi, it takes place between October and April of the following year.

Figure 3 .
Figure 3. Boxplots showing the distribution of the maize yield data for Ghana and Malawi with (a) and without (b) outliers were removed.The blue text is the number of households per cluster.The clusters are as described in Figure 1.

Figure 4 .
Figure 4. Variable importance and partial dependence plots for Ghana in 2013.(a) and (b) All predictor and categorical variables importance plots respectively.(c) and (d) Partial dependence plots for the top 6 predictors with all predictors and only with categorical variables respectively.

Figure 5 .
Figure 5. Variable importance and partial dependence plots for Ghana in 2019.(a) and (b) All predictor and categorical variables importance plots respectively.(c) and (d) Partial dependence plots for the top 6 predictors with all predictors and only with categorical variables respectively.

Figure 6 .
Figure 6.The spatial maize yield prediction and yield gain for Ghana in 2013 and 2019 when fertilizer use was incorporated as a useful agronomic practice.(a) When no agronomic practice was used.(b) Fertilizer use and (c) yield gain/loss (Figure 6b -Figure 6a).

Figure 7 .
Figure 7. Variable importance and partial dependence plots for Malawi in 2013.(a) and (b) All predictor and categorical variables importance plots respectively.(c) and (d) Partial dependence plots for the top 6 predictors with all predictors and only with categorical variables respectively.

Figure 8 .
Figure 8. Variable importance and partial dependence plots for Malawi in 2019.(a) and (b) All predictor and categorical variables importance plots respectively.(c) and (d) Partial dependence plots for the top 6 predictors with all predictors and only with categorical variables respectively.

Figure 9 .
Figure 9.The spatial maize yield prediction and yield gain for Malawi in 2013 and 2019 when fertilizer use was incorporated as a useful agronomic practice.(a) When no agronomic practice was used.(b) Fertilizer use and (c) yield gain/loss (Figure 9b -Figure 9a).

Table 2 .
(Zhang et al. 2019;Meng et al. 2021)ta used in the model.Vegetation indices are essential in agriculture as they assess plant health and vigor.They are sensitive to changes in growth and development, which directly affect crop yield.Among the vegetation indices, EVI is particularly significant in predicting maize yield(Zhang et al. 2019;Meng et al. 2021)because it can more accurately capture changes in vegetation in varying atmospheric conditions and soil backgrounds., maize roots need a proper amount of soil moisture to absorb nutrients effectively.Low soil moisture levels can lead to drought stress, causing reduced yields.Similarly, high soil moisture levels can also be harmful, as it may lead to waterlogging and a decrease in oxygen availability to the roots, resulting in lower yields.Hence, maintaining an appropriate level of soil moisture is crucial for optimizing maize yield.Elevation has a significant impact on maize yield as it affects temperature and rainfall levels.Generally, higher elevations have cooler temperatures that can affect the rate of maize development and the length of the growing season.Moreover, higher elevations usually receive more rainfall, which is advantageous for maize growth and development, but excessive or irregular rainfall can cause waterlogging or drought stress and lower the maize yield.both positive and negative impacts on maize yield.On the one hand, their manure can provide essential nutrients that have been linked to increased maize productivity.However, when livestock density is too high, it can result in competition for crop residue.In such cases, most of the residue is consumed by the livestock, leaving only a small amount of organic matter in the soil, which can lower soil nutrient levels and ultimately reduce maize yield.
TopographyElevation(DEM -Farr et al. 2007).The text in brackets anotate how the variables are referred to in various figures.

Table 3 .
The number of retained predictor variables after the VSURF thresholding step.

Table 4 .
Model performance metrics when all predictors were used and when only the categorical household survey was used.