Modelling agricultural drought: a review of latest advances in big data technologies

Abstract This article reviews the main recent applications of multi-sensor remote sensing and Artificial Intelligence techniques in multivariate modelling of agricultural drought. The study focused mainly on three fundamental aspects, namely descriptive modelling, predictive modelling, and spatial modelling of expected risks and vulnerability to drought. Thus, out of 417 articles across all studies on drought, 226 articles published from 2010 to 2022 were analyzed to provide a global overview of the current state of knowledge on multivariate drought modelling using the inclusion criteria. The main objective is to review the recent available scientific evidence regarding multivariate drought modelling based on the joint use of geospatial technologies and artificial intelligence. The analysis focused on the different methods used, the choice of algorithms and the most relevant variables depending on whether they are descriptive or predictive models. Criteria such as the skill score, the given game complexity used, and the nature of validation data were considered to draw the main conclusions. The results highlight the very heterogeneous nature of studies on multivariate modelling of agricultural drought, and the very original nature of studies on multivariate modelling of agricultural drought in the recent literature. For future studies, in addition to scientific advances in prospects, case studies and comparative studies appear necessary for an in-depth analysis of the reproducibility and operational applicability of the different approaches proposed for spatial and temporal modelling of agricultural drought. HIGHLIGHTS The components and fundamentals of multivariate modelling of agricultural drought were discussed. The importance of hybrid artificial intelligence models is widely discussed in improving the performance of traditional machine learning models. Quantum machine learning algorithms are weakly explored in multivariate drought modelling. Therefore, future studies should explore this approach. The major challenge of multivariate modelling of drought frequency is mainly related to the difference in the return periods of the different variables (time-shifted and spatially effects).


Introduction
Drought is a stochastic, multifactorial climatic hazard and undoubtedly one of the most complex natural phenomena. It is a normal characteristic of the climate, and its recurrence is inevitable. Most often, it grows slowly and spreads over large areas (West et al. 2019;Zhong et al. 2019;Andersson et al. 2020;). Its occurrence, intensity, gravity, duration, and spatial extent are regulated by natural mechanisms that are not easily perceptible and that are difficult to control. Around the world, the implications of drought on local populations and agro-ecosystems are often catastrophic and irreversible in some areas (Feller and Vaseva 2014;Feller 2016). In 2019, for example, and according to UNDRR (United Nations for Disasters Risk Reduction), 61.7 million people were affected by the direct and or indirect causes of natural disasters with 4733 deaths. Worse still, the IPCC (Intergovernmental Panel on Climate Change) predicts in its latest report published in 2022 that about 2.5 million more people will be affected by droughts in the next thirty (30) years, mostly in developing countries.
In recent literature, drought is undeniably a climatic hazard whose terminological connotation varies according to several disciplinary fields. The plurivoque nature of the concept of drought has its origins in the highly variable dimensions of the state of drought by geographical area or variables considered. The so-called agricultural drought that interests us in this study is a recurrent intra-seasonal phenomenon in several regions of the world. It is characterized by a very low level of soil moisture (Gao et al. 2021;Das et al. 2021) and involves the largest number of variables and processes (Wildemeersch et al. 2015;Chen et al. 2020;Jim enez-Donaire et al. 2020;Chandrasekara et al. 2021). It is therefore particularly difficult to quantify and predict accurately (Ghazaryan et al. 2020;Hara et al. 2021). Yet accurate and reliable information on the occurrence, extent, intensity, and potential impacts of drought is very useful in guiding the choice of mitigation strategies for damaging effects (Kavitha et al. 2018;Jim enez-Donaire et al. 2020;Wang & Yu. 2021;Mokhtar et al. 2021).
Unlike hydrological, meteorological, and socio-economic droughts, agricultural drought is particularly sensitive to the spatio-temporal variability of climatic variables during the rainfed crop season. However, the severity of the impacts of agricultural drought is influenced by several non-climatic factors including, among others, the geographical context (topography, continentality), agronomic context (crop growth status, genotypes), soil (soil type, soil depth), anthropogenic factors such as local crop management practices, and the existence or not of irrigation-related inputs. For these reasons, over the past two decades, agricultural drought is undoubtedly the type of drought that has sparked considerable scientific effort (Yang et al. 2016;Han et al. 2019;Bezdan et al. 2019;Bayissa et al. 2019;Ghazaryan et al. 2020;Kulkarni et al. 2020;Bento et al. 2020;Das et al. 2021;Areffian et al. 2021). On this topic, looking at 1117 original papers, the study by Orimoloye et al. (2020) reveals that the field of drought vulnerability, drought severity and water shortage (DDVS_WS) has seen a drastic increase in recent years with an annual growth of about 12.7%. Among the efforts devoted to the study of drought, the development of drought indices is undoubtedly the most important effort.
On a global scale, all approaches combined, it is estimated today that there are 150 indices developed for drought assessment, classification, and monitoring (Svoboda and Fuchs 2016). They are most often grouped into two large distinct families: single indices and multivariate composite indices. However, it should be noted that many of these indices have been developed in specific geographical conditions, and therefore, their performance in other geographical regions could be limited. Indeed, the variability in performance of drought indices by geographic region or ecosystem type has been demonstrated in several studies (Mishra and Singh 2010;Naumann et al. 2014;Zhang et al. 2017;Liu et al. 2018;Song et al. 2020;Li et al. 2020b;Hanad e et al. 2022). By comparing the consistency of several indices (SDCI, TVDI, SDI, VCI, TCI, PCI, VHI) over mainland China, Li et al. (2020b) noted that many indices have different applicability across regions in mainland China. Similarly, through statistical evaluation, Liu et al. (2018) found substantial differences in the performance of drought indices (PDSI, SPI, SPEI) for agricultural drought in the North China Plain. Thus, the sensitivity to drought of a region is strongly influenced by its specific characteristics in terms of the energy and water balances (Shahabfar et al. 2012). According to Cartwright et al. (2020), variability in drought sensitivity is influenced by topography, climate, soil characteristics and altitude. These highly variable characteristics in time and space have a direct influence on the biophysical and climatic variables used in the formulation of drought indices. Therefore, in arid and hyperarid regions, the estimation of the different drought indicators is characterized by important uncertainty due to limited rainfall and very low vegetation cover (Naumann et al. 2014). This implies that the indices, particularly those based on remote sensing data; appear more accurate in rain-fed regions . To support this point through a study on the influence of spatial heterogeneities on the performance of indices, Hanad e et al. (2022) revealed that drought indices have very variable performance depending on whether it is an irrigated, rainfed, desert and mountainous agrosystem. This is particularly true for drought indices based on NDVI (Normalized Difference Index) or LST (Land Surface Temperature). Drought index concordance was higher in semi-arid agrosystems than in the desert zones. Similarly, it was raised in the same geographical region that the drought indices had variable performance depending on the seasons or the time window considered . This suggests that no multivariate index or model for drought monitoring or prediction can adequately address the multifactorial and multi-temporal stochastic dimension of drought under any climatic condition. Thus, nowadays, monitoring and assessment of drought conditions tends to become a local priority putting forward the hypothesis that the development of local early warning systems or monitoring of agricultural drought at the plot scale could improve strategies to mitigate drought-related impacts (Andersson et al. 2020, Jung et al. 2021Das et al. 2021).
Recent developments in geospatial technologies, artificial intelligence, and the use of cloud computing platforms such as Google Earth Engine have undoubtedly given an accelerating impetus to drought studies. From this perspective, the use of advanced machine learning methods as data-driven methods tends to become an indispensable alternative to conventional methods based on statistical analysis and/or domain expertise Sutanto et al. 2020;Mokhtar et al. 2021;Jung et al. 2021). Thanks to this technological development, the use of the full and derived spectrum has been made possible for the detection and multivariate classification of drought (Dao et al. 2021).
Indeed, in a factual way, drought monitoring and its predictive modelling are often subject to inaccuracies related to the spatio-temporal complexity of the cause-and-effect relationships of several variables and processes. Therefore, descriptive, predictive mapping or drought risk analysis must be based on the basic principle of non-stationarity and non-collinearity of the factors and variables of the drought phenomenon in their temporal and spatial evolution. The multifactorial dimension of agricultural drought lies fundamentally in the complexity of the interactions between static and dynamic factors in time and space. For this reason, quantitative estimation of agricultural drought and spatial modelling of associated risks is, therefore, a difficult task. Difficulty in quantifying the highly temporally and spatially changing relative importance of covariates or inappropriate model choices for describing or predicting drought conditions are major sources of inaccuracies in spatial drought modelling (Rahmati et al. 2020). Similarly, other recent studies have shown that the integration of certain variables into drought prediction models can have a contradictory effect, particularly for the last months of crop growth (Panis et al. 2019), which makes it even more complex to model spatio-temporal patterns of drought. In this regard, regardless of the geographical region, conventional methods have proven to be ineffective in modelling the multivariate complexity of the drought phenomenon.
Nowadays, as in other fields, approaches based on multi-sensor remote sensing and artificial intelligence techniques such as random forest (RF), artificial neural network (ANN), SVR, cubist, or SVM have revolutionized approaches dedicated to descriptive and predictive modelling of drought. These contemporary approaches jointly use various products of multi-sensor remote sensing, auxiliary data, and climate model simulations to identify the spatio-temporal and multifactorial complexity of drought. Regardless of the use of machine learning techniques, multi-sensor remote sensing has proven to be an effective means for identifying, evaluating, and anticipating the management of extreme events (Richman and Leslie 2018;Jiao et al. 2021;Maity et al. 2021). Thus, in conjunction with technological and geospatial advances, significant efforts have been made in monitoring, quantifying, and/or predictive modelling of drought parameters (Park et al. 2020;Roushangar et al. 2021;Zhang et al. 2021). This progress includes predictive modelling of several hydro-climatic and biophysical variables in drought management that are of interest to managers and decision-makers (Thober et al. 2015;Samadianfard et al. 2018;Park et al. 2019;Zhang et al. 2019). Similarly, the combination of several sensors and platforms offers the possibility of analyses over long periods of up to 40 years of information. The rise of multi-sensor remote sensing and the combined use of data from climate models and artificial intelligence algorithm techniques have significantly improved the ability to model various aspects of agricultural drought including accurate estimation of drought parameters and quantification of expected impacts.
Furthermore, despite recent advances in the use of multi-sensor remote sensing, it should be noted that advances in the development of algorithms and the increase in cloud-based computing and storage capacity have significantly improved the application potential of remote sensing and machine learning models for drought studies (Huntington et al. 2017;Zhou et al. 2019;Behera et al. 2021). Thus, over the past two decades, technological development and the explosion of massive multi-source data quantities have made possible intelligent diagnoses of complex and multifactorial phenomena, including agricultural drought. The availability and accessibility of large amounts of multi-source satellite data and large-scale model output data at all spatiotemporal scales have profoundly disrupted approaches to drought studies. In this new dynamic, the use of geospatial technologies, cloud computing, and artificial intelligence have been recognized as an essential alternative in the multivariate modelling of agricultural drought. Prediction models based on heterogeneous data and machine learning algorithms have proven to be very promising in identifying the multifactorial complexity of drought and its highly evolving characteristics. These new technologies have the advantage of being able to integrate and jointly manipulate information from different sources. They also have the advantage that they require less development time and less complex than the dynamic or physical model (Deo and Şahin 2015).
Typically, machine learning techniques and multi-sensor remote sensing are used to cover several important aspects of agricultural drought. Multivariate drought modelling mainly covers modelling response variables such as moisture of the surface layer of the soil (Park et al. 2017), modelling expected impact functions such as the effect of water stress on grain quality, and drought resistance (Yuan and Tan 2010;Bachmair et al. 2017;Sutanto et al. 2019;Dao et al. 2021;Gupta et al. 2021). In addition, there are forecasts of the risks of agricultural yield losses due to drought (Mann et al. 2019;Feng et al. 2019a;Feng et al. 2020), risk mapping (Carrão et al. 2016;Rahmati et al. 2020;Hoque et al. 2021) or vulnerability to drought (Hoque et al. 2020;Saha et al. 2021;Joko Prasety et al. 2019;Li et al. 2021). Apart from these elements, descriptive drought modelling is particularly interested in the spatio-temporal quantification of drought parameters from composite models (Sun et al. 2017;Bayissa 2018;Shen et al. 2019;Shah and Mishra 2020;Elbeltagi et al. 2021), early detection and classification of crop stress (Moshou et al. 2014;Zhuang et al. 2017;Schmitter et al. 2017;An et al. 2019b;Samantaray et al. 2019;Su et al. 2020;Ramos-Giraldo et al. 2020). Predictive modelling also provides anticipated information on the occurrence and severity of future dynamics of the phenomenon (Rhee and Im 2017; Khosravi et al. 2015), trajectory (Diaz et al. 2018), and general trend in drought conditions (Son et al. 2021).
These advances are particularly interesting for improving the climate resilience of agrosystems in a context where the highly uncertain nature of expected rainfall has become more pronounced due to climate change. In almost all even the most developed countries with an agricultural vocation, the prospects for the resilience of the agricultural sector in the face of the climate hazards would be based on the operational deployment of new geospatial technologies and artificial intelligence. Realtime plot-level monitoring of meteorological and agricultural parameters is already at work in several countries using the internet of things (IoT). In the near future, due to new space missions and the development of new multi-sensor data fusion algorithms (multi-sensor data aggregation), better predictivity of drought risk and realtime monitoring at fine and large-scale resolution of water and heat stress conditions are expected.
To the current state of knowledge, the prediction time usually varies from a few hours to a few months in advance depending on the model, the quality of the data, and their velocity. However, a distinction should be made between predictive modelling in terms of drought occurrence (frequency forecasting) and spatial modelling of severe drought areas (priority vulnerable areas) or multivariate mapping of associated risks. The last two elements of predictive modelling aim to provide information on the future evolution of agricultural drought by assuming the prevalence of a continuous rainfall deficit at the local scale (Park et al. 2019). Rather, it provides information on the future development and evolution of agricultural drought, assuming that rainfall deficit conditions continue to prevail, in preparation for drought mitigation plans. In addition, machine learning techniques have led to a better understanding of the lagged effects of droughts on photosynthesis on a global scale ) and a better estimation of the asynchronous dynamics of drought-related variables including the delayed effect of NDVI on precipitation and LST (Liu et al. 2020). The other lever for the use of machine learning techniques in agricultural drought modelling concerns quantitative and qualitative modelling of drought-related variables, including evapotranspiration and moisture (Ahmad et al. 2010;Phillips et al. 2014;Chen et al. 2021) and the amount of water in plants (Yuan et al. 2019).
At present, the combination of massive data from multi-sensor remote sensing, climate model simulations, in situ observations (agrometeorological), and auxiliary data (base map) are increasingly very popular in descriptive and predictive modelling of drought parameters and variables. Nevertheless, the complementary use of several data sources has become the preferred use of new approaches (West et al. 2019, Li et al. 2020a, and the use of deep learning models (RNN, LSTM, CNN) would require little or no axillary data to achieve accurate modelling of drought conditions. Since a wide range of multi-sensor drought modelling approaches has been developed and tested in different parts of the world by combining earth observation data with other data sources and using various machine learning techniques, it is therefore appropriate to provide an analysis of the recent literature to describe new scientific trends.
Due to rapid updates in knowledge related to very rapid scientific and technological advances, several systematic review studies have been conducted on drought Singh 2010, 2011;Hao and Singh 2015;AghaKouchak et al. 2015;Hao et al. 2018;Anshuka et al. 2019;West et al. 2019;Orimoloye et al. 2020;Xu et al. 2022;Singh et al. 2022;Petersen-Perlman et al. 2022;Dikshit et al. 2022). Based on global paleoclimatic studies, Mishra and Singh (2010) examined the fundamental concepts of drought modelling and the spatio-temporal relationships of climatic and hydrological parameters. The historical trajectories of major droughts and their socio-ecological impacts in different regions of the world were analyzed and discussed. The study demonstrates the plurivoque nature of the concept of drought and its very complex socioeconomic, hydrological, and environmental implications. Drought as a complex natural hazard was studied, as well as the comparative strengths, weaknesses, and performance of the main hydroclimatic and remote sensing indices. Following the same study, Mishra and Singh (2011) documented the advantages and disadvantages of different approaches to drought modelling and the relationships of the components of drought modelling. Various aspects of drought, namely drought parameters (severity, magnitude, intensity, duration, return period), probabilistic characterization, components of the forecast as well as the various factors aggravating the consequences of drought were discussed.
Similarly, Hao and Singh (2015) established a systematic analysis of multivariate approaches related to the development of composite drought indices. The advantages and disadvantages of water balance-based approaches, objective and subjective linear combination approaches of multi-source variables, and multivariate drought statistics approaches were analysed and discussed. The opportunities and limitations of approaches based exclusively on remote sensing data for drought monitoring from a climatological and ecosystem perspective have been documented by AghaKouchak et al. (2015). Other systematic review analyses of drought have looked at scientific advances in seasonal drought forecasting. To this end, Hao et al. (2018), listed and compared seasonal drought forecasting methods. In this, large-scale mechanisms, and local factors of seasonal predictability of the drought phenomenon from statistical, physical and hybrid methods were discussed. The study highlights significant differences in the ability of statistical and physical methods to include in seasonal drought forecasts the complex atmosphere-ocean interactions, the influence of human activities, and the highly variable impacts of climate change. Addressing the topic of drought prediction, Anshuka et al. (2019) provided a detailed comparative analysis of examples of statistical models of drought forecasts using the standardized precipitation index. The influence of the SPI time scale on the predictive skills of the models was also evaluated and the comparison of empirical models (meta-regression) and data-based ones supports the merit of data-driven models such as neural networks Neural networks transformed into wavelets. Regardless of drought type, drought prediction models were compared by  for the performance and accuracy of statistical, stochastic, probabilistic, artificial intelligence and hybrid models.
Studies such as that of West et al. (2019) have reviewed the progress of new geospatial technology, including the rise of passive and active remote sensing in the monitoring and estimation of drought variables and the influence of spatial resolution on the quantification of the impacts of different types of droughts (meteorological, agricultural, and hydrological). Based on the critical circumstances of water supply shortages, Orimoloye et al. (2020) documented the impacts of socio-economic drought including vulnerability to drought and its severity on a global scale. Recently, the climatic vulnerabilities of drought, including the characteristics that influence its spread in the components of the water cycle, have been examined by ). In the same way, Xu et al. (2022) specifically looked at the characteristics of hydrological drought development and recovery and the influence of human and topographical factors in this process in semi-arid areas. It is in this sense that after examining the interconnections between drought and groundwater, Petersen-Perlman et al. (2022) suggested a joint management between drought risk and water management in terms of governance policies and technological innovations.
Moreover, regardless of the geographical context and the approach used, the study by Singh et al. (2022) provides valuable information on the accuracy and overall acceptability of the indices proposed in the literature for monitoring, evaluating, or predicting drought conditions. Similarly, in view of recent technological advances and the growing use of artificial intelligence models in modelling drought conditions, Dikshit et al. (2022) established a review analysis on the use of the artificial neural network in drought prediction without documenting the full scope of application of artificial intelligence in this field. Nevertheless,  realised a more general analysis of the application of machine learning and deep learning models in the monitoring and forecasting of different types of droughts. This systematic review of the literature is a continuation of these systematic literature review studies that complement each other and further enrich our knowledge of drought. But unlike, this one is particularly interested in multivariate agricultural drought including the use of learning techniques and multi-source data. Its objective is to identify, synthesize and discuss recent studies on multivariate agricultural drought. Thus, the study should identify the most relevant variables in multivariate modelling of agricultural drought and identify the parameters of modellable agricultural drought. It will also examine and discuss the competencies of the different models based on the criteria for validation and evaluation of the models. The answers to these research questions should make it possible to identify prospects for better quantification and multivariate modelling of agricultural drought conditions.

Methodology
Regardless of the geographical area, the study focused on the different methods used, the algorithmic performances and the variables in the multivariate modelling of agricultural drought. To identify, analyse and discuss all available evidence related to the research question in an unbiased manner, the first step of the methodology is to develop and validate a detailed research protocol (appendix, supplementary material). This protocol has made it possible to collect a database of 417 scientific articles on drought questions in general, and 222 scientific articles were analyzed in depth that addresses the research question. These articles come from electronic data available online (Google scholar, Scopus, Science Direct, IEEE Xplore). To conduct a comprehensive search of the existing literature related to the search question, the following keyword combinations were used based on the specific syntax of each database: multivariate modelling including ML multi-sensor remote sensing, multivariate modelling including DL þ multi-sensor remote sensing þ Big data or multivariate modelling including DL þ ML þ multi-sensor remote sensing þ Big data.
The primary bibliographic database was managed using the Zotero reference management software. In the selection of primary studies, only research articles published in journals indexed between 2010 and 2022 were considered excluding traditional review papers, books, and conference and conference proceedings. Subsequently, the secondary database was classified according to several criteria: a classification by year of publication, a classification by approach, and classification according to the scores of the skills of the models and variables considered in the modelling of agricultural drought. The results of the different classifications were summarised in the form of tables, graphs, and diagrams (meta-analysis) and then analyzed and discussed to draw relevant conclusions.

Temporal analysis of publications related to the research question
A temporal analysis of the annual publications related to the research question was carried out with the aim of identifying the frequency of publication per year in the field of multivariate modelling of agricultural drought by machine learning. The result of the analysis presented in Figure 1 shows a spectacular evolution of scientific publications from the year 2018. In the first eight years (2010 to 2017), the number of annual publications on multivariate drought modelling that includes the joint use of multi-sensor remote sensing and artificial intelligence techniques was low. However, from 2018, a clear upward trend is observed (Figure 1).

Characteristics used in multivariate modelling of agricultural drought
Two sets of characteristics are fundamental and inseparable for the multivariate modelling of agricultural drought. The first category of characteristics, essential for spatial and temporal modelling of drought, is related to the different factors and/or variables of drought. Across all approaches, from the analysis of the recent literature on multivariate modelling of agricultural drought, we had identified at least 84 sets of parameters (different input data) that are commonly used in predictive modelling of agricultural drought risks, vulnerability modelling, or quantitative estimation of descriptive parameters of agricultural drought (Table 1). In view of the large number of parameters found in the literature, we have grouped them into 14 categories of parameters depending on the different factors or variables of agricultural drought. Vegetation-related factors cover the greatest diversity of factors (at least 14 documented), followed by factors related to soil moisture (8) and macro-climatic factors (8). The use of raw satellite images, RGB images, Land Cover, light use efficiency (NPP, GPP), expert opinion on soil or crop conditions, and the use of composite indices (VSWI, VHI, TDVI, NMDI) are the least frequently used features in multivariate modelling of agricultural drought. In addition to these sets of factors (input variables) of machine learning models, other approaches consider characteristics related to drought parameters, temporal characteristics of the predictors (seasonality, trend, extreme peak), normality of variables, periods of return of extreme peaks of different variables, and recovery time after drought (DRT). The incorporation of these characteristics has been shown to be essential in multivariate drought modelling.

Choice of the dependent or response variable
In a multivariate modelling approach to agricultural drought, the choice of the dependent variable is very important. The predicted or modelled response variable should allow for multivariate analysis of agricultural drought parameters, namely its intensity, severity, spatial extent, duration, and potential impacts. The dependent variable is indeed the function that quantitatively represents the expected or observed drought conditions. However, there are no unanimously accepted criteria for the choice of the response variable for drought modelling. This is linked to the multitemporal and geographic variability of drought sensitivity. For this reason, credible analysis of the relationships between potential predictive variables and the target variable is required in a multivariate modelling approach to agricultural drought (Rahmati et al. 2020). However, the highly spatio-temporal variability of the interrelationships of the different drought variables precludes any systematic analysis of the phenomenon and lends credibility to contextual and dynamic approaches in time and space. Therefore, the selection of relevant variables in the multivariate drought analysis is one of the most important steps since the criteria for selecting and assigning weights of input variables to models vary considerably.
In this regard, for example,  found that soil moisture and precipitation were the most important variables for assessing drought conditions at different time scales and that the (Land Surface Temperature) has relatively high importance for short-term drought. This further supports the conclusions of Ke et al. (2016) that indicators derived from temperature subject to considerable variation were less important than vegetation greenery and indicators related to soil moisture because the temperature is subject to significant variations on the scale of a few days. However, Park et al. (2016) study found that Land Surface Temperature or evapotranspiration factors are of relatively higher importance for short-term weather droughts than vegetation-related factors (NDVI). It should therefore be noted that the question of the importance of the variables is only a relative approximation that is highly dependent on the context, the reference variable and to some extent the machine learning model chosen. For example, Ke et al. (2016) found that indicators related to NDVI, EVI, and SAVI vegetation growth were more important in modelling evapotranspiration by the Random Forest algorithm than indicators of LST or soil moisture.
To date, there is still no universally valid discriminating criterion for the section and weight allocation of the different multivariate drought modelling variables. In the literature, the selection and assignment of weights in models are based on a prior comparison of potential predictors with a reference series. In addition, the Random Forest algorithm is the most widely used machine learning model for sectioning and estimating variables in multivariate drought models. After a prior choice, the add, remove, and verify test operations are often used to complete the relative importance of the variables obtained by RF. In this analysis, and over the recent period 2010 to 2022, we documented at least 16 categories of response variables that were modeled from multivariate approaches based on machine learning techniques and multi-source data. Thus, given the complexity of the drought phenomenon, the plurality of approaches, and machine learning algorithms used, various validation approaches have been found in the literature. Drought inventory maps, crop yield anomalies (drought losses), soil moisture, and hydro-climatic indices (SPI, SPEI, river flows) are the most widely used reference base for the validation of multivariate modelling of agricultural drought.

Criteria for assessing the competencies of the models
The evaluation of the skills of predictive models is a fundamental imperative of multivariate modelling of agricultural drought. Thus, several types of precision error indicators are used to evaluate model performance. Here we present a general overview of the main evaluation criteria for machine learning models of multivariate agricultural drought modelling. The most widely used in the literature includes: the mean absolute percentage error (MAPE) (Hara et al. 2021), root mean quadratic error (RMSE) (Deo and Şahin 2015;Lees et al. 2020), the coefficient determination (R 2 ) (Park et al. 2019;Feng et al. 2019b), mean absolute error , root relative squared error (RRSE) or validation using the confusion matrix (Anshuka et al. 2021). The combination of these measures is often preferred for the overall assessment of the competencies of forecasting models. Residue analysis is also a widely used method for evaluating the validity and performance of parametric and non-parametric regression models.
In other studies, the practical significance test such as the Bayesian hypothesis test (Meroni et al. 2021), false alarm rate, skill score (Pham et al. 2020), Bridier skill score (Sutanto et al. 2020), or ROC curve (Feng et al. 2020;Arabameri et al. 2021;Raza et al. 2021;Saha et al. 2021;Seo and Kim 2021) is calculated to validate the predictive skills of ML algorithms. Recently, Ciemer et al. (2020) used the Heidke Skill Score (HSS) to evaluate the proposed drought forecasting scheme in South America. The HSS is a test that compares the performance of the model to that of a random prediction. In addition, to these prediction error measures, cross-validation is often used to quantify the accuracy of prediction models (Houborg and McCabe 2018;Lee et al. 2020). Similarly, real skills statistics (RSS), the Friedman and Wilcoxon ranking test, the K-index (Saha et al. 2021), or the Willmott chord index (Deo and Şahin 2015) are also widely used in the evaluation of multivariate machine learning models. Note that in the existing literature, there is very little information on the comparison of these different criteria for assessing the skills of machine learning models applied to multivariate modelling of agricultural drought.

Best machine learning algorithm in multivariate modelling of agricultural drought
In this systematic review, we documented several ML models that are used in multivariate modelling of agricultural drought. However, in the current state of knowledge, there is still no scientifically indisputable factual evidence on the merit of one machine learning model over others. Current and recent literature highlights skills scores that are often quite variable for the same type of algorithm in different contexts and case studies. Thus, in view of this diversity of machine learning models, a comparative analysis of the skills of some models was established based on the skills scores found in the literature to arrive at a factual finding (Table 2). To better understand this, the following table provides a comparison of the skill scores of the most popular models. Emphasis was placed on models based on Random Forest. Based on the score analysis of the skills of the different machine learning models used in the multivariate modelling of agricultural drought listed in the following (Table 2), we deduce that RF is much better than other algorithms like RNN, SVR, Cubist, or SVM. Similarly, information from the converging recent literature supports this finding. The fundamental reason given is that the Random Forest algorithm is less sensitive to noise and effective in managing non-linearity relationships between drought variables (Kirasich et al. 2018;Sutanto et al. 2019). In addition to this, RF reduces overfits and leads to good regression performance based on the average of several decision trees (Park et al. 2019). For the above reasons, Feng et al. (2019b) suggest that the bias-corrected RF model is the most promising tool for monitoring agricultural drought. In a similar vein, Kuswanto and Naufal (2019) found that an approach based on the use of synthetic minority oversampling (SMOTE) techniques contributes to the improvement of regression and classification models (CARET and Random Forest). However, based on the skill scores presented in the following (Table 2), the Cubist model is still very competitive with the RF model. In addition, it should be noted that other studies have found that the use of hybrid models is beneficial for improving the performance of machine learning-based models (Kisi et al. 2019;Park et al. 2020;Danandeh Mehr et al. 2020;Ahmed et al. 2021;Danandeh Mehr et al. 2022;Ahmed et al. 2022). As such, using a PDSI-based predictive modelling approach, Aghelpour et al. (2021a) found that the SVM-DA hybrid model would improve SVM accuracy by up to 29%. Similarly, the bio-inspired optimization algorithms of GA and PSO algorithms would improve the prediction accuracy of the ANFIS model by 38.7%and 22.6% (Aghelpour et al. 2020) and the ICA optimizer is likely to boost the accuracy of the MLP by up to about 28.5% (Aghelpour and Varshavian 2021).

Discussions
Over the past decade, due to recent geospatial advances and data computing, studies on multivariate modelling of agricultural drought have increased significantly. This systematic literature review analyzes recent advances in multivariate modelling of agricultural drought including the use of artificial intelligence techniques. Agricultural drought modelling basically covers three important aspects: descriptive modelling of drought parameters (onset, intensity, and cessation), multivariate predictive modelling of drought intensity and multivariate risk mapping. These components of multivariate agricultural drought modelling are discussed as follows:

Multivariate descriptive modelling of agricultural drought
Multivariate descriptive drought modelling is an approach that combines several variables and attempts to reproduce quantitative and qualitative information essential for drought assessment and monitoring. Scientific efforts for this purpose focus on properly describing drought conditions without providing information on expected conditions. Over the past two decades, substantial progress has been made in multi-scale characterization of agricultural drought conditions. This progress concerns mainly the accuracy of drought parameters estimation, multi-source data aggregation capacity and operational monitoring of drought conditions. These advances in drought monitoring and assessment are closely linked to technological advances and the availability of multi-source data that have fostered the explosion of drought studies around the world.
Unlike traditional methods based on spatial interpolation of in-situ measurements or multivariate statistical modelling of drought, the joint use of multi-sensor remote sensing and machine learning models now offers the possibility of better understanding the spatio-temporal complexity of the drought phenomenon. Along with the explosion of multi-sensor and multispectral data, the use of machine learning algorithms now covers all aspects of multivariate drought modeling. Information taken on several biophysical variables that control vegetation growth and stress conditions can be generated at multiple temporal and spatial resolutions. The use of artificial intelligence techniques in the fusion of data from several satellites, in improving spatial resolution, or in estimating the relative contributions of drought-related covariates has given new direction to methods and approaches to descriptive modelling of agricultural drought. Both machine learning and deep learning models have been widely used to describe and/or to simulate several aspects of agricultural drought. Similarly, the use of hybrid and reinforcement learning models is an emerging trend in the field of drought modelling. On the other hand, at the current state of knowledge, quantum machine learning models have not yet been tested in multivariate drought modeling (Garc ıaa et al. 2022). However, these new algorithms have shown the ability to simulate the complex phenomenon that evolves over time. In the fields of quantitative finance, in particular the predictive modeling of market uncertainties, quantum machine learning has been of great interest. Due to the complex, stochastic and highly evolving nature of drought, the applicability of such approaches should be tested. Analysis of the recent literature highlights several composite models developed in several regions of the world to identify the multidimensional and multifactorial nature of agricultural drought. A large part of these studies combines, the use of multi-sensor remote sensing, artificial intelligence techniques and to some extent the use of cloud computing platforms (Wu et  Each of these studies is a different approach in terms of the choice of response variable and algorithm used to assess the relative importance of variable or index of the multivariate models developed. However, three types of approaches commonly used in multivariate drought modelling can be identified from the second generation of so-called composite indices. These include composite models developed based on local knowledge (expert opinion), (Bezdan et al. 2019;Kulkarni et al. 2020), composite models built on purely statistical approaches (Bayissa 2018;Chen et al. 2020;Wang and Yu 2021;Kim et al. 2021) and very recently models based on artificial intelligence and machine learning algorithms Han et al. 2019, Liu et al. 2020Son et al. 2021). Indeed, to overcome the problem of the non-linearity of drought variables several model models of non-binary composite models have been developed recently in worldwide with the aim of converging towards objective assessments of drought conditions. From this perspective, for example, Hara et al. (2021) point out that the use of artificial neural networks (ANNs) overcomes the lack of linearity between crop yield and independent variables. Studies such as that of Wang and Yu (2021), have recently used two statistical methods namely, principal component analysis (PCA) and hierarchical process analysis (HPA) to build comprehensive drought monitoring models in Yunnan Province, China. To this end, the overall drought index was developed from an objective combination of indices (RTCI, TCI, VCI) and then validated against SPI and SPEI indices and GLDAS data. The coefficient of determination, R, 2 between the SPI and the CDI is from 0.37 to 0.68, from 0.38 to 0.63 with the SPEI and from À0.54 to À0.92 with the TDVI depending on the month. However, the relationship with yield anomalies has not been established for better comparison with other models. Similarly, on the Aras and Karheh watersheds in western Iran, the Comprehensive Remote Sensing Drought Index (CRSDI), developed by Shojaei and Rahimzadegan (2022) that incorporates groundwater information from GRACE and GLDAS data records only 0.66 correlation with the sixmonth SPI. In the same geographic region Kulkarni et al. (2020) developed the combined drought indicator for Marathwada (CDI-M) in India by using two quantitative methods. Thus, the results of the study reveal that the CDI-M based on PCA is more reliable than the expert judgment-based approach with a correlation of 0.6 with crop yields versus 0.4 the correlation between cereal yields and CDI-M based on expert judgement. In the same year, Chen et al. (2020) developed the Modified Composite Drought Index (MCDI) using information entropy to balance the relationship between variables and the unbiased objective attribution of variable weights in Hubei province (India). However, the MDCI has been validated in a very generic way that does not allow an objective critical analysis of the accuracy of its performance to the detriment of other models. Similarly, in the Markazi watershed, Iran, Hosseini-Moghari et al. (2019) introduced for the first time the Modified Total Storage Deficit Index for Drought Monitoring (MTSDI) for water-intensive basins. The MTSDI was developed based on the total signal residue of GRACE observations. It has been shown to be more effective than SPI, SPEI and TSDI in detecting moderate droughts. Such an approach is of major interest for future drought studies in most semi-arid watersheds.
In many other previous studies, other approaches have been developed often for specific applications. Recently, Saha et al. (2021) developed a state-of-the-art approach for real-time drought monitoring at a spatial resolution of 0.25 . In the upper Nile basin (Ethiopia), Bayissa et al. (2019) proposed VegOut-UBN model on a 250 m resolution. VegOut-UBN was developed on a regression tree approach using climate, biophysical and ocean variables. This approach has the advantage of including both real-time drought monitoring and projected perspectives of vegetation conditions with an R 2 > 0.8 for monthly predictions. Similarly, regression equations were used for different types of land use by Li et al. (2020c) to propose the Drought Index of Cloud Regions for the first time. CRDI was developed in semi-arid ecosystems in southeastern China based on information of optical cloud thickness (TOC) and antecedent drought index (ADI).
In other case studies (Ghazaryan et al. 2020;Sultana et al. 2021;Das et al. 2021), multispectral and/or multi-sensor approaches including time series (Landsat, MODIS, Sentinel-2) and Synthetic Aperture Radars (SAR) have proven effective for spatiotemporal monitoring of crop condition and drought impact at scale (management unit). It is in this logic that Wang et al. (2020) first introduced the Improved Soil Temperature, Vegetation and Moisture Index (iTVMDI) into the arid northern China. This study showed that iTVMDI is strongly correlated (-0.7) with GLDAs soil moisture, precipitation (0.65) and surface water (0.81). Nevertheless, despite the innovative multi-sensor nature of this approach, spatio-temporal and static relationships with crop yields have not been provided. Regarding to agricultural drought, these elements are particularly useful for situating the effectiveness of each model developed. Currently, the availability of geostationary meteorological satellite data has made real-time monitoring and early warning of agricultural drought possible. Recently, Vyas and Bhattacharya (2020) examined for the first time in India, the potential of geostationary meteorological satellites (Kalapana-1, INSAT 3 A) for the development of agricultural drought early warning indicators. To this end, operational products, namely baseline evapotranspiration, precipitation, and crop vigor warning, were used in the construction of the early warning indicator. Despite a high temporal follow-up (10 days), the first major disadvantage of this approach lies in its coarse resolution of 0.25 which would significantly affect micro-scale analyses. Such a resolution hardly meets the requirements of drought monitoring and assessment at the sub-watershed level, for example. On a small scale, high-resolution, near-real-time indicators are needed to streamline decision-making at community level. To this end, the approach developed by Aadhar and Mishra (2017) in South Asia, based on the use of SYMAP algorithm and a high-resolution DEM to correct high-resolution (0.05 ) biases in precipitation and temperature data, seems better suited for better monitoring of drought conditions at the sub-watershed scale.
From the above, it should be noted that these multivariate composite approaches include multivariate approaches based on a combination of single-variable drought indices and those integrating several agricultural droughts variables. In this proposed diversity of indices, until very recently, some multi-purpose drought indices such as the SPI, the JDI, the MSPI, the PDSI, and the SPEI, are very little accepted in the literature as indices of agricultural drought. However, based on several evidence-based studies, some researchers suggest that climate drought indices are considered indirectly as indices of agricultural drought, especially in arid and semi-arid regions (Bae et al. 2012;Bazrafshan et al. 2014;Bazrafshan et al. 2015;Aghelpour et al. 2020;Aghelpour et al. 2021b;Fan et al. 2022). By testing the effectiveness of several artificial intelligence models (GRNN, LSSVM, GMDH and ANFIS-FCM), Aghelpour et al. (2021b) demonstrated that multivariate climate indices such as the Multivariate Standardized Precipitation Index (MSPI) can provide valuable information simultaneously of several types of droughts (meteorological, hydrological, agricultural, and socio-economic). This view is still not unanimous within the scientific community. Although the causal relationships between climatic and biophysical indices are factual evidence, several studies distinguish between indices of agricultural droughts and those of so-called meteorological droughts. Indeed, multivariate composite models based primarily on multi-source remote sensing data such as CDMI, CDMIa_RF, MDCI, CDI, SDCI that incorporate several drought variables can provide a more complete picture of drought conditions. As such, Hanad e et al. (2020) demonstrated that the composite model for drought monitoring offers the best correlative compromise with indices of other forms of drought. The CDMIa_RF had highly significant correlations with three types of hydrological drought indices derived from measures of actual flows. This suggests that multivariate composite models can inform several forms of drought at once.
For those reasons, nowadays, particular emphasis is placed on composite indices or multi-source hydrides that include, in addition to remote sensing data, data from climate model simulations and in situ measurements (Rajsekhar et al. 2015;Shen et al. 2019;Han et al. 2019;Wei et al. 2020;Liu et al. 2020;Hanad e et al. 2020;Bravo et al. 2021;Singh et al. 2021;Ali et al. 2022). There is now a general scientific consensus that nonlinear composite models are better suited for operational monitoring applications and assessment of drought conditions. Early warning systems, which integrate several indicators, have achieved considerable success in mitigating the adverse effects of climate crises (Krishnamurthy et al. 2020;Chen et al. 2020, Wang andHara et al. 2021;Hanad e et al. 2022). But the major disadvantage of these models is that they require several datasets often with very different characteristics that impose normalization and downscaling treatments.
Furthermore, it should be noted that regardless of the number of types of variables considered and the learning model used in descriptive modelling, the question of categorization of indices by drought type remains a major element that fuels confusion about multivariate drought modelling. Therefore, to capture the fundamental characteristics of drought, numerous studies based on comparative approaches have evaluated the performance of several machines and deep learning models thanks to the possibility of very advanced analyses via cloud computing platforms. Park et al. (2016) explored, using machine learning techniques, the importance of combining sixteen drought factors to simultaneously monitor weather and agricultural drought during growing seasons across different climate regions of the United States. For this purpose, they used three machine learning approaches, namely RF, boosted regression trees, and cubist . Similarly, it should be noted that multivariate descriptive drought modelling using machine learning techniques is used to map drought-prone areas, for estimating the cultivated areas affected by severe and extreme droughts (Nguyen et al. 2017;Park et al. 2019;Araneda-Cabrera et al. 2021).This includes high-resolution modelling of key factors in drought monitoring such as surface soil moisture (Park et al. 2017;Hajdu et al. 2018;Wang et al. 2022), evapotranspiration (Ke et al. 2016) or earth surface temperature (Ebrahimy and Azadbakht 2019). As such, the water stress function of crops is often estimated from variables such as soil moisture values and water content according to the different phases of crop growth (Samantaray et al. 2019). In addition, it should be noted that beyond traditional descriptive modelling, the analysis of the recent literature increasingly shows the emergence of new approaches that are interested in multivariate modelling of the spatial vulnerability of drought from machine learning models and multi-sensor remote sensing (Saha et al. 2021;Arabameri et al. 2021) sometimes coupled with the use of crop models (Li et al. 2021).
Multivariate mapping of drought vulnerability is a multifactor mapping that includes sensitivity, exposure, and mitigation factors that determine climate resilience in natural or anthropogenic ecosystems. This includes the objective land suitability classification of drought-prone areas (Habibie et al. 2020) by considering natural and anthropogenic sensitivity factors. In addition, it should be noted that machine learning techniques are also increasingly used for the fusion of multi-source data for monitoring vegetation growth parameters such as anomalies in the water content of vegetation (Yuan et al. 2019) or modelling soil moisture near the surface (Chatterjee et al. 2022). Quantitative drought estimation from descriptive modelling, as well as results from machine learning multi-sensor fusion, are often used as key parameters in hybrid models of yield predictions.
However, despite substantial progress in multivariate drought modelling, it should be noted that accurate, reliable, and operational drought monitoring applications still require additional efforts. In previous studies, some very fundamental aspects of drought monitoring such as integration into multivariate models influence of local forcing and the cumulative footprints of previous droughts are not considered in multivariate descriptive modelling of agricultural drought.

Predictive modelling of agricultural drought risk
Spatial modelling of agricultural drought risks based on machine and deep learning techniques is a relatively new approach compared to descriptive modelling of drought parameters. This approach is mainly linked to the growing interest in providing useful information for early drought management, especially in regions where supplementary irrigation is important in the management of drought episodes. It corresponds to multivariate predictive analyses that provide a practical mapping of the expected risks, their intensity, their spatial distribution (Tian et al. 2018;Kaur & Sood 2020;Rahmati et al. 2020), as well as drought-related forecasts of the risk of reduced agricultural production (Zambrano et al. 2018). It also includes, to some extent, predicting areas of severe droughts by providing severe stress information in advance (Park et al. 2019); multifactorial predictive of vegetation conditions (Barrett et al. 2020;Mokhtari and Akhoondzade 2020;Roushangar et al. 2021), precipitation anomalies, seasonal anomaly forecasts, soil moisture (Thober et al. 2015;Zhu et al. 2019) or prediction of crop water content (Hendrawan and Murase 2011).
In the recent literature, four types of models are commonly used in the study of drought forecasting: statistical forecast models, dynamic models, machine learning models, and hydride models, most often including all three approaches. Statistical drought prediction models are based on historical time series analysis and modelling of the relationships between one or more predictor variables and a drought index (Chandrasekara et al. 2021). Statistical models such as multiple linear regression (Pen and. 2017;Petersen 2018;Zambrano et al. 2018;Leroux et al. 2019;Ghazaryan et al. 2020;Fall et al. 2021) and moving average models are traditionally the most widely used (Aghelpour et al. 2021a). Based on the inherent historical persistence of drought indices, several univariate forecasting models Lyon 2015, Lees et al. 2020;Evkaya and Kurnaz 2021) and multivariate drought (Hao et al. 2016;Aghelpour and Varshavian 2021) were proposed. In this sense, Hao et al. (2017) believe that by incorporating previous precipitation conditions and the seasonal cycle of climatic precipitation, drought prediction can be achieved. Thus, considering the historical characteristics of drought, a significant number of previous studies have been focused on univariate drought forecasts using simple indices such as SPI, SDI, SMI etc (Evkaya and Kurnaz 2021).
Unlike statistical models, dynamic (physical) models are based on general circulation models (GCMs) and incorporate interconnections between land, ocean, and atmosphere (Thober et al. 2015;Khan et al. 2020). Dynamic models provide predictions about the potential large-scale occurrence of climate extremes (Feller and Vaseva 2014). But they are particularly sensitive to the prediction interval and the spatial extent considered. The other disadvantage of climate models is that predictive skills vary depending on the climate context. In this regard, Yuan and Wood (2013) pointed out that the NMME model adds additional skills in tropical regions unlike extratropical regions. Thober et al. (2015) revealed that the NMME dynamically based seasonal drought forecasting system had superior predictive skills to forecasts based on the Streamflow Prediction Set (ESP) model. However, the ESP model remains the most popular model for predicting the seasonality of hydrological and meteorological drought.
Nowadays, data-driven seasonal forecasting systems using AI and machine learning techniques have been very successful in recent years (Shamshirband et al. 2020;Mokhtari and Akhoondzade 2020;Danandeh Mehr et al. 2020;Saha et al. 2021;Singh et al. 2021). The large and growing availability of multi-source data and the improvement of the storage and processing capacity of machines are the main success factors of AI and ML techniques. These new approaches combining remote sensing, machine learning (ML), in-situ measurements and numerical model data offer the best prospects for meeting the needs for multi-factor drought forecasting. Thus, to meet this objective, a significant number of models based on ML and AI algorithms have been proposed (Lees et al. 2020;Aghelpour and Varshavian 2021). Very recently, the drought prediction model developed by Park et al. (2020) combining remote sensing drought index (SDCI, SPI) models and numerical model outputs, has made it possible to overcome the difficulty of short-term forecasting. Thus, through the synergistic use of convolutional long-term memory (ConvLSTM) and the Random Forest (RF) algorithm, 8-day drought forecasts were developed. The approach was found to be useful for operationally providing future drought conditions at high resolution. Similarly, by comparing several ML algorithms (RF, SVR, LASSO, GBR and MLP), Meroni et al. (2021) proposed an operational framework for forecasting cereal yields in Algeria with an overall accuracy of 0.16-0.2 t/ha (13-14% of the average yield) over the season. In Morocco, Bouras et al. (2021) compared the predictive performance of five ML models and proposed a dynamic multi-source approach to cereal yield forecasts at the provincial level that seems better suited to overcome the limitations of empirical models. According to the same authors, the most robust model is XGBoost have R 2 ¼ 0.88 and RMSE around 0.22 t. ha À1 . However, it has been reported that developing predictive models by province would improve forecast accuracy by reducing the RMSE by 10-20%. Nevertheless, Petersen (2018) analyzed monthly anomalies in vegetation health and based on multiple regression models, resulted in cereal yield forecasts comparable to predictions in ML and/or hybrid models. The forecast accuracy of the main crops at two to four months was À2% error (for 20% forecast) and À5% error (for 40% forecast).
Considering such results, the debate on the choice of approach and/or algorithm for short-term forecasts of drought or their impacts in terms of falling yields remains topical so far. Studies such as Elbeltagi et al. (2020), Pham et al. (2020) that compared ANN, PSOANFIS and SVM algorithms for rainfall forecasts, or Khan et al. (2020) that examined the potential of three advanced ML techniques in Pakistan, reported that SVM-based forecasting models are more robust in capturing spatio-temporal characteristics of drought than models based on the ANN and K-Nearest Neighbors (KNN). However, it has been revealed in several other studies that ANNs are much better as forecasting algorithms. For example, when comparing the overall performance of three ML Deep Neural Networks (DNN), SVM and RF algorithms, for early drought detection, Dao et al. (2021) point out that DNN-based models are more robust than RF and SVM based models and that this superiority is much more pronounced for models with low input data. In another study, Mokhtari and Akhoondzade (2020) compared four ML algorithms (ANN, SVR, RF and the decision tree), for predictive modeling of NDVI. The robustness of ANN-based models was further illustrated.
To address these contradictions, future research will need to focus more on approaches that compare both statistical, ML and hybrid models. The impacts and occurrence of climate extremes such as drought have characteristics typical to each geographical area or sub-area. There is therefore an urgent need for additional scientific efforts to develop predictive models that consider the local sensitivity of agrosystems. The trend in the recent literature on this subject supports this point of view. The enormous contradictions in the skills of the prediction models found in the literature underline the need for comparative efforts at several scales and hydroclimatic contexts. One of the determining factors that can explain the contradictions in the performance of the models, particularly the ML models, is the choice of model hyperparameters and their parameterization. In the literature, some studies are satisfied with default hyperparameters while for others, the models are adjusted by a test process. In addition, it has been reported, importance emphasizes the importance of selecting and identifying predictors in the development of prediction models to arrive at accurate predictions (Bazrafshan et al. 2014;Khan et al. 2020;Danandeh Mehr et al. 2020;Aghelpour et al. 2021b). In the same way, comparing predictive skills of drought impacts on Random Forest bass and logistic regression, Sutanto et al. (2020) point out that the ability to predict drought impacts using ML is highly dependent on data availability. Indeed, predictive drought modeling requires the integration of massive multi-source data far more than the amount and diversity of data required for the development of descriptive multivariate models. Several studies agree on the importance of the quantity and diversity of data in predictive models to achieve the best performance of data-driven models.
Other in-depth comparative studies (Roushangar et al. 2021, Meroni et al. 2021Jiang and Luo 2021;Elbeltagi et al. 2021, Dao et al. 2021Bouras et al. 2021;Aghelpour and Varshavian 2021) revealed that unique drought prediction models for both machine learning methods and statistical and/or dynamic methods had limited forecasting performance. In evaluating the performance of 16 ML and 16 Deep Learning (DL) models for weather drought prediction, Jiang and Luo (2021) point out that no single model can achieve the best performance for all evaluation metrics simultaneously. The study by Bali and Singla (2022) recently sufficiently supports the trend towards the increasing use of hybrid models and deep learning techniques as an effective means of prediction especially for crop yields (Van Klompenburg et al. 2020). In general, non-stationary models are more reasonable (Liu et al. 2020), more accurate and efficient than stationary models. For example, models based on artificial neural networks or the Bayesian model (Nie et al. 2021) have outperformed stationary models in several studies. Another study by Dao et al. (2021) found that the use of advanced methods, machine learning and deep learning (DNN, SVM, RF) enabled early drought detection with 97.5% to 100% accuracy while spectral index models had very low early detection accuracy. Roushangar et al. (2021) points out that integrated models could improve the accuracy of predictive modeling by up to 40 and 50%. To address this challenge, hybrid or integrated approaches offer the best prospects at the expense of single model approaches. Among these approaches, those combining remote sensing techniques and artificial intelligence techniques offer the best prospects for multi-scale and multifactorial drought modelling.
Approaches to multivariate spatial and temporal forecasts of agricultural drought are fundamentally derived within the limits of reactive management of drought impacts, in particular inappropriate response times or costs associated with managing impacts, which are often considered very high. When it comes to agricultural drought, proactively managing the expected impacts on agricultural production has been recognized as one of the essential elements for planning appropriate responses and achieving high yields. By allowing the identification of the most vulnerable areas (Sivakumar et al. 2021), the assessment and modelling of drought risks make it possible to optimize measures to mitigate the expected impacts. In this perspective, the combined integration of meteorological, hydrological, agricultural, and socio-economic factors is increasingly favored for detective and cartographic modelling of agricultural drought risks using learning techniques (Park et al. 2018;Hatami et al. 2018;Park et al. 2019;Joko Prasety et al. 2019;Adede et al. 2019;Arabameri et al. 2021). Approaches vary depending on the number of variables considered, the type of learning model used, and the response variable to be expected to determine risks. In terms of prediction delays, the most predicted time score is 4 hours (Muñoz et al. 2021) and predictions of 1 to 3 months are the most popular (Tadesse et al. 2014, Adede et al. 2019. Despite the acceptable performance of some predictive models, an increase in forecast horizons beyond two months would increase model errors to a few exceptional cases (Aghelpour et al. 2021b). Indeed, unlike long-term and large-scale expected frequency modelling, short-term modelling of agricultural drought at the local scale is the most difficult aspect of multivariate modelling. Taking this difficulty into account, Park et al. (2018) proposed a complex multivariate approach based on RF and 16 variables for drought prediction on a scale of 10 days in advance. Other studies, such as that of (Rahmati et al. 2020), have proposed other new approaches to mapping agricultural drought risks based on an objective comparison of point-learning models (CART, BRT, RF SVM) and multivariate adaptive regression splines and better consideration of hydro-environmental datasets in addition to conventional evidence. Similarly, based on machine learning techniques, another separate study relied on local information (contributions from local experts) in addition to remote sensing data, simulations, and in situ measurements to model weekly maps of drought conditions in the US (Hatami et al. 2018).
This approach is one of the few documented approaches in the literature which uses the judgment of local expertise in short-term predictive modelling of drought. Another unique advantage very little documented is that in the US Drought Monitor (USDM) approach, the indicators are incorporated into the model considering their relative importance by area and different times of the year. This implies a better consideration of spatial heterogeneity and temporal variability. In addition, multivariate predictive modelling is also used to predict severe drought areas. Recently, Park et al. (2019), have developed a short-term severe drought prediction model (SDAP) by ignoring rainfall data. The Random Forest algorithm was used for this purpose, to model soil moisture one month in advance by considering in the model several variables grouped into four categories of factors (topographic, thermal, vegetative, and water). In this analysis, thermal factors (NIR, SWIR, TIRS) were more decisive land surface factors in the modelling of the soil moisture deficit index than water (MSI, MNDWI, NDMI), vegetative (EVI, NDVI, MSAVI, SAV), and topographical (slope, TWI) factors.
However, it should be noted that the practical (operational) measures of the inaccuracies of these artificial intelligence models have been poorly studied in the recent literature. In this direction, Rulinda et al. (2012) proposed an approach based on the use of the drought belonging function for quantifying the inaccuracies of vegetative drought in East Africa. Similarly, recently,  suggests the use of SHAP algorithms to refine confidence in ML model-based approaches in predictive drought modelling.

Predictive modelling of expected impacts
Predictive modelling of expected impacts is undoubtedly the most challenging component of multivariate modelling of agricultural drought. The first difficulty in this regard stems essentially from the fact that the fundamental determinants of agricultural drought are characterized by very important temporal and spatial changes that exclude any systematic approaches to the phenomenon. Similarly, the severity of agricultural drought and its quantitative impact are highly dependent on the duration of drought, intensity, and the climate vulnerability of agricultural systems. Several studies on predictive modelling of drought impacts are based on the history of several biophysical and hydroclimatic variables without considering contextual socio-economic vulnerabilities (Enenkel et al. 2020;Diaz et al. 2021). This suggests that little consideration should be given to the potential impacts that would result from nonclimatic factors.
At this level, the question is how to predict and interconnect all dynamics, both human and socio-economic, as well as very heterogeneous spatial realities and anomalies of biophysical and climatic variables likely to have a beneficial and or aggravating effect in the predictive estimation of the impacts of agricultural drought. From the above, it should be noted that although it is possible to accurately model the frequency risk, the quantification of the expected impacts is still subject to factual controversies. In the recent literature, two approaches have been explored on this subject. Modelling predicts drought impacts using indices such as SPI, SPEI, and RDI (Belayneh and Adamowski 2013;Mouatadid et al. 2018;Sutanto et al. 2020;. These studies examine self-learning methods based on various machine and deep learning algorithms such as artificial neural network (ANN), Random Forest, SVR, or logistic regression. Based on this approach, drought impact functions can be predicted with delays up to seven months in advance (Sutanto et al. 2020). In the same year, based on machine learning models (LASSO, Regression, RF, ordinary least squares, AdaBoost), Li et al. (2020a) proposed a longterm predictive drought modelling approach 6, 12, and 24 months in advance based on the SPEI index.
However, in the existing literature, it should be noted that self-learning methods have been poorly applied to remote sensing drought indices to predict the expected impacts of drought. Very recently, other authors have emphasized the importance of multi-sensor predictive modelling of drought impacts based on the assumption that incorporating as many drought-related variables as possible may be appropriate for more accurate quantitative predictions than univariate modelling. In this perspective, based on data from the CMIP6 climate model,  proposed a predictive modelling approach to the long-term impacts of drought based on the SPEI in several South Asian countries. Expected yield losses were estimated at 54.15% for rice crops by the overall machine learning model (EML), which distinguished itself by high predictive competence with a maximum R 2 of 0.918 significantly higher predictive competence of RF and GBM autonomous methods. In addition, he notes that regardless of the model adopted, the predictive skills of machine learning models tend to decrease as the prediction time increases. The prediction gap can vary from R 2 ¼ 0.90 for a one-month prediction to R 2 ¼ 0.50 for a three-month prediction (Tadesse et al. 2014). Similarly, the ability to predict impacts using machine learning methods is highly dependent on the quality and velocity of data on reported impacts (Sutanto et al. 2019;Sutanto et al. 2020;Torell o-Sentelles and Franzke 2022). In other words, it should be noted that in the current state of knowledge, modelling of the impacts of climate extremes can be established based on machine learning techniques by combining data from climate models, agronomic models, in situ measurements, and spatial remote sensing. As such, the study (Feng et al. 2019a) on assessing the impacts of climate extremes on wheat yield in south-eastern Australia lays the foundation for a very promising approach.

Multivariate analysis of drought frequency
Just like predictive modelling of expected impacts, multivariate modelling of the frequency of agricultural drought provides the information needed for proactive management of the phenomenon. Multivariate analysis of the frequency of drought parameters (intensity, severity, duration, and spatial extent) from periods of a joint return of drought variables makes it possible to quantify the severity of the expected risk (Gupta et al. 2020;Ji et al. 2022) or the probability of occurrence of impacts (Seo and Kim 2021). By identifying high-risk areas and the magnitude of expected droughts, frequency analyses provide essential information for mitigating the potential impacts of drought episodes.
The basic principle of multivariate drought frequency modelling is based on the joint return periods of drought-related variables over a given region from probabilistic or machine learning approaches. However, with the climate changes of recent years, a shortening of return periods as well as exceptional situations are increasingly recorded in several regions of the world. It follows that the extreme peaks of variables, their variabilities, and interdependencies do not follow any statistically modellable law at any scale. For this reason, we are witnessing the emergence of new approaches that favor the incorporation of local and microclimatic variables in the modelling of the frequency dynamics of expected droughts. For example, recently, using the classifiers (Random Forest) and Bagged Decision Tree, (Herrera and Aristiz abal 2021) modelled the spatial and temporal occurrence of drought in Colombia based on the history of several variables such as LST, precipitation, evapotranspiration, and microclimatic variables.
There is currently no model that provides multivariate frequency modelling by jointly modelling the shifting and highly variable effect of the return periods of the different drought-related variables. The probabilities of occurrence of the extreme peaks of the various variables that condition the occurrence of a drought and the worsening of its intensity are often asynchronous in time and space. For this reason, the period of a joint return of drought-related variables remains difficult to model based on data history. However, the addition of output data from climate models has proven beneficial in multivariate modelling of the projected frequency of agricultural drought.

Limitations of the analysis
The scientific completeness of this analysis is limited by a few difficulties. The first difficulty stems from the multifactorial dimension of the phenomenon studied and the diversity and originality of the studies analysed. The dependence of machine learning-based integrated drought monitoring on the selection of training data and their high susceptibility to error for various environmental conditions or climate states makes the comparability of studies difficult. These elements and the absence of the case studies make comparisons of the predictive skills of the different models used in the multivariate modelling of agricultural drought less objective. Each of the studies analysed is to some extent typical in terms of the scientific approach, including the number of variables considered, the types, sources, and velocity of data used, the period of analysis, the prediction times, and the choice of the modelled response variable as a condition of drought. Similarly, in this analysis, we did not document any studies that looked at comparing the relevance of various criteria for assessing machine learning model skills applied to multivariate agricultural drought modelling. A better comparison of the skills of predictive models would be more objective if it relied on several case studies using and comparing the same algorithms on the same data types, variables, time scales, etc. The recent literature on the subject is very heterogeneous in terms of originality and crucially lacks the reproductive applications of the different approaches proposed. This limits the consistency of the various conclusions drawn on this subject.

Conclusion and prospects for optimal modelling of agricultural drought
This systematic review analyzes the application of artificial intelligence techniques and multi-sensor remote sensing in multivariate modelling of agricultural drought. Analysis shows that all aspects of agricultural drought have been the subject of factual studies at several levels of scale (plot, local, regional, and global) based on classical machine learning models. The parameters (severity, intensity, duration), as well as the determinants of agricultural drought, were modelled using several approaches combining artificial intelligence techniques, multi-sensor remote sensing, and multi-source auxiliary data.
However, this analysis does not reveal any study of spatial modelling of agricultural drought based on quantum machine learning algorithms. Therefore, future studies on multivariate drought modelling need to explore the potential of quantum learning algorithms. These relatively recent methods would be suitable for the study of the stochastic dynamics of drought. In terms of the skills of machine learning models in spatiotemporal drought modelling, there is no conclusive fact about the merit of a specific model. Regardless of which model is adopted, the predictive skills of machine learning models tend to decrease as prediction time increases, and the saturation of prediction models is often observed above a certain performance threshold. In addition, the performance of the different models used tends to reproduce different performances depending on the context of each study and the adjustments of the hyperparameters used, therefore, selecting an appropriate prediction model remains a challenge due to a lack of clear information about the best model ). In the diversity of models so far tested, there is no clearly accepted evidence in the literature on the choice of the best model. However, compared to SVM, RNN, SVR or cubist algorithms, the Random Forest algorithm seems to be better suited for multivariate modelling of agricultural drought, but in general, the latest trend shows an increasingly clear superiority of hybrid models over single models of both machine learning and deep learning. Now, short-term multivariate predictive modelling hours to days in advance can be established (Anshuka et al. 2019;Dikshit et al. 2022). The shortest prediction score is 4 hours (Muñoz et al. 2021) and predictions of 1-3 months are the most popular (Tadesse et al. 2014, Adede et al. 2019. However, despite the significant results recorded in recent years in predictive drought modelling, no approach has been proposed for predictive modelling of the post-drought recovery phase. Similarly, quantifying the lagged effects of different variables in predictive analyses of agricultural drought is the least important research objective to explore. Studies in this direction could lead to a substantial improvement in the accuracy of predictive models as well as an increase in the short-term prediction time of agricultural drought conditions. Quantum machine learning algorithms for example can be explored for this purpose, for future research (Garc ıaa et al. 2022). Approaches based on quantum machine learning could also be useful to address the challenge of different return periods of agricultural drought variables in multivariate drought frequency modelling.
Among the ancillary, data that complement data-based analyses of biophysical parameters, social, economic, and environmental determinants such as the cumulative impacts of previous droughts, local forcing, seasonal influence and characteristics of agricultural practices including the influence of chemical stress are very rarely considered in the multivariate modelling of agricultural drought. About, biophysical variables commonly used, thermal factors have proven to be very decisive in the predictive modelling of soil moisture as a variable of responses, and the latter seems to be more adaptable than rainfall for the monitoring or medium-term prediction of agricultural drought. Also, the analysis of existing literature suggests necessary improvements in the spatial resolution of drought-related variables to achieve certain precisions in the operational monitoring of agricultural drought. From this perspective, deep learning models can be explored to derive proxies at the desired resolution by merging multisource data. In this, the reduction of model space has proven to be beneficial for the skills of predictive models, especially for ANN.
Furthermore, whether it is predictive or descriptive modelling, practical (operational) measures of inaccuracies in artificial intelligence models will need to be further explored in future studies to move from the scientific framework to the operationalization of results.

Acknowledgments
We gratefully thank the following research structures: the laboratory (UR 18) of the Department of Geodesy and Topography (IAV Hassan II), the laboratory of UMR CNRS ESPACE 7300 (AMU France), and the Regional Center for Agronomic Research of Marrakech (INRA, Morocco) for their contributions and collaborations in this research. We also thank the Islamic Development Bank for its financial support (scholarship). We extend our thanks and appreciation to the editor of Geomatics, Natural Hazards, and Risk Journal and the reviewers for their valuable comments and recommendations.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Data availability statement
The bibliographic database supporting the conclusions of this study is available from the corresponding author upon reasonable request.