Evaluation of Landsat 8 image pansharpening in estimating soil organic matter using multiple linear regression and artificial neural networks

ABSTRACT In agricultural systems, the regular monitoring of Soil Organic Matter (SOM) dynamics is essential. This task is costly and time-consuming when using the conventional method, especially in a very fragmented area and with intensive agricultural activity, such as the area of Sidi Bennour. The study area is located in the Doukkala irrigated perimeter in Morocco. Satellite data can provide an alternative and fill this gap at a low cost. Models to predict SOM from a satellite image, whether linear or nonlinear, have shown considerable interest. This study aims to compare SOM prediction using Multiple Linear Regression (MLR) and Artificial Neural Networks (ANN). A total of 368 points were collected at a depth of 0–30 cm and analyzed in the laboratory. An image at 15 m resolution (MSPAN) was produced from a 30 m resolution (MS) Landsat-8 image using image pansharpening processing and panchromatic band (15 m). The results obtained show that the MLR models predicted the SOM with (training/validation) R 2 values of 0.62/0.63 and 0.64/0.65 and RMSE values of 0.23/0.22 and 0.22/0.21 for the MS and MSPAN images, respectively. In contrast, the ANN models predicted SOM with R 2 values of 0.65/0.66 and 0.69/0.71 and RMSE values of 0.22/0.10 and 0.21/0.18 for the MS and MSPAN images, respectively. Image pansharpening improved the prediction accuracy by 2.60% and 4.30% and reduced the estimation error by 0.80% and 1.30% for the MLR and ANN models, respectively.


Introduction
Soil Organic Matter (SOM) is a term used in agriculture to describe various organic substances with various properties (Keshavarzi et al. 2021). SOM improves soil structure stabilization, water infiltration, increased water retention capacity, retention and release of mineral nutrients for plants, and erosion resistance (Taghizadeh-Mehrjardi et al. 2020;van der Wal and de Boer 2017). Because of its importance in agricultural productivity, its presence in sufficient quantities contributes to the proper nutrition of cultivated species, which results in better plant growth and higher productivity, thereby ensuring food security (Taghizadeh-Mehrjardi et al. 2020). SOM also plays a crucial role in the soil's environmental function by offsetting excess greenhouse gas emissions, primarily CO 2 , and mitigating its adverse effects on global warming and climate change (Viscarra Rossel et al. 2016;Minasny et al. 2017).
As a result, a better understanding of SOM spatial pattern distribution is required for sustainable soil management. This understanding is essential for the efficient and effective use of land and protection of the environment (Guo et al. 2013). An accurate estimation of SOM would provide vital information on the nutrient and sediment cycle and play an important role in crop management (Liu et al. 2019;Guo et al. 2013). In order to reduce and minimize the costs of preparing SOM maps, methods that use the fewest number of soil analyses should be developed (Tajik, Ayoubi, and Zeraatpisheh 2020). Reasoned and precise agriculture is necessary when considering the need for sustainable management of natural resources, environmental protection, and technological progress; it is insufficient and expensive to adapt conventional methods to study soil properties in such a system, which requires suitable spatiotemporal resolutions (Rahman et al. 2020;Zeraatpisheh et al. 2019).
Remote sensing is an alternative nondestructive method for studying soil attributes that can identify spatial patterns at a fine scale and significantly reduce fieldwork (Mfuka, Byamukama, and Zhang 2020;Shao, Wenfu, and Deren 2021). For instance, multispectral and hyperspectral remote sensors can be used to record reflectance spectra to study bare soil or lightly covered soil properties (Jensen 2014;Gomez and Lagacherie 2016;Lagacherie and Gomez 2018;Baret 2015). Soil properties in space and time can be monitored more effectively by using satellite data owing to their repetitive frequency. Since the 1990s, several studies have worked on the digital mapping of SOM using Landsat multispectral images (Jarmer et al. 2010;Demattê et al. 2007), SPOT (Vaudour et al. 2013) and hyperspectral Hyperion (Gomez, Viscarra Rossel, and McBratney 2008). Reflectance measurements have already been successfully used to predict the organic matter content of agricultural soils, either in the laboratory of dried soil samples or directly in the field. It is made by using field spectro-radiometers or satellite images that provide a range of information, including organic matter and soil moisture (Poppiel et al. 2021). In the last two decades, some studies have focused on predicting the variation of SOM based on statistical models. For example, multiple linear regression (MLR) has been used basing on the linearity hypothesis and the consideration of the correlation between soil properties and environmental variables Lagacherie and Gomez 2018;Demattê et al. 2007). To solve problems with environmental data modeling, such as lack of normality and nonlinear correlations, other researchers have used artificial neural network (ANN) models to predict soil properties (Huang, Liu, and Jiayi 2021;Dai et al. 2014). A few studies have used satellite image pansharpening techniques in digital soil mapping (Francés and Lubczynski 2011;Xu et al. 2017Xu et al. , 2018Vaudour et al. 2013;Zeraatpisheh et al. 2020b). This method is not currently used in SOM digital mapping. The pansharpening technique improves the spatial resolution of low-resolution multispectral images by utilizing a high-resolution panchromatic band. Additionally, it has the potential to enhance spectral image quality (Yusuf, Tetuko Sri Sumantyo, and Kuze 2013). Regular monitoring in time and space of SOM could be a tool to maintain visibility on its evolution and status dynamics, especially in our study area, which is part of the irrigation scheme in the Doukkala Plain known for its agriculture throughout Morocco (Zeraatpisheh et al. 2020a;Bakhshandeh et al. 2019). The objectives of this study were 1) to predict SOM from Landsat-8 data by using two different modeling methods (MLR and ANN): one is linear, and the other is intelligent and nonlinear; 2) to compare the efficiency and accuracy of the two models; 3) to assess the effect of image pansharpening on SOM prediction; and 4) to use these models to assess the utility of Landsat-8 images in SOM digital mapping. This study may also provide an opportunity to compare the applied models.

Study area
The Doukkala Plain is located in western Morocco with a large irrigated surface. Our study area is located at 32°32ʹ N, 33°47ʹ N, 8°14ʹ W, and 8°32ʹ W. It covers the area of Sidi Bennour, Sidi Smail, and a large part of the High Section in the Doukkala irrigation scheme (Figure 1). It covers an area of 436 km 2 in the middle of the Doukkala irrigated perimeter. It is located approximately 120-130 m of altitude (Ferre and Ruhard 1975). Soil and climate conditions are favorable for agricultural development (Bouasria et al. 2020). However, the study area is dominated by micro-properties, with farm sizes of less than or equal to 5 ha (Bouasria et al. 2021).

Soil data
In this study, we selected a restricted random sampling system. The study area was divided into a 1 km grid. Within each grid segment, a single sampling unit (single location) was then randomly selected, but on the condition that the soil was bare and the minimum distance between sampling points was 1 km. This technique allowed for coverage of the locations throughout the entire study area. The samples were collected at a depth of 30 cm in September 2013. A handy GPS device was used to determine the geographical coordinates of the points. The study included 368 observation points ( Figure 1). Drying, crushing, and sieving were performed on the samples. The Walkey and Black method was used to determine the organic matter content (Walkley and Black 1934).
SOM content ranges from 0.35 to 3.72% in the study area, with a mean of 1.346% and a standard deviation of 0.481%. The coefficient of variation (CV) was 35.72%, indicating that the SOM for all samples showed moderate to high variability. The soil properties are normally distributed if the skewness coefficient is less than 0.5 (Webster and Oliver 2008). SOM had a positive asymmetry of 0.885, indicating that the values were shifted to the left of the median, and thus, the tail of the distribution spread to the right. After testing several types of transformation, it appears that the logarithmic transformation fits best with the SOM data (He et al. 2009). After the transformation, the asymmetry became slightly negative (−0.354). The K-S test confirmed this situation (D (368) = 0.072) at a statistically significant level (p = 0.000).

Satellite data
In this work, we used data from the Landsat-8 satellite equipped with a multispectral sensor (OLI) and a thermal infrared sensor (TIRS). Landsat 8 OLI has nine bands in the Vis-NIR-SWIR that include the following wavelengths (µm): 0.43 to 0.45 (B1coastal/aerosol); 0.45 to 0.51 (B2 -blue); 0.53 to 0.59 (B3 -green); 0.64 to 0.67 (B4 -red); 0.85 to 0.88 (B5 -NIR); 1.57 to 1.65 (B6 -SWIR 1); 2.11 to 2.29 (B7 -SWIR 2); 0.50 to 0.68 (B8 -PAN) and 1.36 to 1.38 (B9 -Cirrus). We selected the image of 19 October 2013, which is during the summer season to avoid soil moisture due to local rainfall and irrigation, to avoid excessive vegetation, since almost the only crop left in the field is alfalfa, which lasts for five years, and to prevent soil disturbance by plowing, as tillage does not begin until early November. Image radiometric calibration was performed to maintain a stable image quality and extract the signal characteristics from the image data. Then, an atmospheric correction of the image was performed using the FLAASH algorithm, which integrates the MODTRAN 4 model (Berk et al. 1999). After these corrections, and to improve the spatial resolution of the multispectral images (30 m) and to synthesize new more information-rich images, a pansharpening of the image (15 m) was performed by applying the Gram-Schmidt algorithm that respects the consistency of the digital counts and limits distortions compared to other image fusion algorithms (Yusuf, Sumantyo, and Kuze 2013). We applied image pansharpening on the seven bands from B1 to B7 at 30 m resolution using band B8 at 15 m of resolution. The original multispectral image at 30 m spatial resolution was named MS, and the pansharpened image at 15 m spatial resolution was named MSPAN. To keep only the bare ground pixels, we created a vector mask to keep only the irrigated area and exclude built-up areas and non-irrigated lands. Then, we generated a raster mask of −0.18 < NDVI < 0.33. This mask removes unwanted areas from the image and gives them a zero (0) value.

Data analysis
The soil data were prepared using QGIS v3.4 software. Image pre-processing and pansharpening were performed using IDL/ENVI v5.3. MLR and ANN analyses and modeling were performed using the SAS JMP v13. SOM richness was assessed according to the DIAEA / DRHA /SEEN (2008) standard, which defines five SOM classes: 1) very poor < 0.7%, 2) poor from 0.7% to 1.5%, 3) average from 1.5% to 3%, 4) rich from 3% to 6%, and 5) very rich > 6%. To evaluate the model performance, we split the dataset using a random sampling method. A split of 245 (66%) samples of the data used for training and the remaining 123 (34%) samples were used to test the model output. The resulting modeling equations were translated into IDL/ENVI expressions and applied to the images using the Band Math tool to generate soil organic matter digital maps.

Multiple linear regression (MLR)
Multiple statistical regression was used to model the relationship between image spectral reflectance and SOM concentration. The linear regression method aims to explain the value of a quantitative soil variable (dependent variable) using a linear combination of predictors in the form of a regression equation. .
where ŷ is the dependent variable (soil parameter), x i is the predictor, n is the number of predictors, a is the intercept, b i is the partial regression coefficient, and ε is the standard error of estimation. The regression equation is determined by minimizing the sum of the squares of the differences between the observed and predicted values (minimizing the residual variance).
Multiple linear regression was used as a prediction model to estimate SOM using Landsat-8 image reflectance values as input variables. In this model, the SOM was the dependent variable, and the satellite bands (B1 -B7) reflectance values were considered as independent variables.

Artificial neural networks (ANN)
Artificial Neural Networks (ANN) provides a method for characterizing synthetic neurons to solve complex problems, such as the human brain. They allow modeling complex nonlinear relationships between explanatory variables and the variable to be explained (Haykin 2009;Huang 2009). They make it possible to model complex nonlinear relationships between explanatory variables and the variable to be explained (Haykin 2009;Huang 2009). There are a variety of ANN architectures, in this study, we used a multilayer perceptron (MLP) model with supervised learning based on error back-propagation. MLP is one of the most widely used neural networks for solving approximation, classification, and regression problems (Haykin 2009). The MLP network is composed of neurons that are linked together in most cases by nonlinear functions (Lek, Giraudel, and Guégan 2000). It is divided into three layers ( Figure 2): (i) the input layer neurons, that corresponds to the explanatory variables (X) which stands in our case for Landsat bands from B1 to B7, (ii) the hidden layer neurons, which during the training process are determined by the user, and (iii) the output layer neuron (Y), which corresponds to the variable to be estimated (SOM). Using a specific learning rule, an MLP can be trained on the calibration data (Srinivasa and Brion 2005;Da Silva et al. 2017). Backpropagation is one of the most widely used algorithms in all neural network paradigms for determining the weights of all neurons. This is done with gradient-descent learning, which ensures error-guided correction between desired and achieved outputs, and correction of hidden layers is performed according to sensitivities (Melesse and Hanley 2005;Rumelhart, McClelland, and Williams 1986). This algorithm implements two main processes: forward pass and backward pass. The output model is presented to the network in the forward pass, and its effect propagates through the network layer by layer. The backward pass then determines the error terms of the current model at the output that is backpropagated to compute the error terms on each hidden neuron and thus obtain the gradient evaluations. This process is repeated until the network outputs are sufficiently close to the desired outputs. Multilayer neural network perceptron used for organic matter estimation (X stands for Landsat spectral data for the bands from B1 to B7, and Y stands for the predicted SOM).

Models accuracy
When calculating predictive equations for linear regressions, only parameters with statistical significance of p ≤ 0.01 were taken into account. Significant differences between the observed and predicted SOM values were determined using a fit test (p ≤ 0.05). For the performance analysis, we used three statistical parameters: the coefficient of determination (R 2 ), root mean square error (RMSE) and mean absolute error (MAE). Furthermore, the performance of each model was assessed by plotting the estimated value versus the actual value and testing the statistical significance of the regression parameters. The statistical indices were calculated as follows:

RMSE ¼
ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi where Y i , Ŷ i and � Y i represent the observed, predicted, and mean SOM content test values, respectively, and N represents the number of observations (= 123).

State of organic matter content
The soils in the studied area were mostly deficient in SOM (Figure 3). A total of 63.3% of the samples had low SOM content (< 1.5%). Soils with medium SOM accounted for 35.6% of the total area. Soils rich in organic matter, on the other hand, accounted for only 1.1%. This situation could be explained by climate conditions (arid and semi-arid) as well as agricultural intensification (Badraoui 2006;Naman 2003). In general, agricultural practices such as agricultural intensification, tillage, irrigation mode, crop rotation, and overall residue management have significant spatial and temporal impacts on soil organic matter variability (Badraoui, Agbani, and Soudi 2000).
Organic matter levels are a function of soil texture, with a high clay content, soil depth, and soil class (Badraoui, Agbani, and Soudi 2000). These low contents are mostly independent of the soil's intrinsic properties, explained by (i) the high mineralization that is encouraged by optimal hydric and thermal conditions, (ii) the poor management of the organic residues collected, (iii) the export of crop residues outside cultivated plots, and (iv) the soil traces that remain stuck to the sugar beet roots during the harvesting period (Rahoui et al. 2000;Naman 2003).
Soils are poor in organic matter, which reduces their capacity to retain water and exchangeable bases (Chivenge et al. 2007). Reduced soil organic matter levels will lead to a decrease in soil fertility, soil nutrient supply, porosity, penetrability, and thus soil productivity (Gray and Morant 2003). SOM is the primary determinant of soil fertility and quality and is closely related to soil productivity (Reeves 1997). It is an essential indicator for estimating soil carbon stocks (Hamzehpour, Shafizadeh-Moghadam, and Valavi 2019). Therefore, the use of organic inputs as an amendment improves the degraded soil. It is strongly recommended to maintain soil organic matter through organic fertilizers and to bury plant residues (Fatima, Soudi, and Chiang 2015;Badraoui 2006).

Performances of MLR Models
The multiple regression results provide SOM predictions from the earth observation data shown in Table 1. This reveals that the R 2 values are significantly high at both resolutions, demonstrating the importance of Landsat-8 data in modeling variations in soil properties at the surface. The difference between the R 2 and adjusted R 2 values was also minimal, indicating that the predictor variables thoroughly explained the dependent variable.
Multiple linear regression analysis was conducted to determine whether the Landsat-8 data could significantly predict the soil organic matter concentration. The regression results indicated that the models explained 63.30% and 66% of the variance for MS and MSPAN, respectively. The models were significant predictors of soil organic matter concentration, F (5,362) = 125,51, p < 0,000) et (F (5,362) = 140,53, p < 0.000 for MS and MSPAN, respectively ( Table 2). The different variables predicted the SOM statistically significantly (p < 0.05).  The statistical correlation between the measured and predicted soil organic matter values was calculated with a relationship (training/validation) of R 2 = 0.62/0.63 and R 2 = 0.65/0.66 for MS and MSPAN, respectively, at a high significance level (p < 0.001). Otherwise, the models for MS and MSPAN (63.4% and 66%, respectively) were found to be accurate in predicting SOM (Figure 4). The RMSE of the models was (training/ validation) 0.23/0.22 and 0.22/0.21, with MAE of 0.18/ 0.17 and 0.17/0.16 for MS and MSPAN, respectively. The findings indicate a moderately positive relationship between SOM concentration and Landsat-8 data. The regression equations were statistically significant (p < 0.0001). Estimated SOM values ranged from 0.49 to 2.86 (mean ± SD of 1.31 ± 0.38) for the MLR-MS models and from 0.54 to 2.62 (1.32 ± 0.38) for the MLR-MSPAN models.
Residual plots of the observed SOM versus the estimated SOM ( Figure 5) showed that all samples were randomly distributed between the two extremes (max and min). The studentized residual plots showed the absence of outliers ( Figure 6).

Performances of ANN Models
Determining optimally the number of hidden layer neurons is an essential step in the development of an MLP structure. We determined four hidden layer nodes for the MS image and six for MSPAN after several tests with different combinations. SOM was predicted using the optimal structures (Table 3). MS and MSPAN images showed statistically significant relationships (p < 0.001) between the measured and predicted SOM values with R 2 = 0.65/0.66 and R 2 = 0.69/0.71 (training/validation), respectively. Thus, the models were 65.50% and 71.50% correct, respectively, when predicting SOM from MS and MSPAN images. The RMSE and MAE were 0.22/0.20 and 0.21/0.18, respectively, while the RMSE and MAE were 0.17/0.16 and 0.16/0.14, for the MS and MSPAN images respectively. Therefore, the findings indicate a moderately positive relationship between the observed and estimated SOM. The results also reveal that the ANN models have higher accuracy in the validation than in the calibration, which means that the models are not over-fitted (Figure 7). Estimated SOM values ranged from 0.46 to 2.46 (mean ± SD of 1.32 ± 0.37) for the ANN-MS models and from 0.55 to 2.35 (1.32 ± 0.39) for the ANN-MSPAN models.

Comparison of the two models
Similar studies have been conducted to investigate soil attributes using Landsat data and the MLR as a predictive model to which the results of this study agree (Demattê     Zhang and Huang 2015). Some studies have adopted SOM prediction modeling techniques from other sensor data, such as ASTER (Nawar, Buddenbaum and Hill 2015) and SPOT (Vaudour et al. 2013). The majority of studies indicate that electromagnetic energy at specific wavelengths interacts with certain soil properties, and its behavior analysis can be used to model and map these soil attributes. The ANN model improved the MAE and RMSE, which were (calibration/validation) 0.17/0.16 and 0.22/0.20 for the MS image and 0.16/0.14 and 0.21/ 0.18 for the MSPAN image, respectively. These results are in agreement with previous studies (Dai et al. 2014;Mirzaee et al. 2016;Guo et al. 2013).
The ANN model yielded better results than the MLR for both MS and MSPAN images. In contrast to MLR, ANN models do not require any prior knowledge of the relationship between input and output (Shafizadeh-Moghadam et al. 2017;Kingsley John et al. 2020). The optimal, possibly nonlinear, relationships linking the input (spectral bands) to the output (SOM) were implemented in an iterative calibration procedure using ANN methods (Schaap, Leij, and van Genuchten 1998). As a result, ANN approaches were successful in extracting as much information as possible from the data (Schaap, Leij, and van Genuchten 1998). Moreover, the efforts required to calibrate the ANN and MLR models were similar, and both required approximately the same amount of processing time and resources (Hattab et al. 2013;Guo et al. 2013).
Although the ANN and regression models produced accurate predictions, there was some unexplained variation in SOM. This variation could be due to a number of factors, such as unsustainable agricultural practices, which can have a significant impact on the distribution of SOM in topsoil (Guo et al. 2013). In addition, other factors such as soil iron oxide richness, texture, and soil type homogeneity can all have an impact on the spectral reflectance of most soil properties, including SOM (Mondal et al. 2017;Demattê et al. 2007;John et al. 2021b). Indeed, R 2 values ranging between 0.63 and 0.71 could be considered significant, because of soil complexity makes it difficult to quantify soil attributes with a sensor located 800 km from the target (Demattê et al. 2007;John et al. 2021a).

Pansharpening effect on prediction quality
For the MLR model, pansharpening improved the prediction accuracy by 2.60% by reducing the error by 0.80%. The ANN model improved the prediction accuracy by 4.30% by reducing the error by 1.30%. Similarly, the ANN model improved the   prediction accuracy compared to the MLR model by 2.10% by reducing the error by 0.20% for the MS image and 3.80% by reducing the error by 0.70% for the MSPAN image. In this study, the ANN model developed to predict SOM explained approximately 70% of the total SOM variability. ANN models explained more variability and could predict SOM because these models use nonlinear relationships between the input and output variables. Overall, the results show that the ANN model successfully identified most of the remote sensing data that influenced SOM. Therefore, these results also show that this methodology can be applied to other regions to analyze soil property data through satellite imagery.
In image enhancement, multispectral image pansharpening has two benefits. It increases the spatial resolution on the one hand, and on the other hand it preserves the spectral fidelity of the image (Xu et al. 2017(Xu et al. , 2018. The most widely used image pansharpening methods in digital soil mapping are the Brovey method (TeMing et al. 2001), the Gram-Schmidt (GS) method (Laben and Brower 2000), and the Intensity-Brightness-Hue-Saturation method (Kalpoma and Kudoh 2007).
Few studies have used pansharpening techniques for satellite images in digital soil mapping. Francés and Lubczynski (2011) used QuickBird and orthophoto aerial photos for soil-type classification. Using MLR as a statistical predictive model, Vaudour et al. (2013) concluded that the pansharpened SPOT image has a higher predictive capability for soil organic carbon content than the original image. Xu et al. (2018) applied several pansharpening methods to four different sensor images (WorldView-2, Pleiades-1A, GeoEye-1, and Landsat 8) to estimate total soil nitrogen and exchangeable soil potassium from spectral indices. The results showed that pansharpening, especially the GS method, had a positive effect on the prediction enhancement.
The GS method preserves the spectral and spatial information of soil characteristics in original image better than the other techniques in most studies (Sarp 2014;Zhang and Huang 2015). Ghosh and Joshi (2013) used 12 pansharpening techniques on WorldView-2 images, and the results indicated that the GS method is among the most effective in increasing spatial resolution. Compared to other methods, the GS method has the advantage of preserving spectral quality, improving spatial resolution, and avoiding color distortion (Yusuf, Sumantyo, and Kuze 2013).

Soil organic matter digital mapping
The equations from the MLR and ANN models for the MS and MSPAN images were translated into functions in the IDL programming language, which were then integrated into the ENVI software through the Band Math tool to generate the digital maps ( Figure 8).
Low values were found in fersiallitic soils which were located mainly in the south of the study area.
The medium values were situated in the modal isohumic and immature soils. High values were observed in vertisols and vertic isohumic soils for the four SOM digital maps. The dependence of SOM richness was related to the mineralization coefficient of the stable humus. Indeed, fersiallitic soils and immature soils have the highest coefficients, and vertisols and isohumic soils have relatively low coefficients (Fatima, Brahim Soudi, and Chiang 2015). This could also be due to the low clay fraction in fersiallitic and lowevolved soils, which has a physical influence on the stability and protection of SOM (Soudi, Naman, and Chiang 2000;Naman 2003;Fatima, Soudi, and Chiang 2015). Indeed, SOM trapped in microaggregates increases in soil with a fine texture, which is not accessible to microorganisms and therefore remains physically protected (Hassink 1992). In addition, SOM adsorption occurs on the clay surfaces (Hassink 1994).
The whole soil is relatively weak in SOM because of its strong mineralizing power (Fatima, Soudi, and Chiang 2015), which is supported by environmental hydric and thermal conditions (Rahoui et al. 2000). This situation endangers soil equilibrium and SOM richness, which are continuously declining (Badraoui, Agbani, and Soudi 2000;Soudi, Naman, and Chiang 2000). This problem can only be solved by good practices of reasonable and sustainable agriculture and regular spatio-temporal monitoring of the SOM status in this area.

Conclusions
This study investigated the utility of remote sensing data for estimating SOM variability in a highly fragmented irrigated area with a semi-arid climate in central-western Morocco. Multivariate statistical models and ANN models were used to determine the relationship between SOM and remotely sensed data. The results obtained show that the MLR models predicted the SOM with R 2 values of 0.63 and 0.66 and RMSE values of 0.22 and 0.21 for the MS and MSPAN images, respectively. In contrast, the ANN models predicted the SOM with R 2 values (calibration/validation) of 0.65/0.66 and 0.69/0.71 and RMSE values of 0.22/0.20 and 0.21/0.18 for the MS and MSPAN images, respectively. Image pansharpening improved the prediction accuracy by 2.60% and 4.30% and reduced the estimation error by 0.80% and 1.30% for the MLR and ANN models, respectively.
As a result, the ANN model outperformed the MLR model in terms of its predictive performance. These findings also suggest that future research should consider other soil properties when calibrating statistical models, because soil reflectance properties are affected by a variety of factors, including soil moisture, structure, texture, and. mineral composition. Their integration into statistical modeling could lead to high accuracy in the use of remotely sensed imagery. Image pansharpening also allowed the quality of the images and, thus, the quality of the SOM estimation. This improvement concerns the spectral and spatial aspects, especially in a very fragmented perimeter with very small plots.
The Landsat 8 image pansharpening technique was used for SOM prediction, which improved the estimation accuracy with higher spatial resolution. These maps have significant value in research and decision-making because of their freedom and ease of acquisition. Remote sensing-based maps can be used by agricultural stakeholders such as farmers and scientists in identifying SOM variation in a small area and implementing fieldspecific soil management and fertilization plans. Because of the free acquisition of Landsat imagery, these MSPAN/ MS Landsat based soil prediction models can be widely used by smallholder farmers in developing countries and help them to develop more field-specific sustainable soil management schemes. Therefore, this work would contribute to the growing digital soil mapping methods in Morocco.
Abdelmejid Rahimi is a Professor of Geology and Geomatics. He started since 1991 as an Assistant Professor in the Department of Geology at the Faculty of Sciences of Ben M'Sik in Casablanca and then at the Faculty of Sciences in El Jadida. He is a Senior Professor since 1998. He is the author of several publications and scientific communications. His scientific work focuses on the contribution of remote sensing and GIS in monitoring the dynamics of land use changes and their impacts on the environment.
El Mostafa Ettachfini is a Professor since 1993, in sedimentary geology and paleontology at the Chouaïb Doukkali University (UCD), El Jadida, Morocco. He received in 1992 his first PhD from Paul Sabatier University of Toulouse, France and in 2006 his second PhD (Doctorat Es-Sciences) from UCD. His main interest is in international collaborations (France, Switzerland, Belgium, Germany, Spain, Algeria and Tunisia) concerns the Cenomano-Turonian and its anoxic event. Since 2010 until now, he is an expert in geological mapping at the Ministry of Energy and Mines, Morocco.
Badr Rerhou received the engineer degree in agronomy in 2008 from the IAV HASSAN II Institute, Morocco, and the MSc in innovation and sustainability in agro-food production in Mediterranean area from University of Catania, Italy. He is currently a PhD Candidate in Management of soil fertility and crops fertilization in IAV HASSAN II Institute, Morocco. His interests are crop protection and fertility effects on soil pathogens.

Data availability statement
The data that support the findings of this study are available on request from the corresponding author.