Comparative analysis of GIS and RS based models for delineation of groundwater potential zone mapping

Abstract Groundwater is a crucial natural resource that varies in quality and quantity across Khyber Pakhtunkhwa (KPK), Pakistan. Increased population and urbanization place enormous demands on groundwater supplies, reducing both their quality and quantity. This research aimed to delineate the groundwater potential zone in the Kohat region, Pakistan by integrating twelve thematic layers. In the current research, Groundwater Potential Zone (GWPZ) were created by implementing Weight of Evidence (WOE), Frequency Ratio (FR), and Information Value (IV) models of the Kohat region. In this study, we used Sentinel-2 satellite data were utilized to generate an inventory map of groundwater using machine learning algorithms in Google Earth Engine (GEE). Furthermore, the validation was done with a field survey and ground data. The inventory data was divided into training (80%) and testing (20%) datasets. The WOE, FR, and IV models are applied to assess the relationship between inventory data and groundwater factors to generate the GWPZ of the Kohat region. Finally, the current research results of Area Under Curve (AUC) technique for WOE, FR, and IV models were 88%, 91%, and 89%. The final GWPZ can aid in better future planning for groundwater exploration, management, and supply of water in the Kohat region.


Introduction
Groundwater is an important natural resource that makes up about 34% of the world's freshwater supply (Tariq, Siddiqui, et al. 2022).It is the primary water supply and is regarded as less contaminated than other water sources.It supplies approximately half of the freshwater that can easily be accessed and used for cleaning, drinking, and cooking regularly (Termeh et al. 2019).Groundwater meets the requirements of 97% of the world's population for freshwater and provides 50% of the world's irrigation (Tariq and Shu 2020).It can be considered the essential capital natural possessions that occurred in the sediments and fractures of soil and rock (Ahmad et al. (Kaliraj et al. 2014), Quadratic Discriminant Analysis (QDA) (Baloch et al. 2021), K-Nearest Neighbour (KNN) (Naghibi et al. 2018).The SVM is the technique to predict Groundwater Potential Zone (GWPZ) (Eid et al. 2023).
Groundwater is most vital and significant natural resource for sustainability due to its agriculture-dependent economy in Pakistan.Groundwater is a vital element in the economy, but human population, industrialization, unscientific exploration, and groundwater mismanagement have twisted a chief risk to this treasured energy source (Moazzam et al. 2022).Therefore, GWPZ is an indispensable technique for mapping and managing the precious water resources in the area of interest (Baloch et al. 2021).Numerous field survey mechanisms, i.e. geological, geophysical, and hydrological studies, have been used by researchers to demarcate potential groundwater zones (Israil et al. 2006).These methods need several human resources financial budget, and it can be most time-consuming.
In this research, WOE, FR, and IV models were utilized to locate the GWPZ in the KPK region of Pakistan.Although several studies have been conducted across Pakistan utilizing RS and GIS techniques to delineate the groundwater potential map, none of those studies have been conducted in this region where sustainable groundwater resources management is essential for the industrial, commercial development, and economy of the country.In the current study, we used twelve influencing parameters that are considered significant to explore the deficient, low, medium, high, and very high potential groundwater regions.These parameters were prepared in the GIS platform from various ground and RS data.In the current research, three geospatial techniques were used to compute the association of influencing parameters with groundwater inventory data and to delineate potential groundwater regions in district Kohat, Pakistan.These GIS-based groundwater mapping models have not been investigated previously in the district Kohat.The final GWPZ map can be helpful for decision-makers to assess and manage groundwater in various regions of the study area.

Study area
The current research is conducted in district Kohat, situated in the southern part of Khyber Pakhtunkhwa (KPK), Pakistan.The Kohat region is geographically extended from 33 35 0 13 00 33 49 0 73 00 N and 71 52 0 49 00 E to 71 26 0 32 00 (Figure 1a-c) (Hussain 2014).The study area occurred at an elevation of about 2000 m.Climatically the research region is considered a limited steppe climate region with slight precipitation throughout the year.In Kohat District, the summers are long, hot, humid, and clear, while the winters are brief, cold, and mostly clear.Both seasons have clear skies.Temperatures below À0.5 C or above 43.33C are extremely uncommon throughout the year.On average, the temperature ranges from 2.2 C to 39.45 C (Azra et al. 2019).Geologically the current study area is situated in Kohat Plateau.The study area includes fold and thrust belt collections which are thinskinned structures covered by thick-skinned structures.Compressional structures subject a significant portion of the plateau; however, the strike-slip faulting is limited to the southern Kohat plateau (Hussain and Zhang 2018).The Kohat plateau is mainly occupied by lithologies of Eocene limestone, shale, evaporates, and subordinate clays, and younger clastic sedimentary rocks of the Miocene-Pliocene age (Hussain et al. 2021).The age of sedimentary rocks in the plateau is composed of Paleocene to Pliocene, which was first deposited on the northern Indian plate margin (Tariq and Qin 2023).

Datasets
Various datasets are applied to generate different parameters used in the current work.The datasets utilized in the current research comprise organization ground (field survey and ground data) and RS data.The ground and satellite information applied to prepare twelve influential parameters for groundwater potential were acquired from appropriate national and international research platforms.The data details and sources of information are mentioned in Table 1.

Methodology
The study was established in four phases: i) preparation of ground water inventory map of the study area using different geospatial, machine learning, and field survey  techniques, ii) generation of twelve influential groundwater parameters, iii) generating GWPZ using three geospatial models like WOE, FR, and IV and, vi) performing validation and accuracy assessment using AUC technique.The comprehensively organized methodology for the present investigation is shown in Figure 2.

Inventory map of surface water bodies
The accurate water inventory map is the primary and essential parameter to generate GWPZ for the region of interest.The ground and RS data for the inventory map were collected from various public organizations and satellite sources.The inventory map of different water bodies was prepared from Sentinel-2 using a ML-model.The inventory map was validated and verified with ground data collected from the public department of district Kohat and various field surveys in the Kohat region.Finally, detected inventory data of current research is divided into training (80%) and testing (20%) datasets (Zhu et al. 2022).

Preparation of GWPZ conditioning parameters
Considering groundwater potential, conditioning parameters is a significant task affecting the final output map of GWPZ; hence, conditioning parameters should be cautiously designated (Bui et al. 2019).The existence and yield of groundwater in a specified aquifer are influenced by numerous parameters.In the present study, twelve influential conditioning factors like elevation, slope angle, aspect, curvature, drainage network, rainfall, LULC, soil, NDVI and the road distance are considered to evaluate the influences of mentioned parameters on groundwater potential in the study area.2.3.2.2.Slope.The slope gradient is another significant parameter for groundwater potential because the slope angle directly influences the amount of rainwater water intrusion and surface run-off in any region.A steep slope gradient negatively impacts groundwater reservoirs because a higher slope enables a rapid run-off area and reduces water infiltration.In contrast, a low slope promotes water infiltration and potential recharge area (Maskooni et al. 2020).The slope of the Kohat region reclassify into five classes, i.e. <5 , 5-15 , 15-25 , 25-35 and >35 using ArcGIS 10.8 as shown in Figure 3b.The highest slope of the present research area is verified as 78 , while the lowest slope of the region is recorded as 0 .
2.3.2.3.Aspect.The slope aspect presents slope directions that affect the quantity of precipitation, radiation of the sun, wind speed, and LULC, which concomitantly strike the amount of water permeation to the pore spaces of sediments influencing groundwater potential in the region (Solomon and Quiel 2006).The aspect of the present area is generated and reclassified into nine classes, as revealed in Figure 3c.2.3.2.4.Curvature.The curvature map shows the association with the capacity to store and hold water reserves on the area of surface.Usually, the dipped structures accumulate more water bodies (Pham et al. 2019).The curvature of the Kohat region is calculated from ALOS DEM having 12.5 m resolution and reclassified into open, flat, and convex groups, as mentioned in Figure 3d.
2.3.2.5.Drainage network.Drainage network presents an inverse association with the percolation of water in fracture and sediments of strata because river density discourages water retention (Kordestani et al. 2019).As the river network density is high, water recharge in the area will be low and vice versa because river density favours surface runoff and decreases infiltration.The five buffers were applied to the stream network of the present research area stream, as shown in Figure 3e.(Bui et al. 2019).Moreover, bare ground and built-up regions usually display low potential, while vegetation and the area near water reservoirs illustrate higher groundwater potential.LULC map of the Kohat region is generated from Sentinel-2 data in GEE using a ML algorithm.This causative factor was further categorized into six classes for evaluating these classes on groundwater, as shown in Figure 3g.Confusion matrices were used to create classification accuracy processes, such as overall accuracy, omission and commission errors, and Cohen's kappa statistic.These classification accuracy matrices were derived using confusion matrices (Firdaus 2014).As per reference data, the commission error is the percentage of pixels that have been incorrectly assigned to classes they do not belong.On the other hand, the percentage of pixels that should have been assigned to a particular class according to the reference data but have not been assigned to that class is an omission error.It was figured out how to determine the omission and commission error for each LULC class and estimate the average for all classes.
2.3.2.8.Lithology.The lithology of strata controls the porosity and permeability of aquifers and influences groundwater due to its conductivity and penetration.These rock and soil properties affect groundwater's existence, accumulation, and mobility (Muavhi et al. 2022).The lithology of the Kohat region is extracted from the Northern Geological Map of Pakistan, as shown in Figure 3h.
2.3.2.9.Fault distance.The fault Buffer of various distance gaps was selected for analysis because it influences the subsurface flow of fluids (Yin et al. 2018).Therefore, geological faults and fractures are critical parameters in detecting groundwater sources.The fault parameter is digitized in this research, as shown in Figure 3i.Five buffers were applied to calculate the relationship of fault with groundwater potential in the study area.
2.3.2.10.Soil types.Soil is the uppermost horizon of land which helps in water infiltration.Soil type is a crucial factor in investigating potential groundwater mapping as the penetration capability of an area controlled by pore spaces of soil (Tariq, Jiango, Lu, et al. 2023).Similarly, the soil is a significant conditioning parameter in the groundwater potential zone mapping.The soil's texture and structure determine its permeability, which in turn represents the soil's capacity for allowing water and other substances to penetrate it (Tariq, Mumtaz, et al. 2023).The soil type map of the research region is produced from FAO and the soil survey of Pakistan, as shown in Figure 3j.

NDVI.
There is a secondary association between NDVI with groundwater.For example, the region increases plant density, the groundwater table decrease, and vice versa.The value of NDVI in the various depths of the water table revealed that dense vegetation occurs in shallow water regions (Islam et al. 2022).The NDVI map of the Kohat region is calculated from Sentinel-2 using machine learning techniques in GEE.The final NDVI map was reclassified into two classes in the GIS environment, as shown in Figure 3k.
2.3.2.12.Road distance.The road distance of District Kohat is generated using the Google Earth platform and road network map of KPK Highway Authority in ArcGIS platform as shown in Figure 3l.

Ground water potential zone mapping models
Geospatial modelling was applied in the current research to evaluate the association of groundwater conditioning parameters and groundwater inventory data to generate GWPZ of the Kohat region.The explanation of the applied three models in the present study is as follows.

WOE model
This GIS-based technique employed linear logic based on Bayesian law to combine data to approximate events' non-conditional and conditional probability (Elmoulat et al. 2015).WOE models compute the spatial association of dependent variables, i.e. water bodies' location and independent variables like groundwater potential mapping conditioning parameter and compute the weight of each class of parameters.The WOE method was first considered to evaluate mineral potential mapping using GIS-based modelling (Bonham-Carter et al. 1989).In this technique, the W þ and W À weights should be considered as the dynamic aspects.The weight of conditioning factors (B) established on the existence or non-existence of the water bodies (C) of the study region is estimated using the following Eqs.( 1)-(3) (Bonham-Carter et al. 1989).
In the mentioned equation, p is the likelihood and ln is the natural logs.However, BB and BB are the existence and nonexistence in the causative factor, correspondingly.Similarly, CC and C ::: show the occurrence and absence of inventory, respectively (Xu et al. 2012).W þ signifies the occurrence of the conditioning parameters at the spatial positions.Its amount demonstrates the positive relationship between conditioning parameters and water bodies occurrence, respectively.While W À represents the nonappearance of groundwater parameters and suggests the level of a contrary relationship.

FR model
The FR technique is the finest bivariate statistical model applied as a valuable GISbased model for evaluating groundwater inventory and groundwater conditioning parameters (Guru et al. 2017).Currently, the FR model has been effectively utilized for GWPZ in various regions of the world.The FR value equal to or greater than one shows a strong positive correlation between different variables.The following Eq.( 4) calculation is applied to compute FR for all causative factors in the present study (Ahmad et al. 2022).
Where FR ¼ Frequency Ratio for each conditioning parameter, E ¼ number of water body pixels in each landslide's causative parameter class, F ¼ total number of all well pixels in research region, M ¼ number of pixels in each landslide conditioning factors, L ¼ total number of all pixels in study area.

IV model
In the current research, the IV model is applied to make GWPZ of the Kohat region.IV is one of the most suitable practices for choosing significant parameters, ranking variables based on their position, and computing their association with inventory data of the study area in the predictive model (Pardeshi et al. 2013).IV model was first improved by Shano et al. (2020).This article considers the IV for each parameter class based on the presence of groundwater inventory pixels in the given region.The computed information value supports governing the role of each parameter class for groundwater occurrence (Ali et al. 2023).The conditional probability was calculated by dividing the groundwater pixels in each parameter class into pixels of a subclass of groundwater parameter, while the prior probability was considered by dividing the total groundwater pixels in the research region by the entire pixels in the whole research region using the Eq. ( 5) (Pardeshi et al. 2013).
W symbolize the weight of parameters for groundwater.
practice for entire number of pixels in region.

Delineation of groundwater using WOE
The groundwater potential index (GWPI) was calculated (Eq.( 6)) and mapped based on s values.
where s is the final weight for the WOE model.

Delineation of groundwater using FR
In contrast to the WOE, the weightage of each class in FR is not determined based on the characteristics of the conditioning factor; instead, it is given in the form of the spatial occurrence of the wells in each class.This contrasts with the WOE, which determines the weightage of each class based on the properties of the conditioning factor.Similarly, the FR is computed for each of the conditioning variables.The succeeding scientific Eq. ( 7) has been applied to produce GWPZ of the Kohat region (Guru et al. 2017).

Delineation of groundwater using IV
The GWPZ can be produced for Kohat region using the Eq. ( 8).

Validation of the GWPM
Evaluation of generated GWPZ is crucial because models without validation have no empirical value.Rather than using the hydraulic parameter of specific capacity, as previous studies did, an indirect indicator of groundwater yield measurement was used in the present research (Jha et al. 2010).From a groundwater sustainability point of view, groundwater yield measurement has been used widely by several researchers such as (Qureshi et al. 2010;Pardeshi et al. 2013;Ji et al. 2015;Fayez et al. 2018;Arabameri et al. 2019) for validation of GWPZ.For many investigations, the receiver operating characteristics (ROC) curve has been the gold standard for evaluating the precision of the GWPZ (Shirazi et al. 2012).The area under the AUC measures the accuracy with which a prediction system can determine whether or not an incident will occur (Shah et al. 2022).In order to validate the WOE, RF and IV-generated GWPZ, the healthy dataset (20%) was used for testing.Areas under the ROC curve were used to evaluate the GWPZ, spatial efficacy (AUC).The rate explains the accuracy with which the model and influencing variables predict the potential.AUC determines which model is superior, and the one with the most outstanding value wins (Rahman 2008).

Results
In this article, we developed an inventory of groundwater bodies from Sentinel-2, imageries using various advanced JavaScript algorithms, Google Earth Pro and Google Earth images.The spatial location of surface water bodies like well, ponds, and springs is mentioned in Figure 1.In the present research, we accomplished three bivariate models to generate GWPZ for the Kohat area.

WOE model
The contrast value can be computed from the calculation of both mentioned weights and calculate the association of both dependent and independent variables.The concluding LSM of the Kohat region is mentioned in Figure 4. Table 2 shows the analytical results of GIS-based models.The two variables' low correlation shows groundwater's low potential zone, and the high value illustrates the high groundwater potential zone in the research region.Based on the results of the WOE model in the elevation parameter, an altitude less than 500 m shows a strong association with groundwater.However, more than 800 m elevation class shows the slightest relationship with groundwater.

FR model
The ultimate output map by FR is mentioned in Figure 5. Estimating the GWPZ with the FR model cannot be overstated.FR model carried out the GWPZ by correlating the various variables that conditioned the water with the specific locations of bore wells.In addition, a more excellent correlation value suggests a more significant groundwater potential and vice versa.Finally, LULC classes have a significant bearing on the effect of industrialization on the potential of groundwater.According to the findings of this research, the water body was a factor in the highly prospective ability of 9.923.The mining/industrial region and the vegetation cover area both discovered insignificant FR values when the FR was analyzed by the conditioning factor and the bore wells data.This is the case because FR is analyzed.The result suggested a low FR value because more data is needed from these classes' bore wells.In contrast, the vegetation cover was always found to influence the infiltration rate significantly.
The FR model's elevation class of less than 500 m, as shown in Table 2 in the present article, illustrates a strong correlation with the groundwater.In contrast, the less correlated elevation class with groundwater is more than 800 m elevation.In the results of the IV model, as shown in Table 2, the most significant elevation for groundwater is <500 m altitude, while the less critical class is more than 800 m.The results revealed that the correlation value of the WOE, FR, and IV model for elevation classes less than 500 m are 1.62, 3.39, and 1.22, respectively.However, the variables association results show that the correlation value of altitude class > 800 m for WOE, FR, and IV model are À1.84, 0.18, and À1.72, respectively.The results of three  bivariate models for elevation illustrate that low elevated area is more permeable and suitable for groundwater potential as compared to high elevated zones of Kohat district.The slope gradient parameter is considered a very significant parameter for GWPZ.
The bivariate results achieved from the association of groundwater conditioning parameters and groundwater pixels in the research region, as mentioned in Table 2, revealed that the most critical class for the groundwater potential is 5 followed by 10 -20 .The results also show that the most significant slope class for the current research area is >35 slope followed by 25 -35 .Based on the slope angle parameter, the slope class with less than 5 has the maximum weight.The association rank for both variables in the present work for 5 by WOE, FR, and IV are 2.68, 3,5, and 2.37, respectively.The correlation between water inventory and groundwater conditioning parameters for slopes over 35 are À1.45, 0.25, and À1.53 for WOE, FR, and IV models, respectively.

IV model
The analytical values, as shown in Table 2 for the current research analysis between dependent and independent variables.It illustrated that a lower slope has the greatest likelihood of groundwater potential.In contrast, a steep slope has adverse impacts on the occurrence of groundwater due to high runoff in high slopes region.The analytical results of bivariate models explained that F is the most vital class of aspect, followed by NE of the Kohat area.The correlation results of the WOE, FR, and IV model for the F direction are 1.73, 4.11, and 1.47, respectively.The NE and SW classes of aspects follow the F class of aspects.The less significant class of aspect is the S direction having À0.71, 0.65, and À0.70 for WOE, FR, and IV model.According to the analysis for the association between groundwater data and curvature shape, the concave structure has the highest correlation value, i.e. 0.99, 1.65, and 0.50 for WOE, FR, and IV, respectively.The concave structure is the most significant class of curvature for groundwater potential zone mapping.The results revealed that the spatial association of groundwater and conditioning parameters for WOE, FR, and IV models are À1.67, 0.29, and À1.21, respectively.As shown in Table 2, the results revealed that the most significant class of curvature is a concave structure, followed by Flat and convex structures.As shown in Table 2, the results revealed that groundwater is more likely to occur in a sense stream.There is a maximum likelihood of distance from the river of less than 200 m.The correlation value of less than 200 m class of stream is 2.0, 3.80, and 1.33 for WOE, FR, and IV model, respectively.These results illustrate that fewer distances to rivers have had a more significant influence on groundwater potential (Figure 6).
However, most of the less significant class of stream parameters are greater than 800 m, followed by a 600 m-800m range.The association between WOE, FR, and IV variables are À1.32, 0.33, and À1.12, respectively, for more significant than 800 m class of stream.In the current research, the precipitation map was formed from CHIRPS satellite data, followed by the reclassification into five categories to assess the relationship of rainwater factor with groundwater bodies.The outcomes, as revealed in Table 2 for rainfall, supported that rainfall is a significant aspect of groundwater potential.The results show that the 1000-1050 mm/year precipitation class is the most significant for groundwater potential, followed by >1050mm/year.The 1000-1050 class correlation between rainfall and groundwater is 1.32, 2.73, and 1.0 for WOE, FR, and IV models.The precipitation class < 900 mm/year has no significant impact on groundwater potential.The bivariate analysis for WOE, FR, and IV are À0.90, 0.45, and À0.79, respectively.Considering the above-mentioned statistical facts, it can be concluded that high precipitation classes show more groundwater occurrence and vice versa.In the current research results, the cropland area is the most important and influential class of LULC parameter in the study area.The analytical results of groundwater and conditioning parameters for WOE, FR, and IV are 0.63, 1.70, and 0.53, respectively.The agricultural land shows high potential results for groundwater because the agriculture region is recharged from the irrigation system of the study area.The scrub/shrub, forest, and urban class of LULC follow the agricultural land.
The analysis of both variables treasures that Q is the most influential geological formation of lithology parameters for groundwater potential in the Kohat region.The correlation of both variables is clear in WOE, FR, and IV model.The results for groundwater potential zone mapping of the present study show that the lithological parameter is the least significant class for groundwater.A fault is a significant parameter for groundwater percolation.Geological faults strongly influence groundwater mobility because they enhance the strata's mobility mechanism for groundwater.The maximum likelihood of groundwater potential in distances to a fault is <500 m buffer region.The correlation value of both variables of class <500 m for WOE, FR, and IV are 0.60, 1.73, and 0.55, respectively.The geological fault's > 5000m fault buffer has no significant impact on groundwater.The results of >5000 m buffer for the WOE, FR, and IV model are À0.45, 0.82, and À0.20, respectively.The results concluded that fault is the influential parameter for groundwater potential for the present research area.As shown in Table 2, the results explained that mainly loamy soil is the suitable class for groundwater in the Kohat area.The correlation of both variables for WOE, FR, and IV are 1.55, 2.69, and 1.23, respectively.As shown in Table 2, the present research results illustrate that NDVI is a crucial parameter for groundwater potential.Both variables' association ranks are 0.57, 1.83, and 0.27 for WOE, FR, and IV model, respectively.The results of the present study between NDVI and GWPZ revealed that NDVI and groundwater Table depth are inversely related, i.e. high NDVI will have low water table depth and vice versa for the current investigation.The current study considered the road to compute the association between the road network in the Kohat region and the groundwater potential.The results explained that the most influential class of road network is a 3000-4000 m buffer followed by >4000 m and 2000-3000 m buffer.The correlation value for WOE, FR, and IV model are 0.78, 1.95, and 0.67, respectively.Table 2 revealed that the <1000 m class correlation values between the road and groundwater are À0.40, 0.69, and À0.36 for WOE, FR, and IV model, respectively.The results of all road network buffers revealed that the road network has adverse impacts on groundwater in the study area.

Validation of models
In the modelling technique, validation of the model is a significant phase to accomplish the reliable scientific worth of the research project (Barakat et al. 2023).In numerous research, the AUC technique was used to evaluate GWPZ.This ROC curve is considered a standard index for accuracy assessment.This technique has been extensively utilized for assessing techniques applied in various water research investigations.The receiver operating characteristics (ROC) graph validated the WOE, FR, and IV models.The region indicates the precision of the prediction or classification under the receiver operating characteristic curve (AUC) (Pourghasemi and Rossi 2017).In this investigation, we have tested and verified three different models derived from the GWPZ's final categorization.The AUC values range from 0 to 1.If the number is less than 0.5, the model's classification was inappropriate, and it should be redone.On the other hand, if the value is close to 1, it suggests that the result is clearly defined (Pourghasemi and Rossi 2017).
To verify WOE, FR, and IV models, the ROC curves of the GWPZ maps were constructed (Figure 7).The finding demonstrates that the outcome predicted by the FR model for GWPZ (AUC ¼ 91%) is successfully achieved when compared to both the WOE model (AUC ¼ 88%) and the IV model (AUC ¼ 89%).Nevertheless, all the obtained findings were checked for validity and clearly defined (Pourghasemi and Rossi 2017).However, the FR is a better representative for this study area to indicate the spatial distribution of the GWPZ compared to the WOE and FR models.This is because the GWPZ is more likely to be found in areas where the FR model is more accurate.As a result, this validation method is highly recommended for research into potential groundwater evaluation.The generated validation graphs of the applied models in the present study, as mentioned in Figure 7, utilized twenty percent of the inventory data of water.The highest value of the AUC value showed the most reliable results of the model and while the lowest value showed unreliable results.The findings by mentioned validation technique for WOE, FR, and IV clearly explained that all applied models are consistent and trustworthy methods to produce GWPZ for the Kohat District of Pakistan.The validation outcome specifies that the FR technique is the most reliable method for GWPZ in the study area.

LULC accuracy assessment
Accuracy evaluations were performed after obtaining land use/land cover categorization outcomes.In order to do so, we used the user accuracy matrix, the producer accuracy matrix, and the total accuracy matrix to measure precision.They calculated the users' accuracy by taking the ratio of adequately classified cells to the total number of reference points.Google Earth was used as a reference tool for this research.The overall accuracy was calculated by dividing properly classified cells by all pixels.In contrast, producer accuracy was calculated by dividing the number of cells with correct land use/land cover classification by the number of ground truth pixels as explained in Table 3.

Discussion
Due to the increased demand for water availability for urbanization, industrialization, and irrigation purposes, there has been an increase in the research investigation on the groundwater scenario.This is especially true in arid to semi-arid regions worldwide, where the need for groundwater is even more critical.There have been numerous research to understand the science behind water recharge and prepare GWPZ for the scientific exploration and management of groundwater (Arabameri et al. 2019).Therefore, proper groundwork and methods should be implemented for GWPZ to manage the groundwater because the execution method for GWPM is still an argued subject (Nampak et al. 2014;Park et al. 2014).
This article has emphasized the appreciation of the groundwater potential of the Kohat region of Pakistan has been evaluated using WOE, FR, and IV models.These models were applied to compute the correlation between water body pixels and conditioning parameters for groundwater.The lower correlation value represents low potential zones, while the high correlation shows high potential groundwater regions.The probability of groundwater potential generally diminutions with increasing elevation (S.Hasan AL-Zuhairy et al. 2017).In the present study, the spatial analysis disclosed that the elevation class of < 500 m has a higher correlation value between both variables; however, the > 800 m elevation class revealed no significant association for groundwater potential.The slope gradient is an influential parameter for groundwater potential because steep slopes are the significant parameter in GWPZ.Moreover, the slope gradient is another significant parameter for groundwater potential.A steep slope gradient adversely impacts on groundwater because it increases the surface run-off and affects the intrusion of precipitation into the ground (Jaiswal et al. 2003).If the slope angle is greater than 35 , groundwater potential is reduced because it restricts the aquifer's recharge (Madrucci et al. 2008).The current study considers slope angle a critical factor for GWPZ.The association of dependent and independent variables for slope up to 20 is very suitable for the high potential zone of groundwater.In contrast, the slope angle > 35 has an inverse relationship with the groundwater pixel and revealed low groundwater potential, as shown in Table 2 of the results.The flat surface of the aspect is more appropriate for groundwater amount (Manap et al. 2013).
In the present study, we observed that the flat surface of the aspect has a strong association with groundwater inventory data.The flat surface correlates with 0.86, 1.97, and 0.68 using WOE, FR, and IV models.The FR and EBF model revealed that concave and convex structures are less associated with groundwater potential than flat regions.Water reservoir and aquifer recharge mainly occurred in the flat region; however, the convex and concave structures did not support the water storage and infiltration (Arabameri et al. 2020).In this study, our results concluded that the Flat class of curvature strongly correlates with groundwater, followed by Concave.At the same time, convex adversely impacts groundwater potential, as shown in Table 2.The most developed likelihood of groundwater is perceived in denser drainage networks.In the current investigation, the < 200 m class of drainage network shows the most influential association with groundwater potential using WOE, FR, and IV technique, followed by 200-400 m and 400-600 m.The relationship ranks of the> 800 m class of drainage revealed that this class has no impact on groundwater potential.The rainfall strongly correlates positively with aquifer recharge (Wu et al. 2020).In this study, the precipitation class 1000-1050mm/year strongly correlates with groundwater potential having a positive correlation y followed by >1050 mm/year.The low precipitated area has no inverse relationship with the groundwater potential of the study area.The precipitation class <900mm/year has minor importance for groundwater potential in the current study area and is followed by 900-950 mm/year Crops and a garden class of the LULC parameter are significantly associated with groundwater having correlation values of 2.06 and 1.25, respectively, demonstrating these classes' high potential water zones (Falah et al. 2017).
In the context of binary classification, the Receiver Operating Characteristic (ROC) curve is a popular method to evaluate and compare the performance of different models.The ROC curve plots the True Positive Rate (TPR) against the False Positive Rate (FPR) for different thresholds of a model's predicted probability (Li et al. 2021).A model with a higher AUC (Area Under the ROC Curve) is considered better.In this study we used WOE, FR and IV and compare their performance using the ROC curve.Model WOE has an AUC of 88%.This means that it has a good balance between TPR and FPR, with relatively few false positives and false negatives.Model WOE is likely to be a good choice for classification tasks where both precision and recall are important (Wahla et al. 2022).
Model FR has an AUC of 0.91.This means that it has a high TPR and low FPR, making it suitable for applications where identifying true positives is crucial, and false positives are less of a concern.However, Model IV may be too aggressive in classifying examples as positive, leading to a high false-negative rate.Model IV has an AUC of 91%.This means that it has a higher FPR and lower TPR compared to Model A, but still performs better than random guessing.Model IV may be useful in cases where minimizing false positives is critical, but it may not perform as well in cases where false negatives are costly.
In summary, each model has its strengths and weaknesses, and the choice of the appropriate model depends on the specific requirements of the task at hand.Model FR strikes a good balance between TPR and FPR, Model IV is useful when minimizing false positives is crucial, and Model WOE is suitable for identifying true positives at the expense of false negatives.
Our research results in the Kohat region of Pakistan showed that cropland is the most influential factor for groundwater potential.The correlation value for cropland in the current research are 0.63, 1.70, and 0.53 for WO, FR, and IV, respectively.Concerning the geological fault buffer, it was hypothesized that the association between both variables for groundwater would weaken the further away from the fault one got.Their relationship increases when the distance from the fault decreases (Falah et al. 2017).Our present study results in the Kohat area presented that fault favours water infiltration and supports the aquifer recharge in the current area.The most effective fault buffer is <500 m because this class shows a strong positive correlation of 0.60, 1.73, and 0.55 applying the WOE, FR, and IV model, followed by 1500 m and 3000 m buffers.However, the buffer of >5000 class has no significant role in groundwater potential and recharge of water.The NDVI is a vital parameter for groundwater potential.NDVI and water Table have an inverse relationship, i.e. when the NDVI increases, the water table rise and vice versa (Seeyan et al. 2014).The same scenario we observed in our current research region.The high NDVI zone strongly correlates with groundwater, while the low NDVI region adversely impacts the present area.As shown in Table 2, the results justified the above statement for NDVI association with groundwater.
According to the analytical results in Table 2, drainage network, slope, elevation, and rainfall are the most significant parameters for GWPZ in the present research area.According to GIS-based statistical models, the FR is the best technique for GWPZ in the current research project.Final GWPZ was also produced using GIS-based models and then was classified into five classes of very low, low, moderate, high, and very high groundwater potential zones.The final GWPZ can be helpful for various research organizations like agriculture and energy-related sectors to manage the groundwater in the present study area.

Conclusions
This article describes a study that aims to investigate potential groundwater zones in the Kohat District of Pakistan using three different GIS-based models: Weight of Evidence (WOE), Frequency Ratio (FR), and Information Value (IV).The study uses various data sources, including satellite imagery, ground surveys, and public health department data, to develop an inventory map of groundwater and twelve groundwater conditioning parameters.The study then applies the three GIS-based models to generate GWPZ maps and categorizes them into five categories based on their potential for groundwater availability.The study finds that stream, slope angle, elevation, and rainfall are the most significant parameters for GWPZ.The study uses ROC curves to assess the accuracy of the models and finds that FR is the most reliable model for the study.The study concludes that the GWPZ maps generated by the WOE, FR, and IV techniques can be useful for research and development agencies to improve groundwater exploration and development planning in the future.

Figure 1 .
Figure 1.(a) Geographical location of Pakistan, (b) Provincial boundary of KPK where study area exists, and (c) Location map of study area with elevation.

Figure 2 .
Figure 2. Flowchart of present research work for the current research study.

Figure 4 .
Figure 4.The WOE model for groundwater potential zone.

Figure 5 .
Figure 5.The FR model for groundwater potential.

Figure 6 .
Figure 6.The IV model for groundwater potential.

Table 1 .
Description of datasets were used in this research.

Table 2 .
Statistical analysis for GWPZ of District Kohat, Pakistan.

Table 3 .
Accuracy assessment of land use and land cover.User's Accuracy, PA ¼ Producer's Accuracy, OA ¼ Overall Accuracy, K ¼ Kappa Coefficient.