Modeling temperature dependency of oil - water relative permeability in thermal enhanced oil recovery processes using group method of data handling and gene expression programming using

In the implementation of thermal enhanced oil recovery (TEOR) techniques, the temperature impact on relative permeability in oil–water systems ( K rw and K ro ) is of special concern. Hence, developing a fast and reliable tool to model the temperature effect on K rw and K ro is still a major challenge for precise studying of TEOR processes. To reach the goal of this work, two promising soft-computing algorithms, namely Group Method of Data Handling (GMDH) and Gene Expression Programming (GEP) were employed to develop reliable and simple to use paradigms to predict the temperature dependency of K rw and K ro . To do so, a large database encompassing wide-ranging temperatures and fluids/rock parameters, was considered to establish these correlations. Statistical results and graphical analyses disclosed the high degree of accuracy for the proposed correlations in emulat- ing the experimental results. In addition, GEP correlations were found to be the most consistent with root mean square error (RMSE) values of 0.0284 and 0.0636 for K rw and K ro , respectively. Lastly, the performance comparison against the preexisting correlations indicated the large superiority of the newly introduced correlations. The findings of this study can help for better understanding the temperature dependency of K rw and K ro in TEOR.


Introduction
Nowadays, energy demand is expected to rise significantly with the increased prosperity in different sectors of industry and with the higher and continues consumption (Tillerson, 2008). As fossil source is still the dominant spring of energy, there have been noticeable and significant efforts to promote the standards techniques to improve the outcomes from oil reservoirs (Olayiwola & Dejam, 2019). Due to this fact, extraction of oil from unconventional reservoirs and oil with low API gravity has turned into quite important ways to compensate the expected need in the fossil energy (Meyer, Attanasi, & Freeman, 2007). The high amount of heavy oils and bitumen over the worldwide raises awareness on this supplementary source of fossil energy although the deficiencies in the characteristics of associated oil such as the high viscosity, low API gravity, and asphaltene content (Ameli, Alashkar, & Hemmati-Sarapardeh, 2018;Green & Paul Willhite, 1998;Prats, 1982;Saboorian-Jooybari, Dejam, & Chen, 2016). Therefore, one robust procedure to address such extreme conditions is increasing the temperature by means of steam or hot water injection, to reduce the viscosity which represents the resistance to the flow (Prats, 1982). These temperature-based techniques for oil recovery are assembled beneath the umbrella of the so-called Thermal Enhanced Oil Recovery (TEOR).
TOER includes many methods in which the main screening application criterion is based on the viscosity values. Accordingly, we distinguish steam-assisted gravity drainage (SAGD) process that is applied for the recovery of bitumen, steam flooding which is effective for the case of heavy oil extraction and cyclic steam stimulation (CSS) which is appropriate for extra-heavy oil . It is well known that in such techniques, temperature has strong influence on the porous medium flow; and hence, various mechanisms of heat transfer such as convection, conduction, and radiation can take place. In fact, the increase in the in-situ reservoir temperature brings significant effects in interaction of rock-fluid which can impact the behavior of the flow (Akhlaghinia, Torabi, & Chan, 2013;Ashrafi, Souraki, & Torsaeter, 2012;Esmaeili, Sarma, Harding, & Maini, 2019a). It is worth mentioning that in addition to the presence of heat transfer mechanisms, related-multiphase phenomena such as diffusion and dispersion also make their marks in TOER. As a result, a more complicated multiphase flow in porous media is noticed when implementing TEOR techniques. The commonly applied mathematical approach to describe the flow is these cases is the outgrowth of the Darcy flow equation to multiphase flow (Maini, 1998) and thermal-based Darcy flow .
Relative permeability is considered a vital factor that is involved in the mathematical models describing the multiphase flow in porous media, in which TEOR processes belong (Esmaeili, Sarma, Harding, & Maini, 2019b;Esmaeili et al., 2019a;Maini, 1998;. Relative permeability which is commonly denoted K r , is recognized as the ratio of effective permeability of a fluid at given saturation to the absolute permeability (Ahmed, 2018). Relative permeability data are a must for a large variety of fluid flow calculations related to TEOR. As a matter of fact, modeling and simulation tasks, which are the means to forecast and predict the performances that can be achieved under different scenarios of these techniques cannot be done without the specification of the relative permeability at reservoir conditions. Hence, it is necessary to have accurate and representative values for this parameter to reduce the risks and uncertainties in the simulation results. However, it is needed to add that relative permeability can be affected by various factors and parameters, among which we can cite the absolute permeability, viscosities of water and oil phases and saturation (Honarpour, Nagarajan, & Sampath, 2006). In addition, the changes made in the fluids and rock proprieties by the temperature upsurge influence the relative permeability curves in TEOR (Casse & Ramey Jr, 1979;Ehrlich, 1970;Honarpour et al., 2006;Sinnokrot, 1969;Zhang, Tong, Xiong, & Zhao, 2017).
The temperature impact on relative permeability values and the shape of their curves has received considerable attention during last decades (Ashrafi et al., 2012;Esmaeili et al., 2019a;Maini, 1998;Zhang et al., 2017). Although unanimous agreement is not satisfied in this topic, a dominant part of experimental and modeling studies that have been published, have noticed the dependency of relative permeability in oil -water systems (Kro and Krw) on temperature (Esmaeili et al., 2019a;Esmaeili et al., 2019b;Li et al., 2014;Schembre, Tang, & Kovscek, 2005;Weinbrandt, Ramey Jr, & Casse, 1975). The investigation conducted by (Weinbrandt et al., 1975) confirmed this statement using consolidated Boise sandstone and mineral oil. The studies of (Schembre et al., 2005) and (Li et al., 2014) demonstrated the effect of temperature on the two-phase oil-water relative permeability on two distinct cases. In addition, the research performed by (Ehrlich, 1970) based on the adsorption resulted in analytical paradigm for the temperature dependency of oil-water relative permeability. Besides, some other models based on IFT as intermediate influencing parameters were developed by (Amaefule & Handy, 1982) and (Kumar, Torabzadeh, & Handy, 1985). To keep the work concise, a deep overview about different studies conducted in the literature to inspect the effect of temperature on relative permeability can be found in our prior published work (Nait Amar, Noureddine, Hemmati-Sarapardeh, & Shamshirband, 2019) and other relevant publications (Akhlaghinia et al., 2013;Ashrafi et al., 2012;Esmaeili et al., 2019a;Esmaeili et al., 2019b;Zhang et al., 2017).
Experimentally, the two-phase oil -water relative permeability in heavy oil cases can be measured by means of three possible techniques: low / high rate displacement tests; and the steady-state co-injection method (Maini, 1998). However, the experimental approaches suffer from sensitive drawbacks such as the complexity of lab preparation and realization, the long time needed to accomplish the tests without forgetting the expensive cost. Therefore, in recent years, addressing these issues by establishing cheap and simple-to-use methods to gain the impact of temperature on Kr has triggered a huge amount of scientific inquiry. (Zhang et al., 2017), (Mosavat, Mohsenzadeh, & Al-Wahaibi, 2016), (Torabi, Mosavat, & Zarivnyy, 2016), and (Bennion, Thomas, Schulmeister, & Ma, 2006) are among the well-known predictive correlations that consider the temperature influence on Kr in oil -water systems. A summary of the aforementioned correlations is given in Table 1. As it is shown in this table, although the form straightforwardness of the prior correlations, they suffer from lack of generalization as their applicability domains are limited to restricted ranges of temperature, rock and fluids parameters. In addition, it should be added that these preexisting correlations have been implemented on the basis of limited databank. In the same context, some other correlations have been established by (Esmaeili, Sarma, Harding, & Maini, 2019c), but these models are not unified with respect to the types of the rock and fluids, and hence, each of them is applicable for specific case, such as consolidated or unconsulated sands interacted with light/heavy. Table 1. Summary of the important correlations for temperature-based oil/water relative permeability prediction. (Bennion et al., 2006), , (Torabi et al., 2016) and (Zhang et al., 2017) are.

Model
Correlations Note (Bennion et al., 2006) 6 0 • C < T < 100 • C K rw = 0.021 1 − 0.6−Sw  On the basis of: • Ottawa silica sand • Unsteady state approach • Heavy oil • History match Restrictions: e 1 = 20.14 e 2 = −0.053 e 3 = −1638.84 e 4 = 40763.24 a 1 = 0.0244 a 2 = 3.8848 On the basis of: • Tight sand stone • Unsteady state approach • Light oil • Combination of JBN and Corey correlation Restrictions: On the other hand, smart computational techniques have emerged and evolved as powerful and advanced approaches that can resolve highly complex relatedmodeling topics (  More recently, Esmaeili et al. (Esmaeili et al., 2019b) applied least square support vector machine (LSSVM) to model the dependency of oil -water relative permeability on temperature. (Nait Amar et al., 2019) proposed various intelligent paradigms as kinds of trustworthy models to estimate oil -water relative permeability in TEOR by combining radial basis function (RBF) neural network and LSSVM with some nature-inspired algorithms. The developed models in the two aforementioned studies showed very satisfactory predictions. The present investigation was done with the aim of implementing explicit, user-friendly and accurate correlations using group method of data handling (GMDH) and gene expression programming (GEP) for predicting the dependency of Kr in the two -phase oil -water systems on temperature, so that it could be applicable to a wider range of temperature, and fluids and rock proprieties.
In the present work, group method of data handling (GMDH) and gene expression programming (GEP) are applied to establish reliable correlations for estimating temperature-based oil -water relative permeability through defining five input parameters; namely the saturation of water (S w ), absolute permeability (K), temperature (T), oil and water viscosities (μ o and μ w ). To this end, a comprehensive data source of 1223 points gathered from valid available literature and covering an extensive range of rock and fluids parameters and temperature, is utilized to establish the correlations. After developing GEP and GMDH models, they are assessed by means of several statistical criteria and graphical error analyses. Lastly, to testify the reliability of the proposed correlations, these ones are compared with preexisting correlations that model the dependency of oil -water relative permeability on temperature. There are some important differences between the present study and the previously performed studies in literature: (1) the established paradigms in this study have widespread applicability ranges, and besides, (2) different userfriendly explicit expressions for modeling temperature dependency of Kro and Krw in thermal enhanced oil recovery processes are developed. Figure 1 recaps the sketch of the problem.
The next sections of the paper are ordered as follows. Section 2 highlights a detailed description of the databank employed to establish the correlations. Section 3 describes the GMDH and GEP concepts. Results are described and discussed in Section 4. Finally, Section 5 points out the main outcoming results.

Data description
To develop reliable correlations that can ensure the generalization and accuracy, a comprehensive and a large databank with widespread conditions must be considered. Due to this fact, in this study, 1223 experimental data points were collected from published literature (Akhlaghinia et al., 2013;Ashrafi et al., 2012;Ashrafi, Souraki, & Torsaeter, 2014;Lo & Mungan, 1973;Maini & Okazawa, 1987;Poston, Ysrael, Hossain, & Montgomery III, 1970;Sinnokrot, Ramey Jr, & Marsden Jr, 1971;Torabi et al., 2016;Weinbrandt et al., 1975). The collected data cove a wide range of temperature and fluid/rock conditions. Among the 1223, 648 points describe the oil relative permeability (K ro ) cases, while the remaining 575 correspond to the relative permeability of water (K rw ). The considered inputs to develop the correlations are the following: temperature (T), water saturation (S w ), water viscosity (μ w ), oil viscosity (μ o ) and the absolute permeability (K). Table 2 reports a full description of the employed databank in this study. It should be mentioned that these data have already been used in our previous paper (Nait .
To establish the correlations using GEP and GMDH, the database was divided randomly into training data covering 80% of the whole databank, and testing data including the remaining 20%. The training data were used to investigate for the best correlations, while the testing data were exploited to evaluate the behavior of the correlations with blind data.

Group method of data handling (GMDH)
Group Method of Data Handling (GMDH) known also as polynomial neural network is one of the most promising families of artificial neural networks (ANNs) (Dargahi-Zarandi, Hemmati-Sarapardeh, Hajirezaie, Dabir, & Atashrouz, 2017). Beside the reliability shown by GMDH in modeling complex systems, it ensures the advantage of providing user-friendly polynomial formula to the system being studied. The conception of GMDH technique consists in employing multiple nodes which belong to intermediate layers. The generated value by each GMDH node is calculated based on a quadratic polynomial model that includes the previous neuron. This GMDH version corresponds to the earliest model that was introduced by (Ivakhnenko, Krotov, & Ivakhnenko, 1970). As the earliest version of GMDH presented some generalization lacks, a modified version, known also as hybrid version, was proposed as an extensive version that includes more interactions between the nodes and variables; hence, this version ensures more flexibility for modeling more complex systems (Rostami et al., 2019). The GMDH hybrid version follows the below-shown rule: (1) where y i , x ij...k stand for the inputs and output parameters of the model, respectively; c ij...k denote the polynomial coefficients; m and d mean respectively, the size of layers and the input parameters number.
Afterwards, the full-form mathematical formulation can be done by partial polynomials with predefined orders to combine between the nodes in previous layers; hence, new nodal variables (i.e. O 1 , O 2 , . . . ) are created. For the case of two neurons related with a quadratic polynomial model, the following equation is applied: (2) To adjust the coefficients of the above-shown equation, the least square method (LSM) is applied. Therefore, the following expression is formulated: In which d is the variables number and N t is the size of the training set.

Gene expression programming (GEP)
Gene expression programming (GEP) is an advanced soft computing method which was introduced by Ferreira (Ferreira 2001). This technique is a part of the family of evolutionary algorithms (EAs) and it applies the evolutionary principles. GEP provides the advantage of generating explicit mathematical expression to the studied systems. From the conception standing point of view, GEP is regarded an improved version of Genetic Programming (GP) introduced by (Koza, 1992), as GEP handled the GP issues, such as the limited regression strategies (Ferreira 2001).
As the other evolutionary algorithm, GEP processes the searching for best expression model by employing chromosomes that codify and reflect possible solutions. In addition, another key element which is the Expression Tree (ET) is introduced in GEP. ET is obtained by transforming the chromosomes into real candidates. GEP employs genes that involve terminals and a head containing functions. Each gene has a fixed length list of symbols which represent kinds of operators such as {+, ×, −, /, log, √ } and a terminal set such as {x, y, z} (Teodorescu & Sherwood, 2008). Figure 2 shows a chromosome having two genes and its mathematical formula. The GEP searching procedure is summarized in the following steps: (1) GEP setting parameters: it consists to define the needed key parameters such as the size of the population, the stopping criteria, and the length of genes.
(2) Population initialization: create randomly initial chromosomes (different possible mathematical expression). (3) Evaluate the chromosomes using a fitness function.
(4) Select the fittest individuals and save them for the next generation.
(5) Apply tournament selection to choose the individuals that will be recombined to generate new offspring. One point and two points recombination are available in GEP. The steps from (3) to (7) are reiterated while the stopping criterion is not satisfied.

Developing the correlations
As previously mentioned, after preparing the databank and specifying the training and testing sets for both cases Kro and Krw, the two rigorous techniques namely GEP and GMDH were applied to establish correlations for these two parameters with the following inputs: the saturation of water (S w ), absolute permeability (K), temperature (T), oil and water viscosities (μ o and μ w ). Therefore, the temperature dependency of oil -water relative permeability correlations are developed with respect to  the aforementioned inputs as follows: In both approaches, mean square error (MSE) was defined as the error function to be minimized during the search process for the best correlations. MSE is defined as follows: in which Kr means the oil or water relative permeability, N is the number of points and the subscript pre and exp mean the predicted and experimental values, correspondingly. When implementing GEP technique, its control parameters such as the population size, mutation probability, the included operators, etc. should be tuned to improve the accuracy of the generated correlations. The considered GEP setting parameters in this study are stated in Table 3. A summarized schematic of the K rw and K ro correlations obtained with GMDH are presented in Figures 3  and 4, correspondingly. As it is shown in these figures, the K rw network encompasses one input layer, one output layer and three intermediate layers; while for the case of K ro , one input layer, one output layer and two intermediate layers were obtained. The resulted GMDH correlations are expressed as follows: The resulted GMDH nodes and genomes included in the above-obtained correlations are reported in Appendix A.
The obtained correlations by GEP are expressed as follows: where A, B, C and D are defined as shown-below: The expressions of the terms appearing in the obtained GEP correlation for Kro are specified in Table 4.

Performances evaluation
Graphical error analyses and statistical criteria and were employed to assess the accuracy of the developed correlations and chose the best representative ones in forecasting the temperature -based K ro and K rw .
The root mean square error (RMSE) and coefficient of determination (R 2 ) and are the statistical indexes that were used in this study. These two statistical criteria are defined in Appendix B.
To fine-tune the above-mentioned criteria, broaden the assessment of the established correlations and give visual comparisons, graphical evaluation diagrams such as cross plots, and histograms of error distribution were considered. In the cross plots, the predicted values by the correlations are plotted versus the counterpart    experimental values. Existence of large amount of points nearby the line Y = X indicated the high accuracy of the model and the excellent degree of correspondence between predictions and real data. In the histograms of error, the distribution of errors is plotted in a bar form and if a normal distribution is noticed nearby zero value, the model is deemed very satisfactory. Figures 5 and 6 display cross plots comparing between experimental data and predictions of GEP and GMDH correlations for Kro and Krw, respectively. As it can be obviously seen from these figures, GMDH predictions show large sparse for both Kro and Krw, whereas the predictions of GEP are accumulated nearly enough around the unit slope line. According to this visual survey, it can be said that the GEP correlations are more aweinspiring as sublime accommodations between their predictions and experimental results are noticed. To excavate the integrity of the established correlations and distinguish the most representative one, Table 5 and bar plots of Figure 7 report statistical and graphical error analyses through the considered assessment criteria, namely RMSE and R 2 , for the established correlations. With   accordance to the demonstrated results in Table 5 and Figure 7, it can be concluded that GEP correlations estimate better K rw and K ro compared to GMDH correlations. The temperature-based oil -water relative permeability correlations established using GEP exhibit overall RMSE values of 0.0284 and 0.0636 for K rw and K rw , respectively, and correlation coefficients that exceed 0.97 for the both cases. Therefore, the developed GEP correlations were considered for further investigation in the rest of paper.
To depict effectiveness and reliability of the GEP correlations regarded to the generated results, the comparison between predicted relative permeability from the implemented correlations and their counterpart real values versus corresponding indexes of data samples were demonstrated in Figure 8 for Kro and in Figure 9 for Krw. As these figures illustrate, the gained results from the GEP correlations are as close as possible to actual values of Krw and Kro during the training and testing phases.
For a better understanding of the GEP correlations integrity in estimating the temperature -based Kro and Krw, Figures 10 and 11 demonstrate histograms of errors between the actual and estimated values for Kro and Krw, respectively. These figures include error histograms for training and testing phases in the two cases, Kro and Krw. Based on the reported results in these histograms, we can observe that the most frequent error values are nearby zero. In addition, it can be said that the error distributions follow the normal curve in all the subplots. The error distributions reported in Figures 10 and 11 confirm the high ability of the established correlations in predicting the temperature -based Kro and Krw.

Comparison of developed GEP correlations with literature models
In the present study, the accuracy of the developed GEP correlations was compared to various available correlations in the literature, which include the effect of  temperature on Kro and Krw. These latter include (Bennion et al., 2006), (Zhang et al., 2017), and . It should be mentioned that while applying the preexisting correlations to the employed data in this study, only the points that fall within the application ranges were included according to each correlation. To this end, the estimated values using the previously mentioned correlations versus the experimental data are Figure 14. The obtained (a) root mean squared error and (b) coefficient of correlation while estimating temperature-based oil/water relative permeability by GEP and available pre-existing correlations.
plotted in Figure 12 for Kro and in Figure 13 for Krw. Figures 12 and 13 demonstrate that large scatters in the Kro and Krw data around the unit slop line were generated by (Bennion et al., 2006) and  correlations, while acceptable accumulation around the X = Y line was noticed in the case of estimating Krw with the (Zhang et al., 2017) correlation. This obviously indicates that (Bennion et al., 2006) and , correlations fail in forecasting the correct values of both Kro and Krw, whereas (Zhang et al., 2017) fails particularly in predicting Kro. Table 6 and Figure 14 summarize the performances of the correlations considered in this work along with those of GEP correlations. The comparison results show that the developed GEP correlations lead to the best performances in predicting both Kro and Krw. According to Table 6 and Figure 14, it is concluded that the developed GEP correlations outperforms largely the preexisting temperature-based oil/water correlations.

Validity of the developed GEP correlations in term of water saturation (S w )
To testify the efficiency of the established GEP correlations in predicting the curves of temperature -based Kro and Krw as function of S w , Figure 15 illustrates the generated Kro and Krw curves via GEP correlations, and compare with corresponding experimental values from two different samples included in this study. As the subplots (a) and (b) of Figure 15 depict, a very satisfactory integrity is shown by the GEP correlations to estimate the temperature-based Kro and Krw curves as their emulated results have almost identical behaviors as actual records do. The prediction capability of the proposed GEP correlations has once again been certified in Figure 15.
Finaly, it should be mentioned that the proposed correlation for modeling the temperature dependency of Kro and Krw should be utilized when the data falls within the applicability realm, otherwise its exactness is not ensured as precise results for certain conditions can be generated, and imprecise results for some others. However, as previously stated, these correlations were gained by including widespread databank, and hence, it can be applied for several cases which have input parameters filling in the applicability realm.

Conclusions
In this study, new explicit, simple-to-use and accurate correlations were proposed to model the dependency of relative permeability in oil -water systems on temperature. Group method of data handling (GMDH) and gene expression programming (GEP) were implemented as promising tools to implement the correlations using a large comprehensive databank. Several assessment criteria were considered to figure out integrity and performance of the new correlations. The main conclusions of the study are summarized as follows: 1. GEP-based correlations were found as the most reliable correlations to predict the temperature dependency of Kr in oil -water relative systems. 2. The newly implemented GEP correlations for predicting the temperature-based Kro and Krw exhibited very satisfactory performances with overall RMSE values of 0.0284 and 0.0636 for Krw and Kro, respectively. 3. The developed GEP correlations were compared with other well-known preexisting correlations; namely those of (Zhang et al., 2017), (Bennion et al., 2006) and . The integrity of the proposed correlations was testified and found to be substantially superior to all of these models. 4. By performing a trend analysis of the developed GEP correlations in term of water saturation, the gained curves for both Kro and Krw followed the expected forms and logical variations in term of water saturation. 5. The established correlations in this study can be applied under a wide variety of conditions and also can be improved in presence of new additional data.

Disclosure statement
No potential conflict of interest was reported by the authors.