Application of physics-informed neural networks to inverse problems in unsaturated groundwater flow

ABSTRACT This paper investigates the application of Physics-Informed Neural Networks (PINNs) to inverse problems in unsaturated groundwater flow. PINNs are applied to the types of unsaturated groundwater flow problems modelled with the Richards partial differential equation and the van Genuchten constitutive model. The inverse problem is formulated here as a problem with known or measured values of the solution to the Richards equation at several spatio-temporal instances, and unknown values of solution at the rest of the problem domain and unknown parameters of the van Genuchten model. PINNs solve inverse problems by reformulating the loss function of a deep neural network such that it simultaneously aims to satisfy the measured values and the unknown values at a set of collocation points distributed across the problem domain. The novelty of the paper originates from the development of PINN formulations for the Richards equation that requires training of a single neural network. The results demonstrate that PINNs are capable of efficiently solving the inverse problem with relatively accurate approximation of the solution to the Richards equation and estimates of the van Genuchten model parameters.


Introduction
Predictions of various problems in unsaturated soil mechanics, such as rainfall-induced landslides and excavation collapses, rely on thorough understanding and the capacity to model coupled hydro-mechanical process in soils (e.g. Zhang et al., 2018). Increasing the capacity for understanding and modelling hydromechanical processes in unsaturated soil mechanics contributes to developing strategies for mitigating potentially negative consequences of such events (e.g. Depina, Oguz, and Thakur, 2020). This study examines the capacity of Physics-Informed Neural Networks (PINNs) in solving unsaturated groundwater flow problems modelled by the Richards Partial Differential Equation (PDE) (Richards, 1931).
The Richards equation is a non-linear PDE and has been a great matter of interest in a wide range of fields including, among others, soil mechanics, agriculture, hydrogeology and reservoir engineering. The nonlinear nature of this equation is a reflection of the nonlinear relationship between the soil volumetric water content, θ, and the pressure head, ψ, which together with hydraulic conductivity, k form three primary variables of Richards equation. The primary variables are mutually dependent with θ and k often being expressed as a function of ψ, respectively, with the Water Retention Curve (WRC) and Hydraulic Conductivity Function (HCF). Both of these hydraulic functions are also known as constitutive relationships and are utilised to describe the characteristics of water and solute movement in soils (e.g. Schindler and Müller, 2017). Among several parametric models that are frequently used to represent these constitutive relationships (e.g. Abkenar, Rasoulzadeh, and Asghari, 2019), this study employs one of the most commonly used constitutive models, known as the van Genuchten model (Van Genuchten, 1980).
Solutions to the Richards equation with the van Genuchten model are commonly found by employing finite difference and finite element methods (e.g. Šimunek, Van Genuchten, and Šejna, 2012). An alternative approach is investigated in this study by finding solutions to the Richards equation with PINNs (e.g. Raissi, Perdikaris, and Karniadakis, 2019). The capacity of PINNs to solve partial and ordinary differential equations stems from the capabilities of Deep Neural Networks (DNNs) to server as a universal function approximator (Chen et al., 2020). PINNs extend the capacity of existing DNNs to solve PDEs by restricting the range of acceptable solutions through modifications of the loss function to the ones that enforce the PDE that governs the actual physics of the problem. In this way, this method extends the existing capacity of DNNs to approximate the solution of PDEs. It is achieved with a relatively simple feed-forward neural network architectures that utilise Automatic Differentiation (AD) techniques to efficiently evaluate the partial derivatives in the PDE to ensure that the PINN prediction satisfies the PDE across the problem domain. PINNs are investigated here due to their computational efficiency that can be advantageous in computationally demanding task in unsaturated soil mechanics involving parametric analysis, uncertainty quantification, real-time monitoring and inverse analysis.
This study builds on several earlier studies that investigated the application of PINNs to the Richards equation in Tartakovsky et al. (2018) and Bandai and Ghezzehei (2020). Bandai and Ghezzehei (2020) introduced a new framework for the inverse solution of the Richards equation to estimate WRC and HCF without introducing a specific constitutive model. This involved training three neural networks with the one approximating the solution, the Richards PDE in terms of θ, and the remaining two describing, respectively, WRC and HCF. Tartakovsky et al. (2018) employed PINNs to determine HCF of an unsaturated homogenous soil from synthetic matric potential data based on the two-dimensional time-dependent Richards equation. This study complements and extends the previous studies by investigating an inverse solution to the one-dimensional Richards PDE with the van Genuchten model. The adoption of the van Genuchten constitutive model allows the solution to the inverse problem to be found by training a single neural network. Additionally, inverse solutions to both the volumetric water content and the pressure head formulations of the Richards PDE were investigated to accommodate the respective measurements. The inverse problem is formulated in this study as a problem with known values of θ or ψ at several spatio-temporal instances, and unknown values of the Richards PDE variables at the rest of the problem domain and unknown parameters of the van Genuchten model. The performance of the PINN formulations on the inverse problem was examined on several examples with synthetically generated data and measurements from a water infiltration column test.

Formulation
Mathematically, a deep network can be considered as a particular choice of a compositional function (Lu et al., 2019). This study employs a Feed forward Neural Network (FNN), also known as Multi-Layer Perceptron (MLP), which is constructed by applying linear and nonlinear transformations to the inputs recursively.
Let N L (x):R d in R d out be an L-layer neural network, or a (L − 1)-hidden layer neural network, with N l neurons in the lth layer (N 0 = d in , N L = d out ). Each layer is associated with a weight matrix, W l [ R N l ×N l−1 , and a bias vector, b l [ R N l . With a nonlinear activation function that is applied element-wise, σ, the FNN is recursively defined as follows (Lu et al., 2019): The activation function, σ, is commonly specified as the logistic sigmoid s(x) = 1/(1 − exp (− x)), hyperbolic tangent, s(x) = tanh (x), or the rectified linear unit, s(x) = max (x, 0).
One of the central properties of neural networks that will be used in the development of PINNs is the Automatic Differentiation (AD). AD is used to evaluate the derivatives of the network outputs with respect to the network inputs through the process of backpropagation (Rumelhart, Hinton, and Williams, 1986). The backpropagation process utilises the compositional structure of neural networks to apply the chain derivative rule repeatedly and compute the derivatives. The AD starts with a forward pass to compute the values of all of the variables, which is followed by a backward pass to compute the derivatives. Computing nth-order derivatives requires AD to be applied recursively n times. Additionally, it is important to note that the AD is applied on the neural network and not the data, thus allowing for the possibility to work with noisy data (e.g. Lu et al., 2019).
Given the introduction to neural networks and AD, the following section will introduce PINNs. Consider the following PDE, parameterised by l, for the solution The implementation of PINNs is illustrated in Figure 1 with mixed boundary conditions u(x) = g D (x) on G D , ∂V and ∂u ∂n = g R (u, x) on G R , ∂V. The implementation of a PINN starts by constructing a neural networkû(x; x) as an approximate solution of u(x). The input to the neural network is x and the output is vector of the same dimension as u. Weight matrices and bias vectors of the neural networkû are denoted with x = {W l , b l } 1≤l≤L .
Following the specification of the neural network, it is necessary to ensure that the neural network satisfies the physics defined by the PDE and the boundary conditions. This is usually implemented by restrictingû on a set of collocation points T = {x 1 , x 1 , . . . , x |T | ,} of size |T |. Generally, T can be composed of collocation points in the domain T f , V and on the boundary T b , ∂V. The discrepancy between the neural networkû and the physical constraints is measured through the loss function, defined as weighted sum of the L 2 norm of the residuals for the collocation points in the domain and on the boundary: where w f and w b are the weights, and The derivatives in the loss function are calculated with AD, which is available in machine learning libraries such as TensorFlow (Abadi et al., 2015) and PyTorch (Paszke et al., 2019). The final step in the implementation of PINNs is the training stage in which the set of parameters in x is varied to minimise the loss function L(x; T ). The loss function is usually highly nonlinear and non-convex with respect to x and the optimisation is commonly performed with gradient-based optimisers.

Inverse problem
In addition to the forward problem, PINNs can be easily applied to inverse problems with unknown parameters l in Equation (1). In the case of an inverse problem, it is considered that there are some additional information on points T i , V in addition to the earlier training points in the domain and on the boundary conditions: The implementation of the inverse problem is often relatively straightforward, with the only difference with respect to the forward problem being the loss function with an additional term: where w i is the weight and The training stage is then done by optimising x and l simultaneously: x * ,l = arg min x,l L x, l; T (8) where x * andl are, respectively, optima of x and l. Finding a solution to the inverse problem based on Equation (8) is relatively straightforward and computationally efficient, especially in the case of linear PDEs. However, it was found in some nonlinear problems that the optimisation process is dominated by gradients with respect to x, which results in the optimisation process not advancing toward a more optimal l value or that the optimal solution results in physically inconsistent values (e.g. negative values of positive constants). Although it is likely that these issues can be addressed by reformulating the loss function and introducing constraints to the optimisation formulation in Equation (8), this study aims to resolve them by reformulating the inverse problem in Equation (8) into a double-loop formulation: wherel is an optimum, g k ; k = 1, . . . , N g are constraint functions, and l l and l u are the lower and upper bounds, respectively. The outer loop of the algorithm will be optimising l values and enforcing constraints on the l values.
The outer loop will be implemented with the global optimisation Cross Entropy (CE) algorithm (Botev et al., 2013), while the inner loop will utilise the gradient-based algorithms commonly available in machine learning libraries. The CE algorithm was selected due to its robust performance as a global optimisation algorithm and the relatively straightforward implementation (e.g. Botev et al., 2013). It is expected that similar performance can be achieved with alternative global optimisation algorithms (e.g. Genetic algorithm). Research on the generalisation error for the application of PINNs to inverse problems is ongoing and more information can be found for example in Raissi, Perdikaris, and Karniadakis (2019) and Mishra and Molinaro (2021).

Richards equation
The Richards equation is a nonlinear PDE that captures the flow of water in unsaturated zone (Richards, 1931). One-dimensional form of the Richards equation is given by where θ is the volumetric soil water content(L 3 · L −3 ), t is the time (T ), x is the vertical dimension (L), k is the hydraulic conductivity (L · T −1 ) and ψ is the pressure head (L). The WRC and HCF relationships can be specified by the van Genuchten model (Van Genuchten, 1980): where k s is the saturated hydraulic conductivity, Θ is the relative saturation, u s and u r are, respectively, the saturated and residual volumetric water contents, α (L −1 ), n, and m are the model parameters, where m = 1 − 1/n. Equation (11a) specifies the HCF and Equation (11b) the WRC relationship.

Pressure head formulation
Equation (10) is reformulated to implement the PINN methodology for the Richards equation in terms of the pressure head, ψ as follows: The equation is then reformulated to express the partial derivates of θ and k with respect to ψ.
∂u ∂c Finally, the following expression is obtained as where C = ∂u/∂c is known as the water storage function. The water storage function is evaluated analytically based on the adopted van Genuchten model as follows: where ∂Q/∂c is calculated based on the factorisation of the Richards equation as shown in Rai and Tripathi (2019): The partial derivative ∂k/∂c is calculated analytically based on the van Genuchten model: where ∂Q/∂u = 1/(u s − u r ). The partial derivative ∂k/∂Q is determined from the van Genuchten model, as shown in Rai and Tripathi (2019): In case of heterogeneous soils and known properties of spatially variable properties, the partial derivatives could be calculated by providing the known values of the material properties directly in the analytical expressions for the partial derivatives. In the case of an inverse problem, such as the ones studied here, additional studies are needed to investigate the potential of PINNs in solving inverse problems with spatially variable material properties.
To be consistent with the expression in Equation (1), the pressure head PINN formulation for the Richards equation is defined as where l = [k s , u s , u r , a, n] T is the vector that parameterises the Richards equation.

Volumetric water formulation
Equation (10) is reformulated to implement the PINN methodology for the Richards equation in terms of the volumetric water content, θ, starting with the following expression: The partial derivatives involving ψ are expressed in terms of θ by using the chain derivative rule: The expression is further evaluated by rewriting the second term on the right side: The partial derivative ∂k/∂x was rewritten as follows: Finally, the expression was rewritten by introducing 1/C = ∂c/∂u: The second-order partial derivative in the second term on the right side of the expression is calculated as where ∂c/∂u is specified based on Equation (15): From the expression it can be observed that ∂c/∂u is expressed in terms of Θ with the sign(c) = 1 for unsaturated conditions, thus eliminating ψ from the expression.
Taking the partial derivative of Equation (26) with respect to x results with the following expression: The partial derivative ∂Q/∂x is expressed as ∂u/∂x by applying the chain rule as follows: This expression is then integrated in Equation (27): The partial derivative ∂k/∂u is evaluated analytically based on the van Genucthen model: where ∂k/∂Q is defined in Equation (18). To be consistent with the expression in Equation (1), the VWC PINN formulation for the Richards equation is defined as

Introduction
This section investigates the application of PINNs for Richards equation for the inverse problem with both the pressure head and VWC formulations. The dataset for the inverse problem was generated by using a finite-difference code from Ireson (2020)  subjected to:

Inner loop
The solution to the inverse problem in Equation (32) is found with a double loop optimisation algorithm. The inner loop of the algorithm aims to minimise the loss function in Equation (6) (Zhu et al., 1997) that is available in TensorFlow.

Outer loop
The outer loop aims to estimate an optimum valuel that minimises the loss function over a set of solutions of the Richards equation. The outer loop can be implemented with a wide range of optimisation  algorithms. A relatively simple and robust global optimisation CE algorithm was implemented in this study for the outer loop. The CE is a population-based evolutionary algorithm for solving estimation and optimisation problems (De Boer et al., 2005). The CE solves an optimisation problem by updating a series of random search distributions such that the optimal or a nearoptimal solution is more likely to be located. The random search distributions are updated in an adaptive process to converge to a distribution with high probability mass in the region of near-optimal solutions (Botev et al., 2013). The CE algorithm with normal updating is implemented in this study. In the CE algorithm with normal updating, an independent normal random search distribution is assigned to each of the variables in l. This results in a multivariate normal distribution with three independent components that are parametrised with a vector of means, m l , and a vector of variances, s l 2 . The CE algorithm iteratively updates m l and s l 2 by calculating them as the mean and variance of a subset of samples with the lowest values of L after each iteration. In the first iteration, the values of l were sampled uniformly from the ranges in Equation (32). In the following iterations, m l and s l 2 were calculated, respectively, as the mean and variance of the 50% quantile of samples with the lowest value of L. The CE algorithm was implemented with 5 iterations and 10 samples in each iteration. The mean value, m l , calculated after the last iteration is adopted as an estimate of the optimum,l.

Pressure head formulation
The pressure head PINN formulation was applied to an inverse problem in which T i values of ψ are known at randomly selected locations across the model domain. The selected locations with known values of ψ are shown in Figure 4 with black crosses. In addition to satisfying the solution of the Richards equation at the locations with known values, the PINN aims to satisfy the solution in the rest of the domain by evaluating the loss L f at a set of T f = 40, 000 collocation points that are distributed uniformly across the domain. As discussed earlier, the solution to the inverse problem is found with a double loop algorithm with the estimates of the unknown values of ψ at the collocation points and estimates of the unknown values of input parameterŝ l = [k s ,â,n] T . Figure 4 illustrates the performance of the pressure head PINN formulation on solving the inverse problem with T i = 1000 randomly selected points with known values. The PINN approximation of the solution of the Richards equation at the collocation points is plotted behind the randomly selected points. A comparison between Figures 2 and 4 reveals that the PINN inverse solution and the numerical model solution agree very well. In order to further investigate the quality of the PINN approximation of the solution, the numerical solution and the PINN prediction are compared at several time instances in the lower part of Figure 4. The comparisons made at the time instances 0.25T, 0.5T, and 0.75T reveal good agreement between the numerical solution and the PINN prediction.
In addition to providing a prediction ψ at the collocation points, the solution to the inverse problem involves estimates of the unknown parameterŝ l = [k s ,â,n] T . Estimates of the unknown parameters are provided in Table 1 Table 1 presents the values of the losses L i and L f to provide an insight in the quality of the PINN prediction of the Richards equation across the domain, and the values of s l = [s k s , s a , s n ] T to examine the convergence of the CE algorithm.
From the values of L i and L f in Table 1, it can be seen that the PINN provides good prediction of the solution to the pressure head formulation of the Richards equation with L i ≈ 10 −4 and L f ≈ 10 −13 . The number of locations with known values of the solution does not significantly affect the losses. The values of the estimated input parametersl = [k s ,â,n] T reveal a relatively good agreement with the actual input values. This is reflected in the actual and predicted values of k s and α being relatively close. The estimated values of n show somewhat less good agreement with the predicted values approximately from 4 to 6 and the actual value of 2.06. Some of the potential reasons this disagreement may be from the value of n being less influential in the optimisation process or the CE algorithm requiring additional iterations to converge to a more optimal solution. Additional iterations would be beneficial for increasing the prediction accuracy of the CE algorithm, with the ratio of standard deviation of the estimates over the mean being between 10% and 20% in Table 1. The CE algorithm was implemented in this study with relatively low 5 iterations as a compromise between accuracy and computational time.
The capacity of the PINN to solve the inverse problem with noisy data was investigated by adding normally distributed error to the dataset. The error term is defined as a normally distributed random variable with zero mean and standard deviation of s e · s d , where s d is the standard deviation of the dataset. Figure  5 shows the performance of the pressure head PINN formulation on the inverse problem with noisy data. The data consist of the T i = 1000 randomly selected points with known ψ values, while the noise is specified by s e = 0.05.
From Figures 5 and 2, it can be observed that the provided solution of the Richards equation agrees very well with the original noise-free solution of the numerical model. The quality of the PINN approximation of the solution with a noisy dataset is further investigated in Figure 5 by comparing the noise-free numerical solution and the PINN prediction at several time instances. The comparisons made at the time instances 0.25T, 0.5T and 0.75T reveal very good agreement between the noise-free numerical solution and the PINN prediction. These results demonstrate the capacity of PINNs to solve noise inverse problems and identify the original noise-free solution of the Richards equation despite the dataset being subjected moderate amounts of noise. The effects of the noise level in the dataset on the capacity of the PINN pressure head formulation on solving the inverse problem is investigated for a range of s e = {0.025, 0.05, 0.075} in Table 2 with T i = 1000. The results indicate that the quality of the prediction over the problem domain decreases with increasing noise levels from L i ≈ 10 −4 and L f ≈ 10 −12 for s e = 0.025 to L i ≈ 10 −3 and L f ≈ 10 −10 for s e = 0.075. The estimates of the input parameters do not show significant variations due to increasing noise levels. Further increasing noise revels result in the solution not converging or showing great deviations from the noise-free solution due to overfitting. Among the estimated values, k s and α show a relatively good agreement with the input values, while the estimates of n show higher deviations from the input values. As discussed earlier, some of the potential reasons these deviations may be from the value of n being less influential in the optimisation process or the CE algorithm requiring additional iterations to converge to a more optimal solution.

Volumetric water content formulation
The VWC PINN formulation was applied to an inverse problem in which T i values of θ are known at randomly selected locations across the model domain. The selected locations with known values of θ are shown in Figure 6 with black crosses. In addition to satisfying the solution of the Richards equation at the locations with known values, the PINN aims to satisfy the solution in the rest of the domain by evaluating the loss L f at a set of T f = 40,000 collocation points that are distributed uniformly across the domain. Similar as for the pressure head formulation, the solution to the inverse problem is found with a double loop algorithm with the estimates of the unknown values of θ at the collocation points and estimates of the unknown values of input parametersl = [k s ,â,n] T . Figure 6 shows the performance of the VWC PINN formulation on solving the inverse problem with T i = 1000 randomly selected points with known values. Estimates of the unknown parametersl = [k s ,â,n] T are provided in Table 3  From the values of L i and L f in Table 3, it can be seen that the PINN provides good prediction of the solution to the VWC formulation of the Richards equation with L i ≈ 10 −6 and L f ≈ 10 −9 . The values of the estimated input parametersl = [k s ,â,n] T reveal a relatively  The capacity of the VWC PINN to solve the inverse problem with noisy data was investigated by adding normally distributed error to the dataset as for the pressure head formulation. Figure 7 shows the performance of the pressure head PINN formulation on the inverse problem with noisy data. The data consist of the T i = 1000 randomly selected points with known ψ values, while the noise is specified by s e = 0.04.
Comparison of Figures 7 and 3 shows that the provided solution of VWC formulation agrees very well with the original noise-free solution of the numerical model. The quality of the VWC PINN approximation of the solution with a noisy dataset is further investigated in Figure 7 Table 4 with T i = 1000. The results indicate that the quality of the prediction is not significantly affected by the increasing noise levels with L i ≈ 10 − 6 and L f ≈ 10 −9 for the examined noise levels. Similar to the pressure head formulation, further increasing noise revels result in the solution not converging or showing great deviations from the noise-free solution due to over-fitting. The estimates of the input parameters do not show significant variations due to increasing noise levels. Similar as for the earlier analyses, the estimates n show higher deviations from the input values.

Test setup
The performance of the VWC PINN formulation for the Richards equation is further investigated by solving the inverse problem on a dataset from a water infiltration test. A large-scale water infiltration test was conducted to study the infiltration process in unsaturated soil by using instrumentation and visual interpretation (Robinson, 2019), as shown in Figure 8. The column was designed with reference to the ASTM D7664-10: Standard Test Methods for Measurements of Hydraulic Conductivity of Unsaturated Soils (ASTM, 2010) and infiltration tests performed in McCartney, Villar, and Zornberg (2007), Li, Zhang, and Fredlund (2009), and Duong et al. (2013). The infiltration test consists of a 1-m-high soil column placed within a hollow acrylic cylinder. The cylinder is 1.3 m high with the outer diameter of 0.25 m and the inner diameter of 0.24 m. The remaining 0.3 m above the soil level was used to maintain a constant pressure head of 0.1 m for the duration of the infiltration test. The column was placed on the base assembly to prevent water leakage and a valve to control the boundary conditions at the base. The valve can be opened to allow for water drainage or closed to simulate an impervious base. A filter system was installed on the contact between the soil and the base to minimise fines migration and support the soil column. Similarly, a stainless steel perforated plate and filter cloth were placed on top of the soil column to reduce soil disturbance during the initial stages of water filling. The constant water head of 0.1 m on top of the soil column was maintained with a custom water supply system, as shown in Figure 8.
The soil mixture for the infiltration test was created from a combination of three soils to obtain a soil mixture that was representative of some Norwegian moraine soils. The resulting mixture had approximately    Figure 8 shows the conditions shortly after the start of the water infiltration column test with the top 0.05 m being saturated, as observed by the darker soil colour.

Inverse analysis
The collected VWC measurements will be used in the PINN-based inverse analysis to estimate the parameters of the van Genuchten model and predict VWC at the non-measured spatio-temporal instances as modelled by the Richards PDE. The values of u r = 0.0991 and u s = 0.35 are determined from preliminary laboratory tests conducted in Robinson (2019) with the values of k s , α and n to be determined from the analysis. Given the measurements from the VWC sensors, the VWC PINN formulation was adopted for the implementation of the inverse problem. As introduced earlier, the inverse problem is formulated as a problem in which a solution of the Richards PDE is known at the locations of the VWC sensors for the duration of the test. The temporal domain was discretised into 501 points, which results in T i = 3006 points with known VWC values. The spatial domain was discretised into 76 points, with the total number of collocation points being T f = 38,076. The inverse problem for the water infiltration tests is specified as follows with the bounds on k s , α and n selected from subjectedto: The solution to the inverse problem in Equation (33) is found by implementing a PINN based on the VWC formulation and the double-loop algorithm introduced earlier. The PINN has eight layers with the first layer having two neurons and the output layer a single neuron. The remaining 6 layers have 50 neurons each. Relu activation function was selected as it was found to provide the most stable results.

Results
Results of the inverse problem for the water infiltration column test are presented in Figure 8. The left part of Figure   solution indicated that there is a relatively rapid transition from unsaturated to saturated conditions in the upper 0.2 m of the model. The infiltration of water to the deeper layers of soil slows down somewhat with the soil column becoming saturated before 200,000 s.
In addition to presenting the solution of the Richards PDE across the problem domain, a set of six graphs is presented in the lower right part of Figure 8 to compare the PINN predictions of θ and the known or measured values. The top three graphs show the known or measured values of θ in comparison to the values predicted by PINN at depths of 0, 0.15 and 0.3 m. These graphs demonstrate that there is a very rapid and sharp transition from unsaturated to saturated conditions in the upper part of the column. The capacity of the PINN to capture the rapid and sharp transitions observed in the measurements is conformed with a relatively good agreement between the PINN predictions and the measurements. Closer inspection of the transitions reveals that the PINN predictions are somewhat smoother than the measurements.
The lower three graphs in the right part of Figure 8 compare the PINN prediction with the measurements of θ at the depths of 0.45, 0.6 and 0.75 m. These graphs show that the transitions from unsaturated to saturated conditions are less rapid with the transition being diffused in comparison to the shallower depths. The PINN predictions agree very well with the measurements thus showing a good capacity to approximate the highly nonlinear infiltration process, as modelled by the Richards PDE and the van Genuchten model. Similar to the approximations of sharp transitions at shallower depths, some smoothing occurs in the PINN approximation of the sharp transitions that appear as the water first approaches the locations of the VWC sensors.
To further investigate the generalisation capability and the physical plausibility of the trained PINN, data collected from one of the sensors were excluded from the training dataset and used to validate PINN's predictive capacity. Figure 9 shows the performance of the PINN with the data from the sensor at position x = 0.45 m being excluded from the training dataset. The resulting predictions of the PINN demonstrate very good predictive capacity of the trained PINN both on the training dataset and the unobserved data. As observed from Figures 8 and 9, the PINN approximates the shock behaviour very well. This can be attributed to the use of the ReLU activation function, which was selected among the various activation functions due to its capacity to predict sharp changes in the solutions of the Richards equation.

Discussion
This study investigated the applications of PINNs to the Richards PDE and the van Genuchten model. PINNs have been successfully applied to solve forward and backward problems for various PDEs, benefiting from the DNNs capacity to approximate complex functions and the computational efficiency of AD to evaluate partial derivatives of the PINN predictions with respect to model parameters, which is used to enforce the PDE on a set of collocation points. This paper specifically focused on the Richards PDE with the van Genuchten model due to the van Genuchten model being often used in analysing groundwater flow in unsaturated soils. The adoption of the van Genuchten model in this study allowed for an implementation of a PINN with only one DNN and analytical derivations of WRC and HCF.
Although PINNs can be applied to both the forward and inverse problems for various types of PDEs, this paper investigated the application of PINNs only for the inverse problem with the Richards PDE. One of the main reason for this limitation is the nonlinearity of the Richards PDE with the van Genuchten model, which can lead to difficulties in reliably converging to an accurate solution. Several analyses, conducted as a part of this study, on the PINN solution to the forward problem with the Richards PDE and the van Genuchten model showed promising results. However, due to the inability to consistently converge to relatively accurate solutions they were not presented in this study. Conversely, the focus is on PINN solution to the inverse problem due the capacity of PINNs to provide solutions with relatively high accuracy and high computational efficiency.
The inverse problem is postulated in this study as a problem where the solutions of the Richards PDE are known at a given set of spatio-temporal instances with the unknown values at the remaining instances and unknown parameters of the van Genuchten model. A highly efficient solution to an inverse problem can be obtained with PINNs by utilising the AD in the process of finding the optimal solution and parameter values that minimise the loss function. However, such implementations resulted in sub-optimal performance in this study with the algorithm minimising the loss function not being able to efficiently advance among the values of the van Genuchten model parameters. This resulted in the inverse solution resulting in sub-optimal solutions or physically inconsistent values. This is attributed to the nonlinearity and complexity of the Richards PDE with the van Genuchten model. A double loop algorithm was implemented in this study to overcome this challenge, with the inner loop optimising the parameters of the DNN and the outer loop optimising the van Genuchten parameters. Constraints and bounds on the van Genuchten parameters are enforced on the outer loop, which was implemented with the CE global optimisation algorithm. The double loop algorithm is implemented as a compromise between computational efficiency and the solution accuracy and stability.
This study features the pressure head and the VWC formulations of PINNs for the Richards PDE with the van Genuchten model to adapt to situations in which, respectively, the pressure head and VWC measurement are available. For both of the formulations, the Richards PDE was expressed in terms of the respective variable (i.e. pressure head or VWC) with the partial derivatives of the variable reformulated in terms of analytical expressions and the derivatives of the variables with respect to time or space. The analytical expressions are derived based on the van Genuchten model, while the derivative of pressure head or VWC with respect to time and space are evaluated with the AD based on the PINN prediction. This allows the PINN for the Richards PDE and the van Genuchten model to be efficiently implemented with a single DNN.
The performance of PINNs for the inverse problem with the Richards PDE and the van Genuchten model was investigated on synthetically generated measurements and the dataset from a water infiltration column test. The synthetically generated data were used to evaluate the performance of both the PINN formulations for a range of available measurements and noise levels. The results demonstrated good performance for both formulations with relatively low effect of the number of measurements on the accuracy for the considered range. Relatively similar estimates of saturated permeability in the van Genuchten model were obtained for all analyses, with higher variation of α values and some deviations for n. This might indicate that n is less influential in the optimisation process or that additional iterations of the CE algorithm could have been implemented. The effects of noise in the measurements on the solution to the inverse problems showed that PINNs can deal with low to medium noise levels and identify the original non-noisy solution. Higher noise levels result in the lack of convergence or great deviations from the original solution due to over-fitting. The VWC PINN formulation was applied to VWC measurements for a water infiltration column test to predict the solution of the Richards equation at non-measured spatio-temporal instances and estimate parameters of the van Genuchten model. The results demonstrate that PINNs are capable of providing a very good prediction of the solution to the Richards equation based on realistic data and a very good fit to highly nonlinear and sharp transitions from unsaturated to saturated conditions that are observed in the measurements.
PINNs represent a hybrid modelling approach that bridges the gap between physical-based (e.g. FEM, FDM) and data-driven approaches. Hybrid approaches aim to integrate the advantages of both approaches with the capacity of data-driven approaches to efficiently integrate and process large amounts of data and for the solution to satisfy the physical laws as described by the governing PDEs. These unique modelling capabilities of PINNs allow it to find applications in situations where large amounts of data are available and physical interpretability of the data or predictions based on those data are important. One such application involves real-time monitoring systems.
Real-time monitoring systems are one of commonly employed strategies for managing geohazards risks, which aims to reduce the risks by issuing timely alerts to evacuate people and property under risk. Real-time monitoring systems often feature large numbers of sensors that produce data time series that require efficient interpretation and processing. Hybrid approaches can contribute to the implementation of such systems with the capacity to efficiently process large amounts of data collected through monitoring system and simultaneously be used to provide predictions of natural phenomena that are based on physical laws described by the governing PDEs. Physical interpretability of collected data and model predictions is of great importance in supporting a decision-making process that is necessary for the implementation of reliable warning systems.

Conclusion
This study investigated the application of physicsinformed neural networks to the inverse problem for the Richards partial differential equation with the van Genuchten model. Pressure head and volumetric water content formulations of physics-informed neural networks were developed to adapt to situations in which the respective measurements are available. The implementation of the two formulations requires only one deep neural network to be trained due to analytical derivations of partial derivations in the Richards equation based on the van Genuchten model. The two formulations were successfully implemented on several examples to demonstrate the capacity of physicsinformed neural networks to provide accurate predictions of the inverse solution to the Richards equation and to determine the parameters of the van Genuchten model. Raissi in the development of this study and the valuable contributions from Kate Robinson in implementing the water infiltration column test.

Disclosure statement
No potential conflict of interest was reported by the author(s).