ANN-based methods for solving partial differential equations: a survey

Abstract Traditionally, partial differential equation (PDE) problems are solved numerically through a discretization process. Iterative methods are then used to determine the algebraic system generated by this process. Recently, scientists have emerged artificial neural networks (ANNs), which solve PDE problems without a discretization process. Therefore, in view of the interest in developing ANN in solving PDEs, scientists investigated the variations of ANN which perform better than the classical discretization approaches. In this study, we discussed three methods for solving PDEs effectively, namely Pydens, NeuroDiffEq and Nangs methods. Pydens is the modified Deep Galerkin method (DGM) on the part of the approximate functions of PDEs. Then, NeuroDiffEq is the ANN model based on the trial analytical solution (TAS). Lastly, Nangs is the ANN-based method which uses the grid points for the training data. We compared the numerical results by solving the PDEs in terms of the accuracy and efficiency of the three methods. The results showed that NeuroDiffeq and Nangs have better performance in solving high-dimensional PDEs than the Pydens, while Pydens is only suitable for low-dimensional problems.


Introduction
Many physical phenomena in modern sciences have been described by using Partial Differential Equations (PDEs) (Evans, Blackledge, & Yardley, 2012). Hence, the accuracy of PDE solutions is challenging among the scientists and becomes an interest field of research (LeVeque & Leveque, 1992). Traditionally, the PDEs are solved numerically through discretization process (Burden, Faires, & Burden, 2015),. For instance, the well-known finite difference method (FDM) and finite element method were utilized to solve many PDE linear and non-linear. Other methods, such as the variational iteration method (VIM) and its variations were used to solve the nonlinear PDE (He & Latifizadeh, 2020), and the finite difference-spectral method was investigated to solve the fractal mobile and immobile transport (Fardi & Khan, 2021). These methods typically end up with the algebraic systems that can be solved by using iterative methods (Hayati & Karami, 2007). The big issue in using the iterative solvers for solving the large scale of linear system of equations is that they potentially breakdown before getting a good approximate solution (Maharani & Salhi, 2015). In fact, their accuracy is not promising. To get rid of the breakdown problem, one has been done by using interpolation and extrapolation model Maharani et al., 2018;Maharani, Larasati, Salhi, & Khan, 2019), and using prediction with support vector machine (Thalib, Bakar, & Ibrahim, 2021). However, the problem is still not fully addressed since computationally, they quite expensive. With no discretization process, artificial neural networks (ANNs) can be an alternative way.
ANN is well-known as one method under machine learning (ML) which is typically used for regressions and classification problems. The development of ANN for solving PDE problems has been investigated at the beginning of the 21st century. For instance (Malek & Beidokhti, 2006), combined ANN and Nelder-Mead simplex method to find the numerical solutions of the high-order of PDE. This hybrid method improved the ANN performances by approximating initial and boundary conditions. Moreover (Sirignano & Spiliopoulos, 2018), used the Deep-Galerkin method (DGM) embedded with ANN, for solving the high dimensional of PDE problems. While, modified DGM by introducing ansatz method for binding the initial and boundary conditions. This modification simplifies the DGM original algorithm. Furthermore, another ANN-based method for solving PDE, called Physics Informed Neural Network (PINN), was introduced by Raissi, Perdikaris, & Karniadakis (2017b), Raissi, Perdikaris, and Karniadakis (2019) and Raissi et al. (2017b). PINN considers the physical laws of PDE to be embedded in loss function as a regularization term. This method was improved by Guo, Cao, Liu, and Gao (2020), in terms of the training effect by using the residual-based adaptive refinement (RAR) method. This strategy will impact in increasing the number of residual points with the large residuals of PDE until the residuals are less than the threshold.
The ability of ANN in solving PDE problems gives some advantages, including continuous and differentiable of the approximate solutions, good interpolation characteristics and less memory (Chen et al., 2020). Other advantages of ANN are that it can utilize automatic differentiation tools, such as Tensorflow (Abadi et al., 2016) and PyTorch (Paszke et al., 2017;Rahaman et al., 2019), allow researchers make more simpler methods in solving PDE problems (Chen et al., 2020). In this study, we focus on three methods for solving PDEs based on ANN model, namely PyDEns which modifies the DGM, NeuroDiffEq which is the ANN approximator with TAS applied (Chen et al., 2020), and Nangs which based on the grid points for training data). This article is structured by follows. Section 1 discusses introduction of ANN-based methods for solving PDEs. Section 2 describes the review of ANN to solve PDEs. In Section 3, the basic theory behind the three methods is also discussed. Section 4 illustrates the three methods solve the heat equation. The numerical results of the three methods in solving different types of PDEs are explained in Section 5. Lastly, we conclude our study in Section 6.

Artificial neural networks (ANN)
ANNs were introduced firstly in 1943 by McCulloch and Pitts (1943). It is inspired by biological neurons working to perform complex tasks (Schalkoff, 1997). At the beginning, ANN has been successful handling several data problems, which then becomes less popular since left out behind another ML techniques. In 1980s, with the tremendous increase in computing power and the amount of data used to training ANNs, this technique became more popular and was successfully applied in various practical applications (Goodfellow, Bengio, & Courville, 2016;Goldberg, 2016;Helbing & Ritter, 2018;LeCun, Bengio, & Hinton, 2015;Li et al., 2019;Mabbutt, Picton, Shaw, & Black, 2012;Nielsen, 2015;Shanmuganathan, 2016), including the differential equation problems discussed in this article.
One of the most popular ANN architecture called Perceptron, as shown in Figures 1(a) (Haykin, 1999), consists of multiple hidden layers as visualized in Figure 1(b) (Khanna, 1990),. However, prior to the invention of the backpropagation algorithm (Rumelhart, Hinton, & Williams, 1985), it was not easy for training perceptron to make a better prediction. In short, backpropagation is a gradient descent method, which enables the perceptron to give a better approximation based on the gradient of the loss function. Furthermore, the backpropagation algorithm became the most popular ANN optimizer algorithm (Li, Cheng, Shi, & Huang, 2012;Nielsen, 2015).

Ann model for solving PDEs: overview
Consider the second-order PDE of the form (McFall, 2010), over the domain X & R 2 , with an initial condition and a boundary condition uðx, tÞ ¼ gðx, tÞ, ðx, tÞ 2 rX: Generally, ANN to solve PDE (Equation (1)) is started by generating the weights to form a linear combination with the inputs x, t and the bias, b i . This form is then used to compute the hidden layer as described in Equation (4) as follows where h 1 is the first hidden layer, w xi and w ti are the weights and b 1 is bias. The second hidden layer, as expressed in Equation (5), is computed by feeding h 1 into it and thus is processed to yield the output layer.
where v ij are the weights, b 2 is the bias, and f is the activation function of the form The activation functions are commonly used to perform the diverse of computations between the layers. Several activation function, such as sigmoid or logistic, tanh, ReLU and Leaky-ReLU are often used (Haykin, 1999;Jagtap, Kawaguchi, & Karniadakis, 2020). Here, we take the hyperbolic tangent function (tanh) as it has been proved to provide the better results compared to other activation functions (Karlik & Olgac, 2011;Panghal & Kumar, 2021). Our aim here is to obtain the approximate solution u net ðx, tÞ which is written as follows, where p j are the weights of the output layers. To control the accuracy of the approximate solution, we compare it with the right-hand side of the PDE (Equation (1)), and this can be only done by differentiate partially u net ðx, tÞ as follows where k ¼ 1, 2.

Ann-based methods for solving PDEs
In this section, we discuss three methods and compare them in terms of accuracy and efficiency. They are Pyden, NeuroDiffeq and Nangs. They are differed by the way generating the training points and the loss functions.

Pydens method
All of ANN-based methods to solve the PDE problem used an optimizer in order to obtain the minimum error. The most common optimization method used is called Deep Galerkin Method (DGM), has been introduced by Sirignano and Spiliopoulos (2018). The name of Pydens was obtained from the python module with the DGM optimizer. Basically, to approximate u(t, x) in Equation (1) using u net ðt, xÞ, Pydens is modified by applying ansatz in binding the initial and boundary conditions. The procedure is explained as follows.
1. Bind the initial and boundary conditions using ansatz by setting up the equation A net ðx, tÞ ¼ multðx, tÞ Á u net ðx, tÞ þ addðx, tÞ: (10) Thus, the solution of PDE is approximated by transforming the ANN output rather than u net ðx, tÞ itself. Equation (10) is to ensure the concatenation of the initial and boundary conditions in Equations (2) and (3), respectively, whenever the following is verified: multðx, tÞ x2rX ¼ 0, addðx, tÞ x2rX ¼ gðxÞ: (13) 2. Generate m points inside the batches of b 1 from the domain ðx, tÞ Â X by using the uniform distribution v 1 . Then, for each point (x, t), feed it to ANN architecture until the optimum output is obtained. 3. Build the loss function as follows where h is a vector consists of the weights and biases. 4. Update the trainable parameter h to minimize the loss function by using SGD optimizer.

Neurodiffeq method
NeuroDiffeq method applies the trial approximate solution (TAS) (Chen et al., 2020), that satisfies the initial and boundary conditions. Recall Equation (1). The procedure of NeuroDiffeq method is explained as follows: 1. Generate m Â n input points of ðx i , t j Þ Â X, where i ¼ 1, 2, :::, m and j ¼ 1, 2, :::, n, and divide the set points into training and validation points.
2. Build the TAS, u T , as the form of McFall (2010).
where A(x) is a function that satisfies the initial and boundary conditions and F½u net ðx i , t j Þ is chosen to be zero for any (x i , t j ) on the boundary. This approach is similar to the trial function discussed by Lagaris, Likas, and Fotiadis (1998). 3. Build the loss function (McFall, 2006).
where h is a vector consists of the weights and biases. Noted that the first term is used as the approximate solutions of the PDE itself, while the second term is used as the approximate for the boundary condition.
Here, a weighting factor g is used to improve the performance of the loss function, as appeared in Equation (16). In practice, it is arbitrary determined.

Nangs method
Different from both methods explained above, Nangs method is not required to create trial solution to minimize the loss functions, instead, it generates mesh points for the training data. The details of how Nangs method can approximate PDE are described as follows.
1. Set mesh points of ðx i , t j Þ Â X for i ¼ 1, 2, :::, m and j ¼ 1, 2, :::, n inside the domain as visualized in Figure 2). 2. For each internal point, once feeding process has been done in ANN architecture, compare the output with the right side of the PDE using the following loss function: x i , t i , u net , :::, 3. For all of the initial and boundary points, the outputs are compared with Equations (2) and (3), respectively, by using the following loss functions respectively: 4. Simulation of the three ANN-based methods for solving heat equation As an illustration of using ANN-based methods to solve PDEs, the following heat equation is considered (Burden et al., 2015), with the initial condition and boundary conditions uð0, tÞ ¼ uð1, tÞ ¼ 0: The analytical solution for this PDE is given by e Àtp 2 sin ðpxÞ: To compare the three methods, we used the same architectures of ANN which are three hidden layers consisting 32 neurons each. The loss function is evaluated up to 100 Â 100 points for the unit inputs (x, t). We also used various number of iterations because each method uses different python modules. Pydens is run under Tensorflow (Abadi et al., 2016), while NeuroDiffeq and Nangs are run under Pytorch (Paszke et al., 2017).

Pydens in solving PDE heat equation
To solve PDE heat in Equation (22), firstly, we use ansatz function to bind the boundary and initial conditions. We then randomly generate up to 100 Â 100 number of points inside the domain ½0, 1: Finally, we compute the loss function as in Equation (11). The complete algorithm for this method is described in Algorithm 1.
end for 9.
Apply the SGD to optimize the trainable parameter h. 10. end while

Neurodiffeq in solving heat equation
Different from the PyDens method, NeuroDiffeq uses TAS (McFall, 2006), to approximate the initial and boundary conditions. Basically, once we set up the domain ½0, 1 into 100 Â 100 data points, we then construct TAS to satisfy the initial and boundary conditions 23 and 24. The details of solving the PDE heat in Equation (22) (23) and (24) as follows u T ðx, t; hÞ ¼ sin ðpxÞ þ xtð1ÀxÞð1ÀtÞu net ðx, t; hÞ: Noted here that we used h to indicate that it consists of the weights and the biases.
Apply SGD optimizer to minimize the loss function. 10. end while

Nangs method in solving PDE heat equation
Nangs method adopts the grid points used in the discretization process as the training data. It is done by building mesh points from (x, t) in the entire domain. Then, split the mesh points into internal, initial, and boundary points (see Figure 2). The complete algorithm is shown in Algorithm 3.
Calculate the output as follows u net ðx, t; hÞ ¼ X 32 i¼1 p i f ðh 3 Þ:

7.
Compute each initial and boundary points and compare them with the original initial and boundary condition of the PDE as in Equations (23) and (24) as follows 8. end for 9.
Apply the SGD to optimize the weight and the biases. 10. end while

Results and discussion
In this section, the three methods, PyDens, NeuroDiffeq and Nangs, are compared in terms of the accuracy and efficiency by solving several three types of PDE, namely elliptic, parabolic and hyperbolic (Burden et al., 2015). We also compared all three methods performance with the classical method which is FDM. All of the results are shown in various tables and figures.

Simulation results in solving PDE heat equation
The comparison methods for solving PDE heat equation as in Equations (22)-(24) are recorded in Table 1.
In terms of the computational times, Pydens also gives the shortest time compared with the other two ANN-based methods, it spent 53 seconds only when using 25 Â 25 training data, whereas 65 and 578 s needed for NeuroDiffeq and Nangs, respectively, for using the same training data. The higher training data of the PDE problem, such as 50 Â 50, 75 Â 75 and 100 Â 100, need more times for the three methods. However, Pydens is still the winner between them. While in comparison with classical method, FDM is still gives the shortest time compared to ANN-based method with less than 10 s in all problems. The results are more clearly seen when we visualized the analytical solution as in Figure      The comparison of all ANN-based methods in terms of the loss values and the computational times are shown in Figure 7. As we can see here that the higher numbers of training data used for solving the PDE heat would affect the performance of the methods. It is appeared in NeuroDiffEq and Nangs methods. For Pydens method, however, it just occurred when using the lower numbers of training data, namely 25 Â 25 and 50 Â 50, whereas greater than that, the Pydens performance slightly worse.
For further analysis of the performance of three methods, it is also shown in Table 2, where it illustrates all of the results of loss values of the variation number of hidden layers, and neurons per layer.
It can be seen from Table 2, in general, when the number of layers and neurons increased, the prediction accuracy also improved. However, more complex ANN architecture causes the computational time increased. Hence, determine the best ANN architecture is crucial.

Simulation results in solving PDE wave equation
Wave equation is one of the hyperbolic PDE and contains the second-order partial derivatives (Guo et al., 2020). Wave equation has been applied in many sciences fields, such as seismic wave propagation and acoustic wave propagations (Gu, Zhang, & Dong, 2018;Kim, 2019;Li, Feng, & Schuster, 2017). The wave equation is described as the following equations [3]: with initial conditions uðx, 0Þ ¼ sin ðpxÞ, (27) @u @t ðx, 0Þ ¼ 0, and boundary conditions uð0, tÞ ¼ uð1, tÞ ¼ 0, The analytical solution is given as follows uðx, tÞ ¼ sin ðpxÞ cos ð2ptÞ:, The simulation results of the three methods to solve the wave equation are shown in Table 3.
In Figures 9(a)-11(a), we can see that NeuroDiffEq closer to analytical solution from 50 Â 50 to   100 Â 100 training data compare to the other. Pydens in 25 Â 25 training data is the best; however, the other training data gives no significant progress. Meanwhile for Nangs, all of the training data result is not close to the analytical solution. Figure 12 above shows similar trends with the one in the previous section, where Pydens took the shortest time with about 61 s for 25 Â 25 training data, compared with 81 and 1062 s for NeuroDiffeq and Nangs, respectively. However, NeuroDiffEq gives better results for using greater numbers of training data than the other two methods. The simulation results with different ANN architectures are shown in Table 4.
The performances of the three methods are summarized in Figure 17, when it can be seen that the NeuroDiffEq is the most accurate method to solve the Poisson Equation (31). However, this method is the slowest one compared with Pydens and Nangs methods, with 1177 s for using 100 Â 100 training data. Pydens, in contrast, took the shortest time which only 53 s for 25 Â 25 training data, compared with 105 and 544 s for NeuroDIffeq and Nangs methods, respectively. Similarly, when using 100 Â 100 training data, Pydens needs 423 s only, compared with 1177 and 967 s for the other two methods, respectively. The simulation results for using different ANN architectures are recorded in Table 6.

Discussion
Overall, can be said the approximation solutions of PDE heat, PDE wave and PDE Poison equation using NeuroDiffEq give the better and stable accuracy than Pydens, Nangs, and even classical method FDM. This observation can be seen in Tables 1, 3 and 5, respectively. However, in the evaluation of the computational time, the FDM still give the shortest time compared to all ANN-based method. While comparing between the three ANN-based methods; this method took the longest when we added more training data. Furthermore, Pydens is the fastest method in solving all given problems. Unfortunately, the performance of this method is disappointing since more training data points do not affect its accuracy. Meanwhile, Nangs gives an unpredictable expectation results with its position is in the middle between the two methods. This method, in our point of view, can potentially give the better performance when solving the variety of PDE problems. The readers can see all the performances of the three methods through Figures 7, 12 and 17 which clearly indicate the conditions explained above.
To further investigation of the performances of these three methods, we also ran the simulations that can be seen in Tables 2, 4 and 6, respectively. The results showed that to improve the performance of the three methods, we can also change the ANN architectures by adding more numbers of layers and neurons. However, these options will increase the computational time. Another options are to increase iteration and change the optimizer.

Conclusion
We have discussed Pydens, NeuroDiffEq and Nangs methods to solve the heat, wave and Poisson equations of the second order of PDEs, and also compared with classical method. The training data used ranging from 25 Â 25 to 100 Â 100 points. We compared the accuracy as well as the time efficiency of each method. There are advantages and disadvantages of each method based on our experiments result. In term of the accuracy, NeuroDiffEq consistently produced the lowest loss values compared with Pydens, Nangs and FDM. To get better loss values, the training data points need to be increased though. However, it affected the longer computational time. On the other hand, the classical method FDM is still given the fastest method compared with the others. Interestingly, for Nangs method, although this method is not the fastest nor the lowest loss value, but it potentially produces the better loss values for solving high dimensional problems. It can be seen on Figures 6, 10 and 14 that the training data, the computation times, and the trend of loss values are potentially overtaken by NeuroDiffEq's performance on high training data problems.

Disclosure statement
No potential conflict of interest was reported by the author(s).