The idea of using neural networks (NNs) for solving partial differential equations (PDEs) has been around for some time [Dissanayake and Phan-Thien, 1994; Lagaris et al., 1998]. Following the advances in fields like computer vision and natural language processing, NNs have recently enjoyed growing interest in computational science and engineering, too. A deep Galerkin method (DGM) was developed as a mesh-free approach to high-dimensional PDEs in [Sirignano and Spiliopoulos, 2018]. Physics-informed NNs (PINNs) have been proposed in [Raissi et al., 2019] as a hybrid model that enables the incorporation of physical prior knowledge into a regression framework.
Nice reviews on PINNs and physics-based machine learning more generally can be found in [Karniadakis et al., 2021; Blechschmidt and Ernst, 2021; Cuomo et al., 2022]. Although such methods cannot (yet) replace more traditional techniques fully [Knoke and Wick, 2021; Chuang and Barba, 2022; Grossmann et al., 2024], they constitute a very exciting area of research in scientific computing. A very brief introduction is given below.
Residual
We consider the heat equation over a time interval \([0, T]\) on a spatial domain \(\Omega\) as an example system. The unknown temperature \(u(t, \boldsymbol{x})\), or any other quantity that can be similarly described, is then governed by the PDE
\[\frac{\partial u(t, \boldsymbol{x})}{\partial t} - \nabla^2 u(t, \boldsymbol{x}) = 0, \quad t \in [0, T], \quad \boldsymbol{x} \in \Omega.\]Some initial conditions \(u(0, \boldsymbol{x}) = u_0(\boldsymbol{x})\) for \(\boldsymbol{x} \in \Omega\) and boundary conditions \(u(t, \boldsymbol{x}) = u_b(t, \boldsymbol{x})\) for \((t, \boldsymbol{x}) \in [0,T] \times \partial \Omega\) are imposed. It shall be noted that other differential operators or boundary conditions can be addressed analogously.
The goal is now to construct a NN \(u_{\boldsymbol{\theta}}(t, \boldsymbol{x})\) that approximately solves the governing equations. It will be convenient in the following to define the residual of the NN approximation as
\[r_{\boldsymbol{\theta}}(t, \boldsymbol{x}) = \frac{\partial u_{\boldsymbol{\theta}}(t, \boldsymbol{x})}{\partial t} - \nabla^2 u_{\boldsymbol{\theta}}(t, \boldsymbol{x}).\]Physics loss
One can construct a physics-inspired loss function that is tailored to solving the PDE. It contains three components that penalize non-zero residuals and deviations from the initial and boundary conditions, respectively. The so-called physics loss reads
\[L_{\mathrm{physics}} = \lambda_r L_{\mathrm{residual}} + \lambda_i L_{\mathrm{initial}} + \lambda_b L_{\mathrm{boundary}}.\]Here, the relative importance of the different contributions can be adjusted with scalar weights \(\lambda_r, \lambda_i, \lambda_b > 0\). Given some collocation points \(\{(t_j^{(r)}, \boldsymbol{x}_j^{(r)})\}_{j=1}^{N_r}\) that test the residual within the domain, and points at the space-time boundary \(\{\boldsymbol{x}_j^{(i)}\}_{j=1}^{N_i}\) and \(\{(t_j^{(b)}, \boldsymbol{x}_j^{(b)})\}_{j=1}^{N_b}\) that test the boundary conditions, the different terms are explicitly given as
\[\begin{align*} L_{\mathrm{residual}} &= \frac{1}{N_r} \sum_{j=1}^{N_r} \left( r_{\boldsymbol{\theta}}(t_j^{(r)}, \boldsymbol{x}_j^{(r)}) \right)^2, \quad t_j^{(r)} \in [0, T], \quad \boldsymbol{x}_j^{(r)} \in \Omega, \\ L_{\mathrm{initial}} &= \frac{1}{N_i} \sum_{j=1}^{N_i} \left( u_0(\boldsymbol{x}_j^{(i)}) - u_{\boldsymbol{\theta}}(0, \boldsymbol{x}_j^{(i)}) \right)^2, \quad \boldsymbol{x}_j^{(i)} \in \Omega, \\ L_{\mathrm{boundary}} &= \frac{1}{N_b} \sum_{j=1}^{N_b} \left( u_{\mathrm{b}}(t_j^{(b)}, \boldsymbol{x}_j^{(b)}) - u_{\boldsymbol{\theta}}(t_j^{(b)}, \boldsymbol{x}_j^{(b)}) \right)^2, \quad t_j^{(b)} \in [0, T], \quad \boldsymbol{x}_j^{(b)} \in \partial \Omega. \end{align*}\]An approximate solution \(u_{\hat{\boldsymbol{\theta}}}(t, \boldsymbol{x})\) of the PDE can eventually be computed by finding the NN weights \(\hat{\boldsymbol{\theta}} = \operatorname{arg\,min}_{\boldsymbol{\theta}} L_{\mathrm{physics}}\) that minimize the physics loss. Note that the formulation presented so far actually establishes a generic PDE solver.
PINN loss
In a wider context, it may be important to incorporate actual experimental data into the scientific modeling process. This could compensate for the inevitable uncertainties and inadequacies to some degree. PINNs offer an elegant mechanism to combine physical knowledge with real data. Given a set of data \(\{(t_i, \boldsymbol{x}_i, u_{\mathrm{meas}}(t_i, \boldsymbol{x}_i))\}_{i=1}^N\) one can simply consider an additional regression loss
\[L_{\mathrm{data}} = \frac{1}{N} \sum_{i=1}^N \left( u_{\mathrm{meas}}(t_i, \boldsymbol{x}_i) - u_{\boldsymbol{\theta}}(t_i, \boldsymbol{x}_i) \right)^2.\]It is remarked here that, in a surrogate modeling context, such input-output data could in principle also come from a high-fidelity simulator. A PINN can be trained by minimizing the regression and physics losses together as a function of the NN weights. For the sake of completeness, the full PINN loss \(L = L_{\mathrm{data}} + L_{\mathrm{physics}}\) is written as
\[L = L_{\mathrm{data}} + \lambda_r L_{\mathrm{residual}} + \lambda_i L_{\mathrm{initial}} + \lambda_b L_{\mathrm{boundary}}.\]