Optimal dynamic treatment regime estimation in the presence of nonadherence

Summary performance statistics for 1000 simulation runs for a 2-stage dynamic treatment regime for scenarios where |$\psi _{22} = -1$| and |$\psi _{22} = 1$|⁠.

	Corrected	Corrected (Known)	Naive
\|$\psi _{22} = -1$\|
MSE \|$\psi _{10}$\|	0.146	0.144	0.159
MSE \|$\psi _{11}$\|	0.086	0.083	0.188
MSE \|$\psi _{20}$\|	0.301	0.299	0.357
MSE \|$\psi _{21}$\|	0.074	0.073	0.123
MSE \|$\psi _{22}$\|	0.733	0.730	0.683
Mean optimally treated (stage 1)	0.975	0.975	0.977
Mean optimally treated (stage 2)	0.959	0.959	0.951
Mean regret ratio	1.550	1.535	1.894
\|$\psi _{22} = 1$\|
MSE \|$\psi _{10}$\|	0.205	0.201	0.249
MSE \|$\psi _{11}$\|	0.090	0.088	0.177
MSE \|$\psi _{20}$\|	0.325	0.324	0.406
MSE \|$\psi _{21}$\|	0.080	0.078	0.162
MSE \|$\psi _{22}$\|	0.805	0.800	0.774
Mean optimally treated (stage 1)	0.972	0.972	0.963
Mean optimally treated (stage 2)	0.958	0.958	0.941
Mean regret ratio	1.902	1.873	3.532

	Corrected	Corrected (Known)	Naive
\|$\psi _{22} = -1$\|
MSE \|$\psi _{10}$\|	0.146	0.144	0.159
MSE \|$\psi _{11}$\|	0.086	0.083	0.188
MSE \|$\psi _{20}$\|	0.301	0.299	0.357
MSE \|$\psi _{21}$\|	0.074	0.073	0.123
MSE \|$\psi _{22}$\|	0.733	0.730	0.683
Mean optimally treated (stage 1)	0.975	0.975	0.977
Mean optimally treated (stage 2)	0.959	0.959	0.951
Mean regret ratio	1.550	1.535	1.894
\|$\psi _{22} = 1$\|
MSE \|$\psi _{10}$\|	0.205	0.201	0.249
MSE \|$\psi _{11}$\|	0.090	0.088	0.177
MSE \|$\psi _{20}$\|	0.325	0.324	0.406
MSE \|$\psi _{21}$\|	0.080	0.078	0.162
MSE \|$\psi _{22}$\|	0.805	0.800	0.774
Mean optimally treated (stage 1)	0.972	0.972	0.963
Mean optimally treated (stage 2)	0.958	0.958	0.941
Mean regret ratio	1.902	1.873	3.532

The mean squared error for each contrast function term, the average proportion of optimally treated individuals at stages 1 and 2, as well as the ratio of the average regret to the average regret assuming perfect adherence, is reported for each scenario. The simulation compares the modified G-estimation procedure (Corrected), to the modified G-estimation procedure with known adherence rates (Corrected (Known)), and to standard G-estimation ignoring the effects of nonadherence (Naive).

TABLE 1

Summary performance statistics for 1000 simulation runs for a 2-stage dynamic treatment regime for scenarios where |$\psi _{22} = -1$| and |$\psi _{22} = 1$|⁠.

	Corrected	Corrected (Known)	Naive
\|$\psi _{22} = -1$\|
MSE \|$\psi _{10}$\|	0.146	0.144	0.159
MSE \|$\psi _{11}$\|	0.086	0.083	0.188
MSE \|$\psi _{20}$\|	0.301	0.299	0.357
MSE \|$\psi _{21}$\|	0.074	0.073	0.123
MSE \|$\psi _{22}$\|	0.733	0.730	0.683
Mean optimally treated (stage 1)	0.975	0.975	0.977
Mean optimally treated (stage 2)	0.959	0.959	0.951
Mean regret ratio	1.550	1.535	1.894
\|$\psi _{22} = 1$\|
MSE \|$\psi _{10}$\|	0.205	0.201	0.249
MSE \|$\psi _{11}$\|	0.090	0.088	0.177
MSE \|$\psi _{20}$\|	0.325	0.324	0.406
MSE \|$\psi _{21}$\|	0.080	0.078	0.162
MSE \|$\psi _{22}$\|	0.805	0.800	0.774
Mean optimally treated (stage 1)	0.972	0.972	0.963
Mean optimally treated (stage 2)	0.958	0.958	0.941
Mean regret ratio	1.902	1.873	3.532

	Corrected	Corrected (Known)	Naive
\|$\psi _{22} = -1$\|
MSE \|$\psi _{10}$\|	0.146	0.144	0.159
MSE \|$\psi _{11}$\|	0.086	0.083	0.188
MSE \|$\psi _{20}$\|	0.301	0.299	0.357
MSE \|$\psi _{21}$\|	0.074	0.073	0.123
MSE \|$\psi _{22}$\|	0.733	0.730	0.683
Mean optimally treated (stage 1)	0.975	0.975	0.977
Mean optimally treated (stage 2)	0.959	0.959	0.951
Mean regret ratio	1.550	1.535	1.894
\|$\psi _{22} = 1$\|
MSE \|$\psi _{10}$\|	0.205	0.201	0.249
MSE \|$\psi _{11}$\|	0.090	0.088	0.177
MSE \|$\psi _{20}$\|	0.325	0.324	0.406
MSE \|$\psi _{21}$\|	0.080	0.078	0.162
MSE \|$\psi _{22}$\|	0.805	0.800	0.774
Mean optimally treated (stage 1)	0.972	0.972	0.963
Mean optimally treated (stage 2)	0.958	0.958	0.941
Mean regret ratio	1.902	1.873	3.532

The results demonstrate an improvement in both parameter estimation and in the scaled regrets observed when using the corrected technique compared with the naive estimators. These results are more pronounced when |$\psi _{22}=1$|⁠. In this setting, the effect of treatment is larger on average, and as a result, minor differences in assigned treatments can result in large differences in outcomes. The corrected and naive methods have similar performance in terms of the proportion of optimally treated individuals, and both perform worse in terms of regret compared to estimation when nonadherence is absent.

4.2 Impact of validation sample sizing

In the second scenario, the impact of the size of the validation sample and the sample size are investigated. Results are considered by varying the sample size n over |$\lbrace 200, 1000, 5000\rbrace$| with a validation sample varied over |$\lbrace 10\%, 20\%, 50\%\rbrace$| of the full sample. The contrast parameter |$\psi _{22}$| is set to |$-1$| for all runs. Table 2 contains the performance metrics for these simulations.

TABLE 2

Summary performance statistics for 1000 simulation runs for a 2-stage dynamic treatment regime for scenarios where |$\psi _{22} = -1$| and n is varied over |$\lbrace 200, 1000, 5000\rbrace$|⁠.

	Corrected
	10%	20%	50%	Known	Naive
\|$n=200$\|
MSE \|$\psi _{10}$\|	15.243	0.949	0.832	0.812	0.718
MSE \|$\psi _{11}$\|	9.644	0.764	0.682	0.656	0.696
MSE \|$\psi _{20}$\|	66.127	2.047	1.976	1.964	2.649
MSE \|$\psi _{21}$\|	40.560	0.707	0.677	0.654	0.354
MSE \|$\psi _{22}$\|	51.442	4.808	4.622	4.506	4.781
Mean optimally treated (stage 1)	0.915	0.924	0.927	0.927	0.939
Mean optimally treated (stage 2)	0.876	0.879	0.880	0.880	0.864
Mean Regret Ratio	3.294	3.042	2.935	2.912	3.075
\|$n=1000$\|
MSE \|$\psi _{10}$\|	0.153	0.150	0.146	0.144	0.159
MSE \|$\psi _{11}$\|	0.095	0.087	0.083	0.083	0.188
MSE \|$\psi _{20}$\|	0.305	0.302	0.301	0.299	0.357
MSE \|$\psi _{21}$\|	0.078	0.076	0.073	0.073	0.123
MSE \|$\psi _{22}$\|	0.713	0.732	0.736	0.730	0.683
Mean optimally treated (stage 1)	0.974	0.975	0.975	0.975	0.977
Mean optimally treated (stage 2)	0.959	0.959	0.959	0.959	0.951
Mean regret ratio	1.552	1.561	1.535	1.535	1.894
\|$n=5000$\|
MSE \|$\psi _{10}$\|	0.024	0.023	0.023	0.023	0.050
MSE \|$\psi _{11}$\|	0.015	0.014	0.014	0.014	0.110
MSE \|$\psi _{20}$\|	0.057	0.056	0.056	0.056	0.113
MSE \|$\psi _{21}$\|	0.016	0.016	0.015	0.015	0.092
MSE \|$\psi _{22}$\|	0.131	0.130	0.130	0.129	0.189
Mean optimally treated (stage 1)	0.990	0.990	0.990	0.990	0.983
Mean optimally treated (stage 2)	0.984	0.984	0.984	0.984	0.979
Mean regret ratio	1.061	1.046	1.038	1.041	2.337

	Corrected
	10%	20%	50%	Known	Naive
\|$n=200$\|
MSE \|$\psi _{10}$\|	15.243	0.949	0.832	0.812	0.718
MSE \|$\psi _{11}$\|	9.644	0.764	0.682	0.656	0.696
MSE \|$\psi _{20}$\|	66.127	2.047	1.976	1.964	2.649
MSE \|$\psi _{21}$\|	40.560	0.707	0.677	0.654	0.354
MSE \|$\psi _{22}$\|	51.442	4.808	4.622	4.506	4.781
Mean optimally treated (stage 1)	0.915	0.924	0.927	0.927	0.939
Mean optimally treated (stage 2)	0.876	0.879	0.880	0.880	0.864
Mean Regret Ratio	3.294	3.042	2.935	2.912	3.075
\|$n=1000$\|
MSE \|$\psi _{10}$\|	0.153	0.150	0.146	0.144	0.159
MSE \|$\psi _{11}$\|	0.095	0.087	0.083	0.083	0.188
MSE \|$\psi _{20}$\|	0.305	0.302	0.301	0.299	0.357
MSE \|$\psi _{21}$\|	0.078	0.076	0.073	0.073	0.123
MSE \|$\psi _{22}$\|	0.713	0.732	0.736	0.730	0.683
Mean optimally treated (stage 1)	0.974	0.975	0.975	0.975	0.977
Mean optimally treated (stage 2)	0.959	0.959	0.959	0.959	0.951
Mean regret ratio	1.552	1.561	1.535	1.535	1.894
\|$n=5000$\|
MSE \|$\psi _{10}$\|	0.024	0.023	0.023	0.023	0.050
MSE \|$\psi _{11}$\|	0.015	0.014	0.014	0.014	0.110
MSE \|$\psi _{20}$\|	0.057	0.056	0.056	0.056	0.113
MSE \|$\psi _{21}$\|	0.016	0.016	0.015	0.015	0.092
MSE \|$\psi _{22}$\|	0.131	0.130	0.130	0.129	0.189
Mean optimally treated (stage 1)	0.990	0.990	0.990	0.990	0.983
Mean optimally treated (stage 2)	0.984	0.984	0.984	0.984	0.979
Mean regret ratio	1.061	1.046	1.038	1.041	2.337

TABLE 2

Open in new tab Download slide

Summary performance statistics for 1000 simulation runs for a 2-stage dynamic treatment regime for scenarios where |$\psi _{22} = -1$| and n is varied over |$\lbrace 200, 1000, 5000\rbrace$|⁠.

	Corrected
	10%	20%	50%	Known	Naive
\|$n=200$\|
MSE \|$\psi _{10}$\|	15.243	0.949	0.832	0.812	0.718
MSE \|$\psi _{11}$\|	9.644	0.764	0.682	0.656	0.696
MSE \|$\psi _{20}$\|	66.127	2.047	1.976	1.964	2.649
MSE \|$\psi _{21}$\|	40.560	0.707	0.677	0.654	0.354
MSE \|$\psi _{22}$\|	51.442	4.808	4.622	4.506	4.781
Mean optimally treated (stage 1)	0.915	0.924	0.927	0.927	0.939
Mean optimally treated (stage 2)	0.876	0.879	0.880	0.880	0.864
Mean Regret Ratio	3.294	3.042	2.935	2.912	3.075
\|$n=1000$\|
MSE \|$\psi _{10}$\|	0.153	0.150	0.146	0.144	0.159
MSE \|$\psi _{11}$\|	0.095	0.087	0.083	0.083	0.188
MSE \|$\psi _{20}$\|	0.305	0.302	0.301	0.299	0.357
MSE \|$\psi _{21}$\|	0.078	0.076	0.073	0.073	0.123
MSE \|$\psi _{22}$\|	0.713	0.732	0.736	0.730	0.683
Mean optimally treated (stage 1)	0.974	0.975	0.975	0.975	0.977
Mean optimally treated (stage 2)	0.959	0.959	0.959	0.959	0.951
Mean regret ratio	1.552	1.561	1.535	1.535	1.894
\|$n=5000$\|
MSE \|$\psi _{10}$\|	0.024	0.023	0.023	0.023	0.050
MSE \|$\psi _{11}$\|	0.015	0.014	0.014	0.014	0.110
MSE \|$\psi _{20}$\|	0.057	0.056	0.056	0.056	0.113
MSE \|$\psi _{21}$\|	0.016	0.016	0.015	0.015	0.092
MSE \|$\psi _{22}$\|	0.131	0.130	0.130	0.129	0.189
Mean optimally treated (stage 1)	0.990	0.990	0.990	0.990	0.983
Mean optimally treated (stage 2)	0.984	0.984	0.984	0.984	0.979
Mean regret ratio	1.061	1.046	1.038	1.041	2.337

	Corrected
	10%	20%	50%	Known	Naive
\|$n=200$\|
MSE \|$\psi _{10}$\|	15.243	0.949	0.832	0.812	0.718
MSE \|$\psi _{11}$\|	9.644	0.764	0.682	0.656	0.696
MSE \|$\psi _{20}$\|	66.127	2.047	1.976	1.964	2.649
MSE \|$\psi _{21}$\|	40.560	0.707	0.677	0.654	0.354
MSE \|$\psi _{22}$\|	51.442	4.808	4.622	4.506	4.781
Mean optimally treated (stage 1)	0.915	0.924	0.927	0.927	0.939
Mean optimally treated (stage 2)	0.876	0.879	0.880	0.880	0.864
Mean Regret Ratio	3.294	3.042	2.935	2.912	3.075
\|$n=1000$\|
MSE \|$\psi _{10}$\|	0.153	0.150	0.146	0.144	0.159
MSE \|$\psi _{11}$\|	0.095	0.087	0.083	0.083	0.188
MSE \|$\psi _{20}$\|	0.305	0.302	0.301	0.299	0.357
MSE \|$\psi _{21}$\|	0.078	0.076	0.073	0.073	0.123
MSE \|$\psi _{22}$\|	0.713	0.732	0.736	0.730	0.683
Mean optimally treated (stage 1)	0.974	0.975	0.975	0.975	0.977
Mean optimally treated (stage 2)	0.959	0.959	0.959	0.959	0.951
Mean regret ratio	1.552	1.561	1.535	1.535	1.894
\|$n=5000$\|
MSE \|$\psi _{10}$\|	0.024	0.023	0.023	0.023	0.050
MSE \|$\psi _{11}$\|	0.015	0.014	0.014	0.014	0.110
MSE \|$\psi _{20}$\|	0.057	0.056	0.056	0.056	0.113
MSE \|$\psi _{21}$\|	0.016	0.016	0.015	0.015	0.092
MSE \|$\psi _{22}$\|	0.131	0.130	0.130	0.129	0.189
Mean optimally treated (stage 1)	0.990	0.990	0.990	0.990	0.983
Mean optimally treated (stage 2)	0.984	0.984	0.984	0.984	0.979
Mean regret ratio	1.061	1.046	1.038	1.041	2.337

As the sample size increases, the performance of the estimators tends to improve. When the sample size and validation sample are small, the corrected estimators are unstable, resulting in large MSEs, though the proportion of correctly treated individuals and regrets remains comparable to the naive analysis. In large samples, the regret of the estimated regime using the corrected technique approaches the same value as the regret for the regime with perfect adherence, where the naive analysis produces a regime with roughly twice the regret.

4.3 Bootstrap and asymptotic variance coverage probabilities

The third simulation explores the impact of sample size and validation proportion on the coverage probabilities. Data are generated taking |$\psi _{22} = -1$|⁠, with the sample sizes and validation sizes as in the second scenario. The corrected estimators are applied with 3 separate techniques for quantifying the uncertainty: a percentile-based bootstrap procedure individually for each component, a simultaneous bootstrap procedure based on percentiles (Gao et al., 2021), and using estimated standard errors from the sandwich variance matrix. Bootstrap intervals are formed for the |$\lbrace 90\%, 91\%, \cdots , 99\%\rbrace$| level based on |$B=200$| replicates. The empirical versus nominal coverage results are included in Figure 1.

$Empirical and nominal coverage probability based on 1000 simulation runs for the estimated parameter values over varying sample sizes (n) and validation proportions. Coverage is calculated for the naive bootstrap (circle points) and the simultaneous bootstrap (triangular points), over nominal levels $\lbrace 0.9,0.91,\dots ,0.99\rbrace$. Coverage is calculated based on the asymptotic variance continuous over this range (lines). Each parameter is shown in different colors. A dotted line in black indicates where the empirical coverage equals the nominal coverage.$

FIGURE 1

Empirical and nominal coverage probability based on 1000 simulation runs for the estimated parameter values over varying sample sizes (n) and validation proportions. Coverage is calculated for the naive bootstrap (circle points) and the simultaneous bootstrap (triangular points), over nominal levels |$\lbrace 0.9,0.91,\dots ,0.99\rbrace$|⁠. Coverage is calculated based on the asymptotic variance continuous over this range (lines). Each parameter is shown in different colors. A dotted line in black indicates where the empirical coverage equals the nominal coverage.

The simultaneous bootstrap intervals show conservative coverage across all parameters, all scenarios, and at all levels. The standard percentile-based intervals tend to produce coverage that is closer to the nominal range; however, as the sample sizes increase, coverage can remain below the nominal levels for certain parameters. With moderate and large samples, the sandwich estimation tends to produce results that are roughly in line with nominal coverage, for most parameters. It is worth noting that the best overall performance occurs with |$n=1000$| rather than |$n=5000$|⁠. We suspect that this is due to numerical stability in the simulation runs. However, these results suggest that further explorations of the asymptotic distribution or bootstrapping techniques are valuable avenues for future work.

5 DATA ANALYSIS

We demonstrate the utility of the proposed corrections by considering an analysis of the MACS data (Kaslow et al., 1987). Our analysis follows Wallace et al. (2016) and Hernán et al. (2000) and considers the question of the timing of intervention with a particular antiretroviral drug, AZT. Our sample considers individuals who were HIV positive and AIDS free in March 1986, and includes 2 decision points. In our sample, roughly 2.06% of individuals were prescribed AZT in stage 1 and an additional 5.32% at stage 2. Approximately 90.08% of individuals who were assigned AZT were fully adherent.

The outcome is the count of a patient’s CD4 cells, a type of white blood cell critical for immune response, at the visit immediately following their second eligible visit. We consider the lab results on CD8 cell counts, white blood cell counts, red blood cell counts, platelet counts, both their systolic and diastolic blood pressure, their weight, and an indicator variable of recent symptoms. Additionally, we use data from a questionnaire administered in October 1998 to assess patient adherence (Kleeberger et al., 2001). While these data correspond to self-reported adherence data, |$A_j^{**}$|⁠, we treat them as though they reflect actual treatment, |$A_j$|⁠. Summary statistics for key health information are contained in Table 3.

TABLE 3

Summary of key demographic and health factors present in the MACS data, based on the prescribed treatment at visits 1 and 2.

Visit 1	No AZT		AZT prescribed
Visit 2	No AZT	AZT prescribed	AZT prescribed
	(N = 2614)	(N = 90)	(N = 57)
Age (years in 1986)
Mean (SD)	35.6 (8.22)	36.3 (7.53)	31.6 (8.79)
Median [min, max]	35.0 [-2.00, 72.0]	35.0 [19.0, 58.0]	33.0 [15.0, 50.0]
CD4 count (visit 1)
Mean (SD)	891 (408)	329 (236)	341 (248)
Median [min, max]	864 [9.00, 2800]	267 [13.0, 929]	297 [18.0, 993]
CD4 count (visit 2)
Mean (SD)	849 (409)	331 (230)	316 (253)
Median [min, max]	823 [6.00, 2640]	297 [8.00, 936]	291 [9.00, 1070]
Body weight (lbs) (visit 1)
Mean (SD)	171 (28.7)	167 (26.1)	171 (45.7)
Median [min, max]	166 [56.4, 408]	163 [125, 270]	165 [116, 442]
Body weight (lbs) (visit 2)
Mean (SD)	172 (30.1)	169 (27.7)	166 (39.5)
Median [min, max]	168 [47.4, 409]	163 [125, 288]	164 [49.6, 348]
Symptoms present (visit 1)
No symptoms	2519 (96.4%)	74 (82.2%)	36 (63.2%)
Symptoms	95 (3.6%)	16 (17.8%)	21 (36.8%)
Symptoms present (visit 2)
No symptoms	2502 (95.7%)	69 (76.7%)	45 (78.9%)
Symptoms	112 (4.3%)	21 (23.3%)	12 (21.1%)

Visit 1	No AZT		AZT prescribed
Visit 2	No AZT	AZT prescribed	AZT prescribed
	(N = 2614)	(N = 90)	(N = 57)
Age (years in 1986)
Mean (SD)	35.6 (8.22)	36.3 (7.53)	31.6 (8.79)
Median [min, max]	35.0 [-2.00, 72.0]	35.0 [19.0, 58.0]	33.0 [15.0, 50.0]
CD4 count (visit 1)
Mean (SD)	891 (408)	329 (236)	341 (248)
Median [min, max]	864 [9.00, 2800]	267 [13.0, 929]	297 [18.0, 993]
CD4 count (visit 2)
Mean (SD)	849 (409)	331 (230)	316 (253)
Median [min, max]	823 [6.00, 2640]	297 [8.00, 936]	291 [9.00, 1070]
Body weight (lbs) (visit 1)
Mean (SD)	171 (28.7)	167 (26.1)	171 (45.7)
Median [min, max]	166 [56.4, 408]	163 [125, 270]	165 [116, 442]
Body weight (lbs) (visit 2)
Mean (SD)	172 (30.1)	169 (27.7)	166 (39.5)
Median [min, max]	168 [47.4, 409]	163 [125, 288]	164 [49.6, 348]
Symptoms present (visit 1)
No symptoms	2519 (96.4%)	74 (82.2%)	36 (63.2%)
Symptoms	95 (3.6%)	16 (17.8%)	21 (36.8%)
Symptoms present (visit 2)
No symptoms	2502 (95.7%)	69 (76.7%)	45 (78.9%)
Symptoms	112 (4.3%)	21 (23.3%)	12 (21.1%)

All individuals prescribed AZT at visit 1 (N = 57) will remain on AZT at visit 2.

TABLE 3

Summary of key demographic and health factors present in the MACS data, based on the prescribed treatment at visits 1 and 2.

Visit 1	No AZT		AZT prescribed
Visit 2	No AZT	AZT prescribed	AZT prescribed
	(N = 2614)	(N = 90)	(N = 57)
Age (years in 1986)
Mean (SD)	35.6 (8.22)	36.3 (7.53)	31.6 (8.79)
Median [min, max]	35.0 [-2.00, 72.0]	35.0 [19.0, 58.0]	33.0 [15.0, 50.0]
CD4 count (visit 1)
Mean (SD)	891 (408)	329 (236)	341 (248)
Median [min, max]	864 [9.00, 2800]	267 [13.0, 929]	297 [18.0, 993]
CD4 count (visit 2)
Mean (SD)	849 (409)	331 (230)	316 (253)
Median [min, max]	823 [6.00, 2640]	297 [8.00, 936]	291 [9.00, 1070]
Body weight (lbs) (visit 1)
Mean (SD)	171 (28.7)	167 (26.1)	171 (45.7)
Median [min, max]	166 [56.4, 408]	163 [125, 270]	165 [116, 442]
Body weight (lbs) (visit 2)
Mean (SD)	172 (30.1)	169 (27.7)	166 (39.5)
Median [min, max]	168 [47.4, 409]	163 [125, 288]	164 [49.6, 348]
Symptoms present (visit 1)
No symptoms	2519 (96.4%)	74 (82.2%)	36 (63.2%)
Symptoms	95 (3.6%)	16 (17.8%)	21 (36.8%)
Symptoms present (visit 2)
No symptoms	2502 (95.7%)	69 (76.7%)	45 (78.9%)
Symptoms	112 (4.3%)	21 (23.3%)	12 (21.1%)

Visit 1	No AZT		AZT prescribed
Visit 2	No AZT	AZT prescribed	AZT prescribed
	(N = 2614)	(N = 90)	(N = 57)
Age (years in 1986)
Mean (SD)	35.6 (8.22)	36.3 (7.53)	31.6 (8.79)
Median [min, max]	35.0 [-2.00, 72.0]	35.0 [19.0, 58.0]	33.0 [15.0, 50.0]
CD4 count (visit 1)
Mean (SD)	891 (408)	329 (236)	341 (248)
Median [min, max]	864 [9.00, 2800]	267 [13.0, 929]	297 [18.0, 993]
CD4 count (visit 2)
Mean (SD)	849 (409)	331 (230)	316 (253)
Median [min, max]	823 [6.00, 2640]	297 [8.00, 936]	291 [9.00, 1070]
Body weight (lbs) (visit 1)
Mean (SD)	171 (28.7)	167 (26.1)	171 (45.7)
Median [min, max]	166 [56.4, 408]	163 [125, 270]	165 [116, 442]
Body weight (lbs) (visit 2)
Mean (SD)	172 (30.1)	169 (27.7)	166 (39.5)
Median [min, max]	168 [47.4, 409]	163 [125, 288]	164 [49.6, 348]
Symptoms present (visit 1)
No symptoms	2519 (96.4%)	74 (82.2%)	36 (63.2%)
Symptoms	95 (3.6%)	16 (17.8%)	21 (36.8%)
Symptoms present (visit 2)
No symptoms	2502 (95.7%)	69 (76.7%)	45 (78.9%)
Symptoms	112 (4.3%)	21 (23.3%)	12 (21.1%)

All individuals prescribed AZT at visit 1 (N = 57) will remain on AZT at visit 2.

Prescribed treatment, |$A_j^{*}$|⁠, takes a value of 1 if AZT was started during period j. Individuals prescribed AZT remain on AZT for the duration of the study. We assume that an individual not prescribed AZT does not take AZT, so |$\operatorname{P}(A_j = 0 \mid A_j^{*} = 0) = 1$|⁠. The nonadherence model is fit using logistic regression on the validation data. We find that the adherence rates appear to be consistent between the first and second stages, and include age and log transformations of the CD4 counts, systolic, and diastolic blood pressures in the adherence model. The functional form of our DTR follows Wallace et al. (2016). We conduct a complete case analysis, using data on 2761 individuals, with adherence data on 220 patients.

The treatment-free model contains the CD4 and log-transformed CD4 counts, from each stage up to the current one. The contrast models include age, log-transformed CD4 counts, and the symptom indicator for stage 1. At stage 2, the true treatment, |$A_1$|⁠, is included. Treatment prescription is fit using a logistic regression with CD4, CD8, red blood cell, white blood cell, and platelet counts. All ages are recorded as of 1986. Table 4 displays the proportion of assigned treatment for both a naive analysis and the corrected procedure, the proportion of agreement between these 2 techniques, and parameter estimates with the estimated standard errors and 95% bootstrap confidence intervals for the contrast parameters.

TABLE 4

Estimated optimal treatment proportions for stages 1 and 2 for both the corrected analysis and a naive analysis (assuming full adherence), along with the number and proportion of treatment agreement at both stages.

	Modified G-estimation		Naive
	\|$A = 0$\|	\|$A = 1$\|	\|$A = 0$\|	\|$A = 1$\|
Stage 1
N	422	2339	368	2393
Agreement	146 (34.6%)	2117 (90.5%)
Stage 2
N	420	2 (Total 2341)	347	21 (Total 2414)
Agreement	144 (34.3%)	2138 (91.3%)
	Estimate (SE)	95% CI	Estimate (SE)	95% CI
\|$\psi _{11}$\|	997.9 (3306.3)	\|$[-604.1, 17227.9]$\|	\|$-45.5$\| (271.5)	\|$[-783.0, 389.4]$\|
\|$\psi _{12}$\|	\|$-10.7 (24.1)$\|	\|$[-116.2, 79.8]$\|	\|$-2.2$\| (3.6)	\|$[-8.8, 5.8]$\|
\|$\psi _{13}$\|	\|$-77.3 (340.2)$\|	\|$[-1482.1, 1218.6]$\|	23.4 (39.8)	\|$[-42.8, 122.9]$\|
\|$\psi _{14}$\|	\|$-74.0 (133.8)$\|	\|$[-661.8, 536.6]$\|	\|$-90.4$\| (62.5)	\|$[-207.5, 31.0]$\|
\|$\psi _{21}$\|	439.7 (1092.7)	\|$[-5682.2, 5147.6]$\|	82.7 (127.6)	\|$[-153.3, 365.4]$\|
\|$\psi _{22}$\|	\|$-4.4$\| (8.0)	\|$[-40.9, 39.8]$\|	\|$-1.5$\| (2.2)	\|$[-6.1, 2.7]$\|
\|$\psi _{23}$\|	\|$-53.4$\| (136.8)	\|$[-656.6, 697.1]$\|	\|$-11.2$\| (17.4)	\|$[-51.4, 24.1]$\|
\|$\psi _{24}$\|	\|$-63.0$\| (56.2)	\|$[-266.2, 91.8]$\|	\|$-46.1$\| (41.7)	\|$[-125.9, 61.2]$\|
\|$\psi _{25}$\|	1347.7 (4098.5)	\|$[-18947.1, 188858.2]$\|	37.5 (35.8)	\|$[-45.2, 108.0]$\|

	Modified G-estimation		Naive
	\|$A = 0$\|	\|$A = 1$\|	\|$A = 0$\|	\|$A = 1$\|
Stage 1
N	422	2339	368	2393
Agreement	146 (34.6%)	2117 (90.5%)
Stage 2
N	420	2 (Total 2341)	347	21 (Total 2414)
Agreement	144 (34.3%)	2138 (91.3%)
	Estimate (SE)	95% CI	Estimate (SE)	95% CI
\|$\psi _{11}$\|	997.9 (3306.3)	\|$[-604.1, 17227.9]$\|	\|$-45.5$\| (271.5)	\|$[-783.0, 389.4]$\|
\|$\psi _{12}$\|	\|$-10.7 (24.1)$\|	\|$[-116.2, 79.8]$\|	\|$-2.2$\| (3.6)	\|$[-8.8, 5.8]$\|
\|$\psi _{13}$\|	\|$-77.3 (340.2)$\|	\|$[-1482.1, 1218.6]$\|	23.4 (39.8)	\|$[-42.8, 122.9]$\|
\|$\psi _{14}$\|	\|$-74.0 (133.8)$\|	\|$[-661.8, 536.6]$\|	\|$-90.4$\| (62.5)	\|$[-207.5, 31.0]$\|
\|$\psi _{21}$\|	439.7 (1092.7)	\|$[-5682.2, 5147.6]$\|	82.7 (127.6)	\|$[-153.3, 365.4]$\|
\|$\psi _{22}$\|	\|$-4.4$\| (8.0)	\|$[-40.9, 39.8]$\|	\|$-1.5$\| (2.2)	\|$[-6.1, 2.7]$\|
\|$\psi _{23}$\|	\|$-53.4$\| (136.8)	\|$[-656.6, 697.1]$\|	\|$-11.2$\| (17.4)	\|$[-51.4, 24.1]$\|
\|$\psi _{24}$\|	\|$-63.0$\| (56.2)	\|$[-266.2, 91.8]$\|	\|$-46.1$\| (41.7)	\|$[-125.9, 61.2]$\|
\|$\psi _{25}$\|	1347.7 (4098.5)	\|$[-18947.1, 188858.2]$\|	37.5 (35.8)	\|$[-45.2, 108.0]$\|

In addition to the optimal treatment agreement, the optimal estimated contrast function parameters, standard errors, and 95% naive bootstrapped confidence intervals are provided for both analyses.

TABLE 4

	Modified G-estimation		Naive
	\|$A = 0$\|	\|$A = 1$\|	\|$A = 0$\|	\|$A = 1$\|
Stage 1
N	422	2339	368	2393
Agreement	146 (34.6%)	2117 (90.5%)
Stage 2
N	420	2 (Total 2341)	347	21 (Total 2414)
Agreement	144 (34.3%)	2138 (91.3%)
	Estimate (SE)	95% CI	Estimate (SE)	95% CI
\|$\psi _{11}$\|	997.9 (3306.3)	\|$[-604.1, 17227.9]$\|	\|$-45.5$\| (271.5)	\|$[-783.0, 389.4]$\|
\|$\psi _{12}$\|	\|$-10.7 (24.1)$\|	\|$[-116.2, 79.8]$\|	\|$-2.2$\| (3.6)	\|$[-8.8, 5.8]$\|
\|$\psi _{13}$\|	\|$-77.3 (340.2)$\|	\|$[-1482.1, 1218.6]$\|	23.4 (39.8)	\|$[-42.8, 122.9]$\|
\|$\psi _{14}$\|	\|$-74.0 (133.8)$\|	\|$[-661.8, 536.6]$\|	\|$-90.4$\| (62.5)	\|$[-207.5, 31.0]$\|
\|$\psi _{21}$\|	439.7 (1092.7)	\|$[-5682.2, 5147.6]$\|	82.7 (127.6)	\|$[-153.3, 365.4]$\|
\|$\psi _{22}$\|	\|$-4.4$\| (8.0)	\|$[-40.9, 39.8]$\|	\|$-1.5$\| (2.2)	\|$[-6.1, 2.7]$\|
\|$\psi _{23}$\|	\|$-53.4$\| (136.8)	\|$[-656.6, 697.1]$\|	\|$-11.2$\| (17.4)	\|$[-51.4, 24.1]$\|
\|$\psi _{24}$\|	\|$-63.0$\| (56.2)	\|$[-266.2, 91.8]$\|	\|$-46.1$\| (41.7)	\|$[-125.9, 61.2]$\|
\|$\psi _{25}$\|	1347.7 (4098.5)	\|$[-18947.1, 188858.2]$\|	37.5 (35.8)	\|$[-45.2, 108.0]$\|

	Modified G-estimation		Naive
	\|$A = 0$\|	\|$A = 1$\|	\|$A = 0$\|	\|$A = 1$\|
Stage 1
N	422	2339	368	2393
Agreement	146 (34.6%)	2117 (90.5%)
Stage 2
N	420	2 (Total 2341)	347	21 (Total 2414)
Agreement	144 (34.3%)	2138 (91.3%)
	Estimate (SE)	95% CI	Estimate (SE)	95% CI
\|$\psi _{11}$\|	997.9 (3306.3)	\|$[-604.1, 17227.9]$\|	\|$-45.5$\| (271.5)	\|$[-783.0, 389.4]$\|
\|$\psi _{12}$\|	\|$-10.7 (24.1)$\|	\|$[-116.2, 79.8]$\|	\|$-2.2$\| (3.6)	\|$[-8.8, 5.8]$\|
\|$\psi _{13}$\|	\|$-77.3 (340.2)$\|	\|$[-1482.1, 1218.6]$\|	23.4 (39.8)	\|$[-42.8, 122.9]$\|
\|$\psi _{14}$\|	\|$-74.0 (133.8)$\|	\|$[-661.8, 536.6]$\|	\|$-90.4$\| (62.5)	\|$[-207.5, 31.0]$\|
\|$\psi _{21}$\|	439.7 (1092.7)	\|$[-5682.2, 5147.6]$\|	82.7 (127.6)	\|$[-153.3, 365.4]$\|
\|$\psi _{22}$\|	\|$-4.4$\| (8.0)	\|$[-40.9, 39.8]$\|	\|$-1.5$\| (2.2)	\|$[-6.1, 2.7]$\|
\|$\psi _{23}$\|	\|$-53.4$\| (136.8)	\|$[-656.6, 697.1]$\|	\|$-11.2$\| (17.4)	\|$[-51.4, 24.1]$\|
\|$\psi _{24}$\|	\|$-63.0$\| (56.2)	\|$[-266.2, 91.8]$\|	\|$-46.1$\| (41.7)	\|$[-125.9, 61.2]$\|
\|$\psi _{25}$\|	1347.7 (4098.5)	\|$[-18947.1, 188858.2]$\|	37.5 (35.8)	\|$[-45.2, 108.0]$\|

In addition to the optimal treatment agreement, the optimal estimated contrast function parameters, standard errors, and 95% naive bootstrapped confidence intervals are provided for both analyses.

Both techniques have highly variable estimates, with no parameters significantly different from 0 at the 0.05 significance level. Both techniques also heavily favor early intervention, with the regime suggesting that a majority of the individuals start AZT at the first stage. Despite this, there is substantial disagreement between which individuals are suggested to start AZT by the naive and corrected regimes. At the first stage, the 2 techniques have disagreement on roughly 18.0% of the patients in the study. At the second stage, there is still disagreement on treatment for roughly 17.3% of the individuals. These differences in optimal treatment far exceed the differences observed in the simulation studies. While the shortcomings of our analysis render it unlikely that our specific effect estimates are indicative of the underlying reality, this analysis makes clear the concerns with ignoring adherence information. Even small amounts of nonadherence can greatly impact the estimation.

6 DISCUSSION

Nonadherence is a pervasive concern in medical data that can invalidate causal analyses. We propose the first method for optimal DTR estimation that corrects for the impacts of nonadherence. The proposed method maintains the desirable properties of G-estimation, when sufficient auxiliary information is available to correctly model the study’s nonadherence. When such data are not available, the technique can be applied to determine how sensitive the estimated DTR is to hypothesized degrees of nonadherence. While the proposed techniques can leverage prescribed or reported treatments to model nonadherence, we suggest that the use of prescribed treatments allows for more natural modeling. Importantly, the proposed techniques rely on correctly specified models for the nonadherence mechanisms and on various conditions regarding the dependence structures in the data for consistency. We also do not explore the use of multiple error-prone treatment indicators, say both the prescribed and reported treatments. Both of these present opportunities for future investigation.

ACKNOWLEDGMENTS

The authors thank the anonymous reviewers, and the editors, for their helpful comments.

FUNDING

This research was partially supported by funding from the Natural Sciences and Engineering Council of Canada (NSERC) and the New Brunswick Innovation Fund (NBIF)/ResearchNB. Yi is a Canada Research Chair in Data Science (Tier 1). Her research was undertaken in part thanks to funding from the Canada Research Chairs Program.

CONFLICT OF INTEREST

None declared.

DATA AVAILABILITY

The Multicenter AIDS Cohort Study (MACS) data that support the findings in this paper are publicly available upon request via https://www.niaid.nih.gov/research/multicenter-aids-cohort-study-public-data-set.

REFERENCES

Chakraborty

Laber

E. B.

Zhao

(

2013

Inference for optimal dynamic treatment regimes using an adaptive m-out-of-n bootstrap scheme

Biometrics

714

–

723

Chakraborty

Murphy

Strecher

(

2009

Inference for non-regular parameters in optimal dynamic treatment regimes

Statistical Methods in Medical Research

317

–

343

Cotton

C. A.

Heagerty

P. J.

(

2011

A data augmentation method for estimating the causal effect of adherence to treatment regimens targeting control of an intermediate measure

Statistics in Biosciences

–

DiMatteo

M. R.

(

2004

Variations in patients’ adherence to medical recommendations

Medical Care

200

–

209

Gao

Konietschke

(

2021

On the admissibility of simultaneous bootstrap confidence intervals

Symmetry

1212

Han

(

2021

Identification in nonparametric models for dynamic treatment effects

Journal of Econometrics

225

132

–

147

Hernán

M. Á.

Brumback

Robins

J. M.

(

2000

Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men

Epidemiology

561

–

570

Hernán

M. Á.

Lanoy

Costagliola

Robins

J. M.

(

2006

Comparison of dynamic treatment regimes via inverse probability weighting

Basic and Clinical Pharmacology and Toxicology

237

–

242

Kaslow

R. A.

Ostrow

D. G.

Detels

Phair

J. P.

Polk

B. F.

and

C. R. R.

(

1987

The Multcenter AIDS Cohort Study: rationale, organization, and selected characteristics of the participants

American Journal of Epidemiology

126

310

–

318

Kleeberger

C. A.

Phair

J. P.

Strathdee

S. A.

Detels

Kingsley

Jacobson

L. P.

(

2001

Determinants of heterogeneous adherence to HIV-antiretroviral therapies in the Multicenter AIDS Cohort Study

JAIDS Journal of Acquired Immune Deficiency Syndromes

–

Liu

Wang

Kosorok

M. R.

Zhao

Zeng

(

2018

Augmented outcome-weighted learning for estimating optimal dynamic treatment regimens

Statistics in Medicine

3776

–

3788

McComsey

G. A.

Lingohr-Smith

Rogers

Lin

Donga

(

2021

Real-world adherence to antiretroviral therapy among HIV-1 patients across the United States

Advances in Therapy

4961

–

4974

McCoy

(

2017

Understanding the intention-to-treat principle in randomized controlled trials

Western Journal of Emergency Medicine

1075

–

1078

Moodie

E. E. M.

(

2009

A note on the variance of doubly-robust G-estimators

Biometrika

998

–

1004

Moodie

E. E. M.

Richardson

T. S.

(

2010

Estimating optimal dynamic regimes: correcting bias under the null

Scandinavian Journal of Statistics

126

–

146

Murphy

S. A.

(

2003

Optimal dynamic treatment regimes

Journal of the Royal Statistical Society, Series B

331

–

355

O’Brien

Chi

Y.-L.

Krause

K. R.

(

2021

Measuring health outcomes in HIV: time to bring in the patient experience

Annals of Global Health

Ranganathan

Pramesh

Aggarwal

(

2016

Common pitfalls in statistical analysis: intention-to-treat versus per-protocol analysis

Perspectives in Clinical Research

144

Robins

J. M.

(

1986

A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect

Mathematical Modelling

1393

–

1512

Robins

J. M.

(

2004

Optimal Structural Nested Models for Optimal Sequential Decisions

189

–

326

New York, NY

Springer

Google Preview

OpenURL Placeholder Text

Robins

J. M.

Hernán

M. Á.

Brumback

(

2000

Marginal structural models and causal inference in epidemiology

Epidemiology

550

–

560

Rubin

D. B.

(

1980

Bias reduction using mahalanobis-metric matching

Biometrics

293

Rubin

D. B.

(

2005

Causal inference using potential outcomes

Journal of the American Statistical Association

100

322

–

331

Sheiner

L. B.

Rubin

D. B.

(

1995

Intention-to-treat analysis and the goals of clinical trials

Clinical Pharmacology & Therapeutics

–