On the Addams family of discrete frailty distributions for modeling multivariate case I interval-censored data

RFV and distribution parameters of the Addams family and support.

Parameters	$Z \sim$	Distribution parameters	Support
$γ > 0 > α$	$ψ N B_{> 0} (ν, π)$	$ν = \frac{1}{γ - α}$ (number of successes),	$ψ \times$
		$π = \frac{- α}{γ - α}$ (success probability)	${ν, 1 + ν, 2 + ν, \dots}$
$γ > 0 = α$	$G (γ^{- 1}, γ^{*})$	$γ^{- 1}$ (shape), $γ^{*} = {(μ γ)}^{- 1}$ (rate)	$R_{> 0}$
$α = γ > 0$	$ψ P (λ^{*})$	$λ^{*} = γ^{- 1}$ (rate)	$ψ \times {0, 1, 2 \dots}$
$γ > α > 0$	$ψ N B (ν, π)$	$ν = \frac{1}{γ - α}$ (number of successes),	$ψ \times$
		$π = \frac{α}{γ}$ (success probability)	${0, 1, 2, \dots}$
$α > γ > 0$	$ψ B (b, π)$	$b = {(α - γ)}^{- 1}$ (number of trials),	$ψ \times$
		$π = \frac{α - γ}{α}$ (success probability)	${0, 1, \dots, b}$

Parameters	$Z \sim$	Distribution parameters	Support
$γ > 0 > α$	$ψ N B_{> 0} (ν, π)$	$ν = \frac{1}{γ - α}$ (number of successes),	$ψ \times$
		$π = \frac{- α}{γ - α}$ (success probability)	${ν, 1 + ν, 2 + ν, \dots}$
$γ > 0 = α$	$G (γ^{- 1}, γ^{*})$	$γ^{- 1}$ (shape), $γ^{*} = {(μ γ)}^{- 1}$ (rate)	$R_{> 0}$
$α = γ > 0$	$ψ P (λ^{*})$	$λ^{*} = γ^{- 1}$ (rate)	$ψ \times {0, 1, 2 \dots}$
$γ > α > 0$	$ψ N B (ν, π)$	$ν = \frac{1}{γ - α}$ (number of successes),	$ψ \times$
		$π = \frac{α}{γ}$ (success probability)	${0, 1, 2, \dots}$
$α > γ > 0$	$ψ B (b, π)$	$b = {(α - γ)}^{- 1}$ (number of trials),	$ψ \times$
		$π = \frac{α - γ}{α}$ (success probability)	${0, 1, \dots, b}$

Table 1:

Open in new tab Download slide

RFV and distribution parameters of the Addams family and support.

Parameters	$Z \sim$	Distribution parameters	Support
$γ > 0 > α$	$ψ N B_{> 0} (ν, π)$	$ν = \frac{1}{γ - α}$ (number of successes),	$ψ \times$
		$π = \frac{- α}{γ - α}$ (success probability)	${ν, 1 + ν, 2 + ν, \dots}$
$γ > 0 = α$	$G (γ^{- 1}, γ^{*})$	$γ^{- 1}$ (shape), $γ^{*} = {(μ γ)}^{- 1}$ (rate)	$R_{> 0}$
$α = γ > 0$	$ψ P (λ^{*})$	$λ^{*} = γ^{- 1}$ (rate)	$ψ \times {0, 1, 2 \dots}$
$γ > α > 0$	$ψ N B (ν, π)$	$ν = \frac{1}{γ - α}$ (number of successes),	$ψ \times$
		$π = \frac{α}{γ}$ (success probability)	${0, 1, 2, \dots}$
$α > γ > 0$	$ψ B (b, π)$	$b = {(α - γ)}^{- 1}$ (number of trials),	$ψ \times$
		$π = \frac{α - γ}{α}$ (success probability)	${0, 1, \dots, b}$

Parameters	$Z \sim$	Distribution parameters	Support
$γ > 0 > α$	$ψ N B_{> 0} (ν, π)$	$ν = \frac{1}{γ - α}$ (number of successes),	$ψ \times$
		$π = \frac{- α}{γ - α}$ (success probability)	${ν, 1 + ν, 2 + ν, \dots}$
$γ > 0 = α$	$G (γ^{- 1}, γ^{*})$	$γ^{- 1}$ (shape), $γ^{*} = {(μ γ)}^{- 1}$ (rate)	$R_{> 0}$
$α = γ > 0$	$ψ P (λ^{*})$	$λ^{*} = γ^{- 1}$ (rate)	$ψ \times {0, 1, 2 \dots}$
$γ > α > 0$	$ψ N B (ν, π)$	$ν = \frac{1}{γ - α}$ (number of successes),	$ψ \times$
		$π = \frac{α}{γ}$ (success probability)	${0, 1, 2, \dots}$
$α > γ > 0$	$ψ B (b, π)$	$b = {(α - γ)}^{- 1}$ (number of trials),	$ψ \times$
		$π = \frac{α - γ}{α}$ (success probability)	${0, 1, \dots, b}$

The parameter $α$ plays a unique role in the context of discrete shared frailty modeling. For discrete shared frailty models the RFV (CRF) either approaches zero (one) or infinity (Bardo and Unkel 2023). Therefore, it is desirable to have a continuous exception within a family of discrete shared frailty distributions for which the RFV (CRF) does not approach zero (one) or infinity with time approaching infinity. Within the context of the $A F$ this is the $G$ which arises for $α = 0$ ⁠. In that case the RFV (CRF) is constant, a shape that is impossible for a discrete shared frailty model to generate. The nested structure might be utilized to test for a constant RFV (CRF) within the $A F$ ⁠.

Moreover, the $A F$ can choose between a monotonically decreasing or increasing trajectory of the RFV (CRF) through the sign of $α$ ⁠. This is unprecedented in the context of discrete shared frailty models. Though the ZMPS distribution is able to create decreasing trajectories of the RFV (CRF), this involves the edge of the parameter space for its deflation/inflation parameter. The opposite is true for the K-point distribution: its RFV (CRF) approaches zero (one) in the long run unless $z_{(1)}$ is equal to zero, which again involves the edge of the parameter space. For the $A F$ ⁠, the parameter $α$ chooses between a decreasing or increasing RFV (CRF) by determining the support of the discrete distribution without involving the edge of parameter space. If $α > 0$ ⁠, $z_{(1)} = 0$ and a cure fraction exists. This induces an increasing trajectory of the RFV (CRF). If $α < 0$ instead, $z_{(1)} > 0$ and no cure fraction exists. This induces a decreasing trajectory of the RFV (CRF); see Bardo and Unkel (2023) for a discussion of the shape of the RFV (CRF) for discrete frailty models.

The feature of the support being dependent on the distribution parameters offers the possibility of a meaningful interpretation of the ${HR}_{W}$ ⁠. Figure 1a shows examples of the ${HR}_{W}$ for $α > 0$ and $α < 0$ ⁠. If $α > 0$ ⁠, ${HR}_{W} (1) \equiv \infty$ and ${HR}_{W} (k) = \frac{k}{k - 1}$ for $k \geq 2$ ⁠. Note that there may be an upper bound on the RCs if $ψ B$ is chosen. In this case, the upper bound of the frailty as well as the number of RCs chosen through the fitting procedure may be the main component of the analysis, e.g. by comparing the RC-related survival curve of the upper bound versus another RC. If $α < 0$ ⁠, ${HR}_{W} (k) = \frac{ν + k}{ν + k - 1}$ which approaches $\frac{k}{k - 1}$ for large k. So there is reasonable flexibility for the first few within-stratum HRs and the focus may be on ${HR}_{W} (1)$ ⁠, where the model is more flexible than for later RCs. This is less flexible than the K-point distribution is, provided that K is large enough, which may even show a non-monotone trajectory of the ${HR}_{W}$ ⁠. However, modeling the K-point distribution with a large K is difficult given that this involves $2 (K - 1)$ parameters for a latent distribution, especially if the cluster size is small. On the contrary, within the $A F$ one is remunerated with $| Ω | = \infty$ (except for the $ψ B$ case) which can be important for extreme observations where events occur very early in the lifespan. However, the parametric constraints of the $A F$ on the trajectory of ${HR}_{W}$ should always be taken in consideration and challenged by the K-point distribution whenever possible.

Figure 1:

Within- and across-stratum hazard ratios versus RC k for various members of the Addams family. a) Within stratum HR for $α > 0$ and $α < 0$ versus RC k. b) Across-stratum HR for $α i, α i^{'} > 0$ versus RC k. c) Across-stratum HR for $α i, α i^{'} < 0$ versus RC k. d) Across-stratum HR for $α i < 0$ ⁠, $α i^{'} > 0$ versus RC k.

This analysis can be extended to the across-stratum HR, ${HR}_{A}$ ⁠, if there is a model for the distribution of individual heterogeneity within the $A F$ ⁠. For that purpose, we allow the parameters to depend on a stratum-specifying set of covariates $\tilde{x}$ ⁠, which determine the parameters of individual heterogeneity $α (\tilde{x}) = {\tilde{x}}^{T} ζ, γ (\tilde{x}) = exp {{\tilde{x}}^{T} κ}$ ⁠, with each parameter in the vectors $ζ, κ$ being an element of $R$ ⁠. For the sake of brevity, we denote $α ({\tilde{x}}_{i}), γ ({\tilde{x}}_{i}),$ and $μ ({\tilde{x}}_{i})$ by $α_{i}, γ_{i},$ and $μ_{i}$ ⁠, respectively. Figure 1 shows the ${HR}_{A}$ for varying scenarios of $α_{i}$ and $α_{i^{'}}$ ⁠. For $α_{i}, α_{i^{'}} > 0$ (Fig. 1b), the ${HR}_{A} = \frac{ψ_{i}}{ψ_{i^{'}}}$ for all $k \geq 2$ and is undefined for $k = 1$ ⁠. However, for $α_{i}, α_{i^{'}} < 0$ (Fig. 1c), the ${HR}_{A} (k) = \frac{ψ_{i} (ν_{i} + k - 1)}{ψ_{i^{'}} (ν_{i^{'}} + k - 1)}$ ⁠, which approaches a constant ratio ${HR}_{A} (k) = \frac{ψ_{i}}{ψ_{i^{'}}}$ for large k. Note that the ${HR}_{A}$ might be greater or less than one for all k, but can also cross the threshold of one with increasing k. If $\tilde{x}$ is for example an experimental treatment indicator (in a univariate context), the ${HR}_{A}$ represents a heterogeneous treatment effect which might identify sub-groups within the population for which the treatment is harmful. For $α_{i} < 0, α_{i^{'}} > 0$ (Fig. 1d), ${HR}_{A} (1) \equiv \infty$ ⁠, as for stratum $i^{'}$ there is a latent sub-population that is not susceptible to the event of interest, whereas for stratum i all latent sub-populations are susceptible. For $k \geq 2$ and ${HR}_{A} (k) = \frac{ψ_{i} (ν_{i} + k - 1)}{ψ_{i^{'}} (k - 1)}$ ⁠. Another scenario, not explicitly shown in Fig. 1b and d, is that one or both strata (i and $i^{'}$ ⁠) might have (different) upper bounds of frailty in the $ψ B$ case. In such a case, the stratum with the larger upper bound (which could still be $\infty$ ⁠), say $i^{'}$ ⁠, could be considered more vulnerable, since that stratum has a higher proportion of individuals who are expected to experience the event very early, namely those with a frailty value greater than $ψ_{i} b_{i}$ ⁠.

Note that the parameters in the formula of ${HR}_{A}$ and ${HR}_{W}$ (and hence the parameters as specified in the legends of Fig. 1) do not uniquely identify the parameters of the frailty distribution, ie for a given trajectory of ${HR}_{A}, {HR}_{W}$ there is an infinite set of $(α, γ)$ or $(α_{i}, γ_{i})$ and $(α_{i^{'}}, γ_{i^{'}})$ ⁠, respectively, that induce the same trajectory but with a different distributions of the RCs which did not need to be specified for Fig. 1. This shows that the analysis of the frailty model has always two branches. The first branch is the analysis of HRs, which indicate the meaning of being in a particular latent RC (in a particular stratum) relative to another latent RC or to another observable stratum in the same latent RC. On the one hand, ${HR}_{W}$ can help to assess the importance of individual heterogeneity, e.g. if ${HR}_{W} (k)$ is large, then latent RC membership has a large effect on expected survival. On the other hand, comparing ${HR}_{W}$ across strata or analysing the ${HR}_{A}$ gives an account of random covariate effects where, e.g. covariates with a beneficial effect on survival or covariates with partly beneficial, partly detrimental effects might be detected. The second branch of the analysis is the distribution of RCs across strata, which may indicate differences in the distribution of risk-taking behavior and predisposition across strata, e.g. by indicating a stratum with a heavier tail of vulnerable RCs. Taken together, the analysis of HRs and the distribution of RCs can provide thorough analytical explanations in terms of selection and random covariate effects that can help to explain the trajectories of population survival curves (where the RCs are marginalized out), ie explanations for why the survival curves of two strata come closer or even cross over time.

4 Estimation

In this section, all time-dependent quantities are evaluated at the monitored (censored or uncensored) event times of the individuals. We delete the argument from the expression and indicate the corresponding quantity with a subscript, e.g. $Λ_{i}^{(j)} = exp {x_{i}^{(j)}^{T} β^{(j)}} Λ_{0}^{(j)} (t_{i}^{(j)})$ ⁠. Furthermore, let $A \in P ({1, \dots, J})$ ⁠, where $P$ denotes the power set. Then, $Λ_{i}^{(A)} = \sum_{j \in A} Λ_{i}^{(j)}$ and $Λ_{i}^{(- A)} = \sum_{j \notin A} Λ_{i}^{(j)}$ ⁠. Note that we define $Λ_{i}^{(\emptyset)} = 0$ and $Λ_{i} = Λ_{i}^{(1, \dots, J)}$ ⁠.

We develop estimation routines for case I interval-censored data. In the case of case I interval-censored data it is only known whether the event occurred during follow-up or not but the exact event time is unknown. For multivariate cases (⁠

J > 1

⁠) it is easier to understand the likelihood if one starts by exploiting the conditional independence assumption of

T_{i}^{(j)}

and

T_{i}^{(j^{'})}, j \neq j^{'}

⁠, given

Z_{i} = z

⁠:

\begin{matrix} L (θ, λ_{0}, β; data) & = \prod_{i = 1}^{n} \int_{0}^{\infty} \prod_{j = 1}^{J} (1 - exp {- z Λ_{i}^{(j)}})^{d_{i}^{(j)}} exp {- z Λ_{i}^{(j)}}^{1 - d_{i}^{(j)}} g_{i} (z) d z \\ = \prod_{i = 1}^{n} \int_{0}^{\infty} \sum_{A \in P (d_{i})} {(- 1)}^{| A |} exp {- z_{i} (Λ_{i}^{(A)} + Λ_{i}^{(- d_{i})})} g_{i} (z) d z \\ = \prod_{i = 1}^{n} \sum_{A \in P (d_{i})} {(- 1)}^{| A |} L (Λ_{i}^{(A)} + Λ_{i}^{(- d_{i})}) . \end{matrix}

(3)

where

d_{i}^{(j)}

is the observational unit and target-specific event indicator (equal to one if the event occurred during the follow-up, zero otherwise), and

d_{i}

is the set of targets on which the

i^{th}

observational unit had an event. The vector

λ_{0}

contains all parameters of the baseline hazard rates,

β = [β^{(0)}, β^{(1)}, \dots, β^{(J)}]

⁠, and

θ = [ζ, κ]

⁠.

Quasi-Newton optimization routines were applied for optimizing the corresponding log-likelihood based on (3). We choose BFGS as the standard method, as implemented in R version 4.2.2 (R Core Team 2022). Standard errors (SE) are obtained via the Hessian of the log-likelihood, which is approximated by Richardson extrapolation as implemented by Gilbert and Varadhan (2019). The delta method was applied where necessary to obtain SE: confidence intervals (CIs) are based on $\ln$ or $\ln {- \ln}$ transformations if the parameter of interest is greater than zero or between zero and one respectively, and are then transformed back to the scale of interest.

We provide algorithms that are able to fit univariate and multivariate frailty models for case I interval-censored data. The frailty distributions can be stratified by a (multi-level) factor. The frailty distribution might either be $A F$ or from the power variance family (both parameters can be estimated). The baseline hazard can be chosen to be piecewise-constant or the parametric generalized gamma distribution (Cox et al. 2007) or one of its special cases, respectively. Covariates can be added in proportional hazards manner. Overdispersion parameters might be added by means of the Dirichlet compound multinomial distribution. Implementations are available on GitHub (https://github.com/time-to-MaBo/Addamsfamily/).

5 Applications

We illustrate the $A F$ in the context of multivariate case I interval-censored data on the human papillomavirus (HPV), obtained from a serological survey in the Netherlands (PIENTER-2); see Mollema et al. (2009) for details on PIENTER-2 and Scherpenisse et al. (2012) for an investigation of the respective HPV dataset. The data were collected in the years 2006 and 2007 and cover people aged 0 to 79. Participants were asked to complete a questionnaire and to provide a blood sample (Mollema et al. 2009). By means of the blood samples, the level of antibodies regarding the high-risk HPV types 16, 18, 31, 33, 45, 52, and 58 were determined in order to detect past infections. Therefore, at the time of observation, it is only known whether the study participants have had an infection in the past or not, but it is never known exactly when the potential infection occurred, resulting in case I interval-censored data, also known as current status data (Sun 2007). Note that at the time of data collection the Dutch national immunization programme did not include a vaccine against HPV.

We analyzed the nationwide sample including oversampled migrants and applied weighting factors to make the sample representative for the Dutch population. We excluded individuals in their first year of life from the analysis, as maternal antibodies could be transmitted to the infant transplacentally or through breastfeeding (Rintala et al. 2005). This left us with a sample size of $n = 6384$ individuals aged 2 to 80. The weighted proportion of females in the dataset is 49.9% (unweighted: 54.4%).

The observed time is the individuals’ age at the date of serological monitoring. The event indicator $d_{i}^{(j)} = 1$ means that individual i is seropositive with respect to pathogen j, $j \in {HPV 16, HPV 18, HPV 31, HPV 33, HPV 45, HPV 52, HPV 58}$ ⁠, and seronegative and still susceptible if $d_{i}^{(j)} = 0$ ⁠. Seroprevalence is interpreted as a proxy for past infections. Note, however, that there is a time-lag between the infection and the time of seroconversion, as well as a difference in the number of individuals who were infected with HPV and those who seroconverted: in previous studies, antibodies could not be detected for about 20-50% of females who were carriers of HPV DNA. However, antibody responses are relatively stable over time and hence, the study of the population’s seroprevalence might yield important insights; see Scherpenisse et al. (2012) for a discussion.

We consider the following models for the individual hazard rates, $λ_{i}^{(j)} (t) = Z_{i} λ_{0}^{(sex: j)} (t)$ ⁠, $sex \in {m, f}$ ⁠, where the target-specific baseline hazard $λ_{0}^{(sex: j)} (t)$ is either sex-stratified (sex-stratified baseline hazard model), or non-stratified $λ_{0}^{(sex: j)} (t) = λ_{0}^{(j)} (t)$ (non-stratified baseline hazard model). The purpose of stratifying baseline hazards by sex is 2-fold. The first is to investigate whether it is justified to estimate baseline hazards jointly for both sexes, and the second is to investigate whether a potential difference in the distribution is better explained by stratified baseline hazards than by different distributions of individual heterogeneity. In any case the target (and potentially sex) specific hazard rate is piecewise constant with a unique parameter within the intervals $[0; 5)$ ⁠, $[5; 10)$ ⁠, $[10; 20)$ ⁠, $[20; 30)$ ⁠, $[30; 40)$ ⁠, $[40; 50)$ ⁠, $[50; 65)$ ⁠, $[65; 80)$ ⁠. The frailty Z is either sex-stratified, ie $Z_{i} \sim A F (α_{{sex}_{i}}, γ_{{sex}_{i}})$ (sex-stratified RE model), or non-stratified, ie $Z_{i} \sim A F (α, γ)$ (non-stratified RE model). Note that in both cases $μ_{m} \equiv 1$ ⁠, $μ_{f} \in R_{> 0}$ except for the stratified-hazard model where $μ_{f}$ is also set to one for the sake of identifiability. The stratified RE model might be better able to reflect differing patterns in individual heterogeneity due to biological and environmental predisposition as well as a different distribution of risk-related behavior across males and females. We combine the stratification status of the baseline hazard with the stratification status of the RE.

An HPV infection can be transmitted via skin-to-skin contact, often—though not exclusively (see, e.g. Syrjänen (2010), Rintala et al. (2005) or Meyers et al. (2014))— via sexual intercourse (Gavillon et al. 2010). Therefore, individual—and typically unobserved—behavior is an important determinant of an individual’s risk of contracting HPV. Frailty models have been used previously to incorporate unobservable individual heterogeneity in the transmission of infectious diseases (see, e.g. Unkel et al. (2014) or Hens et al. (2009)). Moreover, we suspect that there may be distinct jumps in the individual hazard rates due to differences in individual behavior that are relevant for transmission, e.g. comparing individuals who have no sex at all to individuals who have (see, e.g. Richardson et al. (2000), Burchell et al. (2006)), or whether the individuals use condoms or not (see, for example, Lam et al. (2014) or Nielson et al. (2010)). More formally, non-Gaussian and discrete omitted covariates may be the most important drivers of individual heterogeneity, and thus a discrete frailty model may be particularly appropriate here.

We start with a bivariate analysis including HPV16 and HPV18. An extension to higher variate data with data on seven types of HPV follows.

5.1 Bivariate data analysis

We begin by exploiting the nested structure of the models for model selection. The stratification of the RE is statistically significant on conventional levels by means of a likelihood ratio test (LRT) no matter the stratification status of the baseline hazard. The null-hypothesis is $H_{0}$ ⁠: $Z_{m} \sim Z_{f} \sim A F (α, γ)$ vs. $H_{1} : Z_{sex} \sim A F (α_{sex}, γ_{sex})$ ⁠. Note that the expectation parameter $μ_{f}$ is not included in the hypothesis. In the case of non-stratified baseline hazards the LRT test statistic equals 29.700 on 2 degrees of freedom (p-value $\approx$ 0). In the case of sex-stratified baseline hazards the LRT test statistic equals 30.822 on 2 degrees of freedom (p-value $\approx$ 0). Better performance of stratified RE models is also suggested by the $ϕ$ -plot which can be seen in Fig. 2a. The measure $ϕ$ is an association measure for bivariate current status data introduced by Unkel and Farrington (2012), $ϕ > 0$ indicates positive and $ϕ < 0$ negative association. Additionally, $ϕ$ tracks $\ln {1 + RFV (t)}$ with a time-lag. It can be observed that the association between HPV16 and HPV18 is higher for females early in life, but declines more strongly than for males. This is likely to be the reason for the success of stratified RE models here as those models are able to choose a distinct intercept and slope of the RFV across the sexes. We choose the sex-stratified RE model for further analysis.

Figure 2:

Observed association between HPV types in the PIENTER-2 data. a) Observed association between HPV16 and HPV18 in terms of a $ϕ$ -plot. Black dots refer to cohort- and sex-specific nonparametric estimates, size proportional to precision. The black solid line is the corresponding LOESS. Other dotted and dashed lines are estimates resulting from corresponding parametric model. b) RFV of bivariate (left) and higher variate (right) dataset of non-stratified hazard, stratified RE model. Note the different scales on the y-axis. The curves reach a plateau at around 30 years of age, hence, the x-axis was cut-off after 40 years. Note that also the seroprevalence curves $(1 - P (T^{(j)} > t))$ reach a plateau around this time (not shown).

Open in new tab Download slide

In terms of AIC, the non-stratified baseline hazard model $λ_{0}^{(j)} (t)$ ⁠, $j \in {HPV 16, HPV 18}$ performs better than sex-stratified-baseline hazard model $λ_{0}^{(sex: j)} (t)$ (9647 vs. 9656). Thus, we choose the non-stratified baseline hazard, stratified RE model for further analysis.

A LRT for a constant RFV (CRF), ie $H_{0} : Z_{sex} \sim G (γ_{sex})$ vs. $H_{1} : Z_{sex} \sim A F (α_{sex}, γ_{sex})$ for males and females, yields a test-statistic of 93.514 on 2 degrees of freedom (p-value $\approx 0$ ⁠). Hence, the hypothesis on a constant RFV (CRF) can also be rejected. A constant pattern of the RFV is also not suggested by Fig. 2a, where it can be observed that association is constantly falling up to the age of around 50. Considering all tests and pairwise AIC comparisons we choose the non-stratified baseline hazard, stratified $A F$ model for final analysis.

The RFV parameter estimates for the stratified RE, non-stratified baseline hazard models can be seen in Table 2. The estimated RFV (CRF) is decreasing for both sexes. The estimated RFV parameters indicate higher heterogeneity across clusters or association within a cluster for females early on, as indicated by the intercept of the RFV (⁠ ${\hat{γ}}_{f} > {\hat{γ}}_{m}$ ⁠). However, the descent is more strongly for females (⁠ $| {\hat{α}}_{f} | > | {\hat{α}}_{m} |$ ⁠) and consequently it is estimated that heterogeneity/association is stronger for males from the 4 $^{th}$ yr of life onwards; see left-hand panel of Fig. 2b. These results are also supported by the non-parametric estimates of $ϕ$ in Fig. 2a.

Table 2:

Estimated RFV parameters (above dashed line) and resulting estimated frailty distribution parameters (below dashed line) for non-stratified hazard, stratified RE model. Parentheses below point estimates show 95%-CIs.

	Male	Female
${\hat{α}}_{sex}$	$\underset{(- 0.809; - 0.196)}{- 0.502}$	$\underset{(- 5.008; - 0.757)}{- 2.882}$
${\hat{γ}}_{sex}$	$\underset{(66.629; 104.509)}{83.447}$	$\underset{(60.167; 137.621)}{90.996}$
${\hat{ψ}}_{sex}$	$\underset{(0.243; 1.038)}{0.502}$	$\underset{(0.336; 2.66)}{0.946}$
${\hat{ν}}_{sex}$	$\underset{(0.009; 0.016)}{0.012}$	$\underset{(0.006; 0.018)}{0.011}$
${\hat{π}}_{sex}$	$\underset{(0.002; 0.015)}{0.006}$	$\underset{(0.009; 0.11)}{0.031}$

	Male	Female
${\hat{α}}_{sex}$	$\underset{(- 0.809; - 0.196)}{- 0.502}$	$\underset{(- 5.008; - 0.757)}{- 2.882}$
${\hat{γ}}_{sex}$	$\underset{(66.629; 104.509)}{83.447}$	$\underset{(60.167; 137.621)}{90.996}$
${\hat{ψ}}_{sex}$	$\underset{(0.243; 1.038)}{0.502}$	$\underset{(0.336; 2.66)}{0.946}$
${\hat{ν}}_{sex}$	$\underset{(0.009; 0.016)}{0.012}$	$\underset{(0.006; 0.018)}{0.011}$
${\hat{π}}_{sex}$	$\underset{(0.002; 0.015)}{0.006}$	$\underset{(0.009; 0.11)}{0.031}$

Table 2:

Estimated RFV parameters (above dashed line) and resulting estimated frailty distribution parameters (below dashed line) for non-stratified hazard, stratified RE model. Parentheses below point estimates show 95%-CIs.

	Male	Female
${\hat{α}}_{sex}$	$\underset{(- 0.809; - 0.196)}{- 0.502}$	$\underset{(- 5.008; - 0.757)}{- 2.882}$
${\hat{γ}}_{sex}$	$\underset{(66.629; 104.509)}{83.447}$	$\underset{(60.167; 137.621)}{90.996}$
${\hat{ψ}}_{sex}$	$\underset{(0.243; 1.038)}{0.502}$	$\underset{(0.336; 2.66)}{0.946}$
${\hat{ν}}_{sex}$	$\underset{(0.009; 0.016)}{0.012}$	$\underset{(0.006; 0.018)}{0.011}$
${\hat{π}}_{sex}$	$\underset{(0.002; 0.015)}{0.006}$	$\underset{(0.009; 0.11)}{0.031}$

	Male	Female
${\hat{α}}_{sex}$	$\underset{(- 0.809; - 0.196)}{- 0.502}$	$\underset{(- 5.008; - 0.757)}{- 2.882}$
${\hat{γ}}_{sex}$	$\underset{(66.629; 104.509)}{83.447}$	$\underset{(60.167; 137.621)}{90.996}$
${\hat{ψ}}_{sex}$	$\underset{(0.243; 1.038)}{0.502}$	$\underset{(0.336; 2.66)}{0.946}$
${\hat{ν}}_{sex}$	$\underset{(0.009; 0.016)}{0.012}$	$\underset{(0.006; 0.018)}{0.011}$
${\hat{π}}_{sex}$	$\underset{(0.002; 0.015)}{0.006}$	$\underset{(0.009; 0.11)}{0.031}$

The estimated distribution corresponds to a ${\hat{ψ}}_{sex} N B_{> 0} ({\hat{ν}}_{sex}, {\hat{π}}_{sex})$ for males and females. The mean parameter ${\hat{μ}}_{f} =$ $0.328 (95 % - CI [0.091; 1.182])$ and is insignificant as indicated by the $95 %$ -CI. The resulting distribution parameters can also be found in Table 2. The estimated mean ${\hat{μ}}_{f}$ indicates lower expected frailty (and therefore lower population hazard) for females initially. However, the mean parameter has to be interpreted in the context of its distribution. Let $μ_{sex} (t) = E (Z | T > t, sex)$ ⁠. The limit of the conditional expectation of the frailty is $μ_{sex} ([\infty, \infty]) = ψ_{sex} ν_{sex}$ ⁠. With the estimates from Table 2, ${\hat{μ}}_{f} ([\infty, \infty]) =$ 0.01 $> {\hat{μ}}_{m} ([\infty, \infty]) =$ 0.006 follows and the initial order of the expectations is reversed. In this example, this leads to the estimated population seroprevalence $\hat{P} (T^{(HPV 16)} \leq age)$ being higher for males early in life but from 12 yrs of life onwards, females start to catch up and finally cross the curve of males at 25 yrs of life (not shown). We will discuss the reason for the switching order of ${\hat{μ}}_{f} (t)$ ⁠, and ${\hat{μ}}_{m} (t)$ that finally leads to crossing seroprevalence curves by analysing the distribution of the frailties in the paragraphs below.

Table 3 shows an excerpt of the distribution of the RCs. We interpret the distribution of the RCs as the distribution of stratum-relative risk-related behavior and predisposition. The bulk of the population is estimated to be in the lowest RC, though there is more probability to the right of the lowest RC for males. The ratio of the cumulative probabilities between females and males is always above one, also indicating a more heavy tail for males. The heavier tale of the distribution of latent RCs for males is the reason for $μ_{m} > {\hat{μ}}_{f}$ ⁠.

Table 3:

Estimated distribution of RCs and across stratum analysis for stratified RE, non-stratified hazard model. Parentheses below point estimate show 95%-CI.

	$\hat{P} (Z_{sex} \leq {\hat{z}}_{sex, (k)})$			${\hat{z}}_{sex, (k)}$
$k^{th}$ RC	Males	Females	$\frac{\hat{P} (Z_{f} \leq z_{f, (k)})}{\hat{P} (Z_{m} \leq z_{m, (k)})}$	Males	Females	${\hat{HR}}_{A} (k)$
$1^{st}$	$\underset{(0.932; 0.949)}{0.941}$	$\underset{(0.956; 0.97)}{0.964}$	$\underset{(1.013; 1.036)}{1.024}$	$\underset{(0.002; 0.015)}{0.006}$	$\underset{(0.004; 0.026)}{0.01}$	$\underset{(1.411; 2.01)}{1.684}$
$2^{nd}$	$\underset{(0.949; 0.955)}{0.952}$	$\underset{(0.971; 0.976)}{0.974}$	$\underset{(1.013; 1.032)}{1.023}$	$\underset{(0.245; 1.053)}{0.508}$	$\underset{(0.341; 2.684)}{0.956}$	$\underset{(1.064; 3.325)}{1.881}$
$3^{rd}$	$\underset{(0.956; 0.96)}{0.958}$	$\underset{(0.977; 0.98)}{0.978}$	$\underset{(1.017; 1.027)}{1.022}$	$\underset{(0.489; 2.091)}{1.011}$	$\underset{(0.677; 5.344)}{1.902}$	$\underset{(1.061; 3.336)}{1.882}$
$4^{th}$	$\underset{(0.96; 0.963)}{0.961}$	$\underset{(0.98; 0.983)}{0.982}$	$\underset{(1.017; 1.025)}{1.021}$	$\underset{(0.732; 3.13)}{1.513}$	$\underset{(1.014; 8.004)}{2.848}$	$\underset{(1.061; 3.34)}{1.882}$
$5^{th}$	$\underset{(0.963; 0.966)}{0.964}$	$\underset{(0.983; 0.985)}{0.984}$	$\underset{(1.017; 1.024)}{1.02}$	$\underset{(0.975; 4.168)}{2.016}$	$\underset{(1.35; 10.664)}{3.794}$	$\underset{(1.06; 3.341)}{1.882}$

	$\hat{P} (Z_{sex} \leq {\hat{z}}_{sex, (k)})$			${\hat{z}}_{sex, (k)}$
$k^{th}$ RC	Males	Females	$\frac{\hat{P} (Z_{f} \leq z_{f, (k)})}{\hat{P} (Z_{m} \leq z_{m, (k)})}$	Males	Females	${\hat{HR}}_{A} (k)$
$1^{st}$	$\underset{(0.932; 0.949)}{0.941}$	$\underset{(0.956; 0.97)}{0.964}$	$\underset{(1.013; 1.036)}{1.024}$	$\underset{(0.002; 0.015)}{0.006}$	$\underset{(0.004; 0.026)}{0.01}$	$\underset{(1.411; 2.01)}{1.684}$
$2^{nd}$	$\underset{(0.949; 0.955)}{0.952}$	$\underset{(0.971; 0.976)}{0.974}$	$\underset{(1.013; 1.032)}{1.023}$	$\underset{(0.245; 1.053)}{0.508}$	$\underset{(0.341; 2.684)}{0.956}$	$\underset{(1.064; 3.325)}{1.881}$
$3^{rd}$	$\underset{(0.956; 0.96)}{0.958}$	$\underset{(0.977; 0.98)}{0.978}$	$\underset{(1.017; 1.027)}{1.022}$	$\underset{(0.489; 2.091)}{1.011}$	$\underset{(0.677; 5.344)}{1.902}$	$\underset{(1.061; 3.336)}{1.882}$
$4^{th}$	$\underset{(0.96; 0.963)}{0.961}$	$\underset{(0.98; 0.983)}{0.982}$	$\underset{(1.017; 1.025)}{1.021}$	$\underset{(0.732; 3.13)}{1.513}$	$\underset{(1.014; 8.004)}{2.848}$	$\underset{(1.061; 3.34)}{1.882}$
$5^{th}$	$\underset{(0.963; 0.966)}{0.964}$	$\underset{(0.983; 0.985)}{0.984}$	$\underset{(1.017; 1.024)}{1.02}$	$\underset{(0.975; 4.168)}{2.016}$	$\underset{(1.35; 10.664)}{3.794}$	$\underset{(1.06; 3.341)}{1.882}$

Table 3:

Estimated distribution of RCs and across stratum analysis for stratified RE, non-stratified hazard model. Parentheses below point estimate show 95%-CI.

	$\hat{P} (Z_{sex} \leq {\hat{z}}_{sex, (k)})$			${\hat{z}}_{sex, (k)}$
$k^{th}$ RC	Males	Females	$\frac{\hat{P} (Z_{f} \leq z_{f, (k)})}{\hat{P} (Z_{m} \leq z_{m, (k)})}$	Males	Females	${\hat{HR}}_{A} (k)$
$1^{st}$	$\underset{(0.932; 0.949)}{0.941}$	$\underset{(0.956; 0.97)}{0.964}$	$\underset{(1.013; 1.036)}{1.024}$	$\underset{(0.002; 0.015)}{0.006}$	$\underset{(0.004; 0.026)}{0.01}$	$\underset{(1.411; 2.01)}{1.684}$
$2^{nd}$	$\underset{(0.949; 0.955)}{0.952}$	$\underset{(0.971; 0.976)}{0.974}$	$\underset{(1.013; 1.032)}{1.023}$	$\underset{(0.245; 1.053)}{0.508}$	$\underset{(0.341; 2.684)}{0.956}$	$\underset{(1.064; 3.325)}{1.881}$
$3^{rd}$	$\underset{(0.956; 0.96)}{0.958}$	$\underset{(0.977; 0.98)}{0.978}$	$\underset{(1.017; 1.027)}{1.022}$	$\underset{(0.489; 2.091)}{1.011}$	$\underset{(0.677; 5.344)}{1.902}$	$\underset{(1.061; 3.336)}{1.882}$
$4^{th}$	$\underset{(0.96; 0.963)}{0.961}$	$\underset{(0.98; 0.983)}{0.982}$	$\underset{(1.017; 1.025)}{1.021}$	$\underset{(0.732; 3.13)}{1.513}$	$\underset{(1.014; 8.004)}{2.848}$	$\underset{(1.061; 3.34)}{1.882}$
$5^{th}$	$\underset{(0.963; 0.966)}{0.964}$	$\underset{(0.983; 0.985)}{0.984}$	$\underset{(1.017; 1.024)}{1.02}$	$\underset{(0.975; 4.168)}{2.016}$	$\underset{(1.35; 10.664)}{3.794}$	$\underset{(1.06; 3.341)}{1.882}$

	$\hat{P} (Z_{sex} \leq {\hat{z}}_{sex, (k)})$			${\hat{z}}_{sex, (k)}$
$k^{th}$ RC	Males	Females	$\frac{\hat{P} (Z_{f} \leq z_{f, (k)})}{\hat{P} (Z_{m} \leq z_{m, (k)})}$	Males	Females	${\hat{HR}}_{A} (k)$
$1^{st}$	$\underset{(0.932; 0.949)}{0.941}$	$\underset{(0.956; 0.97)}{0.964}$	$\underset{(1.013; 1.036)}{1.024}$	$\underset{(0.002; 0.015)}{0.006}$	$\underset{(0.004; 0.026)}{0.01}$	$\underset{(1.411; 2.01)}{1.684}$
$2^{nd}$	$\underset{(0.949; 0.955)}{0.952}$	$\underset{(0.971; 0.976)}{0.974}$	$\underset{(1.013; 1.032)}{1.023}$	$\underset{(0.245; 1.053)}{0.508}$	$\underset{(0.341; 2.684)}{0.956}$	$\underset{(1.064; 3.325)}{1.881}$
$3^{rd}$	$\underset{(0.956; 0.96)}{0.958}$	$\underset{(0.977; 0.98)}{0.978}$	$\underset{(1.017; 1.027)}{1.022}$	$\underset{(0.489; 2.091)}{1.011}$	$\underset{(0.677; 5.344)}{1.902}$	$\underset{(1.061; 3.336)}{1.882}$
$4^{th}$	$\underset{(0.96; 0.963)}{0.961}$	$\underset{(0.98; 0.983)}{0.982}$	$\underset{(1.017; 1.025)}{1.021}$	$\underset{(0.732; 3.13)}{1.513}$	$\underset{(1.014; 8.004)}{2.848}$	$\underset{(1.061; 3.34)}{1.882}$
$5^{th}$	$\underset{(0.963; 0.966)}{0.964}$	$\underset{(0.983; 0.985)}{0.984}$	$\underset{(1.017; 1.024)}{1.02}$	$\underset{(0.975; 4.168)}{2.016}$	$\underset{(1.35; 10.664)}{3.794}$	$\underset{(1.06; 3.341)}{1.882}$

The numerical value of the frailties then assigns a magnitude related interpretation to the distinct RCs. The estimated support shows that females have a higher category-related hazard in each RC (see the last column Table 3). Across strata, given the $k^{th}$ RC, the conditional or RC-related HR, ${\hat{HR}}_{A} (k) = \frac{{\hat{z}}_{f, (k)} {\hat{λ}}_{0}^{(j)} (t)}{{\hat{z}}_{m, (k)} {\hat{λ}}_{0}^{(j)} (t)}$ ⁠, is 1.684 in the important first category. In this case, the ${\hat{HR}}_{A} (k)$ approaches its limit (with respect to k), $\frac{{\hat{ψ}}_{f}}{{\hat{ψ}}_{m}} =$ 1.883, fast due to small values of ${\hat{ν}}_{sex}$ ⁠. Higher RC-related hazard for females is the reason for ${\hat{μ}}_{f} (t)$ surpassing ${\hat{μ}}_{m} (t)$ with time progressing: the individuals belonging to the tale of the distribution of the RCs start to seroconvert early. This selection effect is more pronounced for males due to the heavier tale of the distribution of RCs. Due to extreme individuals within the male population seroconverting more quickly, the higher RC-related hazard for females causes ${\hat{μ}}_{f} ([t, 0]) > {\hat{μ}}_{m} ([t, 0])$ from 12 yrs of life onwards.

Within stratum, ${\hat{HR}}_{W} (1) =$ $94.878 (95 % - CI [60.448; 148.919])$ for females and $84.949 (95 % - CI [65.475; 110.215])$ for males, ie being in the second instead of the lowest RC is estimated to be more hazardous for females than for males even from a relative perspective. The ${\hat{HR}}_{W} (k)$ then approaches its limit $\frac{k}{k - 1}$ immediately because ${\hat{ν}}_{sex}$ is small for males and females.

Differences in unobserved heterogeneity across the sexes are reflected by the support and the distribution of the RCs. Given that HPV is a sexually transmitted disease, the membership to a certain RC is partly governed by stratum-relative (sexual) behavior in that sense, that having, for example, a higher number of sexual partners than some stratum reference should put one in a higher RC than the reference individual. It is tempting to interpret the difference in magnitude of a given RC on the conditional hazard rate across the sexes. For the human immunodeficiency viruses, for example, it is known that male to female transmission is more likely than female to male transmission (see Nicolosi et al. (1994) or European Study Group on Heterosexual Transmission of HIV (1992)). Assuming that each RC comprises the same set of sexual behavior across the sexes, $z_{f, (k)} > z_{m, (k)}$ for all k, could also hint on a higher susceptibility of females with respect to an infection with HPV16 and HPV18 per relevant contact. However, the RCs are anchored in the stratum and do not necessarily imply the same behavior across the sexes. Hence, this interpretation is highly speculative and assumption based.

5.2 Higher variate analysis

When including all seven high-risk types of HPV for which we have data, the direction of interpretation is largely similar to that of the bivariate case, and mainly the magnitude changes. The estimated RFV parameters are ${\hat{α}}_{m} =$ $- 1.359 (95 % - CI [- 1.8; - 0.918])$ ⁠, ${\hat{γ}}_{m} =$ $9.908 (95 % - CI [8.928; 10.995])$ ⁠, ${\hat{α}}_{f} =$ $- 2.005 (95 % - CI [- 2.5; - 1.509])$ ⁠, ${\hat{γ}}_{f} =$ $6.855 (95 % - CI [6.143; 7.649])$ ⁠. The heterogeneity/association is less extreme than in the bivariate case early in life. However, the association remains at larger levels compared to the bivariate case, as shown in Fig. 2b, indicating that association remains high throughout life. It can also be seen that the RFV (CRF) of females is always below that of males, indicating greater heterogeneity due to individual factors for males throughout the entire time period. As the level of association differs strongly between the bivariate case above and the higher variate case with seven high-risk HPV types, a shared frailty model might be seen as inadequate to capture the patterns of association between the various types of HPV. Therefore, a correlated frailty model may be more appropriate. The shared frailty model might be chosen for its simplicity, however, if the specific types of HPV are less relevant to the research question but, for example, the prognostic factor of one “anonymous” high-risk type on another “anonymous” high-risk type is investigated.

The (initial) expectation of the frailty is virtually identical for males and females; ${\hat{μ}}_{f} =$ $0.955 (95 % - CI [0.883; 1.033])$ ⁠. In the higher variate case the distribution of the RCs is not as much focused on the lower categories. The tail is again more heavy for males (not shown). The ${HR}_{A} (k) > 1$ for all k again indicates higher RC-related hazard for females. The ${\hat{HR}}_{W} (1)$ is less extreme in the higher variate case than in the bivariate case: $9.86 (95 % - CI [8.627; 11.268])$ for females and $12.267 (95 % - CI [10.786; 13.95])$ for males. Note that the order of ${\hat{HR}}_{W} (1)$ for males and females changes when comparing this to the bivariate scenario.

6 Conclusion

In this paper, we discuss the Addams family of discrete frailty distributions, which has been conceptualized by Farrington et al. (2012) for modeling individual heterogeneity in time-to-event models. We further examine the properties of the conditional time-to-event model induced by the Addams family and develop estimation routines for multivariate case I interval-censored data.

For discrete frailty distributions, the RFV (CRF) approaches either infinity or zero (one) over the course of time, where the distinction is made by the minimum of the support of the frailty being zero or greater than zero respectively. Few discrete frailty distributions are able to manipulate the support via its parameters to choose the long-term behavior of the two functions accordingly, but this typically involves the edge of the parameter space; see Bardo and Unkel (2023) for a discussion. For the Addams family of discrete frailty distributions, the minimum of the support can either be zero, resulting in a cure rate model, or greater than zero without involving the edge of the parameter space. Consequently, the RFV (CRF) is either monotonically increasing or monotonically decreasing, again without involving the edge of the parameter space. Through the introduction of a scaling parameter, the Addams family is also able to increase or flatten the slope of the RFV (CRF) and might even approach a constant by approaching its continuous exception, a shape that is impossible for discrete shared frailty model to generate. This makes the Addams family a useful general-purpose modeling approach.

A unique feature of the Addams family is that the support of the discrete frailty distribution varies with its parameters and is hence subject to estimation. We suggest interpreting the support as ordered latent risk categories. This feature allows for a unique analysis of the latent model as the effect of latent risk category membership on the hazard rates can be separated from the distribution of the latent risk category membership. By focusing on the support of the frailty, the latent model can essentially be interpreted analogously to the effect of a covariate, e.g. via time-invariant hazard ratios of different risk categories, which we call the within-stratum hazard ratio. If a model is imposed on the distribution parameters of the Addams family, this analysis can be enriched by the across-stratum hazard ratio, ie the hazard ratio of a given latent risk category for different strata that are defined by covariates. In a second step, the distribution of the ordered risk category membership can be examined in order to fully understand the impact of unobserved heterogeneity on observable patterns such as population hazard rates and ratios which are averaged over the risk category membership of survivors. This type of analysis could also be performed with the discrete K-point distribution. However, there is no counterpart to this covariate-style analysis for continuous frailty distributions, or for discrete frailty distributions where the support is fixed. This is because the distribution of frailty cannot be meaningfully separated from the effect of frailty on hazard rates, as one would need to compare, e.g. quartiles of frailty distributions that vary with survival. Consequently, a time-invariant proportional hazards interpretation is not possible because the hazard ratio inherits the survival condition. Thus, the Addams family and the K-point distribution offer the possibility to analyze the latent model, which may include heterogeneous covariate effects, thoroughly with common measures.

The analysis of the latent model via the within-stratum hazard ratio might help to understand the importance of individual heterogeneity. In that sense, individual heterogeneity might be regarded as important if the within-stratum hazard ratios are large and vice versa. The analysis of the across-stratum hazard ratio may reveal structural differences in individual heterogeneity across covariates, prompting a discussion of the reasons for this. In this sense, the covariate-style interpretation may be beneficial for scientific discussion, as hazard ratios and probabilities are a common way of communicating with a non-statistical audience.

We applied the Addams family to multivariate case I interval-censored infection data and allowed the distribution of individual heterogeneity to differ for males and females. Males are found to have a higher probability for more hazardous categories, possibly reflecting a more cautious behavior in the female population compared to males. However, the estimated hazard in each risk category is higher for females than for males, which might reflect a higher biological burden with respect to the susceptibility of HPV. There was no evidence for the existence of a non-susceptible sub-group, neither in the bivariate data set, including HPV 16 and HPV 18, nor in the data set containing seven high-risk types of HPV.

Acknowledgments

We are grateful to Fiona van der Klis and Liesbeth Mollema from the Dutch National Institute for Public Health and the Environment (RIVM), for giving us permission to use the data from the second PIENTER study. The authors also thank the anonymous reviewers for their valuable suggestions.

Funding

This work was supported by the Deutsche Forschungsgemeinschaft (DFG; German Research Foundation) [grant number UN 400/2-1].

CONFLICT OF INTEREST STATEMENT

None declared.

References

Aalen

OO

,

Borgan

Ø

,

Gjessing

HK.

Event history analysis: a process point of view. Statistics for biology and health

.

Dordrecht

:

Springer

;

2008

.

Ata

N

,

Özel

G.

Survival functions for the frailty models based on the discrete compound poisson process

.

J Stat Comput Simul

.

2013

:

83

(

11

):

2105

–

2116

.

Bardo

M

,

Unkel

S.

The shape of the relative frailty variance induced by discrete random effect distributions in univariate and multivariate survival models,

arXiv, arXiv:arXiv:2303.04915,

preprint: not peer reviewed

,

2023

:

1

–

25

.

Begun

AZ

,

Iachine

IA

,

Yashin

AI.

Genetic nature of individual frailty: comparison of two approaches

.

Twin Res Hum Genet.

2000

:

3

(

1

):

51

–

57

.

Bijwaard

G.

Multistate event history analysis with frailty

.

Demographic Res

.

2014

:

30

:

1591

–

1620

.

Burchell

AN

,

Winer

RL

,

de Sanjosé

S

,

Franco

EL.

Chapter 6: Epidemiology and transmission dynamics of genital hpv infection

.

Vaccine.

2006

:

24

(

Suppl 3

):

S3/52

–

61

.

PubMed

Cancho

VG

,

Barriga

G

,

Leão

J

,

Saulo

H.

Survival model induced by discrete frailty for modeling of lifetime data with long-term survivors and change-point

.

Commun Stat Theory Methods

.

2021

:

50

(

5

):

1161

–

1172

.

Cancho

VG

,

Macera

MAC

,

Suzuki

AK

,

Louzada

F

,

Zavaleta

KEC.

A new long-term survival model with dispersion induced by discrete frailty

.

Lifetime Data Anal

.

2020a

:

26

(

2

):

221

–

244

.

Cancho

VG

,

Suzuki

AK

,

Barriga

GDC

,

do Espirito Santo Ana

PJ.

A multivariate survival model induced by discrete frailty

.

Commun Stat Simul Comput

.

2020b

:

51

(

11

):

6572

–

6590

.

Cancho

VG

,

Zavaleta

KEC

,

Macera

MAC

,

Suzuki

AK

,

Louzada

F.

A bayesian cure rate model with dispersion induced by discrete frailty

.

Commun Stat Appl Methods.

2018

:

25

(

5

):

471

–

488

.

Caroni

C

,

Crowder

M

,

Kimber

A.

Proportional hazards models with discrete frailty

.

Lifetime Data Anal.

2010

:

16

(

3

):

374

–

384

.

Choi

S

,

Huang

X.

A general class of semiparametric transformation frailty models for nonproportional hazards survival data

.

Biometrics.

2012

:

68

(

4

):

1126

–

1135

.

Choi

S

,

Huang

X

,

Chen

Y-H.

A class of semiparametric transformation models for survival data with a cured proportion

.

Lifetime Data Anal.

2014

:

20

(

3

):

369

–

386

.

Cox

C

,

Chu

H

,

Schneider

MF

,

Muñoz

A.

Parametric survival analysis and taxonomy of hazard functions for the generalized gamma distribution

.

Stat Med.

2007

:

26

(

23

):

4352

–

4374

.

de Souza

D

,

Cancho

VG

,

Rodrigues

J

,

Balakrishnan

N.

Bayesian cure rate models induced by frailty in survival analysis

.

Stat Methods Med Res.

2017

:

26

(

5

):

2011

–

2028

.

Duchateau

L

,

Janssen

P.

The Frailty Model

.

Dordrecht

:

Springer

;

2008

.

European Study Group on Heterosexual Transmission of HIV. Comparison of female to male and male to female transmission of hiv in 563 stable couples. european study group on heterosexual transmission of HIV. BMJ

(

Clin Res Ed.)

.

1992

:

304

(

6830

):

809

–

813

.

Farrington

CP

,

Unkel

S

,

Anaya-Izquierdo

K.

The relative frailty variance and shared frailty models

.

J R Stat Soc Ser B (Stat Methodol)

.

2012

:

74

(

4

):

673

–

696

.

Gasperoni

F

,

Ieva

F

,

Paganoni

AM

,

Jackson

CH

,

Sharples

L.

Non-parametric frailty cox models for hierarchical time-to-event data

.

Biostatistics (Oxford, England)

.

2020

:

21

(

3

):

531

–

544

.

Gavillon

N

,

Vervaet

H

,

Derniaux

E

,

Terrosi

P

,

Graesslin

O

,

Quereux

C.

Papillomavirus humain (HPV): comment ai-je attrapé ça?

Gynecol Obstetrique Fertilite

.

2010

:

38

(

3

):

199

–

204

.

Gilbert

P

,

Varadhan

R.

numderiv: accurate numerical derivatives.

2019

. DOI:

10.32614/CRAN.package.

numDeriv. https://cran.r-project.org/web/packages/numDeriv/

Hens

N

,

Wienke

A

,

Aerts

M

,

Molenberghs

G.

The correlated and shared gamma frailty model for bivariate current status data: an illustration for cross-sectional serological data

.

Stat Med.

2009

:

28

(

22

):

2785

–

2800

.

Hougaard

P.

Analysis of multivariate survival data

.

New York

:

Springer

;

2000

.

Lam

JUH

,

Rebolj

M

,

Dugué

P-A

,

Bonde

J

,

von Euler-Chelpin

M

,

Lynge

E.

Condom use in prevention of human papillomavirus infections and cervical neoplasia: systematic review of longitudinal studies

.

J Med Screen.

2014

:

21

(

1

):

38

–

50

.

Meyers

J

,

Ryndock

E

,

Conway

MJ

,

Meyers

C

,

Robison

R.

Susceptibility of high-risk human papillomavirus type 16 to clinical disinfectants

.

J Antimicrob Chemother.

2014

:

69

(

6

):

1546

–

1550

.

Mohseni

N

,

Maboudi

AAK

,

Baghestani

A

,

Saeedi

A.

A cure rate model with discrete frailty on hodgkin lymphoma patients after diagnosis

.

Arch Adv Biosci

.

2020

:

11(4

):

15

–

22

.

Molina

KC

,

Calsavara

VF

,

Tomazella

VD

,

Milani

EA.

Survival models induced by zero-modified power series discrete frailty: application with a melanoma data set

.

Stat Methods Med Res.

2021

:

30

(

8

):

1874

–

1889

.

Mollema

L

,

de Melker

HE

,

Hahne

SJ

,

van Weert

JW

,

Berbers

GA

,

van der Klis

FR.

PIENTER 2-project: second research project on the protection against infectious diseases offered by the national immunization programme in the Netherlands (Report 230421001/2009). Rijksinstituut voor Volksgezondheid en Milieu.

2009

.

Nicolosi

A

,

Léa Corrêa Leite

M

,

Musicco

M

,

Arici

C

,

Gavazzeni

G

,

Lazzarin

A.

The efficiency of male-to-female and female-to-male sexual transmission of the human immunodeficiency virus: a study of 730 stable couples

.

Epidemiology.

1994

:

5

(

6

):

570

–

575

.

Nielson

CM

,

Harris

RB

,

Nyitray

AG

,

Dunne

EF

,

Stone

KM

,

Giuliano

AR.

Consistent condom use is associated with lower prevalence of human papillomavirus infection in men

.

J Infect Dis.

2010

:

202

(

3

):

445

–

451

.

Palloni

A

,

Beltrán-Sánchez

H.

Discrete barker frailty and warped mortality dynamics at older ages

.

Demography.

2017

:

54

(

2

):

655

–

671

.

Pickles

A

,

Crouchley

R.

Generalizations and applications of frailty models for survival and event data

.

Stat Methods Med Res

.

1994

:

3

(

3

):

263

–

278

.

R Core Team

.

R: a language and environment for statistical computing

.

Vienna, Austria

:

R Foundation for Statistical Computing

;

2022

.

Richardson

H

,

Franco

E

,

Pintos

J

,

Bergeron

J

,

Arella

M

,

Tellier

P.

Determinants of low-risk and high-risk cervical human papillomavirus infections in montreal university students

.

Sexually Transmitted Dis.

2000

:

27

(

2

):

79

–

86

.

Rintala

MAM

,

Grénman

SE

,

Puranen

MH

,

Isolauri

E

,

Ekblad

U

,

Kero

PO

,

Syrjänen

SM.

Transmission of high-risk human papillomavirus (hpv) between parents and infant: a prospective study of hpv in families in finland

.

J Clin Microbiol.

2005

:43(

1

):

376

–

381

.

Scherpenisse

M

,

Mollers

M

,

Schepp

RM

,

Boot

HJ

,

de Melker

HE

,

Meijer

CJLM

,

Berbers

GAM

,

van der Klis

FRM.

Seroprevalence of seven high-risk hpv types in the Netherlands

.

Vaccine.

2012

:

30

(

47

):

6686

–

6693

.

Sun

J.

The statistical analysis of interval censored failure time data

.

New York

:

Springer

;

2007

.

Syrjänen

S.

Current concepts on human papillomavirus infections in children

.

APMIS

.

2010

:

118

(

6–7

):

494

–

509

.

PubMed

. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3114159

Troncoso-Ponce

D.

Estimation of competing risks duration models with unobserved heterogeneity using hsmlogit.

2018

Unkel

S

,

Farrington

CP.

A new measure of time-varying association for shared frailty models with bivariate current status data

.

Biostatistics (Oxford, England)

.

2012

:

13

(

4

):

665

–

679

.

Unkel

S

,

Farrington

CP

,

Whitaker

HJ

,

Pebody

R.

Time varying frailty models and the estimation of heterogeneities in transmission of infectious diseases

.

J R Stat Soc Ser C (Appl Stat)

.

2014

:

63

(

1

):

141

–

158

.