Uncovering mortality patterns and hospital effects in COVID-19 heart failure patients: a novel multilevel logistic cluster-weighted modeling approach

Parameter values acquired through the novel multilevel logistic cluster-weighted model (ML-CWMd) estimation across the 3 clusters.

Parameter	Cluster 1		Cluster 2		Cluster 3
\|$\boldsymbol {\mu }$\|	\|$(82.52; \ 9.87)$\|		\|$(59.67; \ 4.59)$\|		\|$(74.65; \ 29.81)$\|
\|$\boldsymbol {\Sigma }$\|	$${\begin{bmatrix}51.3 & -3.8 \\-3.8 & 36.9 \end{bmatrix}}$$		$${\begin{bmatrix}91.6 & -6.3 \\-6.3 & 19.2 \end{bmatrix}}$$		$${\begin{bmatrix}98.6 & 14.7 \\14.7 & 35.9 \end{bmatrix}}$$
\|$\boldsymbol {\nu }$\|	\|$(0.11; \ -3.41; \ -1.37)$\|		\|$(1.80; \ -2.10; \ -1.90)$\|		\|$(0.28; \ -4.86; \ -1.40)$\|
Variable	Estimate	\|${\bf Pr}(>\|z\|)$\|	Estimate	\|${\bf Pr}(>\|z\|)$\|	Estimate	\|${\bf Pr}(>\|z\|)$\|
PNA	\|$+0.18$\|	\|$0.09$\|	\|$-0.08$\|	0.21	\|$+0.27$\|	0.47
RF	\|$+0.10$\|	0.37	\|$+1.51$\|	0.018*	\|$-0.90$\|	0.046*

Parameter	Cluster 1		Cluster 2		Cluster 3
\|$\boldsymbol {\mu }$\|	\|$(82.52; \ 9.87)$\|		\|$(59.67; \ 4.59)$\|		\|$(74.65; \ 29.81)$\|
\|$\boldsymbol {\Sigma }$\|	$${\begin{bmatrix}51.3 & -3.8 \\-3.8 & 36.9 \end{bmatrix}}$$		$${\begin{bmatrix}91.6 & -6.3 \\-6.3 & 19.2 \end{bmatrix}}$$		$${\begin{bmatrix}98.6 & 14.7 \\14.7 & 35.9 \end{bmatrix}}$$
\|$\boldsymbol {\nu }$\|	\|$(0.11; \ -3.41; \ -1.37)$\|		\|$(1.80; \ -2.10; \ -1.90)$\|		\|$(0.28; \ -4.86; \ -1.40)$\|
Variable	Estimate	\|${\bf Pr}(>\|z\|)$\|	Estimate	\|${\bf Pr}(>\|z\|)$\|	Estimate	\|${\bf Pr}(>\|z\|)$\|
PNA	\|$+0.18$\|	\|$0.09$\|	\|$-0.08$\|	0.21	\|$+0.27$\|	0.47
RF	\|$+0.10$\|	0.37	\|$+1.51$\|	0.018*	\|$-0.90$\|	0.046*

The upper section displays the values of the parameters |$\boldsymbol {\mu }$|⁠, |$\boldsymbol {\Sigma }$|⁠, and |$\boldsymbol {\nu }$|⁠. The lower section shows the estimate of fixed effects parameters |$\boldsymbol {\beta }_c$| (⁠|$c = 1,\ldots ,C$|⁠) and their corresponding P-values.

TABLE 1

Parameter values acquired through the novel multilevel logistic cluster-weighted model (ML-CWMd) estimation across the 3 clusters.

Parameter	Cluster 1		Cluster 2		Cluster 3
\|$\boldsymbol {\mu }$\|	\|$(82.52; \ 9.87)$\|		\|$(59.67; \ 4.59)$\|		\|$(74.65; \ 29.81)$\|
\|$\boldsymbol {\Sigma }$\|	$${\begin{bmatrix}51.3 & -3.8 \\-3.8 & 36.9 \end{bmatrix}}$$		$${\begin{bmatrix}91.6 & -6.3 \\-6.3 & 19.2 \end{bmatrix}}$$		$${\begin{bmatrix}98.6 & 14.7 \\14.7 & 35.9 \end{bmatrix}}$$
\|$\boldsymbol {\nu }$\|	\|$(0.11; \ -3.41; \ -1.37)$\|		\|$(1.80; \ -2.10; \ -1.90)$\|		\|$(0.28; \ -4.86; \ -1.40)$\|
Variable	Estimate	\|${\bf Pr}(>\|z\|)$\|	Estimate	\|${\bf Pr}(>\|z\|)$\|	Estimate	\|${\bf Pr}(>\|z\|)$\|
PNA	\|$+0.18$\|	\|$0.09$\|	\|$-0.08$\|	0.21	\|$+0.27$\|	0.47
RF	\|$+0.10$\|	0.37	\|$+1.51$\|	0.018*	\|$-0.90$\|	0.046*

Parameter	Cluster 1		Cluster 2		Cluster 3
\|$\boldsymbol {\mu }$\|	\|$(82.52; \ 9.87)$\|		\|$(59.67; \ 4.59)$\|		\|$(74.65; \ 29.81)$\|
\|$\boldsymbol {\Sigma }$\|	$${\begin{bmatrix}51.3 & -3.8 \\-3.8 & 36.9 \end{bmatrix}}$$		$${\begin{bmatrix}91.6 & -6.3 \\-6.3 & 19.2 \end{bmatrix}}$$		$${\begin{bmatrix}98.6 & 14.7 \\14.7 & 35.9 \end{bmatrix}}$$
\|$\boldsymbol {\nu }$\|	\|$(0.11; \ -3.41; \ -1.37)$\|		\|$(1.80; \ -2.10; \ -1.90)$\|		\|$(0.28; \ -4.86; \ -1.40)$\|
Variable	Estimate	\|${\bf Pr}(>\|z\|)$\|	Estimate	\|${\bf Pr}(>\|z\|)$\|	Estimate	\|${\bf Pr}(>\|z\|)$\|
PNA	\|$+0.18$\|	\|$0.09$\|	\|$-0.08$\|	0.21	\|$+0.27$\|	0.47
RF	\|$+0.10$\|	0.37	\|$+1.51$\|	0.018*	\|$-0.90$\|	0.046*

The second cluster comprises 321 patients aged 59.67 on average with a mean MCS score of 4.59. This points toward a patient profile composed of young and middle-aged individuals who exhibit favorable health conditions. Within this cluster, there is a prevalence of males (82.2%) and a prevalence of patients without COPD (84.4%) and BRH (93.5%). Figure 1B illustrates the interaction parameters, indicating a notable positive correlation between COPD and BRH, alongside a significant negative correlation between BRH and sex. This suggests a higher prevalence of females with BRH. Within a 45-day period post-hospitalization, the mortality rate among individuals in this cluster stands at 19.3%. Upon examining the fixed effects (Table 1), we observe that the presence of RF in patients within this cluster substantially increases their risk of mortality. This correlation likely arises from the fact that RF implies impaired lung function, resulting in diminished oxygen levels in the bloodstream. Given that COVID-19 primarily targets the respiratory system, the coexistence of these 2 conditions further exacerbates respiratory dysfunction, making it challenging for the body to maintain adequate oxygen levels. Consequently, despite their overall good health, the severe oxygen deficiency can lead to exceptionally severe complications. In Figure 2B, we observe that only 2 hospitals exhibit significant associations with an increased likelihood of death in these patients. The limited impact of hospital variability on Cluster 2 is likely attributable to the fact that patients within this profile are generally younger and consistently healthier than the rest of the cohort, making them less susceptible to the overall quality of care provided by the treatment facility.

The third cluster comprises a total of 150 patients aged 74.65 on average along with a mean MCS score of 29.81. This patient profile is predominantly composed of individuals characterized by advanced age and a significant amount of comorbidities. Within this cluster, there is a prevalence of males (67.5%) and a higher occurrence of patients without COPD (62.6%) and BRH (79.3%). Figure 1C depicts interaction parameters, revealing a substantial positive correlation between COPD and BRH, as well as a significant positive correlation between BRH and gender. This implies a greater prevalence of BRH among males within this cluster. For this cluster, the percentage of patients who died within a 45-day period following hospitalization amounts to 43.3%. Interestingly, the presence of RF appears to significantly reduce the probability of death in patients within this cluster (Table 1). This somewhat counterintuitive finding is frequently observed among patients of this profile (West et al., 2014). RF acts as a protective factor; its occurrence in patients already in critical condition prompts medical professionals to prioritize their monitoring and treatment, thereby enhancing their chances of survival. Regarding the random effects (Figure 2C), only 1 hospital shows a significant decrease in the probability of death for this profile. This further suggests that the health conditions of patients in this cluster are so critical that the specific hospital where they are treated has relatively little or no impact on their outcomes.

When examining the random effects across clusters, a noteworthy trend emerges. For instance, Hospital H20 consistently retrieves positive evaluations in both Clusters 1 and 3, whereas Hospitals H13 and H15 receive unfavorable ratings in both Clusters 1 and 2. This discovery has important implications for policymaking, allowing suitable monitoring and second-level analysis of “out of control” situations.

Finally, we evaluate the predictive accuracy and calibration of the proposed model comparing it with the generalized linear mixed effects model (GLMM) and the generalized linear model (GLM) under the same regression settings using held-out data. Specifically, we generate 20 training-test splits with an 80-20 proportion stratified by hospitals. The models are trained on the training data and then used to predict death outcomes for patients in the test set. The results, including training and test predictive accuracy, expected calibration error (ECE), and Brier score, are summarized in Table 2, reporting the mean and standard deviation across the 20 splits. The proposed model consistently outperforms the alternatives, achieving superior performance on all metrics.

TABLE 2

Mean and standard deviation for all metrics across the 20 held-out data. Bold font denotes the best results.

Evaluation type	ML-CWMd	GLMM	GLM
Train Prediction Accuracy	\|$\boldsymbol{0.660 \pm 0.01}$\|	\|$0.639 \pm 0.01$\|	\|$0.588 \pm 0.003$\|
Test Prediction Accuracy	\|$\boldsymbol{0.620 \pm 0.02}$\|	\|$0.610 \pm 0.01$\|	\|$0.588 \pm 0.01$\|
ECE	\|$\boldsymbol{0.045 \pm 0.009}$\|	\|$0.048 \pm 0.012$\|	\|$0.056 \pm 0.012$\|
Brier score	\|$\boldsymbol{0.209 \pm 0.005}$\|	\|$0.212 \pm 0.004$\|	\|$0.225 \pm 0.003$\|

Evaluation type	ML-CWMd	GLMM	GLM
Train Prediction Accuracy	\|$\boldsymbol{0.660 \pm 0.01}$\|	\|$0.639 \pm 0.01$\|	\|$0.588 \pm 0.003$\|
Test Prediction Accuracy	\|$\boldsymbol{0.620 \pm 0.02}$\|	\|$0.610 \pm 0.01$\|	\|$0.588 \pm 0.01$\|
ECE	\|$\boldsymbol{0.045 \pm 0.009}$\|	\|$0.048 \pm 0.012$\|	\|$0.056 \pm 0.012$\|
Brier score	\|$\boldsymbol{0.209 \pm 0.005}$\|	\|$0.212 \pm 0.004$\|	\|$0.225 \pm 0.003$\|

TABLE 2

Open in new tab Download slide

Mean and standard deviation for all metrics across the 20 held-out data. Bold font denotes the best results.

Evaluation type	ML-CWMd	GLMM	GLM
Train Prediction Accuracy	\|$\boldsymbol{0.660 \pm 0.01}$\|	\|$0.639 \pm 0.01$\|	\|$0.588 \pm 0.003$\|
Test Prediction Accuracy	\|$\boldsymbol{0.620 \pm 0.02}$\|	\|$0.610 \pm 0.01$\|	\|$0.588 \pm 0.01$\|
ECE	\|$\boldsymbol{0.045 \pm 0.009}$\|	\|$0.048 \pm 0.012$\|	\|$0.056 \pm 0.012$\|
Brier score	\|$\boldsymbol{0.209 \pm 0.005}$\|	\|$0.212 \pm 0.004$\|	\|$0.225 \pm 0.003$\|

Evaluation type	ML-CWMd	GLMM	GLM
Train Prediction Accuracy	\|$\boldsymbol{0.660 \pm 0.01}$\|	\|$0.639 \pm 0.01$\|	\|$0.588 \pm 0.003$\|
Test Prediction Accuracy	\|$\boldsymbol{0.620 \pm 0.02}$\|	\|$0.610 \pm 0.01$\|	\|$0.588 \pm 0.01$\|
ECE	\|$\boldsymbol{0.045 \pm 0.009}$\|	\|$0.048 \pm 0.012$\|	\|$0.056 \pm 0.012$\|
Brier score	\|$\boldsymbol{0.209 \pm 0.005}$\|	\|$0.212 \pm 0.004$\|	\|$0.225 \pm 0.003$\|

4.3 Scenario analysis

In this section, we implement a scenario analysis to emphasize the importance of considering both the cluster factor and the hospital effect to capture the real-world complexity. We define 3 new COVID-19 HF patients, with characteristics closely mirroring those identified within the 3 clusters. From each of these patients, we select one with respiratory diseases (PNA and RF) and one without them, resulting in 6 possible profiles. For the complete table, including patients’ characteristics, see Web Appendix D. The objective is to illustrate the difference between predicting the likelihood of mortality for these new patients using the proposed model (Equation 1), a GLMM that neglects the consideration of latent patient clusters, and a GLM that neglects both the consideration of latent patient cluster and hospital effect. For each patient, we delineate 3 distinct predictions, corresponding, respectively, to random effects equivalent to |$-\hat{\sigma }_{bc}$| (hospital with commendable performance concerning that specific patient profile), 0 (hospital with negligible influence on the outcome), and |$+\hat{\sigma }_{bc}$| (hospital with poor performance).

In Figure 3, we show the predictions obtained with the 3 models. For the GLM, we have only a single value since it does not take into account the hospital effect. The bar intervals on the graph show how the predicted probabilities of death change when the hospital is considered beneficial for patient characteristics (at |$-\hat{\sigma }_{bc}$| ) versus when it is considered detrimental (at |$+\hat{\sigma }_{bc}$| ). The GLMM generates highly similar predictions across the same patient with and without respiratory diseases because it does not identify any respiratory diseases as significant covariates. Notably, only age, sex, and MCS emerge as significant variables in the fixed regression component of the GLMM. Consequently, GLMM fails to capture the variability attributed to respiratory diseases, which could significantly influence a patient probability of death, particularly among COVID-19 HF patients. Instead, using the ML-CWMd enables us to take into consideration the respiratory diseases effects observing a significant change in the probability of death based on the presence or absence of these diseases in the ML-CWMd predictions. For instance, for the patients in the green cluster (young individuals with low MCS), the probability of death in a hospital characterized as detrimental for the patient characteristics is approximately 0.28 if the patient has no respiratory diseases. This probability increases to around 0.4 if the patient has both respiratory diseases. In conclusion, the ML-CWMd provides more precise estimations of patient survival probabilities across different risk categories. Additionally, it offers a valid tool for the assessment and monitoring of hospital facilities utilizing administrative databases.

FIGURE 3

Predicted probabilities of mortality for the 6 patients across the 3 competing models. Bar intervals illustrate the shift in predicted probabilities when the hospital is deemed advantageous versus detrimental. Solid bars represent the novel multilevel logistic cluster-weighted model (ML-CWMd) introduced in the paper, dotted bars denote GLMM, and single points indicate GLM.

5 DISCUSSION

This paper has introduced a novel methodology for simultaneously risk-stratifying patients and conducting cluster-specific hospital evaluations. In detail, a novel ML-CWMd is devised, extending previous works with the ability to effectively capture the dependence among observations within the same cluster and hierarchy; as well as among dichotomous variables through the inclusion of the Ising model contribution. Resorting to maximum likelihood estimation, we have implemented a tailored CEM algorithm to perform model fitting, testing its performance in a simulated setting, and comparing it with state-of-the-art alternatives. Our proposal has demonstrated promising results when dealing with complex scenarios encompassing latent clusters of observations, group effects, and the interdependence among binary covariates. This research has been motivated by the challenge of developing tailored models needed for accommodating diverse patient profiles and hospital-specific effects. Specifically, we have applied our proposal to a real-world administrative dataset of Lombardy Region, Italy, including information about HF patients hospitalized for COVID-19. The analysis has revealed the existence of 3 distinct patient profiles, each characterized by cluster-wise different survival patterns and comorbidities. On top of this, the model setting has allowed for valuable insights into the ways respiratory diseases and hospitals impact individual profiles of patients. The analysis has thus demonstrated promising results in terms of actionable margins for defining healthcare interventions to enhance the territorial management of patients with HF through the planning of optimal care pathways, thereby reducing adverse clinical outcomes and improving system efficiency.

The devised methodology also possesses limitations. The independence assumption among observations belonging to the same known hierarchy but different latent clusters may not always be tenable, as the grouping effect could potentially exhibit shared patterns across clusters. Furthermore, the restriction of modeling dependence among dichotomous covariates only was solely motivated by the type of covariates available within the Lombardy Region database. Specifically, the considered Ising model represents the simplest form of Markov random fields (Kindermann and Snell, 1980), and future methodological advancements could indeed extend the current proposal to allow for more general higher-order interactions. In addition, tackling the well-known over-parameterization issue associated with multiple continuous covariates could improve both the flexibility and adaptability of the proposed approach. Solutions to this issue are proposed in Banfield and Raftery (1993), Celeux and Govaert (1995), and Murphy and Murphy (2020), among others. Lastly, one aspect not addressed in this work is the evaluation of uncertainty associated with parameter estimates. While non-differentiability issues may prevent the implementation of an information matrix-based approach in our setting, sampling-based methods provide an alternative and generalizable solution for estimating standard errors. Jackknife and bootstrap techniques could be adapted to our problem, similar to the approaches proposed by O’Hagan et al. (2019) and Berta and Vinciotti (2019) in the context of Gaussian mixtures and ML-CWMds, respectively. Future research will extend the current proposal to address its limitations and enhance its ability to model time-to-event outcomes, with several proposals already under study.

ACKNOWLEDGMENTS

This work is part of the ENHANCE-HEART project: Efficacy evaluatioN of the therapeutic-care patHways, of the heAlthcare providers effects, aNd of the risk stratifiCation in patiEnts suffering from HEART failure. The authors thank the “Unit” Organizzativa Osservatorio Epidemiologico Regionale and ARIA S.p.A for providing data and technological support. The authors gratefully acknowledge the support from the Department of Mathematics of Politecnico di Milano, which facilitated this research as part of the department’s activities of “Dipartimento di Eccellenza 2023-2027.”

FUNDING

Chiara Masci acknowledges financial support from the Italian Ministry of University and Research (MUR) under the Department of Excellence 2023-2027 grant agreement ‘Centre of Excellence in Economics and Data Science’ (CEEDS).

CONFLICT OF INTEREST

None declared.

DATA AVAILABILITY

The Lombardy Region dataset analyzed in Section 4 contains sensitive information and cannot be shared due to privacy and confidentiality considerations.

REFERENCES

Adeghate

E. A.

Eid

Singh

(

2021

Mechanisms of COVID-19-induced heart failure: a short review

Heart Failure Reviews

363

–

369

Ammenwerth

Gräber

Herrmann

Bürkle

König

(

2003

Evaluation of health information systems—problems and challenges

International Journal of Medical Informatics

125

–

135

Bader

Manla

Atallah

Starling

R. C.

(

2021

Heart failure and COVID-19

Heart Failure Reviews

–

Banfield

J. D.

Raftery

A. E.

(

1993

Model-based Gaussian and non-Gaussian clustering

Biometrics

803

Berta

Ingrassia

Punzo

Vittadini

(

2016

Multilevel cluster-weighted models for the evaluation of hospitals

Metron

275

–

292

Berta

Ingrassia

Vittadini

Spinelli

(

2024

Latent heterogeneity in COVID-19 hospitalisations: a cluster-weighted approach to analyse mortality

Australian & New Zealand Journal of Statistics

–

Berta

Vinciotti

(

2019

Multilevel logistic cluster-weighted model for outcome evaluation in health care

Statistical Analysis and Data Mining: The ASA Data Science Journal

434

–

443

Biernacki

Celeux

Govaert

(

2003

Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models

Computational Statistics & Data Analysis

561

–

575

Celeux

Govaert

(

1992

A classification EM algorithm for clustering and two stochastic versions

Computational Statistics & Data Analysis

315

–

332

Celeux

Govaert

(

1995

Gaussian parsimonious clustering models

Pattern Recognition

781

–

793

Cheng

Levina

Wang

Zhu

(

2014

A sparse Ising model with covariates

Biometrics

943

–

953

Committee

C.-C. W. P.

et al. (

2012

Statistical issues in assessing hospital performance

The Committee of the Presidents of Statistical Societies

Corrao

Rea

Di Martino

De Palma

Scondotto

Fusco

et al. (

2017

Developing and validating a novel multisource comorbidity score from administrative data: a large population-based cohort study from Italy

BMJ Open

e019503

Dayton

C. M.

Macready

G. B.

(

1988

Concomitant-variable latent-class models

Journal of the American Statistical Association

173

–

178

Gershenfeld

(

1997

Nonlinear inference and cluster-weighted modeling

Annals of the New York Academy of Sciences

808

–

Ghosal

Mukherjee

(

2020

Joint estimation of parameters in Ising model

The Annals of Statistics

785

–

810

Ingrassia

Minotti

S. C.

Punzo

(

2014

Model-based clustering via linear cluster-weighted models

Computational Statistics & Data Analysis

159

–

182

Ingrassia

Minotti

S. C.

Vittadini

(

2012

Local statistical modeling via a cluster-weighted approach with elliptical distributions

Journal of Classification

363

–

401

Ingrassia

Punzo

Vittadini

Minotti

S. C.

(

2015

The generalized linear mixed cluster-weighted model

Journal of Classification

–

113

Karlis

Xekalaki

(

2003

Choosing initial values for the EM algorithm for finite mixtures

Computational Statistics & Data Analysis

577

–

590

Kindermann

Snell

J. L.

(

1980

Markov Random Fields and their Applications

, Vol.

Providence, RI

American Mathematical Society

Murphy

T. B.

(

2020

Gaussian parsimonious clustering models with covariates and a noise component

Advances in Data Analysis and Classification

293

–

325

O’Hagan

Murphy

T. B.

Scrucca

Gormley

I. C.

(

2019

Investigation of parameter uncertainty in clustering using a Gaussian mixture model via jackknife, bootstrap and weighted likelihood bootstrap

Computational Statistics

1779

–

1813

Rey

J. R.

Caro-Codón

Rosillo

S. O.

Iniesta

Á. M.

Castrejón-Castrejón

Marco-Clement

et al. (

2020

Heart failure in COVID-19 patients: prevalence, incidence and prognostic implications

European Journal of Heart Failure

2205

–

2215

Romano

P. S.

Roost

L. L.

Jollis

J. G.

(

1993

Further evidence concerning the use of a clinical comorbidity index with ICD-9-CM administrative data

Journal of Clinical Epidemiology

1085

–

1090

Schwarz

(

1978

Estimating the dimension of a model

The Annals of Statistics

461

–

464