Abstract

Objective

To improve problem list documentation and care quality.

Materials and methods

We developed algorithms to infer clinical problems a patient has that are not recorded on the coded problem list using structured data in the electronic health record (EHR) for 12 clinically significant heart, lung, and blood diseases. We also developed a clinical decision support (CDS) intervention which suggests adding missing problems to the problem list. We evaluated the intervention at 4 diverse healthcare systems using 3 different EHRs in a randomized trial using 3 predetermined outcome measures: alert acceptance, problem addition, and National Committee for Quality Assurance Healthcare Effectiveness Data and Information Set (NCQA HEDIS) clinical quality measures.

Results

There were 288 832 opportunities to add a problem in the intervention arm and the problem was added 63 777 times (acceptance rate 22.1%). The intervention arm had 4.6 times as many problems added as the control arm. There were no significant differences in any of the clinical quality measures.

Discussion

The CDS intervention was highly effective at improving problem list completeness. However, the improvement in problem list utilization was not associated with improvement in the quality measures. The lack of effect on quality measures suggests that problem list documentation is not directly associated with improvements in quality measured by National Committee for Quality Assurance Healthcare Effectiveness Data and Information Set (NCQA HEDIS) quality measures. However, improved problem list accuracy has other benefits, including clinical care, patient comprehension of health conditions, accurate CDS and population health, and for research.

Conclusion

An EHR-embedded CDS intervention was effective at improving problem list completeness but was not associated with improvement in quality measures.

BACKGROUND AND SIGNIFICANCE

Weed1 first conceptualized the problem list in his landmark article “Medical Records that Guide and Teach”. He was prescient in envisioning the potential of the computerization of the problem list in the context of electronic health records (EHRs) to improve care, “Since a complete and accurate list of problems should play a central part in the understanding and management of individual patients and groups of patients, storage of this portion of the medical record in the computer should receive high priority to give immediate access to the list of problems for care of the individual patient and for statistical study on groups of patients.”

A complete patient problem list is the cornerstone of Dr Weed’s vision of the problem-oriented medical record. It serves as a valuable tool for providers assessing a patient’s clinical status and succinctly communicates this information between providers. An accurate problem list supports problem-oriented charting and can also help guide the flow of a clinical encounter, by reminding providers about important health issues to discuss or evaluate during the visit.

Complete problem lists are also important for high-quality clinical decision support (CDS). Considerable evidence exists that, when effectively designed, CDS tools can improve quality of care and patient outcomes.2–4 Effective CDS depends on a complete problem list, since many CDS rules require accurate, coded problem list entries.5 CDS systems which depend on the problem list have been developed for a wide range of purposes, including drug/problem medication alerts,6,7 problem-based screening reminders8 and management of chronic diseases.9–21

Problem lists are also often used for clinical research and quality measurement. Many clinical improvement and research investigations, including large genomic studies, are becoming increasingly dependent on EHR data collected during clinical care.22–24 For example, a genome-wide association study might correlate genomic data taken from many patients with phenotypic data derived from those patients’ EHRs.22,25 If EHR data are incomplete, this will lead to false negatives, which will cloud the accuracy of putative associations. To overcome this limitation, several projects, most notably the eMERGE initiative is developing EHR phenotyping algorithms that attempt to make inferences about patient’s diagnoses (or problems), even when they are missing from the problem list.26,27

Further, the problem list is often used for quality measurements in EHRs, including those in the CMS meaningful use/promoting interoperability incentive program.28 For example, a measure of retinopathy screening for patients with diabetes may depend on accurate documentation of diabetes on the problem list. If diabetes is missing from the patient’s problem list, that patient may be excluded from the measure. Conversely, if a patient has diabetes on his or her problem list, but the diabetes is in remission (or was documented erroneously), the patient may be incorrectly included in the measure, even though he or she may not be appropriate for diabetic retinopathy screening. Precision of the diagnosis is also important: even if diabetes is on the problem list, but diabetic retinopathy is not, the measure may falsely include the patient as eligible for screening, or conversely, score the patient falsely as not meeting the measure because screening appears to not have occurred. A series of recent studies have identified frequent inaccuracies in EHR-derived quality measures due to incomplete data, including problem lists.29–38

An accurate problem list has been associated with higher quality care in observational studies. For example, in 2005, Hartung et al39 found that patients with congestive heart failure (CHF) on their problem list were more likely to receive angiotensin-converting enzyme inhibitors or angiotensin-II receptor blockers than for those CHF patients without CHF listed on their problem list. A more recent thematic analysis of the literature also identified several studies which posited links between quality and problem list usage.38 Users of the problem list perform better on National Committee for Quality Assurance Healthcare Effectiveness Data and Information Set (NCQA HEDIS) quality measures in women’s health, depression, colon cancer screening, and cancer prevention measures, outperforming nonusers by 3.3–9.6% points on HEDIS measure group scores.40 Patients also increasingly view their own problem lists through patient portals,41 and those patients who view them report finding them helpful, so having accurate and complete problem lists may support patient comprehension.

Despite this importance, coded problem lists are often inaccurate, incomplete, cluttered, and out of date. In previous work, we showed that problem list completeness in 1 network ranged from 4.7% for renal insufficiency or failure to 50.7% for hypertension, 61.9% for diabetes,42 to a maximum of 78.5% for breast cancer; and other institutions have found similar results.43–45 Problem lists may also contain inaccurate, out-of-date or duplicative entries, causing them to become long and distracting clinicians from important problems.

Problem list completeness and use also varies dramatically by institution. For a single problem (diabetes), we found that the proportion of patients who had diabetes based on laboratory criteria had diabetes on their problem list 60.2% of the time at the lowest performing of 10 institutions, and 99.4% of the time at the highest performing institution.46

The causes of problem list incompleteness are myriad. In prior ethnographic work, we observed and interviewed 63 clinicians, and noted a “tragedy of the commons” occurring in many practice settings—providers reported that, frustrated with their incompleteness, they had stopped updating patient problem lists—this disuse then contributed to further decay of the problem list, causing other providers to also discontinue use.47

To improve problem list completeness, in prior work, we developed a series of algorithms which can identify missing problems from patient problem lists by analyzing other data in a patient’s EHR, including medications, laboratory results, and billing diagnoses. For example, if we detected a patient with a HbA1c of 9.2% who was on metformin, we inferred that they have diabetes and, if diabetes is missing from their problem list, we alerted their healthcare provider through the EHR. In a single-site randomized trial, we showed a 300% increase in problem list additions for the 17 conditions for which we had developed algorithms.48

Based on this prior work, we formulated 2 hypotheses: first, that the results from our single-site study could transfer to additional diseases and institutions—with CDS alerts leading to increased completeness of the problem list. Second, we further hypothesized that the CDS alerts would yield improvements in clinical quality, measured using EHR-based quality measures.

METHODS

With funding from the National Heart, Lung and Blood Institute, we developed a 4-site randomized trial of an intervention for improving problem list completeness. Table 1 lists the 4 sites. They were selected to have a mix of vendor-developed EHRs, care settings, clinical populations, and to be geographically diverse.

Table 1.

Sites

SiteLocationEHRAlert settingAlert interruptiveAlert actionableAlert repeatsTriggerIntervention dates
MGBaBoston, MAEpicOutpatientYesYesYes, until acknowledged or resolvedChart open6/7/2016–6/7/2017
Holy Spirit HospitalCamp Hill, PAAllscripts Sunrise AcuteInpatientYesNoYes, until resolvedChart open8/4/2016–5/7/2017b
Oregon Health and Science UniversityPortland, OREpicOutpatientNoYesYes, until acknowledged or resolvedNavigator6/30/2016–6/30/2017
Vanderbilt University Medical CenterNashville, TNSelf-developedInpatient and outpatientNoNoYes, until acknowledged or resolvedProblem list activity7/18/2016–7/18/2017
SiteLocationEHRAlert settingAlert interruptiveAlert actionableAlert repeatsTriggerIntervention dates
MGBaBoston, MAEpicOutpatientYesYesYes, until acknowledged or resolvedChart open6/7/2016–6/7/2017
Holy Spirit HospitalCamp Hill, PAAllscripts Sunrise AcuteInpatientYesNoYes, until resolvedChart open8/4/2016–5/7/2017b
Oregon Health and Science UniversityPortland, OREpicOutpatientNoYesYes, until acknowledged or resolvedNavigator6/30/2016–6/30/2017
Vanderbilt University Medical CenterNashville, TNSelf-developedInpatient and outpatientNoNoYes, until acknowledged or resolvedProblem list activity7/18/2016–7/18/2017
a

Previously called Partners HealthCare.

b

Holy Spirit Hospital transitioned from the Allscripts Sunrise EHR to Epic during the intervention period and discontinued the intervention early.

Table 1.

Sites

SiteLocationEHRAlert settingAlert interruptiveAlert actionableAlert repeatsTriggerIntervention dates
MGBaBoston, MAEpicOutpatientYesYesYes, until acknowledged or resolvedChart open6/7/2016–6/7/2017
Holy Spirit HospitalCamp Hill, PAAllscripts Sunrise AcuteInpatientYesNoYes, until resolvedChart open8/4/2016–5/7/2017b
Oregon Health and Science UniversityPortland, OREpicOutpatientNoYesYes, until acknowledged or resolvedNavigator6/30/2016–6/30/2017
Vanderbilt University Medical CenterNashville, TNSelf-developedInpatient and outpatientNoNoYes, until acknowledged or resolvedProblem list activity7/18/2016–7/18/2017
SiteLocationEHRAlert settingAlert interruptiveAlert actionableAlert repeatsTriggerIntervention dates
MGBaBoston, MAEpicOutpatientYesYesYes, until acknowledged or resolvedChart open6/7/2016–6/7/2017
Holy Spirit HospitalCamp Hill, PAAllscripts Sunrise AcuteInpatientYesNoYes, until resolvedChart open8/4/2016–5/7/2017b
Oregon Health and Science UniversityPortland, OREpicOutpatientNoYesYes, until acknowledged or resolvedNavigator6/30/2016–6/30/2017
Vanderbilt University Medical CenterNashville, TNSelf-developedInpatient and outpatientNoNoYes, until acknowledged or resolvedProblem list activity7/18/2016–7/18/2017
a

Previously called Partners HealthCare.

b

Holy Spirit Hospital transitioned from the Allscripts Sunrise EHR to Epic during the intervention period and discontinued the intervention early.

System development

We developed and validated problem identification algorithms for 12 clinically significant heart, lung, and blood diseases:

  • Asthma

  • Atrial fibrillation

  • Chronic obstructive pulmonary disease (COPD)

  • Congestive heart failure (CHF)

  • Coronary artery disease (CAD)

  • Hyperlipidemia

  • Hypertension

  • Myocardial infarction (MI)

  • Sickle cell anemia

  • Sleep apnea

  • Stroke

  • Tuberculosis

These algorithms used a combination of laboratory results, medications, encounter diagnosis codes, and procedures to identify patients who are likely to have one of these conditions, but who do not have a relevant problem list code on their problem list. The details of the algorithms are given in Supplementary Appendix S1, and their test characteristics are described in Supplementary Table S2, based on chart reviews at Mass General Brigham (MGB) clinics prior to implementation. To maximize tolerability of the alerts, the algorithms were designed to maximize their positive predictive value (PPV) while still attaining acceptable sensitivity.

We then developed a CDS alert, which fires during a clinical encounter and suggests adding the potentially missing problem to the problem list. Figure 1 shows an example alert, in Epic at MGB, for a patient who has an elevated B-type natriuretic peptide level, which suggests that the patient may have CHF49 and does not have CHF on his problem list. The alert suggests adding CHF, if appropriate, but also allows the user to indicate that the patient does not have CHF, which prevents the alert from firing again for this patient.

Screenshot of the IQ-MAPLE intervention, for CHF, at MGB.
Figure 1.

Screenshot of the IQ-MAPLE intervention, for CHF, at MGB.

Each site used the same alerting logic and developed in-workflow CDS alerts to prompt users to add problems to the problem list. However, owing to differences in the clinical environments and EHRs at each site, they were free to tailor the presentation and workflow of the alert, following a standard framework for flexible EHR-based interventions.50 The flexible intervention framework is derived from the United Kingdom Medical Research Council framework, which focuses on standardizing the process and function of an intervention, rather than its form.51

Study design

Each site randomized providers (physicians, physician assistants, and nurse practitioners) to the intervention or control arm, using a random number generator. Each provider had an equal probability of being assigned to either arm. Providers in the intervention arm received the alert during clinical encounters. The alert was generated and logged for providers in the control arm; however, the alert was not actually shown to the user.

Immediately after study completion, we extracted data on alert firing, alert acceptance, and problem list utilization at each site. We compared the rate of alert acceptance using a chi-squared test and the rate of problem list addition using Poisson regression.

At one site (MGB), we also evaluated the effect of the alert on clinical quality measures. This analysis was done by condition and compared patients for whom an alert was generated and displayed in the intervention arm to patients for whom an alert would have generated, but not displayed, in the control arm. This analysis was done on an intention to treat basis. For example, for the “LDL Testing” measure in the CAD condition, we calculated the proportion of patients with a low-density lipoprotein (LDL) test during the measurement period, compared across the control and intervention arms. Proportions for each quality measure were compared using a chi-squared test, with a Bonferroni correction for multiple comparisons.

The study was registered with clinicaltrials.gov prior to patient accrual, trial identifier NCT02596087, and was approved by the institutional review boards of MGB, Holy Spirit Hospital, Oregon Health and Science University, and Vanderbilt University.

RESULTS

Alert acceptance

Table 2 shows the proportion of missing problems added, by condition and arm. Across all sites, there were 288 832 opportunities to add a problem in the intervention arm, and the problem was added 63 777 times (overall acceptance rate of 22.1%). To isolate the effect of the intervention, we also analyzed the set of patients for whom the alert would have been presented in the control arm (our system was designed to generate and log the alert in the control arm, but not actually show it to the user). In the control arm, there were 298 817 missing problems. Of these, the problem was spontaneously added (without an alert) 6881 times (2.3%). Comparing the 2 groups, the relative ratio of alert-driven problem addition in the intervention arm was 9.6 (P < .0001).

Table 2.

Proportion of missing problems added, by condition and arm

ConditionControlInterventionP
Asthma404/23 286 = 1.7% 3164/19 309 = 16.4% <.0001 
Atrial fibrillation173/9873 = 1.8% 1562/9774 = 16.0% <.0001 
COPD150/10 496 = 1.4% 931/9004 = 10.3% <.0001 
CHF88/15 197 = 0.6% 1821/15 597 = 11.7% <.0001 
CAD236/17 319 = 1.4% 1654/15 261 = 10.8% <.0001 
Hyperlipidemia3505/110 643 = 3.2% 36 750/112 793 = 32.6% <.0001 
Hypertension2082/79 358 = 2.6% 14 463/79 401 = 18.2% <.0001 
Myocardial infarction28/9650 = 0.3% 825/8912 = 9.3% <.0001 
Sickle cell16/754 = 2.1% 136/729 = 18.7% <.0001 
Sleep apnea93/13 228 = 0.7% 1417/10 712 = 13.2% <.0001 
Stroke78/7962 = 1.0% 812/6347 = 12.8% <.0001 
Tuberculosis28/1051 = 2.7% 242/993 = 24.4% <.0001 
Total6881/298 817 = 2.3% 63 777/288 832 = 22.1% <.0001 
ConditionControlInterventionP
Asthma404/23 286 = 1.7% 3164/19 309 = 16.4% <.0001 
Atrial fibrillation173/9873 = 1.8% 1562/9774 = 16.0% <.0001 
COPD150/10 496 = 1.4% 931/9004 = 10.3% <.0001 
CHF88/15 197 = 0.6% 1821/15 597 = 11.7% <.0001 
CAD236/17 319 = 1.4% 1654/15 261 = 10.8% <.0001 
Hyperlipidemia3505/110 643 = 3.2% 36 750/112 793 = 32.6% <.0001 
Hypertension2082/79 358 = 2.6% 14 463/79 401 = 18.2% <.0001 
Myocardial infarction28/9650 = 0.3% 825/8912 = 9.3% <.0001 
Sickle cell16/754 = 2.1% 136/729 = 18.7% <.0001 
Sleep apnea93/13 228 = 0.7% 1417/10 712 = 13.2% <.0001 
Stroke78/7962 = 1.0% 812/6347 = 12.8% <.0001 
Tuberculosis28/1051 = 2.7% 242/993 = 24.4% <.0001 
Total6881/298 817 = 2.3% 63 777/288 832 = 22.1% <.0001 
Table 2.

Proportion of missing problems added, by condition and arm

ConditionControlInterventionP
Asthma404/23 286 = 1.7% 3164/19 309 = 16.4% <.0001 
Atrial fibrillation173/9873 = 1.8% 1562/9774 = 16.0% <.0001 
COPD150/10 496 = 1.4% 931/9004 = 10.3% <.0001 
CHF88/15 197 = 0.6% 1821/15 597 = 11.7% <.0001 
CAD236/17 319 = 1.4% 1654/15 261 = 10.8% <.0001 
Hyperlipidemia3505/110 643 = 3.2% 36 750/112 793 = 32.6% <.0001 
Hypertension2082/79 358 = 2.6% 14 463/79 401 = 18.2% <.0001 
Myocardial infarction28/9650 = 0.3% 825/8912 = 9.3% <.0001 
Sickle cell16/754 = 2.1% 136/729 = 18.7% <.0001 
Sleep apnea93/13 228 = 0.7% 1417/10 712 = 13.2% <.0001 
Stroke78/7962 = 1.0% 812/6347 = 12.8% <.0001 
Tuberculosis28/1051 = 2.7% 242/993 = 24.4% <.0001 
Total6881/298 817 = 2.3% 63 777/288 832 = 22.1% <.0001 
ConditionControlInterventionP
Asthma404/23 286 = 1.7% 3164/19 309 = 16.4% <.0001 
Atrial fibrillation173/9873 = 1.8% 1562/9774 = 16.0% <.0001 
COPD150/10 496 = 1.4% 931/9004 = 10.3% <.0001 
CHF88/15 197 = 0.6% 1821/15 597 = 11.7% <.0001 
CAD236/17 319 = 1.4% 1654/15 261 = 10.8% <.0001 
Hyperlipidemia3505/110 643 = 3.2% 36 750/112 793 = 32.6% <.0001 
Hypertension2082/79 358 = 2.6% 14 463/79 401 = 18.2% <.0001 
Myocardial infarction28/9650 = 0.3% 825/8912 = 9.3% <.0001 
Sickle cell16/754 = 2.1% 136/729 = 18.7% <.0001 
Sleep apnea93/13 228 = 0.7% 1417/10 712 = 13.2% <.0001 
Stroke78/7962 = 1.0% 812/6347 = 12.8% <.0001 
Tuberculosis28/1051 = 2.7% 242/993 = 24.4% <.0001 
Total6881/298 817 = 2.3% 63 777/288 832 = 22.1% <.0001 

Problem addition

We calculated the total number of study problems added across all conditions and all sites, by arm (Figure 2). In the control arm, 16 132 problems were added in the preintervention period (1 year) and 15 007 were added in the postintervention period (1 year). In the intervention arm, 17 655 problems were added in the preintervention period (1 year) and 75 088 (1 year) were added in the postintervention period. Adjusting for baseline differences, the intervention arm had 4.6 times as many problems added as the control arm (P < .0001).

Number of problems added, by arm and period.
Figure 2.

Number of problems added, by arm and period.

Quality measures

Finally, we evaluated the effect of the intervention on a predetermined set of HEDIS quality measures at one site (MGB). There were no differences in quality measures between the 2 groups (Table 3). Certain HEDIS measures (such as antihyperlipidemic medicines) applied to multiple clinical conditions and were evaluated for each condition. Of the 17 condition-measure combinations, only one had a statistically significant difference; however, after Bonferroni adjustment (for multiple hypothesis testing), this difference was no longer significant (ie, at the 0.05 level, with 17 comparisons, we would expect approximately 1 false positive).

Table 3.

Clinical outcomes

ConditionControlInterventionP
CAD
 Anti-HLD Meds514/768 = 66.9%545/755 = 72.2%.030
 Anti-platelet Meds820/1105 = 74.2%846/1130 = 74.9%.757
 BP Control289/371 = 77.9%287/381 = 75.3%.456
 LDL Control369/768 = 48.0%344/755 = 45.6%.358
 LDL Testing459/768 = 59.8%438/755 = 58.0%.520
Hyperlipidemia (HLD)
 Anti-HLD Meds24 018/28 488 = 84.3%26 472/31 355 = 84.4%.701
 LDL Control7412/28 488 = 26.0%8134/31 355 = 25.9%.839
 LDL Testing10 583/28 488 = 37.1%11 738/31 355 = 37.4%.474
Hypertension (HTN)
 Anti-HTN Meds5959/7920 = 75.2%6598/8684 = 76.0%.276
 BP control4983/7919 = 62.9%5388/8684 = 62.0%.249
MI
 Anti-HLD Meds284/457 = 62.1%298/486 = 61.3%.846
 Anti-platelet Meds572/755 = 75.8%642/831 = 77.3%.521
 LDL Control181/457 = 39.6%179/486 = 36.8%.418
 LDL Testing234/457 = 51.2%237/486 = 48.8%.494
Stroke
 Anti-HLD Meds218/408 = 53.4%288/480 = 60.0%.057
 Anti-platelet Meds348/614 = 56.7%444/726 = 61.2%.108
ConditionControlInterventionP
CAD
 Anti-HLD Meds514/768 = 66.9%545/755 = 72.2%.030
 Anti-platelet Meds820/1105 = 74.2%846/1130 = 74.9%.757
 BP Control289/371 = 77.9%287/381 = 75.3%.456
 LDL Control369/768 = 48.0%344/755 = 45.6%.358
 LDL Testing459/768 = 59.8%438/755 = 58.0%.520
Hyperlipidemia (HLD)
 Anti-HLD Meds24 018/28 488 = 84.3%26 472/31 355 = 84.4%.701
 LDL Control7412/28 488 = 26.0%8134/31 355 = 25.9%.839
 LDL Testing10 583/28 488 = 37.1%11 738/31 355 = 37.4%.474
Hypertension (HTN)
 Anti-HTN Meds5959/7920 = 75.2%6598/8684 = 76.0%.276
 BP control4983/7919 = 62.9%5388/8684 = 62.0%.249
MI
 Anti-HLD Meds284/457 = 62.1%298/486 = 61.3%.846
 Anti-platelet Meds572/755 = 75.8%642/831 = 77.3%.521
 LDL Control181/457 = 39.6%179/486 = 36.8%.418
 LDL Testing234/457 = 51.2%237/486 = 48.8%.494
Stroke
 Anti-HLD Meds218/408 = 53.4%288/480 = 60.0%.057
 Anti-platelet Meds348/614 = 56.7%444/726 = 61.2%.108
Table 3.

Clinical outcomes

ConditionControlInterventionP
CAD
 Anti-HLD Meds514/768 = 66.9%545/755 = 72.2%.030
 Anti-platelet Meds820/1105 = 74.2%846/1130 = 74.9%.757
 BP Control289/371 = 77.9%287/381 = 75.3%.456
 LDL Control369/768 = 48.0%344/755 = 45.6%.358
 LDL Testing459/768 = 59.8%438/755 = 58.0%.520
Hyperlipidemia (HLD)
 Anti-HLD Meds24 018/28 488 = 84.3%26 472/31 355 = 84.4%.701
 LDL Control7412/28 488 = 26.0%8134/31 355 = 25.9%.839
 LDL Testing10 583/28 488 = 37.1%11 738/31 355 = 37.4%.474
Hypertension (HTN)
 Anti-HTN Meds5959/7920 = 75.2%6598/8684 = 76.0%.276
 BP control4983/7919 = 62.9%5388/8684 = 62.0%.249
MI
 Anti-HLD Meds284/457 = 62.1%298/486 = 61.3%.846
 Anti-platelet Meds572/755 = 75.8%642/831 = 77.3%.521
 LDL Control181/457 = 39.6%179/486 = 36.8%.418
 LDL Testing234/457 = 51.2%237/486 = 48.8%.494
Stroke
 Anti-HLD Meds218/408 = 53.4%288/480 = 60.0%.057
 Anti-platelet Meds348/614 = 56.7%444/726 = 61.2%.108
ConditionControlInterventionP
CAD
 Anti-HLD Meds514/768 = 66.9%545/755 = 72.2%.030
 Anti-platelet Meds820/1105 = 74.2%846/1130 = 74.9%.757
 BP Control289/371 = 77.9%287/381 = 75.3%.456
 LDL Control369/768 = 48.0%344/755 = 45.6%.358
 LDL Testing459/768 = 59.8%438/755 = 58.0%.520
Hyperlipidemia (HLD)
 Anti-HLD Meds24 018/28 488 = 84.3%26 472/31 355 = 84.4%.701
 LDL Control7412/28 488 = 26.0%8134/31 355 = 25.9%.839
 LDL Testing10 583/28 488 = 37.1%11 738/31 355 = 37.4%.474
Hypertension (HTN)
 Anti-HTN Meds5959/7920 = 75.2%6598/8684 = 76.0%.276
 BP control4983/7919 = 62.9%5388/8684 = 62.0%.249
MI
 Anti-HLD Meds284/457 = 62.1%298/486 = 61.3%.846
 Anti-platelet Meds572/755 = 75.8%642/831 = 77.3%.521
 LDL Control181/457 = 39.6%179/486 = 36.8%.418
 LDL Testing234/457 = 51.2%237/486 = 48.8%.494
Stroke
 Anti-HLD Meds218/408 = 53.4%288/480 = 60.0%.057
 Anti-platelet Meds348/614 = 56.7%444/726 = 61.2%.108

DISCUSSION

Our results demonstrate that the IQ-MAPLE intervention for problem list improvement had limited effectiveness at increasing problem list documentation, across conditions and at multiple sites. The overall acceptance rate (22.1%) is higher than those reported in many other published studies of CDS52–54; however, this still means that, for many patients (77.9%) for whom the alert was displayed, the provider did not act. We did not have an a priori goal for acceptance rate of the alert; however, subsequent to this trial, Vanderbilt University Medical Center (VUMC) adopted a goal of 30% acceptance for interruptive alerts. Some of the alerts were likely false positives; however, given the high PPV of our alerts identified during early testing, we expected more of them to be accepted. The causes for nonacceptance are likely multifactorial—for example, providers may not have read the alerts, or may not have thought it was their responsibility, or important, to add the problem to the problem list. Further, providers receive many other types of alerts in the EHR, and override many of them—these competing alerts may have contributed to alert fatigue, distracting providers from our problem list alerts. Based on these findings, organizations could expect that alerts such as ours would increase problem documentation significantly; however, if a higher rate of problem list completeness is required, additional strategies are necessary. For example, in a separate study, we found that an intervention where residents were paid $1.45 per chart to review patient records and confirm whether a patient had a splenectomy (from a list generated by a splenectomy-detection algorithm we developed) was at least twice as effective as a point-of-care alert.55

The overall impact of the alert on problem documentation was strong—leading to a 4.6-fold increase in the number of problems added in the intervention arm. This increase is statistically significant; however, it is likely that there are still many patients who have the problems of interest missing from their problem list, again suggesting a need for alternative strategies for problem list addition.

Our findings confirmed our hypothesis that the IQ-MAPLE intervention would lead to an increase in problem list additions. However, we further hypothesized that the intervention would also lead to an increase in measurable clinical quality. Unfortunately, our analysis of the HEDIS data suggests that, at least as measured by HEDIS, our intervention did not yield an increase in quality. There are several possible explanations for this. First, even after the intervention, many patients still had problem list gaps. Second, HEDIS measures may not be an accurate reflection of the true quality of care provided.56 Finally, the link between problem list documentation and quality may, in fact, not be very strong. MGB has CDS related to several of the HEDIS measures, and that CDS uses the problem list, as well as relevant lab results and other clinical data, to make recommendations. If problem list usage increases, this CDS may recognize more patients who have problems and offer more alerts. However, the downstream CDS at MGB had a relatively low acceptance rate, attenuating the possible causal chain from the problem list alerts to better quality through downstream CDS. Some of the measures have financial incentives, and for those population managers sometimes review gaps, and this could be an alternate mechanism for improvement, although no improvement for the metrics studied was found.

Strengths and limitations

Our study has several strengths. It is the first multi-site, randomized study of problem list alerts done using a variety of EHRs and had a relatively effective intervention. It also had some important limitations. First, we did not assess whether the problems added from the alerts were accurate—we believe that they largely were, but there may have been some false positives. Second, although we had 4 sites, there were many differences in implementation strategy at each site, which meant we could not assess whether specific intervention characteristics (eg, inpatient vs outpatient, interruptive vs noninterruptive, and actionable vs nonactionable) made a difference. More sites or a longer time period of data collection, and a random allocation of alert features would be needed to draw these conclusions. Third, our data sharing plan did not include the sharing of additional baseline data about the rate of problem list utilization or number of encounters at each site, so we could not make direct comparisons beyond alert volume, acceptance, and the number of problems added at each site. However, since this study was a randomized controlled trial, we expect these to be well-balanced across arms at each site. Fourth, we focused, in this study, only on adding missing problems and not on removing inaccurate, unimportant, or redundant problems—problem lists frequently contain inaccurate or out-of-date problem entries, so this would be a useful topic for future research. Fifth, our study looked only at structured data already in the medical record to infer problems—by using, for example, natural language processing on notes, optical character recognition of scanned documents, and integration of external data (eg, through a health information exchange) additional problems could potentially be identified, possibly with higher specificity. Finally, more fully automated strategies, such as automatic creation and maintenance of the problem list could also be explored with a goal of reducing the overhead of maintaining the problem list for clinicians—such a strategy would need to be weighed against possible issues with accuracy, as well as the possibility that clinician curation of the problem list may have benefits as the clinician thinks through a patient’s problems.

CONCLUSIONS

Conducting, randomized, multi-site, CDS interventions using different EHRs with different clinical workflows, upgrade cycles, patient populations, CDS governance committees, and different abilities to configure local CDS features is challenging. An EHR-embedded CDS intervention was effective at improving problem list completeness but was not associated with improvements in quality measures. The problem inference algorithms developed may have additional uses, such as improving the accuracy of clinical quality measures or EHR-based phenotyping locally or in networks such as eMERGE.

FUNDING

Research reported in this publication was supported by the National Library of Medicine of the National Institutes of Health under Award Number 1R01HL122225. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.

AUTHOR CONTRIBUTIONS

Dr A.W. and Dr D.F.S. made substantial contributions to the conception and design of the work. All authors assisted in the acquisition, analysis, and interpretation of data. Dr A.W. wrote the initial draft, and all authors revised it critically for important intellectual content. All authors had final approval of the version to be published, and all agree to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

SUPPLEMENTARY MATERIAL

Supplementary material is available at Journal of the American Medical Informatics Association online.

ACKNOWLEDGEMENTS

The authors express their gratitude to both Calvin Beidleman and Kevin Peters for their instrumental assistance with many technical details, including alert programming and randomization of participants.

CONFLICT OF INTEREST STATEMENT

None declared.

DATA AVAILABILITY

The data underlying this article will be shared on reasonable request to the corresponding author.

REFERENCES

1

Weed
LL.
Medical records that guide and teach
.
N Engl J Med
1968
;
278
(
11
):
593
600
.

2

Garg
AX
,
Adhikari
NKJ
,
McDonald
H
, et al.
Effects of computerized clinical decision support systems on practitioner performance and patient outcomes
.
JAMA
2005
;
293
(
10
):
1223
38
.

3

Bright
TJ
,
Wong
A
,
Dhurjati
R
, et al.
Effect of clinical decision-support systems: a systematic review
.
Ann Intern Med
2012
;
157
(
1
):
29
43
.

4

Kawamoto
K
,
Houlihan
CA
,
Balas
EA
,
Lobach
DF
.
Improving clinical practice using clinical decision support systems: a systematic review of trials to identify features critical to success
.
BMJ
2005
;
330
(
7494
):
765
.

5

Wright
A
,
Goldberg
H
,
Hongsermeier
T
,
Middleton
B
.
A description and functional taxonomy of rule-based decision support content at a large integrated delivery network
.
J Am Med Inform Assoc
2007
;
14
(
4
):
489
96
.

6

Kuperman
GJ
,
Bobb
A
,
Payne
TH
, et al.
Medication-related clinical decision support in computerized provider order entry systems: a review
.
J Am Med Inform Assoc
2007
;
14
(
1
):
29
40
.

7

Bates
DW
,
Teich
JM
,
Lee
J
, et al.
The impact of computerized physician order entry on medication error prevention
.
J Am Med Inform Assoc
1999
;
6
(
4
):
313
21
.

8

Sequist
TD
,
Gandhi
TK
,
Karson
AS
, et al.
A randomized trial of electronic clinical reminders to improve quality of care for diabetes and coronary artery disease
.
J Am Med Inform Assoc
2005
;
12
(
4
):
431
7
.

9

Ebrahiminia
V
,
Riou
C
,
Seroussi
B
, et al.
Design of a decision support system for chronic diseases coupling generic therapeutic algorithms with guideline-based specific rules
.
Stud Health Technol Inform
2006
;
124
:
483
8
.

10

Fricton
J
,
Rindal
DB
,
Rush
W
, et al.
The effect of electronic health records on the use of clinical care guidelines for patients with medically complex conditions
.
J Am Dent Assoc
2011
;
142
(
10
):
1133
42
.

11

Graham
TA
,
Kushniruk
AW
,
Bullard
MJ
,
Holroyd
BR
,
Meurer
DP
,
Rowe
BH
. How usability of a web-based clinical decision support system has the potential to contribute to adverse medical events. In: AMIA Annual Symposium proceedings/AMIA Symposium;
2008
:
257
61
; Washington, DC.

12

Guzek
J
,
Guzek
S
,
Murphy
K
,
Gallacher
P
,
Lesneski
C
.
Improving diabetes care using a multitiered quality improvement model
.
Am J Med Qual
2009
;
24
(
6
):
505
11
.

13

Hetlevik
I
,
Holmen
J
,
Kruger
O
,
Kristensen
P
,
Iversen
H
,
Furuseth
K
.
Implementing clinical guidelines in the treatment of diabetes mellitus in general practice. Evaluation of effort, process, and patient outcome related to implementation of a computer-based decision support system
.
Int J Technol Assess Health Care
2000
;
16
(
1
):
210
27
.

14

Jean-Jacques
M
,
Persell
SD
,
Thompson
JA
,
Hasnain-Wynia
R
,
Baker
DW
.
Changes in disparities following the implementation of a health information technology-supported quality improvement initiative
.
J Gen Intern Med
2012
;
27
(
1
):
71
7
.

15

Jeffery
R
,
Iserman
E
,
Haynes
RB.
Can computerized clinical decision support systems improve diabetes management? A systematic review and meta-analysis.
Diabet Med
2012
;
30
(
6
):
739
45
.

16

Lipton
JA
,
Barendse
RJ
,
Akkerhuis
KM
,
Schinkel
AFL
,
Simoons
ML
.
Evaluation of a clinical decision support system for glucose control: impact of protocol modifications on compliance and achievement of glycemic targets
.
Crit Pathw Cardiol
2010
;
9
(
3
):
140
7
.

17

Munoz
M
,
Pronovost
P
,
Dintzis
J
, et al.
Implementing and evaluating a multicomponent inpatient diabetes management program: putting research into practice
.
Jt Comm J Qual Patient Saf
2012
;
38
(
5
):
195
206
.

18

Rodbard
D
,
Vigersky
RA.
Design of a decision support system to help clinicians manage glycemia in patients with type 2 diabetes mellitus
.
J Diabetes Sci Technol
2011
;
5
(
2
):
402
11
.

19

Souza
NM
,
Sebaldt
RJ
,
Mackay
JA
, et al. ;
the CCDSS Systematic Review Team
.
Computerized clinical decision support systems for primary preventive care: a decision-maker-researcher partnership systematic review of effects on process of care and patient outcomes
.
Implement Sci
2011
;
6
(
1
):
87
.

20

Swenson
CJ
,
Appel
A
,
Sheehan
M
, et al.
Using information technology to improve adult immunization delivery in an integrated urban health system
.
Jt Comm J Qual Patient Saf
2012
;
38
(
1
):
15
23
.

21

Toh
MP
,
Leong
HS
,
Lim
BK.
Development of a diabetes registry to improve quality of care in the National Healthcare Group in Singapore
.
Ann Acad Med Singap
2009
;
38
(
6
):
546
.

22

Gerhard
GS
,
Langer
RD
,
Carey
DJ
,
Stewart
WF
.
Electronic medical records in genomic medicine practice and research
.
Essent Genomic Personalized Med
2009
;
142
50
.

23

Denny
JC
,
Ritchie
MD
,
Basford
MA
, et al.
PheWAS: demonstrating the feasibility of a phenome-wide scan to discover gene–disease associations
.
Bioinformatics
2010
;
26
(
9
):
1205
10
.

24

Wright
A
,
McGlinchey
EA
,
Poon
EG
,
Jenter
CA
,
Bates
DW
,
Simon
SR
.
Ability to generate patient registries among practices with and without electronic health records
.
J Med Internet Res
2009
;
11
(
3
):
e31
.

25

Song
W
,
Huang
H
,
Zhang
CZ
,
Bates
DW
,
Wright
A
.
Using whole genome scores to compare three clinical phenotyping methods in complex diseases
.
Sci Rep
2018
;
8
(
1
):
11360
.

26

McCarty
CA
,
Chisholm
RL
,
Chute
CG
, et al. ;
eMERGE Team
.
The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies
.
BMC Med Genomics
2011
;
4
(
1
):
13
.

27

Pathak
J
,
Wang
J
,
Kashyap
S
, et al.
Mapping clinical phenotype data elements to standardized metadata repositories and controlled terminologies: the eMERGE Network experience
.
J Am Med Inform Assoc
2011
;
18
(
4
):
376
86
.

28

Blumenthal
D
,
Tavenner
M.
The “meaningful use” regulation for electronic health records
.
N Engl J Med
2010
;
363
(
6
):
501
4
.

29

Kern
LM
,
Malhotra
S
,
Barron
Y
, et al.
Accuracy of electronically reported “meaningful use” clinical quality measures: a cross-sectional study
.
Ann Intern Med
2013
;
158
(
2
):
77
83
.

30

Parsons
A
,
McCullough
C
,
Wang
J
,
Shih
S
.
Validity of electronic health record-derived quality measurement for performance monitoring
.
J Am Med Inform Assoc
2012
;
19
(
4
):
604
9
.

31

Kerr
EA
,
Smith
DM
,
Hogan
MM
, et al.
Comparing clinical automated, medical record, and hybrid data sources for diabetes quality measures
.
Jt Comm J Qual Improv
2002
;
28
(
10
):
555
65
.

32

Chan
KS
,
Fowles
JB
,
Weiner
JP.
Electronic health records and the reliability and validity of quality measures: a review of the literature
.
Med Care Res Rev
2010
;
67
(
5
):
503
27
.

33

Persell
SD
,
Wright
JM
,
Thompson
JA
,
Kmetik
KS
,
Baker
DW
.
Assessing the validity of national quality measures for coronary artery disease using an electronic health record
.
Arch Intern Med
2006
;
166
(
20
):
2272
7
.

34

D'Amore
JD
,
McCrary
LK
,
Denson
J
, et al.
Clinical data sharing improves quality measurement and patient safety
.
J Am Med Inform Assoc
2021
;
28
(
7
):
1534
42
.

35

Cholan
RA
,
Weiskopf
NG
,
Rhoton
DL
, et al.
Specifications of clinical quality measures and value set vocabularies shift over time: a study of change through implementation differences
.
AMIA Annu Symp Proc
2017
;
2017
:
575
84
.

36

Cholan
RA
,
Weiskopf
NG
,
Rhoton
D
, et al.
From concepts and codes to healthcare quality measurement: understanding variations in value set vocabularies for a statin therapy clinical quality measure
.
EGEMS (Wash DC)
2017
;
5
(
1
):
19
.

37

Colin
NV
,
Cholan
RA
,
Sachdeva
B
,
Nealy
BE
,
Parchman
ML
,
Door
DA
.
Understanding the impact of variations in measurement period reporting for electronic clinical quality measures
.
EGEMS (Wash DC)
2018
;
6
(
1
):
17
.

38

Hodge
CM
,
Narus
SP.
Electronic problem lists: a thematic analysis of a systematic literature review to identify aspects critical to success
.
J Am Med Inform Assoc
2018
;
25
(
5
):
603
13
.

39

Hartung
DM
,
Hunt
J
,
Siemienczuk
J
,
Miller
H
,
Touchette
DR
.
Clinical implications of an accurate problem list on heart failure treatment
.
J Gen Intern Med
2005
;
20
(
2
):
143
7
.

40

Poon
EG
,
Wright
A
,
Simon
SR
, et al.
Relationship between use of electronic health record features and health care quality: results of a statewide survey
.
Med Care
2010
;
48
(
3
):
203
9
.

41

Wright
A
,
Feblowitz
J
,
Maloney
FL
, et al.
Increasing patient engagement: patients’ responses to viewing problem lists online
.
Appl Clin Inform
2014
;
5
(
4
):
930
42
.

42

Wright
A
,
Pang
J
,
Feblowitz
J
, et al.
A method and knowledge base for automated inference of patient problems from structured data in an electronic medical record
.
J Am Med Inform Assoc
2011
;
18
(
6
):
859
67
.

43

Szeto
HC
,
Coleman
RK
,
Gholami
P
,
Hoffman
BB
,
Goldstein
MK
.
Accuracy of computerized outpatient diagnoses in a veterans affairs general medicine clinic
.
Am J Manag Care
2002
;
8
(
1
):
37
43
.

44

Tang
PC
,
LaRosa
MP
,
Gorden
SM.
Use of computer-based records, completeness of documentation, and appropriateness of documented clinical decisions
.
J Am Med Inform Assoc
1999
;
6
(
3
):
245
51
.

45

Kaplan
DM.
Clear writing, clear thinking and the disappearing art of the problem list
.
J Hosp Med
2007
;
2
(
4
):
199
202
.

46

Wright
A
,
McCoy
AB
,
Hickman
TT
, et al.
Problem list completeness in electronic health records: a multi-site study and assessment of success factors
.
Int J Med Inform
2015
;
84
(
10
):
784
90
.

47

Wright
A
,
Maloney
FL
,
Feblowitz
JC.
Clinician attitudes toward and use of electronic problem lists: a thematic analysis
.
BMC Med Inform Decis Mak
2011
;
11
:
36
.

48

Wright
A
,
Pang
J
,
Feblowitz
JC
, et al. Improving completeness of electronic problem lists through clinical decision support: a randomized, controlled trial
J Am Med Inform Assoc
2012
;
19
(
4
):
555
61
.

49

Mayo
DD
,
Colletti
JE
,
Kuo
DC.
Brain natriuretic peptide (BNP) testing in the emergency department
.
J Emerg Med
2006
;
31
(
2
):
201
10
.

50

Wright
A
,
McCoy
AB
,
Choudhry
NK.
Recommendations for the conduct and reporting of research involving flexible electronic health record-based interventions
.
Ann Intern Med
2020
;
172
(
11 Suppl
):
S110
15
.

51

Craig
P
,
Dieppe
P
,
Macintyre
S
, et al. ;
Medical Research Council Guidance
.
Developing and evaluating complex interventions: the new Medical Research Council guidance
.
BMJ
2008
;
337
:
a1655
.

52

Zenziper Straichman
Y
,
Kurnik
D
,
Matok
I
, et al.
Prescriber response to computerized drug alerts for electronic prescriptions among hospitalized patients
.
Int J Med Inform
2017
;
107
:
70
5
.

53

Wright
A
,
Aaron
S
,
Seger
DL
,
Samal
L
,
Schiff
GD
,
Bates
DW
.
Reduced effectiveness of interruptive drug–drug interaction alerts after conversion to a commercial electronic health record
.
J Gen Intern Med
2018
;
33
(
11
):
1868
76
.

54

Kwan
JL
,
Lo
L
,
Ferguson
J
, et al.
Computerised clinical decision support systems and absolute improvements in care: meta-analysis of controlled clinical trials
.
BMJ
2020
;
370
:
m3216
.

55

McEvoy
D
,
Gandhi
TK
,
Turchin
A
,
Wright
A
.
Enhancing problem list documentation in electronic health records using two methods: the example of prior splenectomy
.
BMJ Qual Saf
2018
;
27
(
1
):
40
7
.

56

Campbell
SM
,
Roland
MO
,
Buetow
SA.
Defining quality of care
.
Soc Sci Med
2000
;
51
(
11
):
1611
25
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Supplementary data