Database Open Access

MIMIC-IV Clinical Database Demo on FHIR

Alex Bennett Hannes Ulrich Joshua Wiedekopf Piotr Szul John Grimes Alistair Johnson

Published: Aug. 27, 2025. Version: 2.1.0


When using this resource, please cite: (show more options)
Bennett, A., Ulrich, H., Wiedekopf, J., Szul, P., Grimes, J., & Johnson, A. (2025). MIMIC-IV Clinical Database Demo on FHIR (version 2.1.0). PhysioNet. RRID:SCR_007345..

Additionally, please cite the original publication:

Bennett AM, Ulrich H, van Damme P, Wiedekopf J, Johnson AE. MIMIC-IV on FHIR: converting a decade of in-patient data into an exchangeable, interoperable format. Journal of the American Medical Informatics Association. 2023 Apr 1;30(4):718-25.

Please include the standard citation for PhysioNet: (show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220. RRID:SCR_007345.

Abstract

Interoperability of healthcare data has become increasingly important given the increase in deployment of data driven algorithms in clinical settings. The Fast Healthcare Interoperability Resources (FHIR) standard has emerged as a promising mechanism to share healthcare data across vendors in real-time and batch settings. Real-world datasets available in FHIR would accelerate research and development of data-driven algorithms. Existing datasets in FHIR are primarily synthetic, and cover a limited number of resources. To address this gap, we have reformatted the Medical Information Mart for Intensive Care (MIMIC)-IV Clinical Database Demo into FHIR. The MIMIC clinical databases have received wide adoption and the constituent data are understood by the community. As much as possible, we adhered to the base resources with minimal extensions. Alongside the dataset, we publish openly available code allowing researchers to quickly build upon our work. Translating MIMIC-IV into FHIR provides a benchmark dataset for institutions to experiment with FHIR based tools, and we hope this resource supports adoption and use of FHIR.


Background

The Fast Healthcare Interoperability Resources (FHIR) standard provides a framework for structuring health data and supporting data exchange amongst disparate systems and vendors. At the G7 discussion on Open Standards and Interoperability, FHIR was noted to be gaining wide traction in five of the seven countries with the remaining two looking to adopt FHIR in the near future [1]. In these countries, translation of legacy systems into FHIR will lead to significant research and development in FHIR. Large scale development of FHIR is often accelerated through testing with real-world data. However, access to patient data is restricted for security and privacy concerns.

The use of synthetic data is an intriguing option to bypass the security and privacy concerns of patient data. Synthetic data generation is an active research field, with one prominent FHIR based example being Synthea [2]. Synthea is a tool to create synthetic electronic health records representing patients with the most common demographic, disease distribution, and disease progressions found in the US. The datasets generated by Synthea cover the primary use cases in testing environments, but were not designed to include artefactual data or information derived from atypical workflows. Real-world data includes these necessary components to support development of robust applications that can handle inconsistent data and edge cases.

MIMIC-IV is a relational database corresponding to over 60,000 patients admitted to the Beth Israel Deaconess Medical Center (BIDMC) in Boston, MA [3]. MIMIC-IV has gained traction in the community due to its transparent mechanism of data access, reasonably large sample size, and authentic capture of a real-world electronic health record database. MIMIC-IV has been utilized in over 3000 publications, exploring retrospective analyses and research application development. Emergency department data for patients admitted to the BIDMC has also been published as MIMIC-IV-ED [4].

Recent work has focused on translation of MIMIC-IV into standard data models. A group based in Germany transformed a substantial portion of MIMIC-III and MIMIC-IV into FHIR to assist in research and development in the German FHIR context [5-6]. MIMIC-IV-on-FHIR Demo, this PhysioNet project, aims to translate MIMIC-IV into FHIR, preserve the MIMIC-IV structure and information, and provide an easily accessible FHIR dataset for use in research and development.


Methods

MIMIC-IV-on-FHIR aims to capture MIMIC-IV as is in the FHIR format. FHIR stores healthcare information in resources. Thus, the MIMIC-IV data tables were mapped to equivalent FHIR resources. The mapping process involved five steps:

  1. Terminology Generation. Capturing the MIMIC-IV terminology in FHIR was needed to retain the rich context MIMIC provides. FHIR stores terminology in two resources: CodeSystems and ValueSets. CodeSystems are the source for codes and ValueSets are use-case specific and combinations of CodeSystems. Codes were pulled from the MIMIC-IV tables and then converted to CodeSystems and ValueSets. Once created, the Valuesets are bound to data elements in resources. These bindings are utilized by the FHIR server to validate proper codes are assigned to resource data elements.

  1. Implementation Guide Creation. To provide a reproducible FHIR format for MIMIC, an implementation guide was made. A FHIR implementation guide is a collection of FHIR profiles and terminology aiming to achieve a task within a specific domain. FHIR profiles are modifications of the base FHIR resources. The terminology generated in the first step were used to bind elements in the profiles. Effectively this adds a layer of validation that only the codes in the terminology system can be assigned to an element. The MIMIC implementation guide includes 22 profiles, 64 terminology resources, and 2 extensions. Where possible the US Core R4 profiles were used as the basis for the MIMIC profiles [7]. The MIMIC implementation guide lays the framework for the future steps of mapping and validation.

  1. Mapping. The goal for mapping was to have as complete a picture of MIMIC-IV in FHIR. Each column in MIMIC-IV was investigated to identify potential mappings from MIMIC-IV columns to FHIR resource elements. The final MIMIC-IV column to FHIR element mappings can be found in the MIMIC implementation guide. For MIMIC-IV columns without direct mappings into FHIR, extensions were made to house the information. Custom SQL scripts were used to facilitate the conversion from MIMIC-IV tables to a new schema mimic_fhir. After mapping into FHIR, the resources are raw and must go through FHIR validation before they are ready for use and distribution.  

  1. Validation. To ensure the mappings were consistent with the MIMIC implementation guide, FHIR validation was required. A FHIR server is both a store for FHIR resources and a service to validate, search, and export resources. A FHIR server was created with the mimic implementation guide applied to it, meaning any resources sent to the FHIR server would be validated against the guide. Validation can turn up issues in terminology binding, inter-resource referencing or improper element mappings. The FHIR validation effectively provides unit testing for the correctness of the MIMIC to FHIR mappings completed.

  1. Export. To make mimic-fhir accessible, the resources needed to be exported to a common format. The ndjson format is an ideal choice, as it is meant for delivering large structured data. Exporting the validated mimic-fhir resources to ndjson required two main steps. First, the bulk export functionality of FHIR servers was leveraged to export all the resources. Second, the exported resources were then written to the ndjson format. At this stage the exported njdsons represent the full picture of MIMIC-IV-on-FHIR.

The methodology for creating the demo version of MIMIC-IV on FHIR followed exactly the creation of the full version, except the source data used were sourced from the MIMIC-IV Clinical Database Demo v2.2 [12] and the MIMIC-IV-ED Demo v2.2 [13]. As a result, the MIMIC-IV Clinical Database demo on FHIR project here is a 100 patient subset of the full MIMIC-IV on FHIR dataset [14]. See the MIMIC-IV Clinical Database demo project page for more detail about the selection of the demo patients [12].


Data Description

The source datasets, MIMIC-IV and MIMIC-IV-ED, are distributed as a set of files intended for use within a relational database system. For example, the chartevents table in MIMIC-IV describes the recording of observations using coded integers (itemid), and users must reference the d_items table in order to understand what the observations are. Conversely, data records formatted following the FHIR standard are intended to be standalone: the observations are all provided with the corresponding descriptions. As a result, FHIR datasets tend to be larger as they contain redundant information.

Every observation in the source dataset was converted into a resource, equivalent to a single record. All resources are prefixed with the term Mimic and adhere to the FHIR base profile while ensuring maximal data retention during the data transformation. Resources describing metadata for MIMIC-IV were included as necessary (e.g. the organization which provides care for MIMIC patients was added as a resource). Resources were serialized using JavaScript Object Notation (JSON).

Resources in the demo dataset include patient and organization resources (MimicOrganization, MimicLocation, MimicPatient, MimicEncounter, MimicEncounterED), observation resources which reference a specimen sampled from a patient (MimicObservationLabevents, MimicObservationMicroTest, MimicObservationMicroOrg, and MimicObservationMicroSusc, MimicSpecimen), medication resources (MimicMedication, MimicMedicationAdministration, MimicMedicationAdministrationICU, MimicMedicationDispense, MimicMedicationDispenseED, MimicMedicationStatementED, MimicMedicationRequest), charted observation resources (MimicObservationChartevents, MimicObservationDatetimeevents, and MimicObservationOutputevents, MimicProcedureICU, MimicObservationED, MimicObservationVitalSignsED), and resources related to billing (MimicCondition, MimicConditionED, MimicProcedure, MimicProcedureED). For simplicity, data records in this project are distributed in Newline Delimited JSON (NDJSON) files grouped according to the resource type.

Resource types provided here mirror those in the full MIMIC-IV on FHIR dataset, and detailed description of the data elements can be found on the MIMIC-IV on FHIR PhysioNet project page [14]. The data distributed within this project are a complete subset of the records available in the MIMIC-IV on FHIR project.


Usage Notes

NDJSON files are compressed with gzip and must be decompressed before use. For example, in Python:
import json
import gzip

with gzip.open('MimicPatient.json.gz', 'rt') as f:
   patient_records = [json.loads(row.strip()) for row in f if row.strip()]

Each line within the NDJSON files corresponds to an individual data record. For example, the MimicPatient.ndjson.gz file contains all resources corresponding to the MimicPatient profile, including patient demographics. These demographics can be parsed using basic Python utilities:

import json
import gzip

def extract_demographics(patient):
    demographics = {
        'id': patient['id'],
        'gender': patient.get('gender'),
        'birth_date': patient.get('birthDate'),
        'marital_status': None,
        'race': None,
        'ethnicity': None
    }
    # Extract marital status
    marital = patient.get('maritalStatus', {}).get('coding', [])
    if marital:
        demographics['marital_status'] = marital[0].get('display') or marital[0].get('code')

    # Extract race and ethnicity from extensions
    for ext in patient.get('extension', []):
        url = ext.get('url', '')
        if 'race' in url:
            for race_ext in ext.get('extension', []):
                if race_ext.get('url') == 'text':
                    demographics['race'] = race_ext.get('valueString')
        elif 'ethnicity' in url:
            for eth_ext in ext.get('extension', []):
                if eth_ext.get('url') == 'text':
                    demographics['ethnicity'] = eth_ext.get('valueString')
    return demographics

def load_mimic_patient(patient_file):
    """Extract demographics from all patients"""
    data = []
    
    with gzip.open(patient_file, 'rt') as f:
        for line in f:
           patient = json.loads(line.strip())             if patient: demographics = extract_demographics(patient) data.append(demographics) return data patient_file = "/path/to/your/mimic-fhir/data/MimicPatient.ndjson.gz" patients = load_mimic_patient(patient_file) print(f"Loaded {len(patients)} patients")

An open source repository, MIMIC-FHIR, was created to store all the components needed to generate and use the mimic-fhir resources [8]. The repository allows for community discussion and collaboration on mimic-fhir. An archived version of the scripts for building the MIMIC-IV-on-FHIR demo dataset are also archived in Zenodo [9]. A jupyter notebook was developed to walk through the loading and usage of the mimic-fhir resources with the Pathling FHIR server [10, 11]. Pathling was used to demo the mimic-fhir resources due to its simple NDJSON loading and optimized analytic operations.


Release Notes

The MIMIC-IV on FHIR demo release notes follow those of the full MIMIC-IV on FHIR project as the resources are generated using the same software [8].

Version 2.1.0

The current release of MIMIC-IV Clinical Database Demo on FHIR is v2.1.0. This is a bug-fix release which corrects a number of issues across the dataset encountered during FHIR validation. This version sources data from the MIMIC-IV Clinical Database Demo v2.2 [12] and the MIMIC-IV-ED Demo v2.2 [13].

Validation fixes

  • Generates code systems and value set resources from the mimic database (fhir_trm schema).
  • Removed unnecessary code systems and value sets.
  • Added emar_detail.product_unit and emar_detail.dose_given_unit as sources for the cs_unit code system.
  • Fixed display names for cs-diagnosis-icd9, cs-diagnosis-icd10, and cs-medication-etc code systems.
  • Corrected whitespace in pharmacy.frequency (ensured compliance with coding whitespace rules).
  • Updated dosageInstruction.timing in fhir_medication_dispense.sql (ensured inclusion of code or repeat elements).
  • Handled blank medication in medication_dispense generation (Treated blank medications as NULL values).
  • Produced NULL for blank lab_VALUE in fhir_observation_labevents.sql
  • Handled blank dose_given_unit in fhir_medication_administration.sql
  • Corrected display name for LOINC code: 2708-6 in fhir_observation_vitalsigns.sql
  • Changed extraction of quantity from pharmacy.fill_quantity, improving extraction logic for numerical values and units.
  • Handled NULL values in BP vital signs observations (Ensured compliance with [http://hl7.org/fhir/StructureDefinition/bp|4.0.](http://hl7.org/fhir/StructureDefinition/bp|4.0%60.)).
  • Changed JSON export format to 'csv', fixing backslash escaping issues.
  • Added null value checks, prevented creation of empty objects in fhir_medication_administration.sql and fhir_specimen.sql.
  • Removed array wrapping for medicationCodeableConcept (fhir_medication_request).
  • Renamed mimic-lab-priority to lab-priority (udated extension URL in fhir_observation_labevents).
  • Fixed unit codes for respiratory and heart rate (fhir_observation_vitalsigns).
  • Removed leading whitespace from display names (map_race_omb.sql).
  • Added 'emar_detail.product_unit' and 'emar_detail.dose_given_unit' as source sources for cs_unit code system to fix: Unknown code 'TPN Bag' in the CodeSystem 'http://mimic.mit.edu/fhir/mimic/CodeSystem/mimic-units' version '2.2.0', eg: ": {"dose": {"code": "TPN Bag", "unit": "TPN Bag", ) in MedicationAdministation.
  • Aligned coding display names in fhir resources with the display names in the code system for cs-diagnosis-icd9, cs-diagnosis-icd10 and cs-medication-etc to fix 'Wrong Display Name XXX' errors, e.g: Wrong Display Name 'Acne Therapy Systemic - Tetracycline antibiotic' for http://mimic.mit.edu/fhir/mimic/CodeSystem/mimic-medication-etc#00005953. Valid display is 'Acne Therapy Systemic - Tetracyclines' (en) (for the language(s) '--')

Version 1.0

The first release of MIMIC-IV Clinical Database Demo on FHIR was v1.0. This release used data from the MIMIC-IV Clinical Database Demo v2.2 [12] and the MIMIC-IV-ED Demo v2.2 [13]


Ethics

This project builds upon the work of MIMIC-IV v2.0. MIMIC-IV is a collection of deidentified patient data from the Beth Israel Deaconess Medical Center. MIMIC-IV-on-FHIR approval is based on the original MIMIC-IV work being deidentified and approved for credentialed distribution.


Acknowledgements

The authors would like to thank those behind MIMIC-IV for making the data available and the FHIR community for support in answering questions. This work was supported by the Canadian Institutes of Health Research funding reference number 470397.


Conflicts of Interest

The authors declare no conflicts of interest.


References

  1. Department of Health & Social Care. G7 open standards and interoperability. London (UK): Crown copyright; 2021. Available from: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1045267/G7-open-standards-final-report.pdf
  2. Walonoski J, Kramer M, Nichols J, Quina A, Moesel C, Hall D, et al. Synthea: an approach, method, and software mechanism for generating synthetic patients and the Synthetic Electronic Health Care Record. J Am Med Inform Assoc. 2018 Jul 1;25(7):921. https://doi.org/10.1093/jamia/ocx147
  3. Johnson, A., Bulgarelli, L., Pollard, T., Horng, S., Celi, L. A., & Mark, R. (2023). MIMIC-IV (version 2.2). PhysioNet. RRID:SCR_007345. https://doi.org/10.13026/6mm1-ek67
  4. Johnson, A., Bulgarelli, L., Pollard, T., Celi, L. A., Mark, R., & Horng, S. (2023). MIMIC-IV-ED (version 2.2). PhysioNet. RRID:SCR_007345. https://doi.org/10.13026/5ntk-km72
  5. Ververs S, Ulrich H, Kock A-K, Ingenerf J. Konvertierung von MIMIC-III-Daten zu FHIR. In: Jahrestagung der Deutschen Gesellschaft für Medizinische Informatik, Biometrie und Epidemiologie e.V. (GMDs); 2018. https://doi.org/10.3205/18gmds018
  6. Ulrich H, Behrend P, Wiedekopf J, Drenkhahn C, Kock-Schoppenhauer A-K, Ingenerf J. Hands on the Medical Informatics Initiative Core Data Set — lessons learned from converting the MIMIC-IV. Stud Health Technol Inform. 2021;281:249–53. https://doi.org/10.3233/shti210549
  7. HL7 International. US Core HL7 FHIR Implementation Guide [Internet]. Available from: https://www.hl7.org/fhir/us/core/ [Accessed 2022 Jun 6].
  8. MIMIC-IV-on-FHIR Code on GitHub [Internet]. Available from: https://github.com/kind-lab/mimic-fhir [Accessed 2024 Oct 28].
  9. Bennett AM, Johnson AJ. kind-lab/mimic-fhir: MIMIC-IV-on-FHIR v2.1.0 (v2.1.0). Zenodo; 2022. https://doi.org/10.5281/zenodo.14003175
  10. MIMIC-FHIR Tutorial [Internet]. Available from: https://github.com/kind-lab/mimic-fhir/blob/main/tutorial/mimic-fhir-tutorial-pathling.ipynb [Accessed 2022 Jun 6].
  11. Pathling: Advanced FHIR Analytics Server [Internet]. Available from: https://pathling.csiro.au/ [Accessed 2022 Jun 6].
  12. Johnson, A., Bulgarelli, L., Pollard, T., Horng, S., Celi, L. A., & Mark, R. (2023). MIMIC-IV Clinical Database Demo (version 2.2). PhysioNet. RRID:SCR_007345. https://doi.org/10.13026/dp1f-ex47
  13. Johnson, A., Bulgarelli, L., Pollard, T., Celi, L. A., Horng, S., & Mark, R. (2023). MIMIC-IV-ED Demo (version 2.2). PhysioNet. RRID:SCR_007345. https://doi.org/10.13026/jzz5-vs76
  14. Bennett A, Wiedekopf J, Ulrich H, van Damme P, Szul P, Grimes J, Johnson A. MIMIC-IV on FHIR (version 2.1). PhysioNet; 2024. https://doi.org/10.13026/rrj1-ny66

Parent Projects
MIMIC-IV Clinical Database Demo on FHIR was derived from: Please cite them when using this project.
Share
Access

Access Policy:
Anyone can access the files, as long as they conform to the terms of the specified license.

License (for files):
Open Data Commons Open Database License v1.0

Corresponding Author
You must be logged in to view the contact information.
Versions
  • 2.0 - June 7, 2022
  • 2.1.0 - Aug. 27, 2025

Files

Total uncompressed size: 49.5 MB.

Access the files
Folder Navigation: <base>/fhir
Name Size Modified
Parent Directory
MimicCondition.ndjson.gz (download) 223.4 KB 2024-10-29
MimicConditionED.ndjson.gz (download) 33.4 KB 2024-10-29
MimicEncounter.ndjson.gz (download) 37.6 KB 2024-10-29
MimicEncounterED.ndjson.gz (download) 19.0 KB 2024-10-29
MimicEncounterICU.ndjson.gz (download) 15.2 KB 2024-10-29
MimicLocation.ndjson.gz (download) 1.4 KB 2024-10-29
MimicMedication.ndjson.gz (download) 82.2 KB 2024-10-29
MimicMedicationAdministration.ndjson.gz (download) 2.1 MB 2024-10-29
MimicMedicationAdministrationICU.ndjson.gz (download) 1.1 MB 2024-10-29
MimicMedicationDispense.ndjson.gz (download) 1.0 MB 2024-10-29
MimicMedicationDispenseED.ndjson.gz (download) 50.0 KB 2024-10-29
MimicMedicationMix.ndjson.gz (download) 27.5 KB 2024-10-29
MimicMedicationRequest.ndjson.gz (download) 1.3 MB 2024-10-29
MimicMedicationStatementED.ndjson.gz (download) 148.2 KB 2024-10-29
MimicObservationChartevents.ndjson.gz (download) 33.7 MB 2024-10-29
MimicObservationDatetimeevents.ndjson.gz (download) 577.0 KB 2024-10-29
MimicObservationED.ndjson.gz (download) 140.8 KB 2024-10-29
MimicObservationLabevents.ndjson.gz (download) 7.4 MB 2024-10-29
MimicObservationMicroOrg.ndjson.gz (download) 48.4 KB 2024-10-29
MimicObservationMicroSusc.ndjson.gz (download) 52.8 KB 2024-10-29
MimicObservationMicroTest.ndjson.gz (download) 153.0 KB 2024-10-29
MimicObservationOutputevents.ndjson.gz (download) 375.9 KB 2024-10-29
MimicObservationVitalSignsED.ndjson.gz (download) 289.6 KB 2024-10-29
MimicOrganization.ndjson.gz (download) 345 B 2024-10-29
MimicPatient.ndjson.gz (download) 6.1 KB 2024-10-29
MimicProcedure.ndjson.gz (download) 41.9 KB 2024-10-29
MimicProcedureED.ndjson.gz (download) 47.9 KB 2024-10-29
MimicProcedureICU.ndjson.gz (download) 75.5 KB 2024-10-29
MimicSpecimen.ndjson.gz (download) 58.6 KB 2024-10-29
MimicSpecimenLab.ndjson.gz (download) 413.7 KB 2024-10-29