Resources


Database Open Access

CGMacros: a scientific dataset for personalized nutrition and diet monitoring

Ricardo Gutierrez-Osuna, David Kerr, Bobak Mortazavi, Anurag Das

CGMacros contains information from two continuous glucose monitors (CGM), food macronutrients, food photographs, physical activity, and anonymized participant demographics, anthropometric measurements and health parameters.

diabetes machine learning continuous glucose monitors obesity postprandial glucose response food macronutrients metabolic models food photographs personalized nutrition

Published: Jan. 28, 2025. Version: 1.0.0


Database Contributor Review

COVID Data for Shared Learning (CDSL): A comprehensive, multimodal COVID-19 dataset from HM Hospitales

Álvaro Ritoré, Andreea M Oprescu, Alberto Estirado Bronchalo, Miguel Ángel Armengol de la Hoz

COVID Data for Shared Learning (CDSL) is a multimodal database comprising de-identified structured health data and radiological images from 4,479 patients with COVID-19, as a comprehensive toolkit for developing predictive models.

covid-19 multimodal database radiological images open data healthcare data machine learning and ai

Published: Oct. 25, 2024. Version: 1.0.0


Database Credentialed Access

MIMIC-IV

Alistair Johnson, Lucas Bulgarelli, Tom Pollard, Brian Gow, Benjamin Moody, Steven Horng, Leo Anthony Celi, Roger Mark

Large database of de-identified health information from patients admitted to Beth Israel Deaconess Medical Center

critical care intensive care unit machine learning mimic

Published: Oct. 11, 2024. Version: 3.1


Database Open Access

VTaC: A Benchmark Dataset of Ventricular Tachycardia Alarms from ICU Monitors

Li-wei Lehman, Benjamin Moody, Lucas McCullum, Hasan Saeed, Harsh Deep, Diane Perry, Tristan Struja, Qiao Li, Gari Clifford, Roger Mark

VTaC is an annotated ventricular tachycardia (VT) arrhythmia alarm database containing over 5,000 waveform recordings with VT alarms from ICU monitors, with each alarm labeled as either true or false by at least two human expert annotators.

arrhythmia machine learning icu false alarms benchmark dataset ventricular tachycardia

Published: Oct. 1, 2024. Version: 1.0

Visualize waveforms

Database Credentialed Access

Comprehensive Polysomnography (CPS) Dataset: A Resource for Sleep-Related Arousal Research

Stefan Kraft, Andreas Theissler, Vera Wienhausen-Wilke, Philipp Walter, Gjergji Kasneci

This dataset includes polysomnographic sleep recordings from a study on sleep-related arousal diagnostics, featuring raw and derived data channels, annotated event types, and questionnaire data.

polysomnography sleep disorders machine learning in healthcare sleep arousal diagnostics pulse wave analysis

Published: Sept. 18, 2024. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-ECG-Ext-ICD: Diagnostic labels for MIMIC-IV-ECG

Nils Strodthoff, Juan Miguel Lopez Alcaraz, Wilhelm Haverkamp

Dataset that links ECG records from MIMIC-IV-ECG to ED discharge and hospital discharge diagnoses, which enables to train general ECG prediction models based on clinical labels and facilitates the retrieval of further clinical metadata from MIMIC-IV.

machine learning electrocardiography mimic

Published: Aug. 30, 2024. Version: 1.0.1


Database Credentialed Access

EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images

Seongsu Bae, Daeun Kyung, Jaehee Ryu, Eunbyeol Cho, Gyubok Lee, Sunjun Kweon, Jungwoo Oh, Lei JI, Eric Chang, Tackeun Kim, Edward Choi

We present EHRXQA, the first multi-modal EHR QA dataset combining structured patient records with aligned chest X-ray images. EHRXQA contains a comprehensive set of QA pairs covering image-related, table-related, and image+table-related questions.

question answering machine learning electronic health records evaluation chest x-ray multi-modal question answering ehr question answering semantic parsing benchmark deep learning visual question answering

Published: July 23, 2024. Version: 1.0.0


Database Credentialed Access

MIMIC-CXR Database

Alistair Johnson, Tom Pollard, Roger Mark, Seth Berkowitz, Steven Horng

Chest radiographs in DICOM format with associated free-text reports.

computer vision chest x-rays natural language processing machine learning radiology mimic

Published: July 23, 2024. Version: 2.1.0


Database Credentialed Access

MIMIC-Ext-MIMIC-CXR-VQA: A Complex, Diverse, And Large-Scale Visual Question Answering Dataset for Chest X-ray Images

Seongsu Bae, Daeun Kyung, Jaehee Ryu, Eunbyeol Cho, Gyubok Lee, Sunjun Kweon, Jungwoo Oh, Lei JI, Eric Chang, Tackeun Kim, Edward Choi

We introduce MIMIC-Ext-MIMIC-CXR-VQA, a complex, diverse, and large-scale dataset designed for Visual Question Answering (VQA) tasks within the medical domain, focusing primarily on chest radiographs.

question answering machine learning electronic health records evaluation chest x-ray radiology benchmark multimodal deep learning visual question answering

Published: July 19, 2024. Version: 1.0.0


Challenge Credentialed Access

MIT Critical Datathon 2023: a MIMIC-IV Derived Dataset for Pulse Oximetry Correction Models

João Matos, Tristan Struja, David S Restrepo, Luis Filipe Nakayama, Jack Gallifant, Luca Weishaupt, Nikita Mullangi, Maria Loureiro, Skyler Shapiro, Adrien Carrel, Leo Anthony Celi

A SaO2-SpO2 Pairs Dataset derived from MIMIC-IV

pulse oximetry health equity machine learning

Published: May 8, 2023. Version: 1.0.0