Resources


Database Credentialed Access

A Temporal Dataset for Respiratory Support in Critically Ill Patients

Mira Moukheiber, Lama Moukheiber, Dana Moukheiber, Sicheng Hao, Leo Anthony Celi, Hyung-Chul Lee

A benchmark dataset offering hourly records over a 90-day period for 50,920 ICU subjects, including dynamic pulmonary function data and a spectrum of covariates for respiratory intervention analyses.

oberservational data time-series

Published: May 31, 2024. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext Clinical Decision Making: A MIMIC-IV Derived Dataset for Evaluation of Large Language Models on the Task of Clinical Decision Making for Abdominal Pathologies

Paul Hager, Friederike Jungmann, Daniel Rueckert

A curated set of ED clinical decision making cases for four abdominal pathologies. Each case contains the exams required to diagnose including HPI, physical examination, laboratory tests, and imaging. Relevant treatment information is also included.

clinical decision making emergency room diagnosis abdominal pathologies large language models treatment plan

Published: May 17, 2024. Version: 1.0


Database Restricted Access

DREAMT: Dataset for Real-time sleep stage EstimAtion using Multisensor wearable Technology

Ke Wang, Jiamu Yang, Ayush Shetty, Jessilyn Dunn

Dataset for Real-time sleep stage EstimAtion using Multisensor wearable Technology

wearable biomedical sleep disorders time series classification

Published: April 30, 2024. Version: 1.0.0


Database Credentialed Access

Medical Expert Annotations of Unsupported Facts in Doctor-Written and LLM-Generated Patient Summaries

Stefan Hegselmann, Shannon Shen, Florian Gierse, Monica Agrawal, David Sontag, Xiaoyi Jiang

Annotations for unsupported facts in 100 original MIMIC patient summaries (discharge instructions) and hallucinations in 100 Large Language Model (LLM) generated patient summaries labeled by two medical experts.

Published: April 28, 2024. Version: 1.0.0


Database Contributor Review

CARMEN-I: A resource of anonymized electronic health records in Spanish and Catalan for training and testing NLP tools

Eulalia Farre Maduell, Salvador Lima-Lopez, Santiago Andres Frid, Artur Conesa, Elisa Asensio, Antonio Lopez-Rueda, Helena Arino, Elena Calvo, Maria Jesús Bertran, Maria Angeles Marcos, Montserrat Nofre Maiz, Laura Tañá Velasco, Antonia Marti, Ricardo Farreres, Xavier Pastor, Xavier Borrat Frigola, Martin Krallinger

CARMEN-I is a Spanish corpus of 2,000 clinical records from Hospital Clínic, Barcelona. It covers COVID-19 patients and comorbidities, serving as a resource for training clinical NLP models and researchers in NLP applied to clinical documents.

de-identification clinical ner anonymization

Published: April 20, 2024. Version: 1.0.1


Database Credentialed Access

EHRNoteQA: A Patient-Specific Question Answering Benchmark for Evaluating Large Language Models in Clinical Settings

Sunjun Kweon, Jiyoun Kim, Heeyoung Kwak, Dongchul Cha, Hangyul Yoon, Kwang Hyun Kim, Seunghyun Won, Edward Choi

A patient-specific question answering benchmark tailored for evaluating Large Language Models (LLMs) in clinical environments

Published: April 3, 2024. Version: 1.0.0


Database Credentialed Access

RaDialog Instruct Dataset

Chantal Pellegrini, Ege Özsoy, Benjamin Busam, Nassir Navab, Matthias Keicher

Image-based instruct data for Chest X-Ray understanding and analysis.

radiology report generation large vision-language models medical image understaning radiology assistant radiology chatbot

Published: March 25, 2024. Version: 1.0.0


Database Open Access

PADS - Parkinsons Disease Smartwatch dataset

Julian Varghese, Alexander Brenner, Lucas Plagwitz, Catharina van Alen, Michael Fujarski, Tobias Warnecke

The PADS dataset contains smartwatch-based records from interactive neurological assessments of Parkinsons disease patients, differential diagnoses and healthy controls. The data is complemented with non-motor symptoms and medical history information

wearables movement disorders parkinsons disease

Published: March 25, 2024. Version: 1.0.0


Database Open Access

ScientISST MOVE: Annotated Wearable Multimodal Biosignals recorded during Everyday Life Activities in Naturalistic Environments

João Areias Saraiva, Mariana Abreu, Ana Sofia Carmo, Hugo Plácido da Silva, Ana Fred

Multimodal (ECG, EMG, EDA, PPG, TEMP, ACC) biosignal dataset of everyday activities. Created with 3 wearable devices based on ScientISST Sense and Empatica E4.

multimodal greet lift uncontrolled environments run jump gesticulate walk wearable

Published: March 25, 2024. Version: 1.0.1


Database Open Access

Respiratory and heart rate monitoring dataset from aeration study

Ella Frances Sophia Guy, Isaac Flett, Jaimey Anne Clifton, Trudy Caljé-van der Klei, Rongqing Chen, Jennifer Knopp, Knut Moeller, James Geoffrey Chase

Respiratory and cardiovascular data collected from 20 subjects. Pressure, flow, aeration, and heart-rate data were collected during trials which included resting breathing, CPAP at varied PEEP settings, breath-holds, and forced expiratory manoeuvres.

Published: March 20, 2024. Version: 1.0.0