Resources
Database
Credentialed Access
Jiawei Cao, Sendong Zhao
The subset of the MIMIC-IV dataset includes the examination results and diagnostic information of 4,761 cardiac disease patients. The examination results for each patient are listed separately as evidence for the final diagnosis.
Published: May 6, 2025.
Version: 1.0.0
Database
Credentialed Access
Yanqi Yang, Xiaomeng LI, Keyuan Liu, et al.
FDTooth is a dataset containing intraoral photographs and cone-beam computed tomography (CBCT) images with annotations for automated detection of fenestration and dehiscence in anterior teeth.
Published: May 5, 2025.
Version: 1.0.0
Database
Credentialed Access
Amin Dada, Osman Alperen Koras, Marie Bauer, et al.
MeDiSumQA is a dataset of patient-oriented QA pairs from MIMIC-IV discharge summaries, designed to evaluate LLMs in generating safe, patient-friendly medical responses for clinical QA and healthcare communication.
Published: May 5, 2025.
Version: 1.0.0
Database
Open Access
Lily Koffman, John Muschelli
Minute level step counts obtained from five step counting algorithms for raw accelerometry data, and minute level Activity Counts, MIMS, wear predictions, and wear flags for all participants who wore accelerometers in NHANES 2011-2014.
accelerometry
physical activity
steps
nhanes
Published: May 5, 2025.
Version: 1.0.1
Database
Restricted Access
Ke Wang, Jiamu Yang, Ayush Shetty, et al.
We present high resolution wearable device multichannel data along with clinical labeled and recorded sleep stage and polysomnography (PSG) data from 100 sleep abnormal patients with sleep apnea.
wearable
sleep disorders
biomedical
time series classification
Published: April 30, 2025.
Version: 2.1.0
Database
Restricted Access
Elizabeth Woo, Michael Craig Burkhart, Emily Alsentzer, et al.
We created 23 questions resembling eligibility criteria from the apixaban clinical trial and evaluated them on a random sample of 100 patient notes from MIMIC-IV. We release the 2300 total question-answer pairs as a dataset here.
clinical q and a evaluation set
clinical trial eligibility
Published: April 30, 2025.
Version: 1.0.0
Database
Credentialed Access
Stefan Hegselmann, Shannon Shen, Florian Gierse, et al.
Annotations for unsupported facts in 100 original MIMIC patient summaries (discharge instructions) and hallucinations in 100 Large Language Model (LLM) generated patient summaries labeled by two medical experts.
Published: April 30, 2025.
Version: 1.0.1
Database
Restricted Access
Elizabeth Woo, Michael Craig Burkhart, Emily Alsentzer, et al.
In our recent study, we used Llama-3.1-70B-Instruct to generate synthetic training examples resembling clinical trial eligibility criteria. We manually reviewed 1000 of these examples and release them here.
large language models
synthetic data distillation
clinical trial eligibility
Published: April 22, 2025.
Version: 1.0.0
Database
Open Access
Kenta Tsutsui, Shany Biton Brimer, Joachim Behar
Holter ECG database from Japan, containing data from 100 unique patients with paroxysmal AF including expert annotations of Supraventricular arrhythmias at the beat level.
atrial fibrillation
ecg
holters
Published: April 16, 2025.
Version: 1.0.1
Database
Credentialed Access
Mira Moukheiber, Lama Moukheiber, Dana Moukheiber, et al.
A benchmark dataset offering hourly records over a 90-day period for 50,920 ICU subjects, including dynamic pulmonary function data and a spectrum of covariates for respiratory intervention analyses.
oberservational data
time-series
Published: April 15, 2025.
Version: 1.1.0