Resources


Model Credentialed Access

RadVLM model

Nicolas Deperrois, Hidetoshi Matsuo, Samuel Ruiperez-Campillo, Moritz Vandenhirtz, Sonia Laguna, Alain Ryser, Koji Fujimoto, Mizuho Nishio, Thomas Sutter, Julia Vogt, Jonas Kluckert, Thomas Frauenfelder, Christian Bluethgen, Farhad Nooralahzadeh, Michael Krauthammer

RadVLM is a 7B-parameter vision-language model fine-tuned on public chest-X-ray data that drafts reports, lists abnormalities, grounds findings, and chats about a CXR through a single image-to-text interface.

Published: Oct. 8, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext clinical decision support for referral, triage and diagnosis

Farieda Gaber, Altuna Akalin

This MIMIC-IV extended dataset is designed to evaluate and improve LLMs' ability to assist with triage, specialist referral, and diagnosis, using critical patient information such as history of present illness,vitals signs and other relevant data.

Published: Oct. 8, 2025. Version: 1.0.2


Database Credentialed Access

MIMIC-IV-ECHO-Ext-MIMICEchoQA: A Benchmark Dataset for Echocardiogram-Based Visual Question Answering

Rahul Thapa, Andrew Li, Qingyang Wu, Bryan He, Yuki Sahashi, Christina Binder-Rodriguez, Angela Zhang, David Ouyang, James Zou

We present MIMICEchoQA, a benchmark dataset for echocardiogram-based question answering, built from the publicly available MIMIC-IV-ECHO database.

Published: Oct. 7, 2025. Version: 1.0.0


Database Restricted Access

TN-Mammo: A Multi-view Mammography Dataset for Breast Density Classification

Binh Nguyen, Cat Le, Loc Vu, Quynh Nguyen, Ha-Hieu Pham, Phuong Anh Vu, Thuan Huynh, Cao Tien Dung, Nghiem Diep Tuong, Byung-Woo Hong

We release the first version of TN-Mammo (June 2024), a mammogram dataset of 676 cases with breast density labels, providing high-quality data to support machine learning and early breast cancer detection.

Published: Oct. 4, 2025. Version: 1.0.0


Database Restricted Access

Organ Retrieval and Collection of Health Information for Donation (ORCHID)

Hammaad Adam, Vinith Suriyakumar, Tom Pollard, Benjamin Moody, Jennifer Erickson, Greg Segal, Brad Adams, Diane Brockmeier, Kevin Lee, Ginny McBride, Kelly Ranum, Matthew Wadsworth, Janice Whaley, Ashia Wilson, Marzyeh Ghassemi

Multi-center dataset on organ procurement in the United States

organ procurement organizations organ transplantation

Published: Sept. 29, 2025. Version: 2.1.1


Database Open Access

MIMIC-IV demo data in the Medical Event Data Standard (MEDS)

Robin Philippus van de Water, Ethan Steinberg, Michael Wornow, Patrick Rockenschaub, Matthew McDermott

MIMIC-IV Clinical Database Demo in MEDS (Medical Event Data Standard) format.

ehr critical care electronic health record mimic machine learning meds medical event data standard

Published: Sept. 29, 2025. Version: 0.0.1


Database Credentialed Access

MIMIC-IV-Ext-22MCTS: A 22 Millions-Event Temporal Clinical Time-Series Dataset with Relative Timestamp

Jing Wang, Xing Niu, Tong Zhang, Jie Shen, Juyong Kim, Jeremy Weiss

It is a time series clinical events dataset with concrete temporal information. The dataset consists of 22,588,586 clinical events and related timestamps from 267,284 discharge summaries of the MIMIC-IV-Note.

mimic clinical event annotation time series temporal annotation

Published: Sept. 29, 2025. Version: 1.0.0


Database Credentialed Access

Multimodal Clinical Monitoring in the Emergency Department (MC-MED)

Aman Kansal, Emma Chen, Tom Jin, Pranav Rajpurkar, David Kim

A multimodal dataset of deidentified clinical and physiological data from emergency department visits, supporting research on patient outcomes, care processes, and the effects of continuous monitoring during and after the COVID-19 pandemic.

Published: Sept. 25, 2025. Version: 1.0.1


Database Credentialed Access

MIMIC-Ext-DrugDetection

Fabrice Harel-Canada, Nanyun Peng, David Goodman, Ruby Romero, Allan Nguyen, Brandon Moghanian, Anabel Salimian

This project offers a multilabel annotated dataset of clinical note sentences from MIMIC-III/IV for substance use detection. It supports NLP research for identifying various co-occurring drug use mentions in patient records.

ehr mimic-iv substance use clinical notes methamphetamine multi-label cocaine drug detection polysubstance use prescription opioid misuse cannabis benzodiazepine misuse injection drug use heroin mimic-iii

Published: Sept. 25, 2025. Version: 1.0.0


Database Credentialed Access

RadVLM Instruction Dataset

Nicolas Deperrois, Hidetoshi Matsuo, Samuel Ruiperez-Campillo, Moritz Vandenhirtz, Sonia Laguna, Alain Ryser, Koji Fujimoto, Mizuho Nishio, Thomas Sutter, Julia Vogt, Jonas Kluckert, Thomas Frauenfelder, Christian Bluethgen, Farhad Nooralahzadeh, Michael Krauthammer

This dataset is designed to construct RadVLM, a vision–language model for chest X-ray interpretation. It includes instruction data for tasks such as report generation, abnormality detection, and region grounding, and multitask conversation.

chest x-rays vision-language models medical ai

Published: Sept. 25, 2025. Version: 1.0.0