Resources


Database Restricted Access

VinDr-SpineXR: A large annotated medical image dataset for spinal lesions detection and classification from radiographs

Hieu Huy Pham, Hieu Nguyen Trung, Ha Quy Nguyen

VinDr-SpineXR: A large annotated medical image dataset for spinal lesions detection and classification from radiographs

Published: Aug. 24, 2021. Version: 1.0.0


Database Credentialed Access

RadNLI: A natural language inference dataset for the radiology domain

Yasuhide Miura, Yuhao Zhang, Emily Tsai, Curtis Langlotz, Dan Jurafsky

A radiology NLI dataset introduced in the paper: Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation

Published: June 29, 2021. Version: 1.0.0


Database Credentialed Access

RadVLM Instruction Dataset

Nicolas Deperrois, Hidetoshi Matsuo, Samuel Ruiperez-Campillo, Moritz Vandenhirtz, Sonia Laguna, Alain Ryser, Koji Fujimoto, Mizuho Nishio, Thomas Sutter, Julia Vogt, Jonas Kluckert, Thomas Frauenfelder, Christian Bluethgen, Farhad Nooralahzadeh, Michael Krauthammer

This dataset is designed to construct RadVLM, a vision–language model for chest X-ray interpretation. It includes instruction data for tasks such as report generation, abnormality detection, and region grounding, and multitask conversation.

chest x-rays vision-language models medical ai

Published: Sept. 25, 2025. Version: 1.0.0


Database Restricted Access

EchoNext: A Dataset for Detecting Echocardiogram-Confirmed Structural Heart Disease from ECGs

Pierre Elias, Joshua Finer

EchoNext is a curated dataset of electrocardiograms (ECGs) paired with echocardiogram-confirmed structural heart disease labels, designed to support the development and validation of machine learning models.

heart failure clinical decision support artificial intelligence health equity ecg machine learning deep learning electrocardiogram aortic stenosis cardiovascular screening valvular heart disease digital health ai model deployment left ventricular dysfunction ai in healthcare population health transthoracic echocardiogram structural heart disease

Published: Sept. 16, 2025. Version: 1.1.0


Database Restricted Access

mcPHASES: A Dataset of Physiological, Hormonal, and Self-reported Events and Symptoms for Menstrual Health Tracking with Wearables

Blue Lin, Jin Yi Li, Kaavya Kalani, Khai Truong, Alex Mariakakis

This initial version of the PHASES dataset includes multimodal menstrual health data—hormone levels, wearable sensor metrics, and self-reported symptoms—collected across two study intervals from 42 young adults.

wearables hormones menstrual health multimodal health health sensor data womens health

Published: Sept. 9, 2025. Version: 1.0.0


Database Restricted Access

HYAMD High-Resolution Fundus Image Dataset for age related macular degeneration (AMD) Diagnosis

Meishar Meisel, Benjamin Alfred Cohen, Meital Baskin, Beatrice Tiosano, Joachim Behar, Eran Berkowitz

The HYAMD dataset comprises 1,560 high-resolution fundus images from 325 patients, aimed at validating machine learning models for age-related macular degeneration (AMD) diagnosis.

Published: Sept. 9, 2025. Version: 1.0.0


Database Credentialed Access

SCRIPT CarpeDiem Dataset: demographics, outcomes, and per-day clinical parameters for critically ill patients with suspected pneumonia

Nikolay Markov, Catherine A Gao, Thomas Stoeger, Mengjia Kang, Anna Pawlowski, Prasanth Nannapaneni, Luke Rasmussen, Rogan Grant, Daniel Schneider, Justin Starren, Theresa Walunas, Richard Wunderink, GR Scott Budinger, Alexander Misharin, Benjamin Singer, NU SCRIPT Study Investigators

SCRIPT seeks to delineate the host/pathogen interactions during pneumonia using multiomic analysis of bronchoalveolar lavage fluid joined with clinical data and physician adjudication.

Published: Aug. 4, 2025. Version: 1.8.0


Database Credentialed Access

MIMIC-Ext-CXR-QBA: A Structured, Tagged, and Localized Visual Question Answering Dataset with Question-Box-Answer Triplets and Scene Graphs for Chest X-ray Images

Philip Müller, Friederike Jungmann, Georgios Kaissis, Daniel Rueckert

We present a large-scale CXR VQA dataset derived from MIMIC-CXR with 42M QA pairs, featuring hierarchical answers, bounding boxes, and structured tags. We generated QA-pairs using LLM-based extraction from radiology reports and localization models.

chest x-rays vqa localization scene graphs

Published: July 22, 2025. Version: 1.0.0


Database Open Access

Wearable Device Dataset from Induced Stress and Structured Exercise Sessions

Andrea Hongn, Facundo Bosch, Lara Prado, Paula Bonomini

Physiological signals(Electrodermal Activity,Blood Volume Pulse, Heart Rate, Temperature,etc) from 36 healthy volunteers collected during structured acute stress induction and aerobic/anaerobic exercise sessions using the Empatica E4 wearable device.

exercise stress wearable aerobic anaerobic

Published: June 24, 2025. Version: 1.0.1


Database Restricted Access

Dataset for Segmentation and Classification of Cardiac Implantable Electronic Devices in Chest X-Rays

Keno Bressem, Felix Busch, Andrei Zhukov, Lisa Adams

This dataset comprises 11,094 converted DICOM and smartphone images of Cardiac Implantable Electronic Devices (CIEDs), collected from 897 patients. It aims to facilitate the development of algorithms for CIED detection and classification.

chest x-ray radiology medical imaging cardiac implantable electronic devices

Published: March 4, 2025. Version: 1.0.0