Resources


Database Restricted Access

CheXchoNet: A Chest Radiograph Dataset with Gold Standard Echocardiography Labels

Pierre Elias, Shreyas Bhave

Early detection of heart failure is vital for improving outcomes. The dataset contains 71,589 CXRs paired with gold standard labels from echocardiograms to enable the training of models to detect pathologies indicative of early stage heart failure.

chest x-rays heart failure early detection cardiac structural abnormalties deep learning

Published: March 20, 2024. Version: 1.0.0


Database Open Access

Hillel Yaffe Glaucoma Dataset (HYGD): A Gold-Standard Annotated Fundus Dataset for Glaucoma Detection

Or Abramovich, Hadas Pizem, Jonathan Fhima, et al.

HYGD is a rigorously annotated fundus image dataset with gold-standard clinical labels designed to improve and benchmark deep learning models for accurate glaucoma detection.

ophthalmology retina dfi gold-standard gon fundus glaucoma

Published: June 3, 2025. Version: 1.0.0


Database Open Access

Hillel Yaffe Glaucoma Dataset (HYGD): A Gold-Standard Annotated Fundus Dataset for Glaucoma Detection

Or Abramovich, Hadas Pizem, Jonathan Fhima, et al.

HYGD is a rigorously annotated fundus image dataset with gold-standard clinical labels designed to improve and benchmark deep learning models for accurate glaucoma detection.

ophthalmology retina dfi gold-standard gon fundus glaucoma

Published: June 3, 2025. Version: 1.0.0


Database Credentialed Access

Deidentified Medical Text

Margaret Douglass, Bill Long, George Moody, et al.

Gold standard corpus of 2,434 deidentified nursing notes

medical text nursing notes hipaa de-identification

Published: Dec. 18, 2007. Version: 1.0


Database Restricted Access

TN-Mammo: A Multi-view Mammography Dataset for Breast Density Classification

Binh Nguyen, Cat Le, Loc Vu, et al.

We release the first version of TN-Mammo (June 2024), a mammogram dataset of 676 cases with breast density labels, providing high-quality data to support machine learning and early breast cancer detection.

Published: Oct. 4, 2025. Version: 1.0.0


Database Restricted Access

HYAMD High-Resolution Fundus Image Dataset for age related macular degeneration (AMD) Diagnosis

Meishar Meisel, Benjamin Alfred Cohen, Meital Baskin, et al.

The HYAMD dataset comprises 1,560 high-resolution fundus images from 325 patients, aimed at validating machine learning models for age-related macular degeneration (AMD) diagnosis.

Published: Sept. 9, 2025. Version: 1.0.0


Challenge Credentialed Access

CXR-LT: Multi-Label Long-Tailed Classification on Chest X-Rays

Gregory Holste, Mingquan Lin, Song Wang, et al.

CXR-LT 2024 was a challenge for long-tailed, multi-label, and zero-shot thorax disease classification on chest X-rays, held at MICCAI 2024. This page contains long-tailed labels for 45 diseases from the CXR-LT 2024 and 2023 challenges.

disease classification artificial intelligence chest x-ray deep learning computer-aided diagnosis long-tailed learning cardiopulmonary disease zero-shot learning

Published: March 19, 2025. Version: 2.0.0


Database Credentialed Access

BOLD, a blood-gas and oximetry linked dataset

João Matos, Tristan Struja, Jack Gallifant, et al.

An open-source pulse oximetry and arterial blood gas dataset, derived from MIMIC-III, MIMIC-IV, and eICU-CRD

pulse oximetry intensive care unit health equity electronic health records

Published: Nov. 8, 2023. Version: 1.0


Challenge Credentialed Access

MIT Critical Datathon 2023: a MIMIC-IV Derived Dataset for Pulse Oximetry Correction Models

João Matos, Tristan Struja, David S Restrepo, et al.

A SaO2-SpO2 Pairs Dataset derived from MIMIC-IV

pulse oximetry health equity machine learning

Published: May 8, 2023. Version: 1.0.0


Database Credentialed Access

Chest ImaGenome Dataset

Joy Wu, Nkechinyere Agu, Ismini Lourentzou, et al.

The Chest ImaGenome dataset is a scene graph dataset with additional chronological comparison relations for chest X-rays. It is automatically derived from the MIMIC-CXR dataset. A manually annotated gold standard is also available for 500 patients.

scene graph visual dialogue object detection semantic reasoning bounding box knowledge graph explainability reasoning relation extraction chest disease progression cxr machine learning chest x-ray radiology multimodal deep learning visual question answering

Published: July 13, 2021. Version: 1.0.0