Resources


Database Credentialed Access

Chest ImaGenome Dataset

Joy Wu, Nkechinyere Agu, Ismini Lourentzou, et al.

The Chest ImaGenome dataset is a scene graph dataset with additional chronological comparison relations for chest X-rays. It is automatically derived from the MIMIC-CXR dataset. A manually annotated gold standard is also available for 500 patients.

scene graph visual dialogue object detection semantic reasoning bounding box knowledge graph explainability reasoning relation extraction chest disease progression cxr machine learning chest x-ray radiology multimodal deep learning visual question answering

Published: July 13, 2021. Version: 1.0.0


Database Credentialed Access

RadNLI: A natural language inference dataset for the radiology domain

Yasuhide Miura, Yuhao Zhang, Emily Tsai, et al.

A radiology NLI dataset introduced in the paper: Improving Factual Completeness and Consistency of Image-to-text Radiology Report Generation

Published: June 29, 2021. Version: 1.0.0


Database Open Access

Brain Hemorrhage Extended (BHX): Bounding box extrapolation from thick to thin slice CT images

Eduardo Pontes Reis, Felipe Nascimento, Mateus Aranha, et al.

The first version of this dataset was made available in the forum of Kaggle competition 'RSNA Intracranial Hemorrhage Detection' (v1.0). Then minor corrections were implemented (v1.1).

hemorrhage intracranial

Published: July 29, 2020. Version: 1.1


Database Open Access

Brno University of Technology ECG Quality Database (BUT QDB)

Andrea Nemcova, Radovan Smisek, Kamila Opravilová, et al.

The database is intended for the development and objective comparison of algorithms designed to assess the quality of ECG records. It also enables objective comparison of results between authors.

Published: July 22, 2020. Version: 1.0.0

Visualize waveforms

Challenge Credentialed Access

SNOMED CT Entity Linking Challenge

Will Hardman, Mark Banks, Rory Davidson, et al.

272 discharge notes from the MIMIC-IV-Note dataset annotated with SNOMED CT concepts.

snomed clinical annotation entity linking

Published: Jan. 12, 2026. Version: 1.2.0


Database Open Access

Brno University of Technology Smartphone PPG Database (BUT PPG)

Andrea Nemcova, Radovan Smisek, Eniko Vargova, et al.

BUT PPG is a database created for the purpose of evaluating PPG signal quality and estimation of heart rate. The data comprises 3,888 10s recordings of PPGs recorded by smartphone and associated ECG and ACC signals and annotations.

heart rate artificial intelligence ppg ecg photoplethysmography acc signal quality assessment annotations accelerometric data electrocardiogram

Published: Aug. 23, 2024. Version: 2.0.0


Challenge Credentialed Access

SNOMED CT Entity Linking Challenge

Will Hardman, Mark Banks, Rory Davidson, et al.

272 discharge notes from the MIMIC-IV-Note dataset annotated with SNOMED CT concepts.

snomed clinical annotation entity linking

Published: Jan. 12, 2026. Version: 1.2.0


Challenge Credentialed Access

ArchEHR-QA: A Dataset for Addressing Patient's Information Needs related to Clinical Course of Hospitalization

Sarvesh Soni, Dina Demner-Fushman

A dataset for grounded question answering (QA) from electronic health records (EHRs).

question answering electronic health record patient portals clinicians

Published: Jan. 1, 2026. Version: 1.3


Database Restricted Access

TN-Mammo: A Multi-view Mammography Dataset for Breast Density Classification

Binh Nguyen, Cat Le, Loc Vu, et al.

We release the first version of TN-Mammo (June 2024), a mammogram dataset of 676 cases with breast density labels, providing high-quality data to support machine learning and early breast cancer detection.

Published: Oct. 4, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-Ext-DrugDetection

Fabrice Harel-Canada, Nanyun Peng, David Goodman, et al.

This project offers a multilabel annotated dataset of clinical note sentences from MIMIC-III/IV for substance use detection. It supports NLP research for identifying various co-occurring drug use mentions in patient records.

ehr mimic-iv substance use clinical notes methamphetamine multi-label cocaine drug detection polysubstance use prescription opioid misuse cannabis benzodiazepine misuse injection drug use heroin mimic-iii

Published: Sept. 25, 2025. Version: 1.0.0