Resources


Database Credentialed Access

Immunosuppressive Condition and Medication Annotations for Admission Notes in the MIMIC-III Database

Vijeeth Guggilla, Melissa Bak, Mengjia Kang, et al.

This database contains 200 MIMIC-III admission notes with adjudicated labels for histories of various immunosuppressive conditions and usage of various immunosuppressive medications.

Published: Aug. 4, 2025. Version: 1.0.0


Database Credentialed Access

Lunguage: A Benchmark for Structured and Sequential Chest X-ray Interpretation

Jong Hak Moon, Geon Choi, Paloma Rabaey, et al.

A radiologist-annotated benchmark of structured chest X-ray reports at single and sequential levels, comprising 1,473 reports across 18 relation types and 80 longitudinal cases.

fine-grained structured reports attribute-level clinical reasoning medical text structuring longitudinal clinical reasoning chest x-ray report parsing medical information structuring benchmark dataset for radiology report medical information extraction structured radiology reports temporal relation extraction radiology report benchmarking longitudinal clinical understanding

Published: Jan. 11, 2026. Version: 1.0.0


Database Credentialed Access

DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries

Jayetri Bardhan, Anthony Colas, Kirk Roberts, et al.

DrugEHRQA is a QA dataset containing question-answers from MIMIC-III tables and discharge summaries.

question-answer qa

Published: April 12, 2022. Version: 1.0.0


Database Credentialed Access

Lunguage: A Benchmark for Structured and Sequential Chest X-ray Interpretation

Jong Hak Moon, Geon Choi, Paloma Rabaey, et al.

A radiologist-annotated benchmark of structured chest X-ray reports at single and sequential levels, comprising 1,473 reports across 18 relation types and 80 longitudinal cases.

fine-grained structured reports attribute-level clinical reasoning medical text structuring longitudinal clinical reasoning chest x-ray report parsing medical information structuring benchmark dataset for radiology report medical information extraction structured radiology reports temporal relation extraction radiology report benchmarking longitudinal clinical understanding

Published: Jan. 11, 2026. Version: 1.0.0


Database Credentialed Access

RadGraph-XL: A Large-Scale Expert-Annotated Dataset for Entity and Relation Extraction from Radiology Reports

Jean-Benoit Delbrouck

RadGraph-XL is a large, expert-annotated dataset of 2,300 radiology reports covering multiple modalities and anatomies. It enables accurate extraction of clinical entities and relations for downstream medical AI tasks.

Published: Sept. 12, 2025. Version: 1.0.0


Database Credentialed Access

Northwestern ICU (NWICU) database

Dana Moukheiber, William Temps, Bhadrappa Molgi, et al.

A freely available COVID-rich ICU database comprising de-identified health-related data from Northwestern Memorial Health Center (NHMC).

Published: Nov. 19, 2024. Version: 0.1.0


Database Credentialed Access

Annotation dataset of problematic opioid use and related contexts from MIMIC-III Critical Care Database discharge summaries

Melissa Poulsen, Vanessa Troiani, Philip Freda, et al.

The database contains a corpus of annotated data from the MIMIC-III Critical Care Database from a study that aimed to develop and apply an annotation schema to characterize opioid use disorder and related contextual factors.

opioid use disorder substance use natural language processing clinical notes

Published: Feb. 8, 2023. Version: 1.0.0


Database Credentialed Access

Lunguage: A Benchmark for Structured and Sequential Chest X-ray Interpretation

Jong Hak Moon, Geon Choi, Paloma Rabaey, et al.

A radiologist-annotated benchmark of structured chest X-ray reports at single and sequential levels, comprising 1,473 reports across 18 relation types and 80 longitudinal cases.

fine-grained structured reports attribute-level clinical reasoning medical text structuring longitudinal clinical reasoning chest x-ray report parsing medical information structuring benchmark dataset for radiology report medical information extraction structured radiology reports temporal relation extraction radiology report benchmarking longitudinal clinical understanding

Published: Jan. 11, 2026. Version: 1.0.0


Database Credentialed Access

MIMIC-Ext-CXR-QBA: A Structured, Tagged, and Localized Visual Question Answering Dataset with Question-Box-Answer Triplets and Scene Graphs for Chest X-ray Images

Philip Müller, Friederike Jungmann, Georgios Kaissis, et al.

We present a large-scale CXR VQA dataset derived from MIMIC-CXR with 42M QA pairs, featuring hierarchical answers, bounding boxes, and structured tags. We generated QA-pairs using LLM-based extraction from radiology reports and localization models.

chest x-rays vqa localization scene graphs

Published: July 22, 2025. Version: 1.0.0


Database Credentialed Access

CHIFIR: Cytology and Histopathology Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jasmine Teng, et al.

A corpus of cytology and histopathology reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp clinical documentation information extraction invasive fungal infections

Published: Feb. 20, 2024. Version: 1.0.2