Resources
Database Credentialed Access
CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays
evaluation chest x-ray benchmark structured chest x-ray qa intermediate reasoning steps structured reasoning grounded reasoning diagnostic reasoning structured diagnostic pipeline
Published: Oct. 23, 2025. Version: 1.0.1
Database Credentialed Access
CXReasonBench: A Benchmark for Evaluating Structured Diagnostic Reasoning in Chest X-rays
evaluation chest x-ray benchmark structured chest x-ray qa intermediate reasoning steps structured reasoning grounded reasoning diagnostic reasoning structured diagnostic pipeline
Published: Oct. 23, 2025. Version: 1.0.1
Database Credentialed Access
MIMIC-IV-Ext-MDS-ED: Multimodal Decision Support in the Emergency Department - a Benchmark Dataset for Diagnoses and Deterioration Prediction in Emergency Medicine
emergency department ecg diagnoses prediction deterioration prediction benchmark multimodal
Published: Sept. 12, 2024. Version: 1.0.0
Database Contributor Review
ER-REASON: A Benchmark Dataset for LLM-Based Clinical Reasoning in the Emergency Room
Published: Oct. 23, 2025. Version: 1.0.0
Database Credentialed Access
FFA-IR: Towards an Explainable and Reliable Medical Report Generation Benchmark
fundus fluorescein angiography medical report generation vision and language explainable and reliable evaluation
Published: Jan. 21, 2025. Version: 1.1.0
Database Credentialed Access
MIMIC-IV-Ext-MDS-ED: Multimodal Decision Support in the Emergency Department - a Benchmark Dataset for Diagnoses and Deterioration Prediction in Emergency Medicine
emergency department ecg diagnoses prediction deterioration prediction benchmark multimodal
Published: Sept. 12, 2024. Version: 1.0.0
Database Credentialed Access
EHRNoteQA: An LLM Benchmark for Real-World Clinical Practice Using Discharge Summaries
Published: June 26, 2024. Version: 1.0.1
Database Credentialed Access
MedVAL-Bench: Expert-Annotated Medical Text Validation Benchmark
Published: Nov. 14, 2025. Version: 1.0.1
Database Credentialed Access
MIMIC-IV-ECHO-Ext-MIMICEchoQA: A Benchmark Dataset for Echocardiogram-Based Visual Question Answering
Published: Oct. 7, 2025. Version: 1.0.0
Database Credentialed Access
CXR-Align: A Benchmark for CXR-Report Alignment with Negations
Published: Aug. 21, 2025. Version: 1.0.0