PhysioNet Index

Database Credentialed Access

Structured Viewing Classification Annotations From the MIMIC-IV-ECHO Dataset (ECHOVIEW)

Sampath Rapuri, Sofia Sapeta Dias, Maria Salomé Carvalho, et al.

ECHOVIEW provides structured viewing class annotations for 29,196 transthoracic echocardiograms derived from MIMIC-IV-ECHO using a pretrained CNN. Manual clinician review shows substantial agreement (κ=0.69) with these annotations.

Published: March 17, 2026. Version: 0.1

Database Open Access

ReXErr-v1: Clinically Meaningful Chest X-Ray Report Errors Derived from MIMIC-CXR

Vishwanatha Rao, Serena Zhang, Julian Acosta, et al.

Chest X-Ray reports containing synthetic errors based upon the MIMIC-CXR database. Errors were injected using LLMs and sampled across common human and AI model errors.

Published: March 19, 2025. Version: 1.0.0

Model Credentialed Access

RadVLM model

Nicolas Deperrois, Hidetoshi Matsuo, Samuel Ruiperez-Campillo, et al.

RadVLM is a 7B-parameter vision-language model fine-tuned on public chest-X-ray data that drafts reports, lists abnormalities, grounds findings, and chats about a CXR through a single image-to-text interface.

Published: Oct. 8, 2025. Version: 1.0.0

Challenge Credentialed Access

BioNLP Workshop 2023 Shared Task 1A: Problem List Summarization

Yanjun Gao, Dmitriy Dligach, Timothy Miller, et al.

This is the data storage for BioNLP Workshop Shared Task 1A: Problem List Summarization.

bionlp clinical natural language processing electronic health record summarization

Published: Nov. 12, 2023. Version: 2.0.0

Database Restricted Access

MIMIC-III-Ext-Synthetic-Clinical-Trial-Questions

Elizabeth Woo, Michael Craig Burkhart, Emily Alsentzer, et al.

In our recent study, we used Llama-3.1-70B-Instruct to generate synthetic training examples resembling clinical trial eligibility criteria. We manually reviewed 1000 of these examples and release them here.

large language models synthetic data distillation clinical trial eligibility

Published: April 22, 2025. Version: 1.0.0

Database Open Access

Heart and lung segmentations for MIMIC-CXR/MIMIC-CXR-JPG and Montgomery County TB databases

Benjamin Duvieusart, Felix Krones, Guy Parsons, et al.

Heart and lung segmentations for 200 MIMIC-CXR/MIMIC-CXR-JPG chest x-rays and heart segmentations for 138 Montgomery County tuberculosis chest X-rays.

segmentation heart and lungs montgomery country tb mimic-cxr

Published: Aug. 14, 2023. Version: 1.0.0

Database Restricted Access

Swiss-Mammo: A physician-written, synthetic dataset of German mammography reports

Daniel Reichenpfader, Sandro von Däniken, Harald Marcel Bonel

Swiss-Mammo: A physician-written, synthetic dataset of 28 German mammography reports. The dataset is stratified based on BI-RADS categories and available in German and English.

radiology mammography structured reporting bi-rads

Published: June 24, 2025. Version: 1.0.1

Database Restricted Access

Electrocardiogram-Capable Smartwatches: Assessing Their Clinical Accuracy and Application

Joaquin Recas, Mauro Buelga Suárez, Sergio González-Cabeza, et al.

This database allows the study of the feasibility of using ECG-capable smartwatches as diagnostic tools, focusing on their compliance with clinical standards and their ability to measure critical ECG parameters beyond AF detection.

ischemia fitbit sense ambulatory samsung galaxy watch st-segment withings scanwatch smartwatch apple watch

Published: April 9, 2025. Version: 1.0.0

Database Credentialed Access

MIMIC-IV-Ext Triage Instruction Corpus

Qingyang Shen, Quan Guo

MIMIC-IV-Ext Triage Instruction Corpus includes 9,629 ED triage cases organized by the five-level ESI, enabling LLMs to improve triage accuracy. It provides CSV data, generation prompts, expert validation samples, and SQL QC scripts.

nlp clinical decision support large language models emergency severity index emergency triage machine learning

Published: March 4, 2025. Version: 1.0.0

Database Credentialed Access

Embedding-Based Representations for BRSET and mBRSET

David Restrepo, Chenwei Wu, Michael Morley, et al.

Precomputed image embeddings for the BRSET and mBRSET Brazilian retinal datasets to support efficient, secure, and equitable ophthalmic AI research, enabling tasks such as classification, clustering, multimodal modeling, and fairness analysis.

computer vision vector embeddings ophthalmology

Published: March 30, 2026. Version: 1.0.0

Search

Resources

Structured Viewing Classification Annotations From the MIMIC-IV-ECHO Dataset (ECHOVIEW)

ReXErr-v1: Clinically Meaningful Chest X-Ray Report Errors Derived from MIMIC-CXR

RadVLM model

BioNLP Workshop 2023 Shared Task 1A: Problem List Summarization

MIMIC-III-Ext-Synthetic-Clinical-Trial-Questions

Heart and lung segmentations for MIMIC-CXR/MIMIC-CXR-JPG and Montgomery County TB databases

Swiss-Mammo: A physician-written, synthetic dataset of German mammography reports

Electrocardiogram-Capable Smartwatches: Assessing Their Clinical Accuracy and Application

MIMIC-IV-Ext Triage Instruction Corpus

Embedding-Based Representations for BRSET and mBRSET