PhysioNet Index

Database Credentialed Access

MIMIC-IV-Ext-MedicalBench: Evaluating Large Language Models Towards Improved Medical Concept Extraction

Zhichao Yang, Gregory Lyng, Sanjit Batra, et al.

This dataset is an evidence‑grounded benchmark built on MIMIC‑IV discharge summaries that evaluates how well large language models can verify ICD‑10 medical concepts, including implicitly documented diagnoses, by identifying supporting text evidence.

Published: March 23, 2026. Version: 1.0.0

Model Credentialed Access

Fine-tuning foundational models to code diagnoses from veterinary health records

Adam Kiehl, Nadia Saklou, G Joseph Strecker, et al.

Fine-tuned GatorTron LLM for veterinary diagnosis coding to 7,739 SNOMED-CT codes based on clinical summary text from the Colorado State University Veterinary Teaching Hospital.

transformers natural language processing large language models foundational models one health diagnoses snomed-ct veterinary medicine omop cdm veterinary medical records clinical coding

Published: Jan. 25, 2026. Version: 1.0.0

Database Credentialed Access

MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context

Zishan Gu, Jiayuan Chen, Fenglin Liu, et al.

MedVH provides a visual hallucination evaluation benchmark for large language models in the medical context. It formulates tests using chest X-ray images, including multi-choice question answering and long-text generation tasks.

Published: Dec. 10, 2025. Version: 1.0.1

Model Credentialed Access

RadVLM model

Nicolas Deperrois, Hidetoshi Matsuo, Samuel Ruiperez-Campillo, et al.

RadVLM is a 7B-parameter vision-language model fine-tuned on public chest-X-ray data that drafts reports, lists abnormalities, grounds findings, and chats about a CXR through a single image-to-text interface.

Published: Oct. 8, 2025. Version: 1.0.0

Database Credentialed Access

SCRIPT X2B8 Dataset: per-day clinical features to model successful next-day extubation

Sam Fenske, Alec Peltekian, Mengjia Kang, et al.

This dataset contains electronic health record (EHR) data from ICU patients receiving mechanical ventilation, aggregated on a daily basis, along with annotations of intubation, extubation, tracheostomy days, and cases of failed extubation. Data can b

Published: Jan. 28, 2025. Version: 1.0.0

Challenge Credentialed Access

MIT Critical Datathon 2023: a MIMIC-IV Derived Dataset for Pulse Oximetry Correction Models

João Matos, Tristan Struja, David S Restrepo, et al.

A SaO2-SpO2 Pairs Dataset derived from MIMIC-IV

pulse oximetry health equity machine learning

Published: May 8, 2023. Version: 1.0.0

Database Credentialed Access

GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization

Xuhai Xu, Han Zhang, Yasaman Sefidgar, et al.

GLOBEM datasets contain the first released multi-year mobile and wearable sensing datasets from 2018 to 2021, containing 705 person-years and 497 unique participants.

health ubiquitous computing well-being passive mobile sensing human behavior modeling

Published: March 14, 2023. Version: 1.1

Database Open Access

MIMIC-IV demo data in the OMOP Common Data Model

Michael Kallfelz, Anna Tsvetkova, Tom Pollard, et al.

Preliminary work to transform a MIMIC-IV demo dataset to the OMOP Common Data Model

omop common data model

Published: June 21, 2021. Version: 0.9

Database Credentialed Access

MIMIC-III - SequenceExamples for TensorFlow modeling

Jonas Kemp, Kun Zhang, Andrew Dai

MIMIC-III data converted into TensorFlow SequenceExample format, for use in modeling pipelines.

tensorflow sequence modeling deep learning machine learning

Published: Sept. 29, 2020. Version: 1.0.0

Database Open Access

Longitudinal Cylinder Rearing Behavioral Data in a Mouse Stroke Model Across Multiple Drug Treatments

Yunhao Jiang, Shreyas Venkitaraman, Hee Ra Jung, et al.

This dataset provides cylinder rearing video and behavioral scoring data from 59 mice undergoing stroke and drug treatments.

Published: March 4, 2026. Version: 1.0.0

Search

Resources

MIMIC-IV-Ext-MedicalBench: Evaluating Large Language Models Towards Improved Medical Concept Extraction

Fine-tuning foundational models to code diagnoses from veterinary health records

MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context

RadVLM model

SCRIPT X2B8 Dataset: per-day clinical features to model successful next-day extubation

MIT Critical Datathon 2023: a MIMIC-IV Derived Dataset for Pulse Oximetry Correction Models

GLOBEM Dataset: Multi-Year Datasets for Longitudinal Human Behavior Modeling Generalization

MIMIC-IV demo data in the OMOP Common Data Model

MIMIC-III - SequenceExamples for TensorFlow modeling

Longitudinal Cylinder Rearing Behavioral Data in a Mouse Stroke Model Across Multiple Drug Treatments