Resources


Database Restricted Access

Application of Med-PaLM 2 in the refinement of MIMIC-CXR labels

Kendall Park, Rory Sayres, Andrew Sellergren, et al.

This work further refines the labels associated with CheXpert in MIMIC-CXR-JPG 2.0.0 by filtering with Med-PaLM 2 followed by verification by manual review by three US board-certified radiologists.

mimic-cxr labels

Published: Feb. 4, 2025. Version: 1.0.0


Model Credentialed Access

Characterization of Stigmatizing Language in Medical Records

Keith Harrigian, Ayah Zirikly, Brant Chee, et al.

A suite of classifiers for detecting three types of stigmatizing language in electronic medical records. Trained on MIMIC-IV discharge notes.

clinical natural language processing domain transfer bias stigmatizing language mimic large language models

Published: Nov. 6, 2023. Version: 1.0.0


Model Credentialed Access

Characterization of Stigmatizing Language in Medical Records

Keith Harrigian, Ayah Zirikly, Brant Chee, et al.

A suite of classifiers for detecting three types of stigmatizing language in electronic medical records. Trained on MIMIC-IV discharge notes.

clinical natural language processing domain transfer bias stigmatizing language mimic large language models

Published: Nov. 6, 2023. Version: 1.0.0


Database Credentialed Access

RuMedNLI: A Russian Natural Language Inference Dataset For The Clinical Domain

Pavel Blinov, Aleksandr Nesterov, Galina Zubkova, et al.

RuMedNLI is the full counterpart dataset of MedNLI in Russian language.

natural language inference recognizing textual entailment russian language

Published: April 1, 2022. Version: 1.0.0


Model Credentialed Access

Fine-tuning foundational models to code diagnoses from veterinary health records

Adam Kiehl, Nadia Saklou, G Joseph Strecker, et al.

Fine-tuned GatorTron LLM for veterinary diagnosis coding to 7,739 SNOMED-CT codes based on clinical summary text from the Colorado State University Veterinary Teaching Hospital.

transformers natural language processing large language models foundational models one health diagnoses snomed-ct veterinary medicine omop cdm veterinary medical records clinical coding

Published: Jan. 25, 2026. Version: 1.0.0


Database Credentialed Access

Medication Extraction Labels for MIMIC-IV-Note Clinical Database

Akshay Goel, Almog Gueta, Omry Gilon, et al.

Medication extraction NLP labels for 600 discharge summaries in MIMIC-IV-Note dataset.

Published: Dec. 12, 2023. Version: 1.0.0


Database Credentialed Access

RadGraph: Extracting Clinical Entities and Relations from Radiology Reports

Saahil Jain, Ashwin Agrawal, Adriel Saporta, et al.

RadGraph is a dataset of entities and relations in full-text chest X-ray radiology reports, which are obtained using a novel information extraction (IE) schema to capture clinically relevant information in a radiology report.

entity and relation extraction graph multi-modal natural language processing radiology

Published: June 3, 2021. Version: 1.0.0


Database Credentialed Access

National Institutes of Health Stroke Scale (NIHSS) Annotations for the MIMIC-III Database

Jiayang Wang, Xiaoshuo Huang, Lin Yang, et al.

A dataset of annotated NIHSS scale items and corresponding scores from stroke patients discharge summaries in MIMIC-III.

Published: Jan. 25, 2021. Version: 1.0.0


Challenge Credentialed Access

BioNLP Workshop 2023 Shared Task 1A: Problem List Summarization

Yanjun Gao, Dmitriy Dligach, Timothy Miller, et al.

This is the data storage for BioNLP Workshop Shared Task 1A: Problem List Summarization.

bionlp clinical natural language processing electronic health record summarization

Published: Nov. 12, 2023. Version: 2.0.0


Database Credentialed Access

PIFIR: PET-CT Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jeremy Ong, et al.

A corpus of PET-CT reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp clinical documentation information extraction invasive fungal infections

Published: Feb. 27, 2025. Version: 1.0.0