Resources


Database Open Access

Radiology Report Generation Models Evaluation Dataset For Chest X-rays (RadEvalX)

Amos Rubin Calamida, Farhad Nooralahzadeh, Morteza Rohanian, Mizuho Nishio, Koji Fujimoto, Michael Krauthammer

The RadEvalX is a publicly available dataset developed similarly to the ReXVal dataset. RedEvalX focuses on radiologist evaluations of errors found in automatically generated radiology reports.

Published: June 18, 2024. Version: 1.0.0


Model Credentialed Access

Me-LLaMA: Foundation Large Language Models for Medical Applications

Qianqian Xie, Qingyu Chen, Aokun Chen, Cheng Peng, Yan Hu, Fongci Lin, Xueqing Peng, Jimin Huang, Jeffrey Zhang, Vipina Keloth, Xinyu Zhou, Huan He, Lucila Ohno-Machado, Yonghui Wu, Hua Xu, Jiang Bian

Me-LLaMA is a family of large language models for medical applications trained using clinical text with LLaMA2 models as the base. We release model weights for the foundation models as well as the chat-enhanced models.

large language models

Published: June 5, 2024. Version: 1.0.0


Database Credentialed Access

A Temporal Dataset for Respiratory Support in Critically Ill Patients

Mira Moukheiber, Lama Moukheiber, Dana Moukheiber, Sicheng Hao, Leo Anthony Celi, Hyung-Chul Lee

A benchmark dataset offering hourly records over a 90-day period for 50,920 ICU subjects, including dynamic pulmonary function data and a spectrum of covariates for respiratory intervention analyses.

oberservational data time-series

Published: May 31, 2024. Version: 1.0.0


Database Restricted Access

CheXchoNet: A Chest Radiograph Dataset with Gold Standard Echocardiography Labels

Pierre Elias, Shreyas Bhave

Early detection of heart failure is vital for improving outcomes. The dataset contains 71,589 CXRs paired with gold standard labels from echocardiograms to enable the training of models to detect pathologies indicative of early stage heart failure.

heart failure chest x-rays early detection cardiac structural abnormalties deep learning

Published: March 20, 2024. Version: 1.0.0


Database Restricted Access

OpenOximetry Repository

Nicholas Fong, Michael Lipnick, Philip Bickler, John Feiner, Tyler Law

A repository of matched arterial oxygen and pulse oximeter readings obtained under controlled conditions, with high-frequency physiologic waveforms and skin color measurements.

Published: Feb. 27, 2024. Version: 1.0.0


Database Credentialed Access

EHR-DS-QA: A Synthetic QA Dataset Derived from Medical Discharge Summaries for Enhanced Medical Information Retrieval Systems

Konstantin Kotschenreuther

Dataset consisting of question and answer pairs synthetically generated from medical discharge summaries, designed to facilitate the training and development of large language models specifically tailored for healthcare applications

mimic-iv clinical question-answering medical discharge summaries large language models

Published: Jan. 11, 2024. Version: 1.0.0


Model Credentialed Access

Characterization of Stigmatizing Language in Medical Records

Keith Harrigian, Ayah Zirikly, Brant Chee, Alya Ahmad, Anne Links, Somnath Saha, Mary Catherine Beach, Mark Dredze

A suite of classifiers for detecting three types of stigmatizing language in electronic medical records. Trained on MIMIC-IV discharge notes.

clinical natural language processing domain transfer bias stigmatizing language large language models mimic

Published: Nov. 6, 2023. Version: 1.0.0


Database Open Access

BIG IDEAs Lab Glycemic Variability and Wearable Device Data

Peter Cho, Juseong Kim, Brinnae Bent, Jessilyn Dunn

Glucose measurements and wrist-worn wearable sensor data from highnormoglycemic participants.

biomedical engineering pre-diabetes biomarkers

Published: Sept. 18, 2023. Version: 1.1.2


Database Open Access

MIMIC-IV-ECG: Diagnostic Electrocardiogram Matched Subset

Brian Gow, Tom Pollard, Larry A Nathanson, Alistair Johnson, Benjamin Moody, Chrystinne Fernandes, Nathaniel Greenbaum, Jonathan W Waks, Parastou Eslami, Tanner Carbonati, Ashish Chaudhari, Elizabeth Herbst, Dana Moukheiber, Seth Berkowitz, Roger Mark, Steven Horng

The MIMIC-IV ECG module contains approximately 800,000 diagnostic electrocardiograms across nearly 160,000 unique patients. These patients overlap with the patients from the MIMIC-IV Clinical Database.

Published: Sept. 15, 2023. Version: 1.0

Visualize waveforms

Database Credentialed Access

MIMIC-IV-ECHO: Echocardiogram Matched Subset

Brian Gow, Tom Pollard, Nathaniel Greenbaum, Benjamin Moody, Alistair Johnson, Elizabeth Herbst, Jonathan W Waks, Parastou Eslami, Ashish Chaudhari, Tanner Carbonati, Seth Berkowitz, Roger Mark, Steven Horng

The MIMIC-IV-ECHO module contains more than 500,000 echocardiograms across more than 4,500 unique patients. These patients overlap with the patients from the MIMIC-IV Clinical Database.

Published: July 21, 2023. Version: 0.1