Resources


Database Credentialed Access

Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information

Yael Bensoussan, Alexandros Sigaras, Anais Rameau, et al.

A dataset of features from voice recordings and metadata to enable the development, benchmarking, and validation of clinically applicable machine-learning models for diagnosing a wide range of health conditions.

health biomarkers bridge2ai voice

Published: May 1, 2026. Version: 3.1.0


Database Credentialed Access

Bridge2AI-Voice Pediatric Dataset

Yael Bensoussan, Alexandros Sigaras, Anais Rameau, et al.

A dataset of questionnaire responses, spectrograms, and other information for pediatric participants collected for the Bridge2AI voice as a biomarker of health project.

health pediatric biomarkers bridge2ai voice

Published: May 1, 2026. Version: 1.1.0


Database Open Access

GRABMyoFlow - Dataset extension

Carl Linus Ehlert, James Tung, Ashirbad Pradhan

GRABMyoFlow extends the GRABMyo sEMG dataset with dynamic transitions and 20 new subjects (63 total). Features multi-day wrist recordings for robust hand gesture recognition and biometrics research.

Published: April 16, 2026. Version: 1.0.0

Visualize waveforms

Database Credentialed Access

MIMIC-CXR-Ext-ILS: Lesion Segmentation Masks and Instruction-Answer Pairs for Chest X-rays

Geon Choi, Hangyul Yoon, Hyunju Shin, et al.

Instruction-guided lesion segmentation data for chest X-rays, including 1.1M instruction-answer pairs and 91K segmentation masks covering seven major lesion types.

chest x-ray segmentation text-guided segmentation lesion segmentation

Published: March 25, 2026. Version: 1.0.0


Database Restricted Access

Microbiological, Immunological and Biochemical Characteristics of the Development of Ventilator Associated Pneumonia

Natalia Sanabria-Herrera, Ingrid Gisell Bustos Moya, Luis Felipe Reyes

This study explores the respiratory microbiome's role in nosocomial lower respiratory tract infections in ICU patients. Conducted in Chía, Colombia, revealing the microbiome's impact on disease progression.

Published: Dec. 5, 2025. Version: 1.1.1


Database Contributor Review

ER-REASON: A Benchmark Dataset for LLM-Based Clinical Reasoning in the Emergency Room

Mel Molina, Nikita Mehandru, Niloufar Golchini, et al.

The ER-REASON dataset is a longitudinal collection of 25,174 de-identified clinical notes for 3,437 patients admitted to the emergency room (ER) at a large academic medical center between March 1, 2022, and March 31, 2024.

Published: Oct. 23, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-IV-Ext clinical decision support for referral, triage and diagnosis

Farieda Gaber, Altuna Akalin

This MIMIC-IV extended dataset is designed to evaluate and improve LLMs' ability to assist with triage, specialist referral, and diagnosis, using critical patient information such as history of present illness,vitals signs and other relevant data.

Published: Oct. 8, 2025. Version: 1.0.2


Database Credentialed Access

Multimodal Clinical Monitoring in the Emergency Department (MC-MED)

Aman Kansal, Emma Chen, Tom Jin, et al.

A multimodal dataset of deidentified clinical and physiological data from emergency department visits, supporting research on patient outcomes, care processes, and the effects of continuous monitoring during and after the COVID-19 pandemic.

Published: Sept. 25, 2025. Version: 1.0.1


Database Restricted Access

mcPHASES: A Dataset of Physiological, Hormonal, and Self-reported Events and Symptoms for Menstrual Health Tracking with Wearables

Blue Lin, Jin Yi Li, Kaavya Kalani, et al.

This initial version of the PHASES dataset includes multimodal menstrual health data—hormone levels, wearable sensor metrics, and self-reported symptoms—collected across two study intervals from 42 young adults.

wearables hormones menstrual health multimodal health health sensor data womens health

Published: Sept. 9, 2025. Version: 1.0.0


Database Credentialed Access

CXR-Align: A Benchmark for CXR-Report Alignment with Negations

Hanbin Ko

CXR-Align is a benchmark dataset created to evaluate vision-language models' capability to interpret negations in chest X-ray (CXR) reports, featuring systematically modified reports from MIMIC-CXR.

Published: Aug. 21, 2025. Version: 1.0.0