Resources


Database Credentialed Access

MedNLI - A Natural Language Inference Dataset For The Clinical Domain

Chaitanya Shivade

This is a resource for training machine learning models for language inference in the medical domain.

natural language inference recognizing textual entailment

Published: Oct. 1, 2019. Version: 1.0.0


Database Open Access

Term-Preterm EHG DataSet with Tocogram

Electrohysterogram signals accompanied by a simultaneously recorded external tocogram.

neuroelectric pregnancy electrohysterogram

Published: Aug. 29, 2018. Version: 1.0.0

Visualize waveforms

Database Open Access

BIDMC PPG and Respiration Dataset

ECG signals extracted from the MIMIC-II Matched Waveform Database, with manual breath annotations added by annotators using impedance respiratory signal.

multiparameter photoplethysmogram ecg

Published: June 20, 2018. Version: 1.0.0

Visualize waveforms

Database Open Access

neuroQWERTY MIT-CSXPD Dataset

Keystroke logs collected from 85 subjects with and without Parkinson's disease.

parkinsons neuroelectric brain

Published: Dec. 20, 2016. Version: 1.0.0


Database Open Access

VitalDB Arrhythmia Database: An Anesthesiologist-Validated Large-Scale Intraoperative Arrhythmia Dataset with Beat and Rhythm Labels

Dain Eun, Kayoung Shim, Hyunsoo Lee, et al.

We present a comprehensive intraoperative arrhythmia dataset with 734,528 seconds of ECG recordings from 482 patients, featuring over 660,000 beats annotated and validated by five anesthesiologists.

ppg vitaldb ecg arterial waveform intraoperative dataset

Published: Feb. 26, 2026. Version: 1.0.0


Database Contributor Review

COVID Data for Shared Learning (CDSL): A comprehensive, multimodal COVID-19 dataset from HM Hospitales

Álvaro Ritoré, Andreea M Oprescu, Alberto Estirado Bronchalo, et al.

COVID Data for Shared Learning (CDSL) is a multimodal database comprising de-identified structured health data and radiological images from 4,479 patients with COVID-19, as a comprehensive toolkit for developing predictive models.

covid-19 multimodal database radiological images open data healthcare data machine learning and ai

Published: Oct. 25, 2024. Version: 1.0.0


Database Credentialed Access

ReXPref-Prior: A MIMIC-CXR Preference Dataset for Reducing Hallucinated Prior Exams in Radiology Report Generation

Oishi Banerjee, Hong-Yu Zhou, Subathra Adithan, et al.

We propose ReXPref-Prior, an adapted version of MIMIC-CXR where GPT-4 has removed references to prior exams from both findings and impression sections of chest X-ray reports.

chest x-rays reinforcement learning hallucination

Published: Aug. 14, 2024. Version: 1.0.0


Database Open Access

Respiratory and heart rate monitoring dataset from aeration study

Ella Frances Sophia Guy, Isaac Flett, Jaimey Anne Clifton, et al.

Respiratory and cardiovascular data collected from 20 subjects. Pressure, flow, aeration, and heart-rate data were collected during trials which included resting breathing, CPAP at varied PEEP settings, breath-holds, and forced expiratory manoeuvres.

Published: March 20, 2024. Version: 1.0.0


Database Credentialed Access

Annotation dataset of social determinants of health from MIMIC-III Clinical Care Database

Marco Guevara, Shan Chen, Spencer Thomas, et al.

Annotation dataset of social determinants of health from MIMC-III Clinical Care Database notes.

natural language processing social determinants of health

Published: Jan. 24, 2024. Version: 1.0.1


Database Credentialed Access

Generalized Image Embeddings for the MIMIC Chest X-Ray dataset

Andrew Sellergren, Atilla Kiraly, Tom Pollard, et al.

This database contains compact information-rich embeddings of the MIMIC-CXR Database v2.0.0 using the CXR Foundation API v1.0.

Published: Feb. 22, 2023. Version: 1.0