Resources
Database Credentialed Access
BOLD, a blood-gas and oximetry linked dataset
João Matos, Tristan Struja, Jack Gallifant, Luis Filipe Nakayama, Marie Charpignon, Xiaoli Liu, Jaime dos Santos Cardoso, Leo Anthony Celi, An Kwok Wong
pulse oximetry intensive care unit health equity electronic health records
Published: Nov. 8, 2023. Version: 1.0
Database Credentialed Access
PatientSim: A Persona-Driven Simulator for Realistic Doctor-Patient Interactions
Daeun Kyung, Hyunseung Chung, Seongsu Bae, Jiho Kim, Jae Ho Sohn, Taerim Kim, Soo Kim, Edward Choi
electronic health records multi-turn dialogue llm simulation doctor-patient consultation
Published: Oct. 18, 2025. Version: 1.0.0
Database Credentialed Access
Annotated Social Determinants of Health Dataset for Adverse Pregnancy Outcomes
Nidhi Soley, MaKhaila Bentil, Jash Shah, Masoud Rouhizadeh, Casey Taylor
Published: Aug. 4, 2025. Version: 1.0.0
Model Credentialed Access
Shareable Artificial Intelligence to Extract Cancer Outcomes from Electronic Health Records for Precision Oncology Research
Kenneth Kehl, Pavel Trukhanov, Christopher Fong, Justin Jee, Karl Pichotta, Morgan Paul, Chelsea Nichols, Michele Waters, Nikolaus Schultz, Deborah Schrag
Published: Oct. 24, 2024. Version: 1.0.0
Database Credentialed Access
EHR-DS-QA: A Synthetic QA Dataset Derived from Medical Discharge Summaries for Enhanced Medical Information Retrieval Systems
Konstantin Kotschenreuther
mimic-iv clinical question-answering medical discharge summaries large language models
Published: Jan. 11, 2024. Version: 1.0.0
Database Credentialed Access
EHRXQA: A Multi-Modal Question Answering Dataset for Electronic Health Records with Chest X-ray Images
Seongsu Bae, Daeun Kyung, Jaehee Ryu, Eunbyeol Cho, Gyubok Lee, Sunjun Kweon, Jungwoo Oh, Lei JI, Eric Chang, Tackeun Kim, Edward Choi
question answering chest x-ray electronic health records multi-modal question answering ehr question answering semantic parsing machine learning deep learning evaluation visual question answering benchmark
Published: July 23, 2024. Version: 1.0.0
Model Credentialed Access
Characterization of Stigmatizing Language in Medical Records
Keith Harrigian, Ayah Zirikly, Brant Chee, Alya Ahmad, Anne Links, Somnath Saha, Mary Catherine Beach, Mark Dredze
clinical natural language processing domain transfer bias stigmatizing language large language models mimic
Published: Nov. 6, 2023. Version: 1.0.0
Challenge Credentialed Access
SNOMED CT Entity Linking Challenge
Will Hardman, Mark Banks, Rory Davidson, Donna Truran, Nindya Widita Ayuningtyas, Hoa Ngo, Alistair Johnson, Tom Pollard
snomed entity linking clinical annotation
Published: July 22, 2025. Version: 1.1.0
Database Credentialed Access
MeDiSumQA: Patient-Oriented Question-Answer Generation from Discharge Letters
Amin Dada, Osman Alperen Koras, Marie Bauer, Amanda Butler, Kaleb Smith, Jens Kleesiek, Julian Friedrich
Published: May 5, 2025. Version: 1.0.0
Database Credentialed Access
A Temporal Dataset for Respiratory Support in Critically Ill Patients
Mira Moukheiber, Lama Moukheiber, Dana Moukheiber, Sicheng Hao, Leo Anthony Celi, Hyung-Chul Lee
oberservational data time-series
Published: April 15, 2025. Version: 1.1.0