Resources
Database Open Access
MIMIC-IV demo data in the Medical Event Data Standard (MEDS)
Robin Philippus van de Water, Ethan Steinberg, Michael Wornow, Patrick Rockenschaub, Matthew McDermott
ehr critical care electronic health record mimic machine learning meds medical event data standard
Published: Sept. 29, 2025. Version: 0.0.1
Database Credentialed Access
MIMIC-IV-Ext-Instr: A Dataset of 450K+ EHR-Grounded Instruction-Following Examples
Zhenbang Wu, Anant Dadu, Mike Nalls, Faraz Faghri, Jimeng Sun
large language models medical question answering instruction tuning
Published: Sept. 9, 2025. Version: 1.0.0
Database Credentialed Access
CXR-Align: A Benchmark for CXR-Report Alignment with Negations
Hanbin Ko
Published: Aug. 21, 2025. Version: 1.0.0
Database Credentialed Access
MIMIC-IV-Ext Cardiac Disease
Jiawei Cao, Sendong Zhao
Published: May 6, 2025. Version: 1.0.0
Database Credentialed Access
MIMIC-III-Ext-VeriFact-BHC: Labeled Propositions From Brief Hospital Course Summaries for Long-form Clinical Text Evaluation
Philip Chung, Akshay Swaminathan, Alex Goodell, Yeasul Kim, Momsen Reincke, Lichy Han, Ben Deverett, Mohammad Amin Sadeghi, Abdel badih El Ariss, Marc Ghanem, David Seong, Andrew Lee, Caitlin Coombes, Brad Bradshaw, Mahir Sufian, Hyo Jung Hong, Teresa Nguyen, Mohammad Rasouli, Komal Kamra, Mark Burbridge, James McAvoy, Roya Saffary, Stephen Parnell Ma, Dev Dash, James Xie, Ellen Wang, Cliff Schmiesing, Nigam Shah, Nima Aghaeepour
artificial intelligence natural language processing clinical notes electronic health records large language models brief hospital course long-form text chart review text reranking atomic claim hybrid retrieval clinical informatics clinical medicine fact verification retrieval-augmented generation logical atomism text embedding formal logic llm-as-a-judge llm evaluation
Published: April 9, 2025. Version: 1.0.0
Database Credentialed Access
EHRCon: Dataset for Checking Consistency between Unstructured Notes and Structured Tables in Electronic Health Records
Yeonsu Kwon, Jiho Kim, Gyubok Lee, Seongsu Bae, Daeun Kyung, Wonchul Cha, Tom Pollard, Alistair Johnson, Edward Choi
Published: March 19, 2025. Version: 1.0.1
Database Restricted Access
OpenOximetry Repository
Nicholas Fong, Michael Lipnick, Philip Bickler, John Feiner, Tyler Law
Published: Feb. 28, 2025. Version: 1.1.1
Database Credentialed Access
MIMIC-IV-Ext-BHC: Labeled Clinical Notes Dataset for Hospital Course Summarization
Asad Aali, Dave Van Veen, Yamin Arefeen, Jason Hom, Christian Bluethgen, Eduardo Pontes Reis, Sergios Gatidis, Namuun Clifford, Joseph Daws, Arash Tehrani, Jangwon Kim, Akshay Chaudhari
natural language processing clinical notes brief hospital course text summarization machine learning
Published: Feb. 3, 2025. Version: 1.2.0
Database Open Access
CGMacros: a scientific dataset for personalized nutrition and diet monitoring
Ricardo Gutierrez-Osuna, David Kerr, Bobak Mortazavi, Anurag Das
diabetes continuous glucose monitors machine learning obesity postprandial glucose response food macronutrients metabolic models food photographs personalized nutrition
Published: Jan. 28, 2025. Version: 1.0.0
Database Credentialed Access
TherLid: A Thermometry Linked Dataset
Jeremy Tan, Inês Martins, João Matos, Tiago Filipe Sousa Gonçalves, Tetsu Ohnuma, Jaime dos Santos Cardoso, Leo Anthony Celi, Vijay Krishnamoorthy, Andrea Lane, An Kwok Wong
thermometry intensive care unit health equity electronic health records
Published: Jan. 21, 2025. Version: 1.0.0