Resources
Database Restricted Access
MIMIC-IV-Ext-DiReCT
Bowen Wang, Jiuyang Chang, Yiming Qian
Published: Jan. 21, 2025. Version: 1.0.0
Database Credentialed Access
MIMIC-III-Ext-VeriFact-BHC: Labeled Propositions From Brief Hospital Course Summaries for Long-form Clinical Text Evaluation
Philip Chung, Akshay Swaminathan, Alex Goodell, Yeasul Kim, Momsen Reincke, Lichy Han, Ben Deverett, Mohammad Amin Sadeghi, Abdel badih El Ariss, Marc Ghanem, David Seong, Andrew Lee, Caitlin Coombes, Brad Bradshaw, Mahir Sufian, Hyo Jung Hong, Teresa Nguyen, Mohammad Rasouli, Komal Kamra, Mark Burbridge, James McAvoy, Roya Saffary, Stephen Parnell Ma, Dev Dash, James Xie, Ellen Wang, Cliff Schmiesing, Nigam Shah, Nima Aghaeepour
artificial intelligence natural language processing clinical notes large language models brief hospital course electronic health records long-form text chart review text reranking atomic claim hybrid retrieval clinical informatics clinical medicine fact verification retrieval-augmented generation logical atomism text embedding formal logic llm-as-a-judge llm evaluation
Published: April 9, 2025. Version: 1.0.0
Database Restricted Access
MIMIC-III-Ext-Synthetic-Clinical-Trial-Questions
Elizabeth Woo, Michael Craig Burkhart, Emily Alsentzer, Brett Beaulieu-Jones
large language models synthetic data distillation clinical trial eligibility
Published: April 22, 2025. Version: 1.0.0
Database Contributor Review
BRATECA (Brazilian Tertiary Care Dataset): a Clinical Information Dataset for the Portuguese Language
Henrique Dias, Ana Helena Dias Pereira dos Ulbrich
prescriptions exams tertiary care natural language processing clinical notes
Published: July 14, 2022. Version: 1.1
Database Credentialed Access
MedNLI - A Natural Language Inference Dataset For The Clinical Domain
Chaitanya Shivade
natural language inference recognizing textual entailment
Published: Oct. 1, 2019. Version: 1.0.0
Database Credentialed Access
MedDec: Medical Decisions for Discharge Summaries in the MIMIC-III Database
Mohamed Elgaar, Jiali Cheng, Nidhi Vakil, Hadi Amiri, Leo Anthony Celi
natural language processing medical decisions span classification discharge summary mimic
Published: Oct. 16, 2024. Version: 1.0.0
Database Credentialed Access
Nosocomial Risk Datasets from MIMIC-III
Travis Goodwin
pressure injury risk prediction acute kidney injury anemia forecasting natural language processing deep learning
Published: Sept. 15, 2022. Version: 1.0
Database Credentialed Access
DrugEHRQA: A Question Answering Dataset on Structured and Unstructured Electronic Health Records For Medicine Related Queries
Jayetri Bardhan, Anthony Colas, Kirk Roberts, Daisy Zhe Wang
Published: April 12, 2022. Version: 1.0.0
Challenge Credentialed Access
ArchEHR-QA: BioNLP at ACL 2025 Shared Task on Grounded Electronic Health Record Question Answering
Sarvesh Soni, Dina Demner-Fushman
electronic health record question answering clinicians patient portals
Published: April 11, 2025. Version: 1.2
Database Credentialed Access
CLIP: A Dataset for Extracting Action Items for Physicians from Hospital Discharge Notes
James Mullenbach, Yada Pruksachatkun, Sean Adler, Jennifer Seale, Jordan Swartz, T Greg McKelvey, Yi Yang, David Sontag
Published: June 21, 2021. Version: 1.0.0