Resources
Database Credentialed Access
EHR-DS-QA: A Synthetic QA Dataset Derived from Medical Discharge Summaries for Enhanced Medical Information Retrieval Systems
Konstantin Kotschenreuther
mimic-iv clinical question-answering medical discharge summaries large language models
Published: Jan. 11, 2024. Version: 1.0.0
Challenge Credentialed Access
SNOMED CT Entity Linking Challenge
Will Hardman, Mark Banks, Rory Davidson, Donna Truran, Nindya Widita Ayuningtyas, Hoa Ngo, Alistair Johnson, Tom Pollard
snomed entity linking clinical annotation
Published: July 22, 2025. Version: 1.1.0
Database Credentialed Access
MIMIC-III Clinical Database
Alistair Johnson, Tom Pollard, Roger Mark
MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The databas…
clinical intensive care critical care natural language processing machine learning
Published: Sept. 4, 2016. Version: 1.4
Database Credentialed Access
MIMIC-III Clinical Database
Alistair Johnson, Tom Pollard, Roger Mark
MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The databas…
clinical intensive care critical care natural language processing machine learning
Published: Sept. 4, 2016. Version: 1.4
Database Credentialed Access
Tasks 1 and 3 from Progress Note Understanding Suite of Tasks: SOAP Note Tagging and Problem List Summarization
Yanjun Gao, John Caskey, Timothy Miller, Brihat Sharma, Matthew Churpek, Dmitriy Dligach, Majid Afshar
Published: Sept. 30, 2022. Version: 1.0.0
Database Credentialed Access
Deidentified Medical Text
Margaret Douglass, Bill Long, George Moody, Peter Szolovits, Li-wei Lehman, Roger Mark, Gari D. Clifford
medical text nursing notes hipaa de-identification
Published: Dec. 18, 2007. Version: 1.0
Model Credentialed Access
Characterization of Stigmatizing Language in Medical Records
Keith Harrigian, Ayah Zirikly, Brant Chee, Alya Ahmad, Anne Links, Somnath Saha, Mary Catherine Beach, Mark Dredze
clinical natural language processing domain transfer bias stigmatizing language large language models mimic
Published: Nov. 6, 2023. Version: 1.0.0
Model Credentialed Access
Characterization of Stigmatizing Language in Medical Records
Keith Harrigian, Ayah Zirikly, Brant Chee, Alya Ahmad, Anne Links, Somnath Saha, Mary Catherine Beach, Mark Dredze
clinical natural language processing domain transfer bias stigmatizing language large language models mimic
Published: Nov. 6, 2023. Version: 1.0.0
Database Credentialed Access
Annotation dataset of problematic opioid use and related contexts from MIMIC-III Critical Care Database discharge summaries
Melissa Poulsen, Vanessa Troiani, Philip Freda, Danielle Mowery, Anahita Davoudi
opioid use disorder substance use natural language processing clinical notes
Published: Feb. 8, 2023. Version: 1.0.0
Database Credentialed Access
EHRNoteQA: An LLM Benchmark for Real-World Clinical Practice Using Discharge Summaries
Sunjun Kweon, Jiyoun Kim, Heeyoung Kwak, Dongchul Cha, Hangyul Yoon, Kwang Hyun Kim, Jeewon Yang, Seunghyun Won, Edward Choi
Published: June 26, 2024. Version: 1.0.1