Resources


Database Credentialed Access

MedVH: Towards Systematic Evaluation of Hallucination for Large Vision Language Models in the Medical Context

Zishan Gu, Jiayuan Chen, Fenglin Liu, Changchang Yin, Ping Zhang

MedVH provides a visual hallucination evaluation benchmark for large language models in the medical context. It formulates tests using chest X-ray images, including multi-choice question answering and long-text generation tasks.

Published: March 11, 2025. Version: 1.0.0


Database Credentialed Access

ReXPref-Prior: A MIMIC-CXR Preference Dataset for Reducing Hallucinated Prior Exams in Radiology Report Generation

Oishi Banerjee, Hong-Yu Zhou, Subathra Adithan, Stephen Kwak, Kay Wu, Pranav Rajpurkar

We propose ReXPref-Prior, an adapted version of MIMIC-CXR where GPT-4 has removed references to prior exams from both findings and impression sections of chest X-ray reports.

chest x-rays reinforcement learning hallucination

Published: Aug. 14, 2024. Version: 1.0.0


Database Credentialed Access

ReXPref-Prior: A MIMIC-CXR Preference Dataset for Reducing Hallucinated Prior Exams in Radiology Report Generation

Oishi Banerjee, Hong-Yu Zhou, Subathra Adithan, Stephen Kwak, Kay Wu, Pranav Rajpurkar

We propose ReXPref-Prior, an adapted version of MIMIC-CXR where GPT-4 has removed references to prior exams from both findings and impression sections of chest X-ray reports.

chest x-rays reinforcement learning hallucination

Published: Aug. 14, 2024. Version: 1.0.0


Database Credentialed Access

Medical Expert Annotations of Unsupported Facts in Doctor-Written and LLM-Generated Patient Summaries

Stefan Hegselmann, Shannon Shen, Florian Gierse, Monica Agrawal, David Sontag, Xiaoyi Jiang

Annotations for unsupported facts in 100 original MIMIC patient summaries (discharge instructions) and hallucinations in 100 Large Language Model (LLM) generated patient summaries labeled by two medical experts.

Published: April 30, 2025. Version: 1.0.1


Database Credentialed Access

MIMIC-III-Ext-VeriFact-BHC: Labeled Propositions From Brief Hospital Course Summaries for Long-form Clinical Text Evaluation

Philip Chung, Akshay Swaminathan, Alex Goodell, Yeasul Kim, Momsen Reincke, Lichy Han, Ben Deverett, Mohammad Amin Sadeghi, Abdel badih El Ariss, Marc Ghanem, David Seong, Andrew Lee, Caitlin Coombes, Brad Bradshaw, Mahir Sufian, Hyo Jung Hong, Teresa Nguyen, Mohammad Rasouli, Komal Kamra, Mark Burbridge, James McAvoy, Roya Saffary, Stephen Parnell Ma, Dev Dash, James Xie, Ellen Wang, Cliff Schmiesing, Nigam Shah, Nima Aghaeepour

A clinician-labeled dataset for fact-checking long-form clinical text against patient EHRs. The dataset contains LLM-written and human-written Brief Hospital Course summaries decomposed to atomic claim and sentence propositions with annotations.

artificial intelligence clinical notes natural language processing large language models brief hospital course electronic health records long-form text chart review text reranking atomic claim hybrid retrieval clinical informatics clinical medicine fact verification retrieval-augmented generation logical atomism text embedding formal logic llm-as-a-judge llm evaluation

Published: April 9, 2025. Version: 1.0.0


Database Restricted Access

CXRGraph: Using Information Extraction to Normalize the Training Data for Automatic Radiology Report Generation

Yuxiang Liao, Hoisang Heung, Hantao Liu, Irena Spasic

CXRGraph is a structured radiology report dataset built upon RadGraph and tailored for the Automatic Radiology Report Generation task. It can identify more task-relevant information such as abnormalities and hallucinated prior references.

relation extraction information extraction natural language processing named entity recognition structured radiology report

Published: Feb. 3, 2025. Version: 1.0.0


Database Credentialed Access

CORAL: expert-Curated medical Oncology Reports to Advance Language model inference

Madhumita Sushil, Vanessa Kennedy, Divneet Mandair, Brenda Miao, Travis Zack, Atul Butte

Medical oncology progress notes annotated with advanced, comprehensive oncology-relevant concepts and relationships.

artificial intelligence information extraction oncology natural language processing large language models electronic health records

Published: Feb. 7, 2024. Version: 1.0


Database Credentialed Access

CXR-PRO: MIMIC-CXR with Prior References Omitted

Vignav Ramesh, Nathan Chi, Pranav Rajpurkar

CXR-PRO is an adaptation of the MIMIC-CXR dataset (consisting of chest radiographs and their associated free-text radiology reports) with references to non-existent priors removed.

generation free-text radiology reports references to priors retrieval large language models

Published: Nov. 23, 2022. Version: 1.0.0