Resources


Challenge Credentialed Access

ShAReCLEF eHealth 2013: Natural Language Processing and Information Retrieval for Clinical Care

Danielle Mowery

2013 ShARe/CLEF eHealth Evaluation Lab: Natural Language Processing and Information Retrieval for Clinical Care (Tasks 1 and 2).

natural language processing

Published: Feb. 15, 2013. Version: 1.0


Database Credentialed Access

MIMIC-CXR-Ext-ILS: Lesion Segmentation Masks and Instruction-Answer Pairs for Chest X-rays

Geon Choi, Hangyul Yoon, Hyunju Shin, et al.

Instruction-guided lesion segmentation data for chest X-rays, including 1.1M instruction-answer pairs and 91K segmentation masks covering seven major lesion types.

chest x-ray segmentation text-guided segmentation lesion segmentation

Published: March 25, 2026. Version: 1.0.0


Database Credentialed Access

CHIFIR: Cytology and Histopathology Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jasmine Teng, et al.

A corpus of cytology and histopathology reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp clinical documentation information extraction invasive fungal infections

Published: Feb. 20, 2024. Version: 1.0.2


Database Credentialed Access

PIFIR: PET-CT Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jeremy Ong, et al.

A corpus of PET-CT reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp clinical documentation information extraction invasive fungal infections

Published: Feb. 27, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-Ext-CXR-QBA: A Structured, Tagged, and Localized Visual Question Answering Dataset with Question-Box-Answer Triplets and Scene Graphs for Chest X-ray Images

Philip MĂĽller, Friederike Jungmann, Georgios Kaissis, et al.

We present a large-scale CXR VQA dataset derived from MIMIC-CXR with 42M QA pairs, featuring hierarchical answers, bounding boxes, and structured tags. We generated QA-pairs using LLM-based extraction from radiology reports and localization models.

chest x-rays vqa localization scene graphs

Published: July 22, 2025. Version: 1.0.0


Database Open Access

bigP3BCI: An Open, Diverse and Machine Learning Ready P300-based Brain-Computer Interface Dataset

Boyla Mainsah, Chance Fleeting, Thomas Balmat, et al.

A collection of data from P300-based brain-computer interface studies.

brain-computer interface electroencephalography ieee p2731 working group standard amyotrophic lateral sclerosis p300 speller p300 event related potential oddball paradigm error-related potential

Published: May 19, 2025. Version: 1.0.0


Database Restricted Access

LATTE-CXR: Locally Aligned TexT and imagE, Explainable dataset for Chest X-Rays

Elham Ghelichkhan, Tolga Tasdizen

This dataset includes bounding box-statement pairs for chest X-ray images, derived from radiologists’ eye-tracking data (for explainability) and annotations, for local visual-language models.

eye-tracking chest x-ray dataset automatically generated dataset caption-guided object detection image captioning with region-level description grounded radiology report generation phrase grounding xai multi-modal learning local visual-language models localization

Published: Feb. 4, 2025. Version: 1.0.0