Resources


Challenge Open Access

Early Prediction of Sepsis from Clinical Data: The PhysioNet/Computing in Cardiology Challenge 2019

Matthew Reyna, Chris Josef, Russell Jeter, Supreeth Shashikumar, Benjamin Moody, M. Brandon Westover, Ashish Sharma, Shamim Nemati, Gari D. Clifford

The 2019 PhysioNet Computing in Cardiology Challenge invites participants to predict sepsis in clinical data

prediction sepsis challenge

Published: Aug. 5, 2019. Version: 1.0.0


Database Credentialed Access

Tasks 1 and 3 from Progress Note Understanding Suite of Tasks: SOAP Note Tagging and Problem List Summarization

Yanjun Gao, John Caskey, Timothy Miller, Brihat Sharma, Matthew Churpek, Dmitriy Dligach, Majid Afshar

We introduce a hierarchical annotation suite of tasks addressing clinical text understanding, reasoning and abstraction over evidence, and diagnosis summarization. One task is section tagging major section and the other task is diagnosis generation.

Published: Sept. 30, 2022. Version: 1.0.0


Database Credentialed Access

CHIFIR: Cytology and Histopathology Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jasmine Teng, Joanne Teh, Leon Worth, Monica Slavin, karin thursky, Karin Verspoor

A corpus of cytology and histopathology reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp information extraction clinical documentation invasive fungal infections

Published: Feb. 20, 2024. Version: 1.0.2


Database Credentialed Access

CHIFIR: Cytology and Histopathology Invasive Fungal Infection Reports

Vlada Rozova, Anna Khanina, Jasmine Teng, Joanne Teh, Leon Worth, Monica Slavin, karin thursky, Karin Verspoor

A corpus of cytology and histopathology reports annotated for terminology relevant to fungal infections. Ideal for validation of named entity recognition and relation extraction methods.

nlp information extraction clinical documentation invasive fungal infections

Published: Feb. 20, 2024. Version: 1.0.2


Database Open Access

KINECAL

Sean Maudsley-Barton, Moi Hoon Yap

A dataset for balance falls-risk assessment and balance impairment analysis

balance posturography clinical tests postural sway falls-risk age-related changes

Published: June 8, 2023. Version: 1.0.3


Database Credentialed Access

CORAL: expert-Curated medical Oncology Reports to Advance Language model inference

Madhumita Sushil, Vanessa Kennedy, Divneet Mandair, Brenda Miao, Travis Zack, Atul Butte

Medical oncology progress notes annotated with advanced, comprehensive oncology-relevant concepts and relationships.

artificial intelligence electronic health records information extraction natural language processing oncology large language models

Published: Feb. 7, 2024. Version: 1.0


Database Open Access

KINECAL

Sean Maudsley-Barton, Moi Hoon Yap

A dataset for balance falls-risk assessment and balance impairment analysis

balance posturography clinical tests postural sway falls-risk age-related changes

Published: June 8, 2023. Version: 1.0.3


Database Contributor Review

CARMEN-I: A resource of anonymized electronic health records in Spanish and Catalan for training and testing NLP tools

Eulalia Farre Maduell, Salvador Lima-Lopez, Santiago Andres Frid, Artur Conesa, Elisa Asensio, Antonio Lopez-Rueda, Helena Arino, Elena Calvo, Maria Jesús Bertran, Maria Angeles Marcos, Montserrat Nofre Maiz, Laura Tañá Velasco, Antonia Marti, Ricardo Farreres, Xavier Pastor, Xavier Borrat Frigola, Martin Krallinger

CARMEN-I is a Spanish corpus of 2,000 clinical records from Hospital Clínic, Barcelona. It covers COVID-19 patients and comorbidities, serving as a resource for training clinical NLP models and researchers in NLP applied to clinical documents.

de-identification clinical ner anonymization

Published: April 20, 2024. Version: 1.0.1


Database Open Access

Facial and oral temperature data from a large set of human subject volunteers

Quanzeng Wang, Yangling Zhou, Pejman Ghassemi, Dwith Chenna, Michelle Chen, Jon Casamento, Joshua Pfefer, David Mcbride

Data for each subject include temperatures measured at 29 facial locations over four rounds with two IRTs, oral temperatures measured with a thermometer in two modes, subject demographics (gender, age, ethnicity), environmental conditions, etc.

clinical accuracy receiver operating characteristic curve infectious disease epidemics thermography fever screening thermometry inner canthus elevated body temperature facial maximum temperatures infrared thermograph pearson correlation coefficients

Published: May 24, 2023. Version: 1.0.0


Database Contributor Review

CARMEN-I: A resource of anonymized electronic health records in Spanish and Catalan for training and testing NLP tools

Eulalia Farre Maduell, Salvador Lima-Lopez, Santiago Andres Frid, Artur Conesa, Elisa Asensio, Antonio Lopez-Rueda, Helena Arino, Elena Calvo, Maria Jesús Bertran, Maria Angeles Marcos, Montserrat Nofre Maiz, Laura Tañá Velasco, Antonia Marti, Ricardo Farreres, Xavier Pastor, Xavier Borrat Frigola, Martin Krallinger

CARMEN-I is a Spanish corpus of 2,000 clinical records from Hospital Clínic, Barcelona. It covers COVID-19 patients and comorbidities, serving as a resource for training clinical NLP models and researchers in NLP applied to clinical documents.

de-identification clinical ner anonymization

Published: April 20, 2024. Version: 1.0.1