Resources


Model Credentialed Access

Shareable Artificial Intelligence to Extract Cancer Outcomes from Electronic Health Records for Precision Oncology Research

Kenneth Kehl, Pavel Trukhanov, Christopher Fong, Justin Jee, Karl Pichotta, Morgan Paul, Chelsea Nichols, Michele Waters, Nikolaus Schultz, Deborah Schrag

The DFCI-imaging-student and DFCI-medonc-student AI models for extracting cancer outcomes from imaging reports and medical oncologist notes from electronic health records.

Published: Oct. 24, 2024. Version: 1.0.0


Database Contributor Review

InReDD-Dataset-PAN924

Caio Uehara Martins, Camila Tirapelli, Hugo Gaêta-Araujo, Jose Augusto Baranauskas, Breno Zancan, Jose Carneiro, Alessandra Macedo

InReDD‑Dataset-V1 is a collection of 924 anonymised panoramic dental radiographs curated by the Interdisciplinary Research Group in Digital Dentistry (InReDD) at the University of São Paulo.

Published: Nov. 22, 2025. Version: 1.0.0


Database Credentialed Access

Bridge2AI-Voice Pediatric Dataset

Yael Bensoussan, Alexandros Sigaras, Anais Rameau, Olivier Elemento, Maria Powell, David Dorr, Philip Payne, Vardit Ravitsky, Jean-Christophe Bélisle-Pipon, Ruth Bahr, Stephanie Watts, Donald Bolser, Jennifer Siu, Jordan Lerner-Ellis, Frank Rudzicz, Micah Boyer, Yassmeen Abdel-Aty, Toufeeq Ahmed Syed, James Anibal, Dona Amraei, Stephen Aradi, Kirollos Armosh, Ana Sophia Martinez, Shaheen Awan, Steven Bedrick, Helena Beltran, Alexander Bernier, Moroni Berrios, Isaac Bevers, Alden Blatter, Rahul Brito, Amy Brown, Johnathan Brown, Léo Cadillac, Selina Casalino, John Costello, Abhijeet Dalal, Iris De Santiago, Enrique Diaz-Ocampo, Amanda Doherty-Kirby, Mohamed Ebraheem, Ellie Eiseman, Mahmoud Elmahdy, Renee English, Emily Evangelista, Kenneth Fletcher, Hortense Gallois, Gaelyn Garrett, Alexander Gelbard, Anna Goldenberg, Karim Hanna, William Hersh, Jennifer Jain, Lochana Jayachandran, Kaley Jenney, Kathy Jenkins, Stacy Jo, Alistair Johnson, Ayush Kalia, Megha Kalia, Zoha Khawa, Cindy Kostelnik, Alisa Krause, Andrea Krussel, Elisa Lapadula, Genelle Leo, Justin Levinsky, Chloe Loewith, Radhika Mahajan, Vrishni Maharaj, Siyu Miao, LeAnn Michaels, Matthew Mifsud, Marian Mikhael, Elijah Moothedan, Yosef Nafii, Tempestt Neal, Karlee Newberry, Evan Ng, Christopher Nickel, Amanda Peltier, Trevor Pharr, Michaela Pnacekova, Matthew Pontell, Claire Premi-Bortolotto, Parnaz Rafatjou, JM Rahman, John Ramos, Sarah Rohde, Michael de Riesthal, Jillian Rossi, Laurie Russell, Samantha Salvi Cruz, Joyce Samuel, Suketu Shah, Ahmed Shawkat, Elizabeth Silberholz, John Stark, Lala Su, Shrramana Ganesh Sudhakar, Duncan Sutherland, Venkata Swarna Mukhi, Jeffrey Tang, Luka Taylor, Jamie Toghranegar, Julie Tu, Megan Urbano, Gavin Victor, Kimberly Vinson, Jordan Wilke, Claire Wilson, Madeleine Zanin, Xijie Zeng, Theresa Zesiewicz, Robin Zhao, Pantelis Zisimopoulos, Satrajit Ghosh

A dataset of questionnaire responses, spectrograms, and other information for pediatric participants collected for the Bridge2AI voice as a biomarker of health project.

voice bridge2ai

Published: Dec. 17, 2025. Version: 1.0.0


Database Credentialed Access

Bridge2AI-Voice: An ethically-sourced, diverse voice dataset linked to health information

Yael Bensoussan, Alexandros Sigaras, Anais Rameau, Olivier Elemento, Maria Powell, David Dorr, Philip Payne, Vardit Ravitsky, Jean-Christophe Bélisle-Pipon, Ruth Bahr, Stephanie Watts, Donald Bolser, Jennifer Siu, Jordan Lerner-Ellis, Frank Rudzicz, Micah Boyer, Yassmeen Abdel-Aty, Toufeeq Ahmed Syed, James Anibal, Dona Amraei, Stephen Aradi, Kirollos Armosh, Ana Sophia Martinez, Shaheen Awan, Steven Bedrick, Helena Beltran, Alexander Bernier, Moroni Berrios, Isaac Bevers, Alden Blatter, Rahul Brito, Amy Brown, Johnathan Brown, Léo Cadillac, Selina Casalino, John Costello, Abhijeet Dalal, Iris De Santiago, Enrique Diaz-Ocampo, Amanda Doherty-Kirby, Mohamed Ebraheem, Ellie Eiseman, Mahmoud Elmahdy, Renee English, Emily Evangelista, Kenneth Fletcher, Hortense Gallois, Gaelyn Garrett, Alexander Gelbard, Anna Goldenberg, Karim Hanna, William Hersh, Jennifer Jain, Lochana Jayachandran, Kaley Jenney, Kathy Jenkins, Stacy Jo, Alistair Johnson, Ayush Kalia, Megha Kalia, Zoha Khawa, Cindy Kostelnik, Alisa Krause, Andrea Krussel, Elisa Lapadula, Genelle Leo, Justin Levinsky, Chloe Loewith, Radhika Mahajan, Vrishni Maharaj, Siyu Miao, LeAnn Michaels, Matthew Mifsud, Marian Mikhael, Elijah Moothedan, Yosef Nafii, Tempestt Neal, Karlee Newberry, Evan Ng, Christopher Nickel, Amanda Peltier, Trevor Pharr, Michaela Pnacekova, Matthew Pontell, Claire Premi-Bortolotto, Parnaz Rafatjou, JM Rahman, John Ramos, Sarah Rohde, Michael de Riesthal, Jillian Rossi, Laurie Russell, Samantha Salvi Cruz, Joyce Samuel, Suketu Shah, Ahmed Shawkat, Elizabeth Silberholz, John Stark, Lala Su, Shrramana Ganesh Sudhakar, Duncan Sutherland, Venkata Swarna Mukhi, Jeffrey Tang, Luka Taylor, Jamie Toghranegar, Julie Tu, Megan Urbano, Gavin Victor, Kimberly Vinson, Jordan Wilke, Claire Wilson, Madeleine Zanin, Xijie Zeng, Theresa Zesiewicz, Robin Zhao, Pantelis Zisimopoulos, Satrajit Ghosh

A dataset of features from voice recordings and metadata to enable the development, benchmarking, and validation of clinically applicable machine-learning models for diagnosing a wide range of health conditions.

voice bridge2ai

Published: Dec. 16, 2025. Version: 3.0.0


Database Restricted Access

KURIAS-ECG: a 12-lead electrocardiogram database with standardized diagnosis ontology

Hakje Yoo, Yunjin Yum, Soowan Park, Jeong Moon Lee, Moonjoung Jang, Yoojoong Kim, Jong-Ho Kim, Hyun-Joon Park, Kap Su Han, Jae Hyoung Park, Hyung Joon Joo

The KURIAS-ECG database is a high-quality 12-lead ECG DB including standard vocabulary (SNOMED CT, OMOP-CDM), and ECG diagnoses of our DB are grouped into 10 diagnoses by applying the minnesota code.

snomed minnesota 12-lead ecg

Published: Nov. 8, 2021. Version: 1.0


Database Open Access

Myocardial perfusion scintigraphy image database

Wesley Calixto, Solange Nogueira, Fernanda Luz, Thiago Fellipe Ortiz de Camargo

This database provides a collection of myocardial perfusion scintigraphy images. The dataset encompasses a diversity of clinical cases, including various perfusion patterns and underlying cardiac conditions.

nifti artificial intelligence anonymization clinical diagnosis myocardial perfusion systems modeling myocardial perfusion scintigraphy dicom metadata ventricular walls coronary artery disease convolutional neural networks automated segmentation

Published: Sept. 9, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-III-Ext-VeriFact-BHC: Labeled Propositions From Brief Hospital Course Summaries for Long-form Clinical Text Evaluation

Philip Chung, Akshay Swaminathan, Alex Goodell, Yeasul Kim, Momsen Reincke, Lichy Han, Ben Deverett, Mohammad Amin Sadeghi, Abdel badih El Ariss, Marc Ghanem, David Seong, Andrew Lee, Caitlin Coombes, Brad Bradshaw, Mahir Sufian, Hyo Jung Hong, Teresa Nguyen, Mohammad Rasouli, Komal Kamra, Mark Burbridge, James McAvoy, Roya Saffary, Stephen Parnell Ma, Dev Dash, James Xie, Ellen Wang, Cliff Schmiesing, Nigam Shah, Nima Aghaeepour

A clinician-labeled dataset for fact-checking long-form clinical text against patient EHRs. The dataset contains LLM-written and human-written Brief Hospital Course summaries decomposed to atomic claim and sentence propositions with annotations.

artificial intelligence natural language processing clinical notes electronic health records large language models brief hospital course long-form text chart review text reranking atomic claim hybrid retrieval clinical informatics clinical medicine fact verification retrieval-augmented generation logical atomism text embedding formal logic llm-as-a-judge llm evaluation

Published: April 9, 2025. Version: 1.0.0


Database Credentialed Access

RadGraph-XL: A Large-Scale Expert-Annotated Dataset for Entity and Relation Extraction from Radiology Reports

Jean-Benoit Delbrouck

RadGraph-XL is a large, expert-annotated dataset of 2,300 radiology reports covering multiple modalities and anatomies. It enables accurate extraction of clinical entities and relations for downstream medical AI tasks.

Published: Sept. 12, 2025. Version: 1.0.0


Database Open Access

Myocardial perfusion scintigraphy image database

Wesley Calixto, Solange Nogueira, Fernanda Luz, Thiago Fellipe Ortiz de Camargo

This database provides a collection of myocardial perfusion scintigraphy images. The dataset encompasses a diversity of clinical cases, including various perfusion patterns and underlying cardiac conditions.

nifti artificial intelligence anonymization clinical diagnosis myocardial perfusion systems modeling myocardial perfusion scintigraphy dicom metadata ventricular walls coronary artery disease convolutional neural networks automated segmentation

Published: Sept. 9, 2025. Version: 1.0.0


Database Credentialed Access

MIMIC-III-Ext-VeriFact-BHC: Labeled Propositions From Brief Hospital Course Summaries for Long-form Clinical Text Evaluation

Philip Chung, Akshay Swaminathan, Alex Goodell, Yeasul Kim, Momsen Reincke, Lichy Han, Ben Deverett, Mohammad Amin Sadeghi, Abdel badih El Ariss, Marc Ghanem, David Seong, Andrew Lee, Caitlin Coombes, Brad Bradshaw, Mahir Sufian, Hyo Jung Hong, Teresa Nguyen, Mohammad Rasouli, Komal Kamra, Mark Burbridge, James McAvoy, Roya Saffary, Stephen Parnell Ma, Dev Dash, James Xie, Ellen Wang, Cliff Schmiesing, Nigam Shah, Nima Aghaeepour

A clinician-labeled dataset for fact-checking long-form clinical text against patient EHRs. The dataset contains LLM-written and human-written Brief Hospital Course summaries decomposed to atomic claim and sentence propositions with annotations.

artificial intelligence natural language processing clinical notes electronic health records large language models brief hospital course long-form text chart review text reranking atomic claim hybrid retrieval clinical informatics clinical medicine fact verification retrieval-augmented generation logical atomism text embedding formal logic llm-as-a-judge llm evaluation

Published: April 9, 2025. Version: 1.0.0