Database Open Access
VTaC: A Benchmark Dataset of Ventricular Tachycardia Alarms from ICU Monitors
Li-wei Lehman , Benjamin Moody , Lucas McCullum , Hasan Saeed , Harsh Deep , Diane Perry , Tristan Struja , Qiao Li , Gari Clifford , Roger Mark
Published: Oct. 1, 2024. Version: 1.0
When using this resource, please cite:
(show more options)
Lehman, L., Moody, B., McCullum, L., Saeed, H., Deep, H., Perry, D., Struja, T., Li, Q., Clifford, G., & Mark, R. (2024). VTaC: A Benchmark Dataset of Ventricular Tachycardia Alarms from ICU Monitors (version 1.0). PhysioNet. https://doi.org/10.13026/8td2-g363.
Please include the standard citation for PhysioNet:
(show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
Abstract
False arrhythmia alarms are a persistent problem in intensive care units despite considerable effort from industrial and academic algorithm developers. Among the various arrhythmias, ventricular tachycardia (VT) is particularly challenging to detect accurately, as achieving both high sensitivity and high positive predictivity has proven difficult. We present an annotated VT alarm database, VTaC (Ventricular Tachycardia annotated alarms from ICUs) consisting of over 5,000 waveform recordings with VT alarms triggered by bedside monitors in the ICUs. Each VT alarm in the dataset was labeled as true or false by at least two independent human expert annotators. The dataset comprises data collected from ICUs in three major US hospitals and includes data from three leading bedside monitor manufacturers, providing a diverse and representative collection of VT alarm waveform data. Each waveform recording comprises at least two electrocardiogram (ECG) leads and one or more pulsatile waveforms, such as photoplethysmogram (PPG or PLETH) and arterial blood pressure (ABP) waveforms.
Background
Bedside monitors in ICUs have produced a significant volume of alarms, a considerable proportion of which are false alarms [1,2,3,4]. False arrhythmia alarms in intensive care units (ICUs) pose a significant challenge, leading to alarm fatigue among healthcare providers and potentially compromising patient care. These false alarms not only contribute to increased cognitive load on clinicians but also have the potential to mask true arrhythmias, thereby endangering patient safety. Ventricular tachycardia (VT) alarms are among the most frequently occurring life-threatening arrhythmia alarms [1], and false VT alarms have proven to be the most difficult to detect reliably [3,4,5,6,7].
Several annotated VT alarm waveform datasets have been made publicly available. For instance, Aboukhalil et al. [5] developed false arrhythmia alarm reduction algorithms using 1,900 annotated VT alarms from the MIMIC II database [8]. The PhysioNet Challenge 2015 dataset includes a total of 562 annotated VT alarms, with 341 alarms in the training set and 221 in the test set [3,4,9].
While previous algorithms [ 3,4,5,6,7] show promise in identifying false arrhythmia alarms, their effectiveness has often been evaluated or developed using single, small, or relatively homogeneous datasets. This limitation hinders their generalizability and real-world applicability. The introduction of this annotated VT alarm dataset offers a valuable opportunity to overcome this challenge. By encompassing data from diverse sources, this dataset enables the evaluation and refinement of algorithms in a broader context, spanning a wider range of monitoring devices and clinical settings.
Methods
We extracted and compiled a dataset of 18,465 VT alarm waveform events, derived from 2,376 unique patient waveform records from bedside monitors produced by three leading commercial vendors [10]. These records were sourced from multiple ICUs across three major U.S. hospitals, ensuring a diverse and representative collection of waveform data. Each recording in our dataset consists of a 10-minute segment surrounding the onset of the ventricular tachycardia (VT) alarm, capturing 5 minutes of waveform data before the alarm and 5 minutes after. To maintain diversity, we randomly selected up to five alarm events from any individual patient record, resulting in a total of 5,742 events for annotation. This approach ensures a balanced sampling of events across different patient records, avoiding the over-representation of any single record. All waveform records were de-identified to remove identifiable information such as patient names, dates, and medical record numbers. The signals were uniformly resampled to 250 Hz, and all signal labels were standardized to align with the nomenclature used in the PhysioNet Challenge 2015 database [3,4,9].
Annotation Process
Following the PhysioNet Challenge 2015, a VT episode is defined as five or more consecutive ventricular beats with heart rate higher than 100 beats-per-minute (bpm) [3,9]. Expert annotators were randomly assigned batches of VT alarms to annotate. Each VT alarm event was reviewed and labeled by at least two annotators independently. Annotation was performed using an open-source annotation platform, PhysioTag [11,12], which enables experts to collaboratively annotate physiological waveform records using a standard web browser. Please see [11] for a detailed description of the annotation platform, and an illustrative example of the user interface for annotating the VT alarms. For our task, annotators were given the options of “True” for when they believe the alarm was correct, “False” for when they believe the alarm was incorrect, “Uncertain” for when they were unsure which annotation to assign, “Reject” for when the alarm was unreadable due to noise, artifacts, or other reasons. In order to reconcile conflicts between two annotator decisions, an adjudication process was implemented to resolve the conflicts. These disagreements were resolved either through direct one-on-one discussions between the annotators involved or by an adjudicator’s vote to break the tie. The annotation team consists of six annotators, including an arrhythmia analysis expert physician, and a highly-experienced board certified cardiac arrhythmia technician. The team also includes three clinicians, and one biomedical signal processing engineer specializing in arrhythmia.
A total of 5,742 events were annotated by at least two independent annotators. Two independent annotators reached unanimous decisions on 4,534 (78.96%) events, whereas 21.04% (N=1,208) of the events received conflicting labeling decisions by two human annotators. Among the events with conflicting decisions, 816 (66.55%) were adjudicated. After removing 392 un-adjudicated events, a total of 5,350 alarm events received final labeling decisions. After excluding "Rejected" and "Uncertain" events, the final dataset in this release contains 5,037 events.
Data Description
The VTaC dataset [10] contains 5,037 annotated VT alarm events, among which 1,441 (28.61%) are true alarms. The VT alarms were automatically generated by commercial patient bed-side monitors from ICUs of three major US hospitals. Each waveform record in the current release consists of a 6-minute segment that encompasses the onset of the VT alarm. This segment includes 5 minutes of waveform data preceding the alarm onset and 1 minute following it. Thus, the alarm onset is at the end of the 5 minute mark of each waveform record in the file folder. Each waveform recording contains ECG leads and one or more pulsatile waveforms (photoplethysmogram and/or arterial blood pressure waveforms). For a more detailed description of the VTaC dataset, please see [10].
The waveform files (in the waveforms
folder) are stored in WFDB format [13], the PhysioNet recommended format for waveform recordings. The data are organized based on patient waveform records where each sub-folder represents a patient waveform record. Each sub-folder contains up to five waveform records, each corresponds to a VT alarm event selected from that patient record. All signals were uniformly resampled to 250 Hz.
The final labeling decisions for each of the VT alarm events are listed in the CSV file event_labels.csv
Each row represents the final labeling decision corresponding to a VT alarm. The first column record
contains the patient waveform record name, and the event
column contains the unique event name corresponding to a VT alarm in the patient waveform record. As a part of the de-identification, all waveform record and event names are randomly generated surrogate names. In this dataset, a patient waveform record can have up to five VT alarm events. The column decision
contains the final decision of the human annotator for the corresponding VT alarm.
The CSV file benchmark_data_split.csv
contains the train/validation/test set split used by machine learning algorithms for VT false alarm reduction in [10].
Usage Notes
This dataset is intended for developing algorithms for false arrhythmia alarm reduction. Please cite [10] when using this dataset. Code and scripts for machine learning algorithms presented in VT false alarm reduction reported in [10] can be found at the project GitHub website [14]. A limitation of this dataset is the absence of detailed clinical information accompanying the waveform recordings. We leave the collection of matched clinical data for future research endeavors.
To visualize the waveform records, please click on the "Visualize Waveform" button above the file panel, and select the records and events of interest to display. The WFDB Python package [15] can be used to load the waveform records in WFDB format.
Release Notes
This is the first release of the VTaC dataset (Version 1.0).
Ethics
This dataset is a compilation of patient waveforms sourced from multiple institutions, and the process of collecting this data has received approval from the respective Institutional Review Boards (IRBs) of each participating institution. The project was approved by the Institutional Review Boards of the Massachusetts Institute of Technology (Cambridge, MA). Requirement for individual patient consent was waived because the project did not impact clinical care and all protected health information was deidentified.
Acknowledgements
This research is funded by NIH Grant NIH R01 EB030362.
Conflicts of Interest
None.
References
- Drew, B.J., Harris, P., Zègre-Hemsey, J.K., Mammone, T., Schindler, D., Salas-Boni, R., Bai, Y., Tinoco, A., Ding, Q., & Hu, X. (2014). Insights into the problem of alarm fatigue with physiologic monitor devices: a comprehensive observational study of consecutive intensive care unit patients. PloS ONE, 9(10), e110274. https://doi.org/10.1371/journal.pone.0110274
- Cvach, M. (2012). Monitor alarm fatigue: an integrative review. Biomedical Instrumentation & Technology, 46(4), 268–277. https://doi.org/10.2345/0899-8205-46.4.268
- Clifford, G.D., Silva, I., Moody, B., Li, Q., Kella, D., Shahin, A., Kooistra, T., Perry, D., & Mark, R.G. (2015). The PhysioNet/Computing in Cardiology Challenge 2015: Reducing false arrhythmia alarms in the ICU. In 2015 Computing in Cardiology Conference (CinC), pages 273–276.
- Clifford, G.D., Silva, I., Moody, B., Li, Q., Kella, D., Chahin, A., Kooistra, T., Perry, D., & Mark, R.G. (2016). False alarm reduction in critical care. Physiological Measurement, 37(8), E5–E23. https://doi.org/10.1088/0967-3334/37/8/E5
- Aboukhalil, A., Nielsen, L., Saeed, M., Mark, R.G., & Clifford, G.D. (2008). Reducing false alarm rates for critical arrhythmias using the arterial blood pressure waveform. Journal of Biomedical Informatics, 41(3), 442–451. https://doi.org/10.1016/j.jbi.2008.03.003
- Lehman, E.P., Krishnan, R.G., Zhao, X., Mark, R.G., & Lehman, L.H. (2018). Representation learning approaches to detect false arrhythmia alarms from ECG dynamics. In Machine Learning for Healthcare Conference, pages 571–586. PMLR.
- Zhou, Y., Zhao, G., Li, J., Sun, G., Qian, X., Moody, B., Mark, R.G., & Lehman, L.H. (2022). A contrastive learning approach for ICU false arrhythmia alarm reduction. Nature Scientific Reports, 12(1). https://doi.org/10.1038/s41598-022-07761-9
- Saeed, M., Villarroel, M., Reisner, A.T., Clifford, G., Lehman, L., Moody, G., Heldt, T., Kyaw, T.H., Moody, B. & Mark, R.G. (2011). Multiparameter intelligent monitoring in intensive care II (MIMIC-II): A public-access intensive care unit database. Critical Care Medicine, 39(5), 952–960. https://doi.org/10.1097/CCM.0b013e31820a92c6
- Clifford, G., Silva, I., Moody, B., Mark, R.G. (2015). Reducing false arrhythmia alarms in the ICU: The PhysioNet/Computing in Cardiology Challenge 2015 (version 1.0.0). PhysioNet. https://doi.org/10.13026/c9fg-a467
- Lehman, L.H., Moody, B., Deep, H., Wu, F., Saeed, H., McCullum, L., Perry, D., Struja, T., Li, Q., Clifford, G., & Mark, R.G. (2023). VTaC: A benchmark dataset of ventricular tachycardia alarms from ICU monitors. Proceedings of the 37th International Conference on Neural Information Processing Systems (NeurIPS 2023), Datasets and Benchmarks Track.
- McCullum, L., Saeed, H., Moody, B., Perry, D., Gottlieb, E., Pollard, T., Borrat, X., Li, Q., Clifford, G., Mark, R.G., & Lehman, L.H. (2022). PhysioTag: An open-source platform for collaborative annotation of physiological waveforms, Computing in Cardiology (CinC).
- McCullum, L., Moody, B., Saeed, H., Pollard, T., Borrat Frigola, X., Lehman, L., & Mark, R. (2023). PhysioTag: An open-source platform for collaborative annotation of physiological waveforms (version 1.0.0). PhysioNet. https://doi.org/10.13026/g06j-3612
- Moody, G., Pollard, T., & Moody, B. (2022). WFDB software package (version 10.7.0). PhysioNet. https://doi.org/10.13026/gjvw-1m31
- Code repository for the VTaC project. https://github.com/ML-Health/VTaC (Accessed August 31st, 2024).
- Xie, C., McCullum, L., Johnson, A., Pollard, T., Gow, B., & Moody, B. (2023). Waveform Database Software Package (WFDB) for Python (version 4.1.0). PhysioNet. https://doi.org/10.13026/9njx-6322
Access
Access Policy:
Anyone can access the files, as long as they conform to the terms of the specified license.
License (for files):
Creative Commons Attribution-ShareAlike 4.0 International Public License
Discovery
DOI (version 1.0):
https://doi.org/10.13026/8td2-g363
DOI (latest version):
https://doi.org/10.13026/z4f3-1f07
Topics:
arrhythmia
machine learning
icu false alarms
benchmark dataset
ventricular tachycardia
Project Website:
https://github.com/ML-Health/VTaC
Corresponding Author
Files
Total uncompressed size: 4.0 GB.
Access the files
- Download the ZIP file (2.7 GB)
-
Download the files using your terminal:
wget -r -N -c -np https://physionet.org/files/vtac/1.0/
Name | Size | Modified |
---|---|---|
waveforms | ||
LICENSE.txt (download) | 16.0 KB | 2024-09-13 |
RECORDS (download) | 142.7 KB | 2024-05-15 |
SHA256SUMS.txt (download) | 964.6 KB | 2024-10-01 |
benchmark_data_split.csv (download) | 121.6 KB | 2024-08-10 |
event_labels.csv (download) | 121.6 KB | 2024-07-13 |