Challenge Open Access
Paroxysmal Atrial Fibrillation Events Detection from Dynamic ECG Recordings: The 4th China Physiological Signal Challenge 2021
Xingyao Wang , Caiyun Ma , Xiangyu Zhang , Hongxiang Gao , Gari D. Clifford , Chengyu Liu
Published: June 21, 2021. Version: 1.0.0
When using this resource, please cite:
(show more options)
Wang, X., Ma, C., Zhang, X., Gao, H., Clifford, G. D., & Liu, C. (2021). Paroxysmal Atrial Fibrillation Events Detection from Dynamic ECG Recordings: The 4th China Physiological Signal Challenge 2021 (version 1.0.0). PhysioNet. https://doi.org/10.13026/ksya-qw89.
Please include the standard citation for PhysioNet:
(show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
Abstract
Atrial fibrillation (AF) is the most frequent arrhythmia, but paroxysmal atrial fibrillation (PAF) often remains unrecognized. Early detection of PAF is of great value for AF surgery options, drug intervention and diagnosis and treatment of various clinical complications. Although accurate detection of paroxysmal AF is very important, there is currently no algorithm that can efficiently measure the onset and end of AF episode in dynamic or wearable ECGs. Previous AF detection algorithms usually focus on the classification of AF rhythm instead of locating the onsets and ends of AF episodes. Thus, the clinical significance for the personalized treatment and management of AF patients is limited. The identification of AF rhythm is also influenced by other abnormal rhythms in clinical applications.
The 4th China Physiological Signal Challenge 2021 (CPSC 2021) aims to encourage the development of algorithms for searching the AF episodes in dynamic ECG records. A new dynamic ECG database was constructed to encourage the development of more efficient and robust algorithms for PAF detection. We also develop a new scoring metric to evaluate detection methods for PAF events.
Objective
The 4th China Physiological Signal Challenge 2021 (CPSC 2021) aims to encourage the development of algorithms for searching the paroxysmal atrial fibrillation (PAF) events from dynamic ECG records.
ECG signal provides an important role in non-invasive monitoring and clinical diagnosis for cardiovascular disease (CVD). AF is the most frequent arrhythmia, but PAF often remains unrecognized [1, 2]. Early screening and early detection of PAF are particularly important. It is of great value for AF surgery options, drug intervention and the diagnosis and treatment of various clinical complications.
The number and duration of AF episodes may help explain differences in underlying pathophysiological AF mechanisms and related clinical outcomes. Ultimately, this could aid to personalize AF therapy and may have clinical utility for the assessment of AF treatment response [3]. Although accurate detection of PAF is very important, there is currently no algorithm that can efficiently screen AF episodes in long-term dynamic or wearable ECGs [4]. Previous AF detection algorithms usually focus on the classification of AF rhythm, such as entropy feature-based [5, 6] or machine learning-based methods [7, 8], without the location of onsets and ends of AF episodes. Thus, the clinical significance for the personalized treatment and management of AF patients is limited. The identification of AF rhythm is also influenced by other abnormal rhythms in clinical applications. In this year’s challenge, we focus on the detection of PAF episodes in dynamic ECGs. A new dynamic ECG database containing episodes consisting entirely or partially of AF rhythms, or non-AF rhythms, was constructed to encourage the development of more efficient and robust algorithms for PAF detection.
Participation
Sample Submission
A simple example algorithm is provided and may be used as a template for your own submission. Python and MATLAB implementations, python_entry and matlab_entry, are available in the Files section. Please visit the GitHub repositories (Python, MATLAB) for more details and up-to-date versions. Similar to last year’s Challenge, teams must submit both the code for their models and the code for training their models (if you are using supervised learning methods).
Note that the predicted result of each record needs to be saved separately and named after the record name. E.g., for record “data_0_1”, if your development environment is Python, please save the result as ‘data_0_1.json’ and the format is as {‘predict_endpoints’: [[s0, e0], [s1, e1], …, [sm-1, em-1]]}, if your development environment is MATLAB, please save the variable, predict_endpoints = [[s0, e0], [s1, e1], …, [sm-1. em-1]], as ‘data_0_1.mat’. The specific instructions are listed below.
Preparation and submission instructions
- Create a private GitHub or Gitlab repository for your code. We recommend cloning our example code and replacing it with your code. Add CPSC-Committee as a collaborator to your repository.
- Add your algorithm code to your repository. Like the example code, your code must be in the root directory of the master branch.
- Do not include extra files that are not required to create and run your code, such as the training data.
- Follow the instructions below for the programming language of your submission.
- Register through http://www.cpscsub.com/Users/Register. We will clone your repository using the HTTPS URL that ends in .git. On GitHub, you can get this URL by clicking on “Clone or download” and copying and pasting the URL, e.g., https://github.com/CPSC-Committee/cpsc2021-python-entry.git. Please see these[link] instructions on Github for information on how to get your .git URL.
- We will put the scores for successful entries on the leaderboard. The leaderboard will publicly show your team name, run time, and score. Note that there is a limitation on the run time. The run time can not exceed 1 minute per record, on average. If the code run time exceeds the limit, a timeout error will be fed back.
Python-specific instructions
- Using our Sample Python entry (https://github.com/CPSC-Committee/cpsc2021-python-entry) as a template, format your code in the following way. Consider downloading this repository, replacing our code with your code, and adding the updated files to your repository.
- requirements.txt: Add Python packages to be installed with pip. Specify the versions of these packages that you are using on your machine. Remove unnecessary packages, such as Matplotlib, that your submitted code does not need.
- AUTHORS.txt, LICENSE.txt, README.md: Update as appropriate. Please include your authors.
- entry_2021.py: Update this script to load and run your trained model. While testing, we will firstly run
to generate the prediction results for next-step scoring.python entry_2021.py TEST_SET_PATH RESULT_SAVE_PATH
- utils_2021.py: Do not change this script. It is a script containing the functions for the baseline method.
- score_2021.py: Do not change this script. It is a sample script for reference to help participants test their code in their local environment. Another standardized score script will be used in the back-end server.
MATLAB-specific instructions
-
Confirm that your MATLAB code compiles and runs in MATLAB R2020B or R2021A.
-
Using our sample MATLAB entry (https://github.com/CPSC-Committee/cpsc2021-matlab-entry) as a template, format your code in the following way. Consider downloading this repository, replacing our code with your code, and adding the updated files to your repository.
-
AUTHORS.txt, LICENSE.txt, README.md: Update as appropriate. Please include your authors. Unfortunately, our submission system will be unable to read your README file to change how we run your code.
-
comp_cosEn.m: Do not edit this script. It extracts features from the ECG recordings.
-
qrs_detect.m: Do not edit this script. It extracts the position of the R peak.
-
challenge.m: Update this script to load and run your model weights and any parameters from files in your submission. It takes the input sample path, and returns predict endpoints.
-
Result.m: Update this script to load and run your model weights and any parameters from files in your submission. It takes the input sample name, sample path, save path, and returns predict endpoints and saves them to a specific path with a file name based on the submission name.
-
Add your code to the root/base directory of the master branch of your repository.
-
We will download your code, compile it using the MATLAB compiler - Outputting each result by running
mcc -m challenge.m –a .
and saving to a specific path with a file name based on the submission name by running
mcc -m Result.m -a .
, and run them on our backend server.
Data Description
Data are recorded from 12-lead Holter or 3-lead wearable ECG monitoring devices. Challenge data provides variable-length ECG records extracted from lead I and lead II of the long-term dynamic ECGs, each sampled at 200 Hz. In order to avoid ambiguity in annotation, an AF episode is limited to contain no less than 5 heart beats.
The training set in the 1st stage consists of 730 records, extracted from the Holter records from 12 AF patients (5 PAF patients) and 42 non-AF patients (usually including other abnormal and normal rhythms).
The training set in the 2nd stage consists of 706 records from 37 AF patients (18 PAF patients) and 14 non-AF patients.
The test set comprises data from the same source as the training set as well as from a different data source. We ensure that at least one test subset was collected by a different ECG monitoring system compared with the training set. Similar to previous years, we are not planning to release the test set at any point.
All data is provided in WFDB format and the annotations are standardized according to PhysioBank Annotations. The annotation includes the beat annotations (R peak location and beat type), the rhythm annotations (rhythm change flag and rhythm type) and the diagnosis of the global rhythm. Please refer to the example code entry of the challenge for specific data and label load functions. Note that the flag of atrial fibrillation and atrial flutter (‘AFIB’ and ‘AFL’) in the annotated information is seen as the same type in the scoring method.
Please download the training data from Training_set_I and Training_set_II.
Evaluation
For this year’s Challenge, we developed a new scoring metric that awards the correct detection of paroxysmal AF events. The scoring metric includes two steps:
- the first step is to classify the rhythm types: non-AF rhythm (N), persistent AF rhythm (AFf) and paroxysmal AF rhythm (AFp).
- the second step is to locate the onset and end points for all AF episode predictions.
The participants are only required to provide the final onset and end locations for AF episodes. If the current ECG record is classified as AFf, the provided onset and end locations should be the first and last record points. If the ECG record is classified as N, the answer should be an empty list.
Figure 1. The scoring matrix for the global rhythm classification.
A scoring matrix (as shown in Figure 1) is designed to firstly reward the correct answers and penalize the misdiagnosis for the three rhythm types. Let Ur be the score for this step evaluation.
Figure 2. Grading instance for onset and end detections of AF episodes.
Then the detection of onset and end points of AF episode will be scored (only reward), and the reward is based on the consistence between the detected onset (or end) points and the annotated answers. Figure 2 gives the detailed explanation. Let Ue be the scores for evaluating the detection of onset and end points. We reward Ue with +1 if the detected onset (or end) point is within ±1 beat of the annotated position, and Ue with +0.5 if it is within ±2 beats. Note that a paroxysmal AF record may contain multiple AF episodes and the onset and end points will be separately scored (rewarded). A weight
is assigned to , where and are the amount of AF episodes from annotated answers and the predictions, for each record respectively. Over-estimation of AF episodes may cause lower scores. The final score is the sum of and , and is defined as:
where is the number of the test records. The score is calculated for each record and then averaged for the entire test set. For example, for an ECG record with only one paroxysmal AF episode, if the algorithm classified the record correctly, the first score . Then the predicted onset and end of the AF episode will be checked. If both the onset and end points are within ±1 beat of the annotated position, the second score gives . Thus, the final score is .
Conflicts of Interest
The authors have no conflicts of interest to declare.
References
- G. F. Michaud and W. G. Stevenson, "Atrial Fibrillation," New England Journal of Medicine, vol. 384, no. 4, pp. 353-361, 2021.
- C. Ma, S. Wei, T. Chen, J. Zhong, Z. Liu, and C. Liu, "Integration of results from convolutional neural network in a support vector machine for the detection of atrial fibrillation," IEEE Transactions on Instrumentation and Measurement, 2020.
- R. R. De With et al., "Temporal patterns and short-term progression of paroxysmal atrial fibrillation: data from RACE V," EP Europace, vol. 22, no. 8, pp. 1162-1172, 2020.
- H. Baumgartner et al., "2020 ESC Guidelines for the management of adult congenital heart disease: The Task Force for the management of adult congenital heart disease of the European Society of Cardiology (ESC). Endorsed by: Association for European Paediatric and Congenital Cardiology (AEPC), International Society for Adult Congenital Heart Disease (ISACHD)," European heart journal, vol. 42, no. 6, pp. 563-645, 2021.
- C. Liu et al., "A comparison of entropy approaches for AF discrimination," Physiological measurement, vol. 39, no. 7, p. 074002, 2018.
- L. Zhao, C. Liu, S. Wei, Q. Shen, F. Zhou, and J. Li, "A new entropy-based atrial fibrillation detection method for scanning wearable ecg recordings," Entropy, vol. 20, no. 12, p. 904, 2018.
- V. Kalidas and L. S. Tamil, "Detection of atrial fibrillation using discrete-state Markov models and Random Forests," Computers in biology and medicine, vol. 113, p. 103386, 2019.
- X. Zhang, J. Li, Z. Cai, L. Zhang, Z. Chen, and C. Liu, "Over-fitting suppression training strategies for deep learning-based atrial fibrillation detection," Medical & Biological Engineering & Computing, vol. 59, no. 1, pp. 165-173, 2021.
Access
Access Policy:
Anyone can access the files, as long as they conform to the terms of the specified license.
License (for files):
Creative Commons Attribution 4.0 International Public License
Discovery
DOI (version 1.0.0):
https://doi.org/10.13026/ksya-qw89
DOI (latest version):
https://doi.org/10.13026/h37m-r024
Topics:
event detection
paroxysmal atrial fibrillation
Project Website:
http://www.icbeb.org/CPSC2021
Corresponding Author
Files
Total uncompressed size: 1.3 GB.
Access the files
- Download the ZIP file (1.0 GB)
-
Download the files using your terminal:
wget -r -N -c -np https://physionet.org/files/cpsc2021/1.0.0/
-
Download the files using AWS command line tools:
aws s3 sync s3://physionet-open/cpsc2021/1.0.0/ DESTINATION
Name | Size | Modified |
---|---|---|
Training_set_I | ||
Training_set_II | ||
matlab_entry | ||
python_entry | ||
LICENSE.txt (download) | 14.5 KB | 2021-06-15 |
README.md (download) | 211 B | 2021-04-06 |
RECORDS (download) | 36.2 KB | 2021-06-14 |
SHA256SUMS.txt (download) | 397.5 KB | 2021-06-21 |
figure1 (download) | 141.8 KB | 2021-05-27 |
figure2 (download) | 390.4 KB | 2021-05-27 |