Challenge Open Access
You Snooze You Win: The PhysioNet/Computing in Cardiology Challenge 2018
Mohammad Ghassemi , Benjamin Moody , Li-wei Lehman , Roger Mark , Gari D. Clifford
Published: Feb. 21, 2018. Version: 1.0.0
Community forum for the 2018 PhysioNet/CinC Challenge (Feb. 23, 2019, 12:54 p.m.)
If you have any questions or comments regarding this challenge, please post it directly in our Community Discussion Forum. This will increase transparency (benefiting all the competitors) and ensure that all the challenge organizers see your question.
2018 Challenge publications (Feb. 22, 2019, midnight)
- Publications from the 2018 Challenge are now available.
More news
Journal of Physiological Measurement is now accepting submissions for the 2018 Challenge (Nov. 28, 2018, midnight)
- The Journal of Physiological Measurement is now accepting submissions in the focus issue: Automated Analysis of Arousals, Sleep and Sleep-Related Disorders Using Physiological Time Series. Please included updated results from the test dataset if you develop a new algorithm. Note that the scope does not confine you to the data and topic of this year's Challenge. Please see here for more information on the scope and how to submit. The deadline is set for the end of February 2019.
Official results for the 2018 Challenge (Oct. 10, 2018, midnight)
- Official results, as well as a paper describing the Challenge, are now available. Top scores were achieved by
- Matthew Howe-Patterson, Bahareh Pourbabaee, and Frederic Benard (0.54)
- Guðni Fannar Kristjansson, Heiðar Már Þráinsson, Hanna Ragnarsdóttir, Bragi Marinósson, Eysteinn Gunnlaugsson, Eysteinn Finnsson, Sigurður Ægir Jónsson, Halla Helgadóttir, and Jón Skírnir Ágústsson (0.45)
- Runnan He, Kuanquan Wang, Yang Liu, Na Zhao, Yongfeng Yuan, Qince Li, and Henggui Zhang (0.43)
- An unofficial entry from Hongyang Li and Yuanfang Guan (who unfortunately missed the deadline to submit an abstract) achieved a score of 0.55.
Submission is now closed for the 2018 Challenge (Sept. 1, 2018, midnight)
- Challenge entry submission is now closed. Authors must select the entry they wish to be run against the full Challenge data set before 10 Sept 2018 (midnight GMT, 7pm EDT). After this deadline the most recent entry (only) will be selected and evaluated on the full test data.
- Winners will be announced and prizes presented at Computing in Cardiology on 26 Sept 2018.
Notices of acceptance have been sent for CinC abstracts (June 20, 2018, midnight)
- CinC abstract acceptances/rejections have now been sent out.
- If your abstract was rejected, do not be despondent - please see the Rules and Deadlines below for a second chance to have your abstract accepted for and stay in the official running for the Challenge. This is chance is open to everyone, not just those with rejected abstracts.
- If your abstract was accepted, please log in to the conference site and agree that you will attend. Then, you must submit a full article describing your results and mark it as a preprint (for others to read) by September 15th. (Don't forget that the competition deadline is noon GMT on the 1st September - this deadline will not be extended)
- If you will need a visa to attend CinC, please do this immediately, and follow these instructions.
Top scores in the unofficial phase of the 2018 Challenge (May 7, 2018, midnight)
- Top scores in the Unofficial Phase were achieved by Matthew HP and Bahareh Pourbabaee with a score of 0.439, Yang Liu and Runnan He with a score of 0.244, and Márton Görög, Bálint Varga, and Péter Hajas with a score of 0.228.
- An updated set of arousal files are now available.
- A Matlab implementation of the scoring function is now available.
- Both the Python and Matlab sample submissions have been updated to use the new scoring code.
Python scoring function now available for the 2018 Challenge (April 19, 2018, midnight)
- A Python implementation of the scoring function (gross area under precision-recall curve) is available here.
A second sample submission is available for the 2018 Challenge (April 10, 2018, midnight)
- A second sample submission, implemented using Matlab, has been posted.
- Be sure to submit your abstract for Computing in Cardiology before April 15! See the CinC site for more information. Please submit an abstract with your training results, even if you have not yet been able to obtain a score on the test set.
Submission and scoring functions for the 2018 Challenge are now online. (April 7, 2018, midnight)
- A sample submission and scoring function are now online.
- The entry submission system is now open; you can access it here.
- The deadline for the Unofficial Phase has been extended. The new deadline is noon GMT, April 13.
The 2018 PhysioNet Challenge is now open (Feb. 21, 2018, midnight)
The 2018 PhysioNet/Computing in Cardiology Challenge is now open. This year's Challenge, "You Snooze, You Win", is focused on the problem of automatically detecting disturbances in a patient's sleep. Given a collection of physiological signals recorded during sleep, including EEG, ECG, and EMG, participants are invited to develop an algorithm to automatically identify arousal events, and will be scored based on how well their algorithm's results agree with those of expert human annotators.
2018 Challenge data is now available (Feb. 21, 2018, midnight)
- PhysioNet Challenge data can now be downloaded here.
Please include the standard citation for PhysioNet:
(show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
Introduction
At the end of last year, American scientists Jeffrey Hall, Michael Rosbash and Michael Young received a Nobel Prize in Physiology “for their discoveries of molecular mechanisms controlling the circadian rhythm"— the mechanism that regulates sleep (Osborn, 2017). The precise reasons why humans sleep (and even how much sleep we need) remains a topic of scientific inquiry. Contemporary theorists indicate that sleep may be responsible for learning and/or the clearing of neural waste products (Ogilvie and Patel, 2017).
While the precise reasons why we sleep are not perfectly understood, there is consensus on the importance of sleep for our overall health, and well-being. Inadequate sleep is associated with a wide range of negative outcomes including: impaired memory and learning, obesity, irritability, cardiovascular dysfunction, hypotension, diminished immune function (Harvard Medical School, 2006), depression (Nutt et al, 2008), and quality of life (Lee, 2009). Further studies even suggest causal links between quality of sleep, and important outcomes including mental health.
It follows that improving the quality of sleep could be used to improve a range of societal health outcomes, more generally. Of course, the treatment of sleep disorders is necessarily preceded by the diagnosis of sleep disorders. Traditionally, such diagnoses are developed in sleep laboratory settings, where polysomnography, audio, and videography of sleeping subject may be carefully inspected by sleep experts to identify potential sleep disorders.
One of the more well-studied sleep disorders is Obstructive Sleep Apnea Hypopnea Syndrome (or simply, apnea). Apneas are characterized by a complete collapse of the airway, leading to awakening, and consequent disturbances of sleep. While apneas are arguably the best understood of sleep disturbances, they are not the only cause of disturbance. Sleep arousals can also be spontaneous, result from teeth grinding, partial airway obstructions, or even snoring. In this year's PhysioNet Challenge we will use a variety of physiological signals, collected during polysomnographic sleep studies, to detect these other sources of arousal (non-apnea) during sleep.
Challenge Data
Data for this challenge were contributed by the Massachusetts General Hospital’s (MGH) Computational Clinical Neurophysiology Laboratory (CCNL), and the Clinical Data Animation Laboratory (CDAC). The dataset includes 1,985 subjects which were monitored at an MGH sleep laboratory for the diagnosis of sleep disorders. The data were partitioned into balanced training (n = 994), and test sets (n = 989).
The sleep stages of the subjects were annotated by clinical staff at the MGH according to the American Academy of Sleep Medicine (AASM) manual for the scoring of sleep. More specifically, the following six sleep stages were annotated in 30 second contiguous intervals: wakefulness, stage 1, stage 2, stage 3, rapid eye movement (REM), and undefined.
Certified sleep technologists at the MGH also annotated waveforms for the presence of arousals that interrupted the sleep of the subjects. The annotated arousals were classified as either: spontaneous arousals, respiratory effort related arousals (RERA), bruxisms, hypoventilations, hypopneas, apneas (central, obstructive and mixed), vocalizations, snores, periodic leg movements, Cheyne-Stokes breathing or partial airway obstructions.
The subjects had a variety of physiological signals recorded as they slept through the night including: electroencephalography (EEG), electrooculography (EOG), electromyography (EMG), electrocardiology (EKG), and oxygen saturation (SaO2). Excluding SaO2, all signals were sampled to 200 Hz and were measured in microvolts. For analytic convenience, SaO2 was resampled to 200 Hz, and is measured as a percentage.
Objective of the Challenge
The goal of the challenge is use information from the available signals to correctly classify target arousal regions. For the purpose of the Challenge, target arousals are defined as regions where either of the following conditions were met:
- From 2 seconds before a RERA arousal begins, up to 10 seconds after it ends or,
- From 2 seconds before a non-RERA, non-apnea arousal begins, up to 2 seconds after it ends.
Please note that regions falling within 10 seconds before or after a subject wakes up, has an apnea arousal, or a hypopnea arousal will not be scored for the Challenge.
We have pre-computed the target arousals for you. They are contained in a sample-wise vector (described below in “Accessing the Data”), marked by “1”. Regions that will not be scored are marked by a “-1”, and regions that will be penalized if marked by your algorithm are marked by “0”. You do not need to recompute these scores.
Accessing the Data
The Challenge data repository contains two directories (training
and test
) which are each approximately 135 GB in size. Each directory contains one subdirectory per subject (e.g. training/tr03-0005
). Each subdirectory contains signal, header, and arousal files; for example:
tr03-0005.mat
: a Matlab V4 file containing the signal data.tr03-0005.hea
: record header file - a text file which describes the format of the signal data.tr03-0005.arousal
: arousal and sleep stage annotations, in WFDB annotation format.tr03-0005-arousal.mat
: a Matlab V7 structure containing a sample-wise vector with three distinct values (+1, 0, -1) where:- +1: Designates arousal regions
- 0: Designates non-arousal regions
- -1: Designates regions that will not be scored
Table 1 lists functions that can be used to import the data into Python, Matlab, and C programs.
File type | Python | Matlab | C / C++ |
---|---|---|---|
Signal (.mat) and header (.hea) files | wfdb.rdrecord | rdmat | isigopen |
Arousal annotation files (.arousal) | wfdb.rdann | rdann | annopen |
Arousal files (.mat) | scipy.io.loadmat | load | libmatio |
Submitting your Entry
Participants should use the provided signal and arousal data to develop a model that classifies test-set subjects. More specifically, for each subject in /test, participants must generate a .vec
text file that describes the probability of arousal at each sample, such as:
0.001 0.000 0.024 0.051
The names of the generated annotation files should match the name of the test subject. For instance, test/te09-0094.mat
should have a corresponding file named annotations/te09-0094.vec
.
Entries must be submitted as a zip file containing:
- All of the code and data files needed to train and run your algorithm
- An
AUTHORS.txt
file containing the list of authors - A
LICENSE.txt
file containing the license for your code - The
.vec
files described above
To upload your entry, create a PhysioNet account (if you don't have one), and go to challenge.physionet.org. Entries must be uploaded prior to the deadline in order to be eligible.
Scoring
Your final algorithm will only be graded for its binary classification performance on target arousal and non-arousal regions (designated by +1 and 0 in teNN-NNNN-arousals.mat
), measured by the area under the precision-recall curve. The area is defined as follows:
Note that this is the gross AUPRC (i.e., for each possible value of j, the precision and recall are calculated for the entire test database), which is not the same as averaging the AUPRC for each record.
Python (score2018.py) and Matlab/Octave (score2018.m) implementations of the scoring algorithm are available in the challenge files.
Sample Submission
Two simple example algorithms are provided and may be used as a template for your own submission:
Rules and Deadlines
Entrants may have an overall total of up to three submitted entries over both the unofficial and official phases of the competition (see Table 2). Following submission, entrants will receive an email confirming their submission and reporting how well their arousal annotations match those of the held-out test set.
All deadlines occur at noon GMT (UTC) on the dates mentioned below. If you do not know the difference between GMT and your local time, find out what it is before the deadline!
Start at noon GMT on | Entry limit | End at noon GMT on | |
---|---|---|---|
Unofficial Phase | 15 February | 1 | 13 April |
[Hiatus] | 13 April | 0 | 22 April |
Official Phase | 23 April | 2 | 1 September |
* Wildcard submissions due | 15 July |
All official entries must be received no later than noon GMT on Saturday, 1 September 2018. In the interest of fairness to all participants, late entries will not be accepted or scored. Entries that cannot be scored (because of missing components, improper formatting, or excessive run time) are not counted against the entry limits.
To be eligible for the open-source award, you must do all of the following:
- Submit at least one open-source entry that can be scored before the Phase I deadline (noon GMT on Monday, 9 April 2018).
- Submit at least one entry during the second phase (between noon GMT on Monday, 16 April 2018 and noon GMT on Saturday, 1 September 2018). Only your final entry will count for ranking.
- Entering an Abstract to CinC: Submit an acceptable abstract (about 299 words) on your work on the Challenge to Computing in Cardiology no later than 15 April 2018. Include the overall score for your Phase I entry in your abstract. Please select “PhysioNet/CinC Challenge” as the topic of your abstract, so it can be identified easily by the abstract review committee. You will be notified if your abstract has been accepted by email from CinC during the first week in June.
- Wildcard submissions: For teams who did not submit an abstract in time, or whose abstracts were not accepted, the team who submits the highest-scoring entry before 15 July 2018 will have another chance to compete, if they submit a high-quality abstract and present their work at the CinC conference. We will contact the winners in July with more information.
- Submit a full (4-page) paper on your work on the Challenge to CinC no later than the deadline of conference paper submission.
- Attend CinC 2018 (23-26 September 2018) in Maastricht and present your work there.
Please do not submit analysis of this year’s Challenge data to other Conferences or Journals until after CinC 2018 has taken place, so the competitors are able to discuss the results in a single forum. We expect a special issue from the journal Physiological Measurement to follow the conference and encourage all entrants (and those who missed the opportunity to compete or attend CinC 2018) to submit extended analysis and articles to that issue, taking into account the publications and discussions at CinC 2018.
Attending the Conference
If your abstract is accepted, you must log in to the conference site and agree that you will attend. Then, you must submit a full article describing your results and mark it as a preprint (for others to read) by September 15th. (Don't forget that the competition deadline is noon GMT on the 1st September - this deadline will *not* be extended.)
After agreeing to attend, you must register for the conference, pay the conference fee (prices go up after July ends), and secure a visa if you need one. See the Computing in Cardiology site for more information.
If you need a visa, we strongly suggest you register this week and begin the process. Visas can take months to issue and attendance is mandatory - you cannot receive a prize if you do not attend because defending your work is part of the Challenge. The conference (not PhysioNet) will supply you with a letter for your visa. Please see the CinC 2018 site for details on how to obtain that letter and who to contact. If you have any questions about this process, or are concerned about paying the conference fee before securing a visa, please contact the conference organizers, not PhysioNet.
If your abstract is rejected, then you have one more chance! This year we are introducing a 'wildcard' submission. On July the 15th, the top scoring entry that has not so far been accepted to CinC will be offered the opportunity to submit another (or a new) abstract to the conference system (containing full results). If the team can submit a quality abstract (with performance results) and register for the conference then it's members will be eligible for a prize (assuming they also attend the conference and present a poster). Don't forget, your abstract was probably rejected because it didn't contain any useful results (even on training data) and/or did not describe your methods well. So please pay attention to the abstract when submitting - it won't be automatic. We strongly believe that if you are unable to explain what you did and why, then the code is of very limited value.
We hope this is a suitable encouragement for teams that are either late to the Challenge or failed to secure a place at the conference to continue with their efforts in the competition. It would be a shame not to see potentially great works at the conference.
Look out for future announcements via the community discussion forum.
After the Challenge
As is customary, we hope to run a special issue in Physiological Measurement with a closing date of 31 January 2019. We will therefore encourage competitors (and non-competitors) to submit updates and further reworks based on the Challenge after the award ceremony at the Computing in Cardiology Conference in Maastricht in September.
Obtaining complimentary MATLAB licenses
The MathWorks has kindly decided to sponsor Physionet’s 2018 Challenge providing licenses. The MathWorks is offering to all teams that wish to use MATLAB, complimentary licenses. User can apply for a license and learn more about MATLAB support through The Mathworks’ PhysioNet Challenge link. If you have questions or need technical support, please contact The MathWorks at studentcompetitions@mathworks.com.
Challenge Results
Official results, as well as a paper describing the Challenge, are now available. Top scores were achieved by
- Matthew Howe-Patterson, Bahareh Pourbabaee, and Frederic Benard (0.54)
- Guðni Fannar Kristjansson, Heiðar Már Þráinsson, Hanna Ragnarsdóttir, Bragi Marinósson, Eysteinn Gunnlaugsson, Eysteinn Finnsson, Sigurður Ægir Jónsson, Halla Helgadóttir, and Jón Skírnir Ágústsson (0.45)
- Runnan He, Kuanquan Wang, Yang Liu, Na Zhao, Yongfeng Yuan, Qince Li, and Henggui Zhang (0.43)
An unofficial entry from Hongyang Li and Yuanfang Guan (who unfortunately missed the deadline to submit an abstract) achieved a score of 0.55.
Papers
The following paper is an introduction to the challenge topic, with a summary of the challenge results and a discussion of their implications. Please cite this publication when referencing the Challenge.
Ghassemi MM, Moody B, Lehman L, Song C, Li Q, Sun H, Westover M, Clifford GD., "You Snooze, You Win: the PhysioNet/Computing in Cardiology Challenge 2018," 2018 Computing in Cardiology Conference (CinC), 2018, pp. 1-4, doi: 10.22489/CinC.2018.049.
Over 20 papers were presented at Computers in Cardiology 2018. These papers have been made available under the terms of the Creative Commons Attribution License 3.0 (CCAL). See this page for details. We wish to thank all of the authors for their contributions.
Access
Access Policy:
Anyone can access the files, as long as they conform to the terms of the specified license.
License (for files):
Open Data Commons Attribution License v1.0
Discovery
DOI (version 1.0.0):
https://doi.org/10.13026/6phb-r450
DOI (latest version):
https://doi.org/10.13026/1q9b-ge17
Topics:
apnea
circadian
sleep
challenge
polysomnography
Corresponding Author
Files
Total uncompressed size: 266.6 GB.
Access the files
- Download the ZIP file (45.8 MB)
- Access the files using the Google Cloud Storage Browser here. Login with a Google account is required.
-
Access the data using the Google Cloud command line tools (please refer to the gsutil
documentation for guidance):
gsutil -m -u YOUR_PROJECT_ID cp -r gs://challenge-2018-1.0.0.physionet.org DESTINATION
-
Download the files using your terminal:
wget -r -N -c -np https://physionet.org/files/challenge-2018/1.0.0/
-
Download the files using AWS command line tools:
aws s3 sync --no-sign-request s3://physionet-open/challenge-2018/1.0.0/ DESTINATION
Name | Size | Modified |
---|---|---|
Parent Directory | ||
tr11-0659-arousal.mat (download) | 476.3 KB | 2018-04-20 |
tr11-0659.arousal (download) | 10.7 KB | 2018-02-20 |
tr11-0659.hea (download) | 675 B | 2018-02-15 |
tr11-0659.mat (download) | 139.9 MB | 2018-02-13 |