This data set is used and described in
When referencing this material, please include the citation above, and also include the standard citation for PhysioNet:
The neuroQWERTY MIT-CSXPD database contains keystroke logs collected from 85 subjects with and without parkinsons disease (PD). This dataset has been collected and analyzed in order to indicate that the routine interaction with computer keyboards can be used to detect motor signs in the early stages of PD.
The subjects were recruited from two movement disorder units in Madrid (Spain) following the institutional protocols approved by the Massachusetts Institute of Technology, USA (Committee on the Use of Humans as Experimental Subjects approval no. 1402006203), Hospital 12 de Octubre, Spain (no. CEIC:14/090) and Hospital Clinico San Carlos, Spain (no. 14/136-E).
Each data file collected includes the timing information collected during the sessions of typing activity using a standard word processor on a Lenovo G50-70 i3-4005U with 4MB of memory and a 15 inches screen running Manjaro Linux. Subjects were instructed to type as they normally would do at home and they were left free to correct typing mistakes only if they wanted to. The key acquisition software presented a temporal resolution of 3/0.28 (mean/std) milliseconds.
There are two datasets collected from two sets of experiments:
- PD_MIT-CS1PD - 31 subjects. 13 healthy controls and 18 PD sufferers. Subjects were asked to visit a movement disorder unit twice to complete the study. Therefore each subject's data is stored in 2 csv files.
- PD_MIT-CS2PD - 54 subjects. 30 healthy controls and 24 PD sufferers. Subjects were asked to visit a movement disorder unit once to complete the study.
Along with the raw typing collections, clinical evaluations were also performed on each subject, including UPDRS and finger tapping tests. See the referenced publication for more details.
The data from each of the two experiment sets are split into their own subdirectories. Each dataset contains a subject summary csv file
GT_DataPD_MIT-CSXPD.csv which lists for each subject:
- pID - Patient ID
- gt - Ground truth label of whether or not they had PD
- updrs108 - Unified Parkinson’s Disease Rating Scale part III (UPDRS-III)
- afTap - Alternating finger tapping result
- sTap - Single key tapping result
- nqScore - neuroQWERTY index (nQi)
- Typing speed
- file_n - The csv file(s) containing the patient's typing data
Each keystroke data csv file has four columns which give:
- The key pressed.
- The hold duration in seconds.
- The key release time in seconds from time 0.
- The key press time in seconds from time 0.
neuroQWERTY.zipfile includes all of the data along with the scripts described in the next section.
nqDataLoader.py python module contains functions used to filter anomalous results and load the data from the csv data files. The
readme.ipynb ipython notebook uses these functions and demonstrates how to load and display the data.
These datasets have been collected as part of the neuroQWERTY project at the Massachusetts Institute of Technology thanks to the financial support by the Comunidad de Madrid, Fundacion Ramon Areces and The Michael J Fox Foundation for Parkinson's research (grant number 10860). We thank the M + Vision faculty for their guidance in developing this project. We also thank our many clinical collaborators at MGH in Boston, at “12 de Octubre”, Hospital Clinico and Centro Integral en Neurociencias HM CINAC in Madrid for their insightful contributions.
Name Last modified Size Description
Parent Directory - MD5SUMS 2016-12-20 11:44 194 MIT-CS1PD/ 2016-12-20 11:07 - MIT-CS2PD/ 2016-12-20 11:07 - SHA1SUMS 2016-12-20 11:44 226 SHA256SUMS 2016-12-20 11:44 322 neuroQWERTY.zip 2016-12-20 11:07 2.1M nqDataLoader.py 2016-12-20 11:07 9.8K readme.ipynb 2016-12-20 11:07 77K
Comments and issues can also be raised on PhysioNet's GitHub page.
Updated Friday, 28 October 2016 at 16:58 EDT