Name: National Institutes of Health Stroke Scale (NIHSS) Annotations for the MIMIC-III Database
Published: Jan. 25, 2021
License: https://github.com/MIT-LCP/license-and-dua/tree/master/drafts

Database Credentialed Access

Jiayang Wang , Xiaoshuo Huang , Lin Yang , Jiao Li

Published: Jan. 25, 2021. Version: 1.0.0

When using this resource, please cite: (show more options)
Wang, J., Huang, X., Yang, L., & Li, J. (2021). National Institutes of Health Stroke Scale (NIHSS) Annotations for the MIMIC-III Database (version 1.0.0). PhysioNet. https://doi.org/10.13026/gyjg-0t90.

MLA	Wang, Jiayang, et al. "National Institutes of Health Stroke Scale (NIHSS) Annotations for the MIMIC-III Database" (version 1.0.0). PhysioNet (2021), https://doi.org/10.13026/gyjg-0t90.
APA	Wang, J., Huang, X., Yang, L., & Li, J. (2021). National Institutes of Health Stroke Scale (NIHSS) Annotations for the MIMIC-III Database (version 1.0.0). PhysioNet. https://doi.org/10.13026/gyjg-0t90.
Chicago	Wang, Jiayang, Huang, Xiaoshuo, Yang, Lin, and Jiao Li. "National Institutes of Health Stroke Scale (NIHSS) Annotations for the MIMIC-III Database" (version 1.0.0). PhysioNet (2021). https://doi.org/10.13026/gyjg-0t90.
Harvard	Wang, J., Huang, X., Yang, L., and Li, J. (2021) 'National Institutes of Health Stroke Scale (NIHSS) Annotations for the MIMIC-III Database' (version 1.0.0), PhysioNet. Available at: https://doi.org/10.13026/gyjg-0t90.
Vancouver	Wang J, Huang X, Yang L, Li J. National Institutes of Health Stroke Scale (NIHSS) Annotations for the MIMIC-III Database (version 1.0.0). PhysioNet. 2021. Available from: https://doi.org/10.13026/gyjg-0t90.

Please include the standard citation for PhysioNet: (show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.

APA	Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
MLA	Goldberger, A., et al. "PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220." (2000).
CHICAGO	Goldberger, A., L. Amaral, L. Glass, J. Hausdorff, P. C. Ivanov, R. Mark, J. E. Mietus, G. B. Moody, C. K. Peng, and H. E. Stanley. "PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220." (2000).
HARVARD	Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P.C., Mark, R., Mietus, J.E., Moody, G.B., Peng, C.K. and Stanley, H.E., 2000. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
VANCOUVER	Goldberger A, Amaral L, Glass L, Hausdorff J, Ivanov PC, Mark R, Mietus JE, Moody GB, Peng CK, Stanley HE. PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.

Abstract

The National Institutes of Health Stroke Scale (NIHSS) is a 15-item neurologic examination stroke scale. It quantifies the physical manifestations of neurological deficits and provides crucial support for clinical decision making and early-stage emergency triage. NIHSS scores stored in the free-text of Electronic Health Records (EHRs) often lack standardization and the expression patterns are highly dependent on the habit of the clinicians. This can limit the potential for reusability of the data.

There is benefit in developing robust algorithms to extract NIHSS scores from the free-text of EHRs. We developed a dataset for NIHSS score identification, a task defined as the extraction of scale items and corresponding scores from discharge summaries. Discharge summaries of stroke cases in the Medical Information Mart for Intensive Care III (MIMIC-III) database were used to create an annotated NIHSS corpus.

Each discharge summary was manually annotated for the presence of NIHSS scores by two annotators with backgrounds in medical informatics. Annotations include all scale items (e.g. “4. Facial Palsy”), the corresponding score “measurement”, and their relation “has value”. The dataset is intended to support academic and industrial research in the field of medical natural language processing (NLP).

Background

Electronic Health Record (EHR) data carries enormous amounts of medical treatment information that is not well structured, stored instead as free text. This data has great potential to provide insight into medical treatment and to facilitate retrospective studies. The National Institutes of Health Stroke Scale (NIHSS) quantifies the physical manifestations of neurological deficits and provides crucial support for clinical decision making and early-stage emergency triage, and it is often recorded within free text fields.

NIHSS quantifies 15 items: “1a. Level of Consciousness (LOC)”, “1b. LOC Questions”, “1c. LOC Commands”, “2. Best Gaze”, “3. Visual”, “4. Facial Palsy”, “5a. Left Arm”, “5b. Right Arm”, “6a. Left Leg”, “6b. Right Leg”, “7. Limb Ataxia”, “8. Sensory”, “9. Best Language”, “10. Dysarthria” and “11. Extinction and Inattention”. Each item can be marked with a score based on the responses of a patient to certain queries. The scoring system is widely used for the initial assessment of the severity of stroke, to assess treatment response, and for bedside monitoring. NIHSS is a commonly recorded item in EHR data and it has been used in many stroke-related studies [1-3].

MIMIC-III is an open-access database that contains 58976 intensive care patients’ records from 2001 to 2012 [4]. The discharge summaries in the NOTEEVENTS table store information from hospital admission to discharge; including examinations, medications, procedures, and other treatment records. Different from other structured data in MIMIC-III, discharge summaries are stored as free text and correlated with patient IDs. The summaries include detailed notes relating to medical history, physical examinations, ECG / X-ray / MRI findings, therapeutic regimes, discharge medicine, and other unstructured information, including NIHSS. We sought to develop an approach for extracting NIHSS scores from unstructured patient notes.

Methods

Following approaches of Woodfield et al [5] and Mitchell et al [6], we set our research scope to cover all 4 majority types of stroke: ischemic stroke, hemorrhagic stroke, subarachnoid hemorrhage (SAH), and intracerebral hemorrhage (ICH), mapping to ICD-9 codes 430, 431, 432, 433, 434, 436 and their subtypes. In the end, 3660 stroke cases with valid discharge summaries were selected for further annotation.

Based on the NIHSS structure and common recording convention, we developed a primary annotation guideline. Two expert annotators with backgrounds in medical informatics were recruited for the study. The experts first annotated the same 100 discharge summaries individually using BRAT, a widely used online text annotation tool that allows multiple people to work simultaneously [7]. The graphical user interface allows annotators to label entities (NIHSS item) and relations (item - score) by selecting and dragging, it then generates position/relation records accordingly. After the first round of annotation, the discrepancies were discussed between annotators and the guideline were adjusted correspondingly. Third parties were consulted in cases of disagreement. This process was repeated until the guideline reached a stable state. At this point, Cohen’s Kappa [8] of inter-annotator agreement reached 0.901 which suggested sufficient consistency according to Landis and Koch [9].

Then the two annotators continued to finish the annotation process following the final guideline. The annotated results were saved in “.ann” files. The original Discharge Summary text was then combined with these annotation files to create a dictionary that contained HADM_ID (patient’s admission ID), token (separated word from discharge summary), tags (Begin-Inside-Outside tag), relations (entity-entity relation), entities (annotator recognized NIHSS item) and code (entity or not).

Data Description

Our corpus contains data for 312 stroke patients with 2929 NIHSS items, 2774 measurements, and 2733 item-score relations. The corpus was separate into a training set: NER_RE_Train.txt (220 discharge summaries) and a testing set: NER_RE_Test.txt (92 discharge summaries). These training and testing sets can be used for NIHSS entity recognition and relation recognition. RE_End2End_Test.txt is an end-to-end relation test set, generated based on NER_RE_Test.txt. The end-to-end relation set contains random generalized entity-score relations that can be used for predicted relation validation.

Data example

Key	Value
HADM_ID	`xxxxxx`
Token	`['for','NIHSS','of','22',':','-',……]`
Tags	`['O','B-NIHSS','O','B-Measurement','O','O'……]`
Relations	`[('T1','T2','R1','Has_Value'),……]`
Entities	`[('T1','NIHSS',9,9),……]`
Code	`['0','T1','0','T2','0','0',……]`

Corpus dictionary keys are defined as follows:

HADM_ID: 6-digit patient ID from MIMIC-III database, distinct for each discharge summary, can also be used for further data extraction in the system.
Token: the free-text was separate into a list of word as token.
Tags: Begin-Inside-Outside (BIO) tags were labeled according to token.
Relations: relation between entities with relation sequence number and relation type.
Entities: annotator recognized NIHSS entities, with entity sequence number and position in token list.
Code: labels entity according to token list, entities were labeled with entity sequence number.

Descriptive statistics of the corpus

The table below displays the number of cases for different features within the dataset.

	n
Corpus overview
Stroke cases	312
Sentences	7848
Average sentences per case	25
Average tokens per case	153
Scale item
NIHSS	548
1a. LOC	150
1b. LOC Questions	160
1c. LOC Commands	150
2. Best Gaze	154
3. Visual	164
4. Facial Palsy	191
5. Motor Arm	24
5a. Left Arm	136
5b. Right Arm	133
6. Motor Leg	26
6a. Left Leg	130
6b. Right Leg	121
7. Limb Ataxia	147
8. Sensory	171
9. Best Language	174
10. Dysarthria	186
11. Extinction and Inattention	164
Score
Measurement	2774
Relation
Has value	2733

Usage Notes

NIHSS can have great importance in terms of patient health, particularly for diseases such as stroke. To develop algorithms (for example, algorithms for outcome prediction) that make use of these scores, it is necessary to develop approaches for extracting relevant information from free text notes. We share our annotated data, hoping to inspire and help researchers who are interested in identifying scale scores (NIHSS or other kinds) in unstructured EHR data.

This corpus was generated as part of a project that explored quantified evidence for stroke. Our code applied to this corpus can be found on GitHub [10]. We encourage you to transfer this work to related NLP tasks.

Acknowledgements

This research is supported by the Beijing Natural Science Foundation (Grant No. Z200016), Chinese Academy of Medical Sciences (Grant No. 2017PT63010, 2018PT33024, 2018-I2M-AI-016). We appreciate the efforts of the MIT Laboratory for Computational Physiology and collaborating research groups for sharing MIMIC-III with the research community. We would also like to thank clinicians from the China National Clinical Research Center for Neurological Disease, Beijing Tiantan Hospital, Capital Medical University for their clinical support.

Conflicts of Interest

The authors have no conflicts of interest to report

References

Yuan, H., An, J., Zhang, Q., Zhang, X., Sun, M., Fan, T., Cheng, Y., Wei, M., Tse, G., Waintraub, X., Li, Y., Day, J. D., Gao, F., Luo, G., & Li, G. (2020). Rates and Anticoagulation Treatment of Known Atrial Fibrillation in Patients with Acute Ischemic Stroke: A Real-World Study. Advances in therapy, 37(10), 4370–4380. https://doi.org/10.1007/s12325-020-01469-w
Betts, K. A., Hurley, D., Song, J., Sajeev, G., Guo, J., Du, E. X., Paschoalin, M., & Wu, E. Q. (2017). Real-World Outcomes of Acute Ischemic Stroke Treatment with Intravenous Recombinant Tissue Plasminogen Activator. Journal of stroke and cerebrovascular diseases : the official journal of National Stroke Association, 26(9), 1996–2003. https://doi.org/10.1016/j.jstrokecerebrovasdis.2017.06.010
Abzhandadze, Tamar & Reinholdsson, Malin & Sunnerhagen, Katharina. (2020). NIHSS is not enough for cognitive screening in acute stroke: A cross-sectional, retrospective study. Scientific Reports. 10. 10.1038/s41598-019-57316-8.
Johnson, A. E., Pollard, T. J., Shen, L., Lehman, L. W., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Celi, L. A., & Mark, R. G. (2016). MIMIC-III, a freely accessible critical care database. Scientific data, 3, 160035. https://doi.org/10.1038/sdata.2016.35
Woodfield, R., Grant, I., UK Biobank Stroke Outcomes Group, UK Biobank Follow-Up and Outcomes Working Group, & Sudlow, C. L. (2015). Accuracy of Electronic Health Record Data for Identifying Stroke Cases in Large-Scale Epidemiological Studies: A Systematic Review from the UK Biobank Stroke Outcomes Group. PloS one, 10(10), e0140533. https://doi.org/10.1371/journal.pone.0140533
Mitchell, J & Collen, Jacob & Petteys, S & Holley, Aaron. (2011). A simple reminder system improves venous thromboembolism prophylaxis rates and reduces thrombotic events for hospitalized patients. Journal of thrombosis and haemostasis : JTH. 10. 236-43. 10.1111/j.1538-7836.2011.04599.x.
Stenetorp, P & Pyysalo, Sampo & Topic, Goran & Ohta, Tomoko & Ananiadou, Sophia & Tsujii, Jun'ichi. (2012). brat: a Web-based Tool for NLP-Assisted Text Annotation. The 3th Conference of the European Chapter of the Association for Computational Linguistics; Avignon, France. 102-107.
Cohen, J. (1960). A Coefficient of Agreement for Nominal Scales. Educational and Psychological Measurement, 20(1), 37-46. doi:10.1177/001316446002000104
Landis, J. R., & Koch, G. G. (1977). The measurement of observer agreement for categorical data. Biometrics, 33(1), 159–174.
Code for extracting NIHSS scores from MIMIC-III. GitHub. https://github.com/huangxiaoshuo/NIHSS_IE [Accessed: 19 January 2021]