Database Restricted Access
VinDr-SpineXR: A large annotated medical image dataset for spinal lesions detection and classification from radiographs
Published: Aug. 24, 2021. Version: 1.0.0
When using this resource, please cite:
(show more options)
Pham, H. H., Nguyen Trung, H., & Nguyen, H. Q. (2021). VinDr-SpineXR: A large annotated medical image dataset for spinal lesions detection and classification from radiographs (version 1.0.0). PhysioNet. https://doi.org/10.13026/q45h-5h59.
Hieu T Nguyen, Hieu H Pham, Nghia T Nguyen, Ha Q Nguyen, Thang Q Huynh, Minh Dao, Van Vu. "VinDr-SpineXR: A deep learning framework for spinal lesions detection and classification from radiographs" - International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2021.
Please include the standard citation for PhysioNet:
(show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
Radiographs are used as the most critical imaging tool for identifying spine anomalies in clinical practice . The evaluation of spinal bone lesions, however, is a challenging task for radiologists. To the best of our knowledge, no existing studies are devoted to developing and evaluating a comprehensive system for classifying and localizing multiple spine lesions from X-ray scans. The lack of large-scale spine X-ray datasets with high-quality images and human expert annotations is the key obstacle. To fill this gap, we introduce a large-scale annotated medical image dataset for spinal lesion detection and classification from radiographs. The dataset, called VinDr-SpineXR, contains 10,466 spine X-ray images from 5,000 studies, each of which is manually annotated with 13 types of abnormalities by an experienced radiologist with bounding boxes around abnormal findings. This is the largest dataset to date that provides radiologist's bounding-box annotations for developing supervised-learning algorithms for spine X-ray analysis.
Spinal-related conditions account for the central portion of the overall burden of musculoskeletal conditions . The most simple and accessible modality, conventional radiograph, still plays an essential role in studying spinal disorders despite the rapid development of advanced imaging techniques such as MRI and CT . It has been the primary tool widely used to identify and monitor various spine abnormalities such as fractures, osteophytes, thinning of the bones, vertebral collapse, or tumors. The interpretation of spine X-ray images requires an in-depth understanding of diagnostic radiography, in which large variability in the number, size, and general appearance of spine lesions makes this a complex and time-consuming task. These factors could lead to the risk of missing significant findings, resulting in severe consequences for patients and clinicians.
Currently, deep convolutional networks (CNNs) [4, 5] have shown significant improvements in the musculoskeletal analysis from X-rays. Most of these studies focus on automated fracture detection and localization [6, 7, 8, 9, 10]. To the best of our knowledge, no existing studies are devoted to developing and evaluating a comprehensive system for classifying and localizing multiple spine lesions from X-ray scans. The lack of large datasets with high-quality images and human experts' annotations is the key obstacle. To tackle this challenge, we focus on creating a significant benchmark dataset of spine X-rays that are manually annotated at the lesion level by experienced radiologists. Since deep learning methods are very limited in giving explanations for their predicted results, this dataset aims to provide the details of spine anomalies, i.e., their exact locations, so that the resulting deep learning models can support the radiologists by spotting pathologies in the image instead of just providing disease conclusions. In total, the dataset provides 10,466 spine X-ray images from 5,000 studies (sometimes referred to as exams or procedures) that are manually annotated with 13 types of abnormalities by radiologists. To the best of our knowledge, this is the largest dataset to date that provides radiologist's bounding-box annotations for developing supervised-learning object detection algorithms. Table 1 below provides a summary of publicly available musculoskeletal X-ray datasets and a comparison with our dataset.
|Dataset||Year||Study type||Label||# Images|
|Digital Hand Atlas ||2007||Left hand||Bone age||1,390|
|Osteoarthritis Initiative ||2013||Knee||K&L Grade||8,892|
|MURA ||2017||Upper body||Abnormalities||40,561|
|RSNA Pediatric Bone Age ||2019||Hand||Bone age||14,236|
|Kuok et al. ||2018||Spine||Lumbar vertebrae mask||60|
|Kim et al. ||2020||Spine||Spine position||797|
|Ours ||2021||Spine||Multiple abnormalities||10,466|
The study was approved by the Institutional Review Board of Hospital 108 (H108) and Hanoi Medical University Hospital (HMUH), from which the raw data was collected. The patient informed consent was waived since this study did not impact the clinical care of these hospitals, and patient-identifiable information has been removed from the data. The process of building the VinDr-SpineXR dataset consists of four steps: (1) Data acquisition, (2) Data de-identification, (3) Data filtering, and (4) Data labeling. We describe each step in detail as below.
More than 50,000 raw X-ray scans in DICOM format were collected from the PACS (Picture Archiving and Communication System) of the HMUH and H108. These scans were created in the period from 2011 to 2020.
The process of de-identifying DICOM imaging data involves three steps: (1) removing protected health information in DICOM tags, (2) removing patient information in the pixel data of the image, and (3) revising.
In regards to the DICOM tags, a specific set of tags that provide basic demographic information, namely sex and age, or parameters for image processing were retained while the protected health information (PHI) was removed. The details of retained tags are provided in the supplemental file [supplemental_file_DICOM_tags_SpineXR.pdf]. In the case where the patient is more than 90 years old, their age was made equal to 90. This step was done by a Python script.
First, to remove patient information in the pixel data, all the scans were examined manually to spot DICOM images that have textual information. These images were split according to the text's location, either the top or the bottom. Subsequently, a simple Python script was used to crop the textual regions in these spotted images. In the last step, de-identified images were manually reviewed to ensure that patient information was removed correctly. A team of four human readers was formed independently from the previous steps to do this task. Specifically, each scan was reviewed in two rounds by two different readers. This ensures that the patient information in the pixel data will be totally removed. In addition, DICOM metadata and image data were both examined in this step.
The DICOM data collected from hospitals may contain images of other body parts, i.e., knee, hand, or arm. To get spine images from this dataset, we incorporated a semi-autonomous process for filtering. Firstly, we manually classified 20,000 images from the crawled data into five classes, namely (1) spine (cervical, thoracic and lumbar spine), (2) Bone (humerus, forearm, lower leg, femur), (3) small joints (wrist, hand, ankle, foot), (4) big joints (knee, elbow, shoulder, hip), (5) other bone (skull, sternum). Next, a classifier based on convolutional neural networks was trained on 20,000 classified images. Then the resulting classifier was used to classify the remaining images.
A label set of thirteen types of lesions was created based on the recommendation from a committee of experienced radiologists from the HMUH and H108. The finding set includes (1) Ankylosis, (2) Disc space narrowing, (3) Enthesophytes, (4) Foraminal stenosis, (5) Fracture, (6) Osteophytes, (7) Sclerotic lesion, (8) Spondylolysthesis, (9) Subchondral sclerosis, (10) Surgical implant, (11) Vertebral collapse, (12) Foreign body, and (13) Other lesions. These lesion types were selected based on two criteria: (1) it is common enough to acquire a considerable sample size, and (2) it can be interpreted from radiographs.
A group of three experienced radiologists was formed to annotate the data. All of them are certified in diagnostic radiology and received the healthcare profession certificate from the Vietnamese Ministry of Health. A set of 5,000 studies were randomly chosen from the dataset for labeling. Each study was assigned and annotated by one radiologist. An in-house web-based annotation tool for medical images, called VinDr Lab , was leveraged to facilitate the labeling process.
The data is split into two subsets: a set of 4,000 studies with 8,389 images for training and the remaining 1,000 studies with 2,077 images for testing. The images and associated annotations are split accordingly. It is worth noting that this is the split we used in the associated paper , and there was not any difference in the process used to create these subsets. Therefore, different train and test splits can be used as desired. Regarding lesion annotations, each lesion instance is specified by a lesion category (i.e., one in 13 given categories), its spatial information in the form of a rectangle with four variables indicating the position of the four edges, the radiologist who marked the lesion, and the identifier of the image on which the lesion appears. The identifier of each image is masked by hashing the value of the Service Object Pair Instance Unique Identifier given in the DICOM tag (0008, 0018). By the same method, the identifier of the image's series and study are also masked.
The data is structured into three folders containing training images, testing images, and annotations, respectively:
train_images: a directory that contains 8,389 training images from 4,000 studies. Each image file follows the name convention
test images: a directory that contains 2,077 testing images from 1,000 studies. Each image file follows the name convention
annotations: a directory that contains two CSV files, namely
test.csv, corresponding to radiologists' annotations of the training and testing subset, respectively. Both CSV files follow the same format where each row represents a lesion instance in an image. A row provides information about the lesion through 9 attributes:
image_id: Identifier of the image - the encoded value of the Service Object Pair Instance Unique Identifier provided by the DICOM tag (0008,0018).
series_id: Identifier of the series to which the image belongs - the encoded value of the Series Instance Unique Identifier provided by the DICOM tag (0020,000E).
study_id: Identifier of the study to which the image belongs - the encoded value of the Study Instance Unique Identifier provided by the DICOM tag (00020,000E).
rad_id: Identifier of the radiologist who read the scan - the value is either "rad1", "rad2", or "rad3".
lesion_type: Name of the lesion, which is either one of the 13 aforementioned lesion types or "No finding" indicating the absence of abnormality in an image.
xmin: The absolute coordinate of the left bound of the box in the image.
ymin: The absolute coordinate of the top bound of the box in the image.
xmax: The absolute coordinate of the right bound of the box in the image.
ymax: The absolute coordinate of the bottom bound of the box in the image.
lesion_typebeing "No finding," and the four coordinates variables are empty.
The folder structure of the dataset is as follows:
├── annotations │ ├── test.csv │ └── train.csv ├── test_images │ ├── 00073745e02e69432c002b527c565151.dicom │ ├── ... │ └── fffa8adcc5e692cdb816051b6202870d.dicom └── train_images ├── 004004095d8a302b1c0815ccb044c018.dicom ├── ... └── ff6a81f9fa386401ce11a0eb74e1f661.dicom
The VinDr-SpineXR dataset was created for the purpose of developing and evaluating algorithms for detecting and localizing anomalies in spine X-ray scans. It has been used previously in  that successfully developed and implemented a deep learning-based framework, for the classification and localization of abnormalities from spine X-rays. The dataset can also be used for general tasks in computer vision, such as object detection and multiple label image classification.
This is the first public release (v1.0) of the VinDr-SpineXR dataset.
We would like to thank Hospital 108 (H108) and Hanoi Medical University Hospital (HMUH) for their collaboration in creating the VinDr-SpineXR dataset. We also thank radiologists, technicians, and other collaborators, who were involved in this work, particularly Anh T. Nguyen, Ha T. Vuong , Nguyet T.B. Dang , Tien D. Phan, Dung T. Le, and Chau T.B. Pham for their assistance in the data construction and normalization process.
Conflicts of Interest
Vingroup Big Data Institute (VinBigdata) supported the creation of this resource. Hieu Huy Pham, Hieu Trung Nguyen and Ha Quy Nguyen are currently employed by VinBigdata. VinBigdata did not profit from the work done in this project.
- Deyo, Richard A., and Andrew K. Diehl. "Lumbar spine films in primary care." - Journal of General Internal Medicine 1.1 (1986): 20-25.
- Cieza, A., Causey, K., Kamenov, K., Hanson, S. W., Chatterji, S., & Vos, T. (2020). "Global estimates of the need for rehabilitation based on the Global Burden of Disease study 2019: a systematic analysis for the Global Burden of Disease Study 2019.' - The Lancet, 396(10267), 2006-2017.
- Santiago, F. R., Ramos-Bossini, A. J. L., Wáng, Y. X. J., & Zúñiga, D. L. (2020). 'The role of radiography in the study of spinal disorders." - Quantitative Imaging in Medicine and Surgery, 10(12), 2322.
- LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." - Nature 521.7553 (2015): 436-444.
- Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "ImageNet classification with deep convolutional neural networks." - Communications of the ACM 60.6 (2017): 84-90.
- Kim, K.C., Cho, H.C., Jang, T.J., Choi, J.M., Seo, J.K.. "Automatic detection and segmentation of lumbar vertebrae from X-ray images for compression fracture evaluation." - Computer Methods and Programs in Biomedicine p. 105833 (2020)
- Lindsey, R., Daluiski, A., Chopra, S., Lachapelle, A., Mozer, M., Sicular, S., Hanel, D., Gardner, M., Gupta, A., Hotchkiss, R., et al.: "Deep neural network improves fracture detection by clinicians." - Proceedings of the National Academy of Sciences 115(45), 11591–11596 (2018)
- Rajpurkar, P., Irvin, J., Bagul, A., Ding, D., Duan, T., Mehta, H., Yang, B., Zhu, K., Laird, D., Ball, R.L., et al.: "MURA: Large dataset for abnormality detection in musculoskeletal radiographs." - arXiv preprint arXiv:1712.06957 (2017)
- Thian, Y.L., Li, Y., Jagmohan, P., Sia, D., Chan, V.E.Y., Tan, R.T.: "Convolutional neural networks for automated fracture detection and localization on wrist radiographs." - Radiology: Artificial Intelligence 1(1), e180001 (2019)
- Zhang, X., Wang, Y., Cheng, C.T., Lu, L., Xiao, J., Liao, C.H., Miao, S.: "A new window loss function for bone fracture detection and localization in X-ray images with point-based annotation." - arXiv preprint arXiv:2012.04066 (2020)
- Gertych, A., Zhang, A., Sayre, J., Pospiech-Kurkowska, S., Huang, H.: "Bone age assessment of children using a digital hand atlas." - Computerized Medical Imaging and Graphics 31(4-5), 322–331 (2007)
- Osteoarthritis initiative: A multi-center observational study of men and women. URL: https://oai.epi-ucsf.org/datarelease/, accessed: 2021-02-22
- Halabi, S.S., Prevedello, L.M., Kalpathy-Cramer, J., Mamonov, A.B., Bilbily, A., Cicero, M., Pan, I., Pereira, L.A., Sousa, R.T., Abdala, N., et al.: "The RSNA pediatric bone age machine learning challenge." - Radiology 290(2), 498–503 (2019)
- Kuok, C.P., Fu, M.J., Lin, C.J., Horng, M.H., Sun, Y.N.: "Vertebrae segmentation from X-ray images using convolutional neural network." - International Conference on Information Hiding and Image Processing (IHIP). pp. 57–61 (2018)
- Nghia T. Nguyen, Phuc T. Truong, Van T. Ho, Trung V. Nguyen, Hieu T. Pham, Mi T. Nguyen, Long T. Dam, Ha Q. Nguyen.: "VinDr Lab: A Data Platform for Medical AI", URL: https://github.com/vinbigdata-medical/vindr-lab, 2021
- Nguyen, Hieu T., et al. "VinDr-SpineXR: A deep learning framework for spinal lesions detection and classification from radiographs." - International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2021.
Only registered users who sign the specified data use agreement can access the files.
License (for files):
PhysioNet Restricted Health Data License 1.5.0
Data Use Agreement:
PhysioNet Restricted Health Data Use Agreement 1.5.0
- sign the data use agreement for the project