Database Open Access

CheXmask Database: a large-scale dataset of anatomical segmentation masks for chest x-ray images

Nicolas Gaggion Candelaria Mosquera Martina Aineseder Lucas Mansilla Diego Milone Enzo Ferrante

Published: Sept. 6, 2023. Version: 0.2 <View latest version>

When using this resource, please cite: (show more options)
Gaggion, N., Mosquera, C., Aineseder, M., Mansilla, L., Milone, D., & Ferrante, E. (2023). CheXmask Database: a large-scale dataset of anatomical segmentation masks for chest x-ray images (version 0.2). PhysioNet.

Additionally, please cite the original publication:

Gaggion, N., Mosquera, C., Mansilla, L., Aineseder, M., Milone, D. H., & Ferrante, E. (2023). CheXmask: A large-scale dataset of anatomical segmentation masks for multi-center chest x-ray images. arXiv, 2307.03293.

Please include the standard citation for PhysioNet: (show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.


The CheXmask Database presents a comprehensive, uniformly annotated collection of chest radiographs, constructed from six public databases: CANDID-PTX, ChestX-ray8, Chexpert, MIMIC-CXR-JPG, Padchest and VinDr-CXR. The database aggregates 676,803 anatomical segmentation masks derived from images which have been processed using the HybridGNet model to ensure consistent, high-quality segmentation. To confirm the quality of the segmentations, we include in this database individual Reverse Classification Accuracy (RCA) scores for each of the segmentation masks. This dataset is intended to catalyze further innovation and refinement in the field of semantic chest X-ray analysis, offering a significant resource for researchers in the medical imaging domain.


Chest radiography, an indispensable tool for diagnosing lung diseases, faces challenges in image interpretation due to complex thoracic structures. Through advances in deep learning, the integration of automated analysis systems with radiologists' workflow has significantly alleviated the challenges posed by the scarcity of radiologists, enabling more efficient chest X-ray labeling processes and effectively addressing the high demand for their expertise [1,2]. However, their efficacy depends on large, diverse, and accurately annotated datasets for training. We employed the HybridGNet [3] model, originally introduced with a small dataset, to segment six extensive chest X-ray databases, namely CANDID-PTX [4], ChestX-ray8 [5], CheXpert [6], MIMIC-CXR-JPG [7], Padchest [8] and VinDr-CXR [9], and publicly release the resulting segmentation masks together with an automatic quality control index based on Reverse Classification Accuracy (RCA) [10].


Image Segmentation: We employed the HybridGNet deep learning model for image segmentation. This model segments organ contours by identifying anatomical landmark coordinates, aiming to reduce the distance between the predicted and actual landmark positions. Utilizing an encoder-decoder architecture, it combines standard convolutions for image encoding and graph generative models for the creation of anatomically plausible representations. The pixel-level masks are obtained by filling contours derived from the landmarks predicted by HybridGNet. To enhance the model's robustness and generalizability, we retrained the HybridGNet on an extensive dataset, inclusive of the complete Chest-Xray-Landmark dataset [11] used in previous studies [3, 12]. We incorporated data augmentation techniques such as random rotations, scaling, and color shifting during the training process.

Automated Quality Assessment: In order to assess the quality of the segmentation, we utilized Reverse Classification Accuracy (RCA) [9] to provide an estimate for the Dice Similarity Coefficient (DSC), a widely accepted metric for evaluating the similarity between two binary label sets. RCA facilitates the generation of accurate DSC estimates, thereby enabling the assessment of segmentation methods without the requirement for Ground Truth (GT) data. The RCA framework involves the training of a novel segmentation method, the reverse classifier, using the predicted mask as the GT. This reverse classifier is then applied to a reference image set with known GT masks. The segmentation accuracy of the reverse model on the reference set is assumed to be indicative of the original image's segmentation quality. To expedite the evaluation process, we adopted a deep learning-based atlas registration process as RCA. This two-stage process starts with rigid registration for global alignment, and is followed by deformable registration to correct for any local misalignments. The transformation aligning the atlas to the reference image is then applied to the predicted mask, and Dice is estimated based on the overlapping between the deformed predicted mask and the known reference segmentation.

Data Description

The CheXMask dataset is structured as individual CSV files for each chest x-ray dataset. The dataset does not include original images or metadata. Instead, an image ID is provided in the first column of the CSV files, aligning with the ID column of the respective original dataset, making it straightforward to match rows in this CSV with the original dataset's instances. Due to the usage of the HybridGNet requires same size image masks, all images were preprocessed to have a 1024x1024 shape, then restoring the mask to the original image shapes. The pre-processed versions of the masks are also included, thus enabling all datasets to be used at the same image resolutions. It's worth noting that the masks from CANDID-PTX and ChestX-ray8 are already at the desired resolution.

The structure of the CheXMask CSV file is detailed as follows:

  • Image ID: Contains references to the original images, according to the original metadata, leading to variability in the column name across datasets.
  • Dice RCA (Max): Provides the maximum Dice Similarity Coefficient for the Reverse Classification Accuracy (RCA), indicating the quality of the segmentation.
  • Dice RCA (Mean): Provides the mean Dice Similarity Coefficient for the Reverse Classification Accuracy (RCA), serving as another measure of the segmentation quality.
  • Landmarks: Incorporates a set of points that represent the organ contours, as obtained by the HybridGNet model.
  • Left Lung: Contains the segmentation masks of the left lung, formatted in run-length encoding (RLE).
  • Right Lung: Contains the segmentation masks of the right lung, also in RLE format.
  • Heart: Contains the segmentation masks of the heart, formatted in RLE.
  • Height: Indicates the height of the segmentation mask, which is essential for decoding the RLE.
  • Width: Denotes the width of the segmentation mask, necessary for RLE decoding.

This layout offers a comprehensive view of the data in each record, providing information about the image ID, segmentation quality metrics, organ contours, and segmentation masks for each organ, along with mask dimensions.

Usage Notes

The Database itself does not release any of the images used to generate the segmentation masks. To obtain the source images, users will need to refer to the original sources from each dataset and comply with the requirements that each of the original datasets have, such as ethical courses and training. For downstream analysis, we recommend using only those segmentation masks whose Dice RCA (Mean) is >= 0.7, to ensure that potentially erroneous masks are not included in the analysis.


All publicly available datasets utilized in this study adhered to strict ethical standards and underwent thorough anonymization, with identifiable details removed. The study does not release any part of the original image datasets; it only provides already anonymized image identifiers to allow researchers to match the original images with our annotations. The CANDID-PTX and MIMIC-CXR-JPG datasets required additional ethics training and research courses for access. The study authors fulfilled all ethics courses and data use agreement requirements to ensure ethical data usage.

Conflicts of Interest

The authors have no conflict of interests to declare.


  1. Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In: Navab N, Hornegger J, Wells W, Frangi A, editors. Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. Cham: Springer; 2015. p. 234-241. (Lecture Notes in Computer Science; vol 9351).
  2. Moukheiber D, Mahindre S, Moukheiber L, Moukheiber M, Wang S, Ma C, Shih G, Peng Y, Gao M. Few-Shot Learning Geometric Ensemble for Multi-label Classification of Chest X-Rays. InData Augmentation, Labelling, and Imperfections: Second MICCAI Workshop, DALI 2022, Held in Conjunction with MICCAI 2022, Singapore, September 22, 2022, Proceedings 2022 Sep 16 (pp. 112-122). Cham: Springer Nature Switzerland.
  3. Gaggion N, Mansilla L, Mosquera C, Milone DH, Ferrante E. Improving anatomical plausibility in medical image segmentation via hybrid graph neural networks: applications to chest x-ray analysis. IEEE Trans Med Imaging. 2022. doi:10.1109/TMI.2022.3224660.
  4. Feng S, et al. Curation of the candid-ptx dataset with free-text reports. Radiology: Artificial Intelligence. 2021;3(6):e210136.
  5. Wang X, et al. Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
  6. Irvin J, et al. Chexpert: A large chest radiograph dataset with uncertainty labels and expert comparison. Proceedings of the AAAI conference on artificial intelligence. 2019;33(01).
  7. Johnson AE, et al. MIMIC-CXR-JPG, a large publicly available database of labeled chest radiographs. arXiv preprint. 2019. arXiv:1901.07042.
  8. Bustos A, et al. Padchest: A large chest x-ray image dataset with multi-label annotated reports. Med Image Anal. 2020;66:101797.
  9. Nguyen HQ, et al. VinDr-CXR: An open dataset of chest X-rays with radiologist’s annotations. Sci Data. 2022;9(1):429.
  10. Valindria VV, et al. Reverse classification accuracy: predicting segmentation performance in the absence of ground truth. IEEE Trans Med Imaging. 2017;36:1597–1606.
  11. Gaggion N, Vakalopoulou M, Milone DH, Ferrante E. Multi-center anatomical segmentation with heterogeneous labels via landmark-based models. In: 20th IEEE International Symposium on Biomedical Imaging (ISBI). IEEE; 2023.
  12. Gaggion N. Chest-xray-landmark-dataset [Internet]. GitHub repository. Available from: [Accessed 6/27/2023]

Parent Projects
CheXmask Database: a large-scale dataset of anatomical segmentation masks for chest x-ray images was derived from: Please cite them when using this project.

Access Policy:
Anyone can access the files, as long as they conform to the terms of the specified license.

License (for files):
Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International Public License

Corresponding Author
You must be logged in to view the contact information.
  • 0.1 - June 27, 2023
  • 0.2 - Sept. 6, 2023
  • 0.3 - Jan. 9, 2024
  • 0.4 - March 1, 2024


Total uncompressed size: 37.7 GB.

Access the files
Folder Navigation: <base>
Name Size Modified
LICENSE.txt (download) 0 B 2023-08-29
SHA256SUMS.txt (download) 1.0 KB 2023-09-07