Database Restricted Access
Smartphone-Captured Chest X-Ray Photographs
Published: Sept. 27, 2020. Version: 1.0.0
When using this resource, please cite:
(show more options)
Kuo, P., Tsai, C., Lopez, D. M., Karargyris, A., Pollard, T., Johnson, A., & Celi, L. A. (2020). Smartphone-Captured Chest X-Ray Photographs (version 1.0.0). PhysioNet. https://doi.org/10.13026/7b2j-nq93.
Please include the standard citation for PhysioNet:
(show more options)
Goldberger, A., Amaral, L., Glass, L., Hausdorff, J., Ivanov, P. C., Mark, R., ... & Stanley, H. E. (2000). PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals. Circulation [Online]. 101 (23), pp. e215–e220.
Recent research has applied deep learning imaging techniques to detect pulmonary pathology in chest X-rays (CXR) and achieved performance comparable with radiologists on specific lung abnormalities. Industry has managed to deploy such artificial intelligence (AI) solutions in on mobile devices to help clinicians improve their workflow. However, the feasibility of applying these modern AI algorithms to CXR photographs has yet to be evaluated. In this project, 6,453 smartphone photographs were taken from the frontal-view CXR images from two publicly available databases, MIMIC-CXR and CheXpert. We constructed four derivative CXR photograph datasets (Photo-MMC, Photo-CXP, Photo-MED, and Photo-DEV) including photographs taken by several resident doctors and photographs taken with different devices.
The chest X-ray is one of the most common imaging tests performed in clinical practice for chest abnormalities detection. With a superior performance, artificial intelligence (AI) models are expected to alleviate daily burden of radiologists or to identify insidious diseases omitted by naked-eye [1-3]. Despite of all the promising result, no research has thoroughly investigated whether conventional high-performing AIs can be directly transferred to diagnose low-resolution smartphone-captured CXRs.
We used frontal-view CXR images from both MIMIC-CXR  and CheXpert  datasets. The images from MIMIC-CXR were annotated by CheXpert  and NegBio , and a total of 14 labels (no finding, enlarged cardiomediastinum, cardiomegaly, airspace opacity, lung lesion, edema, consolidation, pneumonia, atelectasis, pneumothorax, pleural effusion, pleural other, fracture, and support devices) were automatically extracted from radiology reports. The labels of CheXpert images were directly given by CheXpert. Four CXR photograph datasets were manually created by making photographic copies of CXR images from both datasets:
- Photo-MMC: CXR photographs were captured by the three participants using eight different smartphones (Apple iPhone X, Apple iPhone 6s, Apple iPad 2, Acer Z330, ASUS Zenfone 3, Asus Zenfone 5z, Samsung A30, Samsung J7). During capturing, CXR images were displayed by eight different computer monitors (MSI GE 40, MSI PS63, Asus VE278, Toshiba Portege R700, Apple MacBook Pro13, Lenovo Yoga 520, Dell SE2417HGX, Samsung SyncMaster 191T plus). The CXR photographs were taken at different times, locations, and various lighting sources. A total of 1759 photographs were taken by using the smartphones mentioned above from randomly selected MIMIC-CXR images.
- Photo-CXP: With the same settings, a total of 1337 photographs of randomly selected images from CheXpert were taken.
- Photo-MED: To simulate the end-user scenario, 1337 photographs of the same source images as those in Photo-CXP were taken by nine resident doctors Each doctor was recruited to take photos of one part of images by using their smartphones. The application, Microsoft Office Lens (Microsoft Corp.), was recommended to immediately and automatically crop the photos on their smartphones. No quality requirement or environmental restriction are given. They are instructed to "take photos as if you want to send them to your radiology colleague and have their opinion." The nine smartphones they used were: Google pixel 3, Apple iPhone8, Apple iPhone 6s plus, Apple iPhone 7 plus, Asus Zenfone 5z, iPhone 6s, Apple iPhone XS, Asus ZenFone3, and Apple iPhone XR. The 6 computer monitors used were: Dell XPS13, Apple MacBook Air, Apple MacBook Pro 13, Apple MacBook Pro 15, Acer vg270k, Acer swift 3, BenQ ew2775zh.
- Photo-DEV: 202 photographs corresponding to the validation dataset provided by CheXpert were repeatedly taken by a single physician 10 times. At the first 9 times, different device settings (3 different smartphones: Apple iPhone 6s, Acer Z330, Asus Zenfone 5z; 3 different computer monitors: MSI GE 40, Dell SE2417HGX, Asus VE278) were used to take a photograph under the same lighting condition and at the same place. An additional set of 202 photographs was collected with a brighter lighting condition. A total of 2020 photographs were collected.
Four directories corresponds to four CXR photographs datasets. Each folder contains the CXR images in JPEG format and their corresponding labels in CSV format (not_mentioned = 0; negative = 1; uncertain = 2; positive = 3). The name of each image is composed of the subject_ID, study_num, and image_num in the original MIMIC-CXR/CheXpert dataset. For Photo-CXP-DEV, labels were assigned by either positive or negative, which were given by CheXpert. For each dataset, the detailed photographing settings for each set of images can be found in the a Readme file in TXT format. The Photo-MMC can be downloaded along with this project. The images for Photo-CXP, Photo-MED, and Photo-DEV can be downloaded from Stanford Photo-CheXpert website: http://eepurl.com/hcIi0T, and their labels and Readme files are available in this project.
Use of the dataset is free to all researchers after signing of a data use agreement which stipulates, among other items, that the user will not share the data and will make no attempt to reidentify individuals. The user also has to agree to the Research Use Agreement from Photo-CheXpert (http://eepurl.com/hcIi0T).
The work was conceived, designed, and conducted during the 2019 fall course HST.953 Collaborative Data Science in Medicine at the Harvard-MIT Division of Health Science and Technology. The creation of the MIMIC-CXR dataset and authors (TJP, AEWJ, and LAC) were funded by the National Institute of Health through R01 grant EB017205. PCK received supports from LEAP program funded by the Taiwanese Ministry of Science and Technology. DML received funding from Fulbright 2019 Visiting Scholar Program. We thank Dr. Wei-Chi Huang, Dr. Huang Yung (National Taiwan University Hospital), Dr. Yu-Tung Lan, Dr. Te-Wei Wang, Dr. Fan-Yun Lan (Harvard T.H. Chan School of Public Health), Dr. Po-Ya Tung (Taipei Veteran Hospital), Dr. Ning-Hsuan Chin (Far Eastern Memorial Hospital), Dr. Tsung-An Chen (Taipei City Hospital, Zhongxiao Branch), and Dr. Hao-Hsiang Hsu (National Cheng Kung University Hospital) for collecting CXR photographs. We also thank Dr. Roger G. Mark (Massachusetts Institute of Technology) for supporting this project.
Conflicts of Interest
We declare no competing interests.
- Rajpurkar P, Irvin J, Ball RL, Zhu K, Yang B, Mehta H, Duan T, Ding D, Bagul A, Langlotz CP, Patel BN. Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS medicine. 2018 Nov 20;15(11):e1002686.
- Rubin J, Sanghavi D, Zhao C, Lee K, Qadir A, Xu-Wilson M. Large scale automated reading of frontal and lateral chest x-rays using dual convolutional neural networks. arXiv preprint arXiv:1804.07839. 2018 Apr 20.
- Majkowska A, Mittal S, Steiner DF, Reicher JJ, McKinney SM, Duggan GE, Eswaran K, Cameron Chen PH, Liu Y, Kalidindi SR, Ding A. Chest radiograph interpretation with deep learning models: Assessment with radiologist-adjudicated reference standards and population-adjusted evaluation. Radiology. 2020 Feb;294(2):421-31.
- Johnson AE, Pollard TJ, Berkowitz SJ, Greenbaum NR, Lungren MP, Deng CY, Mark RG, Horng S. MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Scientific Data. 2019;6.
- Irvin J, Rajpurkar P, Ko M, Yu Y, Ciurea-Ilcus S, Chute C, Marklund H, Haghgoo B, Ball R, Shpanskaya K, Seekins J. CheXpert: A large chest radiograph dataset with uncertainty labels and expert comparison. InProceedings of the AAAI Conference on Artificial Intelligence 2019 Jul 17 (Vol. 33, pp. 590-597).
- Peng Y, Wang X, Lu L, Bagheri M, Summers R, Lu Z. Negbio: a high-performance tool for negation and uncertainty detection in radiology reports. AMIA Summits on Translational Science Proceedings. 2018;2018:188.
Only logged in users who sign the specified data use agreement can access the files.
License (for files):
PhysioNet Restricted Health Data License 1.5.0
Data Use Agreement:
PhysioNet Restricted Health Data Use Agreement 1.5.0