The Effectiveness of Data Augmentation for Bone Suppression in Chest Radiograph using Convolutional Neural Network

Research Article

Austin J Cancer Clin Res. 2021; 8(2): 1095.

The Effectiveness of Data Augmentation for Bone Suppression in Chest Radiograph using Convolutional Neural Network

Ren G¹, Lam S-K¹, Ni R¹, Yang D¹, Qin J² and Cai J¹*

1Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China

2School of Nursing, The Hong Kong Polytechnic University, Hong Kong SAR, China

*Corresponding author: Jing Cai, Department of Health Technology and Informatics, The Hong Kong Polytechnic University, Hong Kong SAR, China

Received: July 23, 2021; Accepted: August 11, 2021; Published: August 18, 2021

Abstract

Objective: Bone suppression of chest radiograph holds great promise to improve the localization accuracy in Image-Guided Radiation Therapy (IGRT). However, data scarcity has long been considered as the prime culprit of developing Convolutional Neural Networks (CNNs) models for the task of bone suppression. In this study, we explored the effectiveness of various data augmentation techniques for the task of bone suppression.

Methods: In this study, chest radiograph and bone-free chest radiograph are derived from 59 high-resolution CT scans. Two CNN models (U-Net and Generative Adversarial Network (GAN)) were adapted to explore the effectiveness of various data augmentation techniques for bone signal suppression in the chest radiograph. Lung radiograph and bone-free radiograph were used as the input and target label, respectively. Impacts of six typical data augmentation techniques (flip, cropping, noise injection, rotation, shift and zoom) on model performance were investigated. A series of statistical evaluating metrics, including Peak Signal-To-Noise Ratio (PSNR), Structural Similarity (SSIM) and Mean Absolute Error (MAR), were deployed to comprehensively assess the prediction performance of the two networks under the six data augmentation strategies. Quantitative comparative results showed that different data augmentation techniques exhibited a varying degree of influence on the performance of CNN models in the task of CR bone signal suppression.

Results: For the U-Net model, flips, rotation (10 to 20 degrees), all the shifts, and zoom (1/8) resulted in improved model prediction accuracy. By contrast, other studied augmentation techniques showed adverse impacts on the model performance. For the GAN model, it was found to be more sensitive to the studied augmentation techniques than the U-Net. Vertical flip was the only augmentation method that yielded enhanced model performance.

Conclusion: In this study, we found that different data augmentation techniques resulted in a varying degree of impacts on the prediction performance of U-Net and GAN models in the task of bone suppression in CR. However, it remains challenging to determine the optimal parameter settings for each augmentation technique. In the future, a more comprehensive evaluation is still warranted to evaluate the effectiveness of different augmentation techniques in task-specific image synthesis.

Keywords: Data augmentation; Bone suppression; Chest radiograph

Introduction

Lung cancer is one of the second commonly occurring cancer worldwide, contributing about 11.4% of the new cancer cases [1]. One of the standard treatments for lung cancer is radiation therapy [2,3]. With the help of On-Board Imaging (OBI) systems, Image-Guided Radiation Therapy (IGRT) has been able to deliver a more accurate dose to the tumor region and reduce the radiation toxicity to the normal tissues [4,5]. The 2D Chest Radiograph (CR) generated by the OBI system is commonly used to determine the patient position and decrease the patient position variations during the IGRT course of lung cancer [6]. However, the bony structure in CR often obscures the localization of the target or landmarks, causing a maximum error of 22mm during the IGRT of lung cancer [7]. To improve the localization accuracy of IGRT, bone suppression in CR is regarded as a promising solution [8].

Various efforts have been made for bone suppression in CR. Dual-energy (DE) radiographic imaging attempts to leverage the difference in attenuation coefficients between bones and soft tissues for obtaining the separation of bone and soft tissue images using two levels of X-ray exposures [9]. Despite the increased diagnostic sensitivity, its clinical application in radiation oncology is still largely restricted. More recently, multiple deep learning techniques have been extensively studied, a variety of CNNs have achieved remarkable progress and have been successfully applied to the task of bone signal suppression, including multiple massive-training artificial neural networks [10], filter learning [11], massive training artificial neural network [12], cascade of multi-scale convolutional neural networks [13], frequency-specific deep neural network convolution [14], to name a few. These methods suppress bone structures by regression prediction, in which the bone-free Digital Radiograph (DR) is used as training ground truth [10-16]. Although such bone suppression methods provide the radiologists with an unobstructed view of the lung tissue, streamlining diagnostic sensitivity of CR without incurring additional radiation dose to patients, the prediction accuracy of the deep learning models still heavily relies on the availability of large-scale high-quality data [17,18]. Undoubtedly, this poses a practical challenge in real-world scenario, since massive expenses and manual efforts are required to obtain enormous amount of datasets in demand, especially in the context of sparse availability of the desired label of interest [19].

Confronted with this roadblock in building deep learning models, data augmentation, which is the process of applying one or more geometric deformations for inflating the size of training dataset artificially [20,21], has been widely adopted. As deep learning models treat a geometrically transformed image as a meaningful image, CNN models can be trained using the deformed dataset to generate more “unseen” data. As such, data augmentation plays a vital role in enhancing the performance of classification and segmentation since it increases the data variability [22,23] and does not affect the semantic validity of the original dataset [24]. The effectiveness of data augmentation has been tested in many natural image datasets, including MNIST handwritten digit recognition, CIFAR-10/100, ImageNet, tiny-imagenet-200, SVHN (street view house numbers), Caltech-101/256, MIT places, MIT-Adobe 5K dataset, Pascal VOC, and Stanford Cars [19]. Hussain et al. compared training model performance utilizing different augmentation strategies, and their results suggested that both discriminative and generative performance were drastically affected [25]. Several data augmentation approaches, such as flips and rotations, have been widely studied in the literatures involving raw medical images. Nevertheless, the impact of various data augmentation techniques in the case of medical synthesis problems, particularly in the aspect of bone signal suppression in CR, remains to be investigated.

In this study, we investigated the impacts of six typical data augmentation techniques (flip, cropping, noise injection, rotation, shift, and zoom), each with varying intensities of augmentation, for the task of bone signal suppression in CR on two popular deep learning architectures, U-Net and GAN. A series of statistical evaluating metrics, including Peak Signal-To-Noise Ratio (PSNR), Structural Similarity (SSIM) and Mean Absolute Error (MAR), were deployed to comprehensively evaluate the prediction performance of the deep learning models under different data augmentation strategies. Our overarching purpose was to provide insights into the optimal adoption of data augmentation initiatives in synthesized bone-free CR using two typical deep learning architectures, U-Net and GAN.