Fast brain tumour segmentation using optimized U-Net and adaptive thresholding

Brain tumour segmentation evolved as the dominant task in brain image processing. Most of the contemporary research proposals devise deep neural networks and sparse representation to address this issue. These methods inherently suffer from high computational cost and additional memory requirements. Thus, optimization of the computational cost became a challenging task for the contemporary research. This paper discusses an optimized U-Net model with post-processing for fast brain tumour segmentation. The proposed model includes two phases: training and testing. Training phase computes weights for optimized U-Net and an adaptive threshold value. In the testing phase, a trained U-Net model predicts a rough tumour segment. Adaptive thresholding grabs the final tumour with improved segmentation results. We have considered a brain tumour dataset of 3064 images with three types of brain tumours for evaluation. Our proposed model exhibits superior results than the existing models in terms of recall and dice similarity metrics. It exhibits competitive performance in accuracy and precision. Moreover, the proposed model outperforms its competitive models in training time.


Introduction
Brain cancer is the 10th leading type of cancer that causes death and established as the deadliest hazards in the world [1]. In olden days, a brain cancer diagnosis is a tedious task for the neurologists. Later, neuroscience is enriched with magnetic resonance (MR) imaging. It simplifies the task of brain tumour diagnosis with the visualization of brain structure. More aggressive or high-grade tumours are incurable and decrease life expectancy [2]. According to the American Cancer Society annual report, there is a rapid increase in the number of brain cancer cases [1]. So, there is a need for automated diagnosis systems to help neurologists. The main objectives of brain tumour diagnosis automation systems are to reduce human intervention and early-stage tumour detection. Some of the contemporary diagnosis applications include brain tumour growth estimation [3], brain health tissue estimation [4], brain tumour nuclei/cell detection [5] and brain image classification [6].
Brain tumour segmentation is one of the essential operations for automated brain diagnosis system. In general, brain segmentation techniques can be segregated as region-based, threshold-based, clusteringbased and classification-based methods [7].
(1) Region-based methods select a seed point and then grow or split the region based on the intensity of regions. Thus, these methods depend on the homogeneity of image intensity. Contour/shapebased, level-set based and graph-based methods are some of the region-based methods for tumour region detection. In general, threshold-based segmentation methods are computationally effective than other methods. Figure  1 demonstrates the influence of threshold on tumour segmentation. Consider the most common types of brain tumours, namely, meningioma, glioma and pituitary, as shown in Figure 1(a, e and i), respectively. Then, the second column of Figure 1 depicts binary images with threshold (th) > 0.2. Similarly, the third and fourth column show binary images with th > 0.3 and th > 0.4, respectively. It also evidences that a common threshold can be expected within the threshold range [0.3, 0.4] for these three images. However, this threshold range changes as the number of images increases. The threshold value depends on the type of tumour, orientation of slice, number of slices and modality of tumour (T1, T2 and FLAIR). A simple threshold value is not sufficient in all cases and there is a need for adaptive threshold value. Although the threshold-based methods are computationally efficient, they are unable to achieve high segmentation accuracy.
Nowadays, classification-based methods especially deep learning techniques are gaining popularity in the applications of medical image analysis [8]. U-Net [9] and its variants are the most dominant deep convolutional neural network (DCNN) models for medical image segmentation. DCNN refers to neural networks with many layers that extract hierarchical features from raw input images [8]. The major bottleneck of the deep convolutional neural networks is computation time due to a huge number of convolution layers. Thus, the design of DCNN with an optimal number of layers is a challenging task for researchers. Moreover, DCNN produces a grey-scale segment image which needs a thresholding operation to achieve a final binary tumour segment.
Basic U-Net consists of two paths, namely, the contracting path and the expansion path. Convolution along with pooling is performed in the contracting path. On the other hand, up-convolution is performed in the expansion path to produce tumour segment. Consider a brain tumour image and its ground truth image, as shown in Figure 2(a, b), respectively. Resulting tumour segment obtained from basic U-Net can be observed from Figure 2(c). This image represents probabilities of the tumour pixels and hence it needs thresholding to produce a binary tumour mask. In general, binary thresholding with threshold th > 0.5 is used to extract the tumour mask, as shown in Figure 2(d). Shape of this tumour mask depends on the threshold value which can be visualized from Figure 2(e, f). These images reveal that the shape of predicted tumour varies with varying threshold values. However, the shape of the tumour segment obtained using th > 0.3 is more accurate than other threshold values in this case. Thus, there is a need of adaptive threshold after U-Net to achieve the accurate shape of the tumour.
It motivated us to propose a new model with optimized U-Net and adaptive thresholding. The proposed model encompasses training and testing phases. The training phase acquires weights for optimized U-Net and an adaptive threshold value. These weights and adaptive threshold value are used in the testing phase to predict tumour segment.
Rest of the paper is organized as follows; Section 2 consists of a detailed analysis on the existing models. Optimized U-Net architecture is discussed in Section 3. Section 4 consists of the working process of the proposed model. The discussion of results is included in Section 5. Finally, Section 6 concludes our findings.

Literature review
In this section, we have analysed various brain tumour segmentation methods. Kermi et al. [10] presented a new fully automated brain tumour segmentation method. It performs segmentation as a three-step process. Initially, image pre-processing is applied to remove the noise. Then, symmetry analysis is performed to locate the tumour de-formable model. Finally, region growing and geodesic level set methods are used to acquire the final tumour. Experiments are conducted using 285 subjects of 3D MR images with different types and shapes of tumours from BraTS 2017 dataset and the attained sensitivity scores of 81.59% and 89.01% for T2 and FLAIR, respectively.
Hao et al. [11] developed a voxel-wise residual network with a set of training schemes. The 2D residual learning is extended into a 3D variant for solving segmentation tasks with a deeper network. Twenty-five layers are used to train deep network with limited training data for brain segmentation. Results are evaluated using BrainS 2013 dataset and achieved dice similarity of 89.46%.
Wu et al. [12] proposed a radionics framework for the differentiation of two clinical problems. It performs feature extraction and selection using sparse representation. Statistical characteristics of the tumour regions are explored with dictionary learning and sparse representation-based feature extraction. A new coefficient of regularization term is introduced using a multi-feature collaborative sparse representation classification framework. Results are evaluated using a private dataset of 102 patients and achieved 98.51% accuracy.
Chen et al. [13] devised a light-weight dilated multifibre network to attain real-time segmentation. Group convolution is performed to explore multi-fibre units. 3D dilated convolutions are used to build a multi-scale feature representation. Results are evaluated on BraTS 2018 challenge dataset and acquired 90.62% of dice similarity.
Majority of these existing methods use a deep convolutional neural network and sparse representation. Common demerit of these methods is high computational cost [14,15]. Moreover, deep neural networks suffer from high memory to store training parameters or weights. Thus, the objective of the proposed model is to perform optimization of U-Net without losing its performance. Our proposed method uses only 10 convolution layer to acquire optimal computational cost. Moreover, the number of trainable parameters is reduced as the input image is resized to (64, 64). Adaptive thresholding is performed as post-processing to improve segmentation results. Existing methods used BraTS dataset that consists of low-grade and highgrade tumours. We considered three-tumour dataset to exhibit the performance of multi-tumour segmentation. Proposed optimized U-Net architecture U-Net model [9] is the most dominant deep convolutional neural network to handle medical image segmentation. Basic U-Net model consists of two separate convolution paths, namely, the contracting path and the expansive path. Two 3 × 3 unpadded convolutions are performed in the contracting path to produce the same size of images as input images. Convolution layers are followed by a rectified linear unit (ReLU) and maxpooling of (2, 2) operation with stride 2 is performed for down-sampling. Two convolutions and max-pooling are collected as a block. Totally four blocks are used in the contraction path. Up-sampling of feature map followed by a 2 × 2 up-convolution is performed in each step of the expansive path. This path expands size of the image which is reverse to the contracting path. Concatenation with the correspondingly cropped feature map from the contracting path will be performed. The expansion path also uses four blocks of up-convolutions to regain the original size of the image. The final layer uses a 1x1 convolution to map each 64-component feature vector to the desired number of classes.
Basic U-Net and its variants include more than 20 convolution layers along with pooling and dropout layers. Our proposed optimized U-Net (OU-Net) model uses one convolution layer in each convolution block instead of two convolution layers. We have used a batch normalization layer instead of a second convolution layer. Normalized results are passed through pooling layers in the concatenation path. Dropout layers with a dropout value of 0.05 are included to optimize trainable parameters. Thus, the proposed OU-Net model consists of ten convolution layers and can be visualized from Figure 3.
Complete configuration of the proposed OU-Net can be observed from of each model is compared, as shown in Figure 4(a, b), respectively. These plots reveal that the error became consistent after 10 epochs in both the models. However, basic U-Net and proposed OU-Net models generate an approximate error of 0.04 and 0.7, respectively, after 10 epochs.
In the proposed OU-Net, feature vector probabilities of tumour pixels are decreased due to lesser convolution layers when compared to basic U-Net and can be visualized from Figure 5. Consider a sample brain MR image and its ground truth image, as shown in Figure 5(a, b), respectively. After training, basic U-Net has predicted tumour region more accurately, as shown in Figure  5(c). However, the proposed OU-Net fails to retain the complete shape of the tumour which can be observed from Figure 5(d). This figure shows that a few tumour pixels at the bottom right corner are not included in tumour prediction. This is due to the removal of fewer relevant tumour pixels in binary segmentation (using threshold th > 0.5). Thus, the proposed OU-Net suffers from high validation error when compared to basic U-Net. It motivated us to use adaptive thresholding as post-processing operation for the improvement of segmentation results. Adaptive threshold value is the mean threshold value of dataset which helps to include relevant tumour pixels in segmentation.

Proposed optimized U-Net and adaptive thresholding
The proposed Optimized U-Net and Adaptive Thresholding (OUAT) model uses adaptive thresholding along with optimized U-Net. The proposed OUAT model consists of two phases, namely, training and testing. The training phase acquires weights of optimized U-Net and an adaptive threshold (ath) value. Tumour segment is predicted using trained OU-Net and adaptive threshold in the testing phase. A complete block diagram of the proposed model is depicted in Figure 6. Details of each step are as follows.

Training phase
Training phase designates two independent tasks: training of optimized U-Net model and computation of adaptive threshold value. Compilation process generates an untrained model with random weights. Our model has compiled using adam optimizer and jaccard loss function. Training of OU-Net needs to be performed using train data after the compilation process. Training process reads each image of train data and updates model weights to produce trained OU-Net. In addition to the OU-Net, the proposed model uses an adaptive threshold selection algorithm to compute an adaptive threshold value. Let DI is a set of N train images and GI is a set of N ground truth tumour images. If the size of each image is (X, Y), then the proposed adaptive threshold selection algorithm is given by Algorithm 1.
Here, isth i represents mean threshold of i th image. Similarly, ath denotes the mean threshold of the train dataset. This threshold value helps in the rejection of non-tumour pixels and inclusion of tumour pixels from the convoluted image obtained from the optimized U-Net model.

Testing phase
Initially, trained OU-Net is used to predict tumour segment. It incurs less segmentation accuracy due to fewer numbers of layers. Then, adaptive thresholding is performed as a post-processing operation. This operation gives final segmentation results by suppressing non-tumour region and inclusion of tumour region. If TE(x,y) is the output of trained OU-Net, then final predicted tumour image PI(x,y) can be computed using the following equation:

Results and discussion
Basic U-Net [9] and variants of residual network (Resnet) [16] like Resnet18, Resnet34 are the most promising medical segmentation models. So, we used these models in the performance analysis. Cheng et al. [17] has provided brain tumor dataset (BTDS) having 3064 T1-weighted MR images and is publicly available [18]. BTDS is enriched with magnetic resonance images of 233 patients with three types of tumours: meningioma, glioma and pituitary. Thus, we used this dataset for the evaluation of models. The proposed optimized U-Net and adaptive thresholding (OUAT) along with its competitive models have been simulated using Python. However, Resnet code has been adapted from Keras segmentation models. Experiments have been conducted on Intel Xeon-based system with 13GB RAM. In our experiments, we use 50% of dataset as train data and 50% of dataset as test data.

Performance analysis
We have considered four most significant metrics, namely, accuracy, precision, recall and dice similarity. The first three metrics concentrate on pixel classification rate. On the other hand, dice similarity focuses on the amount of overlap between predicted images and ground truth images. In general, binary images of predicted and ground truth images are considered for the comparison of brain tumour segmentation. There are four possible labels for each pixel while comparing the predicted image with ground truth image as follows.    Accuracy is the sum of true-positive and truenegative pixels divided by the total number of pixels. It can be computed using Equation (2). Precision is a fraction of relevant pixels among the retrieved pixels of segmentation results. Recall measures the proportion of positive voxels that are correctly segmented. Precision and recall can be computed using Equations (3) and (4), respectively.
Dice similarity mainly focuses on the amount of overlap between the predicted segment image and ground truth image. It is an important metric to compare overall segmentation scores and Equation (5) is used to compute dice similarity.
In our experiments, we have trained the models with 5, 10, 20 and 50 epochs. Comparison of accuracy, precision, recall and dice similarity values is depicted in Figure 7(a-d), respectively. Figure 7(a) shows that the proposed model and its competitive models achieve an accuracy of 98% with 10 and more epochs.
Resnet34 outperforms our proposed model in precision and can be observed from Figure 7(b). The proposed model uses an adaptive threshold value which includes non-tumour pixels that mimic as tumour pixels. It causes a decrease in precision of the proposed model. However, the proposed OUAT model outperforms competitive models in recall which can be visualized from Figure 7(c). Our proposed model takes advantage of adaptive thresholding to retain maximum tumour pixels to achieve recall. Figure 7(c) depicts comparison of dice similarity. Our proposed model exhibits superior performance to its competitive models in dice similarity. Table 2 depicts visual comparison of brain tumour segmentation results of three brain MR images of meningioma, glioma and pituitary. The first column and second columns show original and ground truth images of brain MR images, respectively. Rest of the column visualizes segmentation results of U-Net, Resnet18, Resnet34 and the proposed OUAT, respectively. It can be observed that our proposed model predicts the location of tumour accurately like other models.    The proposed model acquires accurate tumour shape and size when compared to other models in the case of meningioma and pituitary tumours. None of the models performs well in glioma tumour due to its diverse characteristics. However, Resnet34 performs well in the case of glioma tumour and listed in Table 3.
To compare overall performance, we have computed mean of the key metrics including accuracy, precision, recall and dice similarity. Each model has been executed ten times with 20 epochs and computed the mean of metrics. Table 4 lists the mean values of accuracy, precision, recall and dice similarity. Resnet34 performs well in both accuracy and precision. However, our proposed model outperforms the existing models in recall and dice similarity.

Computational complexity
Convolution layers play a major role in contributing to the computational cost of convolutional neural networks. The cost of a convolution layer depends on a number of multiplication operations. Single convolution operation cost is T S = O(a 2 ) when the kernel size is (a, a). The total computation cost of a convolution layer depends on the size of a convoluted image and cost of a single convolution (T S ). Thus, the total cost of a single convolution layer is T C = O(m 2 * a 2 ), if the size of the convoluted image is (m, m). Table 5 compares the number of layers and average time complexity of the proposed model and its competitive models. It proves that the proposed OU-Net outperforms the existing models with fewer numbers of convolution layers. Similarly, Table 6 lists the comparison of training time on CPU and GPU systems. Our proposed model takes 179 sec., 875 sec. and 1511 sec. lesser training time than U-Net, Resnet18 and Resnet34, respectively on CPU. Similarly on GPU, it takes 3 sec., 46 sec. and 90 sec. lesser training time than U-Net, Resnet18 and Resnet34, respectively.

Conclusions
A deep neural network is a contemporary tool to address brain tumour segmentation. It is recommended to optimize the number of layers to reduce computational cost. However, the decrease of layers incurs a loss of accuracy. This work presented an optimized U-Net model with adaptive thresholding as a post-processing operation. Adaptive threshold represents the mean threshold of train dataset which retains tumour pixels. The proposed model starts with a training phase to achieve weights for optimized U-Net model and an adaptive threshold value from the train dataset. In the testing phase, the given test image is convoluted with optimized U-Net model and then final tumour segment is produced with adaptive thresholding. A brain tumour dataset of 3064 images having three types of tumours is used for the evaluation of results. To exhibit the performance of the proposed model, we adopted four key metrics including accuracy, precision, recall and dice similarity. Our proposed model exhibits superior results than its competitive models in recall and dice similarity. The proposed model shows competitive performance with Resnet34 in precision and accuracy. Moreover, our model outperforms the existing models in terms of training time. Our future work focuses on improving precision for more accurate brain tumour segmentation.

Disclosure statement
No potential conflict of interest was reported by the author(s).