A robust method for skin cancer diagnosis based on interval analysis

Early diagnosis of skin cancer from dermoscopy images significantly reduces the mortality due to this cancer. However, several reasons impact the system diagnosis precision. One of the important problems in this process happens during image acquisition. Often, in medical photography, there are some uncertainties like noises and brightness variations, initial digitalization and sampling which affect the image quality. This study presents a new approach for border detection of the cancer area by considering the uncertainties. Interval analysis is utilized to extend the proposed edge detection method and the Hukuhara method is utilized for developing the differentiation formula for edge detection in the interval space. Simulation results are applied to two different skin cancer atlas and the results are compared with three popular methods by considering two types of noises including Gaussian noise and salt-and-pepper noise. The results showed that the introduced method gives better results than the compared methods.


Introduction
Skin cancer is the result of abnormal changes in the outer layer of the skin which is recognized as the most current cancer in the world, whereas it accounts for 75% of the world's cancer [1]. However, most people with skin cancer are healed, it is one of the major concerns of people due to its high prevalence.
Most skin cancers grow only locally and invade adjacent tissues, but some of them, particularly, melanoma (cancer of the pigment cells), which is the rarest type of skin cancer, may spread through the circulatory system or lymphatic system and reach the farthest points of the body. Melanoma is the most serious type of skin cancer. About 6850 people have been expected to die of melanoma (about 4610 men and 2240 women in 2020) [2]. The approximate average of this disease is currently the death of one person per hour. Melanoma is more prevalent in some areas, especially in western regions and countries.
According to the findings, the diagnosis of skin cancer, especially melanoma, in the early stages of the disease can significantly reduce mortality due to this cancer, but since the diagnosis of this disease at an early stage even by specialists and experts is hard-core, so it will be very helpful to provide a method that will help them to simpler diagnose of melanoma in the early steps.
In recent years, with the advancement of technology, and in particular artificial intelligence, suitable methods have been developed for this issue. In the meantime, image processing techniques are progressing as successful ways for these purposes. Using methods and techniques for image processing and cancer diagnosis from images reduces human errors and increases the speed of detection.
Besides, the importance of medical image processing can be considered as it helps physicians and radiologists to more easily diagnose the disease, thus protecting the patient against irreparable risks that will come about.
In the last decade, image processing is turned into a major component of intelligent decision support systems, which is often applied to digital images and computer systems [3][4][5][6].
Various uses of image processing in various fields of technology, industry, urban, medicine, and science have made it a very active topic among research fields [7][8][9].
There have been several methods that are introduced for skin cancer detection based on image processing [10].
In the meanwhile, methods based on thresholding due to their simple implementations used in numerous researches.
For instance, the Otsu's method [11] and Kapur's method [12] which categorize the image into two parts and binarize them based on a threshold point.
However, using these methods has sometimes various difficulties; for instance, the segmented cancer area may have smaller sizes than their real dimension which makes the segmentation method result in extremely asymmetrical lesion boundaries.
Oliveira [10] presented a substitute method for melanoma recognition through dermoscopy images, based on exterior appearance and shade of colour characteristic extraction.
In 2010, Sadeghi et al. presented a technique based on graph method for pigment diagnosis [13]. The accuracy of the technique for detecting the cancerous and healthy parts was obtained by 92.6%.
In 2013, Razmjooy et al. proposed a simple method for skin cancer detection by considering four main signs of cancer, i.e. Asymmetry, border, colour and diameter [14]. The method gave good results and resulted in a stand-alone method for melanoma detection.
In 2018, Razmjooy et al. presented a method based on soft computing for skin cancer detection [6]. They proposed a method based on the optimized neural network by World Cup Optimization (WCO) algorithm.
In 2018, Salem et al. proposed an optimized method for melanoma detection based on a genetic algorithm [15]. The method is a two-phase technique for classifying images into malignant or benign.
In 2019, Hagerty et al. presented a combined approach based on handcrafted technique and deep learning for melanoma cancer detection [16].
Edge detection algorithms can identify many objects from the image of their lines. The best example is for medical applications. The human vision system performs a kind of edge detection before recognizing the colour or intensity of the light.
Therefore, it is logical to discover the edge before interpreting images in automated systems. Edge detection operations are important processing in many artificial vision systems [17,18].
Edge detection is a set of mathematical operations that can be used to identify areas of the image where the brightness changes dramatically, i.e. edge detection can be used to detect drastic changes in lighting, which usually sign of an important change in the environment.
The shadow border is not a physical reality and is where the part of the image starts or ends. The edge can be considered as where the horizontal and vertical sides of the object come together.
Edge detection changes the two grey levels or the values of the adjacent two-pixel brightness that occurs in a specific location of the image.
One of the main processes which must be performed for storing an input image function f : R d → R in computer memory is to quantize and to sample the input image range, where quantizing and sampling are to discretize the image range and to discretize the spatial domain, respectively.
Due to sampling and quantizing, the discretization process has always some missing information in the input image. Because of this important cause, there is never full certain intensity information about the image pixels.
This missing information includes some uncertainties which should be considered in different applications of image processing like edge detection.
For instance, in edge detection, because of these uncertainties, it is even difficult to reach an agreement for selecting a correct boundary among objects. Different methods can be considered for uncertainties. For instance, fuzzy methods [19,20], statistical methods [21] and interval methods [22].
Among the presented methods, interval analysis is a method which only needs lower and upper ranges of uncertainty. Because of the nature of digital images, interval analysis is selected for handling the uncertainty [23].
In this research, an interval-based representation for images will be introduced for better managing their inherent uncertainties. Afterward, an extension of the Laplacian method will be proposed for using on the interval-valued images for edge detection purposes.

Interval representation of the image
In this research, skin cancer images are considered to be matrices of i rows and j columns; i.e. the image γ = [1, . . . , i]× [1, . . . , j] is the set of their positions.
Consider an image A and its pixel value as A(γ ) and assume a certain position α ∈ γ . Also, n(α) ⊂ γ is denoted as the set of positions in a 4 × 4 neighbourhood centered at α, containing itself. Unless α belongs to the image border, |n (α)| = 16 [24].
From the previous section, it is concluded that digital images after the discretization process convey some uncertainties with their selves who effect partially or even completely on the ordinary operations in image processing.
Some of these uncertainties may be generated due to the noise, brightness intensity limiting during the discretization, etc. [25,26]. Image noise contains a random variation of colour information or brightness in images and is usually a characteristic of electronic noise. It can be generated by the image sensor and digital camera or circuitry of a scanner. As noted before, the image discretizes the reality in two different facets, quantizing and sampling. In this research, the brightness of the image has been considered as a part of sampling ambiguity and a more significant fact of uncertainty.
In the process of brightness sampling, a finite number of intensities have been stored; however, by considering more details, the precision of the intensity has limitations.
There are usually 224 intensity values in RGB images and 28 intensity values in greyscale images. However, even by considering higher details, there are always some limitations to the brightness accuracy. Therefore, the intensity error measurement for the pixel is about ±σ . Assume the greyscale image A with i rows and j columns. The interval-based intensity image IA generated from A can be considered as follows: (1) where L is the maximum value for intensity in different image classes; for instance, L = 1 for double class and L = 255 for uint8 classes and σ is the brightness uncertainty. Figure 1 shows an example image, and its interval representation including the lower and the upper bounds.

Improving the histogram of the input image
The main role of a histogram equalization-based method is to contrast enhancement of an input image. The histogram shows the difference between the lowest and the highest brightness in an image, i.e. the image histogram will be low if this difference is a small value. Histogram equalization is an approach for increasing the histogram value and its contrast for simplifying the next steps of image processing.
For more understanding, consider an input image, im by the size of [m, n] that its integer pixel intensities spaced in the range [0, L -1], whereas L declares the number of feasible intensity values and is 256 (for uint8 class). In this condition, the normalized histogram for im, i.e. h can be achieved by the following formula: No. pixels with intensity n Total number of pixels , n = 0, 1, . . . , L − 1 (2) By considering the above formula, the total histogram equalization is achieved as follows: where f l (.) describes the rounding of a digit to the floor value of the closest integer. The method assumes the intensities of h and H continuous and random variables X, Y in the range [0, L -1]. Y is achieved by the following formula: where CUM d describes a differentiable and invertible cumulative distributive function of X multiplied by (L − 1). Figure 2 shows a simple example of skin cancer with an unsuitable histogram for more clarification.
As it is clear, the histogram equalization improves the probability density function of the image and makes the image with better contrast.

Median filtering
Most often, in the photography process (especially medical photography), there made some noises that are caused by oscillation, and unintentional changes appear on the measured signals. A serious problem in image processing operations is noise. This phenomenon has a bad effect on image processing, especially on image edge detection. Because edge detection needs differentiation, it increases the impact of the high-frequency pixel, especially noise. A simple way for reducing this problem is to use a median filter [27]. This operation is important for eliminating the generated noises in the input medical images [28,29]. The main advantage of median filtering is to remove noises while keeping edges. This filter is a nonlinear low pass filter that needs more processing time for the filtering. In median filtering, an m × n neighbourhood is considered. Then all the neighbourhoods are arranged in ascending order, and finally, the middle element of the ordered numbers is selected and is replaced by the central pixel. The median filter is a good filter for eliminating the salt and pepper noises. In this paper, 6 × 6 mask has been employed for a medium filtering of the input images. However, increasing the mask size reduces the image noise, it loses some vital edges. In Figure 3, a simple melanoma image with noises before and after median filtering is shown.
After applying the median filtering to a greyscale image, the median value of the grey values of the pixels is obtained for each pixel.

Image thresholding based on Otsu's method
Before using the proposed image edge detection, we need to threshold the filtered image. Here, we utilized Otsu's method [12,30]. In Otsu's method, we are thoroughly looking for a threshold level that minimizes the class variance, which is formulated as: where ω i describes the probability of two distinct classes with a threshold value of t. σ 2 i describes the value of the variance of these classes.
Indeed, in Otsu's method, minimizing the value of class-like variance is like maximizing the class-invariance [11], i.e.
in which the terms μ i describes the mean value.
The algorithm is briefly explained as follows: (1) Calculate the histogram and the probabilities for all intensity levels; (2) Initializing the value for ω i (0) and μ i (0) for each possible threshold level (t = 1,2, . . . ,); (3) Update ω i and μ; (4) Calculate σ 2 b (t); (5) The optimal threshold is the maximum value of A simple image threshold based on Otsu's method is given in Figure 4.

Interval analysis
A classical definition for the interval numbers over the field of real numbers is derived from the following.
where X is an interval integer over IR and x,x describes its lower and upper bounds, respectively [31]. Note that all of the interval integers are defined by the uppercase symbols. Note that if the lower and the upper bounds for an interval are the same, it will be called degenerate interval integer, i.e. R ⊂ IR.
The width of the interval number (x w ), the radius (x r ), and mid-point value (x c ) of an interval integer X is obtained by the following formulas [31,32].
By the considered definition above, the interval integer in terms of the centre of the interval and the radius is as follows: where [ x] is the symmetric interval of [x].

Hukuhara difference method
Ordinary interval difference (Minkowski difference) could not provide a correct difference; i.e. X + (−X) = {0}, where {0} is a degenerate interval zero. In other words, in ordinary differencing, the inverse and the opposite of a defined integer are not equal.
In 1967, Hukuhara proposed the Hukuhara Hdifference as a set Z which X Y = Z ⇔ X = Y + Z and the important feature of this approach was that X X = {0} [34][35][36][37].
The H-difference exists if and only if for X Y = Z, X contains a translate {Z} + Y of Y. In 2010 Stefania proposed a generalized version of the H-difference [38]. Definition 6.1: Consider X and Y are two interval values, where X = [x,x] and Y = [y,ȳ]. The gH difference between these two interval sets can be defined as follows [39]: and: More details on the gH-difference can be found in [38][39][40][41].

Interval-based derivative
Since the derivative rate is instantaneous, its wrapping effect error is high, but its average variation is less.
Wrapping effect error is the extra interval range which is not required. Since edge detection-based methods have been generated by differentiating of different orders, the interval-based edge detection method should be defi ned for applying and edge detection of the interval input image.
Here, the definition and the improvements of the proposed interval-based derivative will be described.
Many definitions have been proposed for the derivatives. Among the various methods, an efficient method that is closer to the derivative definition is the method of Stefanio et al. [38].

Interval derivative
Assume the derivative of the function F in the interval X. The interval derivative for interval integer is as follows: If satisfies the equation above, f in x 0 will be generalized Hukuhara derivative.

Definition 7.2:
The continuity of an interval function can be expressed as follows: (22) Note that the generalized Hukuhara derivative will be satisfied if the function is continuous and the right derivative (f r (x 0 )) and the left derivative (f l (x 0 )) are equal, i.e.
Finally, with a central definition, we derive the derivative as follows.

Definition 7.3:
Assuming the centred definition of the function (x = x c + x r I c ), we have: By the definition above, the partial derivative is considered as follows:

Taylor inclusion functions
Considering the centred inclusion method and extending it into the higher derivatives, the Taylor inclusion functions method has been achieved. Consider a twofold extension of the equation [42][43][44]: OtherWise (27) The symmetric form of the Hessian matrix is h ij = ∂ 2 f /∂x i x j for all i and j. Therefore, for a Taylor singlevalued system with order n is given as follows [45]:

Edge detection based on Laplacian (Hessian)
As before said, edge detection of an image is a process for characterizing the boundary among the objects in the image. Laplacian of Gaussian (LoG) is one of these edge detection algorithms. LoG was first introduced by Marr and Hildreth. LOG combined Laplacian along with Gaussian filtering. This method is not very popular in image processing. So, improving this method can increase its popularity among the researchers.
Indeed, the Laplacian comprises a 2-D isotropic measure of the second-order derivative of an image.
Generally, applying derivatization operator to an image specifies regions that have quick intensity variations; hence, the Laplacian filter as a second-order derivation operator can be used for extracting the edges on an image.
In the LoG method, Laplacian will be performed on an image, following, the smoothing it by Gaussian filter for reducing the noise which is the reason that this technique is named.
The Laplacian L(i,j) of an image with pixel intensity values ϕ(i, j) is as follows: which is obtained by a convolution filter. Because the input images are digital, so we need first to discretize the convolution kernel for approximating the Laplacian filter. A commonly used small kernel for the Laplacian (second-order differentiation) is shown in Figure 5.
The 2-D LoG function centred on zero and with Gaussian standard deviation σ can be presented as follows: One of the most significant drawbacks in LOG is that malfunctions at the places where the intensity level varies. Therefore, considering a method with these uncertainties can improve its performance to higher accuracy. In the following, an extension method based on interval analysis is introduced to improve the LoG performance.

An extension of the edge detection based on interval analysis
As it is explained before, each image for the processing should be first discretized. Therefore, in this section, a discrete model of the interval derivative operator will be proposed.
We can extract the Laplacian (Hessian) equation by differencing the two above equations as follows: Therefore, And, based on the interval analysis: Hence, for an image matrix i×j, the Hessian matrix will be achieved as follows: Figure 6 shows how the interval-based Laplacian (Hessian) works.

Dataset description
In this study, two different datasets including DermIS Digital Database [46] and Dermquest Database [47] are studied. The Dermquest Database has over 22,000 clinical images to provide an extensive array of resources for dermatologists that are available to download for educational purposes and a clinical photo-sharing facility to share clinical images with colleagues. The images are of irregular sizes and are taken from different lightning, brightness (most in the range 0 and 255) and cameras. Images are in two classes of melanoma and nonmelanoma with ground truth. The DermIS Digital Database is the largest online medical information service available on the internet containing several types of skin cancers by their diagnoses to utilize in medical image processing. Melanoma images include 116 with 74 from Derm Quest and 42 from DermIS and 82 nonmelanoma images of which 58 from DermQuest and 24 are from DermIS [46,47] (Figure 7).

Method implementation
The presented technique is programmed on the platform of MATLAB R2017 R software on a 64-bit system with configuration Intel R core TM i7 CPU 2.6 GHz with 16 GB RAM. However, several methods have been proposed for edge detection purposes, their efficiency is affected by different uncertainties. In this study, the brightness variations of the skin cancer images are considered as the uncertainty factor and the purpose is to design a robust edge detection system for the image in the presence of uncertainty. Figure 8 shows the results of the presented method for some images. Experimental results show that the efficiency of the presented edge detection gives good results for image edge detection.
To clarify the capability of the system, peak signalto-noise ratio (PSNR) is employed as the quantitative metric for analysing the quality of the presented method towards the noise with two different environments (Gaussian and salt and pepper noises) by studying 198 random images. The PSNR is a good measure to evaluate the quality of the detected edges (or the performance of the edge detection algorithm). However, the PSNR is used in Image Compression applications; here, it is used to compare image edge detection quality. The PSNR represents a measure of the peak error. Based on [48], if an operator gives a resultant image with less PSNR, it shows that the operator has high edge detection capability. Table 1 declares the PSNR variations of the studied images based on salt & pepper noise for varying variance (σ ) from 0.2 to 1 for state-of-theart methods. As can be seen, the presented approach has a better PSNR ratio towards the other methods. Besides,    by increasing the value σ , other methods give improper results while the presented method gives in a better way. Another result that can extract from the result is that by increasing the value of σ , other methods get worst while the proposed method gets improved.
The PSNR variations of the studied images based on Gaussian noise for mean value μ = 0.1 and μ = 0.5 are shown in Tables 2 and 3, respectively. Canny filter as a popular classic edge detection gives the worst results. After that, the Fuzzy method and the improved Sobel method have better results, respectively. Finally, the best efficient method among the compared methods gives the best results in the presence of noises, and brightness variations are the proposed   [51,52]. The Dice overlap ratio for six images in Figure 8 are given in Table 4. As can be observed, the estimated edge masks are typically very accurate over the skin cancer boundary. Furthermore, for more clarification, three former stateof-the-art methods including Improved Sobel [49], Fuzzy [50] and Canny filter are compared with the proposed method as described before. The methods are performed to the studied DermIS Digital Database and Dermquest Databases. In Table 5, we report DSC (Dice) as a metric for performance evaluation of the edge detection methods for skin lesion detection.
As can be observed, all the methods give good detection results, but the proposed method achieved the best results with a Dice score of 0.826. In this category, Improved Sobel performed the second best with a Dice score of 0.769. In melanoma, the proposed method has achieved a Dice score of 0.634 which was also the best performer for all the metrics.

Conclusions
This paper presents a new outlook on the edge detection of skin cancer images. The main idea in this study is to consider the uncertain values which are made in the medical image photography. This uncertainty can contain different cases such as noise and brightness variations. The main advantage of the presented method is that it uses interval analysis to consider these uncertainties in their intervals to gives a robust edge detection result. For designing the interval differentiation, the Hukuhara difference following by the Laplacian of the Gaussian method is utilized. Simulation results are applied on two different skin cancer atlases, DermIS Digital Database and Dermquest Database to show the system performance. The results of the system are also compared with three popular methods by considering two different noises including Gaussian noise and salt and pepper noise to show the system superiority.

Disclosure statement
No potential conflict of interest was reported by the author(s).