A study of a clothing image segmentation method in complex conditions using a features fusion model

According to a priori knowledge in complex conditions, this paper proposes an unsupervised image segmentation algorithm to be used for clothing images that combines colour and texture features. First, block truncation encoding is used to divide the traditional three-dimensional colour space into a six-dimensional colour space so that more fine colour features can be obtained. Then, a texture feature based on the improved local binary pattern (LBP) algorithm is designed and used to describe the clothing image with the colour features. After that, according to the statistical appearance law of the object region and background information in the clothing image, a bisection method is proposed for the segmentation operation. Since the image is divided into several subimage blocks, bisection image segmentation will be accomplished more efficiently. The experimental results show that the proposed algorithm can quickly and effectively extract effective clothing regions from complex circumstances without any artificial parameters. The proposed clothing image segmentation method will play an important role in computer vision, machine learning applications, pattern recognition and intelligent systems.


Introduction
Image segmentation can separate regions of interest (ROI) in an original image from complex background environmental conditions. Popular papers have practically specialized in the image segmentation of garment images in complex circumstances and scenes. Clothing images have their own inherent features, characteristics, visual features, and colour. The image texture of clothing types contrast with background circumstances. According to these features and characteristics, many researchers have performed studies in these research fields. Previous work [1] includes studies based on the colour feature of an image, wherein a clothing image segmentation method using the mixed Gaussian model was proposed. However, this proposed method [1] only considered the colour features of clothing images without considering the existing texture features extracted in the images. The mean shift algorithm was also proposed in the literature [2]. The original images in this work were consistent with the image segmentation procedure. The characteristics of regional collection and the method of human-computer interaction through manual rough mark segmentation achieved better results than other solutions. However, the algorithm had a high dimensionality, with 4,096 dimensions. and high computational complexity; additionally, human-computer interaction was required to obtain the final segmentation results. In another study [3], the author designed the JSEG segmentation algorithm, which has been applied to image saliency regions to discriminate the dressing area and background area using human face detection on dress and clothing segmentation. Another work [4] on clothing image segmentation procedures used two clustering architectures and achieved good results. The main idea of this method was the first use of the mean shift algorithm, which used the numbers N and N to obtain a regional feature vector; then, two clustering k-means algorithms were applied, and the image segmentation results of this method were more stable than those of others. Another previously designed algorithm [5] used the histogram threshold method combined with an improved fuzzy c-means clustering algorithm to segment the image and achieved good results. One image segmentation algorithm designed in a previous study [6] segmented the image into shapes using a priori knowledge of the object. The a priori information was then incorporated into an energy function. The main idea of the proposed adaptive shape method was that less object edge information required more shape information. The method utilizing the grayscale image segmentation effect performed better than others. One paper [7] presented an improved texture image segmentation method based on region merging with LAB features, colour features and texture features fused to the Gabor energy of the original image. Then, using the man-machine interactive method for the image segmentation process, the algorithm achieved artificial image segmentation. Then, another work [8] designed and optimized this algorithm. Further designs of the proposed algorithm were then presented in the literature [9,10]. One work [8] used LBP texture features instead of 4,096-dimensional feature vectors, colour fusion characteristics of the object image feature description, and segmentation to improve the efficiency of the algorithm. In another paper [11], the traditional active contour segmentation algorithm, defining oversegmentation and undersegmentation of texture images, was used to analyze isotropic and anisotropic diffusion, and an edge-preserving fuzzy texture design function considering image gradient was used to establish a fuzzy model of image texture change. Additional experimental results have been obtained in previous works. One such work in the literature [12] used the improved graph cut algorithm based on colour features and texture features. The regularization parameter local adaptive structure avoids the foreground and background colours, similar to the graph cut algorithm, which can easily produce false image segmentation and lead to the shrinking bias phenomena. This kind of algorithm has achieved good results but also reduces the complexity of the algorithm and requires manual user participation to realize the image segmentation effect.
In this paper, an unsupervised fuzzy segmentation algorithm for unsupervised clothing images based on a priori knowledge, which combines colour features and texture features, is proposed. The proposed algorithm can realize the adaptive image segmentation of clothing images in complex circumstances and scenes without any artificial parameters.

The proposed algorithm
The colour feature and texture feature characteristics of clothing images contrast with the background conditions. Therefore, to accurately describe the effective content of the original images, we should integrate colour features and texture features. This paper first extracts six-dimensional colour features using block truncation coding (BTC) and the optimized LBP algorithm to obtain texture features suitable for the fusion of six-dimensional colour features. Image segmentation is used to classify the image pixels of the original images into the object areas and the background area. However, if image segmentation is performed directly at the level of image pixels, because the number of image pixels contained in the original image is very large, the image segmentation algorithm will be more complicated than other solutions. Additionally, a single image pixel cannot reflect the colour features and texture features of the garment. Therefore, the image is divided for block processing. For each subimage block, the colour features and texture features are extracted, and the subimage blocks are clustered and divided into two categories, which are the object area and background area, to achieve image segmentation. Before the clustering operation, two types of clustering centres can be determined according to a priori knowledge to avoid clustering around a local optimum.

The color features based on block truncation
The RGB colour model is the built in default of the colour features model. This result is not consistent with the human visual perception [13] of colour differences, so it is necessary to transform the RGB colour model into a uniform model, suitable for clustering the colour model. This paper has chosen to use the LAB colour model.
Block truncation coding (BTC) is a classic compression method. It was originally applied to lossy compression of an original image. The algorithm transforms the image into blocks of suitable size, and at the same time, the average value and standard deviation of the holding block of image pixels can reduce the number of grayscale values of the image to achieve the purpose of the works in [9] and [10] regarding image compression. This paper obtains the colour features of an image block using this method. The traditional LAB colour model can only obtain three-dimensional colour features. In this paper, we used the BTC algorithm to obtain sixdimensional colour characteristics, and the description of colours is more accurate. Assume that there is an image block of size M × N. The image pixel X ij of line i and column j can be regarded as the feature vector containing the three components of L, a and b. The three components can be expressed as X ij (L), X ij (a) and X ij (b), respectively. The proposed algorithm steps used to calculate the colour feature of the image block are as follows: (1) The mean and standard deviation of the three components of L, a and b in the image block are calculated and recorded as L, a, and b, and σ L , σ a , and σ b , in which: In the same way, a, b, σ a and σ b are calculated.
(2) The three components of L, a and b of the image block are truncated, and the L component is taken as an example. L is regarded as the threshold and is compared with the L value of all image pixels in the image block, which is less than the threshold set to L low , and in contrast, can be set to L high : p is the number of image pixels in the image block with an L component greater than or equal to the threshold L.
(3) We also perform truncation processing for all image pixels of the a and b components in the image block so that we can obtain the sixdimensional colour features of image blocks L low , L high , a low , a high , b low and b high .

Improved LBP texture features
Local binary patterns (LBPs) is a term used to describe local texture features of image operators [14]. In the 3 × 3 image pixels of a neighbourhood, patterns starts from the top left image pixel and move in a clockwise direction. If the pixel neighbourhood point value is greater than or equal to the centre pixel, it is recorded as 1, 0, 8 so that a bit binary number is obtained as shown in Figure 1(a) [15][16][17]. In the centre of the LBP neighbourhood, the characteristics of the centre image pixel are converted to the binary "11001001" and to the decimal "201", and this value reflects the relationship between the image pixel and its neighbourhoods. In this way, the LBP texture features of all image pixels in the block correspond to a decimal value. These feature statistics represent the characteristic value of the histogram as texture features of image blocks. However, the texture features and colour features are directly fused because the resulting histogram is 256-dimensional, and the colour features are only six-dimensional. Because the characteristics of the two-dimensional features result in the description of the features of image blocks, the texture features occupy the dominant position, and the colour features of the effect are very limited. To solve this problem, the original LBP texture feature extraction algorithm is improved to reduce the dimensionality of the texture histograms while minimally affecting the image texture profile. This is shown in Figure 1(b).
For the LBP features, the statistical centre pixel, using four to eighty as the centre of the neighbourhood in 1010, is converted to the decimal ten. Because the number of neighbourhood image pixels in the reduced second value model number is smaller, the corresponding decimal number is only in the range 0-15, which will reduce the texture features from 256 dimensions to 16 dimensions.

Image segmentation operation
After dividing the original image into image blocks and extracting the colour features and texture features from the subblocks [18][19][20], the image segmentation is actually the image classification of blocks, and the subimage blocks are divided into two categories: object area and background area [21,22]. As the number of categories is known, the k-means algorithm can theoretically be used for clustering. However, the k-means algorithm is more sensitive to the initial clustering centre. If the initial clustering centre is not properly selected, it may lead to the algorithm converging to a locally optimal solution. A large number of clothing pictures can be found through the experimental analysis, and the statistical rule is shown in Figure 2. It performs a 10 × 10 image segmenting operation; the clothing is the object area of the original image, so the image that is most within the clothing area appears in the image in the middle position of the marked figure's several image blocks, basically belonging to the background region [23][24][25][26]. In this paper, a two-point method based on prior knowledge is proposed. As shown in Figure 1, if each image is divided into a 10 × 10 mesh, the eight-blocks of the background area are marked according to the a priori knowledge, and the probability of a block being in the background area is very large. Assume that an image S contains N subimage blocks, namely, s = {s 1 , s 2 , . . . , s N }, and each subimage block s i can be regarded as a 22-dimensional feature vector after the above feature extraction. These subimage blocks are divided into two categories: the object area G 1 and background area G 2 . In the initial case, the eight subimage blocks marked in Figure  2 belong to G 2 ; that is, G 2 = {s 1 , s 2 , . . . , s 7 , s 8 }. All the remaining subimage blocks belong to G 1 ; that is, G 1 = {s 9 , s 10 , . . . , s N }. Next, we need to select the subimage blocks belonging to G 2 that are in G 1 and put them into G 2 , one by one. Finally, the large component area composed of all subimage blocks in G 1 is the object area. The specific steps of the algorithm are as follows: (1) The selection of image blocks in G 1 should follow the principle of optimal classification, that is, G 2 , where the object function is defined as n 1 and n 2 represent the numbers of image blocks in G 1 and G 2 , respectively, and the objective function represents the average class distance between the two classes of G 1 and G 2 . The greater the value, the better the effect of the classification.
(2) All blocks need to undergo exhaustive G 1 classification; namely, the objective function calculation of s 9 , s 10 , . . . , s N in the G 2 class. When the objective function s i in G 2 is the largest value of s i in G 2 , it is removed from G 1 . The objective function is recorded as E (1), and this is the first objective function performed. At this time, G 1 = {s j | j = i, 9 ≤ j ≤ N} and G 2 = {s 1 , s 2 , . . . , s 8 , s i }. (3) Repeat step 2 as outlined above until the maximum object function is E(2) and the corresponding image block is s p . Compare the object functions E(1) and E (2). If E(2) ≥ E(1), s p is incorporated into G 2 , and repeat the above steps until the current objective function E(k + 1) is smaller than the previous objective function E(k). (4) G 1 contains N − 8 − k subimage blocks, and the 8 + k subimage block is included in G 2 . The large component composed of the subimage blocks contained in G 1 is used as the final object area, and the image segmentation process ends.
This algorithm is based on statistical laws and uses a priori knowledge [27][28][29] to avoid the problem of the k-means algorithm when selecting the initial clustering centre. Additionally, the design objective function optimizes the classification results of each cluster. The use of a priori knowledge can guarantee that the algorithm will not fall into a local optimum [30,31]. In addition, although an exhaustive method is used for classification, the efficiency of the algorithm is very high because of the limited number of image blocks.

Experimental results analysis
To verify the validity of the proposed algorithm, 2,000 clothing images [32] with different background settings were tested [33][34][35]. First, the improved LBP texture features were tested, as shown in Figure 3. Figure  3(a) is the original image, and Figure 3(b) is the LBP value texture image extracted using the traditional LBP algorithm [36][37][38]. Figure 3(c) is an improved texture image. As can be seen from the graph, although the texture feature is reduced from 256-dimensional to sixteen-dimensional, it still retains the texture features of the original image very well.
Then, to verify the feature fusion and the validity of the proposed dichotomy, five groups of experimental schemes are designed [32,39,40].
In experiment one, only six-dimensional colour features are used, as is the segmentation method outlined in this paper. In experiment two, only sixteendimensional texture features and the segmentation method are used. In experiment three, the traditional LAB colour features [41] are combined with sixteen-dimensional texture features as well as the segmentation method. In experiment four, the sixdimensional colour features proposed in this paper are combined with sixteen-dimensional texture features and the segmentation method. In experiment five, the six-dimensional colour features proposed in this paper are combined with sixteen-dimensional texture features and the k-means algorithm [42]. The effect of the experiment is shown in Figure 4 below.
From the experimental results of Figure 4, we find the following: (1) Experiments one and two use only a single colour or texture feature, so in the complex background image or when the clothing has a complex colour change or texture pattern, the segmentation effect is greatly affected, as shown in Figure 4(b and c); (2) Figure 4(d), showing experiment three, is segmented using traditional 3D LAB colour features with sixteen-dimensional LBP texture features. Figure 4(e) of experiment four shows the results when using the six-dimensional LAB colour   Figure 4(f) in experiment five combines the sixteen-dimensional texture features and six-dimensional colour features proposed in this paper and implements image segmentation using the k-means algorithm. The contrast of these results with those of experiment four can be seen because the k-means algorithm uses stochastic methods to determine the clustering centres, which leads to the image segmentation result being unstable, and background area versus object area identification errors occur. To address the problem of the unstable segmentation results of the k-means algorithm, one solution is to perform experiments many times and then determine the initial clustering centre statistically, but this undoubtedly increases the time needed for and complexity of the image segmentation algorithm.

Conclusions
This paper proposed an unsupervised segmentation algorithm based on colour features and texture features for clothing images, which can be used to extract the main parts of clothing in complex scenes. Although the colour features and texture features are fused, the feature dimensions are not high, and the image is divided into blocks so the image classification algorithm can be quickly realized. Additionally, the validity of the image classification algorithm is guaranteed to some extent because of the use of a priori knowledge based on statistical laws. The experiments show that in all kinds of complex scenes, we can obtain satisfactory image segmentation results by using the algorithm proposed in this paper, and we can prepare for subsequent image understanding and image retrieval.