Analysis and recognition of characteristics of digitized tongue pictures and tongue coating texture based on fractal theory in traditional Chinese medicine

Abstract Simple fractal dimensions have been proposed for use in the analysis of the characteristics of digitized tongue pictures and tongue coating texture, which could further the establishment of objectified classification criteria under the conditions of expanding sample size. However, detailed descriptions on simple fractal dimensions have been limited. Therefore, BP (back propagation) neural network model classifiers could be designed by further calculation of the multiple fractal spectrum characteristics of digitized tongue pictures in order to classify and recognize the thin/thick or greasy characteristics of tongue coating. The fractal dimensions of sample data of 587 digitized tongue pictures were collected in a standard environment. A statistical analysis was conducted on the calculation results of the sample data, and the sensitivity of the fractal dimensions to the thin/thick and greasy characteristics of digitized tongue pictures was observed. As the overlap region resulted from a range of values of a single parameter, another 8 characteristic parameters of the multiple fractal spectra of the digitized tongue pictures were further proposed as the elements in the input layer of the three-layers BP neural network. Automatic recognition classifiers were designed and trained for the characteristics of digitized tongue pictures and tongue coating textures. The simple fractal dimension was sensitive to the thin/thick and greasy characteristics of digitized tongue pictures and could better judge the characteristics of the thickness of the tongue coating. A classifier with characteristic parameters of multiple fractal spectra as the input vectors identified by the BP neural network models could effectively increase the accuracy rate judged by the characteristics of the tongue coating texture.


Introduction
Tongue diagnosis, as one of the special diagnostic methods in traditional Chinese medicine (TCM), is a part of inspection diagnosis [1]. It is defined as a comprehensive diagnosis method for syndrome differentiation based on tongue color, tongue coating, tongue shape and sublingual veins [2]. The presentation of the tongue is believed to be closely associated with in vivo viscera, qi and blood as well as body fluids; tongue appearance can be used not only as an auxiliary diagnostic indicator for multiple diseases but also as an indicator for the self-monitoring of daily health [3]. In modern times, computer science and technology have matured, and the computer-assisted digitized management and analysis of tongue pictures has become one of the hot topics in the objectification of TCM tongue diagnosis [4,5]. Its study contents mainly include color-orientated analysis of tongue color and tongue coating color [6], geometrical morphologyfocused analysis of tongue shapes [7], and texture structure-dominated analysis of tongue coat texture [8]. Currently, the analysis of tongue shapes and coating texture structures is limited by disadvantages in the digitized analysis of the objectification of tongue pictures because consensus cannot easily be made in field of standard classification due to a lack of complete mathematical data support in relative studies [9]. In the study fields of the objectification of TCM tongue diagnosis, there are less specific studies to analyze the tongue coating texture in tongue pictures [10]. Representative studies include the analysis of tongue coating texture in tongue pictures based on Gabor wavelet transform proposed by Dapeng Zhang [11,12] from the Harbin University of Institute, the analysis and recognition of characteristics of tongue coating texture in tongue pictures based on the nutrition statistical method proposed by Jiatuo Xu [13,14] from the Shanghai University of Traditional Chinese Medicine, and the analysis of characteristics of greasy tongue coat texture in tongue pictures based on subspace method proposed by Baoguo Wei [15] from the Beijing University of Technology.

Background
An analysis of the texture of tongue coating in a digital tongue focuses on analyzing and identifying the thin/ thick and greasy characteristics of the tongue coating (i.e., whether the tongue coating is thin or thick and whether the tongue coating is greasy) by managing digitized tongue pictures [16]. During the observation and analysis of digitized tongue pictures it was found that the tongue pictures with a thick tongue coating were rough. The thinner the tongue coating was, the smoother the tongue pictures would become, whereas the tongue pictures of curdy and greasy tongue coatings were marked by granules with uneven sizes. Therefore, the density of the tongue coating could reflect the level of crudeness and greasiness in tongue coatings [17]. Tang Rongsheng [11] extracted the characteristics of tongue coating textures using Gabor wavelets based on the pretreatment of removing the samples' reflective spots and tested the classification results using sample training classifiers; the results indicated that the combined use of different classifiers could enhance the classification results. Wei Baoguo [15] applied the subspace method to judge the density of tongue coating texture structures with the projection length as the classification characteristics and comprehensively judged the types of curdy and greasy tongue coating in digitized tongue pictures combined with the characteristics of the roughness of tongue coating texture. It could be concluded from the summarization and analysis of the study results of tongue coating texture in tongue pictures at the present stage that the roughness and density of tongue coating texture were closely associated with the thin/thick and greasy characteristic parameters of tongue coating. However, the establishment of a standard classification still lacks strong support from objective data as the description of the roughness of the tongue coating texture in tongue pictures mainly depends on the distinguishability of the tongue pictures collected [18]. Meanwhile, with the development of nonlinear science, fractal dimension, a powerful new instrument, has been developed in the analysis of medical pictures. In 1984, A.P. Pentland proved that the fractal characteristics on the surfaces of objects were consistent with the fractal characteristics of normal vectors and their components on surface; the belief then was that if a rough surface had fractal characteristics, so did the gray surfaces of the pictures produced [19]. Based on this theory, numerous scholars have further studied pictures-related fractal theories or used this theory as a basic instrument to explore the related issues in all fields of application [20][21][22][23]. Currently, in the analysis of image texture and pattern recognition, many reports on analysis and study using fractal theories have been found. The self-similar concept of fractal theory coincides with the TCM theory of "correspondence between man and universe," a basic syndrome differentiation mode in TCM [24]. Therefore, fractal dimensions and a stable data size can be used to describe the roughness of a tongue coating texture analyze the thin/thick and greasy characteristics of tongue coating and the division of classification criteria.

Digitized tongue pictures
According to ISO 20498-2, the collection light environment was set as follows: illuminance, 6500 lux; color rendering index, 95; and color temperature, 6500 K. The geometrical conditions of the lighting source arranged the light paths based on 45 /0 (angle of illumination/angel of observation), and the sizes of the collected pictures were (512-768) Â 512 [25].
The primary subjects for the collection of tongue pictures were undergraduates with healthy body indexes. The age distribution was 19-22 years old. For the effective data, 499 digitized tongue picture samples were developed under standard collection conditions, and each tongue picture had an investigation table of artificial analysis of tongue picture texture from TCM experts. To analyze the characterized tongue pictures, another 88 collected tongue pictures with obvious texture characters were collected clinically, among which 44 had a thick tongue coating and 44 had a curdy and greasy tongue coating.

Simple fractal dimensions spectrums
Fractal dimension is a quantitative description for fractal characteristics, and there are numerous calculation methods for simple fractal dimensions with irregular fractions, such as the grayscale-based difference method, the calculation method based on self-similar fractal Brownian motion models, the basic differential box-counting method, the wavelet decomposition-based fractional-dimensional approach, and the covering blanket method [26]. Compared with other algorithms, the differential box-counting method and its derived algorithms are sensitive to tongue coating textures with a low degree of roughness, and they are sharper in changes when the roughness is low, meeting the requirements for the judgment of tongue coating texture [27]. Meanwhile, Wang Diji [28] calculated the time consumed by different algorithms in calculating the fractal dimensions of pictures in the same group and concluded that the basic differential box-counting method was advantageous in terms of lower average calculation time and lower algorithm complexity.

Determination of simple fractal dimensions of digitized tongue pictures
On the basis of pretreatment such as the subarea division of digitized tongue pictures collected in a standard environment [29], the digitized pictures of tongue coating texture were converted to grayscale pictures to extract the gray value corresponding to each pixel. The gray values were converted to effective plot areas to form gray surfaces through interval mapping. Tetragonal prisms were set to cover the gray surfaces, the heights of which were determined by the difference in maximum and minimum gray values on gray surfaces in the coverage areas of tetragonal prisms. The subface of a tetragonal prism was taken as d Ã d, and the body of a tetragonal prism was moved step by step to all pixel points. X(d) was defined as the total volume obtained by summing the volumes of all tetragonal prisms. The size of d was changed gradually, and a series of X(d) could be obtained if d was 1, 1/2, 1/4 and so on. Figure 1 shows the 3-dimentional (3D) surfaces of the tetragonal prism-covered gray surfaces corresponding to different numerical values of d. N(d) ¼ X(d)/d 3 . lnN(d)-ln(1/d) curves were constructed, and the slopes of linear parts were considered to be fractal dimension D.
According to the definition of fractal dimensions, the smaller the subarea size, the more accurate the fractal dimension of the calculated picture would be. d could be designed into the size of a pixel point minimally as it did not have fractal property after entering into the dimensions of pixel points to form the 2dimension (2D) sets, and its upper limit could be set as the length of a pre-treated picture.

Multiple fractal spectrums
The basic differential box-counting method does not consider the counts of box pixels, and it is easy to lose detailed information as it only applies average descriptions to the study subjects overall. Multiple fractal dimensions, also termed as fractal measures, can comprehensively reflect the probability distribution set of pixels counts in the box after normalization. The width of multiple fractal spectra can quantify the degree of roughness of the characteristics on the surface, the maximum and minimum probability subsets of multiple fractal spectra, and the ratios of the number of maximum and minimum surface heights. The application of multiple fractal spectra can supplement the disadvantages of simple fractal dimensions, which lack detailed information description [30].

Determination of characteristic parameters of multiple fractal spectrums of digitized tongue pictures
When there were overlapped areas in the value domains of characteristic parameters of different types of sample data, a classification could not be made using value domains by selecting only one characteristic parameter. However, multiple fractal spectra could obtain a group of probability distribution sets using the pixel count in the normalized box and effectively compensate for the disadvantages (e.g., lack of detailed information) of simple fractal dimensions.

Probability measure
Pretreated digitized tongue pictures were covered by tetragonal grids using e as the dimension (e 1: the size of the whole picture was 1). The pixel count with fractal properties in each e-dimension grid was recorded as N ij , and the total number of grids with fractal properties in the whole fractal dimension pictures was recorded as P N ij . The probability measure, also known as the occupying rate of fractal dimension pictures in each e-dimension grid, could be obtained using the following formula: Thereby, the distribution of a probability measure P ij (e) of the physical quantities in the discussed fractal structures could be obtained.

Partition function
The partition function X q (e) of the multiple fractal system was defined as X q e ð Þ ¼ P P ij ðeÞ q , i.e., the partition function was defined as the weighted summation of q power of probability measure P ij , in which q could be from À1 to þ1. q was termed as the weight factor, and different q represented the ratios of different probability measure P ij in partition function X q (e) to exert a specific contribution of probability measure P ij to X q (e). In practical application, jqj 100 was advisable. In this study, jqj 60 was applied.
Based on the characteristics of fractal structures, X q (e) also had certain scale relationships in scale-free self-similar areas, i.e., there were power-function relationships in X q (e) scale parameter e. Hence, the quality index could be obtained from lnX q -lne curve slopes, as shown in the following formula.

Characteristic parameters of multiple fractal spectra
Generalized dimensions D(q) and multiple fractal spectra f(a) could be calculated using the quality index s(q), as shown by following formula: Fractal spectra f(a)Àa of randomized fractal pictures were obtained, in which 8 parameters were used as characteristic quantities.
a min : The minimum value of singular exponent a; 1. a max : The maximum value of singular exponent a; 2. f min : Value of multiple fractal spectrum f(a) corresponding to a min ; 3. f max: Value of multiple fractal spectrum f(a) corresponding to a max ; 4. Spectral width: Value of a max Àa min ; 5. Spectral difference: Value of f max Àf min ; 6. a d : Value of singular exponent a that resulted in a maximum value of f(a); 7. B: One of the parameter in the quadratic fit curve of f(a), also termed as the degree of asymmetry.

Design of classifiers
Three layers of the BP neural network mode structures were established [31], i.e., the input layer, hidden layer and output layer. The network coefficient was initialized first followed by input of the training data. Then, programs were used to calculate the deviation between the practical input values and expected input values, after which the network coefficient was modified according to the satisfactory degree of deviation. The network deviation limit was initially set up, iteration training was performed on the BP neural network, and the deviation was controlled and regulated so that the network input could always be convergent to the deviation limit to finish the establishment of the network models. In this study, the characteristic parameters of multiple fractal spectra were used to judge the characteristics of tongue coating texture, e.g., whether the tongue coating was thin or thick and whether the coat was greasy in the tongue pictures. Therefore, the number of input layer units of the BP neural network was set as 8, and the number of output layer units was 1. The experiment indicated that if the number of units in hidden layer was too low, the network could not have the necessary learning capacity; if the number was excessive, it would increase the complexity of the network structures greatly. Therefore, the selection of the number of units in the hidden layer should also be emphasized by neural network researchers. Based on the method verified by Shen Huayu [32] to determine the number of units in the hidden layer, the number of units in the hidden layer was set to be 9 and the initially set learning rate was 0.1 in this study [33].

Analysis of simple fractal dimensions of digitized tongue pictures
A pretreatment including subarea division and size normalization was conducted to digitized tongue pictures for texture analysis, and the differential boxcounting method was applied for the calculation of the fractal dimensions of 2 D pictures. The fractal dimensions of digitized tongue picture samples from healthy students were 2.14-2.36 and in normal distribution, whereas the fractal dimensions of digitized tongue pictures with obvious pathological characteristics were notably distinguished from those of normal populations, in which the fractal dimensions of 39 out of 44 tongue picture samples with thick tongue coating were >2.36, and those of 44 tongue picture samples with greasy tongue coat were 2.36-2.52. The detailed parameter distribution is shown in Figure 2.
Preliminary conclusions were obtained specific to the distribution characteristics of the sample data: 1. Whether the tongue coat was thick or thin was determined by the fractal dimensions of digitized tongue pictures based on the criterion of whether the bottom of the tongue coating was visible or invisible. A thin tongue coating was defined as a coating through which the underlying tongue nature was visible. Meanwhile, a thick tongue coating was defined as a coating through which the underlying tongue nature invisible [15]. Therefore, the simple fractal dimensions of digitized tongue pictures determined by the box dimension method could better judge whether the tongue coat was thick or thin, and minimal sample overlapping occurred when the fractal dimensions were 2.32-2.36. 2. Whether the tongue coat is greasy was determined by the fractal dimensions of the digitized tongue pictures based on the changes in size and density of the granules on the tongue coating. A greasy coating was mainly marked by fine and densified granules that were fused to be flaky on the tongue coating, and they were thick in the middle of tongue and thin around the tongue [15]. The fractal dimensions of greasy tongue coating samples overlapped partially with the fractal dimensions of the thick tongue coating, and most curdy and greasy tongue coat samples were distributed in middle-to-high data areas in the fractal dimensions of the thick tongue coating sample, when the fractal dimensions were >2.36 because of the lack of partial detailed information as average descriptions were made only for tongue coating texture when the box dimension method was used to determine the simple fractal dimensions of the digitized tongue pictures [34].

Analysis of characteristic parameters of multiple fractal spectrums of digitized tongue pictures
Programs were made to draw the multiple fractal spectra and obtain the characteristic quantities. Figure  3 shows the testing results of the characteristic parameters of multiple fractal spectra of two samples of digitized tongue pictures with different characteristics. Figure 3(a) shows normal tongue coating samples while Figure 3(b) shows the greasy tongue coating samples. As shown in Figure 3, the symmetry of the quadratic fit curves of the multiple fractal spectra of the normal tongue pictures is better than the quadratic fit curves of the multiple fractal spectra of greasy tongue pictures. To analyze the degrees of contribution of different characteristic parameters in multiple fractal spectra to the classification and recognition, the balanced sample size was set as the premise, i.e., different types of equivalent sample data in the sample database were collected to obtain the distribution value domains of all characteristic parameters of   In Figure 5, type 1 samples have a thin tongue coating while type 2 samples have a thick tongue coating. Take spectral width a_D ¼ a max Àa min for example. As shown in the figure, the distribution value domains of 20 thin tongue pictures were $0.03-0.15 and were focused in 0.05-0.13. Meanwhile, the distribution value domains of 20 thick tongue coating samples were $0.11-0.25 and emphasized in 0.11-0.23, and the spectral width value was $0.27 in individual greasy tongue coating samples.
In conclusion, although the categories could be judged preliminarily by analyzing the distribution conditions of all parameter value domains, ideal efficacy could not be achieved under the condition that the sample size could not be expanded persistently. If anyone characteristic parameter was used for classification using the value domain judgement method, there were overlapping areas in all parameter valve domains in different types of samples. Therefore, a proper recognition method should be selected.

Design of classifiers and analysis of results
Usually, the performance of a machine learning algorithm will be affected if the training samples are unbalanced. Therefore, the training samples need to be balanced before training. In this study, after pretreatment including region segmentation and size normalization to the digitized tongue pictures collected in a standard environment, 30 normal tongue pictures, 30 thick tongue coating samples and 30 greasy tongue coating samples were randomly collected and stored in a tongue picture training sample database while the other samples were stored in a test sample database. The training samples are balanced. The sample size for a single test was set to be 20. To ensure experimental accuracy, test samples were randomly collected from the digitized tongue picture test sample database after the number of all types of samples for testing were determined. Then, the following three situations were examined to observe the classification efficacy under 3 different sample balance degrees: 1. The number of type 1 test samples is larger than that of type 2 test samples. 2. The number of type 1 test samples is equal to the number of type 2 test samples. 3. The number of type 1 test samples is less than that of type 2 test samples.
Correct recognition was defined as the coincidence of the predicted and practical types and vice versa. The tansig function and logsig function were set as the activation functions for the hidden layer and the output layer, respectively.

Analysis of greasy characteristics of tongue coatings
A total of 30 normal samples (type 1 samples) and 30 greasy tongue coating samples (type 2 samples) were collected from the training sample database of digitized tongue pictures and used as training samples, with the total training sample size of the two kinds of tongue pictures of 60. Figure 6 shows the statistical classification results of the test samples under conditions of 3 degrees of sample balance: 1. The overall precision rate of recognition was 89.01%, in which 12 out of 13 type 1 samples  Figure 5. Statistics of comparison results of characteristic parameter spectrums between thin and thick tongue coating.
were identified correctly with a precision rate of 92.31%, and 6 out of 7 type 2 samples were identified correctly with a precision rate of 85.71%; 2. The overall precision rate of recognition was 90.00%, in which 8 out of 10 type 1 samples were identified correctly with a precision rate of 80.00%, and 10 out of 10 type 2 samples were identified correctly with a precision rate of 100.00%; 3. The overall precision rate of recognition was 96.16%, in which 7 out of 7 type 1 samples were identified correctly with a precision rate of 100.00%, and 12 out of 13 type 2 samples were identified correctly with a precision rate of 92.31%.

Analysis of thick/thin characteristics of tongue coatings
A total of 15 normal samples and 15 thin-greasy tongue coating samples were collected from the training sample database of digitized tongue pictures and used as training samples of thin tongue coating (type 1 samples), and 15 thick tongue coating samples and another 15 thick-greasy tongue coating samples were collected and used as training samples of thick tongue coating (type 2 samples); the total training sample size of the two kinds of tongue pictures was 60. Figure 7 shows the statistical classification results of the test samples under conditions of 3 degrees of sample balance: 1. The overall precision rate of recognition was 90.49%, in which 11 out of 12 type 1 samples were identified correctly with a precision rate of 91.67%, and 7 out of 8 type 2 samples were identified correctly with a precision rate of 87.5%; 2. The overall precision rate of recognition was 90.00%, in which 9 out of 10 type 1 samples were identified correctly with a precision rate of 90.00%, and 9 out of 10 type 2 samples were identified correctly with a precision rate of 90.00%; 3. The overall precision rate of recognition was 91.67%, in which 8 out of 8 type 1 samples were identified correctly with a precision rate of 100.00%, and 10 out of 12 type 2 samples were identified correctly with a precision rate of 83.33%.

Discussion
In this study, fractal dimension has been used to describe the roughness of tongue coating texture, to achieve the analysis of thin/thick and greasy characteristics of tongue coating and the division of classification criteria. Mathematical support for the objective development of tongue diagnosis in TCM has been put forward. Although, Gabor wavelet transform provides a mathematically complete description, and it is more advisable for the extraction and recognition of the overall characteristics of tongue body due to its relatively significant edge effect [11,12]. However, Rong Liang [35] noted that the clinical analysis of  tongue pictures should focus on capturing abnormal information; thus, the characteristics of tongue coating texture cannot be judged intelligently by the overall characteristics of tongue body surface. As to this issue, a subspace method was proposed by Baoguo Wei [15] in which the tongue body areas are divided into subareas with fixed size, and then each subarea would be classified and recognized. This study aimed to capture the abnormal information, therefore, the tongue body was divided into 5 subareas, namely, the apex linguae, middle tongue, root of the tongue, left side of the tongue and right side of the tongue, to achieve the analysis and recognition of characteristics of tongue coating texture in each subarea. The basic concept of the gray-scale differential statistical method proposed by Jiatuo Xu [14] is to describe the pixels of texture pictures and the gray change conditions between adjacent pixels, which is extremely close to the concept of the differential box-counting method (the measure of pixel point was set as bottom margin d of a tetragonal prism) used in this study. The gray-scale differential statistical method is mainly used to analyze the characteristics of tongue coating texture by judging the ranges of 4 parameter value domains, namely contrast (CON), angle-direction secondary moment (ASM), entropy (ENT) and means (MEAN). However, this study proposed to use the differential boxingcounting approach for the calculation of simple fractal dimensions of digitized tongue pictures, and to consider simple fractal dimensions as effective reference data for the computer-assisted automatic recognition of thin/thick characteristics of tongue coatings. The study results indicated that the accuracy of classification criteria could be further enhanced under the conditions of persistent enhancement of sample data size. If the sample data size could not be expanded persistently, this study proposes that 8 characteristic parameters of multiple fractal spectra of digitized tongue pictures could be collected as input vectors, and three layers of BP neural network classifiers could be designed to establish a neural network model for the judgement of dyadic characteristics of tongue coating texture, e.g., whether the tongue coat is thin or thick and whether it is curdy or greasy. It was concluded from the preliminary statistics and verification of laboratory data that multiple fractal spectrum parameters of digitized tongue pictures had better mathematic quantitative analysis theory and an ideal classification accuracy rate if they were used as the input vectors of the three-layer BP neural network classifiers, the coincidence rate with the judgement of clinical TCM doctors of which reached 90-93%.
In this study, the tongue body was divided into 5 subareas based on corresponding organ distribution, and the characteristics of each subarea was recognized; however, the thin/thick and greasy characteristics of the tongue coating in an overall tongue picture were not judged, which conformed to the doctors thought of capturing abnormal information during the clinical analysis of tongue pictures. However, doctors can make a comprehensive judgement of the whole characteristics of tongue pictures based on the characteristics of each subarea spontaneously when observing tongue pictures. Therefore, the reference weights of thin/thick and greasy characteristics of tongue coatings in appointed subareas can be further added through sample training on the basis of current studies to achieve the comprehensive judgement of the whole characteristics of tongue pictures.

Conclusion
The characteristics of digitized tongue pictures and tongue coating texture can be described by simple fractal dimensions, a kind of stable data size. The study results indicated that the accuracy of classification criteria could be further enhanced under the conditions of persistent enhancement of sample data size. If the sample data size cannot be expanded persistently, the characteristic parameters of multiple fractal spectra of digitized tongue pictures can be used as a mathematical basis with high efficacy in the classification and recognition of thin/thick and greasy characteristics of tongue coatings. Eight characteristic parameters of multiple fractal spectra of digitized tongue pictures were used as the input vectors of the three-layer BP neural network classifiers, and their coincidence rate with the judgement of clinical TCM doctors reached 90-93%.