Quality inspection of nectarine based on hyperspectral imaging technology

ABSTRACT In this paper, the quality detection of nectarines based on hyperspectral imaging technology is proposed. The external quality indexes consist of the intact, cracked, rust, dysmorphic and dark damaged, while the internal quality index is composed of the soluble solid content (SSC). Firstly, 480 nectarine samples (160 intact and 320 defective nectarines) with the similar shape and size are selected. Secondly, 5 spectral principal components and 6 texture values are acquired in the spectral range of 420–1000 nm based on the indexes of external and internal quality. Finally, the methods of Partial Least Squares (PLS), Least Squares Support Vector Machine(LS-SVM) and Extreme Learning Machine (ELM) are used to establish the external quality discrimination models and internal quality prediction models, respectively. As a result, accuracies of 89.73%, 94.45% and 88.62% are obtained in the identification of the external quality. SSC is predicted with determination coefficients of 0.8540, 0.8747, 0.8146, and the root mean squared errors of 0.9849, 0.9101, 1.0732. The results obtained indicate the great potential of the LS-SVM model to predict and discriminate the inner and outer quality of nectarines.


Introduction
In recent years, the qualities of fruit and vegetables have increasingly been demanded by the consumers, and the problem of non-destructive detection of the internal and external qualities of fruit and vegetables has been focused by researchers, see e.g. Huyskens-Keil and Schreiner (2003). Nectarine is an edible fruit, rosaceae, genus peach and has characteristics of smooth and hairless, sweet and sour taste, high nutritional value (Lurie & Crisosto, 2005). Due to the influence of the environment and other factors, the quality of nectarine is diversified. Therefore, a prior evaluation of the internal and external quality would be necessary to offer consumers fruit that can cater to their preferences.
Nonstructive testing of fruit and vegetables is an emerging technology with the development of science and technology (Zhi-Xia & Ji-Yun, 2013). It can detect the internal and external conditions of fruit and vegetables without destroying them. In the past few years, the technologies of acoustics, mechanics, visible spectrum and short-wave near-infrared spectrum have been widely adopted in the research of fruit and vegetable disease detection, quality detection and grading (Dan, 2020;Dubey & Jalal, 2015;Jhuria et al., 2014;Mendoza et al., 2012). In recent years, hyperspectral imaging CONTACT Hua Yang yanghua@sxau.edu.cn, 331336198@qq.com technology, a new type of fast non-contact and nondestructive testing(NDT) technology has been followed with interest by the researchers (Liu et al., 2020). It can obtain the image and spectral information of the study object simultaneously. The image information can reflect the size, shape and other external features of the sample, and the spectral information can fully reflect the difference of the physical structure and chemical composition of the samples. These characteristics determine the unique advantages of hyperspectral image technology in the detection of the internal and external quality of agricultural products (Ariana & Lu, 2010;Cho et al., 2013;Nicolai et al., 2007). Some researchers use hyperspectral imaging technology to conduct the classification and discrimination research on the damaged jujube and intact jujube, and the discrimination accuracy is 94% (Wang et al., 2011). A new detection method is developed by using hyperspectral imaging technology to realize faster and more effective identification of saccharin jujube (Zhang et al., 2020). Some scholars analyse and study soluble solid content(SSC) and hardness of three types of apples by using hyperspectral imaging technology, partial least squares(PLS) models are established, and the experimental results show that the modelling effect combined with spectral and image characteristic parameters is obviously superior to the prediction effect of spectral or image characteristic parameters modelling alone when comparing full-wavelength data with fusion values for spectral characteristic wavelengths and image features (Mendoza et al., 2011). It is pointed out that scholars have made some achievements in NDT of nectarine by hyperspectral technology. Hyperspectral reflectance imaging is used to identify nectarine varieties with the similar appearances and different tastes (Sun et al., 2017). Hyperspectral reflectivity imaging combined with partial least squares discriminant analysis (PLS-DA), artificial neural network (ANN) and support vector machine (SVM) is used to evaluate the cold injury of peach. The results show that, using full wavelengths, ANN model had the highest classification rates for the prediction set, with accuracies of 85.37%, 96.11%, and 99.29% for four class, three-class and two-class classifications, respectively. Partial least squares discriminant analysis (PLS-DA) is used to develop a prediction model to distinguish intact fruits of the cultivars using pixel-wise and mean spectrum approaches, and then the model is projected onto the complete surface of fruits allowing visual inspection. The results indicate that mean spectrum of the fruit is the most accurate method, a correct discrimination rate of 94% being achieved. Wavelength selection reduced the dimensionality of the hyperspectral images using the regression coefficients of the PLS-DA model. An accuracy of 96% is obtained by using 14 optimal wavelengths (Munera et al., 2018). Munera et al. (2019) inspect the internal quality of nectarines by means of hyperspectral transmittance imaging(Prunuspersica L. Batsch var. nucipersica) cv. 'Big Top' (yellow flesh) and 'Magique' (white flesh). Hyperspectral images of intact fruits are acquired in the spectral range of 630-900 nm using transmittance mode during their ripening under controlled conditions. The detection of split pit disorder and classification according to an established firmness threshold is performed using PLS-DA. The prediction of the Internal Quality Index (IQI) related to ripeness is performed using PLSR. As a result, an accuracy of 94.7% is obtained in the detection of fruits with split pit of the 'Big Top' cultivar. Accuracies of 95.7% and 94.6% are achieved in the classification of the 'Big Top' and 'Magique' cultivars, respectively, according to the firmness threshold. The internal quality is predicted through the IQI with R 2 values of 0.88 and 0.86 for the two cultivars.
In conclusion, as a kind of image and spectral fusion technology, hyperspectral imaging technology is widely used in fruit and vegetable NDT. However, the NDT for the nectarine is mainly used for the single test of external defects or internal quality, the problem of simultaneously detecting both the internal and external qualities of nectarine has not been thoroughly investigated, therefore, this paper is an attempt to bridge such a gap. The main contributions of this paper can be highlighted as follows: (1) The information obtained by hyperspectral imaging technology is relatively mixed, and the direct use of this information will have a great impact on the discriminant results, so it is necessary to use signal processing technology to extract useful information, which is a certain challenge. In this paper, 5 spectral principal components and 6 texture values are selected according to internal and external quality indexes. (2) The method proposed in this paper takes the fusion value of spectral information and image texture information obtained by hyperspectral as input to realize the simultaneous detection of external defects and internal quality of nectarine.
The structure of the paper is as follows: In Section 2, the experimental materials and methods are introduced. Three kinds of the external quality discrimination and internal quality prediction models are established in Section 3 by using PLS, least squares support vector machines(LS-SVM) and extreme learning machine (ELM). In Section 4, the model with the highest accuracy is selected through verification and analysis. Finally, the research results of the research are summarized and the possible research directions in the future are pointed out in Section 5.

Experimental design
A total of 480 'Chinese nectarine No. 9' samples of similar shape and size are collected from Wan'an Village, Yuncheng City, Shanxi Province, China. Each sample weighs in the range of 160 ∼ 205 g. The selected samples include 160 intact fruit and 320 defective fruit, and the defective fruit consist of 80 cracked fruit, rust fruit, dysmorphic fruit and dark damaged fruit.
Based on Kennard-Stone algorithm, 360 samples are used to construct model and 120 samples are selected to verify the established model. The numbers of the samples and its distribution are given in Table 1. As shown in Figure 1, the hyperspectral imaging system mainly consists of a camera, a spectrometer (Im Spector V10E, Spectral Imaging 1td, Oulu, Finland), a computer and an ENVI spectral image processing software. The hyperspectral test is carried out with a wavelength range of 420 ∼ 1000 nm.

The selection of external quality indicators
According to the surface characteristics, the samples can be divided into two types of intact fruit and defected fruit. The indexes of surface defect include cracked, rust, dysmorphic and dark damaged which refer to the fruit surface cracking, rust spots on the surface of fruit caused by funguses, the abnormal shape formed by the bad ventricular union and the slight pitting of the skin of the fruit caused by damage, respectively. A intact fruit is the one that has no surface defects.
The intact sample and four defective samples are shown in Figure 2.

The selection of internal quality indicators
Four indexes of hardness, SSC, titratable acid content and vitamin C content are selected to characterize the intrinsic quality of nectarines. The values of the indexes are determined by the following methods.
(1) Fruit hardness: the TMS-PRO type food property analyser (FTC company, USA) is used to determine hardness value of nectarine. (2) SSC: nectarine samples weighing 5.00g including skin and flesh are selected for grinding, and the juice is filtered and measured by a hand-held sugar metre.
(3) Titratable acid content: The value of titratable acid content c(%) is given by the following formula: where v(mL) is NaOH solution dosage; m(mol/L) is the concentration of NaOH solution (mol/L); l is the conversion coefficient, which means the mass(g) of different organic acids per millimole, and its value is 0.067; a(g) represents the weight of the sample; b(mL) refers to the total number of millilitres made of the sample fluid; d(mL) is the number of millilitres of sample fluid used for titrating. (4) Vitamin C content: the value of vitamin C content w(%) is decided by redox titration given by the following: where h(mL) represents the number of millilitres of dye used to titrate the sample; h 1 (mL) is the number of dye millilitres used in the blank titration; e(mg) is the 1mL dye solution equivalent to milligrams of ascorbic acid. f (mL) is the number of millilitres of sample solution absorbed during titration; q(mL) refers to the total ML after dilution of sample solution and the p(g) is the grams of the sample.
According to the given calculation methods, the sample's measurement and statistical values of maximum, minimum, average, standard deviation (SD) and the test statistic u are obtained. The results are shown in Table 2.
It is known from the u boundary table that u 0.05 = 1.96 and u 0.01 = 2.58. Table 2 shows that the test statistic u values of hardness, titratable acid content and vitamin C content are all lower than u 0.01 , which means that there are no significant differences between intact and defective samples based on the internal quality indexes of hardness, titratable acid content and vitamin C content. The test statistic u values of SSC are greater than u 0.01 , which indicates that there is a significant difference between intact and defective samples in terms of internal quality index of SSC, so the SSC is adopted as the internal quality index for the determination of nectarines.

Extraction of spectral information
The spectral data of nectarine samples are extracted by region of interest, and spectral curves of the 5 types of samples (intact, cracked, rust, dysmorphic and dark damaged) are shown in Figure 3.
According to the spectral images of the defective and intact samples, the reflectivity of intact samples is higher   than that of defective samples in the visible light region at the band of 420 ∼ 780 nm, and the reflectivity of intact samples is lower than that of defective samples in the near-infrared region at the band of 780 ∼ 1000 nm. This shows that the spectral curves of intact and defective samples are significantly different. Considering the data extracted at the band range of 420 ∼ 1000 nm containing redundant information, it is necessary to carry out appropriate dimensionality reduction processing for the whole band spectrum. Principal component analysis is used to reduce the dimension of the spectral data. The principal components of samples are extracted, and the contribution rate of the first 10 spectral principal components is shown in Table 3.
The first 5 main components in Table 3 can explain the original spectral volume of 99.66% of the information, so the first 5 principal components are selected to establish models.

Image recognition and extraction of texture index value
In this study, six indexes of mean, contrast, correlation, energy, homogeneity and entropy are adopted to compare the texture features in the 0 o direction. Let Mean, Contrast, Correlation, Energy, Homogeneity and Entropy represent the average grey value in the window, the difference of grey in the region, the linear correlation of grey, uniformity of grey distribution, local change of grey and disorder of grey distribution, respectively. The values of the six variables are given as follows: where i and j are the grey levels of two pixels, P(i, j) represents the grey symbiotic matrix, σ x and σ y are the grey standard deviations of the corresponding row and column. The symbiotic matrix texture eigenvalues of the intact and defective samples are shown in Figure 4. It can be seen from Figure 4(a) that the texture means of intact and  defective nectarines are distributed in 0.02 ∼ 0.065 and 0.03 ∼ 0.13, respectively. This means that there is partially overlap between the mean of defective nectarine and that of the intact fruit. Figure 4(b) shows that the average contrasts of the intact and the defective samples are 0.37 and 0.45, the average contrast of the defect samples is significantly higher. It can be known from Figure 4(c,e) that the correlation and homogeneity of defect samples are lower than that of the intact samples. The result of Figure 4(d) expresses that there is a serious overlap between the energy values of the defective samples and the intact samples. Figure 4(f) gives that the average entropy value of the defective samples is significantly higher than that of the intact samples. This means that the adopted six indexes of texture can effectively distinguish between the intact fruit and defective fruit. Until now, the indexes of external quality (intact, cracked, rust, dysmorphic, dark damaged) and internal quality (SSC) of nectarines are selected, then 5 spectral principal components and 6 texture values are adopted based on the indexes of external and internal quality. In the following, under the 5 spectral principal components and 6 texture values, three kinds of models will be constructed to distinguish external quality and predict inter SSC simultaneously.

Model building
In this section, we aim to establish a model to discriminate the external quality and predict the internal quality simultaneously in terms of the selected 5 spectral principal components and 6 texture values. In order to improve the accuracy, the models of PLS, LS-SVM and ELM are adopted in the following.

PLS
PLS is suitable for both small and large samples analysis, and it is widely used in the modelling based on spectral data.The basic idea of PLS is to divide the observed data into several regions, which can be replaced by a linear combination with different weights. Let X and Y be input matrix of spectral data and output matrix of chemical index data with appropriate dimensions. Based on the method of PLS, the input matrix X and output matrix Y are decomposed as follows: where T and U are quantitative score matrices of X and Y, P and Q are load (principle component) matrices of X and Y, E and F are the fitting residual matrices of X and Y, respectively. Then, linear regression is performed for matrices T and U.

LS-SVM
SVM is a method for classification and nonlinear regression on the basis of statistical learning theory. LS-SVM, which takes the least squares linear systems instead of classical convex quadratic programming as a loss function, is a kind of improvement on the classic SVM. Compared with SVM, LS-SVM has the advantage of small computational complexity and high efficiency. The discriminant function is given as following: where x is the input vector, x i is the target value corresponding to x, a i is the support value, b is the deviation value, K(x i , x) is the radial basis kernel function (RBF), which is selected as follows: and σ 2 is RBF kernel function parameter selected as 5.2 × 10 2 by optimization.
For the prediction of unknown vector x, the following regression function is adopted:

ELM
ELM is a new neural network algorithm developed by overcoming deficiencies of single-hidden layer feedforward neural network, such as local minimum, slow training speed, and so on. The structure of the ELM model ( Figure 5) is composed of input layer, hidden layer and output layer. In Figure 5, X = {x 1 , x 2 , . . . , x n } is the input vector, Y = {y 1 , y 2 , . . . , y m } is the output vector, ω ij are connection weights between input layer and hidden layer, β jk are connection weights between hidden layer and output layer.
The discriminant model for discriminant analysis of unknown sample vector X using ELM algorithm is given as follows: 2, . . . , m) (17) where N is the number of hidden layer elements, function g(·) is a sigmoid activation function.

Experimental results
In this section, the established models of PLS, LS-SVM and ELM are used to discriminate the external shape and predict the values of SSC. The results are shown in Figure 6 and Table 4. In order to analyse the validation of the proposed models, some discriminant criteria, including determination coefficient (R 2 ) of prediction and root mean squared error of prediction (RMSEP), are introduced. Figure 6(a-c) depicts the classification and discrimination results of external defects, which show that the predicted category values of intact, crack, rust, dysmorphic and dark damaged nectarines are in the range of 0.5 ∼ 1.5, 1.5 ∼ 2.5, 2.5 ∼ 3.5, 3.5 ∼ 4.5 and 4.5 ∼ 5.5, respectively. If the predicted category value is less than 0.5 or greater than 5.5, the samples are regarded as heterogeneous fruit. In Figure 6(d-f), the predicted values of intact and defective nectarines are mainly in the range of 7.8 ∼ 9 and 6.6 ∼ 7.4, respectively. The reason for this phenomenon is that the appearance damage of nectarines may lead to the decrease of SSC.
It can be seen from Table 4 that, based on models of PLS, LS-SVM and ELM, accuracies of 89.73%, 94.45% and 88.62% are obtained in the identification of external quality, SSC is predicted with R 2 of 0.8540, 0.8747, 0.8146, and RMSEP of 0.9849, 0.9101, 1.0732. This means that the model of LS-SVM has the smallest error and the highest accuracy in discrimination and prediction, and it has the great potential to predict and discriminate the inner and outer quality of nectarines.
According to the analysis, it can be seen that the LS-SVM model established by data fusion of spectral principal component value and image texture index value has a strong universality, which can achieve the purpose of simultaneously detecting the external defect and internal quality (SSC) of nectarine. It provides theoretical support and basis for the research and development of online nectarine detection equipment.

Conclusion and prospects
This paper presents a new approach to the simultaneous detection of the external and internal quality of nectarines by means of the information of the spectral and image texture extracted from hyperspectral imaging. By selecting intact, cracked, rust, dysmorphic, dark damaged as the external quality indexes and SSC as the internal quality index, 5 spectral principal components and 6 texture values are adopted, then three external defect identification models and three SSC prediction models are established in terms of PLS, LS-SVM and ELM, respectively. The results confirm that the model of LS-SVM has the great potential to evaluate the external and internal qualities of nectarines.
The present models for the external quality discrimination and the internal quality prediction are constructed in terms of 5 types of external indexes and 4 types of internal indexes. However, there are great differences in the external characteristics and internal indexes of different fruit. The model constructed in this study has low universality in the discrimination of other fruit because more physical and chemical indexes are not taken into account. In the future, the diversity of defect types and physical and chemical indicators can be appropriately improved to develop a more widely applicable and more accurate discrimination algorithm.

Disclosure statement
No potential conflict of interest was reported by the authors.