Tilapia freshness prediction utilizing gas sensor array system combined with convolutional neural network pattern recognition model

ABSTRACT Freshness of tilapia stored under cold storage was studied by utilizing gas sensor array combined with convolutional neural network (CNN) pattern recognition model in this paper. Total volatile basic nitrogen (TVBN) index was conducted to supply a freshness mark for tilapia sample. A portable electronic nose was designed and fabricated. The sensor array responses to tilapia sample were recorded. Principal component analysis (PCA) was used for gas sensor array data treatment, and this method could not discriminate all the samples. CNN model was optimized. The structure of CNN model had one input layer, four convolutional layers, two pooling layers, one Dropout layer, one fully connected layer and one output layer. The predicting accuracy of optimized CNN was 92.31%. The method investigated in this paper presented some advantages including rapid, easy operation, high sensitivity, high precision, etc. This method can be promising in aquatic food quality evaluating applications.


Introduction
Tilapia is a kind of medium-sized fish. Its delicious meat was widely accepted by people. Moreover, the content of protein and a variety of unsaturated fatty acid is very rich. Many researchers conducted work on its high nutrition content. China is the world's largest country with the biggest tilapia harvest and exports. So the quality determination for cold-stored tilapia is of great significance.
At present, fish freshness determination ways include physical discrimination, chemical examination, microbial detection, and sensory evaluation. [1,2] The traditional detection ways have disadvantages including tedious, time-consuming, high cost, etc. So a simple, accurate, and easy method is urgently needed in fish freshness rapid detection In recent years, gas sensor array analysis technique develops quickly. [3] It is widely used in food quality, food production, and other industries, etc. [4][5][6] This technique presents some advantages including rapid detection and stability. [7][8][9][10] Meanwhile, with the development of machine learning technique in recent years, more methods, such as supporting virtual machine (SVM), CNN, etc., are effectively applied in food quality examination occasions. However, there is little literature on reporting on the application of gas sensor array combined with machine learning models in tilapia quality rapid monitoring.
In this work, a portable electronic nose system was designed and fabricated using eight kinds of gas sensors. The electronic nose was used to measure the tilapia samples of different storage time under cold storage. The TVBN index was examined to supply a sufficient index reference for electronic nose detection. PCA method was used to conduct classification of fish samples. CNN model was optimized and used to predict the fish freshness.

Materials and pretreatment
Tilapia was obtained from a agricultural market in Hangzhou. The tilapia was wased with ice water, and the viscera and tail were removed: scale, head and tail. According to the fish spine line, the fish was divided into small pieces (each piece weighted at 25 g ± 5 g), and stored in 277 K in a refrigerator. The fish samples were randomly taken to conduct TVBN tests and gas sensor array experiments for continuous eight days.

TVBN
TVBN examinations were referred to the national standard of SC/T 3032-2007.

Gas sensor array
System setup. Figure 1a is schematic diagram of gas sensor array system adopted in this experiment, and mainly consisted of gas collection device, sensor array, signal processing system. Sample gas collection chamber and clean air collection chamber are two different and mutually separate chambers, which is beneficial to improve the purity of sample gas, improving the accuracy of the experimental data. This experiment adopted eight gas sensor arrays, which is listed in Table 1.
Gas sensor array experiments. Fish samples (10 g) were taken out of refrigerator randomly whose temperature is 277 k, and put into 100 ml sample bottle, then packaged with sealing membrane. Half an hour later, the sampling pinhead and zero gas pinhead are placed into the head-space of the beaker. Volatile gas sucked into the E-nose and reacted with gas sensors. The experiments lasted for 50s. The E-nose real-time responses were recorded. Each time after the end of the experiment, continuously feeding the clean air until the gas sensor array recover to initial value. Set 5 parallel experiments and each experiment repeats 3 times.

CNN model development and optimization
CNN model structure could be referred in Figure 1b. The development of CNN model follows four steps: (i) CNN model is developed according to the initial structure and parameters, and the parameters are initialized. [11,12] (ii) The training data set is input to the model at batch size, and the output results of each layer are obtained, and the model parameters are amended based on the output results. (iii) Whether the accuracy of the testing data set reaches the standard? If yes, execute the next step. If no, return to step (i). (iv) Save the model parameters and end training, and CNN model development is finished.
CNN model has input layer, convolutional layer, pooling layer, dropout layer, and fully connected layer, etc. The input layer absorbs input data. The convolutional layer extracts features. The pooling layer reduces the dimensionality of the features extracted from the convolutional layer and accelerate the convergence of the model. The dropout layer prevents CNN model from over fitting by randomly zeroing out the data. The fully connected layer reduces the output features and passes features to output layer, which outputs the model discrimination results. Considering the practical requirements, CNN model is designed as (see Figure 1c). The model contains one input layer, four convolutional layers, two pooling layers, one dropout layer, one fully connected layer, and one output layer. Based on the model structure, the parameters of each layer are initialized. Model parameters are optimized by four steps. (i) Convolution kernel size determination. The accuracy of the model with different convolutional kernel sizes could be referred to Table 2. The opzimized convolutional kernel size is 9. (ii) Number of convolution kernels. When the convolutional kernel size is 9, the effect of different numbers of convolutional kernels on the accuracy of the model is evaluated, and the results could be referred to Table 2. The optimized number of convolutional kernels is 100. (iii) Batch_size.
With the convolutional kernel size of 9 and the number of convolutional kernels of 100, the optimized batch_size value is 400. (iv) Dropout optimization. The dropout parameter is used to prevent model over fitting by randomly setting part of the data to 0. Table 2 shows the model discrimination results for different Dropout values. The optimized dropout is 0.5. After optimization, the model parameters were finally determined as shown in Table 3.

Results of TVBN
When functioned by microorganism and endogenous enzymes, the protein of tilapia decomposes into ammonia, amine, and other alkaline nitrogen content, presenting certain volatile properties. Through the volatile base nitrogen method, [13] the volatile based nitrogen content in fish could be determined, and the freshness of fish could be approximately evaluated. The examination results are displayed in Figure 2a.
The results indicate that TVBN index increases with the increase of storage time. It can't accurately characterize the freshness changes of tilapia during cold storage. [2] According to GB 2733-2005, the fish meat begins to corrupt. [14] The reason to this phenomena lies in that the fat content of fish meat is easily to be oxidized to a certain extent. From the experimental results, the growth rate of volatile base nitrogen content increases in 7th day. TVBN reaches the maximum and the fish seriously contaminates.   Figure 2b is the original responding chart of gas sensor array. This gas sensor array is equipped with a total of eight gas sensors. From this figure, responses of S1 and S2 are the much larger than other sensors. The responses of S5 and S6 follow in turn. Sensor S8 presents the weakest responses. Results indicate that tilapia meat sample produces feature sensor array responses.

PCA results
The measurement data to tilapia meat samples under different storage time generated by sensor array system is multi-dimensional matrix with sufficient data reductance. [15] PCA method could reduce the dimensionality of such complex data set. Several important index, which is the linear combination of the original data to generalize the most important feature is selected. PCA results are shown in Figure 2c, indicating that the freshness of fish meat in the experiment procedure changes a lot. The total contribution of the first component (PC1 and PC2) is about 85.09%. PC1 has the declining trend in 8 days, and PC2 presents an increasing trend. Generally, the total changing trend of PC1 combined with PC2 is from southeast to northwest. On each day, the gathering degree of the samples declines with the increase of storage days, indicating that the corrupted degree of the sample in the same day has much difference. Generally speaking, PCA analysis is not suitable for characterization on freshness of tilapia samples.

CNN prediction results
The training set has 3500 groups of data, and the predicting set has 2000 groups of data. The predicting experiments are conducted by randomly using data from predicting set. CNN model without optimization was used as the performance comparison to optimized CNN model. Results are displayed in Table 4. The accuracy of the CNN model without optimization is 85.19%. The predicting accuracy of the optimized CNN model is 92.31%. Moreover, BP network method was also conducted to make a comparison toward CNN and optimized CNN methods. However, BP method only presented an accuracy of 61.54%. Results indicated that the optimized CNN model is more suitable for the prediction. Moreover, the optimized model need shorter processing time. In field application, two important factors are usually considered for food quality analysis. One is the testing accuracy, the other is the testing efficiency. Some models present better predicting accuracy, but the testing efficiency is relatively low, which prevents these models from wide applications. The optimized CNN model makes a compromise scheme between detection time and detection accuracy. In this work, tilapia freshness monitoring method using gas sensor array combined with machine learning model (CNN) was proposed and validated. The gas sensor array generated abundant examination data. CNN model was trained and the trained model was used for testing data set prediction. The novelty of this method lies in the combination between gas sensor array hardware and CNN software. The gas sensor array provided efficient measurement data, and the data was divided into training set and testing set. CNN model was trained and the optimized model was obtained. The optimized model was utilized to finish the tilapia freshness determination. This method presents some advantages including rapid detection and good accuracy.

Conclusion
In this paper, freshness monitoring on tilapia stored under cold storage utilizing gas sensor array and pattern recognition method was studied. The following results were reached. (i) TVBN index provided freshness reference for tilapia samples. This index increases with the increase of storage time, demonstrating that the meat quality continuously decreased during the experiments. (ii) A portable gas sensor array system was designed and fabricated. The sensor array responses to tilapia sample were recorded. PCA was used for sensor array data treatment. This method could not discriminate all samples in different days. (iii) The optimized CNN model was designed. The structure of designed model had one input layer, four convolutional layers, two pooling layers, one dropout layer, one fully connected layer and one output layer. CNN model parameters were decided by experiments results. The predicting accuracy of optimized CNN is 92.31%. (iv) The method explored in this work has some advantages including rapid, easy operation, high sensitivity, high precision etc, and it is promising in aquatic quality rapid analysis.

Acknowledgments
This work is aided by Science and Technology Research Project of Hangzhou Dianzi University.