Pattern recognition analysis on nutritional profile and chemical composition of edible bird’s nest for its origin and authentication

ABSTRACT Authenticity of food is of great importance to ensure food safety and quality, and to protect consumer rights. A rapid and accurate method for authentication of edible bird’s nest (EBN) was proposed by using nutritional profile and chemical composition, and pattern recognition analysis. The authentication of EBN includes identification and classification of EBN by production origin (houses or caves), species origin (Aerodramus fuciphagus or Aerodramus maximus) and geographical origin (Peninsular Malaysia or East Malaysia) based on their active compositional content. Three pattern recognition methods, principal component analysis (PCA), hierarchical cluster analysis (HCA) and linear discriminant analysis (LDA), were employed to develop classification models for authentication of EBN origins. Compared to PCA and HCA, LDA is more accurate and efficient in distinguishing EBN by different production, species, and geographical origins, having classification ability of 100% and prediction ability of 92% as validated by cross-validation method. The key chemical markers for production origin differentiation are total phenolic content, zinc, valine, and calcium, while for species origin discrimination are sialic acid, serine, phenylalanine and valine, and for geographical origin differentiation are arsenic and mercury. The findings suggest that nutritional and chemical profiles combined with pattern recognition analysis are promising strategy for rapid authentication of EBN and its products.


Introduction
Composition of food has been used for food classification and identification such as the Codex Alimentarius. [1] Recently, the classification and identification of specific compounds in food have also led to the identification of its origin and authentication studies. [2][3][4][5] Knowing the origin of food is important for quality control, food traceability. It ensures consumer protection and fair competition in the food industry against fraudulent substitution and adulteration practices. Food with high economic value are often substituted and/or adulterated with cheaper materials in order to gain higher profit. [6] These fraud and adulterated food can cause adverse health impacts to consumers and lead to economic losses to the food industry. [7] Edible bird's nest (EBN), a type of food built from the salivary secretion of swiftlets. In Malaysia, the EBN is mainly produced by two swiftlets species, namely Aerodramus fuciphagus and Aerodramus maximus. [8] It is one of the common targets for food fraud due to its high economic value and commercial importance. Many studies have reported the medicinal and therapeutic effects in EBN, such as the ability in rejuvenating skin, inhibiting influenza virus and inflammation, proliferating cell, enhancing bone strength and dermal thickness, reducing tumor production, treating erectile dysfunction and osteoporosis. [9][10][11][12][13][14][15] Owing to its potential abilities and high price, EBN is increasingly fraud substituted with lower quality EBN and adulterated with cheaper ingredients such as Tremella fungus, karaya gum, red seaweed and fried porcine skin, and then marketed at a premium price for greater financial gain. [16] Visual inspection is less likely to detect these fraud substitutions and adulterants in EBN because they have similar appearance with the genuine EBN. Thus, in concern for the consumers' health and industries' economic, it is important to establish an accurate and reliable approach to examine EBN.
Physicochemical, proximate, elemental, fatty acids, triacylglycerol, sialic acid, saccharides, peptides, and amino acids have been used previously to classify EBN following its colorations, regions, production sites, and harvesting seasons, [16][17][18][19][20][21] but no report on EBN species origin except for recent study by Quek et al. [4] For a holistic representation of the characteristics and properties of EBN, various important parameters and properties of EBN including the physicochemical properties, proximate content, antioxidant activities, elemental compounds, amino acids, nitrite, nitrate, and sialic acid often result in a very large sampling size, a very large dataset which is challenging for analysis and interpretation. The conventional univariate data analysis is inadequate in handling large dataset with highly complex food composition. Thus, pattern recognition analysis, also known as multivariate data analysis is introduced. Pattern recognition analysis is widely used in agricultural and food domain analysis to efficiently interpret large and complex dataset, [22] and to examine failure risk of processes and products in wide range of industries including EBN. [23][24][25][26] It uses mathematical and statistical procedures to obtain useful information from the dataset. [27] Among the pattern recognition methods, the unsupervised algorithms of principal component analysis (PCA) and hierarchical clustering analysis (HCA), and the supervised algorithms of linear discriminant analysis (LDA) are most commonly used and compared. The unsupervised algorithms excel at discovering unexpected patterns in dataset, while the supervised algorithms well at classification. [28] PCA was previously employed in analyzing active components (protein and sialic acid) of EBN to classify EBN by color, production sites, and geographical origins. [21,29] In addition, Fourier transform infrared spectroscopy (FTIR) data of EBN was analyzed by LDA with satisfactory classification ability of 94.12%. [30] This research aimed to use pattern recognition analyses of PCA, HCA, and LDA on the nutritional profile and chemical compositions of the EBN to differentiate EBN by their production, swiftlet species and geographical origins. This approach has been used in previous works which include classification of olive oils, fish, wine, honey, and dairy products [31;−35] despite, no reports on EBN. From the developed pattern recognition analysis models and identified chemical markers from the nutritional and chemical profiles, a rapid and accurate EBN origin identification method to ensure safe and genuine products sold in the market thus help to prevent fraudulence in the bird nest industry and protect consumers from the danger of adulterated products.

Edible bird's nest preparation
Thirteen EBN samples were collected in Malaysia from 2013 to 2014. These samples were originated from two different production origins (house and cave), species origins (A. fuciphagus and A. maximus) and geographical origins (Peninsular Malaysia and East Malaysia) ( Table 1). The production and geographical origins of EBN samples were guaranteed by the respective farmers, except two samples (EBN 12 and 13) that were purchased from the local markets. The species origin of EBN samples was preliminary confirmed using molecular technique by DNA sequencing of the cytochrome b gene. [36] The EBN samples were ground into powder using a MFM-202 high-speed grinder (Ta Feng Electrical Appliances Co. Ltd., Taoyuan, Taiwan) at 20,000 rpm and screened through 1 mm mesh size. The ground EBN samples were kept in airtight containers and stored at −20°C until analysis.

Compositional analysis
A total of 45 compositional properties were assessed on each EBN sample, including physiochemical properties of water activity and color. Water activity was determined using a Fast-lab water activity meter (GBX, Romans-sur-Isere, France). Color was measured using a CR-10 colour reader (Konica Minolta Sensing Inc., Tokyo, Japan) and results were represented in L* for lightness, a* for redness and b* for yellowness. Hue and chroma values were calculated using the following equations, respectively.
Proximate analysis was conducted using the established AOAC Official Methods. [37] Moisture content, protein, ash, and fat were determined following the AOAC Official Methods 950.46, 981.10, 923.03, and 991.36, respectively. Carbohydrate was determined by difference method using the following equation.
Nitrite and nitrate were determined using a Dionex ICS-90 ion chromatography system (Dionex Co., Sunnyvale, USA) coupled with a conductivity detector following the Malaysian Standard MS 2509:2012 (P) [41] and Paydar et al. [42] Sialic acid was determined using a Dionex UltiMate 3000 HPLC system (Dionex Co., Sunnyvale, USA) coupled with a FLD-3400RS fluorescence detector (Dionex Co., Sunnyvale, USA). [43] The entire analyses were conducted in at least two replications to ensure reproducibility.

Data analysis
Data matrix of 26 observations (13 EBN samples x 2 replicates) and 45 compositional variables (physicochemical, proximate, antioxidants, elementals, amino acids, nitrite, nitrate, and sialic acid) was used in this study. The EBN replicates were used as observations in order to enlarge the sample size. The number of variables is larger than the number of observations, so a variable selection step is required. [27] This step removes those variables that contained redundant and noisy information from the dataset to minimize the overfitting problem in classification. [44] The variable selection was conducted using analysis of variance (ANOVA) and Pearson correlation analysis. One-way ANOVA was first performed on each variable to determine which variables significantly differentiate EBN between classes. Variable that has P-value less than 0.05 is considered to be statistically significant (P < 0.05). Next, those variables with significant P-values were subjected to Pearson correlation analysis, to determine their correlation level. Variables having high correlation were pruned by retaining only variables with higher F-ratio values. The F-ratio value indicates the significant effect in differentiating between classes. By definition, F-ratio is the ratio of betweengroup to within-group variances, which used to rank the potential variables with significant effect. [6] Then, the selected variables were allowed to proceed for pattern recognition analysis.
Three pattern recognition methods were employed for analyzing the selected variables, in order to classify and differentiate EBN following their production origin, species origin, and geographical origin. The three pattern recognition methods utilized were two unsupervised algorithms of PCA and HCA, and a supervised algorithm of LDA. All the compositional data was standardized (autoscaled or unit-variance scaled) prior to pattern recognition analysis. Standardization was performed on each variable individually by subtracting its mean value and then divided by standard deviation, to ensure all variables contribute equally to a scale. Statistical analyses were performed using Statistica software version 10.0 for Windows (StatSoft Inc., Oklahoma, USA).

Principal Component Analysis (PCA)
PCA is a commonly used unsupervised pattern recognition method to reduce data dimensionality and discover unsuspected relationships in extremely large dataset. PCA transforms a large number of original variables into a smaller set of new uncorrelated variables, known as principal components. [27] Principal components retain most of the information from the original data in terms of variance. The first principal component (PC1) accounts for the maximum variance of the data and the subsequent principal components (PC2, PC3 . . . PCn) account for the remaining variance in lesser proportion. The optimal number of principal components to retain was determined based on eigenvalues greater than 1. [45] The first two or three principal components often represent the main structure of the data, while the remaining principal components contain noise or less relevant information. [46] The EBN samples were differentiated based on correlation matrix and the results were presented on the PCA score plot and loading plot. The score plot portrays the similarity or differences between observations, while the loading plot illustrates the correlations between the principal components and the variables. [47] Hierarchical Cluster Analysis (HCA) HCA is an unsupervised pattern recognition method that forms natural grouping for observations. The observations are grouped into respective clusters using similarity or distance metric without a priori information about the class memberships. [48] The EBN samples were grouped using the distance metric based on single Euclidean distance and the clustering method used was complete linkage method, and the results were presented on the dendrogram.

Linear Discriminant Analysis (LDA)
LDA is the most frequently used supervised pattern recognition method that uses variables and observations with prior known information to build a classification model. The model built is crossvalidated using a new independent set of observations with prior known information. After crossvalidation, the model can be used to estimate the class memberships of unknown samples. [49] LDA is also able to identify marker variables, which contribute to the differentiation between the classes. The LDA model for classification of EBN samples was built using a forward stepwise analysis. Variables that contribute most in differentiating EBNs were sorted and included into the model stepby-step based on Fisher criteria test (F to enter/remove values). In general, the F-value for variable refers to its statistical significance in the differentiation between groups. The F to enter value indicates the level of significant contribution of a variable to be added into the model, while F to remove value determines how insignificant the contribution of a variable to be removed from the model. The most significant and differentiating variables identified for each origin differentiation were determined with respect to their P-values (P < 0.05). The performance of the LDA model constructed was examined and the results were expressed as classification ability and prediction ability. Classification ability is the capability to group the observations to the correct category to establish a classification rule, while prediction ability is the capability to group the new observations with prior known information to the correct category. [47] A 10-fold cross-validation method (internal validation) was performed to validate the model and estimate the predictive ability of the model.
For evaluation of the LDA models, sensitivity, specificity, and accuracy are measured. Sensitivity and specificity are statistical measures of binary classification test. [50] Sensitivity, also known as true positive rate, measures how well the test correctly predicts a condition whereas specificity, also called true negative rate, measures how well the test correctly predicts the other condition. Sensitivity is the proportion of true positive of all positive cases while specificity is the proportion of true negative of all negative cases. Accuracy measures how well the test correctly predicts both conditions, meaning the proportion of true results of all possible results. Sensitivity, specificity, and accuracy are calculated using the following equations. Table 2 summarizes the chemical compositions such as physicochemical properties, proximate composition, antioxidant activities, elemental content, amino acid profile, nitrite and nitrate, and sialic acid contents of the EBN samples as grouped following their production, species and geographical origins. The compositional data of EBN samples were combined with pattern recognition analysis such as PCA, HCA, and LDA for identification of different origins of EBN.

Production origin differentiation
The EBN samples studied were from two different production origins, house, and cave. The house EBNs were obtained from man-made buildings or houses, while the cave EBNs were collected from natural limestone caves. Dataset consisted of 22 observations (11 EBN samples X 2 replicates) and 45 compositional variables were used to differentiate the house and cave EBNs. After performing one-way ANOVA, 19 variables with significant effect (P < 0.05) in differentiating EBN were selected and ranked based on F-ratio value as listed in Table 3 (a). Variable with higher F-ratio value indicating greater differences between the house and cave EBNs. The Pearson correlation analysis showed that TPC, DPPH, and FRAP were highly correlated (0.82 ≤ r ≤ 0.89). The TPC was then retained for successive analysis as it had the largest F-ratio value. Hue and color a* had a strong correlation (r = −0.83), while nitrite and nitrate had a good correlation (r = 0.59).
For the similar reasons based on F-ratio value, hue and nitrite were retained. Hence, the 15 remaining compositional variables were proceeded for pattern recognition analysis. PCA revealed that the first four principal components explained 87.39% of the total variance was able to differentiate EBN samples based on their production origin. The PC1 explained 51.11% of the total variance, and the three subsequent principal components explained 16.80%, 11.75% and 7.73% of the total variance, respectively. It is impractical to display four-dimensional plots, so the first two principal components were plotted. Fig. 1 shows PC1 and PC2 plots that explained most of the total  variance at 67.91%. The score plot in Fig. 1(a) shows the house and cave EBNs were separated distinctively into two groups as indicated by two ellipses. Tight group was observed for the house EBN samples compared to the cave EBN samples, implying higher degree of similarity within the house EBN. On the loading plot, variables that contributed to each of the principal components were identified ( Fig. 1 (b)). Generally, the closer the variables to the unit circle, the higher their contributions to the principal components. [47] Variables 1, 2, 5, 6, 11, 12, 13, and 15 (color L*, hue, TPC, mercury, valine, isoleucine, glutamic acid and sialic acid), and variables 4, 8, 9, 10, and 14 (ash, calcium, magnesium, zinc, and nitrite) had the highest negative and positive loadings on PC1, respectively. The results indicated that these variables have higher values in those observations that have highest negative score (house EBN samples) and positive score (cave EBN samples) on PC1, respectively. Thus, it could be deduced that the house EBN is whiter, less yellowish in color and had higher concentrations in TPC, mercury, valine, isoleucine, glutamic acid, and sialic acid. The ash, calcium, magnesium, and nitrate contents were higher in the cave EBN. Variables 3 and 7 (carbohydrate and cadmium) had highest negative and positive loadings on PC2. These results interpreted that the variables on PC1 were responsible for the separation between the house and cave EBNs, while carbohydrate and cadmium were accountable for the variability within the house or cave EBN. In addition, PCA was also conducted using all the 45 compositional variables without performing variable selection step. As expected, a poor separation between the house and cave EBNs was obtained. This is probably due to the incorporation of less relevant variables which did not provide useful information in differentiating EBN and introduced noise to the PCA model, hence influenced accuracy of the results. Fig. 2 shows the result of HCA performed to confirm the clustering of EBN samples following their production origin in a dendrogram. By limiting at linkage distance 6.2, three clusters were obtained. The first cluster consisted of house EBN samples, while the other two clusters contained cave EBN samples. This result was in agreement with the results obtained in PCA.
LDA was employed to build a model with high percentage of correct classification based on EBN production origin. A classification model was developed with 9 compositional variables using forward stepwise analysis. Variables that contributed in differentiating the house and cave EBN samples were, in descending order, TPC, zinc, valine, calcium, isoleucine, cadmium, magnesium, carbohydrate, and ash. The first four variables were the most significant and differentiating variables. The LDA model developed achieved 100% classification ability, where all observations were correctly classified. It can be concluded that the 9 variables were evidently significant in classifying EBNs following their production origin. A 10-fold cross-validation was performed to examine the reliability of this LDA model. All observations were correctly predicted with an excellent prediction ability of 100% in cross-validation. These results suggested that the LDA model developed was reliable and valid to be used for classification of EBNs based on their production origin. The results of classification and prediction abilities including sensitivity, specificity, and accuracy of the LDA model are shown in Table 4.
Despite TPC, zinc, valine, and calcium contents being identified responsible for EBN classification based on production origin, Seow et al. [51] reported glutamic acid and tyrosine as the differentiating markers between house and cave nests. This may be due to the use of the entire amino acids for analysis in Seow et al.'s study. Inclusion of amino acids that do not contain relevant information in the differentiation of EBN could have introduced noise to the analysis and thus affecting the accuracy of the results. The classification rate achieved in this study is believed to be better when compared to Seow et al.'s study as a more comprehensive compositional data which involved more differentiating variables was used for the analysis.

Species origin differentiation
One-way ANOVA and Pearson correlation analysis were applied to the dataset containing 26 observations (13 EBN samples X 2 replicates) and 45 compositional variables to eliminate less relevant and redundant variables for differentiation of EBN samples by swiftlet species origin, A. fuciphagus and A. maximus, also commonly known as the white-nest swiftlet and black-nest swiftlet, respectively, by the locals in Malaysia. Twenty-six variables with significant effect (P < 0.05) were selected and listed in Table 3 (b). The Pearson correlation analysis showed strong correlations between TPC, DPPH and FRAP (0.77 ≤ r ≤ 0.84), and between color L*, color b* and chroma (0.58 ≤ r ≤ 1.00). Only TPC and color L* were retained for subsequent analysis because they had larger F-ratio values. A good correlation was also obtained between nitrite and nitrate (r = 0.62). Similarly, nitrite with a larger F-ratio value was retained. The 21 remaining compositional variables were then applied for pattern recognition analysis.
For PCA, the first five principal components were used to differentiate EBN samples following their species origin, accounted for 90.73% of the total variance. The PC1 accounted for 47.26% of the total variance, and the following principal components accounted for 19.50%, 11.59%, 6.74%, and 5.64% of the total variance, respectively. PC1 and PC2 plots accounted for most of the total variance at 66.76% are shown in Fig. 3. The score plot shows that EBN samples produced by A. fuciphagus and A. maximus were slightly overlapped as illustrated by ellipses ( Fig. 3 (a)). This indicated that the two swiftlet species of EBNs was not well-differentiated. From the loading plot in Fig. 3 (b), variables 1, 4, 9, 10, 12, 15, 18, and 21 (color L*, TPC, threonine, leucine, valine, serine, tyrosine and sialic acid), and variables 3, 6 and 20 (ash, calcium and nitrite) showed the highest negative and positive loadings on PC1, respectively. Results implied that these variables have higher values in those observations that having the highest negative score (A. fuciphagus) and positive score (A. maximus) on PC1, respectively. Thus, it could be derived that EBN produced by A. fuciphagus were whiter in appearance and had higher concentrations in TPC, threonine, leucine, valine, serine, tyrosine, and sialic acid, while A. maximus had higher ash,  HCA was performed to affirm the clustering of EBN samples following their species origin. HCA result presented in a dendrogram (Fig. 4) shows two predominant clusters, being the A. fuciphagus and A. maximus were obtained by limiting at linkage distance 8.4. Two observations from EBN sample 6 which produced by A. fuciphagus were incorrectly clustered to A. maximus. This result was consistent with the results attained in PCA.
A forward stepwise LDA was used to construct a model with high correct classification rate based on EBN species origin. A classification model with 100% classification ability was developed using 9 compositional variables. Variables that contributed most in differentiating EBN samples by A. fuciphagus and A. maximus were sialic acid, serine, phenylalanine and valine, followed by calcium, color L*, mercury, ash, and zinc, in descending order. The reliability of the LDA model constructed was further examined using a 10-fold cross-validation. The LDA model obtained a satisfactory prediction ability of 92%. Out of the 26 observations, two observations (EBN sample 6 with two replicates) originated from A. fuciphagus were misclassified to A. maximus. This could be explained   by the close relationship between A. fuciphagus and A. maximus, in which they belong to the same family and genus named Apodidae Aerodramus. Table 4 shows the abilities of the LDA model in classifying and predicting EBN by species origin.
The sialic acid, serine, phenylalanine, and valine had the most significant impact to the differentiation of A. fuciphagus and A. maximus. Sialic acid is produced from the salivary glands of swiftlets and the amino acids are coded by the nucleotides containing unique information for every swiftlet. These variables possessed close proximity with the genetic information in swiftlets. Thus, it is suggested that genetic factor could be an alternative promising tool to differentiate EBN from A. fuciphagus and A. maximus.

Geographical origin differentiation
The EBN samples studied were from two geographical origins, the Peninsular Malaysia and East Malaysia (Sabah and Sarawak). Dataset containing 22 observations (11 EBN samples X 2 replicates) and 45 compositional variables were used to differentiate EBN samples from the Peninsular and East Malaysia. A lesser observations were used in this classification compared to 26 observations used in species origin classification. It is because some of the EBN samples with unknown geographical origin were not included in this classification.
Fifteen variables having a significant effect at P < 0.05 in differentiating EBN samples by geographical origin were selected using one-way ANOVA and ranked based on F-ratio value as shown in Table 3 (c). The Pearson correlations analysis revealed that TPC, DPPH and FRAP (0.82 ≤ r ≤ 0.89), and hue and color a* (r = −0.83) had strong correlations. Nitrite was also positively correlated with nitrate at r = 0.59. Based on the higher F-ratio value thumb rule, the TPC, hue, and nitrite were retained for the subsequent analysis. The 11 compositional variables remained were used for pattern recognition analysis.
PCA showed that the first two principal components explained a total variance of 71.37%, were able to differentiate EBN samples following their geographical origin. The PC1 and PC2 explained 56.80% and 14.57% of the total variance, respectively. Fig. 5 illustrates PCA plots of PC1 and PC2. The score plot in Fig. 5 (a) shows a distinct separation between the Peninsular Malaysia and East Malaysia EBN samples. The EBN samples from Peninsular Malaysia were more closely grouped together compared to East Malaysia, implying higher similarity within the Peninsular Malaysia EBN. The loading plot in Fig. 5 (b) shows contribution of variables with respect to each principal component. For PC1, variables 3, 8, 9, and 10 (ash, calcium, magnesium, and nitrite), and variables 1, 2, 5, 6, and 11 (color L*, hue, TPC, mercury, and sialic acid) had the highest negative and positive loadings, respectively. These variables were used to characterize the Peninsular Malaysia and East  HCA was conducted to confirm EBN samples clustering following their geographical origin. Fig. 6 shows a dendrogram obtained by HCA. At linkage distance of 6.3, two clusters were obtained. The first cluster was predominantly Peninsular Malaysia EBN samples, while the second cluster contained East Malaysia EBN samples.
A classification model was developed using forward stepwise LDA, to correctly classify EBN samples based on geographical origin. Eight compositional variables were used in developing LDA model and it achieved excellent classification ability of 100%. Among these variables, arsenic and mercury contributed most significantly to the geographical origin differentiation, next were color L*, nitrite, TPC, magnesium, hue, and ash. The significant variations in arsenic and mercury (P < 0.05) between different geographical origins were probably linked to the environmental conditions. The Peninsular Malaysia is concentrated with many industrial activities which is expanding rapidly, for instance, the Straits of Malacca, one of the most hectic shipping lanes in the world. [52,53] These activities emit toxic metal pollutants that contaminate the environment and ecosystem, including forages for the swiftlets. Swiftlets are aerial insectivores that usually forage for insects around their habitats. [54] Swiftlets from the Peninsular Malaysia mostly forage around the industrial areas, thus their diets are often enriched with toxic metal pollutants. Expectedly, EBNs that they produced are higher in toxic metal contents. The diets of swiftlets may impose a significant effect on the marker variables and influence the differentiation of EBN by geographical origin. This result is consistent with the findings reported by Chua et al., [55] stated that the swiftlet's diet significantly affected variables that differentiate EBN from different countries.
A 10-fold cross-validation was employed in examining the reliability of the LDA model developed in term of its prediction ability. All 22 observations were correctly predicted and achieved 100% prediction ability in cross-validation. It is recommended that this LDA model can be used to classify EBN by geographical origin due to its high reliability and validity. Table 4 presents the results of classification and prediction abilities of the LDA model.

Conclusion
Good classifications of EBN were demonstrated by PCA, HCA, and LDA models with good agreements and consistency. Comparison between the three models indicated that the supervised stepwise LDA model achieved a more effective classification for EBN with regards to reliability, time, and cost of analysis. The developed LDA model required fewer variables of 8-9 variables, is more time saving and cost-effective than the PCA and HCA models in identifying EBN origins. The LDA models were highly reliable and valid for EBN identification, with 100% classification abilities and at least 92% prediction abilities achieved. The TPC, zinc, valine, and calcium were identified as key markers to differentiate EBN by production origin, while sialic acid, serine, phenylalanine and valine for species origin, and arsenic and mercury for geographical origin determination. It is suggested that nutritional and chemical profiles coupled with pattern recognition analysis are a viable approach to rapidly determine EBN origins for food safety and quality control plus traceability and authenticity purposes.

Highlights
• Rapid and accurate authentication method of edible bird's nest (EBN) origins • Pattern recognition analysis based on nutritional profile and chemical composition • EBN can be distinguish following their production, species, and geographical origins • Key discriminating parameters are antioxidant, sialic acid, amino acids, elemental • Developed LDA model is accurate and reliable with excellent classification ability • First report to identify swiftlet species of EBN using chemical pattern recognition