Detection of Fire Blight disease in pear trees by hyperspectral data

ABSTRACT Rapid and early detection of Fire Blight as the most destructive bacterial disease of apple and pear trees is very important to avoid product loss. The objective of this research was to evaluate the usefulness of visible near-infrared spectrometry for early detection of Fire Blight . Three kinds of samples were selected: healthy leaves (H) from healthy trees and symptomatic (S) and non-symptomatic diseased (MS) leaves from infected trees. For spectral analysis, different preprocessing and processing techniques were carried out. Linear discriminant analysis, quadratic discriminant analysis, Mahalanobis discriminant analysis, soft independent modeling of class analogy (SIMCA) and partial least square-discrimination analysis were applied as classification techniques. Laboratory test by selective culture method was used to detect bacteria. Based on analyses, hyperspectral wavelengths for detection of H, MS and S leaves were obtained. SIMCA proved to be the strongest among all classifiers to discriminate healthy leaves from diseased leaves. The results indicated that structure intensive pigment index and modified simple ratio were sensitive to discriminate H–S, H–MS and S–MS leaves. Randomized difference vegetation index showed potential to classify H–S and S–MS samples. Anthocyanin reflectance index showed potential to discriminate H–MS samples. Finally, modified triangular vegetation index1 and modified chlorophyll absorption ratio index1 were identified and considered as spectral indices to discriminate S–MS samples. Based on these results, this technique is reliable for detecting non-symptomatic diseased leaves and is capable of early detection of Fire Blight before spreading.


Introduction
Pome fruits such as apple with production of 14.2 ton/ha and pear with production of 12.6 ton/ha are the most important horticultural export products of Iran. Fire Blight is one of the most destructive bacterial diseases of apple (Malus domestica) and pear (Pyrus communis) trees. The causal agent is the necrogenic Gram-negative bacterium Erwinia amylovora (Ea). This pathogen enters the tree through natural openings such as nectarthodes or through wounds on succulent aerial parts. Once inside the susceptible host plant, the bacteria multiply mainly in the apoplast of parenchyma cells and colonize active growing shoots inducing the progressive necrosis of the infected plant tissues. In resistant host plants or in nonhost plants, bacteria cause a local cell death (hypersensitive-like reaction) and are unable to further colonize the plant tissue (Gaucher, Bernonville, Guyot, Dat, & Brisset, 2013). This bacteria inhibits nutrient flow to the leaves, which causes their death gradually (Gaucher et al., 2013). In this situation, leaf color changes to yellow-brown because of structural disruption. To avoid the spread of the plant disease, detection of infected trees and attempts to prevent infection with one to several protective sprays are recommended. Antibiotics have advantages of good to excellent efficacy at low material cost with little risk of reducing the appearance quality of the developing fruit. Recent integration of acibenzolar-S-methyl with antibiotics is used for protection of pear and apple from Fire Blight (Johnson et al., 2016). Besides antibiotics, other ways of reducing/stopping Fire Blight are pruning infected branches, less use of nitrogen fertilization, control of insects and finally tree removal. So, the early detection of Fire Blight is important and leads to lower cost and less crop damage. Rapid identification and chemical fighting of disease will reduce the spread of Fire Blight in pome fruit trees. Currently, scouting orchards is the most widely used method for Fire Blight detection, but it is time consuming, laborious, often subjective and prone to errors. Thus, there is a need for accurate and real-time sensing technologies to improve plant disease detection (Futch, Weingarten, & Irey, 2009).
Several advanced techniques have been developed for plant disease detection and monitoring, including molecular techniques such as enzyme-linked immune sorbent assay (ELISA) and polymerase chain reaction (PCR) (López et al., 2003;Sankaran, Mishra, Ehsani, & Davis, 2010), electronic nose system (Sankaran et al., 2010;Zhang, Shen, Chen, Xiao, & Bao, 2008) and remote sensing systems which include visible (Vis) and infrared spectrometry, infrared thermography, fluorescence spectrometry and X-ray imaging (Sankaran et al., 2010). All these techniques have their specific merits in detecting crop diseases and potential to be incorporated into disease diagnostic systems. However, based upon practical concerns, spectrometry in the Vis and near-infrared (NIR) range can potentially be used for plant disease and plant stress detection (Sankaran et al., 2010).
Differences in the spectral reflectance of healthy and diseased plants in the Vis and infrared regions of the electromagnetic spectra can be used as an indication of plant disease (West et al., 2003). Reflectance spectra of vegetation, measured in the Vis and infrared regions, contain information on plant pigment concentration, leaf cellular structure and leaf moisture content (Borengasser, Gottwald, & Riley, 2001). Because Vis and NIR spectrometry is an accurate tool for plant status monitoring, it has been implemented in a wide variety of decision support systems in agriculture for both ground and aerial remote sensing (Berni, Zarco-Tejada, Suarez, & Fererez, 2009;Zarco-Tejada, Gonzalez-Dugo, & Berni, 2012). Various researchers (Delalieux, Van Aardt, Keulemans, Schrevens, & Coppin, 2007;Purcell, O' Shea, Johnson, & Kokot, 2009;Spinelli, Noferini, & Costa, 2006;Yang, Cheng, & Chen, 2007) have used spectral reflectance-based techniques for disease detection in plants. Vis and NIR spectrometry has been used extensively to detect plant anomalies (Delalieux et al., 2007;Kobayashi, Kanda, Kitada, Ishiguro, & Torigoe, 2001;Pontius, Hallett, & Martin, 2005;Sankaran et al., 2010;Wu, Feng, Zhang, & He, 2008). For example, Malthus and Madeira (1993) studied the spectral reflectance of field beans leaves infected with Botrytis fabae (fungal disease). They found that blue (470-500 nm) and red (590-700 nm) regions were positively correlated with infection. They further reported a decrease in the NIR reflectance, around 800 nm due to infection. Kobayashi et al. (2001) used a multispectral radiometer and an airborne multispectral scanner to identify panicle blast in rice. They concluded that reflectance ratios (R470 nm/ R570 nm, R520 nm/R675 nm and R520 nm/R675 nm) decreased with panicle blast progression. Muhammed and Larsolle (2003) concluded that the fungus Drechslera tritici-repentis mainly affected the spectral signature by a flattening of the green reflectance peak together with a general decrease in reflectance in the NIR region, a decrease at the shoulder of the NIR reflectance plateau and a general increase in the Vis region between 550 and 750 nm. Also, NIR spectrometry was applied to predict previsual decline in eastern hemlock trees (Pontius et al., 2005). Graeff, Link, and Claupein (2006) studied powdery mildew infection in wheat in greenhouse conditions by means of leaf reflectance measurements. Huang et al. (2007) discovered that photochemical reflectance index could be used to successfully track the physiological changes occurring in winter wheat plants that had been infected by the yellow rust. The reflectance around 580 and 610 nm had a significant response to yellow rust at the canopy level . Wu et al. (2008) used Vis-NIR reflectance spectroscopy (325-1075 nm) for early detection of Botrytis cinerea disease in eggplant leaves at symptomatic stages. They achieved an accuracy rate of 85% in predicting fungal infections. Naidu, Perry, Pierce, and Mekuria (2009) applied Vis infrared spectrometry (350-2500 nm) for detecting grapevine leafroll disease. Therefore, the green (530-595 nm) and red bands (710-750 nm) in the Vis region may be concluded to have potential to discriminate diseased leaves from healthy leaves. Balasundaram, Burks, Bulanon, Schubert, and Lee (2009) recommended spectral regions between 500 and 800 nm for successful detection of canker in citrus peel. De Castro, Ehsani, Ploetz, Crane, and Abdulridha (2015) used spectral imaging for early detection of laurel wilt (LW) disease in avocado. They obtained wavelengths of 580 nm, 650 nm, 740 nm, 750 nm, 760 nm and 850 nm for LW disease detection.
Previous research shows that Vis and NIR spectrometry is a powerful method for detection of plant disease. Despite considerable Fire Blight damage, no research has been carried out for early detection of this harmful disease. So, the main objective of this research is to investigate the capability of Vis and NIR spectrometry for detection of Fire Blight in the early stage before spreading. Also, in this research, wavelengths specific to detect Fire Blight and vegetation indices for discrimination of healthy and diseased leaves were determined. The results of this research will be used to improve early Fire Blight detection through remote sensors.

Data collection
The experiments were conducted in a 5-ha pear tree orchard at Damavand city of Tehran Province in Iran on May 2016 (35 39 59.8" N, 52 4ʹ 42.3" E). Leaves from healthy and diseased trees were collected (Figure 1) between 10:00 am to 14:00 pm (Tehran local time). Selection of healthy and diseased trees was carried out by expert decision. To ensure the accuracy of expert detection, leaf sample units were tested by selective culture test in a laboratory.
A total of 106 leaf sample units was collected for laboratory and spectral measurement including 34 healthy leaves (H), 50 symptomatic diseased leaves (S) with varied severity and 22 non-symptomatic diseased leaves (MS). Sample units were packed in different plastic bags with labels, kept in a cool and dark box and immediately transported to a nearby indoor laboratory for spectral measurements. Spectral reflectance data were collected under controlled laboratory conditions in the spectral range of 380-1000 nm because the spectra were noisy at the extremes. Then, all leaves were tested in a laboratory by the selective culture method to confirm the identity of E. amylovora in each sample. In this laboratory test, all samples were washed in tap water. Infected tissues were surface sterilized by immersion in 10% household bleach for 3 min and rinsed twice in sterile distilled water for a few minutes. Leaf samples were each macerated in a few drops of sterile distilled water in a sterile glass Petri dish using a sterile scalpel and forceps. Thirty minutes after maceration, 30 μL of macerated tissue were streaked onto King's medium agar B. The plates were then incubated at 27°C for 2-3 days and observed daily for bacterial growth. Suspected colonies of E. amylovora (white, circular, mucoid and curved) were selected and further purified on King's medium agar B at 27°C (King, Ward, & Raney, 1954).

Spectral measurements
A portable high-resolution fiber-optic spectrometer (Avaspec-ULS3648, Netherland) was used to collect the spectral reflectance data from leaves in the range of 200-1100 nm with a resolution of 0.05-20 nm under laboratory conditions. The spectral measurements were taken five times for each sample. Data for H, S and MS leaves were collected and recorded. Spectral data for H, S and MS leaf samples were analyzed with unscramble software (version X10; CAMO software, Oslo, Norway). Unscrambler is a powerful software for multivariate analysis, preprocessing and post-processing of spectra.

Preprocessing methods of spectral data
When collecting spectral data, there are always some undesired systematic variations which are primarily caused by light scattering and differences in spectroscopic path length. These unwanted variations often constitute the major part of the total variation in the sample set and can be observed as shifts in baseline (multiplicative effects) and nonlinearities. To remove these systematic variations and noises and to correct spectra, some types of pre-treatments are commonly applied before calibration modeling (Wold, Antti, Lindgren, & Ohman, 1998). In this research, the spectrometer measured the reflectance of each leaf five times; then all five spectra were averaged as a leaf reflectance before preprocessing.
The preprocessed spectral reflectance data were used to calculate the normalization, detrending (DT), standard normal variate (SNV), multiplicative scatter correction (MSC) and derivatives. In this research, area normalization was used to produce an approximately common scaling for all data. This transformation normalizes an observation (i.e. spectrum, chromatogram) by calculating the area under the curve for the spectra A Savitzky-Golay filter with an unweighted linear least-squares fit using a polynomial model was used to estimate the filter coefficients. In this study, a window length/frame size of 9 and polynomial order of 2 and 4 were used for deriving the first and second derivatives, respectively. Higher derivatives were not used because of removal of base features of spectra. Also, MSC is a pre-processing step needed for measurement of many elements. It is a transformation method used to compensate for the additive and/or multiplicative effects in spectral data (Mohamadi Monavar et al., 2013). .

Spectral data analysis
Spectral data sets were analyzed using principal component analysis (PCA). PCA is a latent variable regression method which reduces a data set to orthogonal components which represent most of the variability in the original data and contain a reduced amount of random measurement noise. Statistical parameters commonly used for the spectroscopic technique were used to evaluate the ability of the Vis-NIR spectroscopy method to classify samples.
The PC scores accounted for 95% of the variance in the original feature data sets were used as an input to the classifiers which were randomly divided into calibration and validation data sets. Linear discriminant analysis (LDA), quadratic discriminant analysis (QDA), Mahalanobis discriminant analysis, soft independent modeling of class analogy (SIMCA) and finally partial least squares-discrimination analysis (PLS-DA) were applied as classification techniques. LDA is a classification model, which generates a linear transformation of n-dimensional samples, while QDA develops a quadratic model for classification of unknown (unclassified) sample units (validation). Finally, SIMCA is based on constructing a PCA model for each class in a defined training set. Unknown samples were then compared to the class models and assigned to classes according to their proximity to the training samples. For example, the discriminant plot (Figure 2) illustrates the difference in class separation based on LDA, QDA and Mahalanobis using seven PCs derived from the normalized spectra. In Figure 2 every sample is displayed, color-coded by class, and the axes are for two of the classes in the model. Sample units lying close to zero for a class are associated with the class. Finally, the model was properly validated using a suitable cross validation and test set. Cross validation is a model evaluation method for removing some data before the training process. Although, full cross validation is more time consuming than other methods, estimation of the residual variance is more reliable. Leave-one-out (LOO) cross validation is K-fold cross validation taken to its logical extreme, with K equal to N, the number of data points in the set. LOO fits the model with k-1 observations and classifies the remaining observation left out. This process is repeated another k-1 times, each with a different observation left out.
Each pattern recognition procedure was run for every pretreatment method. For more accuracy, the spectra classification was executed three times while PCs (21) were used as input features to classify the data using each algorithm. In this study, all samples were divided into calibration (84 sample units) and validation (22 sample units) sets based on the Kennard-Stone algorithm (Kennard & Stone, 1969). Calibration samples were used to establish an estimation model, whereas validation samples were used to assess the model's predictive ability.

Vegetation indices
The spectral features adopted in this research included some vegetation indices demonstrated to be sensitive to variations of leaf pigments and health from previous studies (Table 1). After calculation of vegetation indices, analysis of variance (ANOVA) was performed for each index using SAS software 9.4 (Statistical Analysis System) at the 5% level of significance to specify sensitive spectral indices to Fire Blight disease and also to discriminate healthy and diseased leaves. Figure 3 shows the treated spectra with the first derivative of the Savitzky-Golay algorithm for H, S and MS leaf samples. Relative to the spectra of symptomatic and non-symptomatic diseased leaves, healthy leaves diverged and shifted to the left. Healthy leaves mostly fluctuated in spectra of 469, 493, 541 and 672 nm in Vis and 738 and 755 nm in NIR region while MS samples correlated to 490, 526, 552 and 680 nm in Vis area, and some small peaks between 749, 763, 868 and 951 nm in NIR region which could be because of changing foliar internal structure. Reflectance difference of diseased leaves was distinguished in the red edge.

Feature extraction
Healthy looking diseased leaves are more similar to diseased leaves spectrally with peaks of MS samples shifted 8-10 nm to the right side compared to the H samples ( Figure 3). For instance, the first derivative reflectance includes leaf pigment peaks at 469 and 541 nm related to H leaves. Hence, MS samples separated obviously in the NIR range. The reflectance percent of MS increased from 760 to 850 nm abruptly. S samples showed less than 7 nm drift to the right in each peak in comparison with MS samples. On the other hand, an accelerated increasing trend was seen in 541, 552 and 559 nm for H, MS and S spectra and decreasing reflectance showed in 672, 680 and 685 nm in H, MS and S samples, respectively, and then three spectra became horizontal with some weak peaks.
In addition, MS spectra were very similar to S spectra. This result was according to selective culture laboratory test for which E. amylovora bacteria were found in all MS samples. So, the extraction of leaf spectral features could be used for detection of infected trees before symptoms appear and Fire Blight spreads.

Discrimination analysis of healthy and diseased samples
The classification accuracy of discriminant analysis was used to study the effects of different preprocessing methods on the calibration and validation modeling. Based on the results (Table 2), Mahalanobis accuracy was greater than for the Savitzky-Golay algorithm. The results of SNV and MSC were similar especially for SIMCA analysis because both methods used the same algorithm for decreasing redundant zones and spectra noise. SIMCA, as a non-linear method, had an acceptable and reliable outcome (more than 96%) whereas   Gamon et al. (1992) ARI R550 ð Þ À1 À R700 ð Þ À1 Gitelson et al. (2001) first-derivative spectra produced 93.45% accuracy. The reason may be that some principal features were removed from spectra during derivation because the second derivative was not considered. Sankaran, Mishra, Maja and Ehsani (2011) obtained classification accuracy of approximately 92% for a combined data set with SIMCA-based algorithms. MSC was the only preprocessing method with an accuracy of more than 95%. Wouters, Ketelaere, Baerdemarker and Saeys (2013) obtained similar results for MSC. The average classification accuracy for SNV, Savitzky-Golay and DT was approximately 95% ( Table 2). The SNV and first derivative showed less average classification accuracy in LDA and QDA models relative to the Mahalanobis model. However, comparing the individual classification accuracies, it was found that the QDA, Mahalanobis and SIMCA models represented larger accuracy in the healthy category than LDA. Otherwise, Mahalanobis and SIMCA models were more reliable and accurate with 100% recognition of Fire Blight infected leaves (S) and healthy ones.
It is more important to identify diseased and nonsymptomatic leaves with minimum error to guarantee the most recognition of infected leaves. In this manner, SIMCA was more appropriate to detect Fire Blight (Table 2).
In addition, SIMCA could distinguish the distance of MS samples from the two basic groups of H and S. It was seen that MS samples had significant distance to the H category while it was less for S samples (Table 3). According to the SIMCA algorithm, discrimination was reliable if the distance between categories was more than 3. Therefore, MS samples are recognized as the diseased class. As, MS leaves were cut from infected trees so, MS leaves could be categorized as diseased samples.
For the last step, PLS-DA was developed based on PCs to discriminate H-MS and also H-S samples. PLS-DA was performed for the purpose of creating a classification model able to determine the class of new samples. PLS-DA uses PLS regression for discrimination purposes. Once the PLS model has been checked and validated, it could be used to classify new leaves. Figure 4 shows the score and loading graph of factors which explained 90% of total variance. It appears that the intrinsic spectra could be considered as a sensitive method that may allow discriminating H and S leaves.

Vegetation indices for detection of Fire Blight infection
Based on the extraction of sensitive wavelengths to Fire Blight disease detection, some vegetation indices  related to vegetation conditions and plant structure were calculated. To analyze the differences among class means and also variation among and between classes, ANOVA was used. The ANOVA results of testing the vegetation indices are presented in Table 4. ANOVA provides a statistical test of whether or not the means of three classes are equal and therefore generalizes the t-test to three classes. Among the studied indices, modified simple ratio (MSR) and structure-intensive pigment index (SIPI) indices were sufficiently sensitive to discriminate H-S, H-MS and also S-MS samples. For the randomized difference vegetation index (RDVI), there were significant differences between H and S and also MS and S samples at the 5% probability level. This index was considered to discriminate healthy leaves from symptomatic diseased leaves and also discriminating symptomatic and non-symptomatic infected leaves. For anthocyanin reflectance index (ARI), the p-value of H and MS samples showed a significant difference at probability level of 5%. So, this index could be used specifically for discrimination of H and MS samples. For physiological reflectance index (PhRI) and transformed chlorophyll absorption and reflectance index (TCARI), the difference between H-S and also H-MS was significant, while there was no significant difference between S and MS samples. So, it could be concluded that PhRI and TCARI were not sensitive to discriminate S and MS samples. The p-value for the nitrogen reflectance index (NRI) showed a significant difference at 5% probability between H-MS and S-MS samples. Modified triangular vegetation index1 (MTVI1) and modified chlorophyll absorption ratio index1 (MCARI1) could discriminate S and MS samples. So, all these four indices could be used as a specified index to detect and discriminate symptomatic diseased leaf samples and non-symptomatic infected samples. Another reason for this finding is vis-NIR spectra in 550, 670 and 800 nm wavelengths were used for the calculation of MTVI1 and MCARI1. These wavelengths were in accordance with the wavelength ranges obtained for Fire Blight disease detection in this research as described in Figure 3.
It may be concluded that SIPI and MSR had potential to discriminate H-S, H-MS and S-MS samples. The reason is that the SIPI index uses 570, 670 and 870 nm wavelengths and MSR index uses 670 and 800 nm wavelengths for calculation. All these wavelength ranges were obtained for the detection of Fire Blight-infected leaves in the present research (based on the results of Figure 3). Also, ARI was sensitive to discriminate H and MS samples. So, MTVI1 and MCARI1 had potential to discriminate S and MS samples. The reason for the significant differences among all studied indices was that the green, red and NIR wavebands were used to calculate these indices, and these wavebands often are capable of detecting plant health based on reflectance changes.
Based on the ANOVA results (Table 4) and comparing p-values for vegetation indices, all indices were found to be sufficiently sensitive to detect Fire Blight disease. The reason is that E. amylovora bacteria causes withering structure and death of leaves gradually; leaf color changes from green to yellow-brown during growing. So, light absorption in the Vis and NIR ranges changes because of changing leaf color. Damaging foliar internal structure, intracellular water content, decreasing chlorophyll content and pigment changes result in changing light absorption. Therefore, by comparing the spectral differences for healthy and diseased leaves and calculating spectral vegetation indices in Vis-NIR ranges, it was possible to discriminate healthy, symptomatic and non-symptomatic diseased leaves.
In the first stage of infection by E. amylovora bacteria, there is no visual or physiological symptom on the tree. In this time, Vis spectral range could not be useful for detection of infection. Instead, The NIR range could be used for early disease detection, because in this stage, the internal structure of the leaf starts changing which effects on light absorption in NIR range. Later, the disease detection in the Vis range is clear because the damage of leaf structure became larger. So, vegetation indices which are based on combination of Vis and NIR wavelengths can show infection.

Conclusions
In the present research, the application of Vis and NIR spectrometry was evaluated for the detection of Fire Blight disease in pear trees. Based on the results of this research, multiple conclusions are justified. First, wavelengths in 496-560 nm and 680-700 nm in Vis and 765, 872 and 960 nm in NIR region represented Fire Blight disease symptoms. Namely, 541, 522 and 559 nm wavelengths in Vis region and 672, 680 and 685 nm wavelengths in NIR region were extracted for the detection of healthy, non-symptomatic and diseased leaves, respectively. Second, based on spectral analysis, nonsymptomatic diseased leaves from infected trees were known and classified as infected leaves. Laboratory test results justified this finding. Therefore, spectral feature extraction for non-symptomatic diseased leaves could be used for the detection of infected trees before Fire Blight disease spreads. Third, comparing different classification methods indicated that SIMCA method was more reliable and accurate than other models for Fire Blight detection. Fourth, SIPI and MSR were identified as ideal candidates for discrimination of H-S, H-MS and S-MS leaves, and ARI was identified as a good candidate for discrimination of H and MS leaves. RDVI was found to be able to discriminate H-S and S-MS samples. Finally, the results clearly showed the potential of NRI for classifying H-MS and S-MS samples and the potential of MTVI1 and MCARI1 to discriminate S-MS samples.

Disclosure statement
No potential conflict of interest was reported by the authors.