Characterizing Tree Species in Northern Boreal Forests Using Multiple-Endmember Spectral Mixture Analysis and Multi-Temporal Satellite Imagery

Abstract Northern boreal forests are characterized by open stands whereby trees, understory background, and shadow are all significant components of the spectral response within a pixels’ spatial footprint. To overcome this mixed pixel problem, accurate spectral characterization of these (endmember) components is necessary for spectral mixture analysis (SMA) to generate forest classifications at the species level. Obtaining these endmember spectra in the field, however, can be difficult or impossible. This study examined whether image endmember spectra can be identified using forest inventory information to derive dominant tree species classifications. This was tested using multiple-endmember SMA (MESMA) and single- and multi-date Landsat imagery of a forested area in the Northwest Territories, Canada. Image classifications (n = 80) were generated based on 20 image-date combinations and four unmixing models. Accuracies of 80% and 82% were achieved for open and medium dense forest stands, respectively using multi-date imagery, which outperformed single-date imagery acquired at peak phenology. The overall accuracy is 72%; lower due to challenges in very open stands. The multi-date MESMA approach was robust for both compositionally pure and mixed stands. The approach merits further investigation, particularly within the context of the increasing availability of regional-scale satellite imagery enabling composite time-series and spectral-temporal image features.


Introduction
In Canada, the boreal zone occupies 552 million ha, within which 309 million ha consists of forests and woodlands (Brandt 2009).Its forests are ecosystems that host high levels of biodiversity, influence biogeochemical cycles, provide a wide range of ecosystem services, and have commercial, economic, and cultural value (Kimmins 1996;Brandt et al. 2013).To sustain these ecosystem services within a sustainable forest management context requires an understanding of the composition, structure, and distribution of boreal forests (MacDicken et al. 2015;Siry et al. 2018).This understanding is increasingly important due to changes in climate and fire regimes that are being expressed as forest decline, land-cover change, and permafrost degradation (Price et al. 2013;Veraverbeke et al. 2017;Carpino et al. 2018;Wang et al. 2020).Forest management decisions require up-to-date, spatially referenced information to determine the current forest state, assess rates of change, understand ecosystem dynamics, and ensure forest sustainability.Knowledge of forest species is a necessary component of forest management (Leckie and Gillis 1995;Ørka et al. 2013), and relevant for describing stand structure and landscape configuration (Plourde et al. 2007;Thompson et al. 2015;Dearborn and Baltzer 2021;Hauglin et al. 2021).
Species composition, defined as the relative proportion of tree species in a stand, is among the fundamental attributes collected in a forest inventory (Gillis and Leckie 1993;Leckie and Gillis 1995;Alberta Sustainable Resource Development 2005;Government of Northwest Territories 2006).Within the boreal, knowledge of species composition is used specifically to estimate timber growing stock (Boudewyn et al. 2007), net primary productivity (Tang et al. 2010), carbon uptake (Kurz et al. 2013), nutrient cycling (Prescott 2002), soil and dead organic matter carbon stocks (Shaw et al. 2015), and successional pathways following disturbance (Amos-Binks et al. 2010;Baltzer et al. 2021;Day et al. 2023).Information regarding species composition is also used to determine fuel types for wildfire behavior prediction (e.g., Canadian Forest Fire Behavior Prediction System) (Tymstra et al. 2010), estimate post-fire fuel consumption, carbon emissions or lichen biomass recovery (de Groot et al. 2007;Greuel et al. 2021), and define habitat selection models for a range of boreal mammals (e.g., woodland caribou; Rangifer tarandus caribou, grizzly bear; Ursus arctos) and birds (Hobson and Bayne 2000;Rettie and Messier 2000;McDermid et al. 2009).In total, spatially explicit information regarding tree species composition is a necessary attribute for a wide-range of forest management information needs.
The interpretation of aerial photographs has been the most common approach to obtain operational forest inventory information (Hall 2003;Pinto et al. 2007;Lutz et al. 2008) and although softcopy methods (digital aerial photographs and interpretation) are now routinely used, it remains cost-prohibitive over large, remote forests (Falkowski et al. 2009).This is certainly the case in northern boreal (Taiga Plains) forests, such as in the Northwest Territories (NWT), Canada, which have at least 42 million ha of forested land, as estimated based on Earth Observation for Sustainable Development of Forests (EOSD) landcover information (Wulder et al. 2008;Mahoney et al. 2018).However, <10% of this forested land has a complete inventory.This, combined with long distances to urban areas, lack of physical access, and the relatively small industrial forest base in the NWT compared to the total forest resources (Ogden and Innes 2007) has resulted in a limited means by which conventional forest inventories can be completed.Satellite imagery is a desirable, if not the only data source that can provide comprehensive forest information throughout these vast, remote, northern areas, and particularly within the context of applying multi-source approaches (Margolis et al. 2015;Matasci et al. 2018;Mahoney et al. 2018;Breidenbach et al. 2021;Castilla et al. 2022).However, these lower density, more open forest stands exhibit the "mixed-pixel problem" (Adams et al. 1993;Van der Sluijs et al. 2016) in remote sensing whereby (in this environment) a satellite image pixel signal comprises reflected energy from the sunlit portion of trees as well as from visible ground cover and tree shadows.This makes it more difficult to obtain information about trees, especially for more challenging applications, such as deriving tree species information.Spectral mixture analysis (SMA; also known as unmixing) approaches (Adams et al. 1993) has evolved to become a widely accepted approach to dealing with (and providing important new information from) mixed pixels, with more advanced SMA methods, such as multiple-endmember SMA (MESMA) (Roberts et al. 1998) providing additional options, flexibility and robustness.For example, MESMA has been successfully applied to obtain tree species distribution maps in California (Dennison and Roberts 2003a), Australia (Youngentob et al. 2011), and Hawaii (Somers and Asner 2013).However, there are additional challenges in northern boreal regions and these have received less attention (Section Tree species mapping opportunities; Van der Sluijs et al. 2016).Addressing some of these key challenges is an important part of this study, with the longer-term goal to provide improved forestry information over large, remote, lower density forest stands using satellite imagery.

Remote sensing of tree species
Deriving information about tree species from digital remote sensing has long been a research problem of considerable interest (Franklin 2001;Lutz et al. 2008).Most studies regarding tree species mapping are either data-driven or sensor-driven, and conducted over small areas that are not easily comparable (Fassnacht et al. 2016).When undertaking a study to map or classify tree species, knowing how species composition is defined in the field is relevant, particularly if it serves as a training and validation data source (Van der Sluijs et al. 2016).Image analysis approaches for medium spatial resolution airborne and satellite data have typically involved per-pixel classifiers, regression, and spectral mixture analysis (SMA) (Goodenough et al. 2003;Peddle et al. 2007;Wolter and Townsend 2011;Persson et al. 2018), or k nearest neighbor or Random Forest modeling of presence and/or abundance (Beaudoin et al. 2014;Thompson et al. 2015;Hermosilla et al. 2022).Process-based segmentation and Individual-Tree-Crown (ITC) classifications have been preferred for high spatial resolution data from airborne imagery and LiDAR (Dalponte et al. 2014;Korpela et al. 2014;St-Onge et al. 2015).These studies and others have had widely variable results (30%-90% overall accuracy), an indicator of the challenges associated with species characterization.Due to a lack of studies the performance of species classifications in open, shorter forest stands common to the northern boreal region is largely unknown.Fassnacht et al. (2016) recommended that future studies include a greater emphasis on understanding classification errors, improving the representativeness of reference samples, and addressing issues pertaining to scale (e.g., the effect of forest background signals, mixed species).Specifically, these reviews highlighted opportunities in the use of SMA, multi-temporal datasets, and classification procedures that incorporate existing knowledge (e.g., forest inventory information).

Tree species mapping opportunities in boreal forests of the Taiga Plains
The boreal forests within the NWT Taiga Plains are characterized by predominately open stands (i.e., crown closure generally <60%) where understory ground vegetation is a significant component of pixel-level reflectance (Peddle et al. 1999;Franklin et al. 2003;Van der Sluijs et al. 2016;Castilla et al. 2022).There are considerable canopy gaps and open areas due to the variable tree heights and smaller tree crowns often occurring in lower stem density stands due to the predominance of tree species, such as black spruce (Picea mariana (Mill.)BSP) and jack pine (Pinus banksiana Lamb.) compared to boreal forests at more southerly latitudes (Van der Sluijs et al. 2016).The forest canopy, understory, and shadow components contribute considerably to the overall pixel spectral signal in both high-and mediumspatial resolution images (e.g., 2-m, 30-m, respectively).In very low density, open stands, the background understory is often the main reflecting target.Conventional pixel-level approaches for classification of medium-to-high spatial resolution imagery at the tree species level (Dymond et al. 2002;Dalponte et al. 2008) are not suitable in these open stand environments and therefore were not tested here.Instead, a classification approach based on SMA (Adams et al. 1993), whereby mixed pixels are decomposed to physically meaningful components of sunlit canopy, background, and shadow (i.e., endmembers) may be better suited to derive information regarding tree species in northern boreal forests (Van der Sluijs et al. 2016).
A key to successful SMA is the appropriate selection and derivation of endmembers (Tompkins et al. 1997), which involves identifying the number and type of endmembers and their corresponding spectral signatures.Endmember spectra may be: (i) derived by field measurements with a spectroradiometer (reference endmembers); (ii) obtained from spectral libraries; (iii) extracted directly from the image data (image endmembers), or (iv) estimated using canopy reflectance models (Peddle et al. 1999), with reference and image endmembers being the most common sources.Reference endmember spectra can often be measured in the field, but physical accessibility, budgetary limitations, equipment requirements (e.g., availability and setup of a field spectroradiometer system), and/or highly variable forest stand composition and structure may pose challenges or prevent the collection of representative measurements (Van der Sluijs et al. 2016).Furthermore, reference endmembers should ideally capture the understory diversity and species abundance when defining the background component.The use of reference endmembers, as with any target, may also limit the classification of coniferous tree species at the stand scale (Roberts et al. 2004;Van der Sluijs et al. 2016) due to differences in atmospheric conditions between reference endmembers and the image data for which calibration to reflectance is required for both but may be challenging or not possible and differences in the scales at which the endmember spectra are obtained and applied (Drake et al. 1999).Alternatively, the use of image endmembers avoids these challenges as well as limitations resulting from site accessibility, measurement uncertainties, available budget, and it does not require any field spectroradiometer equipment.Image endmembers would also be expected to be better suited for multi-temporal imagery to exploit phenological differences among tree species and may be more suited to conduct mapping over larger geographic extents.
Despite having advantages, such as not requiring field measurement or reflectance calibration, the use of image endmembers in northern boreal forests can pose different challenges.In this study, it was unknown whether useful image endmembers for classification purposes could be identified with the aid of forest inventory information.Numerous endmember extraction algorithms exist to find (or derive) pure image endmembers and their spectra, such as N-FINDR (Winter 1999), pixel purity index (Boardman 1994), orthogonal subspace projection (Harsanyi and Chang 1994), principal component analysis (Bateson and Curtiss 1996), vertex component analysis (Nascimento and Dias 2005), convex hull approaches (Boardman 1994), and iterative error analysis (Neville et al. 1999).However, pure and/or derived image endmember spectra may have an unknown physical meaning, may not exist in the image, or may not be useful for the application (e.g., a spectrally bright gravel road in a forested scene) (Plaza et al. 2012).Other endmember extraction methods do not assume the presence of pure endmembers in the image and instead generate virtual endmembers that are not necessarily present in the image itself (Tompkins et al. 1997;Plaza et al. 2012).Considerable research has been undertaken to automatically extract image endmembers, yet little is known about the application of image endmembers identified using forest inventory information.Therefore, a logical step is to test and compare the use of field reference versus image endmembers.A potential issue regarding the creation of endmembers is the effects of vegetative phenology on the degree to which boreal tree species can be discriminated, as this is relevant to the timing of image collection (Wolter and Townsend 2011;Persson et al. 2018;Hemmerling et al. 2021).This issue is also relevant to the timing of field data collection, and to defining image endmembers more generally, as it influences the required dates of data collection within a multi-temporal image database.Thus, it was important to assess if multi-temporal imagery acquired at different stages of the growing season would improve discrimination of the dominant species in a representative northern boreal forest environment within the Taiga Plains.To address these issues for a study area in the NWT, three research questions were pursued in this study: Are there differences in SMA-based classification performance when using reference endmembers versus image endmembers?Is the performance of image endmember spectra dependent on the type of forest inventory information used to select representative sunlit canopy and background components?Does the use of multi-temporal imagery to represent different stages of vegetative phenology improve the accuracy of dominant tree species classification?

Study area
The 342 km 2 (18 km Â 19 km) study area located 20 km south of Fort Providence, NWT (Figure 1) is in the Great Slave Lowland Mid-Boreal Ecoregion (Ecosystem Classification Group 2007) of the Taiga Plains Ecozone (Ecological Stratification Working Group 1995).It was selected based on the presence of representative northern boreal tree species, accessibility, and availability of a forest inventory field dataset.The study area has had no recorded forest fires over the entire record of the NWT fire database before this study .Considering the scarcity of remote sensing literature on deriving tree species information of northern boreal forests, such as in the NWT, experiments at a small geographic extent were pursued to better understand whether the spatial heterogeneity of tree species in these forests can be retrieved with suitable precision.
Tree species in the study area included white spruce (Picea glauca (Moench Voss)) and trembling aspen (Populus tremuloides Michx.) that are dominant along the alluvial flats adjacent to rivers, and contain diverse herb and shrub understories.Mixtures of jack pine and trembling aspen occur primarily in open stands (but also include some of higher density) and are found in dry, coarse-textured soils associated with beach ridges, and typically have sparse shrub, forb, and reindeer lichen (Cladina mitis Sandst.)Hustich) understories.Colder, poorly drained sites are predominantly occupied by predominantly black spruce, where understories typically consisted of Labrador tea (Ledum groenlandicum Oeder) and mosses (Sphagnum spp., Drepanocladus spp.).Other tree species common to the study area, such as tamarack (Larix laricina (DuRoi) K. Koch), white birch (Betula papyrifera Marshall), and balsam poplar (Populus balsamifera L.) were not addressed in this study as their relative abundances were small in the field plot data.
Representative photos of the study area can be found in Van der Sluijs et al. (2016).

Imagery
Five Landsat-5 Thematic Mapper (TM) terrain-corrected scenes (L1T products) acquired May 11, 2006, June 12, 2006, July 8, 2004, August 21, 2005, and September 6, 2005 were retrieved and correspond to a monthly progression through seasons that include snow-free image dates (i.e., spring, summer, fall).Solar zenith angles at the time of image acquisition for all scenes ranged between 39 and 56 .The scenes were all within ±1 year of the forest inventory (July 2005), and the July scene was closest to the timing and phenological stage of the spectral field data (July 16-22, 2013;Van der Sluijs et al. 2016).Any changes in understory diversity and relative abundance of species between the image data and 2013 spectral field data collection were deemed to be minor because of the slow growth of sunlit and background components, and the absence of wildfire and other major forest disturbances in the study area.An ancillary August 30, 2003 QuickBird image was available for site reconnaissance (Figures 1 and 2).
Landsat-5 TM at-sensor radiance data, generated using sensor gains and offsets from Chander et al. (2009), were atmospherically corrected to surface reflectance using the Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes (FLAASH) module in ENVI 4.8 ( 2009) with a Sub-Arctic Summer atmospheric model and rural aerosol model.To create a multi-temporal dataset, all scenes and bands were layer-stacked into a 30-band datacube (i.e., 5 dates Â 6 TM bands).No additional co-registration of Landsat scenes was conducted as the geo-registration of standardized Landsat L1T deliverables were already based on ground control points (GCPs) and a digital elevation model (DEM) to correct for terrain and view angles (spatial accuracy tolerance of L1T collection 0.5 pixels).For this work, the geometric residuals ranged between 3.4 m and 4.5 m RMSE as reported by image metadata.Multiple date combinations were created from the data-cube to allow testing of whether combinations of early spring (green-up, new conifer needles), peak, and autumn (senescence) phenological conditions in the sunlit canopy and its associated changes in vegetative background improved species discrimination (Ross et al. 1986;Miller et al. 1997;Mickelson et al. 1998;Hu et al. 2008;Rautiainen et al. 2011;Clark 2017).A total of 20 single-date and multidate combinations were tested to capture the general variability in classification performance using combined spring, peak, and senescence phenological stages (Figure 2).

Forest inventory data
Forty-eight 20-m Â 20-m plots were located using a stratified random sampling process in representative stands of jack pine, white spruce, black spruce, trembling aspen, and various mixed-woods based on Government of Northwest Territories Forest Vegetation Inventory (GNWT FVI) maps that were available for the study area (Hall and Skakun 2007).The field plots were collected in July 2005 using protocols based on the National Forest Inventory (Gillis 2001).All plots were at least 100 m away from roads, cut-lines, water bodies, and non-forested areas.Plot centers were established using a metal pigtail with flagging tape, after which the locations were recorded using a Trimble differentially corrected GPS system (NAD 83 UTM Zone 11N).Measuring tape and a compass were used to mark the cardinal (i.e., N, W, S, and E) and intercardinal (i.e., NW, SW, SE, and NE) points of the plot with flagging tape and to determine plot boundaries.The species, diameter at breast height (dbh), and total height were recorded for every tree that exceeded both 1.3 m in height and 5 cm in dbh.Within each plot, crown-closure estimates were obtained using a spherical densitometer (Stumpf 1993) at five locations: the plot center, and the four intercardinal corners.

Image endmember spectra extraction
As boreal stands are inherently open from a nadir spaceborne sensor perspective, at-sensor signals comprise reflected energy from tree canopy, understory vegetation, and shadows.Given these open stands, the background endmember is often more prominent (i.e., has high area fractions within a pixel, especially at low crown closures), resulting in a larger contribution to pixel-level reflectance from the background endmember.Previous work in the study area highlighted the importance of the composition of the understory and the influence of spectral diversity in open boreal forests on the selection of reference endmembers and subsequent classification of dominant tree species (Van der Sluijs et al. 2016).In particular, improvements in dominant tree species classification accuracies were achieved when endmembers reflected the compositional and structural complexity of the area.Thus, SMA-based classifications of tree species using image endmember spectra may likely be influenced by how the corresponding spectral signatures of endmember components are identified.This section outlines the experimental approaches and criteria used in determining background and sunlit canopy endmember spectra from the Landsat imagery, as well as the specification of shadow endmember spectra.

Background endmembers
Two methods of representing the sunlit background endmember were tested, with pixels located either just outside the forest cover or inside of the forest cover (Table 1).For the first background endmember approach ("PurestBg"), single pixel spectra were acquired as the pixel immediately outside the forested edge where no spectral contamination by the treed canopy was present (as also visually interpreted from an ancillary August 30, 2003 QuickBird image; Table 1).For the second approach ("LowestCC"), single pixel spectra were selected from within the forest area at six representative plots with the lowest crown closure estimates, acknowledging canopy presence (Table 1).Both image endmember spectra sets were acquired using the software plugin VIPER Tools (Roberts et al. 2007) available for ENVI.In cases where the locations of plots overlapped multiple Landsat pixels, the individual spectra of all intersecting pixels were extracted without averaging to accommodate spectral and ecological variability, which resulted in 14 and 10 spectra for PurestBg and LowestCC, respectively (Figure 2).

Shadow endmember
Shadows on the ground or canopy sides receive mostly diffuse light in treed environments, originating from neighboring trees.There was no viable means to select pure shadow endmember spectra from the Table 1.Criteria for establishing sunlit canopy and background spectral libraries.

Image endmember Criteria Definition
Sunlit canopy (i.e., tree species) By purest basal area "Basalarea" Pure plots: Plots selected where the relative abundance of a particular tree species exceeded 95% of the total basal area of the dominant trees above stand height Ã Mixed plots: Plots of mixed abundance (<80%) by total basal area of the dominant trees above stand height Ã ; selected to incorporate compositional and structural diversity.By highest crown closure "HighestCC" Pure plots: Plots selected that were pure by species (>80%) Ã , and had the highest crown closure estimates within each species group.Mixed plots: Plots of mixed abundance (<80%) Ã ; had the highest crown closure estimates within each mixed-species group.Background By spectral purity "PurestBg" Homogenous areas of low vegetation near the forest edge were selected using a panchromatic QuickBird image as a visual reference.By lowest crown closure "LowestCC" Pure: Plots were pure by species (>80%) Ã , and had the lowest crown closure estimates within each species group.Mixed: Plots had the lowest crown closure estimates within each mixed-species group.Ã Species determined by all trees in the dominant and co-dominant layer based on tree measurements of total height and diameter at breast height.
imagery because in these either open areas, or within open forested stands, at this image spatial resolution no pixels came close to being completely in shadow.Instead, this study adopted a commonly applied solution by representing these signals as photometric shade (i.e., zero reflectance across the electro-magnetic spectrum) (Clark and Roberts 2012).

Sunlit canopy endmember
Two methods of representing the sunlit canopy endmember were tested to determine whether the performance of image-derived endmember spectra for tree classifications depending on the type of forest inventory information used to select representative samples for input to SMA.Representative pure plots were selected from the 48 forest inventory plots either by basal area or crown closure criteria (Table 1).With the basal area approach ("Basalarea"), candidate pure plots were retained with the highest tree species purity in the overstory (i.e., !95%).A total of 14 candidate conifer plots met this criterion.With the crown closure approach ("HighestCC"), candidate pure plots (n ¼ 8) with the highest crown closure estimate within each tree species group were retained as long as they met the compositional purity criterion (!80%; Table 2).The inventory data did not contain any pure aspen stands that met the criteria of the Basalarea or HighestCC sunlit canopy endmember, and therefore two aspen endmember image spectra were identified for each approach based on pure patches (at the Landsat scale) visible on an ancillary August 30, 2003 QuickBird image.
The identified candidate pure plots that met the Basalarea criteria represented potential calibration data to derive species classification outputs yet a final selection procedure was necessary to accommodate forest heterogeneity but also to maintain the highest number of independent plots used for validation of the species classification (Section Validation).To achieve this, a final pruning procedure was used to retain only those candidate plots that were spectrally most similar to other plots of the same species and most dissimilar to plots of other species.VIPER Tools (Roberts et al. 2007) offered three techniques for identifying optimal endmember spectra associated with these 14 candidate plots using July imagery, including (1) count-based endmember selection, (2) endmember average Root Mean Square Error (RMSE), and (3) minimum average spectral angle.Firstly, count-based endmember selection, proposed by Roberts et al. (2003), identifies those spectra that model the greatest number of spectra within their tree species class (by assessing fraction and RMSE constraints when unmixing other spectra in the candidate library).Secondly, endmember average RMSE, proposed by Dennison and Roberts (2003b), identifies the spectra within a tree species class that provides the best fit when modeling that class (i.e., the lowest average RMSE when it is used to model all other spectra of the same class).Thirdly, minimum average spectral angle, proposed by Dennison et al. (2004) is conceptually similar to endmember average RMSE as it identifies the spectra with the best average fit within a species class, but uses average spectral angle and not RMSE as the fitness metric.Whereas a single spectra will often possess both the minimum endmember average RMSE and average spectral angle values, this may not always be the case depending on the brightness of the image scene, and further, both values may disagree with count-based endmember selection, requiring manual intervention and prioritization (Roberts et al. 2007).Of the 14 candidate conifer plots that met the Basalarea criteria, a final eight pure plots were manually identified as being spectrally most similar to other plots of the same species and most dissimilar with plots of other species (Table 2; Figure 2).
Only pure plots were considered up to this point in the endmember spectra selection, yet northern boreal forests are often comprised of mixed species composition ( 80%).This requires attention in any representative tree species classification approach to meet the mapping needs of operational forest management (Fassnacht et al. 2016).Rather than discarding or masking out compositionally mixed plots, small samples of these plots were purposefully selected based on similar basal area and crown closure criteria as with the pure plots (Table 1) (Foody and Mathur 2006).Representative mixed Basalarea plots were manually selected based on forest inventory data (n ¼ 6; Table 2; Figure 2), whereby the compositional and structural heterogeneity of the mixed plots for each dominant species was accommodated (e.g., a jack pine leading plot with a white spruce sub-component, or a white spruce leading plot with a black spruce component).Representative mixed HighestCC plots were also manually selected based on forest inventory data based on the criteria in Table 1 (n ¼ 5; Table 2; Figure 2), but differ from Basalarea plots in that the HighestCC plots selected two compositionally mixed plots with the highest crown closure.
The criteria differences for selecting background and sunlit canopy endmember plots support the study objective to investigate whether the performance of image-derived spectra is affected by the type of forest inventory information used to select representative sunlit canopy and background components.Once the optimal plots for species discrimination were known for the Basalarea (n ¼ 14) and HighestCC sets (n ¼ 13), the final image endmember spectra sets were acquired through VIPER Tools (Table 2; Figure 2).These Basalarea and HighestCC plots had similar distributions of Lorey's height and stem density, while crown closure was highest for the latter plots, as expected (Figure S1).Multi-temporal spectra were extracted from the same plot locations to test whether different combinations of multi-temporal imagery improved species discrimination.As noted earlier, for the multi-temporal datasets, the image endmembers were already aligned as the standard Landsat L1T level product data were already corrected using GCPs and a DEM ( 0.5 pixels).For multi-temporal imagery, the spectra were simply concatenated into respective spectral library files.In case optimal plots overlapped multiple Landsat pixels, the individual spectra of all intersecting pixels were extracted without averaging to accommodate spectral and ecological variability, which resulted in 26 and 21 spectra for Basalarea and HighestCC, respectively (Figure 2).The observed differences in extracted spectra were small among the three coniferous species throughout the growing season (Figures S2 and S3); differences hypothesized could be exploited similarly to other SMA works attempting to map the distribution of spectrally similar vegetation species (Somers and Asner 2014;Chance et al. 2016).

Classification and validation
Spectral mixture analysis Spectral mixture analysis (SMA) quantifies the proportion within a pixel that is occupied by each of the pure features occurring on the ground (i.e., endmembers).The output is a fraction (% area within a pixel) for each endmember along with the error of fit (Adams et al. 1993).For each pixel, this linear model can be expressed as: where the (spectrally mixed) image pixel value R 0 i is the reflectance in band i for a given pixel, modeled as the sum of the endmember reflectances (R) in band i for N endmembers as weighted (multiplied) by fraction f k associated with each endmember (k).The e i term is the remainder between the actual image reflectance value (e.g., from Landsat TM) and the sum of the SMA terms, and is expressed as a band residual.Model fitness can be assessed either by using the residual term or via the RMSE (Roberts et al. 1998) over the total number of bands (): For this study, MESMA (Roberts et al. 1998) was used instead of conventional SMA to accommodate the inherent variability in boreal environments (e.g., different tree species, different structures or types of understory cover-and various combinations of these).MESMA is not constrained by a single set of endmembers as it allows the number and types of endmembers to vary on a per-pixel basis to account for spatial heterogeneity, whereby the best endmember set is determined based on the lowest error across multiple bands (e.g., RMSE from the system of equations, one equation per wavelength band).In our study, the tree species associated with the sunlit canopy endmember of the best-match endmember set was assigned as the tree species class for that pixel.This is how the tree species classification was achieved.
MESMA was facilitated using VIPER Tools (Roberts et al. 2007) which provided a raster image of the best fit endmember model of each pixel (i.e., lowest RMSE) among other outputs (Figure 2).The resulting MESMA output was converted to a per-pixel landcover classification image by assigning as the pixel class the tree species identified as associated with the best fit endmember set.Results were obtained using the MESMA partially constrained unmixing option whereby the (per-pixel) sum of the fractions must equal 1.00, but individual endmember fractions can be slightly <0.00 and slightly >1.00.Fraction limits of À10% and 110% were used, based on empirical tests similar to other studies that showed instances of obtaining optimal class accuracies from models that deviated slightly from the standard physical thresholds (Dennison and Roberts 2003b;Thorp et al. 2013).The RMSE was constrained to below 2.5% reflectance, representing a constraint to ensure the measured Landsat reflectance was unmixed using realistic models while assuring models were selected for the majority of pixels in the study area (Roberts et al. 1998).The RMSE threshold ensured endmember stability by excluding unsuitable combinations of endmember spectra that exceeded the 2.5% reflectance constraint; more stringent RMSE thresholds would limit the chance of pixels to be classified thus leading to unclassified pixels.The Landsat TM datasets were unmixed using a three-endmember model having sunlit canopy, background, and shadow components.The two different approaches of identifying sunlit canopy image endmembers (Basalarea, HighestCC) and background endmembers (Purestbg, LowestCC) produced a controlled experiment, in that the Landsat TM datasets were unmixed using each of the four unique combinations of image endmember spectra sets as well as the same shadow endmember spectra (photometric shade).
The first unmixing model ("Basalarea þ PurestBg") was an attempt to classify the Landsat TM imagery with the purest and most compositionally representative sunlit canopy endmembers (albeit with some degree of spectral contamination from the understory) and the purest homogenous background spectra sets (just outside the forested area).All possible combinations (n ¼ 364) of Basalarea-criteria selected sunlit canopy spectra (n ¼ 26) and PurestBg-criteria selected background spectra (n ¼ 14; Figure 2; Figure S4a) were input to MESMA, with the best model determined for each pixel as the combination of sunlit canopy, background, and shadow endmember components that yielded the lowest RMSE.As the sunlit canopy endmember is species specific, the best-performing endmember combination was simply used as the tree species class, thus producing the tree species-level image classification.The second unmixing model ("Basalarea þ LowestCC") tested was an attempt to better capture the understory characteristics of northern boreal forests that often comprise several species (Barbier et al. 2008), acknowledging some degree of spectral contamination from the canopy (26 Basalarea spectra with 10 LowestCC spectra (Figure 2; Figure S4b; for a total of 260 possible combinations).The third unmixing model ("HighestCC þ PurestBg") aimed to reduce spectral contamination for both sunlit canopy and background endmember components by using representative spectra from the highest density stands and the purest homogenous background spectra sets (21 HighestCC spectra with 14 PurestBg spectra, for a total of 294 possible combinations; Figure 2).The fourth unmixing model ("HighestCC þ LowestCC") was an attempt to better characterize the overstory and understory by using spectra from the highest and lowest density stands, respectively (21 HighestCC spectra with 10 LowestCC spectra, for a total of 210 possible combinations (Figure 2).All four unique combinations of image endmember spectra sets (endmember model combinations) were used to unmix 20 Landsat single-date and multi-date combinations, resulting in a total of 80 image classifications under evaluation.For each of these 80 classifications, the raster outputs produced by MESMA identified for each pixel the best model combination of spectra that yielded the lowest RMSE (Figure 2).MESMA metadata about each model combination was referenced to determine which tree-species specific sunlit canopy endmember performed best for each pixel, which in turn was used to recode the best model raster into the four dominant species of interest for this work (i.e., white spruce, black spruce, jack pine, and aspen).The best-performing model was thus used to determine the tree species class, in turn producing the tree specieslevel image classification.

Validation
Validation of the species classification products for the 80 image classifications was expressed as agreement against the dominant species from the independent field-inventory plots.The dominant species per fraction of the total basal area of the dominant and codominant trees were used as the field-based species description because Landsat TM imagery showed the highest sensitivity to this measure for validation (Van der Sluijs et al. 2016).The results of the July image endmember classification were compared to the best (field-measured) reference endmember model as reported by Van der Sluijs et al. (2016).The same set of plots was used in the validation of the July-only reference endmember model classification and multitemporal image classifications (i.e., related samples).Contingency matrices were derived to determine the overall agreement with validation data as well as the producer and user accuracy [agreement] (Congalton and Green 2009).McNemar's tests, a nonparametric test based upon standardized normal test statistics calculated from contingency matrices, were conducted to estimate the statistical significance of the difference between two estimated accuracies (Foody 2004).We discuss the quantity disagreement and allocation disagreement indicators in favor of the kappa statistic given that kappa is fundamentally not an index of accuracy, is redundant to percent agreement, and has not been recommended for use in remote sensing accuracy assessment (Pontius and Millones 2011).Quantity disagreement refers to the amount of difference between the reference endmember and image endmember classifications that is due to the less than perfect match in the proportions of the categories (i.e., compositional differences).Allocation disagreement describes error due to differences in the location of map categories (i.e., spatial pattern/configuration differences).For forest inventory purposes, a low quantity disagreement is desirable to correctly model the proportional (net) availability of tree species in a given study area, while low allocation disagreement is desirable for the specific location of tree species.
To support the validation of the species classification products, forest inventory plots independent from those where endmember spectra sets were collected for the Basalarea and HighestCC sets were retained.Out of the 48 plots, there were 34 plots (48 minus 8 pure spectra plots and 6 mixed spectra plots) and 35 plots (48 minus 8 pure spectra plots and 5 mixed spectra plots) available for the Basalarea and HighestCC classifications, respectively.The forest inventory data did not contain any independent plots to assess trembling aspen stands, therefore, five 20-m by 20-m validation plots were manually included where dominant patches of aspen were identified from a 1994 aerial photo-interpretation-based forest inventory and with reference to August 30, 2003 QuickBird image.The five image plots were not located in the same areas or stands where the sunlit canopy spectra for the trembling aspen class were extracted.A total of 39 and 40 independent validation locations were thus used to determine the agreement between field observations and image classifications based on the Basalarea and HighestCC spectral sets, respectively (Figure 2).

Validation of single-date image classifications
The first study question was to determine if there were differences in SMA-based classification performance between reference endmembers and image endmembers.Validation (level of agreement with ground data) results of the image endmember classifications generated relatively low agreements ranging between 33% and 44% with independent forest inventory plots (Figure 3).These results were lower than the 52% agreement achieved with reference endmember classifications (Van der Sluijs et al. 2016), although absolute differences were small.Agreement levels using singleimage date datasets exhibited a large range of variability, with image endmember classifications near the early peak of phenology (June, July) performing considerably better than the late peak (August) or senescent stage (September) images.Classifications based on May imagery resulted in unmixing RMSEs exceeding the 2.5% maximum threshold and thus lead to unclassified forest stands.The classification results achieved using a single-image date representative of peak growth conditions highlight the considerable challenges with identifying dominant tree species in the study area using either field reflectance or imagederived endmember spectra.
The second study question was to determine whether the performance of image endmember spectra was influenced by the type of forest inventory information used to select representative sunlit canopy and background components.Considerable differences in agreement existed among the various endmember models depending on the phenological stage, with June and July datasets showing greater differences among image dates than those later in the growing season (Figure 3; up to 23% and 11%, respectively).For the August and September image datasets, the mean differences in classification agreement among unmixing models ranged between 2% and 3% relative to the maximum agreement for that single-month dataset.The highest agreement occurred when the Basalarea spectra were used to define the sunlit canopy, with no discernible differences observed for the background image endmember.None of the image endmember datasets or spectral library combinations provided satisfactory results, hence an ideal scenario for single-image date tree species classification appears unlikely.

Validation of multi-temporal image classifications
The third study question was to determine if the use of multi-temporal imagery to represent different stages of vegetative phenology improves the accuracy of dominant tree species classification.Multi-temporal combinations improved species discrimination compared to single-date datasets (Figure 4).The mean overall agreement differed significantly among single datasets and datasets consisting of 2, 3, 4, and 5 image dates [One-way ANOVA F (4,79) ¼ 17.394, p < 0.001], whereby median agreement increased with an increasing number of image dates used in the unmixing, up to combinations of 4 image dates (Figure 4).Maximum classification agreement increased from the baseline comparison of single date datasets (i.e., 46%) to 56%, 62%, and 72% for datasets consisting of 2, 3, and 4 image dates, respectively.Tree species discrimination was improved with multi-temporal imagery that represented different stages of vegetative phenology.
The experimental design of four endmember model combinations used in the multi-temporal datasets was to further analyze whether the performance of image endmember spectra was influenced by the type of forest inventory information used to identify representative sunlit canopy and background components (second question).In determining meaningful patterns, the results of each of the four unique unmixing model combinations were grouped for the calculation of means and variance.Mean overall agreement differed significantly among unmixing models [One-way ANOVA F (3,79) ¼ 2.929, p < 0.039; Figure 5].Mean and maximum classification agreements ranged between a low of 36% and 48% for the "HighestCC þ LowestCC" unmixing model to a high of 48% and 72% for the "Basalarea þ PurestBg" unmixing model, respectively.This difference was found to be statistically significant using Scheffe's multiple comparisons post-hoc test (mean: 12.4%, S.E: 4.3%, p ¼ 0.048).No other multiple comparisons between grouped unmixing models were found to be statistically significant (Figure 5).With the differences in overall accuracy among the unmixing models, it was clear that the performance of image endmember spectra was dependent on the type of forest inventory information used to select the sunlit canopy and background endmember inputs.
Consistent with observations for single-date datasets, there were considerable differences in agreement among the various endmember models used to unmix the multi-temporal imagery.For each of the 20 imagery combinations, the mean classification agreement among the four unmixing models differed from 4% to 15% relative to the maximum agreement for that combination.However, for some image combinations, the differences between unmixing models were much larger and approached 30% (e.g., 2 months late season, 4 months without May, 4 months without July; Figure 6).The large differences observed among unmixing models are indicators of unequal classification performance.The magnitude of the differences among unmixing models did not appear to be influenced by the number of image dates in a datacube, with mean differences ranging between 8% and 10% and maximum differences ranging between 16% and 21% among datasets that contain 2, 3, and 4 image dates.The considerable differences observed between unmixing models, therefore, indicated that image endmember classification performance was dependent on the type of forest inventory information used to select representative sunlit canopy and background components.

Contingency matrices of select classifications
The results described in the previous section highlighted that multi-temporal combinations improved tree species discrimination compared to single-date datasets, with a 72% agreement for the best multi-temporal classification compared to 46% agreement for the best single-date classification based on image endmembers (June dataset; Figure 7a) or a 52% agreement for the best reference endmember classification based on Van der Sluijs et al. (2016) (July dataset; Figure 7c).Substantial improvements in producer and user  accuracy were realized for jack pine (52% to 71% and 73% to 79%, respectively), white spruce (25% to 75% and 40% to 86%, respectively), and black spruce (20% to 60% and 13% to 38%, respectively) relative to the June classification (Tables 3 and 4).Noticeable differences in species distribution patterns were also observed between the June-only image classification and the best multi-temporal image classification (Figures 7a and 7b).A distinct pattern of trembling aspen was observed in the June image classification generated with the "Basalarea þ LowestCC" unmixing model, yet its low user accuracy of 36% suggested that this vegetation class was over-represented (Table 3).This over-representation was not present in the multi-temporal classification based on the "Basalarea þ PurestBg" unmixing model (trembling aspen user accuracy improved from 36% to 80%; Table 4).Furthermore, the detected species distribution patterns shown in Figure 7b follow documented ecological gradients and site preferences (see Study area).Multi-temporal image classifications were therefore clearly important for discrimination among dominant tree species in the study area.
Two components of disagreement, quantity, and allocation disagreement, were used to explain the reasons for classification disagreements based on information in the contingency matrices (Pontius and Millones 2011).The primary source of disagreement for the best multi-temporal image endmember classification (72% agreement, thus 28% disagreement) was due to a less than optimal match in the spatial allocation of the species classes (20% out of a total of 28%) rather than differences in quantities (8% out of 28%; Table 5).This indicated that the derived tree species distribution map (Figure 7b) reasonably depicted the proportional availability of tree species in the study area.It was suspected that compositionally mixed plots and crown closure density likely affected disagreement with ground validation.To test this, i.e., whether this overall disagreement was a result of the inclusion of compositionally mixed plots in the ground validation assessment, the same agreement estimates were produced for a subset of independent pure plots (>80% one species, n ¼ 26).A small improvement of 5% overall accuracy (77% vs. 72%) was realized as a result of a 5% reduction in quantity disagreement (not shown in Table 5).The percentage of allocation disagreement remained at 20%, indicating that mixed forest inventory plots were mapped at similar accuracies as compositionally pure plots and were consequently not the main cause of allocation disagreement.Instead, when classification accuracies were analyzed for plots grouped by crown closure, it became evident that very open stands were the most challenging to classify according to dominant species, and that MESMA performed very well in open to   medium-dense stands (accuracy !80%; Table 6).Confusion between tree species occurred mainly between jack pine and black spruce.This confusion was expected given that upland black spruce and jack pine are spectrally similar and often occur on the same well-drained sites with reindeer lichen understories (Van der Sluijs et al. 2016).

Discussion
Three main advances were reported in this study.First, this is the only documented study that we are aware of involving sub-pixel scale tree-species classification in a northern boreal forest context using multi-temporal, medium spatial resolution satellite imagery and imageendmembers within an SMA experimental design.Second, forest inventory information was helpful in identifying representative image endmember spectra for SMA-based classifications.Third, multi-temporal SMA represents different stages of vegetative phenology that improved the accuracy of dominant tree species classification compared to single-date approaches.

Image endmember spectra for classification purposes
In this study, existing forest inventory information facilitated a stratified-random plot design approach for deriving image-endmembers that considered the representativity of forest diversity.Unlike reference endmembers, image endmembers are selected at the same scale, units, and atmospheric conditions as the imagery (Drake et al. 1999) because they are extracted directly from the image.The levels of agreement from single-date image classifications at peak growth conditions were similar using endmembers obtained directly from the image versus reference endmembers acquired with a field spectroradiometer.This work indicated a high degree of flexibility in optimizing classification results by selecting representative sunlit   canopy and background image endmember spectra based on existing forest inventory information.Northern boreal forests, such as those occurring within the Taiga Plains consist of predominately open forest stands (i.e., crown closure generally <60%) whereby understory ground vegetation represents a significant component of pixel-level reflectance.In this study, the best results were obtained using background component image endmember spectra that were acquired just outside of forest cover, highlighting how these are useful for SMA-based classifications of tree species.
Approaches based on image-derived spectra can be more representative, advantageous, and operationally feasible compared to reference endmember approaches, especially in remote forested areas that are difficult to access.Although factors, such as phenology, bi-directional reflectance, calibration, and solar zenith angles can limit the portability of single-scene image endmembers between different images (Dennison and Roberts 2003a), they are usually used in the context of a single image (and thus the aforementioned factors do not apply).Bidirectional reflectance distribution function (BRDF) modeling and inversion of established canopy reflectance models can also be used to derive scene-specific image endmember spectral inputs that are adjusted for different scene physics and image attributes (e.g., adjusting image endmember spectra from one image for use with a different image with different scene acquisition properties, such as variable sun-sensor-surface geometry).The classification results obtained from single-date peak growth conditions (accuracies ranging between 33% and 46%) highlight the considerable challenges with identifying dominant tree species using either field reflectance or image-derived endmember spectra from single-date medium spatial resolution imagery alone.One future extension of SMA-based classifications in this study region is to apply referenceand image-endmembers to AVIRIS hyperspectral imagery collected as part of NASA's Arctic Boreal Vulnerability Experiment Airborne Campaign (Miller et al. 2019) as well future hyperspectral satellite missions (Cawse-Nicholson et al. 2021), which could allow for a comparison against single-date multispectral imagery datasets used in this work and explore the viability and possible further advantages of hyperspectral image data for which SMA and MESMA techniques are well suited.
The classification challenges encountered in this study also underscore the need for additional research into the capabilities of other medium spatial resolution sensors with improved radiometric and spectral resolution [e.g., Landsat-8 Operational Land Imager (Wulder et al. 2019), Sentinel-2 Multi-Spectral Instrument (Persson et al. 2018;Breidenbach et al. 2021;Hemmerling et al. 2021)] that could be applied to the large-area mapping of dominant tree species across the broader northern boreal landscape.Existing national-scale species mapping initiatives (Beaudoin et al. 2014;Hermosilla et al. 2022) relied on spatial imputation processes (k nearest neighbor or Random Forest) that inherently generalize patterns of tree species based on the limitations of spatial precision of the input data.For example, the national-scale map identified the entire study area as black spruce dominant (Figure 7d), whereas the MESMA outputs show greater diversity which properly corresponded with field inventory data (54% jack pine, 25% white spruce, 19% black spruce, 2% trembling aspen; Van der Sluijs et al. 2016) and documented ecological transitions.National scale products are not necessarily expected to be accurate at the scale of local studies, such as in this experiment.In this work we pursued experiments at small geographic extents using medium spatial resolution data to better understand whether the spatial heterogeneity of tree species in northern boreal forests can be retrieved with suitable precision using MESMA.Future extensions of this work need to increase the spatial footprint in its application and determine whether it can be used for dynamic monitoring of species composition.The MESMA implementation in this study weighted all spectral bands equally during model inversion, however, other SMAbased studies have shown how improvements may be realized via a band weighting scheme to emphasize differences in reflectance at particular wavelengths (Somers et al. 2009).Future work should thus also evaluate whether there are benefits to spectral band weighting schemes, such as emphasizing the visible spectrum that has been shown to be important for discriminating coniferous species (Van Aardt and Wynne 2007; Dalponte et al. 2013).

SMA-based classifications of multi-temporal imagery for northern boreal tree species
SMA-based models and multi-temporal imagery were used to derive empirical assessments of the occurrence of dominant tree species of both pure and mixed species stands occurring in open to dense canopies.The similarity of reflectance spectra of differing tree species and the large intra-species variation in reflectance spectra is well known to complicate the discrimination among different conifer tree species using imagery from medium-spatial resolution sensors (Treitz and Howarth 1999;Cochrane 2000).SMA-based models offer a potential solution herein, as while other approaches may need to resort to "mixed" class labels in case spectrally insignificant differences exist between species, MESMA selects the best fitting spectra even when tree species are spectrally similar but not the same (Somers and Asner 2014;Chance et al. 2016).MESMA thus represents a different, higherlevel approach allowing for greater sensitivity in producing candidate species identification outputs, despite needing only a small amount of training data.MESMA can be a computationally intensive technique, driving the need to better understand the tradeoff between improvements of classification accuracy (normally achieved with a larger number of image bands) and computational efficiencies (achieved with smaller number of image bands; Degerickx et al. 2019).Advances with modern computing platforms and cloud-based storage and processing continue to enable the processing requirements and this is seen as an advantage to further use of approaches, such as MESMA, not a barrier.Furthermore, one of the main research questions of this work identified that a subset of image dates (excluding peak growth conditions) can outperform an all-dates image combination, yielding higher computational efficiency to scale this approach to larger extents and reducing the need for image availability.One possible reason why multitemporal classification results were better without July imagery is the similarity of reflectance spectra between conifer tree species and between tree species-specific background components at the peak of the growing season.Imagery acquired at key phenological periods provides information that characterizes distinct spectral-temporal behavior among tree species.For example, boreal conifer species exhibit subtle foliar changes during seasonal transitions related to photosynthetic activity (Springer et al. 2017).Differences can also be attributed to temporal changes in reflectance of the background component which is often spectrally brighter and of larger surficial contribution of spectral response within a pixel than from the canopy (Bubier et al. 1997;Gerylo et al. 2002).In this study, phenological differences were exploited through multi-temporal imagery which improved the discrimination of dominant species in northern boreal zones and reached similar classification accuracies as reported by Wolter et al. (1995) and Peddle et al. (2004) in more southerly boreal forest settings.Even though Landsat TM has a limited spectral resolution compared to sources of hyperspectral imagery, SMA and multi-temporal imagery partially compensated for these limitations.
The improvements found in this study with multidate within-season imagery could be further extended by incorporating readily available imagery from multiple years, and including the same or similar withinseason coverage as used in our study within each given annual time-series.Pixel-based, cloud-free image composite time-series representing different stages of vegetative phenology are likely necessary in future pursuits of tree species discrimination at large spatial scales (White et al. 2014;Strickland et al. 2020;Wulder et al. 2021).Thompson et al. (2015) mapped the spatial distribution of six tree species over an area >39 million ha using a multiyear Best-Available-Pixel Landsat composite acquired at peak phenology to generate a cloud-free coverage of their study area.Hemmerling et al. (2021) concluded that using all available Sentinel-2 data for building dense time-series is mandatory for improved tree species mapping.These findings suggest there are advantages to developing methods that are suitable for Landsat [and similar] imagery based on phenology-optimized image fusion (Somers and Asner 2013) or using spectraltemporal image features (Pasquarella et al. 2018;Hemmerling et al. 2021), given its lengthy archive, broad spatial coverage relative to airborne sensors, and cost-effectiveness.Spectral-temporal image features, derived through very dense time-series, exploit key phenological indicators between tree species (e.g., start and end of photosynthetically active period, seasonal amplitudes of band reflectances).Recent studies have leveraged national-scale image composite timeseries for tree species mapping (Hermosilla et al. 2022), and further assessment would be of interest in northern boreal forest environments characterized by the challenges of a relative lack of available in-situ species information.Overcoming these complexities may be achieved by expanding the experiments and findings of this work within cloud-computing environments (Gorelick et al. 2017) and through crossdiscipline leverage of a growing NWT archive of drone mapping data (Fraser et al. 2017;Van der Sluijs et al. 2018, 2020;Carpino et al. 2021;Fraser et al. 2022) where detailed tree species information across a broad range of forest conditions is available.Future studies could evaluate if greater integration of spectral-temporal concepts and metrics within a MESMA framework would achieve improvements toward wallto-wall mapping of tree species occurring in northern boreal forest environments.

Conclusion
Precise information on tree species composition is critical for forest and land management, with variable success in the use of satellite data for this application.Determining the dominant species of open forest stands that typify northern boreal forests, such as within the Taiga Plains is particularly challenging from a remote sensing perspective when using conventional image classification methods.Spectral mixture analysis approaches provide an improved framework, however, it has been found that using spectra from a field spectroradiometer as reference endmember inputs are sub-optimal.This study investigated whether classification results can be improved by using image endmembers as an alternative to fieldmeasured endmember spectral inputs, and further, whether multi-temporal imagery could exploit phenological differences among tree species during the growing season.
This study found that the criteria used to identify image endmember spectra using forest inventory information, and what stages of vegetative phenology were included in the multi-date imagery were both important factors that affected tree species classification.The use of multiple endmember inputs to SMA and the optimization search protocols provided by MESMA were important to this study, and this is especially appealing over larger areas and longer time series with greater variability.The improvements using image endmembers for multi-temporal imagery resulted in overall agreement improvements to 72% compared to results from using reference field-based endmembers, which may or may not be satisfactory dependent on the specific natural resource or ecological application.No clear differences in overall agreement (35%-45%) were observed between reference-and image-derived endmembers for single-date imagery at peak growth conditions (July).The criteria used to select image endmember spectra of the sunlit canopy and background components affected the classification accuracy for multi-temporal imagery.The sunlit canopy endmember that yielded the best results was obtained using a basal-area purity criterion whereby the compositional diversity of forest stands was most accounted for, whereas spectra obtained just outside of the forest canopy best approximated the required background endmember.These image-based endmember spectral protocols are viable alternatives, even when field-based spectral measurements are available.
This work has presented new experimental observations in a challenging and understudied region of the boreal forest and applied innovative methodology (MESMA) for a more advanced application (tree species).The smaller study area was chosen to develop and test approaches within the reality of reduced site accessibility and data availability, with the goal in future work to expand to larger areas in the NWT.Further investigations are needed to evaluate whether additional classification improvements can be achieved by optimizing the selection of multi-temporal spectral signatures, band weighting, integrated time-series, and leveraging drone-based observations of tree species to expand calibration and validation networks.The overall appeal of this multi-temporal, MESMA-based approach is further amplified by the increased availability (and quality) of remote sensing data including extensive time series archives.The fundamental principle of sub-pixel scale analysis is particularly important for lower density open stands in northern boreal forests where the reflected energy from trees is typically a minority portion of the overall pixel signal.The ability to extend SMA to multiple sensors, dates (seasonal, interannual, and over longer time periods), larger geographic extents (possibly via cloud computing), and to incorporate greater, more representative variability through multiple within-class endmembers via MESMA is especially important for more complex applications, such as tree species classification and these approaches merit investigation for other applications across a broad range of ecological conditions.
the University of Lethbridge (P.I.'s P. Teillet, K. Staenz, and D. Peddle)-the Advanced Methods, Education and Training in Hyperspectral Science and Technology (NSERC CREATE AMETHYST) Program, as well as by in-kind contributions and funding support from Natural Resources Canada, the Canadian Forest Service Northern Forestry Centre (CFS-NoFC), and the Government of the Northwest Territories (GNWT).

Figure 1 .
Figure 1.Map of the study area in the Northwest Territories (Canada).Inset: QuickBird scene (Copyright DigitalGlobe Inc.) acquired August 2003 for reference.The study area transitions from alluvial flats along the Mackenzie River in the north to dry, coarse-textured soils as well as colder, poorly drained soils in the central and southern regions.

Figure 3 .
Figure 3. Classification agreement with field observations for single date datasets and different unmixing models using image-based endmember spectra (CC: crown closure; BG: background).The horizontal red dashed line represents the classification agreement (52%) obtained using reference endmembers (Van der Sluijs et al. 2016).Classifications based on May imagery resulted in unmixing RMSEs exceeding the 2.5% maximum threshold and thus lead to unclassified forest stands.

Figure 4 .
Figure 4. Classification agreement distributions and medians (black crossbar) by single-date and multi-date combinations.The number of months refers to the number of fused image dates used in the datacube.Individual classification results are provided (grey dots) to indicate maximum classification accuracy (maximum: 72% for 4 month datacube).The horizontal red dashed line represents the classification agreement (52%) obtained using reference endmembers (Van der Sluijs et al. 2016).Boxplots are not shown due to unequal sample size.

Figure 6 .
Figure 6.Classification agreement with field observations for single-date and multi-date classifications and different unmixing models using image-based endmember spectra.The horizontal red dashed line represents the classification agreement (52%) obtained using reference endmembers (Van der Sluijs et al. 2016).

Figure 7 .
Figure 7. Tree species distribution maps were generated using: (a) June image endmembers, (b) multi-temporal image endmembers, and (c) July reference endmembers (Van der Sluijs et al. 2016).Recent national-scale tree species distribution map by Hermosilla et al. (2022) for reference (d).EOSD refers to Earth Observation for Sustainable Development of Forests (Wulder et al. 2008).Insets are centered at 61.18 latitude and À117.65 longitude, and are $15 km wide.Species distributions in (b) follow documented ecological gradients from jack pine stands on dry coarse-textured soils and poorly drained sites populated by black spruce in the southern and central regions of the study area, to more productive white spruce and trembling aspen stands along the alluvial flats adjacent to the Mackenzie River.

Table 2 .
Plots selected for image endmember spectra.Tree species (Sw: white spruce, Sb: black spruce, Pj: jack pine, Aw: trembling aspen, La: larch) listed for each plot in decreasing order of occurrence with percent cover recorded with a subscript in 10% increments.† Stand height (Lorey's height) in meters; Crown closure in percent; Stem density in stems per hectare.

Table 3 .
Contingency matrix of the best single-date (June) image classification based on image endmembers.

Table 4 .
Contingency matrix of the best multi-temporal (all months except July) image classification based on image endmembers.

Table 5 .
(Van der Sluijs et al. 2016multi-temporal (MT) imagery for different unmixing models using image-based endmember spectra.McNemar's test results describe classification accuracy differences relative to single-date imagery at peak growth conditions (July) using reference endmembers(Van der Sluijs et al. 2016).

Table 6 .
Classification accuracy of the best multi-temporal image classification grouped by crown closure.