Classification of Tundra Vegetation in the Krkonoše Mts. National Park Using APEX, AISA Dual and Sentinel-2A Data

ABSTRACT The aim of this study was to evaluate and compare suitability of aerial hyperspectral data (AISA Dual and APEX sensors) and Sentinel-2A data for classification of tundra vegetation cover in the Krkonoše Mts. National Park. We compared classification results (accuracy, maps) of pixel-based (Maximum Likelihood, Suport Vector Machine and Neural Net) and object-based approaches. The best classification results (overall accuracy 84.3%, Kappa coefficient = 0.81) were achieved for AISA Dual data using per-pixel SVM classifier for 40 PCA bands. The best classification results of APEX though were only 1.7 percentage points lower. To get comparable results for Sentinel-2A classification legend had to be simplified. With the simplified legend the accuracy using MLC classifier reached 77.7%.


Introduction
Climate change, airborne pollution (e.g. deposition of nitrogen, sulphur, and heavy metals) and habitat loss have been identified as major current threats for biodiversity [Millennium Ecosystem Assessment, 2005;Kłos et al., 2015]. Tundra ecosystems (alpine treeless) belong to the most valuable natural phenomena worldwide. At the same time, biotopes above the treeline are very sensitive to various types of environmental factors. Changes can be very fast in these areas and their dynamics is a good indicator of different types of driving forces. Across the European arctic-alpine tundra areas, changes in vegetation composition and species diversity [Kapfer et al., , 2013, species ranges [Felde et al., 2012], and altitudinal limits of trees and shrubs [Rundqvist et al., 2011;Treml et al., 2012] have been observed over several past decades. Most of these changes have been explained by recent climate changes (warming and reduction of the length of snow cover), or locally by changes of grazing practices or by other human impacts.
Besides botanical studies that are usually limited in space and time, Earth observation has become widely used to trace land cover changes and to identify anthropogenic pressures in protected areas (Nagendra et al. 2015) in the last decades. Recently, new, partly commercial and partly freely available data with very high spatial, good spectral and some of them with also good time resolution, have brought new possibilities for extensive land cover monitoring of rather fragmented extensive tundra biome. Many studies from different areas of the world focusing on tundra ecosystem were published over the last decades. Multispectral data were often used, applying pixel-based, object-oriented, and other special classification methods [Král, 2009;Atkinson and Treitz, 2012;Moody et al., 2014;Reese et al., 2014;Virtanen and Ek, 2014]. Time series of imagery from satellite and aircraft platforms was employed by Lin et al. [2012]. The potential of data fusioncombination of radar data (PolSAR, TerraSAR-X and Radarsat-2) with multispectral Landsat 8 data was tested by Ullman et al. [2014].
While the use of multispectral data (aerial imagery, VHR images like WorldView-2, Ikonos etc.) for vegetation classification has become rather common, utilization of hyperspectral aerial data, especially in the case of tundra, remains scarce. Field spectroscopy intended to distinguish among four arctic tundra plant communities at Ivotuk, Alaska was applied by Bratsch et al. [2016]. A two-step sparse partial least squares and linear discriminant analysis were used for community separation with rather high accuracy (55 -94%). Halabuk et al. [2013] use field spectroscopy for alpine grasslands with the aim to identify season-dependent relationship between spectral vegetation indices and aboveground phytomas. Zagajewski [2010] conducted mapping based on aerial hyperspectral data (DAIS 7915 and ROSIS) in the eastern part of the Tatra National Park (Slovakia) with neural net methods.
Remote sensing of vegetation in mountainous areas is rather difficult (great altitudinal difference, steep slopes, cloudy climate etc.) and it is a challenging topic also in the case of tundra in the Krkonoše Mts. This unique southernmost relict area of the arctic-alpine tundra in Europe located in the Czech Republic (50°N, 15°E) is included in the international tundra monitoring program (INTERACT: International Network for Terrestrial Research and Monitoring in the Arctic) [Soukupová et al., 1995;Jeník and Štursa, 2003]. Only few studies using Earth observation for vegetation monitoring in tundra in the Krkonoše Mts. have been published so far. Multispectral orthoimages and maximum likelihood method were used by Müllerová [2005] to detect seven categories: Pinus mugo scrub, Nardus stricta stands, subalpine tall grasslands and tall-herb vegetation, vegetation along roads, roads, water areas, and wetlands. Hyperspectral data were used by Marcinkowska et al. [2014], who classified tundra and all other types of mountain vegetation in the Szrenica Mount region on the border between Poland and the Czech Republic in the Krkonoše Mts. National Park using APEX data and Support Vector Machines classifier. Jarocińska [2016] carried out an analysis of Krkonoše Mts.' meadows condition. She used APEX images as a base for remote sensing indices, which were verified by field-level acquired data (chlorophyll content, absorption of photosynthetic active radiation, leaf area index, and evapotranspiration).
Our previous study [Suchá et al., 2016] compared suitability of multispectral data with different spatial and spectral resolutions (orthoimages with infrared band, WorldView-2 and Landsat 8) for classifications of vegetation above the tree line in the Krkonoše Mts. National Park. Nevertheless, no study that would use aerial hyperspectral data for the evaluation of tundra vegetation in the Krkonoše Mts. has been published yet. After the Sentinel-2A satellite launch in 2015 there is an additional (freely available) data source with very good spatial, spectral and time resolution that should be examined and compared with other above mentioned available data in tundra vegetation cover classification and research.
The main aim of this study was to evaluate and compare suitability of aerial hyperspectral data (AISA Dual and APEX sensors) and freely available Sentinel-2A data for classification of tundra vegetation cover in the Krkonoše Mts. National Park. Different classification methods (pixel and objectbased) were used to find out which classification algorithm for which type of data can bring the most accurate classification results. We expected that the best accuracy will be achieved using hyperspectral data with higher spatial and spectral resolution (AISA Dual). Further assumption was that in the case of Sentinel-2A data with its limited spatial and spectral resolutions some vegetation (especially grassland) categories will not be distinguishable.

Study Area
The highest parts of the Krkonoše Mts. National Park (KRNAP) rise above the treeline (1,300 m a. s. l.) and are covered by relict tundra. It covers an area of 47 km 2 (32 km 2 on the Czech territory, 15 km 2 on the Polish territory), which makes up 7.4% of the total Krkonoše Mts. area. As a result of palaeogeographical history [Treml et al., 2008;Margold et al., 2011], the Krkonoše Mountains represent a "biodiversity crossroads" where Nordic and alpine flora and fauna coexist [Jeník and Štursa, 2003]. Besides the mosses, lichens, and alpine heathlands, the prevailing vegetation types are: closed alpine grasslands dominated by Nardus stricta, subalpine tall grasslands, and Pinus mugo scrub [Chytrý et al., 2001].
Over the years, the arctic-alpine tundra in the Krkonoše Mts. was affected by human impactstundra was expanding due to local agricultural practices that included deforestation and grazing from the 9 th century till the beginning of the 19 th century [Lokvenc, 1995;Speranza et al., 2000;Novák et al., 2010]. Since early 20 th century this human impact has been reduced and the area became strictly protected as a nature reserve. At present, the tundra vegetation is being disturbed by occasional avalanches and debris flows. Fragmented tundra areas are characterized by depauperized species composition under longterm high loads of air-borne nitrogen and sulphur deposition. Observations from the Krkonoše Mts. suggest recent spread of grass Molinia caerulea and Calamagrostis villosa [Hejcman et al., 2009]. In addition to the shifts in distribution of herb species, expansion of prostrate dwarf pine (Pinus mugo) on areas formerly covered by Nardus stricta, and expansion of Norway spruce [Harčarik, 2007;Treml et al., 2012] were also recorded.
The Krkonoše tundra has two spatially separated parts: Western and Eastern Tundra. The Western part is situated near Labská bouda and covers about 1,284 hectares. The Eastern part is located around Luční bouda covering 2,284 hectares. Both parts of tundra were examined in full using the Sentinel-2A data (Fig. 1). The computational demands of hyperspectral data processing required creation of a spatial subset in order to facilitate the classifications using AISA and APEX data. This area of interest is located in the Eastern Tundra and covers 656.5 hectares. The boundaries follow the contour lines 1,300 m a. s. l. and 1,475 m a. s. l. south of the Luční hora mountain and marked escarpments (Obří důl, deep valleys, etc.) to avoid unwanted shadowssee Figure 1.

Image data
Data from three sensors were used and compared in this study. Aerial hyperspectral image data are represented by two sensors with different spectral resolution and very high spatial resolution. These are APEX (Airborne Prism EXperiment) data acquired on September 10, 2012, and AISA Dual data acquired on June 19, 2013. Besides multispectral data with high spatial resolution, also freely available satellite data from Sentinel-2A sensor were used. The Sentinel-2A cloud-free image acquired on August 30, 2015 was downloaded from Sentinel SciHub as level -1C product. Table 1 shows basic features of all the data used.

Reference data
Reference data were collected during two periods, one in June 23 -25, 2014 and second in July 7-11, 2015. Finally, 123 polygons corresponding to vegetation categories as defined in the detailed legend (for legend definition see chapter Methods) were identified by botanist in the field and recorded using the GNSS system (Trimble Geoexplorer 3000 Geo XT, accuracy 10 centimetres). Polygons corresponding to categories Pinus mugo scrub, Picea abies stands, water and block fields and anthropogenic areas were added later using manual vectorization based on visual interpretation of orthoimages acquired in June 2012. Samples for all categories were collected in accordance with the categories' spatial distribution and abundance equally in the entire study area.
The dataset collected in the field and completed with polygons added on the basis of orthoimages visual interpretation was divided into training and validation parts. Many studies deal with number of training samples (e.g. Camps-Valls et al. [2004], Pal and Mather [2005], Waske et al. [2010]). They recommend different shares of training and validation data. According to some of them (Pal and Mather [2005]) SVM method works very well with low number of training data (from 20 -30% of) while this share is not ideal for other classification methods. Based on our experiences (Zagajewski [2010], Suchá et al. [2016]) we used 40% of our data collected in the field for training and 60% for validation in the case of all data types' classifications based on the detailed legend in the Eastern Tundra ( Figure 4). Fifty one polygons (11,388 m 2 ) were used as training dataset and the classification accuracies were assessed by seventy two validation polygons (17,129 m 2 ), both representing all eleven categories. As a next step, these training and validation datasets were adjusted  to different image data (rasterized to Regions of Interest and edited when needed)see the chapter Methods.
Regarding the Sentinel-2A data classified according to the simplified legend the training as well as validation data were different ( Figure 4). As Sentinel-2A data have significantly bigger pixel than used hyperspectral data the lower number of training pixels could decrease the classification accuracy of Sentinel-2A image. To increase the number of training data polygons based on the visual interpretation of orthoimages were used for training phase. It was possible because the categories of simplified legend were recognizable from the orthoimages. Field dataset from 2014 (available data for the whole tundra area) was used for validation. The final dataset for Sentinel-2A classification based on simplified legend was set from 7,057 pixels in 73 polygons (i.e. 705,700 m 2 ) for training and 2,106 pixels in 110 polygons (i.e. 210,600 m 2 ) for validation.

Classification legend
Detailed classification legend including the most important categories of grassland vegetation as well as other vegetation and non-vegetation categories was created in cooperation with the National Park botanists. The legend contains the following eleven categories: 1. Block fields and anthropogenic areas 2. Pinus mugo scrub (Mountain pine) 3. Subalpine Vaccinium vegetation (Blueberries, cranberries and bog bilberries) 4. Closed alpine grasslands* 4a. Nardus stricta stands (Matgrass) 4b. Species-rich vegetation with high cover of forbs 5. Subalpine tall grasslands* 5a. Calamagrostis villosa stands (Hairy reed grass) 5b. Molinia caeruela stands (Purple moor grass) 5c. Deschampsia cespitosa stands (Tufted hair grass) 6. Alpine heathlands 7. Wetlands and peat bogs 8. Water areas (not for Sentinel-2A) * Group categories that were not classified For the pictures of the detailed legend categories see Figure 2. This detailed legend was used for classifications of all the image data (AISA Dual, APEX and Sentinel-2A) in Eastern tundra (see Figure 4). Moreover, the simplified legend with grouped categories specifically created for Landsat 8 classification in Suchá et al. [2016] was used in this study for Sentinel-2A image classification of the whole Krkonoše tundra area ( Figure 4). We wanted to check if simplified legend can lead to better classification results than the detailed one in the case of Sentinel-2A data that have lower spatial and spectral resolution in comparison with other used types of data.
The simplified legend contains the following eight categories: 1. Block fields and anthropogenic areas 2. Picea abies stands (Norway spruce) 3a. Pinus mugo scrub dense (more than 80% of total cover) 3b. Pinus mugo scrub sparse (30 -80% of total cover) 4. Closed alpine grasslands dominated by Nardus stricta 5. Grasses (except Nardus stricta) and subalpine Vaccinium vegetation 6. Alpine heathlands 7. Wetlands and peat bogs Image data pre-processing The geometric, radiometric, and atmospheric corrections of hyperspectral image data were conducted by the data providers. In the case of APEX data it was the VITO company (Flemish Institute for Technological Research), the procedure of data pre-processing is described in detail by Schaepman et al. [2015], Sterckx et al. [2016] and Vreys et al. [2016a;2016b]. In the case of AISA Dual it was Geodis Brno; the preprocessing methods are summed up in the internal technical report [AISA, 2013] and include in-flight calibration data, field spectrometric measurements using ASD FieldSpec 3, and digital elevation model. ENVI/IDL and CaliGeoPro software were used to carry out the corrections. The geometric correction of Sentinel-2A data was ensured by using Level-1C product which includes also orthorectification that makes use of a Digital Elevation Model and transforms the image into cartographic coordinates [European Space Agency, 2015]. Atmospheric correction of Sentinel-2A Top Of Atmosphere reflectance data was not carried out because it is not necessary when all the images are classified separately [Song et al., 2001].
Stacking of AISA Eagle (400 -1,000 nm, 254 spectral bands) and AISA Hawk (978 -2,451 nm, 244 spectral bands) data into one file constituted the next step in hyperspectral data processing. In the overlapping part of the spectra for both sensors, the data from AISA Eagle with higher spectral and spatial resolutions have been kept. Consequently, the resulting image consists of 494 bands (out of original 498) and its spatial resolution is 1 m. To cover the study area spatially with the aerial hyperspectral data, a mosaic from four flight lines of AISA Dual data and from two flight lines of APEX data respectively (after exclusion of the zero value bands) was created and colour balancing was performed. The edges of the flight lines with the biggest distortions were not included.
Finally, to reduce data dimensionality and avoid information redundancy, Principal Component Analysis (PCA) procedure was applied. This is a mathematical process that transforms correlated bands into a new image set of uncorrelated principal components. All image bands were transformed into PCA space. For the classification purposes first few components containing almost all the original information (in the case of APEX data 5 PCA components, in the case of AISA data 7 PCA components) were used. Based on results of tests with number of bands entering the classification also 40 PCA transformed bands were used for both data types classification.

Reference data adjustment
In order to improve classification results based on polygons collected in the field, the training polygons were visualized in the feature space for first few principal components and edited. Firstly, the informational categories from the legend were divided into more spectral categories, e.g. the category "block fields and anthropogenic areas" was divided in the case of AISA Dual data to the spectral categories: asphalt road, block fields, and buildings. Also the vegetation categories were divided into more spectral categories in case of marked differences. In total, 21 spectral categories were created for AISA data and 23 for APEX data. Secondly, outliers were erased from the clusters of all spectral categories so that these would consist from similar pixels. Feature spaces for the first two components of AISA and APEX before and after the edits are shown in Figure 3. Finally, the spectral categories were aggregated to form the categories as shown in the legend.
Because big Sentinel-2A pixels don't follow the boundaries of reference polygons identified in the field the field dataset had to be adjusted for Sentinel-2A classification (training and validation data for the detailed legend and validation data for simplified legend) based on visual interpretation of orthoimages. When surrounding area of polygon was identified as the same category as the polygon, the pixel was kept in the training/validation dataset. On the contrary, pixels that contained any different category were not used.
The last adjustment is concerned with water areas. Water areas were not classified from Sentinel-2A image because they are very small. Their extent is usually less or equal to one Sentinel-2A pixel, thus, it was not possible to get enough pixels for training and validating the classifications.

Classification methods and parameters
Two approaches to classifications were testedperpixel and object-based ones. As already proved in our previous study [Suchá et al., 2016], the object-based classification can bring very good results for the data with very high spatial resolution (e.g. orthoimages). On the other hand, for larger pixels this approach is  not appropriate as mentioned for example by Blaschke [2010]. In this study, hyperspectral data (APEX, AISA Dual) were classified by object-based approach as well as by pixel-based algorithms (Maximum likelihood, Support Vector Machine, Neural Network). Sentinel-2A data were classified only by per-pixel algorithms. All the algorithms were performed using ENVI 5.3 software and are described in detail in the following text. Parametrization of the classification methods was in general performed in accordance with many analogue studies (Pal and Mather [2005]; Petropoulos et al. [2012], Zhou and Yang [2008]) based on a set of tests. We run all the classification algorithms many times, different number of bands and different settings of parameters were tested (always only one parameter was gradually changing while other parameters stayed the same) and several final sets of parameters' combinations that produced the best (or very good) classification results for each classification method are introduced in the chapter Results (see also Tabs. 2 and 3).
Regarding the original bands as inputs to the classification processes the following bands were used: AISA Dualall 494 bands, APEXall bands excluding the zero value bands no. 150-153 and 201-212, Sentinel-2Aall 10m and 20m bands (i.e. 10 bands; 60m bands for atmosphere evaluation were not used). In addition, for hyperspectral data 5 and 40 PCA transformed bands for APEX, resp. 7 and 40 PCA transformed bands for AISA Dual were used for classifications.
Above described phases of the data analysis are schematically shown in Figure 4.

Maximum Likelihood classification (MLC)
To apply successfully this widely used algorithm, two conditions have to be met. First, the image data should show normal distribution [Fernandez-Prieto et al., 2006]. Second, the training samples' statistical parameters (e.g., mean vector and covariance matrix) should truly represent the corresponding land cover category [Duarte et al., 2005].
MLC Classifier was used for Sentinel-2A classification, for APEX and AISA Dual PCA transformed data. ENVI software allows to set just one parameterprobability threshold which was not used.

Support Vector Machines algorithm (SVM)
The SVM algorithm is a machine learning classification algorithm that belongs among supervised nonparametric methods, which means that no particular data distribution is required (e.g. normal distribution). It is based on the statistical learning theory and aims to find the best hyperplane in a multidimensional feature space that optimally separates categories. The term "best hyperplane" is used to refer to a decision boundary obtained in a training step and minimizing misclassifications. Training samples used for construction of hyperplane are called support vectors. These lie on the margin of categories to be classified and are extracted automatically by the algorithm [Camps-Valls et al., 2004;Jones and Vaughan, 2010;Mountrakis et al., 2011;Petropoulos et al., 2012].
All three kernel types (radial basic functions (RBF), linear, polynomial) available in ENVI software were tested. The best results were achieved using the RBF kernel type. Two basic parameters affecting the classifications results are Gamma in kernel function and penalty parameter. The penalty parameter controls the trade-off between allowing training errors and forcing rigid margins. Increasing value of the penalty parameter increases also the cost of misclassifying points and causes ENVI to create a more accurate model that may not generalize well (https://www.harris geospatial.com/docs/supportvectormachine.html). After the series of tests evaluated by classification accuracy results the default parameters were used similarly like in Petropoulos et al. [2012]. It means that gamma kernel was set up as an inverse number of the number of bands in the input image (e.g. for 40 bands of PCA images it was 0.025) and the penalty parameter was 100. Classification probability threshold and pyramid levels were not used.
Neural Network algorithm (NN) NN algorithm also belongs among machine learning non-parametric methods. It is designed to simulate human learning process by establishing linkages between input and output data via one or more hidden layers. The basic unit of each layer is called neuron (node) [Benediktsson et al., 1990]. The classic model of a feed-forward multilayer neural network, known as multilayer perception (MLP), features fullyconnected neurons among all layers (input, output,  and hidden), which means that each neuron of a given layer feeds all neurons in the next layer [Camps-Valls et al., 2004]. This model is used in our processing tool, ENVI 5.3 software.
To ensure maximal classification accuracy different NN architectures and full set of parameters in ENVI during the network training and optimization were tested. The tests were based mainly on our previous experiences [Suchá et al., 2016] and recommended settings from literature (for example Ndehedehe et al., [2013], Zhou and Yang, [2008], Wan-Kadir et al., [2011]). Finally, single layer feedforward architecture, logistic activation function and parameter configurations shown in Table 2 were used. The first one (ENVI default) best performing for the simplified legend uses the low training rate parameter, which decreases the risk of oscillations or non-convergence of the training result but increases the computation time necessary for the algorithm training phase. ENVI default setting on the other hand uses high value of training momentum which trains larger steps than lower values of momentum rate. It encourages weight changes along the current direction. A setting with opposite values of these two important criteria (training rate and training momentum) provided the best results for classifications based on detailed legend. The parameter training threshold contribution which adjusts the changes to a node's internal weight remains unchanged on quite high level which could result in better classification but worse generalization. (https://www.harrisgeospatial. com/docs/neuralnet.html)

Object-based image classification
The object-based image analysis (OBIA) works with homogeneous clusters of pixels called segments. Segments are areas generated by one or more criteria of homogeneity. Thus, compared to single pixels, segments include additional spectral information (e.g. mean values per band, minimum and maximum values, mean ratios, variance etc.) [Blashke, 2010].
The edge algorithm, where images are divided on the bases of Sobel's method of edge detection, was used for images segmentation. Edge segmentation algorithm is recommended in manual of used software for optical image data (http://www.harrisgeospatial.com/docs/ FXRuleBasedTutorial.html). Segmentation parameters had to be set in accordance with spatial resolution and in order to avoid overlap of two different training polygons into one segment. After testing of several combinations Scale Level was set on 60 and Merge Level on 50 for AISA Dual and on 50, respectively 85 for APEX.
Segmentation was carried out using all bands and also 40 bands from PCA transformation for APEX and 7 PCA bands for AISA Dual data in ENVI software, Feature Extraction extension. The applied parameters are listed in Table 3.
Based on our previous experience [Jakešová et al., 2014] the example-based approach that sorts segments into pre-defined categories using training areas (segments) and selected attributes was employed using the support vector machine algorithm and radial basis function with default setting (RBF). In accordance with Jakešová et al.
[2014] example-based approach was able to classify all categories of the detailed legend while detection of some grassland categories (Speciesrich vegetation with high cover of forbs, Calamagrostis villosa stands, Deschampsia cespitosa stands) was not possible using rule-based approach.
Texture is one of the important characteristics in image objects identification [Haralick et al., 1973] and textural parameters may contribute for classification accuracy improvement, especially in the case of low spectral variability between the classified categories. Therefore, beside spectral also textural parameters were extensively tested with the intention to better distinguish spectrally similar grassland categories. We used following spectral and textural attributes for all input bands (same as for segmentation process): Spectral Mean, Spectral Max, Spectral Min, Spectral STD, Texture Range, Texture Mean, Texture Variance, and Texture Entropy.

Results
Results are summarized in the Tables 4 -9 and error matrices for the best classifications of each data type are in Appendix.

Per-pixel classification -APEX data
The best classification results for the APEX data were obtained by SVM classifier for forty components of PCA (Table 4); the overall accuracy reached 82.59% (Kappa coefficient = 0.79). When different categories of the legend are compared (Table 9), the categories "Block fields and anthropogenic areas", "Water areas" and "Pinus mugo" show the best results. The user's and producer's accuracies exceeded 90% in all cases. On the contrary, the category "Subalpine Vaccinium vegetation" shows the worst results of all. Though the user's accuracy equalled 100%, the producer's accuracy reached only 8.5%. The most common overlaps were between grassland categories. Good result was also obtained by SVM classification of all APEX bands. Overall accuracy reached 77.7% (Kappa coefficient = 0.74), while the MLC classifier failed and did not produced any satisfactory results.
Per-pixel classification -AISA Dual data AISA Dual data with the highest spatial and spectral resolutions in this study gave good results with overall accuracies above 70% for all classification methods ( Table 5). The best results were obtained by SVM classifier, especially when 40 components of PCA transformation of original data were used (overall accuracy 84.31%). On the other hand the application of 40 components of PCA, Maximum Likelihood Classifier did not work wella lot of unclassified places remained in the output. The SVM algorithm was also tested on the original set of 494 bands but the process proved to be quite time consuming and the results were no better than with the transformed data.
As expected, the categories "Block fields and anthropogenic areas", "Water areas" and "Pinus mugo" showed the highest classification accuracies (Table 9). On the contrary, grassland categories were often mixed together. Considering all the classifications, the worst results were obtained for "Subalpine Vaccinium vegetation" and "Species-rich vegetation with high cover of forbs". The category "Calamagrostis villosa stands" showed very low level of producer's accuracy.

Per-pixel classification -Sentinel-2A data
The Sentinel's pixel size (10 and 20 metres) does not wholly meet the requirements for the detailed classification of tundra vegetation in the Krkonoše Mts. National Park. The overall accuracies reached 50 to 60% (Table 6). Only the categories "Block fields and anthropogenic areas" and "Pinus mugo" did not show classification problems (Table 9). Low abundance of the category "Species-rich vegetation with high cover of forbs" in the observed area and its location along the paths in the Eastern Tundra determined the worst results. Looking at the best classification output obtained by NN Classifier, the category "Molinia caeruela stands" with the lowest producer's accuracy was often classified as "Deschampsia cespitosa stands" and "Subalpine Vaccinium vegetation". Also the categories "Alpine heathlands", "Calamagrostis villosa stands" and "Nardus stricta stands" show low producer's accuracy and not very high user's accuracy.
To get better results from the Sentinel-2A data, the simplified legend, which is more suitable for the pixel size of the data, was tested ( Table 7). The categories "Pinus mugo scrub dense", "Alpine heathlands", "Picea abies stands", and "Block fields and anthropogenic areas" were classified best with user's and producer's accuracies reaching more than 80%. On the contrary, the category "Pinus mugo scrub sparse" shows rather low user's accuracy (40%) although the producer's accuracy equalled almost 90%. It means that "Pinus mugo scrub sparse" was overclassified, largely instead of "Wetlands and peat bogs", "Grasses (except Nardus stricta)" and "Subalpine Vaccinium vegetation". It illustrates well the problems brought by larger size of pixels -"Pinus mugo scrub", of course, does not respect pixel edges and the surrounding vegetation included in the pixel (mixel) affects the final reflectance.

Object-based approach -APEX and AISA Dual data
Object-based classification was used for APEX and AISA Dual data in Eastern Tundra (detailed legend).   Table 8 clearly shows that AISA Dual data for 7 components of PCA brings better results than classifications of the APEX data. The overall accuracy reached 80.66% (Kappa coefficient = 0.77). The best classification results were again achieved for the categories "Block fields and anthropogenic areas", "Water areas" and "Pinus mugo"both the user's and producer's accuracies exceeded 95% in all cases (Table 9). On the contrary, the category "Species-rich vegetation with high cover of forbs" showed the worst results. The most common overlaps were with "Deschampsia cespitosa stands". The grassland categories, namely the categories "Nardus stricta stands" and "Calamagrostis villosa stands" show very good results with producer's and user's accuracies ranging between 73% and 83%.

Results summary
When comparing APEX, AISA Dual, and Sentinel-2A data and methods used for classifications, the best overall accuracy was reached for AISA Dual data using per-pixel SVM classification from 40 PCA bands (overall accuracy 84.3%, Kappa coefficient = 0.81). The difference of the best classification results of APEX compared to AISA Dual was only 1.7 percentage points. Best result of object-based classification (for AISA Dual) brought overall accuracy only about 4 percentage points worse than in the case of per-pixel classification and 10 percentage points better result for AISA Dual than for APEX data. The best result of Sentinel-2A classification for the same detailed legend as for hyperspectral data (NN classifier) reached about 26 percentage points lower accuracy than the best classification result (AISA Dual). On the other hand, with the simplified legend the accuracy using MLC classifier reached 77.7%.
When it comes to the detailed legend categories (Table 9) "Block fields and anthropogenic areas", "Water areas", and "Pinus mugo" were classified best with the highest user's and producer's accuracies in the case of both per-pixel and object-based approaches. As for grassland categories generally for the both pixel-based and object-based classifications of all the data types rather bad results were obtained for category "Species-rich vegetation with high cover of forbs". But we have to emphasize that two grassland categories -"Nardus stricta stands" and "Deschampsia cespitosa stands"were classified satisfactory from hyperspectral data in the case of both pixel based and also object based methods. OBIA produced the best result in the case of "Callamagrostis villosa stands", much better than pixel based classifications. "Molinia caeruela stands" were classified best from AISA Dual by pixel based approach. In general, worse results were obtained for grassland categories that often became mixed ("Calamagrostis villosa stands", "Molinia caeruela stands"). Such results, however, are not surprising from the botanical point of view as these categories represent grasses of similar plant habitus.
As for non-grassland vegetation categories good results for hyperspectral data and all the methods were obtained for category "Alpine heathland" while bad results for "Subalpine Vaccinium vegetation". Category "Wetlands and peat bogs" was very satisfactory classified even from Sentinel-2A data.

Map outputs
Figures 5 and 6 show final land cover maps for the best classification results. Figure 5 combines outputs from per-pixel and object-based classifications. Comparison shows that both per-pixel and object-based classification outputs for the same data types are very similar (e.g. both classifications for APEX and both  This difference may have been partly caused by problematic colour balancing on the borders of two flight lines of APEX data and possibly also by the seasonal abundance of Calamagrostis villosa as APEX and AISA data were acquired in different periods of year. The outputs of Sentinel-2A MLC classification for both Eastern and Western parts of tundra based on simplified legend are shown in Figure 6. It is not possible to compare this output with maps in Figure 5 as the legend is different. However, the comparison of Sentinel-2A classification map and the Landsat 8 classification map (refer to Suchá et al. [2016]) shows that the spatial distribution of respective categories is almost identical. It is obvious that Sentinel-2A map is more detailed due to higher spatial resolution.

Discussion
Evaluating the different data suitability for tundra vegetation cover classification, besides other factors one has to consider differences in dates of the data acquisition. Consequently, the classification results may have been influenced by varying weather conditions and also by the seasonality in the sense of changing meteorological conditions during one growing season. The ongoing climatic changes may result in dramatic changes of the tundra vegetation patterns; thus, accurate and cost-effective approach of monitoring vegetation types using Earth observation methods are in high demand. Vegetation categories in tundra tend to be rather compact, with short height during spring and autumn, while in summer (July, August) the grassland vegetation advances based on species-specific phenology and different types of grasses are mixed together in one pixel. The onset of flowering may also influence spectral signature of different species in some cases. For this reason better acquisition dates in tundra on our opinion may be late spring/early summer or late summer/early fall than mid-summer (end of July first half of August). Unfortunately, it is practically impossible to acquire all required data within one year or even within season or months. As we work in our study area for several years we were able to collect rather extensive data series. Though the data are from different years and seasons our results are remarkable because studies using or even comparing aerial hyperspectral data for tundra ecosystems are rare and no study using Sentinel-2A data for tundra has been published yet.
Our results for aerial image hyperspectral data can be compared with other studies that use hyperspectral images for classification of mountain vegetation. Marcinkowska et al. [2014] classified vegetation communities (15 categories) in the Szrenica Mountain region on the border between Poland and the Czech Republic in the Krkonoše Mts. National Park using APEX data and SVM classifier. The overall classification accuracy (79.13%) as well as the Kappa coefficient (0.77) reached slightly lower levels than those in our research. Zagajewski et al. [2010] mapped eastern part of the Tatra National Park, Poland, with aerial hyperspectral data (DAIS 7915 and by ROSIS sensors). They focused on the mountain vegetation of subalpine, alpine, and sub-nival zones utilizing maximum likelihood and neural net methods. The NN classifier allowed to identify 39 vegetation categories with 86% of overall accuracy.
No research comparing hyperspectral and multispectral data classification for tundra ecosystem has been carried out yet. Feilhauer et al. [2014] dealt with Natura 2000 habitat types in a boggy area in Bavaria, Germany, to compare resampled (spectrally and spatially) airborne spectroscopy data (AISA Dual) with the characteristics of two state-of-the-art multispectral sensors (RapidEye and Sentinel-2). The classification models resulted in an overall accuracy in calibration of 64% for the AISA Dual data, 62% for the RapidEye, and 59% for the Sentinel-2 data. Our results for AISA Dual and Sentinel-2A data were better; the legend, however, is not fully comparable.
The results for hyperspectral data from our study can be compared with the results of classifications of multispectral data with very high spatial resolution published by our team in previous study [Suchá et al., 2016]. In Suchá et al. [2016] orthoimages (blue, green, red, NIR bands; spatial resolution 0.125 m) were classified by object-based classifier (SVM algorithm) and WorldView-2 data were classified with per-pixel (SVM, NN, MLC) and objectbased (SVM) classifiers. The same detailed legend like in this study was applied, but in the Western Tundra. The best overall classification accuracy for orthoimages was 71.96%, Kappa coefficient equalled 0.65. OBIA achieved for orthoimages (spatial resolution 0.125 m) better results than all used pixel-based classifiers (MLC, NN, SVM), even better than those for APEX data classified by OBIA in our study. In the case of WorldView-2 data, the best results were also brought by object-based approachoverall accuracy was 66.5%, Kappa coefficient 0.6. It is obvious that the use of hyperspectral data brings the best results as regards vegetation cover classification in tundra in the Krkonoše Mts. National Park. The overall accuracy was higher by 12 percentage points for AISA Dual, resp. by 10 percentage points for APEX data compared with orthoimages. For WorldView-2 imagery, the difference makes even almost 18 and 16 percentage points respectively (refer to Suchá et al. [2016]).
References dealing with the same or similar categories we used are scarce therefore the comparison of results is limited. Müllerová [2005] classified "Pine stands" by MLC and ISODATA classifiers with producer's and user's accuracies around 70%, and "Grass communities" with accuracies around 60%. Our results were significantly better for the both categories. Suchá et al. [2016] used different classifiers for multispectral data and orthoimages and the accuracies of "Pinus mugo scrub" were comparable to those reached in our analysis while results for the most of vegetation categories were significantly worse than our results. It is interesting that category "Subalpine Vaccinium vegetation" was problematic in our case while Suchá et al. [2016] got producer and also user accuracies above 70% for ortophotos and WorldView-2 imagery. Our results can be influenced by above mentioned date of acquisition or by training data. In comparison with ortophotos and multispectral data in Suchá et al. [2016] hyperspectral data revealed significant improvements in classification accuracy of some grassland categories ("Nardus stricta stands", "Deschampsia cespitosa stands"per pixel classification and OBIA and "Callamagrostis villosa stands" -OBIA).

Conclusions
Three types of data were classified using different classifiers to evaluate and compare the data suitability for tundra vegetation cover classification. In accordance with our expectation the best classification results for tundra vegetation were in our study achieved for the hyperspectral data with the highest spectral and spatial resolution, i.e. AISA Dual data. However, the overall accuracy and also results for most of individual categories in the case of APEX data are comparable and it can be concluded that higher spatial and spectral resolutions of AISA Dual data brought only moderate improvement. Similarly to many older and recent studies [for example Camps-Valls et al., 2004;Pal and Mather 2005;Petropoulos et al., 2012] the best results for the both types of hyperspectral data were achieved using SVM classification algorithm. As for Sentinel-2A data, especially in the case of simplified legend, NN and MLC methods achieved better results than SVM. For OBIA our results from this study are supportive to our conclusion from Suchá et al. [2016] that one of the main attributes increasing accuracy of this method is spatial resolution.
It has been supposedand provedthat Sentinel-2A data with 10/20 meters spatial resolution and 10 spectral bands can not get to accuracy of APEX and AISA Dual data for the detailed legend. However, the rather high overall accuracy for the simplified legend is quite promising. The assumption that some grassland categories will not be distinguishable in Sentinel-2A data was confirmed. This supports our conclusion (Suchá et al. [2016]) that it is not appropriate to use the same classification legend for the data with significantly (order of magnitude) different spatial and spectral resolutions.
Our results of Sentinel-2A classification are unique as they are one of the first outputs using real (not simulated) data and therefore they cannot be compared with other studies dealing with Sentinel-2A data. We assume that Sentinel-2A classification accuracy could be increased by using series of images acquired during one season. The detection of Pinus mugo could be significantly improved analysing images from spring or autumn (when the surrounding grassland is dry). Series of images could also help to distinguish among the "green" non-forest grass vegetation categories (Molinia caerulea, Calamagrostis villosa, and others) that have different seasonal behaviour (growth, flowering, senescence).
All the used data types can be assessed as suitable for the classification of tundra vegetation cover and for the monitoring of the ongoing vegetation changes [Hejcman et al., 2009;Harčarik, 2007;Treml et al., 2012] when appropriate legend and classification method is used. Though rather different detail of information can be achieved based on spatial and spectral resolution and an increase of classification accuracy in case of some grassland categories is desirable. An analysis of time series of the data for the further improvement of the classification accuracy can be applied especially for Sentinel-2A data because of the very good time resolution. But in the case of commercial data it is problematic. For aerial hyperspectral data improvements can be achieved due to experiments with training and validation datasets, using additional classification inputs like DEM and DSM, field spectral signatures, combination of classification methods or new concepts like for example in Waske et al. [2010] or Magiera et al. [2015].