Average annual and seasonal Land Surface Temperature, Spanish Peninsular

ABSTRACT The first long-term Land Surface Temperature (LST) maps for the Peninsular Spain at annual and seasonal time scales for 1981–2015 is presented in this work. A robust protocol for correcting and calibrating NOAA-AVHRR images and computing LST datasets at the spatial resolution of 1.1 km has been used. Simultaneously, maximum air temperature (Tmax) maps at the same spatial resolution have been produced using data from meteorological stations. The comparison between the two datasets resulted in statistically significant spatial correlations at annual and seasonal scales. Finally, the Normalized Difference Vegetation Index (NDVI) data were also compared with the obtained LST datasets and the results showed significant negative correlations between the two variables, especially in summer.

LST can be obtained by large wide infrared radiometric information from different sensors on board of satellite platforms, some of them recording this data from the early of the 1980s, being the LST from the National Oceanic and Atmospheric Administration Advanced Very High Resolution Radiometer data (NOAA-AVHRR hereafter) the most widely used to obtain long-term time series of the sea surface temperature (SST, e.g. Reynolds, Rayner, Smith, Stokes, & Wang, 2002). In continental areas, there are larger problems to retrieve LST given the diversity of land cover types with different emissivity (Becker & Li, 1990;Qin, Dall, Karni, & Berliner, 2001;Sobrino, Raissouni, & Li, 2001), land cover changes (Jin & Liang, 2006), etc.
Although several studies have focused on the development of algorithms for an accurate estimation of the LST from NOAA-AVHRR data (e.g. Coll, Caselles, Sobrino, & Valor, 1994;Sobrino, Coll, & Caselles, 1991), or used LST to improve air temperature mapping (Vogt, Viau, & Paquet, 1997), there is not still a study that explores the potential of the NOAA-AVHRR thermal infrared data to obtain long-term LST climatology at high spatial resolution over the past three decades.
The objective of this study is to use 34 years (July 1981-June 2015) of NOAA-AVHRR to (i) develop the first long-term LST dataset for the Peninsular Spain and (ii) infer robust annual and seasonal longterm LST climatology at a high spatial resolution (i.e. 1.1 km) never showed before for the region.

Materials and methods
2.1. Protocol for processing development of a historical NOAA-AVHRR LST dataset AVHRR sensor collects information in two long-wave thermal infrared bands at 10.3-11.3 μm and 11.5-12.5 μm. The two thermal bands on board of the NOAA-AVHRR satellites show a spatial resolution of 1.1 km at the nadir, collected at images of around 2700 km wide. In the thermal infrared region, there is a direct emission of energy, which is directly related to the temperature of the surface (Becker & Li, 1990;Sobrino et al., 1991). According to the Planck thermodynamic law, the temperature of a body may be estimated according to its radiation, by means of the simple inversion of the Planck equation (e.g. Becker & Li, 1990;Snyder, Wan, Zhang, & Feng, 1998;Urban, Eberle, Hüttich, Schmullius, & Herold, 2013).
Here, we have used the historical daily NOAA-AVHRR dataset comprising 12,418 afternoon images from July 1981 to June 2015 for the entire Peninsular Spain. All the available daily images from 1981 to 1986 were acquired to the Dundee Satellite Reception Center. Images from 1986 to 1997 were required to the European Spatial Agency and rescued from the original tapes. Finally, the images from 1997 were obtained from the Advanced High Resolution Picture Transmission (AHRPT/HRPT) antennae located at the Centro de Recepción, Proceso, Archivo y Distribución de Imágenes de Observación de la Tierra (CRE-PAD) that the Spanish National Institute of Aerospace Technology (INTA) has in its Canaries Space Centre (Maspalomas, Gran Canaria). Software was designed and implemented in the IDL language to read the different NOAA-AVHRR satellites and convert all images to the same format. We used the complete range of available AVHRR bands (1-5) since in addition to the two thermal bands (4 and 5) used to calculate the LST, we needed the information from the visible, near-infrared and medium infrared spectral regions with the purpose of calculating the surface emissivity and obtaining robust cloud coverage masks.
The first step was to apply a geometric correction to the digital numbers (DN) of the NOAA-AVHRR images. For this purpose, we used the algorithm developed by Baena-Calatrava (2002), which is based first on the satellite parameters and second on geometric corrections. The second geometric correction uses a set of 97 Ground Control Points, which are based on recognizing land patterns (Ho & Asem, 1986) and are mostly located in coastal accident (e.g. capes), but also in land areas (e.g. reservoirs). The images were then re-projected to the European-ED50-UTM zone 30N over an area bounded between 34°22 ′ N and 44°1 2 ′ N and 11°7 ′ W and 4°16 ′ E.
The second step was to calibrate the different satellites (Price, 1991). Initially, we applied a linear correction to obtain the radiance recorded by the satellite (L a ) from the DNs, using the coefficients embedded in the images (G and I), for bands 4 and 5. These coefficients are different for each line of the image.
The radiances were corrected non-linearly for the satellites N-7, N-9, N-11 and N-14: where A, B and C are coefficients that can be retrieved from Walton, Sullivan, Rao, and Weinreb (1998) where c 1 = 1.1910659 × 10 −5 m W sr −1 cm 4 , c 2 = 1.438833 cm K. The values of v change for each satellite and they can be found in the NOAA satellite specifications (http://www.ncdc.noaa.gov/oa/pod-guide/ncdc/docs/ podug/html/c1/sec1-4.htm; last accessed 1 February 2017). Once brightness temperatures were obtained, we removed cloudy pixels. For this purpose, we used the cloud detection algorithm developed by Azorin-Molina et al. (2013) also implemented in the IDL software. To avoid problems related to cloud shadows, we also removed three pixels around the identified cloudy pixels. In addition, to avoid geometric and radiometric problems related to the view angle of satellites, we removed those pixels with observation angles higher than 50°.
Red and near-infrared bands were calibrated using revised calibration coefficients considering the orbit degradation for each satellite (Rao & Chen, 1995, 1999; http://www.ncdc.noaa.gov/oa/pod-guide/ncdc/ docs/intro.htm; last accessed 1 February 2017). Channel 1 and 2 radiances were transformed to top-of-the-atmosphere reflectances and cross-calibrated to the NOAA-9 sensor to correct the different spectral response of the sensors (Trishchenko, 2009;Trishchenko, Cihlar, & Li, 2002), and finally the reflectances were topographically corrected by means of a non-Lambertian model using a digital elevation model at the resolution of 1.1 km (Riaño, Chuvieco, Salas, & Aguado, 2003).
To obtain the LST, we applied a split-window algorithm developed by Sobrino and Raissouni (2000) and widely tested by the atmospheric conditions of the Iberian Peninsula: where W is the water vapor content in each pixel of the image, according to the algorithm by Sobrino, Jimenez, Raissouni, and Soria (2002), valid for the latitudes in which the Iberian Peninsula is located: R is obtained for the whole pixels available in the image (free of clouds and lower than 50°of view angle).
T 4,k is the value of the pixel k in the band 4; T 4,0 is the average of all the available pixels in the band 4; T 5,k is the value of the pixel k in the band 5; T 5,0 is the average of all the pixels in the band 5. The emissivity ε and Δε are obtained according to the method developed by Sobrino et al. (2008) using thresholds of the Normalized Difference Vegetation Index (NDVI) values from the red and near-infrared bands. The coefficients c 0 to c 6 were obtained from Lahraoua, Raissouni, Chahboun, Azyat, and Achhab (2012). For inland water bodies and given they are relatively small in Spain, we did not apply a different algorithm for them. We considered that the same algorithm used to estimate the LST is valid for water areas changing the value of emissivity corresponding to the water. This was addressed in the emissivity methodology (Sobrino et al., 2008) used in this study.
LST may vary as a function of the viewing angle (Wan & Dozier, 1996), but also as a function of the time at which the images are taken, given the different sun angles and radiation received by the surface (Price, 1991). Julien and Sobrino (2012) illustrated this problem and demonstrated that orbit drift is an important source of noise in LST time series. The difficulties of developing a physical approach to correct these effects motivated the use of a statistical approach, similar to that followed by Julien and Sobrino (2012) to remove from each daily LST image the effects of the different solar zenith and observation angles. Given the strong seasonality found in solar effects, we split the available daily time series in semimonthly series (all the images between 1st January and 15th January, between 16th January and 31st January and so on). This selection produced 24 series of LST for the period 1981-2015. For each series and pixel, we removed the effect of the zenith and observation angles by means of a multiple regression model: where a, b and c are the regression coefficients and a z and a 0 are the solar zenith and observation angles, respectively. Using the coefficients obtained, we calculated the free-effect LST (LST_c) removing the obtained residuals (observed minus predicted LST by the model) according to: where m is the average LST in the semimonthly time series.
Daily images were composited in semimonthly periods (two per month, from the 1st day of the month to the 15th, and from 16th to the end of the month) using the maximum temperature recorded in the period. The use of maximum LST values instead of the mean of the semimonthly periods was made to ensure the minimum effect of any remaining cloud coverage on our data. Moreover, as a function of the dates in which data is available the mean values may produce strong spatial contrasts mostly driven by data availability. On the other hand, and as recommended by NDVI, maximum instead mean values   also produce much better results in terms of spatial and temporal homogeneity of the resulting LST.
The large number of images affected by clouds, large observation angles and geometric distortions caused several gaps in the data series. To fill these gaps we used a regression model considering the LST values of the series of the pre-and post-semimonthly periods as predictors. The procedure was applied iteratively to complete all the gaps existing in the images and to make sure all consecutive periods with no data are correctly filled. The series of each pixel were temporally filtered to reduce residual noise by means of the algorithm developed by Quarmby, Milnes, Hindle, and Silleos (1993), which filters low LST values since high LST values are less affected by atmospheric contamination and residual noise. LST = Max{LST (n) , (LST (n−1) + LST (n+1) )/2} Finally, the semimonthly images were used to create monthly, seasonal (defined as winter (DJF), spring (MAM), summer (JJA), autumn (SON)) and annual series, using the average values for each period. Figure  1 shows a summarizing schema of the main steps of the LST dataset development and Figure 2 shows an example of the monthly images from January to December 2010 as an example.

Air temperature data
To deepen the knowledge of the relationship between the spatial variability of the LST retrieved from NOAA-AVHRR images and air temperature, we have used the available information on air temperature in Spain. There are different gridded layers of air temperature in Spain (e.g. Gonzalez-hidalgo, Peña-angulo, & Cortesi, 2015;Herrera, Fernández, & Gutiérrez, 2016). Nevertheless, these datasets show low spatial resolutions to be compared with the high spatial resolution (1.1 km) of the LST dataset created in this study. For this reason, we have developed a new monthly gridded dataset for the period 1981-2015 to match with the spatial resolution of the LST images. We have focused on maximum air temperatures (Tmax) since they are more comparable to the LST.
To generate a gridded dataset for air temperature, we have used the most complete, quality controlled and homogenized dataset available for Spain, based on the Monthly Temperature Data for Spain (MOTE-DAS) dataset (Gonzalez-hidalgo et al., 2015). This dataset contains more than 1300 stations over Spain. The dataset was updated to 2015 using the raw time series of air temperature from the Spanish National Meteorological Agency (AEMET), and the few existing gaps filled by means of the use of reference series, following the same approach used by Gonzalez-hidalgo et al. (2015) to generate the MOTEDAS dataset. A total number of 1348 meteorological stations, which cover the entire study domain, were used to generate the gridded layers ( Figure 3).

Regression-based approach
To generate the gridded layers at the spatial resolution of 1.1 km, we applied a regression-based approach (Ninyerola, Pons, & Roure, 2000). For this purpose, we used different geographic and topographic variables that affect the spatial distribution of air temperatures (i.e. elevation, distance to the sea, latitude and longitude) that were incorporated as independent layers in the generation of global regression models in which the dependent variable was Tmax. Correlation between variables (collinearity) can complicate the interpolation of the results. To prevent this problem, a forward stepwise procedure, with 'probability to enter' set to .05, was used to select only significant variables (Hair, Anderson, Tatham, & Black, 1998).
Air temperature predictions by the linear regression models are inexact since usually there is a difference between the observed and predicted air temperature values in the location of the points of observation. For this reason, we applied a local interpolation of the residual values (observed minus predicted) to obtain exact values in the points of measurement by means of the difference between the regression results and the interpolated residuals. For this purpose, a kriging algorithm was used considering independent spherical semivariograms for each month.

Validation of the predictions
To determine the quality of the Tmax predictions, the grid layers were validated by a Jackknife resampling procedure (Phillips, Dolph, & Marks, 1992). This approach is based on withholding, in turn, one station out of the network, estimating regression coefficients from the remaining observatories, locally interpolating the residuals and calculating the difference between the predicted and observed value for each withheld observatory. We calculated two validation statistics by means of the comparison between the observed and predicted values of each monthly layer: the mean absolute error (MAE) and the agreement index (D) (Willmott, 1981), which are relative and bounded measures of model validity. Figure 4 shows the temporal evolution of D (blue) and MAE (red) in the different monthly gridded data created. These results indicate the good quality of predictions since D values are always higher than 0.9 and they are close to 0.95 most of the months, which indicates a high agreement between observations and predictions. The MAE oscillates around 1°C, being the errors a bit higher for winter than for summer months. Figure 5 presents an example of the obtained grids corresponding to the average monthly maps of Tmax over the Peninsular Spain during 2010 in which there is certain spatial and seasonal agreement of the maps to the obtained results for the LST during the same year.

Normalized Difference Vegetation Index
To study the interaction between the spatial variability of the LST and the vegetation coverage, we have used the NDVI dataset at the same spatial resolution of 1.1 km. The NDVI dataset has recently been developed by Martin-Hernandez et al. (2017) using the same NOAA-AVHRR dataset processed in this study therefore covering the same spatial and temporal coverage.

Results
The Main Map shows the spatial distribution of the average annual and seasonal LST in the Peninsular Spain for 1981-2015 showing marked spatial and seasonal differences. Summer is the season that recorded the highest values, in some areas of the South and Southwest higher than 55°C. Also in areas of the Northeast, in the semiarid regions of the Ebro river basin, LST greater than 50°C have been recorded in summer. In spring and autumn, the spatial distribution of the LST is highly controlled by the relief disposition, with low LST recorded in the different topographic chains (the Pyrenees, Central system, etc.). In winter, dominated a North-South gradient, with the lowest temperatures recorded in the northern chains (the Pyrenees, Cantabrian mountains), and also in the northern plateau of the Iberian Peninsula. Annual distribution of LST shows a mixture of these seasonal patterns. There is a high topographic control, but also clear latitudinal gradient, which is modulated by the spatial distribution of the vegetation coverage.
The observed spatial distribution of LST is strongly related to the spatial distribution of the average Tmax at seasonal and annual scales (Figures 6  and 7). The spatial distribution of the average annual Tmax also shows higher values in the southwestern and southeastern parts of Spain. The Guadalquivir valley shows the highest average values and the lowest values are recorded in the North, corresponding to the Pyrenees and the Cantabrian range. The maps show strong seasonality in the air temperature, but independently of the season of the year, the spatial patterns are very close to that observed for the average LST.
Nevertheless, there is also a close relationship with the spatial distribution of the vegetation activity, represented by the NDVI. The average LST shows a negative and strong correlation with the average NDVI (Figure 8), indicating that areas that show higher vegetation activity/coverage tend to show a lower LST than areas showing nude soils or low vegetation coverage. Thus, at the annual scale, the correlation is strong among these two variables (r = −0.68).

Conclusion
This study develops a long-term database  of LST at a high spatial resolution (1.1 km 2 ) from the set of NOAA-AVHRR images available in the Peninsular Spain, which were obtained after processing the thermal infrared bands (4 and 5) using a split-window algorithm. This dataset has improved the spatial resolution of previous available datasets for Spain, i.e. the Pathfinder 64 km 2 - (Julien, Sobrino, & Verhoef, 2006;Ouaidrari, Goward, Czajkowski, Sobrino, & Vermote, 2002) and the Land Long Term Data Record (LTDR) 16 km 2 (Sobrino & Julien, 2016).
The study analyzed for the first time the spatial patterns of the long-term high spatial resolution (1.1 km 2 ) LST for the Peninsular Spain. The LST climatology shows more spatial details and contrasts than the available air surface temperature climatology for Spain (e.g. http://www.opengis.uab.es/wms/iberia/mms/index. htm) since it allows to identify features related to the terrain conditions (e.g. vegetation coverage and type) and to cover areas in which the availability of meteorological stations is low.
There are important spatial and seasonal differences of the LST across Spain. Nevertheless, strong agreement between the spatial distribution of the average LST and the average Tmax has been found. In addition, we have also showed that other factors may also contribute to explain the spatial distribution of the LST. The principal of these factors is the spatial distribution of the vegetation coverage, being mainly evident during the summer season, in which the contrasts between the areas covered by vegetation and areas of nude soil are clearer in terms of radiative fluxes (Kustas & Anderson, 2009). Thus, in the summer season, there is a negative and linear relationship between the spatial distribution of the LST and the NDVI, which is also identified at the annual scale. The strong relationship between the average distribution of the LST and the NDVI has implications to understand current soil moisture differences, which can be determined by means of conceptual models based on the combination between LST and NDVI (Sandholt, Rasmussen, & Andersen, 2002).
The gridded seasonal and annual climatology layers of LST and air temperature for continental Spain are available in Ascii ESRI format at the Institutional Repository of the Spanish National Research Council (http://hdl.handle.net/10261/164053).

Software
Interactive Data Language (IDL) software was developed to process the daily NOAA-AVHRR images. R 3.3.2 was used to develop the seasonal and annual dataset of LST. MiraMon 6.4 and Idrisi Taiga were used to model mean air temperature. ArcGis 10.2 was used to generate the final maps.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This work was supported by the research projects I-Link1001 (Validation of climate drought indices for multi-sectorial applications in North America and Europe under a global warming scenario) financed by Consejo Superior de Investigaciones Científicas [PCIN-2015-220, CGL2014-52135-C03-01], Red de variabilidad y cambioclimático RECLIM (CGL2014-517221-REDT) financed by the Spanish Foundation for Science and Technology and FEDER European Regional Development Fund, IMDROFLOOD financed by the Water Works 2014 Joint Programming Initiative Water challenges for a changing world co-funded call of the European Commission, and 'LIFE12 ENV/ES/000536-Demonstration and validation of innovative methodology for regional climate change adaptation in the Mediterranean area (LIFE MEDACC)' financed by the LIFE Financial Instrument for the Environment program of the European Commission.