Uncertainty in hottest years ranking: analysis of Tibetan Plateau surface air temperature

Abstract Changes in surface air temperature can directly affect hydrology, agriculture, and ecosystems through extreme climate events such as heat waves. For this reason, and to improve climate change adaptation strategies, it is important to investigate the ranking of hottest years. In this study, the Wilcoxon signed-rank test and Monte Carlo simulation are used to estimate the ranking of the hottest years for the Tibetan Plateau (TP) in recent decades, and the uncertainty in the ranking. The Wilcoxon signed-rank test shows that the top 10 hottest years on record over the TP mainly occur after 1998. The top three hottest years are ranked as 2006, 2009, and 2010, but there is almost no significant difference between them. When both sampling and observational errors are considered, only five years have a non-zero probability of being the hottest year, with the three highest probabilities being for the years 2006 (~47.231%), 2009 (~40.390%), and 2010 (~12.376%). Similarly, with respect to a given year that is among the 10 hottest years, our results show that all the years among the ranks of 1–10 resulting from the Wilcoxon signed-rank test have probabilities above 10%, while the years 2001 and 2012 have probabilities of 3% and 4%.


Introduction
It is widely accepted that many applications need reliable and well-synthesized information regarding climate extremes and their impacts to form reasonable strategies and make sensible decisions. One way to meet these needs is to use indicators of climate extremes, such as rankings of the hottest years, thus helping to gain a better understanding of the scientific problem. The ranking of hottest years is determined based on data from meteorological observational stations and statistical inference, and as such has been investigated at different temporal scales and in different regions of the world, such as the USA (Shen, Lee, and Lawrimore 2012;Arguez et al. 2013;Shen et al. 2016), Australia (King et al. 2014), China (Zhai et al. 2016), Europe (Luterbacher et al. 2004), and globally (Zhang, Li, and Wan 2016).
The Tibetan Plateau (TP), known as the 'third pole' , is the highest plateau in the world and home to the headwaters of several important large rivers in East Asia. In recent years, significant warming has been detected over the TP -a trend with the potential to reinforce climate extremes and disasters (Wang et al. 2013 Table 1 also shows very small differences in SAT anomalies for some years, such as 2006/2009, and these differences may be affected significantly by the selection of spatial averaging or integration methods. This can also be seen in Table 1 in that, although the year 2006 is by far the hottest year, the p-value is not really small. Furthermore, the remaining hottest years are not significantly different, statistically, except for the fourth hottest year, 2007, which has a significant p-value at a 95% confidence level, making it difficult to definitely say which year is hotter. Furthermore, owing to the replacement of instrumentation, the uneven spatial distribution, changing observational practices, the relocation of observation sites, and the effect of urbanization, errors contaminate the observational data and create uncertainties (Jones, Osborn, and Briffa 1997;Brohan et al. 2006;Morice et al. 2013;Hua, Shen, and Wang 2014;Hua et al., forthcoming). The Wilcoxon signed-rank test can rank the hottest year, but without considering the data error, and therefore the natural resources and ecosystems in China. Although the hottest years on record over the TP have been ranked in previous studies (eastern part (Li et al. 2015); northeastern part (Pan, Wu, and Liu 2015); southeastern part (Fan et al. 2011); and central part (Yan and Liu 2014)), these studies were mainly based on the arithmetic average of surface air temperature (SAT), without strict statistical inference. Furthermore, such rankings do not consider the various errors in data and related uncertainties of their conclusions. These critical scientific questions remain unanswered, and only by addressing these questions can we build a cumulative understanding of climate and climate change. Accordingly, the present study aims to (1) characterize the annual ranking of TP temperature from 1951 to 2013, and (2) investigate the uncertainty in that ranking.

Data and method
The monthly mean maximum and minimum temperature (T max and T min ) records observed at 100 meteorological stations over the TP (above 2000 m), for the period 1951-2013, are employed in this study. The locations and elevations of the stations are given in Figure 1.
The Wilcoxon signed-rank test (Wilcoxon 1945) is used to rank the hottest years. The approach is performed on paired data to test whether a year is significantly hotter than the following year. The p-value of the test, returned as a non-negative scalar from 0 to 1, is defined as the probability of observing a test statistic as or more extreme than the observed value under the null hypothesis, to determine whether the samples are significantly different. A Monte Carlo simulation (Guttorp and Kim 2013) is used to explore the uncertainty of the annual rankings for the TP temperature time series. The main idea of Guttorp and Kim (2013) is to assume that the SAT of each year, β(t), is treated independently. Then, one can simulate different time series with the annual mean SAT anomaly and standard error, ɛ(t), by shifting it up or down, i.e.
Here, T i (t) is the ith simulation annual mean SAT anomaly time series and r(t) i is the random normal number. Thus, from a large number of simulated time series, we can calculate the rank of each year in the different simulated time series, as well as count the proportion of a given year that is the hottest year, and among the top 10 hottest years.

Results
As the first step of data analysis, the mean temperature (T mean ) is calculated as the average of T max and T min . Table 1 shows the results of the top 10 hottest years ranked by the Wilcoxon signed-rank test. In the 63 years from 1951  question arises as to how the uncertainty might influence the ranking of the hottest years. In statistical climatology, a typical expression of this uncertainty is the standard error. Here, we utilize the standard error from our previous study (Hua et al., forthcoming), which investigated the sampling error (the uncertainty caused by a nonexhaustive survey) and the observational error (the uncertainty caused by station data quality). The sampling error and observational error are calculated based on the correlation-factor method of Shen, Lee, and Lawrimore (2012) and the assumption postulated in Hua et al. (forthcoming).
For example, Figure 2 shows the annual mean series with its uncertainties (adapted from Figure 5 in Hua et al. (forthcoming)). From the figure, we can see that the year 2007, with the annual mean anomaly being 1.210 °C, is recorded as the fourth hottest year. However, this year may be ranked as the second hottest year if its anomaly plus error bar at a 95% confidence interval is considered. Similarly, the same year can be ranked as eighth if the anomaly minus error bar is considered. Thus, it is important to understand uncertainty when determining rankings of hottest years.
To explore the uncertainty in ranking, we first simulate the annual mean T mean time series using Equation (1). We then repeat this a large number of times (10, 000 times in this paper) to generate a probability distribution of each year to be the hottest year, with an accuracy of two decimal places in proportions. Figure 3 shows 10 simulated random series. We can then easily obtain the rank of each year in the simulated series. Table 2 shows the probability of a year being the hottest in recent decades.
We can see from Table 2  But what about the probability that a given year is among the top 10 hottest years? Table 3 shows the results. Among the 63 years from 1951 to 2013, eight have probabilities of more than 70%, and the years 2005 and 2003 have probabilities of more than 35% and 12%. It is worth noting that, although the probabilities are quite small (2%-3%), the years 2001 and 2012 (not ranked in Table 1) come out as being in the top 10 hottest years. Thus, it is not acceptable to neglect the uncertainties in data when performing ranking analyses.

Conclusion
In this paper, the ranking of the hottest years in the TP region and the uncertainty in that ranking are explored using the Wilcoxon signed-rank test and Monte Carlo simulation. The results show that the top 10 hottest years mainly occur after 1998, and the three hottest years over the TP are ranked as 2006, 2009, and 2010, albeit there is almost no significant difference between groups.
To obtain insight into the uncertainty in climate change, both sampling and observational errors are considered to assesses the uncertainty in the rankings for the TP SAT time series. Only five years have probabilities above 1% of being the hottest year, with the three highest probabilities being for the years 2006 (~47.231%), 2009 (~40.390%), and 2010 (~12.376%). We also analyze the probability that a given year is included in the top 10 hottest years. All the years ranked in Table 1 have at least more than a 10% probability; moreover, the years 2001 and 2012, although not shown in Table 1, and quite small, certainly have probabilities of being detected in the top 10 hottest years. Although both sampling and observational errors are considered, according to our previous study, the sampling error is greater than the other errors. However, to better understand the uncertainties in climate change, future work could estimate other errors, such as bias error, homogenization adjustment error, normal error, and so on.

Disclosure statement
No potential conflict of interest was reported by the authors.