Estimating loss functions of experts

ABSTRACT We propose a new and simple methodology to estimate the loss function associated with experts’ forecasts. Under the assumption of conditional normality of the data and the forecast distribution, the asymmetry parameter of the lin–lin and linex loss function can easily be estimated using a linear regression. This regression also provides an estimate for potential systematic bias in the forecasts of the experts. The residuals of the regression are the input for a test for the validity of the normality assumption. We apply our approach to a large data set of SKU-level sales forecasts made by experts, and we compare the outcomes with those for statistical model-based forecasts of the same sales data. We find substantial evidence for asymmetry in the loss functions of the experts, with underprediction penalized more than overprediction.


I. Introduction
Sales forecasts are often the outcome of a process in which an expert with domain-specific knowledge modifies a model-generated forecast. Typically, simple extrapolation models are used to create such model forecasts, and often they are generated by automated statistical software which gets fed by lagged sales and other possibly relevant variables.
There is a long tradition in the sales forecasting literature to examine the quality of these expert forecasts relative to model forecasts (if these are available). Key questions are whether the domainspecific knowledge translates into improved forecasts, or whether experts downplay the model forecasts too much, thereby quoting less accurate forecasts. Classical studies are Blattberg and Hoch (1990) and Mathews and Diamantopoulos (1986) where various case studies are examined.
Recently, this literature has seen a revived interest with the advent of a range of large data sets that allow for more generalizing statements. For example, Fildes et al. (2009) study thousands of expert and model forecasts, and conclude that expert forecasts tend to be biased and that expert forecasts are not necessarily better than model forecasts. Franses and Legerstee (2010), using a database with over 30,000 forecasts and realizations, show that, on average, model forecasts and expert forecasts are about equally good, but when expert forecasts are worse they are much worse.
A common finding in these two recent studies is that expert forecasts tend to exceed model forecasts, or in other words, judgemental adjustment is often positive. A potential explanation for this finding is that the experts dislike underpredicting more than overpredicting, perhaps due to planning reasons. Hence, when creating forecasts, their loss function may not be a mean squared error loss function symmetric around zero, but some other asymmetric loss function. If such an alternative loss function is used indeed, this may then also explain why expert forecasts seem less accurate than model forecasts, as typically forecasts are evaluated using criteria like the root mean squared prediction error.
The loss function of experts is usually not known in practice. Given available data, one may however try to estimate this loss function by evaluating theoretical properties of loss functions against actual data. Various forms of asymmetric loss functions have been proposed in the literature, like, for example, the lin-lin loss function, the quad-quad loss function and the linex function proposed by Varian (1975). These loss functions have been frequently analysed, for example, by analysing the optimal forecast under a specific asymmetric loss function, see Zellner (1986) and Diebold (1996, 1997), among others.
In this article, we are interested in estimating the parameters of loss functions given the availability of expert forecasts. Clatworthy, Peel and Pope (2012) investigate whether financial analysts' loss functions are asymmetric or not, but they do not estimate the loss function. A notable exception is Elliott, Komunjer and Timmermann (2005). These authors propose a linear instrumental variable (IV) estimator for the shape parameter of a general class of loss functions which signals the degree of asymmetry in the loss function. The general class of loss functions nests four popular loss functions, and these are the absolute deviation loss function and its asymmetric counterpart the lin-lin loss function, and the squared loss function and its asymmetric counterpart the quad-quad loss function. They use their methodology to estimate the asymmetry in forecasts of budget deficits for the G7 countries made by the IMF and OECD.
To estimate the loss function of experts in the sales forecasting industry, we propose a methodology that differs from the methodology proposed by Elliott, Komunjer and Timmermann (2005) in a number of ways. By making a normality assumption on the conditional distribution of the variable to be forecasted, and by that on the forecast distribution, we demonstrate that the estimation of the asymmetry parameter is simplified substantially. Elliott, Komunjer and Timmermann (2005) need IVs for their estimation method, but in our proposed methodology, only simple linear regressions (OLS) are used, using panel data on expert forecasts and on the variable to be forecasted. If the normality assumption is valid, OLS is more efficient than using IVs and the methodology can easily be extended to multiple-step ahead forecasts. Furthermore, with our methodology, it is possible to estimate the key parameters of the well-known and useful linex loss function.
The outline of our article is as follows. In Section II, we show that for two well-defined loss functions, the lin-lin loss function and the linex loss function, simple regressions can be used to estimate the asymmetry parameter of the functions, provided the availability of the relevant data. In Section III, we illustrate this methodology for a large database covering forecasts from a range of experts. We also consider statistical model forecasts to establish to what extent symmetric loss functions prevail. The robustness of our crucial assumption on the forecast distribution is tested in three ways. One way, for example, is to compare our estimates with those obtained with the methodology of Elliott, Komunjer and Timmermann (2005). Upon estimating our two loss functions, we find overwhelming support for the conjecture that experts may feel that negative forecast errors (meaning the forecasts are below actual sales) require more weight in the loss function than positive forecast errors. Section IV concludes this article with a summary and suggestions for further research.

II. Loss functions
Suppose that Y tþ1 is the random variable to be forecasted with forecast density f y tþ1 ; θ; Y t ; X t ð Þthat may depend on parameters θ and lagged values Y t ¼ y tþ1Àj È É J j¼1 and other exogenous variables summarized in X t . To simplify notation, we write f y tþ1 ; θ ð Þ instead of f y tþ1 ; θ; Y t ; X t ð Þ . In this article, we confine our analysis to one-step ahead forecasts.
Given the forecast distribution, a point forecast p tþ1 for Y tþ1 can be obtained by specifying a loss function. For example, the quadratic loss function is given by where we adopt the convention that a forecast error is the forecast minus the realization. The point forecastp tþ1 results from minimizing expected quadratic loss E QL Y tþ1 ; p tþ1 ð Þ ½ with respect to p tþ1 , where E denotes the expectation operator. In case of quadratic loss, this results inp tþ1 ¼ E Y Tþ1 θ j ½ . Hence, the optimal forecast is unbiased.
From a supply chain management point of view, it can be necessary to put a higher penalty on negative forecast errors than on positive forecast errors. For example, if one forecasts sales, the consequences of a prediction which is lower than the realized demand may be worse than a prediction which is higher than the demand. In other words, being out of stock is worse than having a little too much stock.
To allow for different penalties, one may then consider an asymmetric loss function.

Asymmetric absolute loss function
An example of an asymmetric function is the lin-lin loss function, further also called the asymmetric absolute loss (AAL) function, which is given by One sets α A > 1, if one wants to put more penalty on a forecast which is smaller than the true realization, see also Ferguson (1967). The optimal point forecast is obtained by minimizing expected loss, that is, The expected loss function E AAL Y tþ1 ; p tþ1 ð Þ ½ can be written as The first-order partial derivative is given by where we used the Leibniz integral rule and where FðÁ; θÞ is the forecast distribution function of Y tþ1 (with f ðÁ; θÞ as its derivative). The optimal point forecast is obtained when this derivative is set equal to zero and solved for p t + 1 , which results in This corresponds to the α A = 1 þ α A ð Þth percentile of the forecast distribution. Under symmetric loss α A ¼ 1 ð Þ we obtain the median of the forecast distribution. For α A > 1, we have a forecast which is larger than the median, and for α A < 1, we obtain a forecast which is smaller than the median.
Hence, apparent biased forecasts of an expert may be due to the fact that an asymmetric loss function is used. Our main claim in this article is that if we were to observe several forecasts of experts together with realizations of the forecasts, it is possible under some testable assumptions to estimate the value of α A , see also Section 'Estimation'.
Suppose that we have data with T forecasts where for each point forecast created at time t ¼ 1; . . . ; T, the conditional forecast distribution is normal with mean m t and variance s 2 t . Furthermore, assume that all forecasts are constructed using the same AAL function. Under these assumptions, the forecasts are thus generated by where Φ À1 is the inverse CDF of a standard normal distribution. Further assume that the realizations y tþ1 result from a normal distribution with mean μ t and variance σ 2 t for t ¼ 1; . . . ; T and hence y tþ1 ¼ μ t þ σ t ε t where ε t is a realized draw from a standard normal distribution. If there is a systematic bias in the forecast distribution it holds that m t ¼ μ t þ b with b Þ 0. If we consider the difference between p tþ1 and y tþ1 , we obtain If we can obtain a consistent estimate of s t and σ t , one can use the simple regression to provide the estimate for β 0 ¼ b and for Þ . An estimate of α A can easily be obtained by solving In sum, in this scenario, it is possible for a forecaster to have an asymmetric loss function and a systematic bias in its forecasting distribution. The expression in (9) shows that it is possible to calibrate the loss function and the bias.

Linex loss function
An alternative non-linear asymmetric loss function is the linear-exponential function, also called the linex (LIN) loss function, see Varian (1975) and Zellner (1986). This function is given by with α L Þ 0. A negative value of α L implies that a p tþ1 lower than Y tþ1 is more costly than a p tþ1 higher than Y tþ1 . To be more precise, if α L < 0, the linex loss function shows an almost exponential increase in loss to the left of the origin p Tþ1 À Y tþ1 ð Þ¼0 and an almost linear increase in loss to the right of the origin. A positive value of α L implies the opposite and a α L ! 0 implies symmetric loss. Zellner (1986) shows that the point forecast which minimizes expected loss is given by Hence, if we assume that the forecast distribution of Y tþ1 is normal with mean m t and variance s 2 t , then the point forecast is given by Again, it is possible to estimate α L in case, we observe several forecasts of experts together with realized forecasts. Under the same conditions as above and using the same arguments, taking the difference between p tþ1 and y tþ1 and dividing byσ t result in the simple regression OLS provides the estimate for the systematic bias b ¼ β 0 and asymmetry parameter α L ¼ Àβ 1 .

Estimation
To run regressions (9) and (13), we need estimates of s 2 t and σ 2 t , where s 2 t is the variance of the forecast distribution and σ 2 t is the variance of the data. Note that the variance s 2 t cannot be estimated from the variance of the available expert forecasts as these forecasts may be biased. However, it is possible to construct an econometric model to create T unbiased model forecasts mf tþ1 for y tþ1 such that mf tþ1 ¼ E½y tþ1 . We can now assume that s 2 t is constant (s 2 t ¼ s 2 for t ¼ 1; . . . ; T) and that the variance of the expert is equal to the model variance.
If we combine the unbiased model forecasts (mf tþ1 ¼ E½y tþ1 ) and the realizations y tþ1 , we can also estimate σ 2 usinĝ under the assumption that σ 2 t ¼ σ 2 . Because σ t and s t are now constant over t, we need panel data with expert forecasts and realizations in order to estimate the parameters in (9) and (13). In other words, if we have forecasts for variables i ¼ 1; . . . ; N over periods t ¼ 1; . . . ; T, namely p i;tþ1 and mf i;tþ1 , we could estimateŝ 2 i andσ 2 i for each i. In case of the lin-lin loss function, we can now estimate the bias and asymmetry parameter with the regression In case of the linex function, we can estimate the bias and asymmetry parameter with where b ¼ β 0 and α L ¼ Àβ 1 .

Misspecification
The error terms ε i;t for i = 1,. . .,N and t ¼ 1; . . . ; T should be normal with mean 0 and variance 1 in regressions (15) and (16). If this is not the case, some of the assumptions, such as the assumption of a normal forecast distribution, may not be valid or the loss function may not be adequate. It is therefore important to test if the estimated residuals are standard normally distributed. If tests show that the error terms are not standard normally distributed or if there are other reasons to doubt whether the forecast density is normal, it is also possible to assume that the forecasts are lognormally distributed in case of the lin-lin loss function (AAL). Under this distribution, the forecasts are generated by where m t is the mean and s 2 t the variance of log(p t+1 ) and Φ À1 is the inverse CDF of a standard normal distribution. Assume now that the realizations y t+1 result from a lognormal distribution with parameters μ t and σ 2 t for t = 1,. . .,T and hence log y tþ1 ð Þ ¼ μ t þ σ t ε t where ε t is a realized draw from a standard normal distribution. We can now write where b is again the systematic bias in the forecast distribution, thus m t ¼ μ t þ b. Using the estimates of s t and σ t and using the relevant panel data the regression Again, if the assumptions are correct, including the assumption of lognormality of the forecast distribution, and the loss function is AAL, the error terms ε i;t for i ¼ 1; . . . ; N and t ¼ 1; . . . ; T should be normal with mean 0 and variance 1.
Another way to check if the assumptions are correct is to compare the results for the AAL loss function with the results as found with the method of Elliott, Komunjer and Timmermann (2005). They use as a general loss function where I½Á is an indicator function which takes a value of 1 if the statement between brackets is true and is 0 otherwise, where α E 2 ð0; 1Þ and where they impose q = 1 or q = 2. By setting q = 1, the AAL loss function is obtained as defined above in (2), but with weight α E for cases where p tþ1 Y tþ1 and with weight 1 À α E for cases where p tþ1 >Y tþ1 . Stated differently, α E = 1 À α E ð Þ¼ α A . Elliott, Komunjer and Timmermann (2005) do not make assumptions on the distribution of the forecasts. Therefore, if the normality assumption is valid, their methodology should result in anα E for whichα E = 1 Àα E ð Þ%α A , whereα A is obtained from (15). Differences betweenα E = 1 Àα E ð Þandα A might be a result of the chosen IVs for the estimation of α E or the use ofŝ andσ instead of s and σ for the estimation of α A or both.
In the next section, we will illustrate the techniques and robustness checks described in this section for a range of forecasts made by many experts.

III. Illustration
We apply our methodology to an extensive panel data set. The data set covers SKU-level sales data and is described in detail in the next subsection. In the subsections after that, the results of our analysis are discussed.

Data set
For our case study, we use monthly sales data of a large pharmaceutical company. The company has its headquarters in The Netherlands and has local offices in various countries worldwide. The company uses an automated statistical package to create forecasts using lagged sales data as the only input. The experts know that these data are the only input. Each month's model selection and parameter estimation are updated, whereby the package uses techniques such as Box-Jenkins and Holt-Winters. These model forecasts are then sent to the managers/experts in the local offices, after which they quote their own forecasts.
The forecasts are available for the months November 2004 through November 2006. They are created for various horizons, but we only use the 1step-ahead forecasts in the analysis presented in this article. In each country, forecasts are created by a different expert and hence we have forecasts for 35 countries and thus 35 distinct individuals. For confidentiality reasons, we denote the countries with roman numbers I-XXXV. Forecasts are created for 1038 different products. In the notation of the previous section, this means that i ranges from 1 to 1038. Per product we have a minimum of 15 and a maximum of 25 observations for which the model forecast, the expert forecast and realized sales are available to us. Thus, T depends on i and 15 T i 25 for i = 1,. . .,1038. All together, we have 24,897 observations. We denote the model forecasts as constructed by the statistical programme of the company as MF, and the final forecasts from the experts are denoted as EF. The model that we use to estimate σ i and s i is for each i an AR(1) model for which the parameters are estimated over all available observations for i. For mf i,t+1 "i and "t, we consider the in-sample forecasts generated by these AR(1) models. Note that MF i,t + 1 and mf i,t + 1 are different forecasts, the first is the statistical model forecast as used by the company and the second is the forecast from the AR(1) model used to estimate σ i and s i .
The parameters in (15), (16) and (19) are estimated for each expert separately by multiplying the two variables in the regressions by dummy variables for the managers. We also estimate α E per expert. Observations per expert range from 96 to 2132 with an approximate average of 710 observations.

Estimated asymmetry
We begin by analysing the results as obtained under the assumption that the AAL function is used by the experts. Column 2 in Table 1 presents the estimated asymmetry parameter α A per expert. We see that 26 of the 35 experts have anα A that is significantly different from 1 at a significance level of 10%. For 21 of these managers, the difference is even significant at the 1% significance level. For all those 26 managers, theα A exceeds 1, meaning that sales forecasts that are too low are penalized more than forecasts that are too high. On average, over 35 experts,α A has a value of 1.40, which indicates that too low forecasts are weighted 40% heavier than too high forecasts. To get some more insight into this value forα A , see Figure 1.
The estimated systematic bias b for each expert can be found in Column 3 of Table 1. There are 11 experts with a significant systematic bias at the 1% significance level and another 2 experts with a significant systematic bias at the 5% significance level. Most of these are positive biases and most are linked to a significantly positive asymmetry parameter.
If we only take the 1% significance level into consideration, we can conclude that 15 experts have an asymmetric loss function, but no systematic bias. Another six experts have an asymmetric loss function and also a systematic bias. Only five experts have a systematic bias and no asymmetric loss function, and finally, only nine experts seem to have a symmetric loss function and no systematic bias.
When we apply the test regression to the model forecasts MF, we obtain the results as reported in Columns 4 and 5 of Table 1. As we might expect  from model forecasts based on techniques such as Box-Jenkins and Holt-Winters, we find much less evidence of asymmetry in the loss function and of systematic bias. For only eight countries, theα A is significantly different from one at the 1% significance level and in another one at the 5% significance level. The average of the 35α A values is 1.03, which is very close to 1. Some evidence of systematic bias is found in 12 countries, but at the 1% significance level, only 4 of these cases remain. In sum, the model-based forecasts in general seem unbiased and have been created using a symmetric loss function. Now we turn to the results when we assume that the linex loss function is used by the experts. See Table 2 for the estimated asymmetry parameters and systematic biases again for both EF and MF. In the second column of this table, we findα L for each expert. For 18 experts, we find anα L significantly different from 0 (thus asymmetry) at a significance level of 10%. For 12 of these, the difference is also significant at 1%. So this is almost half of the cases where we found asymmetry for the AAL function. All except 1 (which is only significant at the 10% level) have a negative asymmetry parameter, indicating that again negative forecast errors weigh more heavier than positive forecast errors. All except two (which are both again only significant at the 10% level) were also found to have an asymmetric loss function under AAL. On average,α L has a value of −0.0002. See Figure 2 for the shape of LIN with an α L equal to this average estimate.
However, we do find more often a significant systematic bias under the linex loss function than under the lin-lin loss function, see Column 3 of Table 2. 22 experts have ab significantly different from 0 at the 10% significance level and for 16 of them is this difference also significant at the 1% level. In some instances, the linear asymmetry as found under AAL seems to be replaced by a (more profound) systematic bias, see for example, the experts denoted with IV, XX and XXX. In general, the bias is positive again.
In sum, we find that at the 1% significance level, there are far more experts with a symmetric loss function (23) than with an asymmetric loss function (12) if we assume the linex loss function. 12 of the  experts with a symmetric loss function also do not have a systematic bias, although 16 experts have a systematic bias. Results are also a bit more ambiguous, because there are more countries for which significant asymmetry and/or bias is found with the 5%-or 10% significance level and not with the 1% significance level, as compared to the AAL situation. Finally, we also compare these linex results for EF with the linex results for MF, see Columns 4 and 5 of Table 2. Again we do not find much evidence for asymmetry and systematic bias in the model forecasts.α L is on average −4.13e − 06, so much closer to 0 than the averageα L of −0.0002 found for EF. For only 10 countries is the asymmetry parameter significantly different from 0 at the 10% level, and in only 3 countries at the 1% level. The number of significant systematic biases is 16 at the 10% level and 6 at the 1% level. So again these results confirm that statistical model forecasts are unbiased and derived from a symmetric loss function.

Specification checks
So far, we have analysed the results given the assumptions underlying the analysis. To test these assumptions, we now follow the strategy as outlined in section 'Misspecification'.
The first step is to check if the error terms of the regression models (15) for AAL and (16) for LIN are standard normally distributed. To that end, we use the Kolmogorov-Smirnov test, see D' Agostino and Stephens (1986). The test is performed on the error terms of each country separately, so we have 35 test results. In the second and third column of Table 3, we see how often these 35 tests reject the null hypothesis of standard normally distributed error terms at the 1% significance level. For the AAL function, we see fairly low figures. For EF, we see that in a little bit over one-third of the tests, the null hypothesis is rejected and for MF this is a little bit over one-fifth. Note that the numbers of observations for which the tests are performed are large (see section 'Data set') and that the probability that the error terms are not standard normally distributed according to the test is related to this number of observations. For countries with much observations, we might therefore use an even lower significance level and the number of rejections might even further decline.
For the linex loss function, we find much higher numbers of rejection, namely 23 (66%) for EF and 12 (34%) for MF. As the numbers for AAL are much lower, this might indicate that we should not reject the assumption of a normal forecast distribution at this point, but that the assumption of a linex loss function is perhaps not an appropriate assumption. The AAL function seems to be the loss function that is more likely to be used by the managers creating the forecasts in this data set.
As we deal with sales forecasts in this application, which are always positive, it might be reasonable to assume that the forecasts are lognormally distributed instead of normally. Therefore, we also estimate (19), again with separate coefficients for each country, and again we test if the error terms are standard normally distributed. We find overwhelming evidence that the forecast distribution is not lognormal, see Column 4 of Table 3. Both for EF and MF, the null hypothesis of standard normal error terms is rejected for all 35 countries. This again indicates that assuming a normal forecast distribution seems acceptable for our data.
Our final specification check involves a comparison of our AAL results with those upon using the method of Elliott, Komunjer and Timmermann (2005). Table 4 and Figures 3 and 4 give the results. Note that Columns 2 and 4 of Table 4 are the same as Columns 2 and 4 in Table 1, but are repeated for ease of comparison. Columns 3 and 5 present the results as obtained using the method of Elliott, Komunjer and Timmermann (2005), where we used as IVs a constant and one-month lagged sales. Remember that we would expectα A andα E = 1 Àα E ð Þ to be approximately the same if the assumptions for our method are correct.
First note, from Table 4, that wheneverα A is significantly larger than 1 at each significance level, α E = 1 Àα E ð Þ is never significantly smaller than 1 at each significance level. Furthermore, wheneverα A is significantly smaller than 1 at the 1%-, 5%-or 10% significance level (happens only twice for MF), α E = 1 Àα E ð Þ is never significantly larger than 1 at the 1%-, 5%-or 10% significance level. Both statements also hold true whenα E = 1 Àα E ð Þis evaluated againstα A . These results indicate that we never find fully conflicting results with the two alternative methods.
The largest difference in results appears when we sometimes find a significant asymmetry with one method and no significant asymmetry with the other method. If we focus on the 1% significance level, this happens 8 times for EF and 2 times for MF, but in most of these cases (7), the other method also shows asymmetry at the 5% or 10% level. Hence, we find that both methods may differ in terms of the amount of asymmetry, but not in the sign of the asymmetry and hardly in the existence of the asymmetry.
To get a more precise idea of the size of the differences in estimated asymmetry parameters, we can take a look at the histograms in Figures 3 and 4. Here, the differences betweenα A andα E = 1 Àα E ð Þ are depicted, for EF in the first figure and for MF in the second. Multiplying the differences with 100% shows the differences in percentages. Thus for example, a value of 0.1 indicates that the difference in weight between too low forecasts and too high forecasts is 10% higher according toα A than according toα E .
Although we see some outliers in the graphs, the largest one being thatα E is 97% larger than 1 Àα E than thatα A is larger than 1, on average the difference is around to be equal to 6% (0.06 in the figure).   Furthermore, in 23 of the 35 countries, the difference is smaller than 25% in absolute sense and in 31 of the 35 countries the difference is smaller than 50%. For MF, see Figure 4, these differences are even smaller, with an average difference of around 2.5% and a maximum difference of around 51%, both in absolute terms. The larger differences do not necessarily seem to be related with the rejection of a normal forecast distribution. The correlation between the absolute difference and the p value of the Kolmogorov-Smirnov test is −0.14 for EF and −0.06 for MF. If we look at the test results for some countries with large differences in estimation results, we sometimes find rejection of the null hypothesis and sometimes we do not.
To conclude, we do not find large differences in the results of both methods and we take this as a final indication that the assumptions underlying our analysis do not need to be rejected, at least, for our data set at hand.

IV. Conclusions
There is much available research on asymmetric loss functions for forecasters, but most of it is focussed on the theoretical discussion of possible shapes of those loss functions and on resulting optimal forecasts. Very little is known about which loss function is actually exercised by experts when they create their forecasts and rarely it is quantified to what extent the loss functions are asymmetric. We are aware of one study only, and this is presented in Elliott, Komunjer and Timmermann (2005).
In the present article, we propose a new and simple methodology to deduce the asymmetry parameter of the AAL function and of the linex loss function. The derivation is based on some simplifying assumptions which can be held against actual data in a number of ways. The derivations were shown to lead to simple linear regressions.
We applied our methodology to a large data set of SKU-level sales forecasts where model forecasts are received by experts, after which they provided their final forecasts. We documented substantial evidence that the experts use an asymmetric loss function, where we diagnosed that most likely it is the AAL function. Forecasts that are too low have a weight in the loss function that is on average 40% higher than forecasts that are too high.
The methodology proposed in this article results in similar results as found with the methodology of Elliott, Komunjer and Timmermann (2005), and in general, we find no obvious indications that the assumptions underlying our analysis should be rejected. To what extent this is true for other data sets remain to be analysed.
Further research on loss functions of experts could focus on multi-step-ahead forecasts. As forecasts errors might be correlated in such situations, the methodology might be a bit more complicated than the one presented here. Finally, forecast updates, that is, sequential forecasts for the same event, are also interesting to analyse.

Disclosure statement
No potential conflict of interest was reported by the authors.