Trends in Extreme Value Indices

Abstract We consider extreme value analysis for independent but nonidentically distributed observations. In particular, the observations do not share the same extreme value index. Assuming continuously changing extreme value indices, we provide a nonparametric estimate for the functional extreme value index. Besides estimating the extreme value index locally, we also provide a global estimator for the trend and its joint asymptotic theory. The asymptotic theory for the global estimator can be used for testing a prespecified parametric trend in the extreme value indices. In particular, it can be applied to test whether the extreme value index remains at a constant level across all observations.


Introduction
Extreme value analysis makes statistical inference on the tail region of a distribution function. Balkema and de Haan (1974) show that extreme observations above a high threshold follows approximately a scaled generalized Pareto distribution (GPD). Consequently, one main parameter governs the tail behavior: the shape parameter in the GPD, also known as the extreme value index. Estimation of this parameter is therefore of prime importance for tail inference. Classical extreme value analysis assumes that the observations are independent and identically distributed (iid). Recent studies aim at dealing with the case when observations are drawn from different distributions. In this article, we aim at dealing with non-iid observations: we consider a continuously changing extreme value index and try to estimate the functional extreme value index accurately.
Consider a set of distribution functions F s (x) for s ∈ [0, 1] and independent random variables X i ∼ F i n (x) for i = 1, . . . , n.
Here F s (x) is assumed to be continuous with respect to s and x. To perform extreme value analysis, assume that F s ∈ D γ (s) , where D · denotes the max-domain of attraction with corresponding extreme value index. In this article, we consider the case that the function γ is positive and continuous on [0, 1]. The goal is to estimate the function γ and test the hypothesis that γ = γ 0 for some given function γ 0 , based on the observations X 1 , . . . , X n .
The idea for estimating γ (s) locally is similar to the kernel density estimation. More specifically, we will use only observations X i in the h-neighborhood of s, that is, i ∈ I n (s) We start with considering the local asymptotic normality. Under some suitable conditions for k and h, we can show that, as n → ∞, for each fixed s ∈ (0, 1), This result is comparable with the asymptotic normality of the Hill estimator, but now the estimation is based on observations with different extreme value indices. The speed of convergence is √ 2kh because only the top [2kh] order statistics are used in the estimation.
Next, we consider testing the hypothesis that γ (s) = γ 0 (s) for all s ∈ [0, 1]. Although we are able to estimate the function γ locally, since the local estimators use only local observations, their asymptotic limits are obviously independent. That prevents us from constructing a testing procedure. In addition, the local estimators converges with a slow speed of convergence 1/ √ 2kh. To achieve the stated goal, we consider the estimation of (s) = s 0 γ (u)du and test the equivalent hypothesis that = 0 where 0 is a function defined as 0 (s) = s 0 γ 0 (u)du for s ∈ [0, 1].
The function is estimated by aggregating the local estimators of γ (s) to a "global estimator" as follows. Consider a discretized version ofγ H (s):γ H (2[ s 2h ] + 1)h . Define the estimator of (s) as the integral of the discretized version as follows: for all 0 ≤ s ≤ 1, ] + 1)h du.
There are two notable features in this asymptotic relation. Firstly, the convergence is uniformly for all s ∈ [0, 1]. Secondly, the speed of convergence for the estimatorsˆ H (s) is 1/ √ k. From these two features, it is possible to construct efficient testing methods to test the null hypothesis that (s) = 0 (s) for all s ∈ [0, 1] with some given function 0 .
Our approach can be regarded as a combination of kernel density estimation and extreme value statistics. To prove the local and global asymptotic normality, we need to combine two limiting procedures as the number of observations tending to infinity. First, the observations used are from a h-neighborhood that is shrinking. Second, within each h-neighborhood, we need to apply a threshold to all observations that is increasing toward infinity. If the h-neighborhood shrinks too fast, there will not be sufficient observations in each neighborhood for statistical inference. If it shrinks too slowly, we would have involved too many observations with vary different extreme value indices such that the estimation is distorted. Therefore, the two limiting procedures have to be balanced such that the resulting estimators possess proper asymptotic behavior.
For that purpose, we assume some conditions regarding the choice of k and h that are related to the speed of variation of the distribution function F s and the continuity of the extreme value index γ (s); see conditions (4)-(7). The first two conditions (4) and (5) are typical assumptions in kernel density estimation and extreme value statistics, respectively. The third condition (6) ensures that the extreme value index function γ (s) is sufficiently smooth. In other words, observations in the h-neighborhood have extreme value indices that are not too far off. Notice that γ (s) governs the parametric structure of the limit only. In order that a unified "threshold" can be applied to all observations in the h-neighborhood, we further assume the smoothness of the intermediate quantiles as in condition (7). These conditions are not too restrictive, see Example 2.1.
The most close studies to our approach are Gardes and Girard (2010) and Goegebeur, Guillou, and Schorgen (2014). The setups of these two studies are similar to our analysis albeit formulated in a conditional setup. The former focuses on nonstochastic covariates, whereas the latter focuses on random covariates. Both approaches proposed estimators using observations locally and established the local asymptotic normality only. To obtain the local asymptotic normality, the conditions assumed in these two studies are quite different from our conditions and the results obtained therein also differ from our results. Besides, we attempt to establish a global result for H (s). The global asymptotic result is necessary for conducting hypothesis testing.
Our article is also related to, but differs from, heteroscedastic extremes. Einmahl, de Haan, and Zhou (2016) model the tail region of distributions of non-iid observations by considering the quotient between tails of different distributions and a common tail. By assuming that such quotients stay positive and finite as one goes further into the tail, the asymptotic constant is called "scedasis. " Within such a framework, the extreme value index remains unchanged across the non-iid observations. Compared to the heteroscedastic extremes, we allow for continuously changing extreme value index and try to estimate the functional extreme value index accurately. In our case, the tails of the probability distributions are of different order, that is, the quotient between the tails of the distributions at two locations with different extreme value indices tends to either zero or infinity. In other words, we are dealing with distributions that differ much more than in the heteroscedastic extremes. Therefore, our situation cannot be handled in the same way as in heteroscedastic extremes.
This article is also related to the literature dealing with the variation or trend in extreme value index when considering a purely parametric model such as the generalized extreme value (GEV) distribution or the GPD. First, one may model the trend in the parameters of such models as a specific functional of the covariates; see, for example, Smith (1989) for the GEV model and Davison and Smith (1990) for the GPD model. Second, the trend can also be nonparametrically estimated using various local estimation techniques; see, for example, Davison and Ramesh (2000) using the local likelihood method, Hall and Tajvidi (2000) using the local linearization method, among others. Compared to all these studies, we do not impose a fully parameterized model and therefore maintain a semiparametric approach.
Finally, this article contributes to the literature on testing the null hypothesis of constant extreme value index. Quintos, Fan, and Phillips (2001) and more recently Hoga (2017) considered testing a change point in the tail index. Einmahl, de Haan, and Zhou (2016) proposed two tests for the same purpose. Nevertheless, in all these studies, the main asymptotic result for the constructed tests is under a more restrictive null than having constant extreme value index only. In contrast, we consider a wider null hypothesis potentially including models with constant extreme value index that are excluded from the null of the two existing studies. In addition, our study allows for testing the null hypothesis of having a general prespecified trend in the extreme value index beyond the constant function, such as γ (s) = γ 0 (s) for all s.
We demonstrate the performance of our testing procedure by extensive simulation studies. In addition, our estimation procedure for the γ (s) function is also validated when the function differs from a constant function. We apply our developed method to two datasets. The first dataset consist of daily precipitations at Saint-Martin-de-Londres, France from 1976 and 2012. The testing results show that we do not reject the constant extreme value index in this period. The second dataset consist of the losses of the S&P 500 index. The testing results show that we do not reject the constant extreme value index in the period from 1988 to 2012 but do reject this null in a longer period from 1963 to 2012. In the second application, we deal with the presence of serial dependence in the data. Nevertheless, we prefer to postpone the incorporation of serial dependence in the theory in order not to overload the already complicated article.
The article is organized as follows. Our main theorems regarding the local and global estimators are presented in Section 2. The testing procedure is established in Section 3 with simulations. Section 4 is devoted to the applications. Proofs are postponed to the Appendix.

Main Theorem
We need the following conditions for obtaining the asymptotic theories of the local and global estimators.
First, we assume the usual second order condition, but uniformly for all s ∈ [0, 1] as follows. Denote U s = (1/(1 − F s )) ← as the quantile function corresponding to the distribution F s . Suppose there exists a continuous negative function ρ(s) on [0, 1] and a set of auxiliary functions A s (t) that are continuous with respect to s, such that holds for x > 1/2 and uniformly for all s ∈ [0, 1]. A similar uniform second order condition has been adopted in Einmahl and Lin (2006). Next, we require that the intermediate sequence k and the bandwidth h are properly chosen as follows: there exists some positive constant ε > 0 such that as n → ∞, h = h n → 0, k = k n → ∞, k n /n → 0, k n h n log h n → ∞, (4) 1,n := k n sup 0≤s≤1 A s n k n → 0, 2,n := k n log k n sup 3,n := sup Condition (4) ensures that the number of high order statistics used in each local interval tends to infinity. Condition (5) is the one usually required for extreme value analysis to guarantee to have no asymptotic bias in the estimator. Condition (6) assumes that k n is compatible with the h n -variation in the γ function. Condition (7) states that 1 − k n -quantiles of distributions are sufficiently smooth in short h-intervals.
The following theorem gives the local asymptotic normality for the estimatorγ H (s) defined in (1).
Theorem 2.1. Let X 1 , X 2 , . . . , X n be independent random vari- is continuous with respect to s and x and F s ∈ D γ (s) where γ (s) is a positive continuous function on [0, 1]. Assume conditions (3)-(7). Then as n → ∞, we have that for all s ∈ (0, 1), Remark 2.1. The result of Theorem 2.1 is still valid under weaker conditions if replacing √ k in (5) and (6) by √ kh.
The next theorem gives the asymptotic normality of the global estimatorˆ H (s) defined in (2).
Theorem 2.2. Assume the same conditions as in Theorem 2.1. Then under a Skorokhod construction, there exists a series of Brownian motions W n (s) such that as n → ∞, We show through an example that the assumptions in (4)-(7) are consistent and not too restrictive.
Finally, we find necessary conditions to ensure condition (7). Note that Therefore, (7) holds if as n → ∞, This is guaranteed by the fact that ξ > (−ρ)(1 − η). To conclude, we have shown that our required conditions are consistent. Notice that this example can be easily generalized to U s (t) = C(s)t γ (s) (1 + D(s)t ρ(s) ), with some proper continuous function C(s) > 0 and D(s) on [0, 1]. Further the power in the Lipschitz continuity condition of γ and ρ could be any positive number, not necessarily one. Therefore, our required conditions are not too restrictive.

Testing Trends in Extreme Value Indices
Theorem 2.2 provides the possibility to test if the extreme value indices follow a specific trend, that is, H 0 : γ (s) = γ 0 (s) for all s ∈ [0, 1], with some given function γ 0 . Similar to testing the specific trend in the "skedasis" function in Einmahl, de Haan, and Zhou (2016), we apply an equivalent test to test H 0 : Clearly, one may construct a Kolmogorov-Smirnov (KS) type test with the testing statistic defined as Then, Theorem 2.2 implies that under the null hypothesis H 0 : where W(u) is a standard Brownian motion defined on [0,1]. It is often of interest to test whether the extreme value index remains constant over time, without prior knowledge on the constant extreme value index, that is, H 0 : γ (s) = γ for all s ∈ [0, 1] without specifying γ . In this case, one may useˆ H (1) as an estimator of the constant extreme value index γ and define the testing statistic as It is straightforward to show that under the null hypothesis H 0 : where B(s) is a standard Brownian bridge defined on [0,1]. Note that the limit distribution is identical to that in the classical KS test.
We run a simulation study to demonstrate the finite sample performance of the testing procedure usingT. In all our simulations we generate m = 2000 samples with n observations in each sample. We start with setting the sample size to n = 5000. For the two parameters k and h, we choose several combinations between k = 100, 200 and h = 0.025, 0.04.
For each sample, we simulate the observations from the following data generating process For the function γ (s) we consider either a linear trend as γ (s) = 1 + bs or a trend following the sin function as γ (s) = 1 + c sin(2π s). If b = 0 or c = 0, the two model resemble the iid case, that is, the null hypothesis that the extreme value indices remain constant holds. We consider four alternative cases: b = 1, b = 2, c = 1/4, and c = 1/2. In total, we have 20 sets of simulations due to the various choices of k, h and the model of γ (s).
For each simulated sample j we apply the testT in Section 3 to test whether the extreme value indices remain constant and obtain the corresponding p-value, p j , for j = 1, 2, . . . , m. For the simulations based on b = 0 (or c = 0), that is, when the null hypothesis holds, we make QQ-plots between the simulated pvalues across all m samples that are below 0.1 against a uniform distribution on [0,0.1]. If the size of the test is agrees with the significance level, the dots in the QQ-plots should line up on the 45 degree line. Figure 1 presents four QQ-plots corresponding to four choices of (k, h). The plots confirm the validity of our test under the null hypothesis.
Next, for all sets of simulations, we calculate the rejection rate based on each significance level α as # j : p j < α /m for α = 0.01, 0.05, and 0.1. The rejection rates are reported in Table 1.
In the first panel, we observe that under the null hypothesis, the rejection rates, that is, the Type I error, are close to the significance levels. The difference between the two choices of h is negligible when choosing k = 200. The difference between the two choices of k is also negligible when considering h = 0.025. For k = 100 and h = 0.04, the test is conservative.
In the next two panels, the rejections rates can be read as the power of the test. Between the two choices of h, h = 0.025 leads to a slightly higher power for rejecting the linear trend, while h = 0.04 leads to a slightly higher power for the sin trend. Between the two choices of k, k = 200 leads to a much higher power in all alternative models. Therefore, choosing a higher k is preferred as long as the bias is not an issue, whereas the choice of h depends on the shape of the trend.
When comparing across the models, the power is higher for b = 2 (c = 1/2) than for b = 1 (c = 1/4). This is in line with the intuition that the test is more powerful to detect larger deviation from the null hypothesis of having constant extreme value index.
Finally, for the two sin trends: c = 1/4 and c = 1/2, we plot the average of the estimated γ (s) across the m samples (the solid line) and its corresponding 95% confidence interval for each given s. There are two ways to construct the confidence interval. First, we use the asymptotic theory in Theorem 2.1 to construct the confidence interval based on the averaged estimate of γ (s) (the dotted lines). Second, we can obtain an empirical confidence interval from the m estimates (the dash-dotted lines). The comparison between the two provides a validation of our asymptotic theory. In this exercise, we fix k = 200 and h = 0.025. Figure 2 shows the estimation result for the sin trends. Firstly, the average estimates across m sample resembles the true value of the γ (s) function (the dash line). Secondly, the   confidence intervals derived from our asymptotic theory are close to that obtained from the simulation indicating the validity of our asymptotic theory. In both cases, the empirical confidence interval is shifted slightly upward compared to the theoretical confidence interval. This shift cannot be explained by the estimation bias because the average estimates is close to the true value. An alternative explanation is that the asymptotic normality requires large 2kh, but in this simulation, 2kh = 10 is rather low. To validate this reasoning, we make QQ-plots for the estimates of γ (1/2) against the standard normal distribution in Figure 3. The solid line has a slope of 1/ √ 2kh. The deviation of the dots from the solid line indicates that the current level of 2kh is rather low for having normality of the estimates. Although the middle part of the distribution might be close to a normal distribution, both left and right tails are deviating from normality.
Besides working with large sample size n = 5000, we also consider a smaller sample size: n = 2000. For the small sample size, we keep h = 0.025 and h = 0.04, but choose different levels of k at k = 100 and k = 50. The results for rejection rates is reported in Table 2. The general patterns observed in the simulations for n = 5000 are preserved.

Application
We apply our developed method to two datasets to test whether the extreme value indices remain unchanged over time. If not rejecting the null, we estimate the constant extreme value index. If rejecting the null, we estimate the time variation in the extreme value indices. Throughout the application analysis, we choose h = 0.025.

Application 1: Precipitation at Saint-Martin-de-Londres
We employ a dataset consisting of the precipitation at Saint-Martin-de-Londres from 1976 to 2015, with 14,610 daily observations. We test the constancy of the extreme value indices over the entire period. The obtained p-values against various levels of k are shown in the upper panel of Figure 4. We do not reject the null hypothesis under the 5% significance level (the dash line). We then estimate the constant extreme value index by applying the Hill estimator to all observations, that is, estimating (1). The obtained estimates against various levels of k are shown in the lower panel of Figure 4. By choosing k = 200, we get the estimate for the constant extreme value index at 0.395.

Application 2: Loss Returns of S&P 500
We employ the same dataset as in Einmahl, de Haan, and Zhou (2016), that is, the S&P 500 index, from 1988 till 2012 . We construct daily loss returns defined as X t = log(P t /P t+1 ), where P t is the index on day t. This results in a sample with 6302 observations. Similar to Einmahl, de Haan, and Zhou (2016), we also test the constancy of the extreme value indices over a subperiod from 1988 to 2012 (6302 observations). The obtained p-values against various levels of k are shown in the upper panel of Figure 5. We do not reject the null hypothesis for k up to 750 under the 5% significance level (the dash line). This result differs from the conclusion in Einmahl, de Haan, and Zhou (2016) where the constancy of the extreme value index in the period from 1988 to 2012 was rejected. We interpret the difference by the fact that we are testing a broader null. Notice that in the heteroscedastic extremes model in Einmahl, de Haan, and Zhou (2016), the skedasis function is assumed to be bounded away from 0 and +∞, whereas our model potentially allows for unbounded skedasis in the null hypothesis. As a consequence, our test for the constancy of the extreme value index may have a lower power.
Since the null hypothesis is not rejected for the sample from 1988 to 2012, we consider an extended sample from 1963 to 2012 (12586 observations). The result is shown in the lower panel of Figure 5. We reject having a constant extreme value index during this long period for k ranging from 250 to 750 under the 5% confidence level. Based on this analysis, we conclude that there is a change of extreme value index during the period from 1963 to 2012.
One concern in the aforementioned analysis is that financial data such as stock returns exhibits serial dependence. The presence of serial dependence would in general enlarge the asymptotic variance of the local estimators for γ (s). Correspondingly, the critical value of the proposed test should be higher. By using the test based on assuming no serial dependence, we tend to over reject the null. Given that the analysis using the data from 1988 to 2012 did not reject the null, accounting for serial dependence may not alter the conclusion. However, the rejection result based on the data from 1963 to 2012 may suffer from the serial dependence issue. Therefore, we conduct additional analysis as follows. We split the dataset into two subsets that consist of daily returns on the even and odd days, respectively. In other words, we do not take returns from consecutive trading days. The split of the full dataset helps to mitigate the serial dependence and data from each subset is more close to the iid assumption. We conduct our tests in each subset with the results presented in Figure 6.
From the two figures in Figure 6, we observe that the null hypothesis is not rejected at 5% level, for the dataset containing daily returns on the even days only. For the daily returns on the odd days, the result is not conclusive either: for k ranging from 400 to 600, the null hypothesis is not rejected at 5% level. However for a lower choice of k, such as k = 200 or k = 300, the null is rejected under 5% level. Overall, the additional analysis reveals that the rejection result for the full dataset might be affected by the serial dependence. With accounting for serial dependence, there is no conclusive evidence that the extreme value index varies over this period.
Finally, we plot the estimated (s) function (the solid line) and the corresponding 95% confidence band uniformly for all s ∈ [0, 1] in Figure 7 for the period from 1963 to 2012. In this analysis, we use k = 400, for which the null hypothesis of having a constant extreme value index was rejected. We obtain the confidence band in two ways.
Without having prior information on the shape of (s), we obtain from Theorem 2.2 that as n → ∞, where W(u) is a Brownian motion. We simulate the quantile of the limit and use that for constructing the uniform confidence band. Since the limit distribution involves the function γ (s), we plug the estimate of the γ (s) function into the stochastic integral and simulate the statistic sup s∈[0,1] s 0γ H (u)dW(u) one million times. Then we take the numerical 95% quantile among the one million simulations, denoted as q(0.95).  s ∈ [0, 1]. The dash line lies always within the confidence band which is seemingly contradictory to our testing result. Notice that the construction of the uniform confidence band is not based on the null hypothesis in the testing analysis. The width of the band is therefore relatively wider due to the stochastic integral.
To be consistent with the testing procedure, we construct a uniform confidence band under the null hypothesis that γ (s) = γ for all s ∈ [0, 1] in the lower panel of Figure 7. More specifically, we use the quantile of the limit distribution

Appendix A: Proofs
We start with presenting auxiliary results that are necessary for the proof of our main theorems. Then we establish the asymptotic theory for the "local tail empirical process. " Finally, we provide the proof for the two main theorems.

A.1. Auxiliary Results
The following lemma shows that under the conditions of Theorem 2.1, the quantile function in the regularly variation property does not vary much in a h-neighborhood. Denote q n = k 1+ε for some ε > 0. Firstly, the second order condition (3) ensures that (see Theorem 2.3.9 in de Haan and Ferreira (2006)) uniformly for all s 1 ∈ [0, 1], x ≥ 1/2, as n → ∞, Together with the condition (5), we get that as n → ∞, Secondly, the conditions (5) and (7) ensure that n → ∞, Lastly, from the condition (6), we get that as n → ∞, It implies that √ k sup |s 1 −s 2 |≤h,1/2≤x≤q n |I 3 − 1| = o(1). The lemma is then proved by combining the three components.

A.2. Proof of Theorem 2.1
We prove the theorem by constructing upper and lower bounds for the local estimatorγ H (s) at a fixed s.

The local estimator is based on observations
are iid standard Pareto distributed random variables. To construct the local Hill estimator, we rank the observations {X i : i ∈ I n (s)} into order statistics as X  = 1, 2, . . . , [2nh]. j, [2nh] , and the U-functions corresponding to these Z i are all bounded below U, we get that there are at least j random variables among {X i : i ∈ I n (s)} that are bounded below by U s,n (Z (s) j, [2nh] ). This proves the inequality for the upper bound. A similar argument can be made for the lower bound.
Therefore, we get an upper bound forγ H (s) as follows, We would now apply Corollary A.1 to bound the two terms in (A.2). For that purpose it is necessary to check that as n → ∞, Here, we consider the maxima over all Z i across all observations, and use the fact that max 1≤i≤n Z i n = O p (1) as n → ∞. In this way, we have verified the upper bound q n in (A.3). Now, we are ready to apply Corollary A.1 to the two terms in (A.2) and continue the inequality as follows: as n → ∞, within the set k n Z (s) [2nh]−j+1,[2nh] ∈ [1/2, q n ], for all j = 1, 2, . . . , [2kh] , [2nh] . Note that the o (1) as n → ∞. Notice that here we use the fact that kh → ∞ as n → ∞ from the condition (4).

A.3. Proof of Theorem 2.2
Next, we prove the global asymptotic normality in Theorem 2.2, which is a uniform result over all s ∈ [0, 1].
Recall the definition ofˆ H (s) in (2) aŝ which is the partial integral of a discretized version of the function γ H (·). We apply the same discretization to the function γ (·) as follows: defineγ Later on, we handle the uniformly negligible difference betweenγ (·) and γ (·).
for some C > 0. Here the first inequality is an application of a stronger version of the Hoeffding's inequality for binomial distributions: Theorem 2(ii) in Okamoto (1959) As n → ∞, we have that 1 2h exp(−Ckh) → 0 since kh/ log h → ∞ and kh → ∞, see the condition (4). The lemma is proved.
Again, we regard the set log (S n (2hp) −S n (2h(p − 1))), for p = 1, 2, . . . , [1/(2h)]. With this new notation, we have that To handle theS n process, we first apply Theorem 2.2(ii) in Csörgő and Horváth (1993) to get an asymptotic expansion of the S n process. Note that E(e tY 1 ) = e −t 1 1−t for all t < 1. By verifying the condition in that theorem, we obtain that under a Skorokhod construction, there exists a series of Brownian motions W n (u) and a constant C 1 such that where as n → ∞,

Pr sup
As n → ∞, from the modulus of continuity, we have that θ n (u) for some positive constants C 3 . Write, for j = 1, 2,  Firstly, we handle L whereγ is the uniform upper bound of the γ (·) function. In the last step, we use the inequality (A.7). The conditions (4) and (6)  = o P (1).
The first limit relation is ensured by the local asymptotic normality of γ H ((2[1/(2h)] − 1)h). The second limit relation follows from the uniform continuity of W n and the fact that the γ (·) function is uniformly bounded. Therefore, we can extend the region of s to the full interval [0, 1] and obtain (A.5). Finally, we show thatγ (·) and˜ (·) can be replaced by γ (·) and (·) in (A.5). From the definition ofγ (·), it is straightforward to verify that Notice that for any fixed n, s 0 (γ (u) − γ (u))dW n (u) 0≤s≤1 is a martingale. We apply the Doob's inequality for the sub-martingale s 0 (γ (u) − γ (u))dW n (u) 0≤s≤1 and get that for any fixed n and ε > 0, Pr sup By taking n → ∞ and applying the condition (6), we obtain (A.8).
Consequently, we have proved the theorem.