Central limit theorem for the capacity of the range of stable random walks

In this article, we establish a central limit theorem for the capacity of the range process for a class of d-dimensional symmetric α-stable random walks with the index satisfying . Our approach is based on controlling the limit behaviour of the variance of the capacity of the range process which then allows us to apply the Lindeberg–Feller theorem.


Introduction
Let ( , F, P) be a probability space, and let {X i } i∈N be a sequence of i.i.d. Z d -valued random variables defined on ( , F, P), where d ≥ 1 and Z d stands for the d-dimensional integer lattice. For x ∈ Z d , define S 0 = x and S n = S n−1 + X n , n ≥ 1. The stochastic process {S n } n≥0 is called a Z d -valued random walk starting from x.
Throughout the article, we will often rely on the Markovian nature of {S n } n≥0 , therefore we need to allow arbitrary initial conditions of the underlying probability measure. For this purpose, we redefine the probability space in the following way. Put¯ = Z d × , F = P(Z d ) ⊗ F, and P x = δ x × P for x ∈ Z d . A random variable X on ( , F, P) is extended automatically to (¯ ,F , {P x } x∈Z d ) by the rule X(x, ω) = X(ω) for x ∈ Z d and ω ∈ . Furthermore, define S 0 :¯ → Z d by S 0 (x, ω) = x for x ∈ Z d and ω ∈ . Clearly, P x (S 0 = x) = 1, and for each x ∈ Z d , the process {S n } n≥0 is a Z d -valued random walk on (¯ ,F , P x ) starting from x. Also, it is a (strong) Markov process (with respect to the corresponding natural filtration). Observe that the corresponding transition probabilities are given by write ( , F, {P x } x∈Z d ) instead of (¯ ,F , {P x } x∈Z d ), and when x = 0, we suppress the index 0 and write P instead of P 0 . We denote by G(x, y) = n≥0 p n (y − x), x, y ∈ Z d , the Green function of {S n } n≥0 . Due to the spatial homogeneity of {S n } n≥0 , we sometimes write G(y − x) instead of G(x, y). Recall that {S n } n≥0 is called transient if G(0) < ∞; otherwise it is called recurrent.
The main aim of this article is to establish a central limit theorem (CLT) for the capacity of the range process of {S n } n≥0 . Recall that the range process {R n } n≥0 is defined as the random set Here, T + A denotes the first return time of {S n } n≥0 to the set A, that is, Also, when A = {x}, x ∈ Z d , we write T + x instead of T + {x} . We are interested in the long-time behaviour of the process {C n } n≥0 defined as C n = Cap(R n ).
Before stating the main result, we introduce and discuss the assumptions which we impose on the random walk {S n } n≥0 .
(A1) {S n } n≥0 is aperiodic, that is, the smallest additive subgroup generated by the set supp p 1 = {x ∈ Z d : p 1 (x) > 0} is equal to Z d . (A2) {S n } n≥0 is symmetric and strongly transient. (A3) The step X 1 of the random walk {S n } n≥0 belongs to the domain of attraction of a non-degenerate α-stable random law with 0 < α ≤ 2, meaning that there exists a regularly varying function b(x) with index 1/α such that where U α is an α-stable random variable on R d and (d) − → stands for the convergence in distribution. (A4) {S n } n≥0 admits one-step loops, that is, p = p 1 (0) > 0.
Let us remark that assumption (A1) is not restrictive in any sense. Namely, if {S n } n≥0 is not aperiodic, we can then perform our analysis (and obtain the same results) on the smallest additive subgroup of Z d generated by supp p 1 (see Ref. [25, p. 20]).
To discuss (A2) and (A3), we recall that a transient random walk {S n } n≥0 is called strongly transient if n≥1 n p n (0) < ∞; otherwise it is called weakly transient. It is known that every transient random walk is either strongly or weakly transient (see Ref. [22] [26,Theorem 7]). The notion of strong transience was first introduced in Ref. [19] for Markov chains and was later used in Ref. [13] in the context of the limit behaviour of the range of random walks. Actually, in Ref. [13], a slightly different definition of strong (weak) transience has been used: a transient random walk {S n } n≥0 is called strongly transient if n≥1 n P(T + 0 = n) < ∞; otherwise it is called weakly transient. For reader's convenience, we show that these two definitions are equivalent. Indeed, starting from the following classical identity (see ref. [25]) and whence both series must converge simultaneously. It is a well-known fact that the condition G(0) < ∞ forces P(T + 0 < ∞) = ∞ n=1 P(T + 0 = n) < 1. We remark that the strong transience assumption is very natural in this context. Namely, it ensures that the range process {R n } n≥0 grows fast enough which allows us to conclude that the limiting distribution in Theorem 1.1 is not degenerated, in other words, the constant σ d does not vanish.
Validity of condition (A3) can be checked with the aid of the regular variation of the tails of the step X 1 . In the one-dimensional case we refer to Ref. [6,Theorem 8.3.1], and in the multidimensional case analogous results can be found in Refs. [20,Section 5.4.2], [21,Theorem 4.2], and the references therein. Conditions for stability of a random vector in terms of its one-dimensional projections can be found in Ref. [11].
Finally, assumption (A4) is of technical nature only. By using a random time-change argument and loop decomposition technique, it allows us to conclude that the limit in (1.1) exists and it is not degenerated. One could ask if our main result holds without (A4) but for the time being we cannot address this demanding question. Let us remark, however, that in the case of a simple random walk (for which (A4) is obviously violated) in Ref. [3], the authors established CLT with normal distribution in the limit and in their proof they employ an analogous idea (so called no double-backtracks at even times) to obtain nondegeneracy of the limit law. Unfortunately, this idea is designed for simple random walks and it is not clear whether it can be modified for more general walks (for instance for stable random walks).
A natural way to construct a random walk that satisfies our assumptions (A1)-(A4) is to employ a recently introduced method of discrete subordination (see Ref. [5]). To be more precise, let us consider the simple symmetric random walk in Z d that we denote by {Z n } n≥0 . Furthermore, let {η n } n≥0 be an increasing random walk in Z starting from 0 that is independent of {Z n } n≥0 , and which is uniquely determined by the following relation: Here, ψ(λ) is a Bernstein function (see Ref. [24]) such that ψ(0) = 0 and ψ(1) = 1. We then define the subordinate random walk as S n = Z η n , n ≥ 0. Such a random walk is aperiodic and symmetric. Moreover, it satisfies (A3) with index 0 < α ≤ 2 if and only if the function ψ(λ) is regularly varying at zero with index α/2 (see Refs. [4,18]). For instance, one can take ψ(λ) = λ α/2 . More general examples of random walks satisfying assumption (A3) may be found in Ref. [27].
We now state the main result of the article.
where N (0, 1) stands for the standard normal distribution.

Outline of the proof
Let us briefly explain the main steps of the proof. We follow the path of Ref. With this in hands and a more general form of the following capacity decomposition: which was obtained in Ref. [3,Corollary 2.1], we conclude that the left-hand side in (1) converges in distribution to a zero-mean normal law with variance σ 2 d which is exactly the limit of {Var(C n )/n} n≥1 . Here, is the error term which is the main object to be studied in order to get estimates of the sequence {Var(C n )} n≥0 . The proof of step (i) follows the approach from Ref. [3] which bases on the estimates of the moments of {C n } n≥0 extracted from Ref. [16], combined with an application of Hammersley's lemma (see Ref. [12]). Also, this is the place in the article where the restriction to d/α > 5/2 plays a key role.
To conclude step (ii), we require (A4). The proof is based on a random time-change argument and loop decomposition technique (see Subsection 5.1).

Literature overview and related results
The study on the range process {R n } n≥0 of a Z d -valued random walk {S n } n≥0 has a long history. A pioneering work is due to Dvoretzky and Erdös [10] where they obtained a law of large numbers for {#R n } n≥0 when {S n } n≥0 is the simple random walk and d ≥ 2. Here, #R n denotes the cardinality of R n . The result was later extended by Spitzer [25] for an arbitrary random walk in d ≥ 1. CLT for {#R n } n≥0 was obtained by Jain and Orey [13] when {S n } n≥0 is strongly transient. Le Gall and Rosen [17] were the first who considered the strong law of large numbers (SLLN) and the CLT for {#R n } n≥0 in the case when {S n } n≥0 is a stable aperiodic random walk, that is, when it satisfies (A1) and (A3). On the other hand, the first results on the long-time behaviour of the capacity process {C n } n≥0 are due to Jain and Orey [13] where they obtained a version of the SLLN for any transient random walk. Very recently, Asselah et al. [3] proved a CLT for {C n } n≥0 for the simple random walk in d ≥ 6. In the case when d = 5, Schapira [23] obtained analogous result for a class of symmetric random walks which fulfil some moment condition. Versions of a law of large numbers and CLT in the case d = 4 were proved by Asselah et al. [1], see also Ref. [7] for d = 3. Asselah and Schapira [2] established also a large deviation principle for d ≥ 5.
The aim of this article is to obtain a CLT for the capacity of the range process for a class of α-stable strongly transient random walks with the index satisfying d/α > 5/2. To the best of our knowledge, this is the first result in this direction dealing with random walks that do not have finite second moment. Our motivation comes from the article by Le Gall and Rosen [17] and approach developed by Asselah et al. [3]. A type of the limit behaviour of the sequence {C n } n≥0 depends on the value of the ratio d/α. We observe that our CLT reveals that the capacity of the range of stable random walks with d/α > 5/2 behaves as the cardinality of the range for d/α > 3/2, cf. Ref. [17,Result 1]. If d/α = 5/2, we conjecture that the limit law is again normal but the scaling sequence should be of the form ng(n), where g(n) = n k=1 k 2 b(k) −2d is a slowly varying function. This corresponds to the scaling sequence for the range process in the case d/α = 3/2, as established in Ref. [17,Section 4.5]. We remark that the case when α = 2 and d = 5 has been recently partially solved by Schapira [23], where the author considers a class of symmetric random walks which satisfy certain moment condition and obtains the normal law in the limit under the scaling n log n. For 2 ≤ d/α < 5/2, the limit law should be non-normal and we expect yet another scaling sequence which would involve the truncated Green function, cf. Ref. [17,Result 2].
The study of the case 2 ≤ d/α ≤ 5/2 is an ongoing project and it is postponed to followup articles.

On the SLLN for {C n } n≥0
In this section, we prove that under (A2), the sequence {C n } n≥0 satisfies a (version of) SLLN with strictly positive limit. This result will be crucial in showing that the limit in (1.1) is non-degenerate (see Section 5). Recall first that for any transient random walk on Z d , it holds that the corresponding capacity process {C n } n≥0 satisfies (see Ref. [13,Theorem 2]). In the rest of this section, we show that under (A2) the constant μ d is necessarily strictly positive. We start with the following auxiliary lemma. Recall that for A, B ⊆ Z d , the quantity G(A, B) is defined as where the last inequality follows from (A2).
We now show that 0 cannot be an accumulation point of {E[C n ]/n} n≥1 .

Proposition 2.2: Assume (A2).
Then there is a constant c > 0 such that Proof: For fixed n ≥ 1, we consider the following (random) probability measure defined on Z d : Clearly, supp ν n = R [1, n]. According to Ref. [14,Lemma 2.3], for symmetric random walks, the capacity of a set A ⊆ Z d has the following representation: where the infimum is taken over all probability measures on Z d with supp ν ⊆ A. By setting , Finally, by Jensen's inequality we have that which together with Lemma 2.1 proves the assertion.
As a direct consequence of (3) and Proposition 2.2, we conclude strict positivity of the constant μ d .

Error term estimates
The goal of this section is to obtain estimates of the error term which is of the form (2). This will be crucial in the analysis of the sequence {Var(C n )} n≥0 . In the sequel, we assume (A3). Recall that the function b(x) is necessarily of the following form: where α ∈ (0, 2] and (x) is a slowly varying function. Without loss of generality, we may assume that b(x) is continuous, increasing and b(0) = 0 (see Ref. [6]). If, in addition, (A1) holds true, then by Ref., [17,Proposition 2.4.] there exists a constant C > 0 such that for any n ≥ 0 and x ∈ Z d , Furthermore, for n ≥ 0, we write G n (x, y) for the Green function up to time n, that is, Also, similarly as before, we use the notation G n (x) = G n (0, x), x ∈ Z d . We start with the following auxiliary lemma. Lemma 3.1: Assume (A1)-(A3). Then there exists a constant C > 0 such that for all n ≥ 1 and all a ∈ Z d , where h d (n) is given by Observe that the function n → n k=1 k −1 (k) −d is non-decreasing and slowly varying.
Proof: By (5), we have that for all k, j ≥ 0, where the last inequality follows from Ref. [ where we again used Ref. [ what finishes the proof.
We next obtain estimates of the error term. Let us remark here that a similar result has been obtained in Ref. [3,Lemma 3.2] for the simple random walk only. We give an alternative proof of this result which relies on the Markovian structure of random walks and was motivated by techniques that were applied to estimate moments of intersection times for random walks, cf. Refs. [8,15]. Our approach is valid for all random walks satisfying (A1)-(A3).

Lemma 3.2:
Assume (A1)-(A3). Let {S n } n≥0 be an independent copy of {S n } n≥0 and denote the corresponding range process by {R n } n≥0 . Then, for all k, n ≥ 1 we have that where C > 0 is a constant that depends on k, and h d (n) is defined in Lemma 3.1.
Proof: Let us consider the hitting times T x = inf{n ≥ 0 : S n = x}, x ∈ Z d . It then holds Since P(T x ≤ n) ≤ G n (x), for k = 1, we conclude the result in view of Lemma 3.1. For k > 1, we proceed as follows. We first observe that For simplicity, we use notation We clearly have where (k) is the set of all permutations of the set {1, . . . , k}. Hence, Notice that the strong Markov property employed at time T x k−1 implies that We thus obtain For the last term, we have and, by Lemma 3.1, the last sum is bounded by a constant times h d (n). By repeating the same argument k times, we get the result.

Variance estimates
In this section, we show that the limit of the sequence {Var(C n )/n} n≥1 exists if d/α > 5/2. We follow the approach from Ref. [3,Lemma 3.5]. The proof is based on the following two results which we state for reader's convenience. The first one is Hammersley's lemma.

Lemma 4.1 ([12, Theorem 2]):
Let {a n } n≥1 and {b n } n≥1 be sequences of real numbers satisfying a n+m ≤ a n + a m + b n+m , n, m ≥ 1.
If {b n } n≥1 is non-decreasing and ∞ n=1 b n n 2 < ∞, then {a n /n} n≥1 converges to a finite limit.
The second is the capacity decomposition formula discussed in the introduction. We remark that Proposition 1.2 in Ref. [3] is stated for a simple random walk only, but the proof is valid for an arbitrary transient random walk. Proof: Let n, m ≥ 1 be arbitrary. Due to space homogeneity of the capacity, that is, Thus, according to Lemma 4.2, where C (1) n and C (2) m (R (1) n and R (2) m ) are independent and have the same law as C n and C m (R n and R m ), respectively. Furthermore, for k ≥ 1 define C k = C k − E[C k ], and similarly C (1) k and C (2) k . Taking expectation in (7) and then subtracting those two relations yields Denote · 2 = E[(·) 2 ] 1/2 . Clearly, Var(C k ) = C k 2 2 , k ≥ 1. The triangle inequality and independence of C (1) n and C (2) n together with the estimate E[G(R (1) n , R (2) m )] ≤ G(R (1) n , R (2) m ) 2 and Lemma 3.2 imply where in the last inequality we used Consequently, By setting a k = C k 2 2 , k ≥ 1, the above relation reads a n+m ≤ a n + a m + c 2 C n In the sequel, we find an upper bound for the third term of the right-hand side of inequality (8).
We first consider the case d/α ≥ 3 for which h d (n) is slowly varying. If we prove that , k ≥ 1, the assertion of the lemma will follow directly from (8) and Lemma 4.1. For k ≥ 1, we set Furthermore, for k ≥ 2, we take n ≥ 1 such that 2 k ≤ n < 2 k+1 , and we set l = n/2 and m = n−l. Here, a stands for the largest integer smaller than or equal to a ∈ R.
Analogously as above, we have and we arrive at Recall that C (1) l and C (2) m (R (1) l and R (2) m ) are independent and have the same law as C l and C m (R l and R m ), respectively. Hence, Equation (10) implies Taking supremum over 2 k ≤ n ≤ 2 k+1 yields We next set β k = α k /h d (2 k (2 k ). This and monotonicity of h d (n) gives By iteration of this inequality, we have β k ≤ c 6 2 k/2 which implies α k ≤ c 6 Finally, using definition of α k and Ref. [6,Theorem 1.5.6], we obtain which finishes the proof for d/α ≥ 3.
Next we consider the case 5/2 < d/α < 3. We set = d/α − 5/2 and we observe that where s(n) = ( (n)) −d is a slowly varying function. If we prove then by defining b k = ck 3 /2 h 2 d (k), k ≥ 1, the assertion of the lemma will again follow directly from (8) for k ≥ k 0 . We set again β k = α k /h d (2 k ). Dividing (10) by h d (2 k ) and using the fact that By monotonicity of h d (n), there exists M ≥ 1 such that h d (2 k−1 )/h d (2 k ) ≤ M2 (3 −1)/2 , k ≥ 1, and thus we may easily extend (12) to all k ≥ 1. Hence by iterating (12), we obtain with α k defined in (9). We thus conclude (11) and the result follows.

CLT for {C n } n≥0
In this section, we first show strict positivity of the limit σ 2 d from Lemma 4.3, and then we finally prove Theorem 1.1. Namely, σ 2 d will be exactly the variance parameter of the limiting normal law in (1). To show that σ 2 d is strictly positive, we adapt an idea from Ref. [3] where the simple random walk is decomposed into two independent processes. The first process is the process counting the number of double-backtracks, and the second one is the process with no double-backtracks. For our class of random walks, we use one-step loops instead of double-backtracks. To be more precise, we say that {S n } n≥0 makes a one-step loop at time n if S n = S n−1 . Clearly, {S n } n≥0 admits one-step loops if and only if p 1 (0) > 0. Also, when the walk makes a one-step loop, the range evidently remains unchanged. We will first build a random walk { S n } n≥0 with no one-step loops, and then we will show how to construct a random walk { S n } n≥0 starting from { S n } n≥0 with (i) the same law as {S n } n≥0 and (ii) the range process being a certain random time-change of the range process of { S n } n≥0 .

Strict positivity of σ 2 d
Assume (A1)-(A4), and let { X i } i∈N be a sequence of i.i.d. random variables with distribution Recall that p = p 1 (0) > 0 (assumption (A4)). Furthermore, let { S n } n≥0 be the corresponding random walk. Clearly, { S n } n≥0 has no one-step loops. We now construct a random walk { S n } n≥0 by adding an independent geometric number of one-step loops to { S n } n≥0 at each step and which has the same law as {S n } n≥0 . Let {ξ i } i≥0 be a sequence of i.i.d. geometric random variables with parameter p which are independent of { S n } n≥0 . Recall, We set N −1 = 0, and define { S n } n≥0 according to the following procedure: we start by setting S 0 = 0. Furthermore, for k ≥ 0, we define If I k = ∅ then for each i ∈ I k , we define S i = S k . We next follow the path of { S n } n≥0 which means that we set S k+N k +1 = S k+1 . This construction provides a random walk { S n } n≥0 with the same law as {S n } n≥0 . We also have where the second equality holds since n + N n−1 = (n − 1) + N n−1 + 1. Consequently, where { R n } n≥0 and { R n } n≥0 are range processes of { S n } n≥0 and { S n } n≥0 , respectively. We now show that σ d must be strictly positive. We first establish two technical lemmas. Therefore, where c 1 > 0 is a constant that we specify. For that, notice that there exists a constant c 2 > 0 such that E[M n ] ≤ c 2 n, n ≥ 0. Set c 1 = c 2 + ε for some ε > 0. Then, by Chebyshev's inequality, we have To bound the first term of the penultimate estimate, we observe that G(x − a, y − a) = G(x, y), x, y, a ∈ Z d , and that the two random variables are independent. Thus, instead of the second random set, we can write R (c +c 1 )n , where {R n } n≥0 is the range process of a random walk that is an independent copy of { S n } n≥0 . We obtain where the constant c 3 is defined as above to make P(N n−1 ≥ c 3 n) tending to zero as n goes to infinity. We finally set c 4 = max{ 1 + c 3 , c + c 1 } and we apply the Markov inequality, Lemma 3.2 and Ref. [6,Theorem 1.5.6], to get Since for d/α > 5/2 the index of h d (n) is less than 1/2, the last term tends to zero and the result follows.
In what follows, we Proof: Since S k = S k+N k−1 and S k+n = S k+n+N k+n , we have as desired.
We finally prove that σ d is strictly positive. Proof: We define three sequences for a constant A > 0 which will be specified later. Lemma 5.2 implies Thus, for n large enough Similarly, we show that for n large enough By Lemma 5.1, we get for n large enough and We introduce the following events: By the CLT, there exists a constant c A > 0 such that P(B n ) ≥ c A and P(D n ) ≥ c A for n large enough. We distinguish between two cases: We first study case (i). By Lemma 4.2, we have We thus obtain In view of the assumption, space homogeneity of the capacity and (13), we have that This together with (15) implies By independence of {N n } n≥−1 and { S n } n≥0 , we get We next observe that on D n , we have k n + N k n −1 ∈ [n, n + 2 √ n]. We also recall that and whence Since { C n } n≥0 is clearly increasing in n, we deduce that and, finally, the deterministic bound C n+2 Choosing A large enough such that Aμ d /2 − 2 > 0 and applying the Chebyshev's inequality shows that in case (i), we have Var(C n ) = Var( C n ) ≥ cn as desired.
In case (ii), we proceed similarly. By Lemma 4.2, we have Next, Equations (14) and (16) and the fact that P(B n ) ≥ c A imply On B n , we have i n + N i n −1 ∈ [n, n + 2 √ n] and it follows that We thus finally conclude that and an application of the Chebyshev's inequality finishes the proof.

Proof of Theorem 1.1
We start with the following technical lemma.

Proof:
The proof is similar to that of Lemma 4.3. For k ≥ 1, we set where · 4 = E[(·) 4 ] 1/4 . For k ≥ 2, we take 2 k ≤ n < 2 k+1 and we set l = n/2 and m = n−l. Using Lemma 4.2, as in the proof of Lemma 4.3, we obtain where again C (1) l and C (2) m (R (1) l and R (2) m ) are independent and have the same law as C l and C m (R l and R m ), respectively. We observe that where we used the fact that C (1) l and C (2) m are two independent and centred random variables. From Lemma 4.3, we have whereas by Lemma 3.2 and the fact that h d (n) is regularly varying of index less that 1/2, Combining this with the elementary inequality (a + b) 1/4 ≤ a 1/4 + b 1/4 , a, b ≥ 0, we get Similarly as in Lemma 4.3, we thus obtain Setting β k = 2 −k/2 α k , k ≥ 1, we deduce that which shows that {β k } k≥1 is a bounded sequence. Therefore, α k ≤ c 5 2 k/2 , k ≥ 1, which immediately yields the result.
The proof of Theorem 1.1 is based on the dyadic capacity decomposition formula derived in Ref. [3,Corollary 2.1] and Lindeberg-Feller CLT, which we state for reader's convenience.
Using (19) and Lemma 3.2, we get that Next we distinguish between two cases. If 5/2 < d/α < 3, then we set L = log 2 (n /2 ) , where = d/α − 5/2. This implies If d/α ≥ 3, then h d (n) is slowly varying and in this case it is enough to choose L = log 2 (n 1/4 ) to obtain (20). We are thus left to prove that