Semiparametric Tests for the Order of Integration in the Possible Presence of Level Breaks

Abstract Lobato and Robinson developed semiparametric tests for the null hypothesis that a series is weakly autocorrelated, or I(0), about a constant level, against fractionally integrated alternatives. These tests have the advantage that the user is not required to specify a parametric model for any weak autocorrelation present in the series. We extend this approach in two distinct ways. First, we show that it can be generalized to allow for testing of the null hypothesis that a series is for any δ lying in the usual stationary and invertible region of the parameter space. The second extension is the more substantive and addresses the well-known issue in the literature that long memory and level breaks can be mistaken for one another, with unmodeled level breaks rendering fractional integration tests highly unreliable. To deal with this inference problem, we extend the Lobato and Robinson approach to allow for the possibility of changes in level at unknown points in the series. We show that the resulting statistics have standard limiting null distributions, and that the tests based on these statistics attain the same asymptotic local power functions as infeasible tests based on the unobserved errors, and hence there is no loss in asymptotic local power from allowing for level breaks, even where none is present. We report results from a Monte Carlo study into the finite-sample behavior of our proposed tests, as well as several empirical examples.


Introduction
It is well known that if not accounted for, level shifts in a weakly autocorrelated (or short memory) process, denoted I(0), can induce features in the autocorrelation function and the periodogram of a time series that can be mistaken as evidence of long memory (see, e.g., Diebold and Inoue 2001;Gourieroux and Jasiak 2001;Granger and Hyung 2004;Mikosch and Stȃricȃ 2004;Qu 2011;Iacone, Leybourne, and Taylor 2019). To avoid the possibility of spurious inference being made about the memory properties of a time series, it is therefore important to develop tests on the fractional integration (memory) parameter of a time series which are robust to level shifts. As a consequence, Iacone, Leybourne, and Taylor (2019) generalized the parametric Lagrange multiplier (LM) time domain based fractional integration tests of Tanaka (1999) and Nielsen (2004) to allow for the possibility of a single break in the deterministic trend function at an unknown point in the sample. These tests are equivalent to analogous extensions of the frequency domain tests of Robinson (1994) to allow for breaks in the deterministic trend function. Iacone, Leybourne, and Taylor (2019) showed that this approach delivers an LM test which, regardless of whether a break occurs or not, is a locally most powerful test and has a χ 2 1 limiting null distribution. However, a significant practical disadvantage of the tests of Iacone, Leybourne, and Taylor (2019) is that, like the tests of Robinson (1994), Tanaka (1999), and Nielsen (2004) from which they are derived, they are based on fitting a full parametric model to the data. Crucially, the short run component of this model must be correctly specified under the null hypothesis for the resulting test to be correctly asymptotically sized. This requirement is clearly problematic in practice, and is likely to be further complicated in the case where level breaks are present as this would likely interfere with any preliminary model selection stage used to specify the form used for the short memory component. It therefore seems worthwhile developing long memory tests analogous to those of Iacone, Leybourne, and Taylor (2019) but which do not require the user to specify a parametric model for the short memory component of the series.
Our contribution in this article is therefore to develop semiparametric analogues of the parametric tests of Iacone, Leybourne, and Taylor (2019). We will base our approach on an extension of the semiparametric frequency domain based fractional integration tests of Lobato and Robinson (1998). This approach is based on the use of a low frequency approximation provided by the local Whittle (LW) likelihood, which obviates the need to explicitly model any short range dependence present in the data. To account for the possibility of level breaks, the Lobato and Robinson (1998)-type statistics we propose are constructed from data which have been de-trended allowing for the possibility of level breaks, the locations of which are estimated by a standard residual sum of squares estimator applied to the levels data. The tests proposed in Lobato and Robinson (1998), again based on the LM testing principle, are specifically designed for testing the null hypothesis that a time series is I(0). We show that, as conjectured in Lobato and Robinson (1998, p. 478), their approach can be generalized to provide a valid test for the null hypothesis that the series is integrated of order δ, for any δ lying in the stationary and invertible region of the parameter space (−0.5 < δ < 0.5). It is also possible to test orders of integration outside the stationary and invertible region using data transformations. For example, the null hypothesis of an autoregressive unit root can be obtained by testing for the null hypothesis of short memory in the first differences of the series; as such this is then a test in the levels data for a unit root allowing for the possibility of trend breaks. Because the tests are based on the LM testing principle, no preliminary estimation of the memory parameter is required.
Our focus on the Lobato and Robinson (1998) testing approach is due, at least in part, to results in Shao and Wu (2007a) who showed that the standard Lobato and Robinson (1998) tests are, for a suitable choice of the bandwidth parameter m used in the local Whittle loss function, considerably more powerful than other semiparametric tests for testing the null of I(0) against the alternative of fractional integration that are available in the literature. In particular, they showed that tests based on the rescaled range and rescaled variance statistics and tests based on the well-known KPSS statistic of Kwiatkowski et al. (1992) have power against local alternatives of order (ln(T)) −1 , where T denotes the sample size. On the other hand, the Lobato and Robinson (1998) tests have power against local alternatives of order m −1/2 , where the bandwidth parameter m is typically of the type m = T α for some 0 < α < 4/5. Moreover, these other approaches have only been developed to test the null hypothesis of I(0) against the alternative of fractional integration, whereas we wish to maintain the flexibility to test a more general I(δ) null hypothesis. Busetti andHarvey (2001, 2003) developed extensions of the KPSS test that allow for a single level break at an unknown point in the sample, although their approach is based on the assumption that a level break is known to occur.
We establish that, regardless of whether level breaks occur or not, the large sample properties of the tests we propose are identical to those which obtain for the standard Lobato and Robinson (1998) tests for δ = 0 in the case where no level breaks occur. In particular, our proposed LM-type test has a χ 2 1 limiting null distribution and the corresponding t-type test a N(0, 1) limiting null distribution, regardless of the value of δ being tested under the null hypothesis, and each attains the same asymptotic local power function as the corresponding infeasible test based on the unobserved errors. Moreover, these asymptotic local power functions do not alter between the break and no break cases and so there is no loss in asymptotic local power from allowing for level breaks, even where no breaks are present. Although based on different and hence not directly comparable models, these large sample properties contrast with those of most popular unit root tests, such as that of Dickey and Fuller (1979), and stationarity tests, such as that of KPSS. In particular, the limiting null distributions of unit root and stationarity test statistics tend to be nonstandard and depend on the functional form of the fitted deterministic, differing between the no break and break cases, and dependent on the locations of the breaks. Moreover, where breaks are fitted but not actually present in the data, these tests show a considerable decline in asymptotic local power relative to the case where no break is fitted.
The remainder of the article is organized as follows. Section 2 sets out the fractionally integrated level break model within which we work. Section 3 describes our tests in the infeasible case where the errors are observable. Our proposed semiparametric statistics for the case of unknown level breaks are described in Section 4, where we also establish their large sample properties. Section 5 summarizes the results from a Monte Carlo simulation study into the finite sample size and power properties of our proposed tests and compares with the nonparametric KPSS-type tests of Busetti andHarvey (2001, 2003). Illustrative empirical examples of the methods developed in this article to bitcoin returns data, VIX market volatility, U.S. CPI inflation, and U.S. real GDP growth are considered in Section 6. Section 7 concludes. Proofs of our main results are provided in a mathematical appendix. A supplementary appendix contains full details of the Monte Carlo design and results.

The Fractionally Integrated Model With Level Breaks
Consider the scalar time series process, y t , satisfying the data generating process (DGP), In (1), β := (β 1 , β 2 ) is a vector of fixed parameters and DU t (τ * ) := (DU t (τ * 1 ), . . . , DU t (τ * k )) is a vector of k level break terms, where DU t (τ ) is defined for a generic argument τ as DU t (τ ) := I(t ≥ τ T ), I(·) denotes the usual indicator function, · denotes the integer part of its argument, and where A := B and B =: A is used to denote that A is defined by B. The formulation in (1) therefore allows for up to k level breaks where τ * := (τ * 1 , . . . , τ * k ) is the vector of (unknown) putative level break fractions and β 2 := (β 2,1 , . . . , β 2,k ) the associated break magnitude parameters, such that a level break occurs at time τ * i T when β 2,i = 0 for i = 1, . . . , k. The true but unknown number of level breaks present, k * say, is then given by the number of nonzero elements of the vector β 2 . The (putative) level break fractions are assumed to be such that τ * i ∈ [τ L , τ U ] =: for all i = 1, . . . , k, where ⊂ (0, 1) is compact and the quantities τ L and τ U are trimming parameters below and above which, respectively, a level break is deemed not to occur. We also make the standard assumption that |τ * i − τ * j | ≥ λ > 0 for all i = j, such that there are at least λT observations between breaks. Note that these conditions imply that the number of breaks that can feasibly be calculated, In the context of (1) the shocks, u t , are assumed to follow a stationary and invertible process which is fractionally integrated of order δ, denoted u t ∈ I (δ). For our purposes, we define fractional integration for u t as where η t is a zero mean I(0) process. We define I(0) to be such that η t has spectral density f (λ) with f (λ) → G for some G ∈ (0, ∞) as λ → 0; formal assumptions on η t required for our large sample theory results will be delayed until Section 3.
The assumption that u t is stationary and invertible entails that the long memory parameter, δ, is such that δ ∈ (−0.5, 0.5). A process satisfying the conditions just stated for u t is often referred to in the literature as a Type I fractionally integrated process.
Our interest focuses on testing the null hypothesis that u t , and hence y t , is I(δ 0 ) for some δ 0 ∈ (−0.5, 0.5); that is, H 0 : δ = δ 0 in (1). Note that the extension to allow δ 0 = 0 is nontrivial, in the sense that testing δ = δ 0 on y t , as we do in this article, is different from testing δ = 0 on δ 0 y t , as is done, for example, in Iacone, Leybourne, and Taylor (2019), since the latter process has a different unconditional mean thereby changing the model and its interpretation. Based on the familiar LM testing principle we will develop tests against two-sided alternatives of the form H 1 : δ = δ 0 (y t is not I(δ 0 )) together with corresponding t-type tests against one-sided alternatives of the form H 1 : δ > δ 0 (y t is more persistent than an I(δ 0 ) series) or H 1 : δ < δ 0 (y t is less persistent than an I(δ 0 ) series).
Next, in Section 3, we discuss the tests proposed in Lobato and Robinson (1998) which were developed for testing the specific null hypothesis that y t is short memory. These tests apply to the case where either u t in (1) is observable or where it is known that β 2 = 0 (so that no level breaks are present). We show that this approach can be readily extended to develop tests for the null hypothesis that y t is I(δ 0 ) for some δ 0 ∈ (−0.5, 0.5). Then, in Section 4, we show how these tests can be generalized to allow for the possibility that β 2 = 0 in (1), such that level breaks could potentially occur in the data. The testing approach we outline in Section 4 does not assume knowledge of whether level breaks genuinely occur; that is, we do not assume knowledge of whether β 2 = 0 or β 2 = 0.
3. Tests of H 0 : δ = δ 0 when it is known that β 2 = 0 Suppose for the purposes of this section that it is known to be the case that β 2 = 0 in (1). Under this restriction we can also set β 1 = 0 with no loss of generality because, as discussed in Lobato and Robinson (1998, p. 477), the statistics we will discuss in this article are invariant to β 1 in the case where β 2 = 0. The restriction that β 2 = 0 is therefore equivalent to the case where β 1 , β 2 and τ * are all known, such that u t in (1) is observable. We may therefore proceed as if u t were observable. We will discuss the application of the tests to u t , although in the context of this section they could equally be applied to y t because no meancorrection is required (provided the mean is constant) due to invariance to β 1 .
For observable u t , semiparametric inference on δ based on the approximation of the Whittle likelihood at low frequencies was proposed by Künsch (1987) and analyzed further in Robinson (1995b). This approach is semiparametric as it does not require the specification of a parametric model for f (λ) and, within the class of semiparametric methods, it has the advantage of being based on a (local) likelihood, and it is therefore considerably more efficient than other semiparametric estimates such as the log-periodogram regression of Geweke and Porter-Hudak (1983) and Robinson (1995a).
For a generic series a t , let w a (λ) := 1 √ 2π T T t=1 a t e iλt denote the Fourier transform of a t , and let I a (λ) := |w a (λ)| 2 denote the periodogram. Then, as discussed in Robinson (1995b), for the observable series u t , the local Whittle estimate of δ is obtained by minimizing the loss function R (d) with respect to d, where and m denotes the bandwidth, satisfying the rate condition that 1/m + m/T → 0 as T → ∞. Recall that λ j := 2π j T for integer j are the Fourier frequencies. Applying the LM principle to the objective function in (3) yields the LM-type statistic to test H 0 : the LM * m (δ 0 ) statistic can be equivalently rewritten in terms of the Fourier frequencies and the periodogram ordinates at those frequencies as The null hypothesis H 0 that u t is I(δ 0 ) can then be rejected for large values of LM * m (δ 0 ), while a large positive (negative) value of t * m (δ 0 ) would allow rejection against the one-sided alternative H 1 : δ > δ 0 (H 1 : δ < δ 0 ). It will turn out that standard critical values can be employed in the context of these decision rules. Lobato and Robinson (1998) analyzed the special case of the t * m (0) and LM * m (0) statistics in (4) and (5), respectively, which obtain setting δ 0 = 0, such that one is testing the null hypothesis of short memory, H 0 : δ = 0. For the purpose of later sections, we need to also define the Lobato and Robinson (1998) t-and LM-type test statistics for the hypothesis H 0 : δ = δ 0 applied to the observed data, {y t }, and which do not account for the possibility of level breaks; we will denote these as t m (δ 0 ) and LM m (δ 0 ), respectively. These differ from the infeasible statistics t * m (δ 0 ) and LM * m (δ 0 ) for the hypothesis H 0 : δ = δ 0 which are applied to the unobserved innovations, {u t }. In the context of this section, where it is known that β 2 = 0, then t m (δ 0 ) and t * m (δ 0 ) coincide, as do LM m (δ 0 ) and LM * m (δ 0 ). Lobato and Robinson (1998) established that, under certain regularity conditions (see Assumption 1), t * m (0) and LM * m (0) have N(0, 1) and χ 2 1 limiting null distributions, respectively. Shao and Wu (2007a) subsequently demonstrated that under local alternatives of the form H c : and, hence, LM * m (0) d → χ 2 1 4c 2 , where χ 2 1 4c 2 denotes a noncentral χ 2 1 distribution with noncentrality parameter 4c 2 . Before progressing to consider the case where u t is not observable, that is where it is not known for sure that β 2 = 0 in (1), we first show that the properties established for the LM * m (0) and t * m (0) statistics in Lobato and Robinson (1998) and Shao and Wu (2007a) carry over to the general case of the LM * m (δ 0 ) statistic in (5) and corresponding t * m (δ 0 ) statistic in (4) for testing H 0 : δ = δ 0 for any δ 0 ∈ (−0.5, 0.5). To do so we first introduce sufficient conditions for establishing these large sample justifications. We will discuss two sets of possible assumptions under which our large sample results obtain. The first set, given in Assumption 1, coincides with the conditions adopted by Robinson (1995b). The second set, given in Assumption 2, coincides with those employed by Shao and Wu (2007a).
ii. The weights ψ j are such that ∞ j=0 ψ 2 j < ∞. iii. The spectral density of η t , f (λ), is twice boundedly differentiable in a neighborhood of λ = 0 and satisfies, as Remark 3.1. The conditions on η t detailed in Assumption 1 coincide with those given in Robinson (1995b) and are slightly stronger than those in Lobato and Robinson (1998). A full discussion of these conditions is given in Robinson (1995bRobinson ( , pp. 1634Robinson ( and 1641 and Lobato and Robinson (1998, p. 478). Assumption 1 includes all stationary and invertible finite-order ARMA models for η t . Assumption 1 allows for nonlinearity via the martingale difference assumption on the innovations, but is otherwise linear. Notice also that Assumption 1 requires f (λ) to be smooth only around λ = 0 and so does not rule out long memory behavior at frequencies other than λ = 0 (although this needs to be strengthened in Assumption 3 to obtain results for our feasible tests).
The assumption of conditional homoscedasticity imposed by part (i) of Assumption 1 may be considered unacceptable for many data applications, in particular those involving financial data. Wu (2007a, 2007b) showed that this can be weakened to allow for a wide class of stationary, causal nonlinear processes. To that end, suppose that where ε t are independent and identically distributed (IID) random variables and F is a measurable function such that η t is well defined as a stationary, causal, ergodic process. For a random variable ξ and p > 0, write ξ ∈ L p if ξ p := (E(|ξ | p )) 1/p < ∞.
Remark 3.2. Assumption 2 includes a number of widely used nonlinear time series models for η t such as bilinear models, threshold models, GARCH and ARMA-GARCH models; see Shao and Wu (2007a, p. 254) and Shao and Wu (2007b) and the references therein for further discussion of this assumption and further examples of classes of nonlinear processes which satisfy it. While Assumption 2 weakens, inter alia, the conditional homoscedasticity restriction of Assumption 1, this comes at the cost of a stronger assumption on the bandwidth, that is restricted to be such that m = o(T 2/3 ). Moreover, as discussed in Shao and Wu (2007b, Remark 3.1), Assumption 2(ii) implies continuous differentiability of f (λ) for all frequencies, whereas, as discussed in Remark 3.1 and Robinson (1995b), Assumption 1 only imposes conditions on f (λ) in a local-to-zero band. There is therefore a clear trade-off between the conditions imposed on η t by Assumptions 1 and 2.
In Theorem 1, we now derive the large sample properties of the LM * m (δ 0 ) and t * m (δ 0 ) statistics, obtained for the case where it is known that β 2 = 0 in (1). To facilitate discussion of asymptotic local power, we consider the local alternative H c : δ = δ 0 + cm −1/2 . Theorem 1. Let y t be generated according to (1) with β 2 = 0, and let either Assumption 1 or Assumption 2 hold on η t . Then, for any δ 0 ∈ (−0.5, 0.5), under H c : δ = δ 0 + cm −1/2 : Remark 3.3. Theorem 1 shows that the results obtained for the limiting null distributions of the LM * m (0) and t * m (0) statistics in Lobato and Robinson (1998) apply more generally to the LM * m (δ 0 ) and t * m (δ 0 ) statistics for testing the null hypothesis that u t is I(δ 0 ) for any δ 0 in the usual stationary and invertible region. Theorem 1 also shows that tests based on the LM * m (δ 0 ) and t * m (δ 0 ) statistics possess the same local power functions as tests based on the LM * m (0) and t * m (0) statistics. Moreover, these results hold regardless of whether u t is conditionally homoscedastic or conditionally heteroscedastic (satisfying Assumption 2). Finally, note that the result in Theorem 1 was anticipated without proof by Marinucci and Robinson (2001, sec. 4), at least under H 0 and Assumption 1.

Feasible Tests of H 0 : δ = δ 0 Allowing for up to k Level Breaks
Recall that the LM-and t-type tests discussed in Section 3 are based on the assumption that β 2 = 0, such that the LM m (δ 0 ) and t m (δ 0 ) statistics calculated on the observed data {y t } will coincide with the LM * m (δ 0 ) and t * m (δ 0 ) statistics based on the shocks, {u t }, even if β 1 = 0 such that {u t } are unobservable (because the statistics are invariant to β 1 ). However, where β 2 = 0 this is no longer the case, and we cannot proceed as if the tests were based on the unobservable shocks, {u t }. Moreover, where β 2 = 0 the LM m (δ 0 ) and t m (δ 0 ) statistics constructed from the observed data, {y t }, are nonsimilar tests and will diverge. For example, if δ 0 = 0 it can be shown that the (m)) under H 0 , so that both statistics will diverge with the sample size, even under the null hypothesis. As a consequence, therefore, the Lobato and Robinson (1998) tests will spuriously reject the null with probability tending to one as the sample size diverges. That is, tests based on LM m (δ 0 ) or t m (δ 0 ) are uninformative if it is unknown whether β 2 = 0 or not. In this section, we will therefore discuss how feasible versions of the tests discussed in Section 3 can be derived for the case where it is not known for certain whether β 2 = 0 or not.
In the context of (1), the disturbances u t are not observable and so they must be estimated. For a generic vector of (putative) break locations, τ = (τ 1 , . . . , τ k ) , we can use ordinary least squares (OLS) estimators of the parameters β 1 and β 2 in (1). To that end, let β := (β 1 , β 2 ) , and let y : For a given value of τ we then have the corresponding estimated residuals Based on I u(τ ) (λ j ), we can then define analogues of the LM * m (δ 0 ) statistic of (5) and the corresponding t-type statistic t * m (δ 0 ) in (4), for testing H 0 : δ = δ 0 as follows If the true vector of break fractions, τ * , were known then one would simply evaluate LM m (δ 0 ; τ ) and t m (δ 0 ; τ ) at τ = τ * . Our focus, however, is on the case where τ * is unknown and so will need to be estimated from the data. An obvious candidate is the minimum residual sum of squares (RSS) estimator considered in Lavielle and Moulines (2000, pp. 38-39), which can be written as where it is recalled that τ L and τ U are trimming parameters such that [τ L , τ U ] ⊂ (0, 1). Given the RSS estimator τ in (9), tests for H 0 : δ = δ 0 can then be based on LM m (δ 0 ; τ ) and t m (δ 0 ; τ ). For these tests to be operational, we will need to establish the large sample behavior of the LM m (δ 0 ; τ ) and t m (δ 0 ; τ ) statistics under the null hypothesis, H 0 : δ = δ 0 , and show that unique asymptotic critical values (in the sense that they do not depend on any nuisance parameters) for the tests can be obtained from these distributions. In fact, we will be able to show in what follows that these statistics have the same limiting null distributions as were obtained for their infeasible counterparts LM * m (δ 0 ) and t * m (δ 0 ) in Theorem 1. To do so, however, we must impose some additional regularity conditions on η t . In particular, Assumptions 1 and 2 must be strengthened to Assumptions 3 and 4, respectively, as follows: Assumption 3. Let Assumption 1 hold. Assume further that: Assumption 4. Let Assumption 2 hold, and define the projection . Then we assume further that: Remark 4.1. Both Assumptions 3 and 4 impose the additional moment condition that q > 1/ (1 + 2δ) moments exist. This condition is needed so that we can appeal to the functional central limit theorem (FCLT) for fractional processes for which the moment condition is necessary; see Theorem 2 of Johansen and Nielsen (2012). The fractional FCLT also requires that q > 2, but this is implied in Assumptions 1 or 2 so is not stated explicitly here. The condition placed on the weights ψ j in Assumption 3(ii) is quite standard for the (fractional) FCLT and is met by all stationary and invertible finite-order ARMA models. This condition also implies continuity of the spectral density of η t and hence rules out long memory at other frequencies, see Remarks 3.1 and 3.2. The condition that 0 < ∞ j=0 ψ j < ∞ (and a similar condition for the nonlinear process) is again omitted because it is implied by the assumption 0 < f (0) < ∞. The additional condition required to hold on the bandwidth in part (iv) of Assumptions 3 and 4 is not restrictive in practice because much larger bandwidths will typically be used.
We are now in a position to state our main result in Theorem 2 which details the large sample behavior of the feasible statistics LM m (δ 0 ; τ ) and t m (δ 0 ; τ ) under local alternatives of the form H c : δ = δ 0 + cm −1/2 . We will first state our main result and then provide some discussion around this result. We will also provide further insights into this result through the case of k = 1 where results are easier to explain.
Theorem 2. Let y t be generated according to (1), and let either Assumption 3 or Assumption 4 hold on η t . Then, for any δ 0 ∈ (−0.5, 0.5), under H c : δ = δ 0 + cm −1/2 , and regardless of Remark 4.2. A comparison of the results in Theorem 2 with those given previously in Theorem 1 yields the following immediate consequence. Regardless of whether any particular ele- (1) is zero or nonzero, the tests based on LM m (δ 0 ; τ ) and t m (δ 0 ; τ ) attain exactly the same asymptotic local power functions as obtained by the infeasible tests based on LM * m (δ 0 ) and t * m (δ 0 ), respectively. Moreover, , so that standard critical values can be used for both tests, again regardless of whether β 2,i = 0 or β 2,i = 0, i = 1, . . . , k, holds in (1).
A proof of Theorem 2 is provided in the appendix. The proof strategy is to consider the distances between the feasible statistics LM m (δ 0 ; τ ) and t m (δ 0 ; τ ) and the infeasible LM * m (δ 0 ) and t * m (δ 0 ) statistics, respectively, in large samples. Inherent in doing so is to analyze the distance between u t and u t ( τ ), the latter given by u t (τ ) evaluated at τ = τ , and establish how this affects the distance between the feasible and infeasible statistics. The behavior of both LM m (δ 0 , τ ) and t m (δ 0 , τ ) clearly depend on the large sample properties of the estimates τ in (9) and β( τ ), the latter given by β(τ ) evaluated at τ = τ . For the properties of τ we apply a result of Lavielle and Moulines (2000), and we combine this with a fractional FCLT for u t to obtain results for β( τ ).
Remark 4.3. To give some insight into the mechanics behind the proof, it is instructive to specialize our discussion to the case where k = 1. Accordingly, and with an obvious notation, we redefine τ , τ , β 2 , and β 2 as τ ,τ , β 2 , andβ 2 , respectively. The proof proceeds by establishing that two key results hold under the conditions of Theorem 2. The first result is that if β 2 = 0 (so that no level break occurs), then (1), in each case uniformly in τ . This result establishes that when no level break occurs, the differences between the statistics based on u t and u t (τ ) are asymptotically negligible, and that this holds uniformly in τ and, hence, holds for τ . To prove this, we first establish uniformly in τ results for β(τ ). It is at this stage that the fractional FCLT is used. We can then derive properties of the estimated residuals u t (τ ) and analyze the distance between the Fourier transforms (and hence the periodograms) of u t (τ ) and of u t . The second result is that if That is, when β 2 = 0, such that a level break occurs, the differences between the statistics based on u t and u t ( τ ) are asymptotically negligible. In this case, we first establish the properties of the estimate of the break fraction, τ , using results from Lavielle and Moulines (2000). These properties allow us to bound the distance between β( τ ) and β(τ * ), and use this to analyze the distance between the Fourier transforms (and the periodograms) of u t ( τ ) and of u t (τ * ).
Remark 4.4. As discussed in Remark 4.3, the difference between the feasible and infeasible test statistics is shown to be o p (1) in Theorem 2. However, these remainder terms are nonetheless functions of δ 0 , or equivalently of δ because of the local asymptotic framework (see, e.g., (A.17), (A.24), and (A.25) in Appendix A.2). This finite sample dependence on δ 0 can also be observed in the Monte Carlo results; see point (v) in Section 5.
Remark 4.5. The result in Theorem 2 shows that there is no loss in asymptotic local power from allowing for k breaks when the true number of breaks, k * say, is smaller than k. However, as the simulation results in Section 5 show, the finite sample size and power properties of the feasible LM-type test, LM m (δ 0 ; τ ), deteriorate somewhat if k is chosen to be larger than k * . On the other hand, if k is chosen to be smaller than k * then we know from the discussion at the start of Section 4 that the LM m (δ 0 ; τ ) statistic will diverge, even under the null hypothesis. In practical applications it would therefore seem sensible to select the number breaks used in constructing the LM m (δ 0 ; τ ) statistic according to a consistent information criterion. Theorem 9 of Lavielle and Moulines (2000, pp. 49-50) provides the conditions required on the penalty function such that an information criterion-based approach will consistently select the true number of breaks in the context of the DGP in (1) under the conditions of Theorem 2. Their result shows that, provided the maximum number of breaks allowed, k, is at least as large as the true number of breaks, k * , then the commonly used Bayes information criterion (BIC) of Schwarz (1978) and Hannan-Quinn information criterion (HQIC) of Hannan and Quinn (1979) will both deliver consistent estimates of k * . We recommend the use of the HQIC as this is less parsimonious than the BIC, and hence constitutes a safer choice in practice, given the severe implications of fitting too few level breaks. We will illustrate the use of the BIC and HQIC in the empirical applications in Section 6.

Monte Carlo Evidence
We begin this section by investigating how well the large sample predictions of Theorem 2 hold in finite samples for a DGP that has either zero or one level break and we accordingly set k = 1 so that the notation of Remark 4.3 applies. To that end, Figures 1 and 2 graph simulated finite sample power functions of the feasible LM-type test, LM m (δ 0 ; τ ), proposed in Section 4 and the corresponding Lobato and Robinson (1998) test, LM m (δ 0 ), that does not allow for the possibility of a level break. In the context of the LM m (δ 0 ; τ ) statistic, we set the trimming parameters to be τ L = 0.15 and τ U = 0.85. Also graphed are the power functions of the corresponding infeasible tests, LM m (δ 0 ; τ * ), defined just under (8), and LM * m (δ 0 ) defined in (5). The former assumes knowledge of the true break location, τ * , but not the innovations, u t , and the latter assumes knowledge of the innovations.
The simulated data used to construct the power curves in Figures 1 and 2 were generated according to the DGP in (1)-(2) for T = 512 and T = 1024 setting k = 1 and with η t ∼ NIID(0, 1), and where β 1 was set equal to zero with no loss of generality. All of the reported tests are for testing H 0 : δ = 0 at the nominal asymptotic 5% level. The graphs depict the simulated power functions of the tests under the local alternative H c : δ = cm −1/2 for a range of values of c and with the corresponding values of δ shown on the horizontal axes. Results are reported for two bandwidth choices, namely m = T 0.65 and m = T 0.8 . The results in Figure 1 relate to the case considered in Theorem 2 with no level break, that is, β 2 = 0, while the results in Figure 2 relate to Theorem 2 for the specific case of a level break with β 2 = 2 at τ * = 0.5, that is, a break equal to two standard deviations of the innovation process occurring midway through  the sample. The simulated power curves were computed using 10,000 Monte Carlo replications using the RNDN function of Gauss 20. As a benchmark, we also include in each graph the corresponding asymptotic local power curves obtained directly from the noncentral χ 2 1 (4c 2 ) distribution, where c = δ √ m. Consider first the results in Figure 1 for the no break case. Here, given knowledge that no level break was present, the best possible test to use among the three considered would be the basic Lobato and Robinson (1998) test, LM m (δ 0 ) = LM * m (δ 0 ). Against positive values of δ this test has power closest to the asymptotic local power function and is somewhat more powerful than the infeasible LM m (δ 0 ; τ * ) test, which in turn is more powerful than the feasible LM m (δ 0 ; τ ) test. These differences are, however, reduced for T = 1024 vis-à-vis T = 512 and for m = T 0.8 vis-à-vis m = T 0.65 ; indeed for T = 1024 and m = T 0.8 the differences between the three tests are quite small with all three lying close to the asymptotic local power curve. For negative values of δ there are only very slight differences between the three tests. Overall, when β 2 = 0, the large sample predictions from Theorem 2 appear to hold reasonably well in finite samples, particularly so for the larger bandwidth considered.
Consider next the results in Figure 2 for the case where a level break of magnitude β 2 = 2 occurs. Here the infeasible LM * m (δ 0 ) test no longer coincides with the feasible Lobato and Robinson (1998) test, LM m (δ 0 ). In this case the divergence of the LM m (δ 0 ) test is clearly seen, regardless of whether the null hypothesis holds or not, with the test rejecting essentially 100% of the time even for the smaller sample size considered. The power functions of the infeasible LM m (δ 0 ; τ * ) and feasible LM m (δ 0 ; τ ) tests essentially coincide regardless of the sample size or bandwidth considered, suggesting that τ * is very accurately estimated by τ in this case. As with the results for the no break case in Figure 1, for positive values of δ the power curve of the feasible LM m (δ 0 ; τ ) test lies only slightly below that of the infeasible LM * m (δ 0 ) test, which in turn lies close to the asymptotic local power curve, with the differences between the power curves reducing as T and/or m is increased. For negative values of δ the power curves of the LM m (δ 0 ; τ ) and LM * m (δ 0 ) tests are almost indistinguishable regardless of m or T. Again the large sample predictions from Theorem 2 would appear to hold reasonably well in finite samples.
In the remainder of this section, we summarize the results from an large set of Monte Carlo experiments designed to investigate the finite sample size and power properties of the semiparametric long memory tests proposed in Section 4. Specifically, we compare the empirical size and power properties of the LM m (δ 0 ; τ ), LM m (δ 0 ; τ * ), and LM m (δ 0 ) tests along with the corresponding t-type tests, t m (δ 0 ; τ ), t m (δ 0 ; τ * ) and t m (δ 0 ), respectively. Results are reported for DGPs with either zero, one or two level breaks. In the case where a maximum of one level break is allowed, comparison is also made with the KPSS stationarity test, denoted KPSS, together with the generalizations thereof proposed in Busetti andHarvey (2001, 2003) which allow for a level break at either a known or unknown location, denoted KPSS(τ * ) and KPSS( τ ), respectively. The full set of results together with details of the experimental design can be found in the supplementary appendix.
We considered models for {y t } of the form given in (1) with either k = 1 or k = 2: • For the k = 1 (so that up to one level break is allowed) case the DGP had either no level break or a level break at the sample midpoint with magnitude β 2 ∈ {0.5, 1, 2}. Results are reported related to testing H 0 : δ = 0; both where δ = 0 (empirical size) and where δ ∈ {−0.15, 0.15} (empirical power). The empirical size properties of tests for H 0 : δ = 0.3 and H 0 : δ = −0.3 were also explored. For the empirical size results the error process η t was allowed to follow either an IID process, an AR(1) process or an ARCH(1) process, while for empirical power IID and ARCH(1) processes were considered. • Forthek = 2 (so that up to two level breaks are allowed) case the DGP had either no level break or was such that two level breaks occurred with the level shifting from 0 to β 2 to 2β 2 at 1/3 and 2/3, respectively, of the way through the sample with β 2 ∈ {0.5, 1, 2}. Results are again reported related to testing H 0 : δ = 0; both where δ = 0 (empirical size) and where δ ∈ {−0.15, 0.15} (empirical power).
All of the tests were implemented for both a range of fixed bandwidths and using data-driven bandwidth rules. Again we set the search set as = [0.15, 0.85]. The principal findings of our Monte Carlo results can be summarized as follows, where comments (i)-(vi) relate to results in Tables S.1-S.18 for the single (putative) level break case, and comment (vii) relates to results in Tables S.19-S.24 for the double (putative) break case: i. As with the findings in Lobato and Robinson (1998) our results demonstrate that the bandwidth m has a significant impact on the finite sample properties of the tests, with a clear trade-off seen between size and power. In particular, for a given sample size, excluding those tests which are nonsimilar (i.e., excluding the LM m (δ 0 ) and t m (δ 0 ) tests when β 2 = 0), we observe the following general patterns: (a) for a given pattern of weak dependence and a given bandwidth, m, the observed distortions from the nominal (asymptotic) significance level are greater the larger is m, and (b) empirical power against a given fixed alternative increases as the bandwidth, m, increases. Generally, a range of bandwidths between m = T 0.5 and m = T 0.65 provides reasonable finite sample size control across the cases considered. ii. Our results suggest that the automatic bandwidth, m LR , of Lobato and Robinson (1998) delivers a reasonable tradeoff between finite sample size and power considerations, at least when the data are conditionally homoscedastic. In the conditionally heteroscedastic ARCH(1) case, the empirical size of tests based on m LR do not improve, other things equal, as the sample size is increased. This is perhaps not surprising given that the m LR bandwidth rule is not consistent with the bandwidth rate imposed on m by Assumption 2, and we therefore recommend caution in using the m LR bandwidth rule with data which are suspected to display conditional heteroscedasticity. For the KPSS-type tests, the automatic bandwidth rule recommended in Lobato and Robinson (1998) also appears to deliver a reasonable sizepower trade-off. iii. Overall, our results suggest that it may be helpful in practice to consider the automatic bandwidth, m LR , together with a range of bandwidths between m = T 0.5 and m = T 0.65 . This is what we will do in the empirical examples in Section 6. iv. As expected, where a level break occurs (β 2 = 0), the nonsimilar LM m (δ 0 ), t m (δ 0 ), and KPSS tests are highly unreliable displaying severe oversize (excepting the left-tailed t m (δ 0 ) test which is correspondingly undersized), and hence spurious evidence of long memory. The observed size distortions seen with these tests are higher, other things equal, the larger is the sample size or the level break magnitude. v. Although asymptotically equivalent under both the null and local alternatives (cf. Theorem 2), differences are observed between the finite sample size and power properties of the pairs of tests LM m (δ 0 ; τ ) and LM m (δ 0 ; τ * ), and t m (δ 0 ; τ ) and t m (δ 0 ; τ * ). The LM m (δ 0 ; τ * ) and t m (δ 0 ; τ * ) tests are based on knowledge of whether a level break occurs or not (i.e., whether β 2 = 0 or β 2 = 0) and, where a break occurs, also knowledge of the level break location τ * , while LM m (δ 0 ; τ ) and t m (δ 0 ; τ ) do not assume knowledge of either. The differences between the finite sample properties of these pairs of tests are seen to diminish as either the sample size or, in the case where a level break occurs, the break magnitude increases; indeed, for the largest magnitude considered, β 2 = 2, these differences are largely eliminated even for the smaller of the two sample sizes considered. The observed differences between the empirical power properties of these pairs of tests are seen to be slightly larger, other things equal, in the case where the errors are ARCH(1) vis-à-vis the IID case. Moreover, the finite sample differences between the pairs of tests are smallest for the tests of H 0 : δ = −0.3 and largest for the tests of H 0 : δ = 0.3; cf. Remark 4.4. Where no level break is present, the finite sample differences between the LM m (δ 0 ; τ ) test and LM m (δ 0 ) (which assume no level break is present) are again relatively small, other things equal, particularly for the larger sample size considered. This is also broadly true for a comparison between the t m (δ 0 ; τ ) and t m (δ 0 ) tests, although the differences are larger than for the LM-type tests. Overall, the asymptotic theory presented in Theorem 2 appears to provide a reasonable prediction of the finite sample behavior of the LM m (δ 0 ; τ ) and t m (δ 0 ; τ ) tests. vi. For a given DGP, the one-sided t-tests have more power (in the correct tail) than the corresponding two-sided LM tests, as would be expected. Moreover, and consistent with both the discussion concerning theoretical power rates against local alternatives in Shao and Wu (2007a) and the simulation findings in Lobato and Robinson (1998), the KPSS-type tests have considerably lower power to detect departures from short memory than do the corresponding LM-and t-based fractional integration tests discussed in this article, at least provided reasonable bandwidths m are chosen. vii. The results for the case where two putative breaks are allowed for (k = 2) are qualitatively similar to the corresponding results discussed above for the case of a single (putative) level break. However, as might be expected, the patterns seen for k = 1 are somewhat magnified for k = 2.

Empirical Examples
Throughout the empirical examples in this section, we set the trimming parameters equal to the same values as were used in the Monte Carlo experiments in Section 5, that is, τ L = 0.15 and τ U = 0.85. Where multiple breaks were estimated, we set the minimum spacing parameter λ defined in Section 4 to λ = 0.10, except for the VIX example where we set λ = 0.05 to allow larger values of k. For k ≥ 4, a complete enumeration of all possible break date combinations is infeasible, so the break dates are estimated by numerical (integer) optimization of the RSS function using a Matlab program which is available in the supporting material.

Bitcoin Returns
We apply the semiparametric long memory tests described in this article to the daily returns of Bitcoin over the period 17 September 2014 to 31 December 2019, giving a total of T = 1932 daily observations. The data were retrieved from Yahoo Finance. The logarithm of the closing price of Bitcoin in USD is graphed in Figure 3 along with the returns series, defined as first differences of the (log) closing price series. A visual inspection of the data suggests the plausibility of changes in slope, implying changes in level at the same point in the returns series, with the most obvious case being at around the beginning of 2018. The red line on the graphs shows the fitted deterministic trend/level of the series allowing for two breaks, the locations of which are estimated by applying the RSS-based estimator discussed in Section 4 to the returns data setting k = 2. The estimated break dates are March 24, 2017 and December 16, 2017. Evidence of long memory in returns would of course be in strong violation of the efficient market hypothesis, and so it is of interest in the context of the Bitcoin returns data to test H 0 : δ = 0 against the alternative H 1 : δ > 0. We do so using both the test based on the t m (0) statistic of Lobato and Robinson (1998), which does not allow for a level break, and the analogues of this test based on the t m (0; τ ) and t m (0; τ ) statistics allowing for the presence of either one or two level breaks, respectively, in each case occurring at unknown points in the sample. Following the recommendations from our Monte Carlo study we computed the statistics for a range of values of the bandwidth parameter, m, lying between T 0.5 = 43 and T 0.65 = 137, inclusive, as well as for the automatic bandwidth rule, m LR of Lobato and Robinson (1998) with the value that this takes reported in parentheses below the outcome of the statistics. The results are summarized in Table 1. Here, and also in Tables 4 and 5, the superscripts * , * * and * * * denote outcomes which are statistically significant at the 10%, 5%, and 1% level, respectively, while the superscripts HQ and BIC indicate the number of breaks chosen by the HQIC and BIC, respectively; cf. Remark 4.5.
Using Lobato and Robinson's t m (0) test, we can reject H 0 at the 10% level when using the data-dependent bandwidth rule, m LR , and for all but the smallest and largest of the other bandwidths considered. The null can also be rejected at the 5% level for m = 75 and m = 93. On balance we surmise from the results for the standard Lobato and Robinson test that the short memory null hypothesis is rejected in favor of long memory in the Bitcoin returns data. On the other hand, for the test based on t m (0; τ ), which fits a level break to the data, the evidence against    the null hypothesis is considerably weaker and, in particular, H 0 can only be rejected at the 10% level for bandwidths m ∈ {75, 93, m LR }. Allowing for two breaks, which is the number chosen by our preferred HQIC, no choice of bandwidth results in a rejection at even the 10% level for the t m (0; τ ) test. This suggests that the finding of long memory in Bitcoin returns by the Lobato and Robinson (1998) test is likely attributable to the presence of at least one level break in the returns data.

VIX Market Volatility
In the next example, we consider market volatility, measured by VIX, using daily data from January 1, 2000 to December 31, 2019 for a total of T = 5031 observations. The data were downloaded from Yahoo Finance and are graphed in Figure 4. The red step function on the graph shows the fitted deterministic level of the series allowing for 10 level breaks. It has been argued by several authors that long memory in volatility is an important stylized fact (see, e.g., Andersen et al. 2001 and references therein). Furthermore, long memory in volatility is relevant in asset pricing. For example, Baillie, Bollerslev, and Mikkelsen (1996) used asset pricing as motivation for their FIGARCH model, and Christensen and Nielsen (2007) discussed implications of long memory in volatility in the context of stock pricing. Other authors, however, suggest volatility might be a short memory process with the statistical evidence for long memory disappearing once level shifts in the data are accounted for; see, among others, Granger and Hyung (2004).
NOTE: All statistics in this table are significant at the 1% level, excepting those with a superscript a which are significant at the 5% level, those with a superscript b which are significant at the 10% level, and those with a superscript c which are not significant at the 10% level. Superscripts HQ and BIC indicate the number of breaks chosen by the HQIC and BIC, respectively.
To investigate this further, we test the short memory null hypothesis H 0 : δ = 0 against the long memory alternative H 1 : δ > 0 in the VIX data. We report the outcomes of the t m (0) statistic, the t m (0; τ ) statistic which allows for the presence of up to one level break, and the t m (0; τ ) statistic which allows for up to k level breaks for each of k = 2, . . . , 10. We again computed these statistics for a range of values of the bandwidth parameter, m, between T 0.5 = 70 and T 0.65 = 254, inclusive, together with the automatic bandwidth rule, m LR . The results are summarized in Table 2. Following Andersen et al. (2001), we also conducted the analysis using logarithmically transformed VIX data, and the results were nearly identical to those reported in Table 2.
It is seen from the results in Table 2 that the short memory null hypothesis is easily rejected at the 1% significance level for all of the bandwidths considered, other than m LR , regardless of how many level breaks we fit to the data. The tests based on m LR provide weaker evidence of long memory in the VIX data where 5 or more levels breaks are fitted; for example the HQIC selects 10 breaks and here the t m (0; τ ) test is only able to reject at the 10% level when using m LR . In conclusion, though, the results of these tests strongly suggest that long memory is a feature of the VIX data, and that this would not appear to be spurious long memory due to unmodeled level breaks. Table 3 repeats the analysis of Table 2, but testing null hypothesis H 0 : δ = 0.4 against the two-sided alternative H 1 : δ = 0.4. The value δ = 0.4 is very commonly found to characterize volatility data in empirical work (e.g., Andersen et al. 2001;Christensen and Nielsen 2007), and thus seems like a natural null hypothesis. For bandwidths m ≥ 125, including m LR , the null hypothesis is rejected at the 1% level regardless of the number of breaks allowed for. However, for m ≤ 108 the evidence against the null hypothesis becomes weaker. Using the number of breaks selected by either HQIC or BIC, the null cannot be rejected at the 10% level for any m ≤ 100, but can be rejected for larger m. On balance, unless a relatively small bandwidth is used, we conclude that the VIX is more persistent than an I(0.4) series (because rejection is in the right tail). The latter finding is in line with some recent empirical work (e.g., Frederiksen, Nielsen, and Nielsen 2012).

U.S. CPI Inflation
We next consider U.S. CPI inflation, defined as the first differences of the logarithm of the price index. Specifically, we used the series CPIAUCSL from the FRED database, which is the CPI for all items, Urban consumers, seasonally adjusted, base year 1984. We used monthly observations spanning January 1970 to December 2019, for T = 599 observations on the first differences. The log-CPI data along with the inflation data, the latter multiplied by 1200 to return a measure that is compatible with the commonly reported inflation rate, are both plotted in Figure 5. U.S. inflation is widely argued to have gone through several different policy regimes over the sample period considered here, most notably the Great Inflation period of the 1970s, the subsequent Volcker-Greenspan era of inflation rate targeting by the U.S. Federal Reserve starting in the early 1980s, and the response to the financial crisis of 2008. Figure 5 is indeed suggestive of the possibility of several level breaks in the inflation data. The red step line on the graphs again shows the fitted deterministic trend/level of the series allowing for up to four breaks. The estimated break dates are August 1977, July 1982, January 1991, and July 2008, broadly consistent with the regimes discussed above.
We again test the short memory null hypothesis, H 0 : δ = 0, against the alternative of (positive) long memory in the U.S. inflation data. We consider both the test based on the t m (0) statistic of Lobato and Robinson (1998), and the corresponding tests based on the t m (0; τ ) and t m (0; τ ) statistics allowing for the presence of up to k = 1, . . . , 4 level breaks, in each case at unknown points in the sample. The results are reported in Table 4 again for a range of values of the bandwidth parameter, m, lying between T 0.5 = 24 and T 0.65 = 63, inclusive, and the data-dependent bandwidth rule, m LR .
Lobato and Robinson's t m (0) test overwhelmingly rejects short memory at any conventional significance level for all of the bandwidths considered. Allowing for the presence of level breaks considerably reduces the magnitude of the test statistics. The test outcomes are generally still strongly significant when allowing for one or two level breaks, but when three level breaks are allowed for (the number chosen by BIC), the null cannot be rejected at the 5% level for bandwidths up to m = 40. When Table 3. Tests of H 0 : δ = 0.4 versus H 1 : δ = 0.4 in VIX volatility data.  allowing for four level breaks (the number chosen by HQIC) only the tests based on bandwidths of m = 50 and m = 63 are significant at the 5% level. Consequently, while the standard Lobato and Robinson (1998) test presents very strong evidence in favor of long memory in the U.S. inflation rate, tests which allows for different policy regimes within the sample period are more suggestive that U.S. inflation is a short memory series.

Real U.S. GDP Growth Rate
Finally, we consider U.S. GDP growth rates obtained as the first difference of the logarithm of real U.S. quarterly GDP (seasonally adjusted) over the period 1947Q1 to 2019Q4 obtained from the FRED database (series GDPC1), for a total of T = 292 quarterly observations. The data for U.S. (log) GDP and the GDP growth rates are both graphed in Figure 6. The red line on the graphs again shows the fitted deterministic trend/level of the series allowing for up to three breaks. The estimated break dates are 1973Q2, 1982Q3 and 2000Q2, broadly consistent with the first oil crisis, changes in the Fed policy (discussed in the context of the U.S. CPI data in Section 6.3) and the end of the dot-com bubble. In particular, we will test the null hypothesis that growth rates are short memory, H 0 : δ = 0, such that the log- NOTE: * , * * , and * * * denote outcomes which are statistically significant at the 10%, 5%, and 1% level, respectively, while superscripts HQ and BIC indicate the number of breaks chosen by the HQIC and BIC, respectively. level of GDP follows an I(1) process, against the alternative of negative long memory (antipersistence) in growth rates, H 1 : δ < 0, such that the log-level of GDP is less persistent than an I(1) process. As in the previous examples, we consider the test of Lobato and Robinson (1998) based on the t m (0) statistic, and the corresponding tests based on the t m (0; τ ) and t m (0; τ ) statistics allowing for up to k = 1, 2, 3 level breaks, in each case at unknown points in the sample. The results are reported in Table 5, again for a range of values of the bandwidth parameter, m, lying between T 0.5 = 17 and T 0.65 = 40, inclusive, and the data-dependent bandwidth rule, m LR .
With only a few exceptions, the tests reported are unable to reject the null hypothesis that GDP growth rates are short memory against H 1 : δ < 0 at conventional significance levels. The results from these tests do not therefore appear to support the conjecture of Perron (1989) that U.S. GDP is I(0) about a broken linear trend, particularly when recalling that our test is of the null hypothesis that U.S. GDP is I(1) around a broken trend.  NOTE: * , * * , and * * * denote outcomes which are statistically significant at the 10%, 5%, and 1% level, respectively, while superscripts HQ and BIC indicate the number of breaks chosen by the HQIC and BIC, respectively.

Conclusions
We have developed semiparametric tests, based on the Lagrange multiplier testing principle, for the fractional order of integration of a univariate time series which may be subject to the presence of level breaks. This is of significant practical importance as it is well known that long memory and level breaks can be mistaken for one another, with unmodeled level breaks rendering standard fractional integration tests highly unreliable. Our approach generalizes the tests for the null hypothesis of weak dependence (I(0)) developed in Lobato and Robinson (1998). These tests are based on the local Whittle approach, and therefore do not require the user to specify a parametric model for any weak autocorrelation present in the data, which is a considerable practical advantage where the confounding effects of long memory and level breaks are present. We also show how, as conjectured in Lobato and Robinson (1998, p. 478), their testing approach can be generalized to develop tests of the null hypothesis that a series is I(δ) for any δ lying in the usual stationary and invertible region of the parameter space, not just δ = 0. In spite of these generalizations, our tests are shown to attain the same standard asymptotic null distributions and asymptotic local power functions as the corresponding tests in Lobato and Robinson (1998); hence, there is no loss of asymptotic local power from allowing for level breaks, even where no level breaks are present. Monte Carlo simulations suggest that the tests perform well and that the predictions from the asymptotic theory appear to hold reasonably well in finite samples. The practical relevance of our proposed tests was highlighted with a number of empirical examples relating to macroeconomics and finance.

Appendix A: Mathematical Proofs
In this appendix, we provide proofs of Theorems 1 and 2.

A.1. Proof of Theorem 1
We use the notation δ c := cm −1/2 , so that, under H c , we have δ = δ 0 + δ c . Consider first the proof under Assumption 1. We rewrite t * m (δ 0 ) in (4) as Letting I ε (λ j ) denote the periodogram of ε t , (4.8) of Robinson (1995b) shows that, for r ≤ m, = O p r 1/3 (ln(r)) 2/3 + r 3 T −2 + r 1/2 T −1/4 . (A.5) Then, letting b j := ν j j −2δ c and proceeding as in Robinson (1995b) it follows that the remainder term (A.2) is o p (1). This involves using summation by parts, (A.5), and the bound |b j −b j+1 | = O(j −1 ), which follows by elementary calculations. From (4.11) of Robinson (1995b) it follows directly that (A.3) converges in distribution to N(0, 1). Next, by a Taylor series expansion and by definition of (A.6) Writing ln j = ν j + m −1 m k=1 ln k, the first term of (A.6) is Noting that 2cm −1 m j=1 ν 2 j 2π E(I ε (λ j )) = 2cm −1 m j=1 ν 2 j → 2c, the first term converges in probability to 2c by a law of large numbers. Using the result for (A.3) and the fact that m −1 m k=1 ln k = O(ln m), the second term is O p (m −1/2 ln m) = o p (1). Next, the expectation of the absolute value of the second term of (A.6) is where the last equality follows because m 1/2 ≥ ln m, which implies m 2|δ c | ≤ m 2|c|/ ln m = e 2|c| . This shows that the second term of (A.6) converges to zero in L 1 -norm and hence in probability.
The denominator of t * m (δ 0 ) in (A.1) may be analyzed in the same way to establish the result that m −1 m j=1 λ 2δ j j −2δ c 2π I u (λ j ) → p G. The claim of Theorem 1 under Assumption 1 follows by combining these results.
Next, we prove the theorem under Assumption 2. Instead of the bound (A.5) from (4.8) of Robinson (1995b), we let α T (λ) := (1 − e iλ ) −(δ 0 +δ c ) and use Lemma 4 of Shao and Wu (2007a), where it is shown that, under Assumption 2, where the last equality follows by using bounds for the low-frequency approximation of the ratio of f (λ j ) to G, see Assumption 2(iii), and of |α T (λ j )| 2 to λ −2δ j as in Robinson (1995b).
For the leading term in (A.8), we let b j := ν j |α c (λ j )| 2 (with slight abuse of notation), and rewrite it as As in the analysis of (A.2) it holds that (A.9) is o p (1) using (A.7). The term (A.10) is asymptotically normal as shown in Shao and Wu (2007a). As in the previous case, the same arguments also give m −1 m j=1 λ 2δ j 2π I u (λ j ) → p G, and the claim of Theorem 1 under Assumption 2 follows combining these results.

A.2. Proof of Theorem 2
Proof of Lemma 1. Given that δ 0 < 1/2, for m large enough there is δ = 1/2 − such that δ 0 < δ and δ < δ. Using a mean value theorem expansion, where |δ mvt − δ 0 | < |δ − δ 0 | and k is an integer to be chosen. From the fractional FCLT, the first term on the right-hand side of (A.11) satisfies in the Skorohod metric (see, e.g., Hosoya 2005;Wu and Shao 2006). Moreover, because the jumps in the partial sums take place at fixed points in time, and the limit W(τ ; δ) is a.s. continuous, the weak convergence also takes place in the uniform metric.
By the same argument, including a slowly varying function, it follows that 1 ln(T) ; in both cases uniformly in τ . The k − 2 remaining terms in the expansion of −δ η t in (A.11) can be analyzed the same way.
For the last term on the right-hand side of (A.11), notice that where we recall that f (λ) is the spectral density of η t , which is bounded, uniformly in λ, under either Assumption 3 or Assumption 4. Then, by the Cauchy-Schwarz inequality, and note that this is uniform in τ . So, upon choosing k finite but sufficiently large, T −(1/2+δ 0 ) m −k/2 T → 0 by Assumption 3(iv) or Assumption 4(iv), and consequently Combining these arguments we obtain the desired result.
In what follows, results for stochastic functionals of τ are to be considered as uniform in τ , unless otherwise specified. We omit the reference to uniformity in τ for brevity.
We divide the remainder of the proof into three parts for readability. Recall that k * is the true number of breaks, that is, the number of nonzero elements of β 2 .

A.2.1. Proof of Theorem 2 When k > k * = 0
In this case β 2,i = 0 for all i = 1, . . . , k. To lighten the notation, we give the proof for the case with k = 1 and k * = 0; see also the notation in Remark 4.3. The proof for the general case is the same, but with vectors and matrices replacing scalar quantities. Thus, we prove that, when β 2 = 0, t m (δ 0 ; τ )−t * m (δ 0 ) = o p (1) uniformly in τ . It is sufficient to show that We give only the proof of (A.12). The proof of (A.13) is almost identical leaving out the factor ν j and noting the different normalization.