The asymptotic behaviour of the residual sum of squares in models with multiple break points

ABSTRACT Models with multiple discrete breaks in parameters are usually estimated via least squares. This paper, first, derives the asymptotic expectation of the residual sum of squares and shows that the number of estimated break points and the number of regression parameters affect the expectation differently. Second, we propose a statistic for testing the joint hypothesis that the breaks occur at specified points in the sample. Our analytical results cover models estimated by the ordinary, nonlinear, and two-stage least squares. An application to U.S. monetary policy rejects the assumption that breaks are associated with changes in the chair of the Fed.


Introduction
There has been a considerable literature in econometrics on least square-based estimation and testing in models with discrete breaks in the parameters. The seminal paper by Bai and Perron (1998) developed a framework for estimation and inference in linear regression models estimated via ordinary least squares (OLS) that has served as the template for similar frameworks in more general models, including systems of linear regression models (Perron and Qu, 2006), linear models with endogenous regressors estimated via two stage least squares (2SLS, Hall et al., 2012), and nonlinear regression models estimated by Nonlinear Least Squares (NLS, Boldea and Hall, 2013).
Within these models, the key parameters of interest are those indexing the breaks-the break fractions-and the regime speci c coe cients. If the model in question is assumed to have m breaks, then these key parameters are estimated by minimizing the residual sum of squares over all possible data partitions involving m breaks. The asymptotic analysis then focuses on establishing the consistency of and a limiting distribution theory for these parameters, and also on the development of a limiting distribution theory for statistics relating to the number of breaks. However, relatively little attention has been paid to the minimized residual sum of squares per se, despite its key role in inference for these models.
The rst study to examine analytically the consequences of coe cient break point estimation on the residual sum of squares appears to be Ninomiya (2005), who considers breaks in the mean of a Gaussian process with inference on the number of breaks conducted through the Akaike information criterion (AIC) viewed as the bias-corrected maximum log-likelihood estimator. Ninomiya (2005) nds the required bias implies that estimation of each break fraction parameter has an impact on the the maximized log likelihood equivalent to the estimation of three mean parameters. Kurozumi and Tuvaandorj (2011) extend Ninomiya's (2005) analysis to systems of linear regressions with exogenous regressors and heteroscedasticity. Within this general case, the implications for the relative impacts of the parameter estimators and break-estimators are less easily summarized. However, if their results are specialized to the case of a single regression equation with homoscedastic errors, then their analysis reveals the same conclusions as Ninomiya's (2005) regarding the relative impacts of the estimated parameters and estimated breaks on AIC.
The paper makes three contributions. First, we derive the asymptotic expectation of the residual sum of squares in models with breaks in the coe cients at unknown dates. For linear or nonlinear regression models with exogenous regressors, this expectation depends on the numbers of estimated break points and estimated mean parameters, with the former having a weight of three relative to each mean parameter. For the linear model, this nding reproduces that of Kurozumi and Tuvaandorj (2011), but the extension to nonlinear models is new. In addition, our derivation is di erent to Kurozumi and Tuvaandorj (2011) and is based on a decomposition of the residual sum of squares that, we believe, provides interesting insights into the derived result. Although the expression is more complicated in linear models estimated via 2SLS, nevertheless the principal result, namely, that each estimated break date has the same impact on the expectation as three estimated mean parameters, carries over to this context. Second, we propose a statistic for testing the joint hypothesis that the breaks occur at speci ed points in the sample. Under its null hypothesis, this statistic is shown to have a limiting distribution that is non-standard but, under certain assumptions, asymptotically pivotal a er normalization; percentiles are provided for this limiting distribution. Although the same distribution is obtained by Hansen (2000) (see also Hansen, 1997) in the context of testing the location of the single threshold in a threshold autoregressive (TAR) model, no joint test appears to have been proposed previously in the literature. This statistic can be used to construct con dence sets for the breaks. This issue has recently received some attention in linear models; e.g., see Elliott and Mueller (2007), Eo and Morley (2015), and Chang and Perron (2015). Unlike these previous methods, our approach treats the breaks jointly rather than constructing individual intervals for each break. Our third contribution is to examine breaks in the U.S. monetary policy, for which we shed new light on the common assumption that Volcker taking over as Fed chair marked an immediate policy change [Clarida et al. (2000)].
An outline of the remainder of the paper is as follows. Section 2 obtains the asymptotic expectation of the minimized residual sum of squares for regression models with exogenous regressors. Section 3 then examines the case of a model with endogenous regressors estimated via 2SLS, where the reduced form may be either stable (with no breaks) or unstable and subject to breaks that need not coincide with those of the structural form. Section 4 proposes our joint test for the hypothesis that breaks occur at certain prespeci ed points in the sample, discusses their use to construct joint con dence sets and an evaluation of the test properties via a simulation study. Section 5 examines breaks in U.S. monetary policy, while Section 6 concludes. All proofs are relegated to a mathematical appendix.

RSS with exogenous regressors
Our analysis of the asymptotic expectation of the residual sum of squares cover both linear and nonlinear regression models estimated by least squares. However, since the assumptions di er in some important ways, it is convenient to treat the two cases separately. Although the linear case is already covered by Kurozumi and Tuvaandorj (2011), it is pedagogically convenient to begin by rst developing our results in that context for the following reasons. First, our presentation is di erent from Kurozumi and Tuvaandorj (2011) as it involves a decomposition of the residual sum of squares that we believe provide an interesting insight into why the asymptotic expectation takes the form it does in that model. Second, we apply a similar decomposition in all three models considered here and so an explicit presentation of the result for the linear model with exogenous regressors serves to underscore for the reader the common underlying structure that is present in all three cases. Third, this common structure is exploited to introduce new tests in Section 4 that can be applied in models estimated by OLS, NLS, and 2SLS, and so it is convenient to highlight how this structure is present in all three cases from the outset. Finally, as the linear model with exogenous regressors is the simplest-and a leading-case, it provides the most convenient framework in which to introduce the results.

Linear models
Consider the case in which the equation of interest is a linear regression model exhibiting m breaks, such that with T 0 0 = 0 and T 0 m+1 = T, where T is the total sample size. Thus, y t is the dependent variable, while x t is a p × 1 vector of exogenous explanatory variables that typically includes the constant term, and u t is a mean zero error. As usual, in the literature, we require the true break points to be asymptotically distinct.
Suppose now that a researcher knows the number of breaks but not their location(s). We use λ to denote an arbitrary set of m break fractions, with λ = [λ 1 , . . . , λ m ] ′ and 0 < λ 1 < . . . < λ m < 1, λ 0 = 0, and λ m+1 = 1. In order to minimize the overall residual sum of squares, the researcher estimates the regression model for each possible unique m-partition of the sample, where T i = [λ i T], and e * t is an error term. This is embodied in the following assumption: The parameter ǫ is known as the trimming parameter. Assumption 2 requires that each segment considered contains su cient observations for estimation of the model with nite T, while containing a positive fraction of the sample asymptotically. The second part of this restriction is motivated by the requirement that large sample statistical theory can be applied to deduce the limiting behavior of the relevant statistics in every subsample considered. In applications with macroeconomic data, ǫ is typically set equal to 0.2, 0.15, or 0.10, with this choice justi ed by simulation studies calibrated to the sample sizes and time series properties of macroeconomic data. 2 However, outside these settings, other values may be appropriate. If judged by the adequacy of the reliability of asymptotic inference, Bai and Perron (2006) provide simulation evidence that ǫ can be smaller when data are independently and identically correlated than when the data are heteroscedastic and/or serially correlated. Bai and Perron (2006) also suggest that the trimming parameter can be reduced as the sample size increases. Such a scheme might be appropriate with high-frequency data. 3 However, the appropriate choice of ǫ in this context is potentially complicated by the fact that as T increases, it seems desirable to allow for the number of break points also to increase. Killick et al. (2012) provide an algorithm for break point selection that allows the number of possible breaks to increase with the sample size, but they do not provide any formal guidance on the choice of ǫ. Thus, while clearly an important issue, it is to our knowledge an open question as to how the trimming parameter should be chosen as the sample size increases.
The estimates of β * = (β * 1 ′ , . . . , β * m+1 ′ ) ′ are obtained by minimizing the sum of squared residuals denotes the integer part of the quantity in brackets. 2 For example, see Bai and Perron (2006). 3 We thank a referee for drawing our attention to this issue.
with respect to β = (β 1 ′ , . . . , β m+1 ′ ) ′ . We denote these estimators byβ being the associated estimator of β * j relating to segment j. The estimators of the break points, (T 1 , . . . ,T m ), are then de ned as: where the minimization is taken over all possible partitions, (T 1 , . . . , T m ), and the associated minimized residual sum of squares is denoted as , are then the regression parameter estimates associated with the estimated partitions. The estimated break fractions are collected inλ, the m × 1 vector with j th elementT j /T. Bai (1997) and Bai and Perron (1998) derive the large sample behaviors ofλ andβ({T i } m i=1 ), together with various tests for parameter variation that arises naturally in this context.
Our focus is the large sample behavior of the minimized residual sum of squares. To this end, consider the asymptotic expectation of the bias term where Hence ξ T de ned by (5) is the di erence between the (minimized) residual sum of squares in (3) and the expected error sum of squares, Tσ 2 = E[ T t=1 u 2 t ], in the data generating process (DGP) of (1). The bias term of (5) arises from (i) estimating the unknown break dates, (ii) estimating the regimespeci c coe cients of (1), and (iii) the random disturbances. Re ecting these, we decompose ξ T into three components, The rst component, represents the e ect on the residual sums of squares from using the estimated rather than the true break dates. The second component is de ned as: where ESS(T 0 1 , . . . , T 0 m ) is the error sum of squares for (1) evaluated using the true {β 0 i } m+1 i=1 . Hence ξ 2,T is the impact on the residual sum of squares from estimating the coe cients of (1) with known (true) break dates. The nal component is and therefore captures the e ects of the speci c random disturbance sequence {u t }. Previous structural break analyses, including Bai (1997) and Bai and Perron (1998), separately considers the roles of break date and coe cient estimation. Therefore, (8) can be viewed as explicitly recognizing a decomposition that has previously been implicit in the literature.
Let AE[·] denote the asymptotic expectation of the term in brackets. 4 To derive the AE[ξ T ], we make the following assumption about the magnitudes of the breaks: Assumption 3 is the so-called "shrinking breaks" case, which is designed to capture the situation in which there is uncertainty about the location of the breaks in moderate-sized samples. This assumption, with breaks restricted to shrink at a slower rate than T −1/2 , is commonly employed in the literature to deduce a limiting distribution for break-point estimators; see Bai (1997) and Bai and Perron (1998).
Assumptions are also imposed about the regressors and errors, as follows.
is uniformly positive de nite for all T su ciently large 5 , and Assumption 6. There exists an l 0 > 0 such that for all l > l 0 , the minimum eigenvalues of A il = (1/l) Assumption 4 limits the behavior of the regressor cross product matrix and rules out trending regressors but allows regime-speci c behavior. As noted by Qu and Perron (2007, p. 471), an assumption of this type is introduced with shrinking breaks, so that the limiting distribution of the break date estimator does not depend on the distribution of u t in (1). The role of Assumption 5 is to limit the dependence structure of {x t u t } and {u t }. In particular, parts (i)-(iii) of Assumption 5 ensure that {x t u t } is a short memory and satis es a functional central limit theorem within each regime (White (2001) [Theorem 7.19]). Parts (iv) and (v) concern {u t } and the parts are stated separately since they are relaxed in some parts of our analysis. Qu and Perron (2007, p. 466) discuss models which satisfy assumptions of this type. Two prominent time series examples relevant to our case are (a) a regression model with (nontrending) exogenous variables whose properties are time-invariant within each regime and with short memory disturbances, such as a stationary ARMA−GARCH process and (b) an individual equation from a rst-order VAR system where coe cient breaks occur simultaneously across equations, with the roots of each regime-speci c characteristic polynomial lying outside the unit circle. In both cases, relaxing part (v) would allow the disturbance parameters to change at the break dates. Finally, Assumption 6 requires there be enough observations near the true break points, so that they can be identi ed and is analogous to the extension proposed in Bai and Perron (1998) to their Assumption A2.
The component ξ 1,T is the focus of much of our analysis. This is closely related to the asymptotic distribution of the estimator for the location of a single break point obtained, under an assumption of a "shrinking" or "small" break, by Yao (1987) for the mean of an i.i.d. process and very recently extended to more general linear and nonlinear univariate time series models by Ling (2015). Bai (1997) examines the break point estimator in a regression model, with Hansen (1997Hansen ( , 2000 considering the analogous case of threshold estimation in a single threshold TAR model, while multiple breaks are studied in Bai and Perron (1998). Lemmata 1-3 are stated in the Appendix to Kurozumi and Tuvaandorj (2011). Lemma 1. Under Assumptions 1, 2, 3, 4, 5(i)-(iii), and 6, there exist positive constants K i , i = 1, . . . , m, such that for large T, in which W i,j (.)(i = 1, . . . , m, j = 1, 2) are independent Brownian motions on [0, ∞) and Clearly, minimization of G i (k i ) is equivalent to maximization of G i (k i ) = −G i (k i ), namely, the maximum of two independent Brownian motion processes with negative dri s. The following lemmata and de nition provide distributional results relating to this maximum.
De nition 1. Let B(µ 1 , µ 2 ) denote the distribution with CDF Lemma 2, which is stated in Bai (1997) [p. 563] and, for γ = 1, in Stryhn (1996) [Proposition 1], makes clear that the maximum value taken by an individual Brownian motion process with negative dri follows an exponential distribution. Our notation for the distribution of the maximum of two independent processes is given by (16 ). The result in (17), which is key to our analysis, follows from the mean of an exponential distribution and is stated in Kurozumi and Tuvaandorj (2011) [p. 221].
Although not stated in this form, Ninomiya (2005)  Remark 2. A comparison of AE[ξ 1,T ] and AE[ξ 2,T ] indicates that the break parameters and the regression parameters a ect AE[ξ T ] di erently. Proposition 1(i) shows that the bias due to estimation of an additional break date increases in absolute value by 3σ 2 . From Proposition 1(ii), estimation of the regression parameters in the additional regime increases the asymptotic bias in absolute value by pσ 2 (with p the number of regression coe cients in the additional regime). As noted by Ninomiya (2005), this can be interpreted as implying estimation of the break fraction has thrice the impact of estimation of a regression parameter on the bias, providing a theoretical motivation for the modi ed information criteria penalty function proposed by Hall et al. (2013) in the context of structural break estimation. 7

Nonlinear models
Analogously to (1), consider a univariate nonlinear model with m unknown breaks: where f : R q × B → R is a known measurable function on R for each β ∈ B. For simplicity, let f t (β) = f (x t , β). To avoid excessive notation, rede ne the estimators and residual sum of squares analogously to Section 2.1, replacing x ′ t β i by f t (β i ) in (3). Compared with the OLS case, the consistency and large sample distribution ofλ andβ({T i } m i=1 ) have been established to date in the NLS setting only under more restrictive conditions on the dynamic structure of the data and also the rate of shrinkage between regimes; see Boldea and Hall (2013)[Assumptions 2-8]. These additional restrictions arise because of the inherent nonlinearity of the model; see Boldea and Hall (2013) for further discussion. We impose these conditions, but for brevity, relegate some to the Appendix. In addition to (18) replacing (2), Assumption 3 is modi ed, so that α ∈ [0.25, 0.5), and analogues are required for Assumptions 4 [with x t replaced by F t (β 0 ) = ∂f t (β)/∂β)| β=β 0 ] and 5 [with h t replaced by u t F t (β 0 )]. We note that these assumptions cover a range of models including nonlinear AR, smooth transition autoregressive and nonlinear ARCH. Boldea and Hall (2013, pp. 160-161) provide a detailed discussion.
Remark 3. Theorem 1 reveals that AE[ξ T ] does not depend on the form of f ( · ), beyond that embodied in the assumptions. Consequently, Remark 2 continues to apply in the nonlinear context.

Two stage least squares RSS
Now we consider the case in which the equation of interest is a structural relationship from a simultaneous system, with this equation exhibiting m breaks such that where T 0 0 = 0, T 0 m+1 = T, and T is the total sample size. Here x t is a p 1 × 1 vector of endogenous explanatory variables, z 1,t is a p 2 × 1 vector of exogenous variables including the intercept, and u t is a mean zero error. We de ne p = p 1 + p 2 . As in the previous section, we assume the location and magnitude of the breaks are governed by Assumptions 1 and 3, respectively.
As (19) is a structural equation, the endogenous explanatory variables, x t , are (in general) correlated with the errors, u t , and so 2SLS requires a reduced form representation to be estimated using appropriate instruments. The reduced form is discussed in the rst subsection below, before attention is focussed on (19). It should be noted that the analysis of this section assumes strong instruments; some comments are made in our Section 6 about extending the analysis to the case of weak instruments.

Reduced form model
The reduced form model is where T † 0 = 0 and T † h+1 = T. The vector z t = (z ′ 1,t , z ′ 2,t ) ′ is q × 1 and contains variables that are uncorrelated with both u t and v t and are appropriate instruments for x t in the rst stage of the 2SLS estimation. The parameter matrices 0 k are each q × p 1 . In line with Section 2, the number of reduced form breaks, h, is assumed known, but with the break points {T † i } unknown.
The reduced form of (20) can be rewritten as: . . , h, and I{·} is an indicator variable that takes the value one if the event in the curly brackets occurs. Letπ = [π 1 , . . . ,π h ] ′ denote estimators of π 0 . These estimators are not our prime concern and it is assumed that they satisfy the following condition.
This condition would be satis ed if, for example, the break dates in the reduced form are estimated by OLS equation by equation and the estimates of the break fractions are then pooled; see Bai and Perron (1998) [Proposition 5] and Bai (1997) [Proposition 1]. Notice that under our assumption 8 1 − 2α r > 0 andπ is consistent for π 0 . Letx t denote the resulting tted values, i.e., wherez t (π ) is de ned analogously toz t (π 0 ). In the special case when the reduced form is stable, (20) is replaced by a model with a single regime (h = 0), while Assumptions 7 and 8 are redundant. Obviously, (22) then becomes the corresponding OLS expression forx ′ t .

Structural form RSS
For estimation of (19), the statistic of interest is the minimized residual sum of squares from the secondstage estimation. Now suppose that a researcher knows the number of breaks in (19) but not their locations. As in the previous section, we use λ to denote an arbitrary set of m break fractions in the model of interest. The second stage of 2SLS can begin with the estimation via OLS of for each possible unique m-partition of the sample, where T i = [λ i T] and u * t is an error term. De ning β * i for a given partition as β estimation proceeds by minimizing the residual sum of squares as discussed in Section 2, leading to the 2SLS estimateŝ Given the existence of breaks in both structural and reduced form equations, we modify the de nition of admissible partitions over which the minimization is achieved.

Assumption 10. Equation (23) is estimated over all partitions
The generalization in Assumption 10 implies that the search for structural form breaks not only cover the relevant structural form intervals but also conducted in all intervals between (true) reduced form breaks. However, when the reduced form is stable, this latter requirement is redundant. For ease of presentation, the following assumptions also rede ne some notation used in Section 2.
, I a denotes the a × a identity matrix and 0 a×b is the a × b null matrix. Assumption 13. There exists an l 0 > 0 such that for all l > l 0 , the minimum eigenvalues of A il = (1/l) Assumption 11 requires h 1,t to be a conditionally homoscedastic martingale di erence sequence, and imposes su cient conditions to ensure the analogue of T −1/2 [Tr] t=1 h t satis es a functional central limit theorem within each regime (see White (2001) [Theorem 7.19]). It also contains the restrictions that the implicit population moment condition for 2SLS is valid-i.e., E[z t u t ] = 0-and the conditional mean of the reduced form is correctly speci ed. Assumptions 11 and 14 combined imply that V i = V = ⊗ Q ZZ . Assumptions 12 and 14, in conjunction with Assumption 11, imply the standard rank condition for identi cation in IV estimation of the linear regression model. 9 Note Assumption 12 implies q ≥ p. Assumption 13 requires there be enough observations near the true break points of the structural equation, so that they can be identi ed.
To facilitate the analysis below, we introduce an alternative version of the structural equation, which is the composite disturbance that applies in (19) for regime i when the endogenous x t are substituted by E[x t |z t ] from the reduced form. Therefore, (24) applies when the reduced form coecients are known, with x t = E[x t |z t ] embodying the true reduced form regimes when those coe cients are subject to breaks. Also de ne Applying Assumption 3 to the coe cient vector β 0 i = (β 0′ x,i , β 0′ z,i ) ′ , breaks in the structural form coe cients are of asymptotically negligible magnitude, with β 0 x,i → β 0 x , say, for all i = 1, . . . , m + 1. Under this assumption, then we have for all i = 1, . . . , m + 1 With known reduced form coe cients, the quantity ρ 2 provides the asymptotic variance of the composite structural form disturbance u t,i of (25) with shrinking coe cients. Therefore, Tρ 2 plays an analogous role in our analysis of the residual sum of squares for 2SLS as does Tσ 2 for the OLS case.
in which AE[·] again denotes the asymptotic expectation operator. Hence ξ T is the di erence between the residual sum of squares in the second step of 2SLS and the expected error sum of squares in (24).
Generalizing the approach of Section 2 to the 2SLS case requires the role of the reduced form to be recognized and we now decompose ξ T into four components, The rst component represents the e ect on the second-stage residual sums of squares from estimating the coe cients of (19) within each structural form partition based on the estimated rather than the true break dates in both the structural equation and (if relevant) the reduced form. Both elements of (31) are obtained using x t from (22). The second component is de ned as: where ESS(T 0 1 , . . . , T 0 m ) is the error sum of squares for (19) evaluated using the true {β 0 i } m+1 i=1 in conjunction with x t . Hence ξ 2,T is the impact on the residual sum of squares from estimating the coe cients of (23) with known (true) break dates and evaluated using the rst stage x t with true break dates. The third component is given by: where ESS e (T 0 1 , . . . , T 0 m ) is the error sum of squares evaluated using the true Consequently ξ 3,T is the e ect from using x t rather than x t for computation of the structural equation error sums of squares. The nal component is and hence captures the e ects of the composite u t,i in the structural equation of (24). Theorem 2 then generalizes the result of Proposition 1 to the 2SLS case, employing the notation with λ 0 0 = 0 and λ 0 m+1 = 1; δπ 0 i (i = 1, . . . , h + 1) is de ned analogously for the true reduced form regime fractions.
Theorem 2. Let y t be generated by (19), x t be generated by (20), andx t be given by (22). Let Assumptions 1, 3, 7-14 hold. Then we have: in which d i is de ned as follows: if there are no reduced form breaks between λ 0 i−1 and λ 0 i and so π 0 if there are reduced form breaks between λ 0 i−1 and λ 0 i and so π 0 Remark 4. Theorem 2 indicates that AE[ξ T ] depends on: the number of structural form breaks, m, the number of mean parameters in each regime, p, the number of instruments, q, the covariance structure of the composite error u t,i through (ρ 2 − σ 2 ) = 2γ ′ β 0 x + β 0′ x β 0 x , and also on the relative locations of the structural and reduced form breaks.
Remark 5. The expression for AE[ξ 1,T ] carries over from Proposition 1 and Theorem 1, and so the e ect of estimating the residual sum of squares of interest is asymptotically the same irrespective of whether the model is a linear or nonlinear equation with exogenous regressors or a linear equation with endogenous regressors and consistently estimated reduced form break dates. We also note that Lemma 3 underlies this result in all cases.
Remark 6. Theorem 2(i) does not require Assumption 14(ii), and so AE[ξ 1,T ] has the stated form even if the instrument cross product matrix exhibits the regime-speci c behavior delineated in part (i) of that assumption.
The special case of a stable reduced form is of particular interest. Using the de nition of d i for the case of no reduced form breaks in the structural form regime i, it immediately follows that a stable reduced form implies The resulting asymptotic expectation of the residual sum of squares in the second-stage regression is stated as a Corollary to Theorem 2: Corollary 1. Let y t be generated by (19), with x t generated by (20) andx t be given by (22), both with h = 0. Let Assumptions 1-3, 9, 11, and 12-14 hold. Then we have: Remark 7. With a stable reduced form, the expression for AE[ξ 2,T ] in Corollary 1 can be written as −p (m + 1)ρ 2 − (ρ 2 − σ 2 ) . Ignoring the second term, which is independent of m, the term −(m + 1)pρ 2 can be associated with estimation of the (m + 1)p structural form coe cients. Combined with AE[ξ 1,T ] = −3mρ 2 , the comment in Remark 2 about the relative impacts of break-fraction and regression parameter estimation in models with exogenous regressors applies equally in models with endogenous regressors estimated via 2SLS with stable reduced forms. When the reduced form is unstable, however, this result is modi ed in that p enters the second term of AE[ξ 2,T ] in Theorem 2(ii).
Remark 8. Corollary 1 also clari es the role of the reduced form in minimization of the 2SLS residual sum of squares in models with no breaks. When conventional 2SLS is applied to a stable structural form (m = 0, h = 0), (30) becomes ξ T = RSS − Tρ 2 and The result shows that the downward bias in the minimized 2SLS residual sum of squares compared with E[u 2 t ] depends not only on the number of structural form coe cients estimated, p, but also on the extent of overidenti cation (q − p) and the additional asymptotic variation induced in the structural form by the use of IV estimation, namely, E[u 2 t − u 2 t ] = (ρ 2 − σ 2 ). In this context where both the reduced forms and structural forms are stable, Pesaran and Smith (1994) propose a generalized R 2 criterion computed from the second-stage regression, and (36) makes clear that the value of this criterion will asymptotically depend on characteristics of the reduced form (including the number of instruments) as well as the goodness-of-t of the structural form equation itself.
Remark 9. Two further special cases of Theorem 2 are of interest; in both only the numbers of breaks matter, not their locations per se. First, when all reduced form breaks coincide with structural form breaks, with possible additional structural form breaks, then m+1 i=1 d i /(δλ 0 i ) = h + 1 (see the proof of Theorem 2 in the Appendix). In this case, This expression has a similar interpretation to that drawn out in Remark 7, with the rst term of (37) giving the bias due to estimation of the structural form coe cients and break dates, while the second shows the roles of the additional asymptotic variation from using IV, (ρ 2 − σ 2 ), and the extent of overidenti cation (q − p), with the number of reduced form regimes (h + 1) now magnifying the latter e ects. Second, when all structural form breaks coincide with the dates of reduced form breaks, with possible additional reduced form breaks, then m i=1 d i /(δλ 0 i ) = m+1 (as again seen from the Appendix) and This has a similar interpretation to (37), although overidenti cation in the second term of (38) appears in the form of a comparison of the total numbers of reduced and structural form coe cients estimated.
Remark 10. For the general case where reduced and structural form break dates do not necessarily coincide, the theorem shows that although AE[ξ T ] depends on the relative locations of structural and reduced form break points, the extent of this dependence is bounded. Consequently, based on the interpretation of (37) and (38) in Remark 8, the quantity q(h + 1) − p m+1 i=1 d i /(δλ 0 i ) might be interpreted more generally as a measure of the extent of overidenti cation of the structural form parameters in the presence of structural and/or reduced form breaks.
Remark 11. Hall, et al. (2015) proposes an information criterion for break selection in models estimated by 2SLS in which the penalty function gives the number breaks thrice the weight of the number of parameters. They report simulation evidence that suggests this enhanced weighting of the breaks improves performance over a criteria that weights the estimated breaks and parameters equally in the penalty function.

Testing break dates
The discussion of Sections 2 and 3 notes that AE[ξ 1,T ] exhibits similar behavior in all the models considered, and this is due to the large sample behavior of ξ 1,T being governed by a version of Lemma 1, and more speci cally (12)-(13), in each case. The current section exploits this structure to propose a statistic for testing with 0 < λ 1 < ... < λ m < 1, against the alternative hypothesis that at least one λ 0 i = λ i (i = 1, . . . , m). In other words, we consider the situation where the researcher knows the number of breaks and wishes to test a joint hypothesis regarding their locations. Given the common structure underlying AE[ξ 1,T ], we consider the OLS case in some detail in the rst subsection and then note (in Subsection 4.2) how the result extends to other models considered above.
Remark 12. The limiting distributions in Theorem 3 depend on model parameters. However, asymptotically valid inference can be performed by simulating the null distribution using consistent estimators of µ i,j under H 0 and then comparing N λ (λ) to the appropriate percentile of this simulated distribution. A consistent estimator for µ i,j is given by: This provides a heteroscedasticityconsistent estimator. If Assumption 5(iv) holds and homoscedasticity applies within each regime, then an alternative consistent estimator isμ 2 ℓ,t . Finally, if Assumption 5(v) holds and the error variance is constant over all regimes, an additional consistent estimator is 10 µ i,j = 0.5σ 2 (43) Remark 13. If Assumption 5(iv)-(v) holds, then it is possible to normalize the statistic to remove nuisance parameters from the limiting distribution. To this end consider the F-type test statistic This leads to the following corollary to Theorem 3:

Corollary 2. Under the conditions of Theorem 3, including Assumption
are mutually independent and b i ∼ B(0.5, 0.5).
Percentiles of this limiting distribution, simulated in MATLAB using 10 million replications, are presented in Table 1. Hansen (1997Hansen ( , 2000 develops a test of the null hypothesis of a known threshold value in a single threshold TAR model, with his statistic being a special case of F λ (λ) with m = 1.
The critical values presented by Hansen (1997Hansen ( , 2000 are e ectively identical to those of Table 1 for m = 1. Critical values at the 10, 5, and 1% signi cance level of the limiting distribution of F λ (λ) in Corollary 2, for models with m number of breaks.
The statistics above can be used to generate con dence sets for the break fractions. For the linear model with exogenous regressors, an approximate 100(1 − α)% con dence set for the break fractions is given by: where N λ (λ) is de ned in (40) and q m,1−α is the 100(1 − α) th quantile of m i=1 b i de ned in Theorem 3. Clearly, with Assumptions 5(iv) and (v) imposed, the asymptotic critical values of Table 1 can be employed for q m,1−α .
The limiting distribution of the multiple break date test statistic in Theorem 3 has not, to the best of our knowledge, been obtained in the previous literature. Nevertheless, under similar assumptions to ours, Yao (1987) and Bai (1997) obtain the marginal distribution of a single break fraction estimator. This special case of the distribution in the theorem is used by Bai (1997) and also Bai and Perron (1998) to construct a con dence interval for the date of each break 11 . Since the m break date distributions are asymptotically independent, a joint test of the null hypothesis (39) could be deduced from these. However, rather than using a con dence interval approach, (44) compares RSS at the hypothesized break dates with the overall minimized RSS, providing a natural test statistic in the least square context considered here. In common with the approach of Elliott and Mueller (2007), but not that of Yao (1987), the con dence sets in (45) do not imply the dates corresponding to a speci c λ i are necessarily contiguous. However, unlike our methods, Elliott and Mueller (2007) results only apply to the one-break model. Eo and Morley (2015) recently propose a method for constructing con dence intervals for individual breaks in systems of linear models with exogenous regressors based on inverting a likelihood ratio statistic and show it yields narrower con dence intervals than those proposed by Bai (1997). It would be interesting to explore an approach that takes the joint approach proposed here within the context of the likelihood ratio test-based inference, but this is beyond the scope of the current paper.

Other models
As shown in the Appendix, Lemma 1 continues to apply for nonlinear regression models that satisfy Assumptions 1, 2 with (18) replacing (2), 3 for α ∈ [0.25, 0.5), 5(i), and A.1-A.4 (in the Appendix). In the NLS case, however, a i,j , c i,j given in (14) and (15) are replaced by the Appendix expressions (54) and (55), respectively. It therefore follows from Lemmata 2 and 3, together with De nition 1, that the statistic N λ (λ) given by (40) has the limiting distribution for a nonlinear model as given in the rst part of Theorem 3. Further, the imposition of Assumption 5 parts (iv) or (iv) and (v) yields the same specializations of µ i,j as described in Theorem 3.
A consistent estimator of µ i,j for use in simulation of the limiting distribution is given by (41), except that the following changes are required:β i now denotes the NLS estimator of the parameter vector in (estimated) regime i; x t is replaced by F t (β ℓ ) = ∂f t (β)/∂β)| β=β ℓ inQ ℓ andV ℓ . If Assumption 5(iv) holds, then an alternative consistent estimator is given by (42) but withû ℓ,t being the NLS residual; if Assumption 5(v) holds, then a further consistent estimator is given by (43) with the same rede nition of the residual. Similarly, we can de ne an analogous version of F λ (λ) for this model which has the limiting distribution given in Corollary 2 under all the assumptions made for this model, including Assumption 5(iv)-(v). Therefore, under these assumptions, the critical values of Table 1 can be applied for testing the joint break fraction hypothesis of (39) in a nonlinear regression model.
In the 2SLS case, however, the construction of a consistent estimator of µ i,j for use in simulation of the limiting distribution depends on the location of the i th break in the structural equation relative to the reduced form breaks. Ifπ k−1 <λ i <π k for some k, then a consistent estimator of µ i,j is given by: k=1ˆ k I{t ∈ (π k−1 ,π k ]}. Ifπ k−1 =λ i for some k then a consistent estimator of µ i,j is given by: and all other de nitions remain the same. Regardless of the relative positions of the structural and reduced form breaks, if in addition Assumption 11(iv) holds, then a consistent estimator for µ i,j is provided bŷ Further, if Assumptions 11(iv)-(v) hold, then an alternative consistent estimator for µ i,j isμ i,j = 0.5ρ 2 (49) In this last case, the dependence of the limiting distribution on model parameters can be removed by using Under the assumptions listed above for the 2SLS case, including Assumption 11(iv)-(v), and the H 0 of (39), F 2SLS λ (λ) converges to the limiting distribution in Corollary 2. This enables the critical values of Table 1 to be employed also for testing break dates in a structural model estimated by 2SLS.
As discussed for the linear model with exogenous regressors in the previous subsection, the hypothesis tests for break dates can be inverted to obtain joint con dence intervals for the dates of the m breaks in the models of this subsection.
Following the implication of Proposition 1 that the impact of the break fraction estimation is thrice that of a regression parameter, we add −3m as an additional correction in the degrees of freedom for the variance estimator that takes the, otherwise generic, form: where RSS( · ) is de ned in (6)-(7).
As in the analysis of Section 2, estimation is performed imposing the true number of breaks. The break dates are estimated as de ned in (4) except that in practice regimes are restricted to contain at least [ǫT] observations. The parameter ǫ, o en referred to as the trimming parameter, is set at ǫ = 0.1. The e cient search algorithm of Bai and Perron (2003) is employed in our analysis.
We examine the power of the test for a range of null hypotheses given by H 0 : λ 1 = 0.5 + κ for κ = 0, 0.02, 0.04, . . . , 0.2. Since λ 0 1 = 0.5 in our DGP, κ = 0 corresponds to the case in which the null is true, and as κ increases the distance between the hypothesized value and the truth increases. The calculated test statistic is compared to the critical value in Table 1  As expected, power increases with κ for each T and α, with power inversely related to α for given T and κ. For example, with α = 0.1, power is more than 0.95 when T = 480 and κ = 20 (λ 1 = 0.7) but reaches little more than 0.5 for these T and κ values when α = 0.4. 12 Clearly, it is di cult to detect deviations from the hypothesized location when the break is small. On the other hand, although developed under the shrinking break assumption, the test performs well when the break magnitude is xed (α = 0).
The test also exhibits good size performance overall. It is generally a little under-sized for small values of α, is well-sized (with empirical sizes between 0.041 and 0.057) when α = 0.2 and is typically modestly oversized for larger α, although it remains marginally under-sized for T = 120 even with α = 0.4. Perhaps not surprisingly, the greatest size distortion across the cases considered occurs for the small breaks that apply with α = 0.49 and T = 480, where the empirical size is 0.085.

U.S. monetary policy
The U.S. monetary policy is widely acknowledged to have undergone change since the 1970s, with many arguing that this provides a key explanation for changes in the properties of in ation and (sometimes) real activity. Studies those explore these issues typically either treat the date(s) of change as known or employ essentially ad hoc approaches to deal with the issue. For example, Boivan and Giannone (2006)  split their sample in 1979, re ecting the date at which Volcker became chairman of the U.S. Federal Reserve, while Ahmed et al. (2004Ahmed et al. ( ) use subsamples covering 1960Ahmed et al. ( -1979Ahmed et al. ( and 1984Ahmed et al. ( -2002Ahmed et al. ( , with 1980Ahmed et al. ( -1983 due to uncertainty about potential dates of change. In a similar vein, the seminal study of Clarida et al. (2000) adopts the 1979 change date, but also acknowledges uncertainty about breaks and examines interest rate reaction functions estimated over the individual subsamples implied by the periods of o ce of the four Fed chairmen within their overall sample period, and also consider a possible post-1982 sample. Although the literature largely accepts that a new monetary policy regime commenced immediately on Volcker becoming chairman in 1979Q3, Du y and Engle-Warnick (2006) throw some doubt on this nding, since their application of the sequential test procedure of Bai and Perron (1998) in a dynamic monetary policy model nds a 1980Q3 break rather than a year or more earlier. Nevertheless, the tests available to Du y and Engle-Warnick (2006) do not allow for endogeneity and they employ only backward-looking speci cations.
We examine hypotheses about breaks in the U.S. monetary policy using the forward-looking dynamic model where r t is the actual federal funds rate, while π t+1|t and x t+1|t are forecasts of in ation and a proxy for the output gap, respectively. We follow Orphanides (2004), who revisits the analysis of Clarida et al. (2000), by employing real-time data and, more speci cally, Greenbook forecasts prepared by Fed sta for meetings of the Federal Open Market Operations Committee (FOMC). 13 The Greenbook provides forecasts of key variables, including in ation, output, and unemployment, which informs FOMC interest rate decisions. Although, for simplicity, our speci cation in (51) assumes that policymakers focus on forecasts for the following quarter, Orphanides (2004) nds results to be largely una ected for horizons between 1 and 4 quarters. Our sample period is 1968Q4 to 2005Q4, which is appropriate for our purpose of examining implicit hypotheses made in the literature about changes in U.S. monetary policy responses. Although FOMC meetings are held more frequently (and sometimes irregularly), we follow the usual convention of treating them as quarterly by employing forecasts made for the meeting closest to the middle of the quarter. As Greenbook output gap forecasts are available only from late 1987, we follow Boivan (2006) and employ a real-time unemployment gap measure as a proxy in (51). More explicitly, as in Boivan (2006), x t+1|t is measured as the natural rate of unemployment minus the Fed's forecast, where the natural rate is computed as an average of the historical unemployment rate over data as available at t. The in ation forecasts π t+1|t relate to the Gross National Product (GNP) or Gross Domestic Product (GDP) price de ator (as appropriate) and are given in the Greenbook as quarter on quarter growth rates, expressed as annualized percentage points. The interest rate series is the average actual federal funds rate for the third month of the quarter, with the third month used to ensure that r t re ects any monetary policy change e ected during that quarter.
As already noted, Greenbook forecasts are prepared by Fed sta in advance of FOMC meetings and they are, in principle, conditional on interest rate policy remaining unchanged over the forecast horizon. However, it may not be appropriate to treat these as exogenous in (51), since Ellison and Sargent (2012) argue that the FOMC may doubt the accuracy of these sta forecasts and instead favor a "worst case" scenario. Consequently the Greenbook forecasts may be measured with error in relation to the forecasts of the FOMC itself, with the measurement errors correlated with interest rate decisions. To guard against this possibility, our analysis of breaks in (51) employs a 2SLS approach. The instruments used are π t−i , x t−i , r t−i , for lags i = 1, 2, GNP/GDP growth (as appropriate at t ) and the interest rate spread between long-term (10 year) bonds and the short-term federal funds rate, also for the two quarters prior to t, with all variables real time as at t. Lagged in ation and growth rates are employed as instruments in line with new Keynesian models in which expectations in (51) are formed from past observations on output growth, in ation and interest rates. There is both theoretical and empirical evidence that the interest rate spread contains useful information for monetary policymakers [see, for example, Ellingsen and Söderström (2001) and Rudebusch and Wu (2008)] and hence is also included. 14 Based on the analyses of Hall et al. (2013Hall et al. ( , 2015, we use an information criteria approach to inference in both the reduced form equations for π t+1|t and x t+1|t and in the structural form (51 ). Speci cally, we employ Bayesian Information Criterion (BIC) and Hannan-Quinn Information Criterion (HQIC), with the penalty function in each case taking account of coe cient and break estimation by counting the number of e ective parameters estimated as (p + 3)m, as suggested by Proposition 1. 15 The maximum  (51)  number of breaks is set to 5 in each case, with trimming parameters set to ǫ = 0.15 (15% of the total sample) for each reduced form equation and ǫ = 0.10 for the structural form. Bai and Perron (2006) provide a discussion and some evidence on the choice of the trimming parameter for structural break tests, while the simulations of Hall et al. (2013) consider the choice for information criteria approaches. We use ǫ = 0.10 in (51) since there are relatively few coe cients to be estimated and the disturbances are uncorrelated. Both criteria nd the reduced form equation for x t+1|t to be stable over the sample period, but three breaks are indicated in the π t+1|t equation, dated at 1974Q4, 1980Q4, and 1986Q3. Using the reduced form predictions (with breaks taken into account) rather than observations for π t+1|t and x t+1|t in (51), both criteria then indicate that two breaks occur in the U.S. monetary policy. The search algorithm estimates the break dates as 1980Q3 and 1985Q3. 16 The estimated monetary policy reaction functions are presented in Table 2, the rst column of which shows the 2SLS estimated coe cients under the assumption that the reduced and structural forms are stable, with the remaining three columns taking into account of reduced form and structural form breaks. Under the assumption of stability, the equation is poorly determined, with no individual coe cient signi cant. On the other hand, allowing for breaks shows U.S. monetary policy to react signi cantly to forecasts for both the unemployment gap and in ation until 1980Q3, followed by a period to 1985Q3 where the response appears to be targeted strongly to in ation. The nal regime, from 1985Q4 to 2005Q4, is one of low in ation and relative stability (the so-called Great Moderation), during which responses appear to be dominated by interest rate dynamics. It is notable that, nevertheless, the implied steady-state monetary policy responses to in ation are e ectively constant over the whole sample period. This nding contrasts with Clarida et al. (2000), who argue that the monetary policy response to in ation was stronger a er Volcker became Fed chairman than previously, but agrees with the real-time analysis of Orphanides (2004).
As discussed above, many studies of the U.S. monetary policy, including Clarida et al. (2000) and Orphanides (2004), assume that a break occurs in 1979Q2, with a new regime applying when Volcker took up appointment as the Fed chairman in the following quarter. Indeed, Clarida et al. (2000) take this further and informally investigate whether monetary policy changes with each Fed chairman. In terms of our analysis, this would imply that the true date of the second break we detect is 1987Q2, with Greenspan taking up o ce in August that year. 17 Applying the tools of Section 4, we therefore test the Figure 2. 99% break fraction con dence set for monetary policy application. The con dence set shows the break fraction pairs (λ 1 , λ 2 ) for which the statistic F λ (λ) does not reject the corresponding joint null hypothesis at the 1% level, when applied to each permissible null hypothesis subject to a 15 observation minimum segment (ǫ = 0.10). The λ 1 = 0.32 break fraction corresponds to 1980Q3 and is the only date of a rst break that does not reject the null while λ 2 can take any value from 0.42 to 0.62, or 1984Q2 to 1991Q4. joint null hypothesis Under the assumption of homoscedasticity, the test statistic of (50) is F 2SLS λ (λ) = 37.29, which strongly rejects the null hypothesis at the 1% level in relation to the critical values of Table 1. 18 Relaxing the homoscedasticity assumption by using the 2SLS analogue of (40) leads to a statistic of N 2SLS λ (λ) = 49.73, which also leads to rejection at the 1% level whether the null distribution of b i ∼ B(µ i,1 , µ i,2 ) is simulated under the assumption of regime-dependent variances as in (48) or allowing more general heteroscedasticity as in (46). Indeed, N 2SLS λ (λ) always rejects the joint null hypothesis at this level for any hypothesized T 0 1 = 1980Q3. On the other hand, there is substantial uncertainty about the second break date, with a 99% joint con dence set including all dates from 1984Q2 (the lower bound of the search interval in combination with 1980Q3) to 1991Q4, inclusive, while reducing the con dence level to 90% brings forward the latter date by only two quarters. Figure 2 illustrates the 99% joint con dence set graphically in terms of the break fractions, with the horizontal line emphasizing the relative uncertainty about λ 2 in contrast to λ 1 .
These results shed new light on the timing of changes in the U.S. monetary policy. In particular, the widely accepted break date of 1979Q2 is not supported, with our results strongly pointing to the break occurring 1980Q3. Interestingly, Du y and Engle-Warnick (2006) also nd evidence of a break at this later date in a dynamic monetary policy model. Although detailed analysis of the evidence is beyond the scope of this paper, it is notable that the policies now referred to as "Reaganomics, " and introduced a er his election as U.S. President in November 1980, included a focus on the control of in ation. Therefore, it may be that the monetary policy regime o en associated with Volcker as Fed Chairman is for practical purposes very similar to that of Reagan, making the two di cult to distinguish empirically. As to the second break, our results support other studies, including Clarida et al. (2000), who suggest that the date of change is unclear. However, we go further than previous authors in the sense that our 90% con dence set includes dates into the early 1990s.

Concluding remarks
A considerable literature now exists concerned with least square-based estimation and testing in models with multiple discrete breaks in the parameters, see inter alia Bai andPerron (1998), Hall et al. (2012), and Boldea and Hall (2013). In these contexts, if the model is assumed to have m breaks, then the break points (the points at which the parameters change) are estimated by minimizing the residual sum of squares over all possible data partitions involving m breaks. A natural side product of this estimation is the minimized residual sum of squares and this quantity plays an important role in subsequent inferences about the model. This paper, rst, derives the asymptotic expectation of the residual sum of squares, the form of which indicates that the number of estimated break points and the number of regression parameters a ect this expectation in di erent ways. Second, we propose a statistic for testing the joint hypothesis that the breaks occur at speci ed xed break points in the sample. Under its null hypothesis, this statistic is shown to have a limiting distribution that is nonstandard but simulatable, being a functional of independent random variables with exponential distributions whose parameters can be consistently estimated. In a special case, the statistic can be normalized to make it pivotal and we provide percentiles for the associated limiting distribution. These results cover the cases of either the linear or nonlinear regression model with exogenous regressors estimated via ordinary (or nonlinear) least squares or a linear model in which some regressors are endogenous and the model is estimated via two stage least squares.
The paper also illustrates the usefulness of the results through an application to breaks in the U.S. monetary policy. Such breaks are widely acknowledge in the literature, but are usually assumed to coincide with changes in the chair of the Federal Reserve; see, for example, Clarida et al. (2000). When subjected to test, we reject this hypothesis on the coincidence of change. In particular, the widely assumed break date of 1979Q2 associated with the end of the pre-Volcker era is rejected in favor of a break in late 1980. Nevertheless, we also note that monetary policy under both Volcker, as Fed Chairman, and President Reagan focused on in ation, and the start of the new regime may be di cult to determine from the data. An important side-product of our analysis is the joint con dence set we obtain for two dates of change detected in monetary policy over the period 1969-2005. Our analysis of the 2SLS case assumes that the instruments are strong and that the di erence between the regimes is shrinking. It would be interesting to extend our analysis to the case where the relationship between the instruments and endogenous regressors is allowed to diminish with the sample size in the fashion of nearly weak or weak instruments. Recently, Antoine and Boldea (2016) introduce a framework in which the strength of the instruments is potentially regime dependent and implicitly controls the rate at which the breaks in the reduced form are shrinking. Interestingly, they show that if a break is located between two regimes, in one of which the instruments are nearly weak and in the other of which the instruments are weak, then the break fraction can still be consistently estimated. It would be interesting to examine the behavior of the statistics considered in our paper within Antoine and Boldea (2016) framework, however this is le to future research.
(ii) {v t } is a β-mixing process with exponential decay, i.e., there exists N > 0 such that for B ∈ F a −∞ , The function f t (·) is a known measurable function, twice continuously di erentiable in β for each t.
Assumption A.4. (i) S T (T 1 , . . . , T m ; β) has a unique global minimum at β 0 and (T 0 The proof follows similar lines to that of Proposition 1. From the arguments of Boldea and Hall (2013), it follows that (12) and (13) continue to apply, but now with for j = 1, 2. The result for ξ 1,T then follows using arguments as for the proofs of Lemma 1 and Proposition 1. For ξ 2,T , the proof again follows the same argument as Proposition 1 using T 1/2 (β i − There are then two scenarios of interest for the general case of an unstable reduced form with h > 0 in (20), namely, whether the (true) reduced form and structural breaks are common or not. To be more precise, and following Boldea et al. (2012), we consider scenarios where some breaks occur in the structural form but not the reduced form and where at least some breaks are common to both; the former includes the special case of a stable reduced form. These scenarios can be represented as follows.
Using the same arguments as Boldea et al. (2012) in the proof of their Theorem 2, it follows that the limiting distribution of ξ 1,T is given by (12) and (13) as in Lemma 1, but with [from Assumption 14 and Assumption 11(iii)], (14)-(15) replaced by Under Assumption 11(iv) (ℓ) = ν ℓ ℓ ν ′ ℓ ⊗ Q ZZ (ℓ), and with the addition of Assumption 11(v), we have (ℓ) = ν ℓ ν ′ ℓ ⊗ Q ZZ (ℓ). Thus, under our assumptions where ρ 2 is de ned in (27) and Assumption 3 is imposed. Therefore, applying Lemmata 1 and 2, we have and so, as we can consider the breaks separately, it follows from Lemma 3 that Under the shrinking breaks Assumption 8, and with distinct reduced and structural form breaks such that π 0 j < λ 0 k+1 < ... < λ 0 k+ℓ < π 0 j+1 , the result immediately extends to the case where the number of reduced form breaks is h > 1. It also immediately specializes to the case of a stable reduced form.

Scenario 2
Under this scenario, consider h = 1 in the case where the rst of the m structural breaks coincides with the single reduced form break. Hence the data generation process is identical to Scenario 1, except that T † 1 = T 0 1 and, consequently, π 0 1 = λ 0 1 . From Boldea et al. (2012), and since the m breaks at T 0 1 , . . . , T 0 m can be considered separately, the limiting distribution of ξ 1,T applies as for Scenario 1, with a i,j and c i,j as given by (56) and (57), respectively, for i = 2, . . . , m, but a 1,j and c 1,j are as follows: Under our assumptions, therefore, (58) applies and consequently (59) holds for a break that is common to the reduced and structural forms. Therefore (60) holds under Scenario 2.