Stochastic model-based assessment of power systems subject to extreme wind power fluctuation

Extreme outliers of wind power fluctuation are a source of severe damage to power systems. In our previous work, we proposed a modelling framework, verified its usefulness via real data, and developed a model-based evaluation method of the impact of such extreme outliers. However, it has been a drawback that the obtained estimates of frequency fluctuation of power systems are sometimes excessively conservative for their practical use. To overcome this weakness, theory and methods for tightening the fluctuation estimates are investigated in this paper. This is done by applying a robust performance analysis method of a Lur'e system to the error analysis of stochastic linearization. The usefulness of our proposed method is shown through a load frequency control model.


Introduction
In many engineering problems, it is important to evaluate the effect of probabilistic uncertainty. Stochastic dynamical models play an important role in such evaluation. In particular, a linear stochastic system driven by a Wiener process is a popular framework. The great advantage is that its various statistical properties are explained by Gaussian distributions; see e.g. H 2 -control design, Kalman filtering [1]. On the other hand, in recent years, it has been pointed out that the fluctuation of wind power generation is usually small, but it takes extremely large values due to the occurrence of wind gusts and turbulence at a non-negligible frequency. These outliers bring about large frequency fluctuation to power systems interconnected to wind energy [2]. Therefore, Gaussian distributions whose tails decay rapidly are not suitable for modelling wind power fluctuation.
In view of this, the authors proposed a modelling method for such uncertainty using stable processes [3]. Here, stable processes can represent extreme outliers that obey a power law. In [4], the authors verified based on real data that wind power fluctuation can be well modelled by this proposed framework. To the best of our knowledge, this is the only result of building a dynamical model of wind power fluctuation taking its power law tail into an explicit account.
Concerning model-based analysis of power systems, nonlinearities such as saturation and rate limiter of power generation are crucial. Consequently, it is difficult to analyse their behaviour analytically, and we usually rely on Monte Carlo simulations. Note that, because power systems require a very high resilience to perturbations, it is indispensable to properly assess risks with guaranteed accuracy. However, the relationship between the number of Monte Carlo samples and their accuracy is not trivial. To make matters worse, a huge sample size is often required to capture the effect of extreme outliers [5]. To resolve these issues, a stochastic linearization method [6,7] is extended for the systems driven by stable processes based on their mathematical similarity to Gaussian distributions [3]. Moreover, our previous work [3,8] derived theoretical error bounds for stochastic linearization to guarantee the accuracy of the approximation. This makes it possible to perform approximate evaluations with guaranteed accuracy without generating any sample paths. A remaining drawback of our proposed framework has been that the obtained bounds are sometimes excessively conservative for their practical use.
Contribution: The goal of this paper is to overcome this weakness to develop methods for the assessment of power system fluctuation subject to extreme outliers. To this end, in this paper, we exploit the fact that the proposed error analysis can be seen as robust performance analysis of a Lur' e system having diagonal nonlinearity. This enables us to apply a scaling approach used to reduce the conservativeness of the small gain condition for structured uncertain systems [9]. The present paper shows that this method is also effective for the error analysis of stochastic linearization. This is not trivial due to the heavy-tailed stochasticity in the dynamics.
Actually, limits and expectation in the proof of the main result in [8] are interchanged partially carelessly. Its theoretically rigorous justification is also provided in this paper; see Appendix 2. Finally, it should be noted that the proposed evaluation method is applicable to analysis and synthesis of various systems which have extreme outliers.
Organization: This paper is organized as follows: In Section 2, we provide mathematical preliminaries and formulate our problem. In Section 3, we derive novel error bounds for stochastic linearization via scaling. In Section 4, its usefulness is shown through a load frequency control (LFC) model. Some concluding remarks are given in Section 5.
Notation: The set of real numbers is denoted by R. The imaginary unit is denoted by j. The identity matrix is denoted by I. For a matrix A ∈ R n×n , we write A 0 (resp. ≺ 0) if A is symmetric and positive (resp. negative) definite. For A 0, A 1/2 denotes the unique positive definite square root. Let ( , F, P) be a complete probability space equipped with a natural filtration {F t } t≥0 . The expectation is denoted by E. Only when we need to specify the underlying probability distribution, or conditioning, will this be shown in the subscript. For a stochastic process {x t }, the convergence in law [10,11] The probability distribution of x ∞ is referred to as the stationary distribution of x t . The vector 1-norm and Euclidean norm is denoted by · 1 and · , respectively. The matrix norm induced by the vector 1-norm is denoted by · 1 ind , i.e.
and a real-valued function f (t) defined on t ≥ 0, the L α -norm of f is defined by provided that it is finite. The gamma function is defined by (2)

Stable process
In this paper, we deal with systems driven by stable processes. Here, we briefly introduce them. We begin with an extension of a Gaussian distribution [12].

Definition 2.1:
A real-valued random variable X is said to have a symmetric stable distribution with parameter α ∈ (0, 2] and σ > 0, simply denoted by X ∼ SαS(α, σ ), if its characteristic function satisfies The parameter α represents the degree of non-Gaussianity. In particular, SαS(2, σ ) coincides with the Gaussian distribution with zero-mean and variance 2σ 2 . The scale parameter σ acts like the standard deviation of the Gaussian distribution. The parameter is omitted when σ = 1 such as X ∼ SαS(α). For X ∼ SαS(α), we have This is a power law of the stable distribution. In Figure 1, the probability density function p α (x) of the stable distribution SαS(α) is plotted for different α. Due to its power law, the slope of p α (x) on a log-log plot approaches −(α + 1) as x → ∞ except for α = 2. As α becomes small, the resulting distribution has a heavier tail.
Next, we extend the notion to stochastic processes. A Wiener process can be generalized in a similar way to Definition 2.1.

Definition 2.2:
Every α-stable process is a Lévy process, that is, it has stationary independent increments. Stable processes include a Wiener process (α = 2). Finally, similarly to the Ito integral for the Wiener process, we can introduce a stochastic differential equation associated with stable processes [12, Chapter 3].
Next, consider the following linear system driven by a stable process: where A ∈ R n×n is Hurwitz, b, c ∈ R n , and {L t } is an α-stable process with α ∈ (1, 2]. Then, the stationary behaviour is characterized as follows [3, Theorem 2].
This analytical tractability is a great advantage of the stable processes over other stochastic processes having a power law. For example, although Student's t-distribution also has a power law and is widely used to model extreme outliers, its associated stochastic process [13] cannot replace the stable process in Proposition 2.2.

Problem formulation
This paper focuses on a system consisting of a linear system and saturation nonlinearity in Figure 2: where x t ∈ R n , u t ∈ R r , z t ∈ R, y t ∈ R r denote the state, control input, evaluation output, and observation variables, respectively. The matrices A, B, and C y have compatible dimensions, and b, c z ∈ R n . The stochastic input {L t } is an α-stable process with parameter α > 1, and with the given threshold vector d ∈ [0, ∞) r and where y j (resp. d j ) is the jth element of y (resp. d).
Then, we are ready to state our problem.
Problem 2.1: Consider the feedback system with saturation given above. Suppose that (A + BKC y ) is Hurwitz for any K ∈ {diag(k 1 , . . . , k r ) : k j ∈ [0, 1]} and that x 0 has finite pth moment for some p ∈ (1, α), and x t converges in law to a random variable x ∞ valued in  R n as t → ∞. Then, given an approximate stationary distribution of z t by a linearization of the saturation, guarantee its accuracy.
Thanks to Proposition 2.2, we can obtain the stationary distribution of z t once the nonlinearity is approximated by a suitable linear gain. To be more precise, if we replace sat d by However, it is not trivial (1) how to choose K, and (2) how to guarantee the accuracy.
The former question has been thoroughly studied in our work [3] based on stochastic linearization; see Appendix 3 for its brief review. In this paper, we shed light on the latter question.

Remark 2.1:
Although we deal only with the saturation nonlinearity in this paper, our analysis can be applied to other nonlinearities as shown in Figure 3.

Theoretical error bounds
For the error analysis of the linearization by K ∈ {diag(k 1 , . . . , k r ) : k j ∈ [0, 1]}, we apply the loop shifting in Figure 4 to obtain with (10) and (11) where Thanks to the linearity of the dynamics, the solutions to See the block diagram in Figure 5.
It is, therefore, crucial for the error analysis to evaluate δ t . The main difficulty lies in the fact that δ t is a signal generated in the nonlinear feedback loop. In order to state the main result, let us define where C yj is the jth row of C y , and K ∈ K := {diag(k 1 , . . . , k r ) : k j ∈ [0, 1]}. Also, define the scaling set S := {diag(s 1 , . . . , s r ) : s j > 0}. Here, we derive a novel error bound using scaling [9]. The proof is provided in Appendix 1.
where · 1• for a matrix-valued function f (t) is defined by The obtained error bound can be understood intuitively as follows: First, we consider the case S = I corresponding to the previous result [8]. The L 1 -gain [14] of C y (sI − (A + BKC y )) −1 B is less than or equal to η y . Therefore, ζ and η y can be viewed as (incremental) gain upper bounds of the static nonlinearity ψ(·) and the linear system fromū t to y t , respectively. Consequently, the assumption ζ η y < 1 plays a role of a small gain condition to ensure the stability of the nonlinear feedback loop in Figure 5. Under this condition, 1/(1 − ζ η y ) can be seen as the sensitivity function that characterizes how much the signal added toū t , which is related to r j=1 η j (K), affectsū t . Similarly, η z is the gain of the linear system fromū t to z t . The upper bound in (27) is the product of these three terms. Next, consider a general scaling S ∈ S. This amounts to inserting S and S −1 into the system as shown in Figure 6. We emphasize that this system is no longer equivalent to the original system in Figure 4 due to the nonlinearity of ψ(·). Nevertheless, the same interpretation as explained above applies to this case. For instance, η y can be regarded as the gain of the linear system fromū t to y t in Figure 6.
In addition, we can employ [8, Theorem 2] to compute η j (K) by numerical integration, which is computationally much more effective than Monte Carlo methods. Hence, it is not necessary to generate samples at all in computing the error bound in Theorem 3.1. Note that one of the diagonal elements in scaling matrices can be normalized to 1 since for any γ > 0, Applying some modifications to the proof of Theorem 3.1, we obtain another type of the error bound. The proof is given in Appendix 1. where · 1• for a vector-valued function f (t) is defined by provided that it is finite. Then, if r j=1 ζ j η yj < 1, where ζ j is defined in (24).
The condition r j=1 ζ j η yj < 1 may be milder than that of Theorem 3.1 when some of k j , j = 1, . . . , r are close to 1/2 while the others are close to 0 or 1.

Choice of a scaling
Differently from the previous error bounds in [8], we can relax the condition required to obtain the bounds and tighten them by choosing a suitable scaling. Especially when one wants to find a scaling that satisfies ζ η y < 1 in Theorem 3.1, it is reasonable to minimize η y with respect to S ∈ S since ζ is a constant. In addition, we can expect a smaller η y makes 1/(1 − ζ η y ) smaller, and the right-hand side of (27) as well. Unfortunately, the authors are not aware of any method to efficiently find S that directly minimizes η y . Instead, we minimize an upper bound of η y given by the following proposition.
where · H ∞ is the H ∞ -norm.

Proof: It can be shown that
where · 2 ind is the matrix norm induced by the Euclidean norm; see [16]. We also have for anyĀ ∈ R r×r ; see [17]. Combining (33) and (34), we obtain (32).
The problem of finding a scaling that minimizesη y can be solved efficiently using the following proposition.

Proposition 3.2: Consider the scaling set S and a continuous-time transfer function T(s) of (not neces-
A ∈ R n×n ,B ∈ R n×r , andC ∈ R r×n . For any γ > 0, the following statements are equivalent. (i)Ā is Hurwitz and there existsS ∈ S such that (ii) There exist positive definite solutions X andS ∈ S to the linear matrix inequality (LMI): Proof: Using the strict bounded real lemma [18], (i) is equivalent to the condition: By left and right multiplying the left-hand side of (37) by diag(I,S 1/2 ), we obtain (ii).
Note that the LMI (36) can be solved by convex optimization techniques [19,20]. In addition, by the proof, a solutionS ∈ S to the LMI (36) satisfies (35), and vice versa. Hence, the scaling S =S 1/2 ∈ S that minimizes η y can be found efficiently by replacingĀ by the Hurwitz matrix (A + BKC y ) and performing a bisection search on γ .

Model description
In this section, we apply the proposed framework to evaluate the frequency fluctuation of power systems interconnected to wind energy. In particular, we consider two types of thermal generators: a coal generator and a combined cycle thermal unit. The latter can change its output at a faster rate, but takes a higher fuel cost than the former. From an economic point of view, it is preferable to increase the ratio of the former capacity to the latter one. However, the excessive increase leads to the situation that the output of the thermal power generators cannot follow the abrupt change of wind power. Hence, it is important to balance the ratio. This is why we model the two types of the thermal generators separately. Specifically, we utilize the LFC model in Figure 7. The physical meaning of each signal and block is • t: time, • n t : wind power fluctuation, • e t : power deviation, • f t : frequency deviation, • v 1t , v 2t : power adjustment by LFC units, • W: frequency domain model of the wind power fluctuation, • P 1 : physical inertia, e.g. load characteristics and system inertia of the power system, • P 21 , P 22 : thermal power generators for LFC, • P 31 , P 32 : characteristics of LFC units.
The values of d 1 , d 2 > 0 represent the capacity of the thermal power generators and largely affect the running cost and the resulting frequency deviation.
Considering the 1500 MW interconnected wind power case, and based on the discussion in [4,7], we take d 1 = 25 MW, d 2 = 35 MW, k fq = 1/250 Hz/MW, k fb1 = 0.75, k fb2 = 0.10, The time constant of P 21 corresponding to a combined cycle thermal unit is shorter than that of P 22 for a coal generator. In summary, the dynamics in Figure 7 with z t = f t are given by (9) to (12)

Simulation result
In order to assess the resulting average frequency fluctuation, we approximately compute E[|f ∞ |]. We determine the linearized gain by the algorithm proposed in [3]; see Appendix 3 for detail. In the linear dynamics where the saturation is replaced by K, it readily follows from Proposition 2.2 this is too conservative. Next, we find the scaling that minimizesη y : For S * , we obtain η y (S * ) = 0.4198, ζ η y = 0.4198 < 1, and an error bound is given by In this case, there is not much difference between the error bounds (43) and (46). This implies that we can find a sufficiently suitable scaling and obtain a tight error bound by the proposed method. Finally, we emphasize that when the dimension r becomes larger, e.g. when rate limiters are contained in a power system model (r = 4), the parameter sweep is no longer realistic. Even for our example (r = 2), the computation time of η y for each S is about 0.05 seconds, and the parameter sweep takes about 1 hour. On the other hand, the proposed method takes 1 second. The advantage of the proposed method in terms of computation time becomes significant especially when we design controller parameters while referring to the uncertainty evaluation. In this case, we need to recompute the error bound whenever the controller parameters are changed, and therefore the parameter sweep is impractical.

Conclusion
In this paper, we have derived novel theoretical error bounds for stochastic linearization of a class of feedback systems. Specifically, we have analysed the effect of scaling on stochastic Lur' e systems, and proposed how to choose a suitable scaling efficiently. This makes it possible to assess uncertainties containing extreme outliers with high reliability, and our proposed method becomes applicable to a much broader range of systems. Furthermore, we have examined that the proposed method can evaluate the impact of wind power on the power system.

A.1 Theorem 3.1
In what follows, the following explicit representation is employed: for any t > t > 0. We ignore the first term since it vanishes as t → ∞ independent of the choice of t . This implies that whereC y := SC y ,ψ(y) := Sψ(y). Furthermore, due to the In summary, By the same argument, also holds. Next, the inequality where ψ j is the jth element of ψ, yields where C yjxt d − → SαS(α, C yj e (A+BKC y )t b L α ), Lemma A.2, and (A4) are applied. Therefore, under the assumption ζ η y < 1, Finally, this inequality with (A5) completes the proof.

A.2 Corollary 3.1
First, instead ofC y δ t in (A2), we estimate each of its component from above as follows. (A12) Here, we applied (A10) and whereψ j is the jth element ofψ. Hence, under the assumption r j=1 ζ j η yj < 1, we obtain the desired result.

Appendix 2. Interchange of limits and expectation
Here, we provide results to guarantee that the interchange of limits and expectation is valid. In our problem setting, the interchangeability is not trivial due to the power law of the stable process. for some p > 1, we can prove this lemma; see [21,Theorem 3.5,(3.18)]. In what follows, we show that (A15) is satisfied for p =p wherep ∈ (1, α) and x 0 has finitepth moment. Equation (18) (α, σ yj ), then it is reasonable to approximate sat d j (y j ) by the linear gain k j = min 1, d j γ α σ yj . (A29) Conversely, once each saturation sat d j (y j ) is approximated by a linear gain k j , the stationary distribution of y j is given by by Proposition 2.2. Combining (A29) and (A30), we obtain the algorithm to determine a linearized gain, that is, we choose the solution K = diag(k 1 , . . . , k r ) to the coupled equations k j = min 1, d j γ α C yj e (A+BKC y )t b L α , j = 1, . . . , r (A31) as a linearized gain.