A connection between time domain model order reduction and moment matching for LTI systems

ABSTRACT We investigate the time domain model order reduction (MOR) framework using general orthogonal polynomials by Jiang and Chen [1] and extend their idea by exploiting the structure of the corresponding linear system of equations. Identifying an equivalent Sylvester equation, we show a connection to a rational Krylov subspace, and thus to moment matching. This theoretical link between the MOR techniques is illustrated by three numerical examples. For linear time-invariant systems, the link also motivates that the time-domain approach can be at best as accurate as moment matching, since the expansion points are fixed by the choice of the polynomial basis, while in moment matching they can be adapted to the system.


Introduction
Various mathematical and physical processes can be modeled as linear time-invariant (LTI) input-output systems E _ xðtÞ ¼ AxðtÞ þ BuðtÞ; yðtÞ ¼ CxðtÞ; where E; A 2 R nÂn are sparse matrices, B 2 R nÂp and C 2 R qÂn are input and output matrices, respectively, xðtÞ 2 R n is the state vector, uðtÞ 2 R p is the input vector, yðtÞ 2 R q is the output vector and t 2 R represents time.
Since the order of the LTI system in (1) is often huge n ) 10 3 , a numerical simulation might be too expensive or even impossible, caused by immense computational time and memory requirements. Nevertheless, the input-output behaviour of (1) can be computed fast and accurately if the given system is reduced to a system E r _ x r ðtÞ ¼ A r x r ðtÞ þ B r uðtÞ; y r ðtÞ ¼ C r x r ðtÞ; that approximates the dynamic behaviour of (1), but E r ; A r 2 R rÂr , B r 2 R rÂp , C r 2 R qÂr , x r ðtÞ 2 R r , y r ðtÞ 2 R q with the reduced order r ( n. The aim of MOR is to approximate a system (1) with a huge order n by a system (2) with a much smaller order r, such that structural properties are preserved and the approximation error yðtÞ À y r ðtÞ over a given time interval ½t 0 ; t f , or the error of the transfer functions GðsÞ À G r ðsÞ over a frequency range ½s 0 ; s 1 , is small.
Transfer functions describe the relation between input and output in frequency domain. For (1), (2) with zero initial states and an evaluation point s in frequency domain, these are given by (3) G r ðsÞ ¼ C r ðsE r À A r Þ À1 B r : There exist numerous MOR approaches. H 2 optimal MOR techniques like the iterative rational Krylov algorithm (IRKA) (see, e.g. [2]) or the two-sided iteration algorithm (see, e.g. [3]) measure their approximation error 2 in the H 2 system norm (see, e.g. [4, Chapter 5]) 2 :¼ k GðsÞ À G r ðsÞk H 2 : These techniques are just two types of projection based MOR methods. There, the system (1) is reduced using so-called projection matrices V; W 2 R nÂr that map the matrices E; A; B and C onto a subspace approximating the space of the state vector xðtÞ.
The reduced system is given as where xðtÞ % Vx r ðtÞ. A more simple example for projection methods is moment matching, where V and W are computed to approximate the moments of the transfer function. This method and its relevant properties are repeated in Section 4.1. It represents one step in the procedure behind the IRKA iteration. Also the balanced truncation technique falls into the class of projection based reduction methods. Its error 1 is measured in the H 1 norm (see, e.g. [4,Chapter 5]) 1 :¼k GðsÞ À G r ðsÞk H 1 : Applying this method, the system (1) is first balanced, i.e. the observability and controllability Gramians P O and P C , given as the solutions of two Lyapunov equations are made equal and diagonal, such that P O ¼ P C ¼ diagðσ 1 Á Á Á σ n Þ and σ 1 ! Á Á Á ! σ n > 0 are the systems invariant Hankel singular values (HSVs). The discardable portions are identified and truncated according to the magnitude of the HSVs. More details about this method can be found, e.g. in [4,Chapter 7].
The above MOR techniques are motivated and derived by frequency domain considerations.
In contrast to that, we next review the idea of Jiang and Chen [1] presenting a time domain MOR framework based on orthogonal polynomials. In this paper, we only consider single-input single-output (SISO) systems, i.e. p ¼ q ¼ 1 in (1) to simplify the notation. Drawbacks of this method are the dependence of the reduced order model (ROM) on the input uðtÞ and the initial state xðt 0 Þ ¼ x 0 .
The dependence on the input can be neglected, since we will see that piecewise constant controls, which are the most important ones in practical applications anyway, allow for a joint ROM to exist. However, a ROM depending on the initial state is undesirable since the reduced system needs to be recomputed for each initial state or ROMs need to be stored for all possible initial values. Frequency domain based model reduction methods, such as balanced truncation or moment matching, assume xðt 0 Þ ¼ 0 Á Á Á 0 ½ T ¼: O n;1 2 R n , in the first place, in order to avoid additional terms in the transfer function representation (3). For the comparison, we will do the same in the time domain case in the following.
The remainder of this paper is organized as follows. The time domain MOR approach based on general orthogonal polynomials by Jiang and Chen [1] is introduced in Section 2. In Section 3, the structure of the resulting linear system of equations is exploited to derive an equivalent Sylvester equation. Further a slight variation of the approach is discussed that eliminates the initial condition in the case it is assumed to be zero, which also simplifies the structure of the coefficients in the Sylvester equation. Since we want to show a connection to moment matching, we briefly introduce this Krylov subspace method in Section 4 concluding with an important equivalence to the approaches of Section 3. Still, the additional freedom in the choice of the coefficients in the Sylvester equation makes moment matching theoretically more flexible and better adaptable to the original system under investigation. Numerical examples illustrated in Section 5 demonstrate this advantage. Concluding remarks are given in Section 6.

Time Domain MOR Based on Orthogonal Polynomials (TDMOR)
As already mentioned above, we restrict ourselves to SISO systems, i.e. p ¼ q ¼ 1.
The framework in [1] uses W ¼ V in (5) and obtains the projection matrix V 2 R nÂr from the vector valued coefficients in series expansions of the state and input, sampling their time dependence via orthogonal polynomials [5,Chapter 22].
The key property of orthogonal polynomials for the derivation of the framework in [1] is given by the following theorem. Theorem 2.1 (Differential recurrence formula, e.g. [6, Section 2.15]) For three subsequent orthogonal polynomials g i ðtÞ (i 2 N 0 ) holds g n ðtÞ ¼ α n _ g nþ1 ðtÞ þ β n _ g n ðtÞ þ γ n _ g nÀ1 ðtÞ; "n 2 N; where α n ; β n and γ n are differential recurrence coefficients. A list of such coefficients for selected families can be found in Table 1.
We restrict our considerations to the polynomials investigated in [1]. Other polynomials fulfilling Theorem 2.1 are for instance the Gegenbauer polynomials, a generalization of the Legendre polynomials (see, e.g. [6,Section 2.11]).
The following repeats some of the details of the derivation in [1]. First, the state, the initial condition and the input vector are approximated by the following truncated series expansions: uðtÞ % u r ðtÞ ¼ where v i 2 R n and w i 2 R are weights determining the subspace span V f g and g i ðtÞ are orthogonal polynomials representing the time dependence. Note, that the open literature provides no information about the remainder terms in equations (6)- (8). Thus, the estimation of approximation errors, and resulting model reduction errors, is at best difficult.
The approximations of the state (6) and the input (8) are inserted into the state equation of (1) and using Theorem 2.1, one obtains an expression that only depends on _ g i ðtÞ, since g 0 is always constant: A comparison of coefficients leads to the huge ðnr Â nrÞ linear system of equations Hv ¼ f presented in (9), where the approximation of the initial state (7) is only required to obtain a square matrix g 0 ðt 0 ÞI n g 1 ðt 0 ÞI n g 2 ðt 0 ÞI n g 3 ðt 0 ÞI n Á Á Á g rÀ1 ðt 0 ÞI n À g0ðtÞ A   2   6  6  6  6  6  6  6  4 vecðAÞ ¼ a 1;1 ; . . . ; a m;1 ; a 1;2 ; . . . ; a m;2 ; . . . ; a 1;n ; . . . ; a m;n ½ T and is called a vectorization (see, e.g. [7,Section 4.2]). A further orthogonalization of V leads to the desired projection matrix. In the following, we will refer to this algorithm as SYLTDMOR1.
Note that the matrix pencil ðÃ;ẼÞ has at least one eigenvalue equal to zero caused by the structure ofÃ, which arises from the initial state condition. Therefore, the matrix H in (9) is not invertible resulting in an infinite number of solutions and thus in a possibly infinite number of ROMs. Hence, this method is not well-posed and its solution not well-defined.

Variation of the Presented Algorithm (SYLTDMOR2)
In the approach of Jiang and Chen [1], the approximation of the initial state is only required to obtain a square matrix. As a conclusion of Section 3.1, this condition turns out to be linearly dependent anyway. Besides, a ROM depending on the initial state is not desirable.
In Section 5, we compare to the frequency domain methods, thus also here we fix the initial state to Doing so, we can neglect the constant polynomials g 0 ðtÞ in the approximation (6). In order to keep an r dimensional approximation, we shift the sums by 1. Using the same procedure as in Section 2, we end up with an nr Â nr linear system of equationŝ Hv ¼f . Rewriting it, again using the Kronecker product, we obtain where nowÊ Exploiting the equivalence (10) and the factÂ ¼ I r , the linear system of equationŝ Hv ¼f can be reformulated as the Sylvester equation wherev ¼ vecV À Á andf ¼ vecF À Á . As in Section 3.1, the projection matrix can be obtained by orthogonalization ofV. In the following, we will call this method SYLTDMOR2.
Compared to (11), Sylvester equation (12) does not depend on the initial state. Moreover, the pencil ðI r ;ÊÞ does not have a zero eigenvalue, such that (in contrast to (11)) (12) always allows for a unique solution. Thus, this method is well-posed and the ROM is well-defined.

Reincorporation of Non-Zero Initial Conditions in SYLTDMOR2
Although we removed it in the formulation, it is possible to use the initial state condition in SYLTDMOR2. One way to include the initial condition is given by the approach presented in [8], where the given SISO system is reformulated to a multipleinput single-output (MISO) system, by adding the initial state as a column in B and using a corresponding Dirac input. Another and more flexible method is described in [9] for the frequency domain MOR methods, we want to compare with. Here, a whole variety of initial state conditions, instead of only one condition, can be considered using an approach splitting the problem into a homogeneous and inhomogeneous part, that can be solved separately. This method preserves the SISO system and can also be applied to the time domain MOR approach. If the subspace of relevant initial conditions is known, this method clearly offers a more flexible setting and overcomes the problem of storing a separate reduced model for every possible initial condition.

Moment Matching and its Relation to SYLTDMOR2
Our main goal in this paper is to show a connection between the above mentioned time domain MOR approaches and moment matching. To this end, we repeat the basics of this Krylov subspace technique by first introducing a standard Krylov subspace (see, e.g. [10, Section 1.6]) of order r for a matrix A 2 R nÂn and a vector b 2 R n as K r ðA; bÞ ¼ span b; Ab; Á Á Á ; A rÀ1 b È É :

Moment Matching
Moment matching is a projection based MOR technique. It constructs the projection matrix starting from a series expansion of the transfer function rather than the state exploiting the Neumann series (see, e.g. [11]) where T 2 R nÂn is a matrix, such that ðI n À TÞ is in fact invertible. Assuming, that ðs 0 E À AÞ is invertible, and using (13), the transfer function of the original system can be expressed as where M s 0 k are called moments of the original transfer function around s 0 .
The aim of moment matching is to find a reduced system of order r ( n, such that for some k ¼ 0; :::; 1 for the moments of the reduced order transfer function we haveM s 0 k ¼ M s 0 k . This equality of moments can be guaranteed by using an orthonormal basis of the input or output Krylov subspace around a single expansion point s 0 2 C to form the orthogonal matrices Q 1 and Q 2 If the one-sided Krylov subspace method is used, i.e. V ¼ W ¼ Q 1 is used to project, r moments will match (see, e.g. [4,Chapter 11]). In [12,Chapter 3], it is pointed out that this property also holds if Q 2 is used instead of Q 1 . In contrast, if both V ¼ Q 1 and W ¼ Q 2 , then 2r moments of the original and reduced order systems will match (see, e.g. [4,Chapter 11]). This method is called two-sided Krylov subspace method. If multiple expansion points s 1 ; . . . ; s k 2 C are given, Q 1 and Q 2 can be obtained as a basis of the union of Krylov subspaces, that belong to the expansion points: where P k i¼1 r i ¼ r. Using only V ¼ W ¼ Q 1 to project, the first r i moments around s i of the original and reduced order model match for i ¼ 1; . . . ; k. In the two-sided Krylov subspace method using multiple expansion points 2r i moments will match around s i for i ¼ 1; . . . ; k (see, e.g. [12], Chapter 3]).

Moment Matching and Sylvester Equations
Since TDMOR presented in Section 2 and SYLTDMOR1 and SYLTDMOR2 presented in Section 3 only use one projection matrix V to obtain a ROM, we will only focus on the one-sided Krylov subspace method. On the one hand, the projection matrix V can be obtained using the approach presented in Section 4.1. On the other hand, there is a very useful result describing a relation between the basis of a Krylov subspace and the solution of a Sylvester equation, which can be found in [13,Section 3.4] and [10,Section 2.3]. This connection requires the observability of a matrix pair ðS; LÞ 2 C rÂr Â C pÂr . This is, e.g. given (see, e.g. [4,Chapter 4]), when the corresponding observability matrix has full rank. One then has the following theorem. . Given the expansion point s 0 2 C, such that s 0 is not an eigenvalue of E À1 A, the columns of V 2 C nÂr form a basis of a rational Krylov subspace if and only if there exists an observable pair ðS; LÞ, where S 2 C rÂr , L 2 C 1Âr ,which admits the Jordan canonical form J, for an appropriate transformation matrix T 2 C rÂr , such that the Sylvester equation This theorem also extends to the case of multiple expansion points. if and only if there exists an observable pair ðS; LÞ with S 2 C rÂr , L 2 C 1Âr , which admits the Jordan canonical form J, for an appropriate transformation matrix T 2 C rÂr , such that the Sylvester equation is satisfied. Moreover, the reduced model G r ðsÞ ¼ C r ðsE r À A r Þ À1 B r from (2) matches the moments M s i 0 ¼M s i 0 ; i ¼ 0; . . . ; r À 1, if none of the s i is a pole of G r ðsÞ. Here, the eigenvalues of S correspond to the expansion points in moment matching. Following [13,Theorem 3.23], the eigenvalues of S are either interpolation points between GðsÞ and G r ðsÞ or the inverse of common poles between GðsÞ and G r ðsÞ. Considering multiple-input multiple-output (MIMO) systems, the matrix L is of importance, since tangential directions are stored in its columns. Conversely, every solution of a Sylvester equation consisting of an observable matrix pair ðS; LÞ spans a Krylov subspace with expansion points given by the eigenvalues of S.
The following Section 4.3 uses Theorems 4.1 and 4.2 to show a novel connection between moment matching and the time domain MOR framework based on orthogonal polynomials.

Equivalence of SYLTDMOR2 and Moment Matching
In this section, we apply Theorems 4.1 and 4.2 to the derived Sylvester equations (11) and (12) from Section 3. In the moment matching MOR method, it is assumed, that the initial state vector is Since this condition is also required for SYLTDMOR2, we only need to set the initial state vector to zero for the remaining time domain MOR approaches to compare these methods. Recall, that the approximation of the initial state was only needed to derive a square linear system of equations. For a consistent initial state, it is thus redundant and the matrix H in the linear system (9) is actually singular. Another restriction, we make in this paper, is to set (without loss of generality) the time interval to t ¼ ½0; 1. Note that for the general case t 1 2 ½t 0 ; t f , this can always be obtained by the simple transformation t 1 To obtain the structure of the Sylvester equations (14) or (15) from Theorems 4.1 and 4.2, it is necessary to invert theẼ (SYLTDMOR1) andÊ (SYLTDMOR2) matrices containing information about the orthogonal polynomials. Due to the structure of these matrices, it is only possible to invert them in the following cases: MatrixẼ is regular for: • Hermite: r odd (otherwise g rÀ1 ðt 0 Þ ¼ 0 and thus we obtain a zero row) • Laguerre: all r • Legendre, Chebychev of first and second kind: r odd (otherwise a zero row is obtained due to linear combination)

MatrixÊ is
• Hermite: always singular • Laguerre: always regular • Legendre, Chebychev of first and second kind: regular for r even (otherwise a zero row is obtained due to linear combination) Explicit representations of the inverse matrices for the different polynomials listed above can be found in [14]. The inverse matrices of the Jacobi polynomials cannot be obtained as easy as for the above mentioned polynomials caused by the structure and the influence of parameters a and b. Therefore we assume to choose a and b, such thatẼ andÊ are invertible. In the following, the Jacobi polynomials are only used to proof the assumptions of Theorems 4.1 and 4.2 since the Legendre and Chebychev polynomials are special cases of these polynomials (see, e.g. [5,Chapter 22]).
Assuming either of the aforementioned cases and exploiting the zero initial state, we rewrite the Sylvester equation (11) as Equivalently, we can rewrite the Sylvester equation (12): Since the eigenvalues of S andŜ are the expansion points only in case of observability, we now have to check the observability of the matrix pairs ðS; LÞ and ðŜ;LÞ. While S andŜ only depend on the choice of the orthogonal polynomial, L andL additionally depend on the expected input uðtÞ, since w 1 ; . . . ; w r are weights of the approximated input (8).
Remark 4.3 In practice, the input often needs to be realized piecewise constant. Therefore we assume, that uðtÞ ¼ 1. Note that any other constant value for uðtÞ only scales the solution and thus changes the basis but not the subspace spanned by V. As a consequence, the ROM stays the same.
Under this condition, we compare the numerical ranks of the associated observability matrices for certain orthogonal polynomials. Here, the tolerance of MATLAB®s rank function was set to 10 À20 to ensure a good rank estimation. 1 The differences between the reduced orders r and the numerical rank of the observability matrix are depicted in Figure 1. For the matrix pair ðS; LÞ, we only consider an odd order r, for ðŜ;LÞ, only an even order r due to the invertibility conditions for the matricesẼ andÊ. There are two exceptions: Since theẼ andÊ matrices are always invertible in case of Laguerre polynomials, their numerical rank is plotted for all r. Even though theÊ matrix using the Hermite polynomials is not invertible at all, we use here matrix pair ðS;LÞ from equation (17) and thus observability matrix (18) instead to obtain its numerical rank.
In both subfigures it is easy to see that the Legendre and both types of Chebychev polynomials always lead to a full numerical rank. In case of these polynomials, the rank is only plotted with one mark, because the result is always the same. The Laguerre polynomials show the same behaviour in both figures for all r. In contrast, the Hermite polynomials in SYLTDMOR2 only have a full numerical rank if the reduced order r is small enough, i.e. r 14. For SYLTDMOR1, their numerical rank is full only if r 27.
Note, that Figure 1 only presents numerical ranks. We now, considering only our proposed new variant SYLTDMOR2, prove the full rank of the observability matrices for the Jacobi, Laguerre, Legendre and Chebychev polynomials of first and second kind rigorously. We will also show that for the Hermite polynomials, theoretically, the rank of ObðS;LÞ is always full as opposed to the numerical rank.
To overcome the difficulties with the singularity ofÊ for some kinds of orthogonal polynomials, we will rewrite Sylvester equation (12) as whereS Thus, the solution of the Sylvester equation (17) is a basis of a Krylov subspace with expansion points 1 s i for i ¼ 1; . . . ; r.

Hermite Polynomials
Since for these polynomials theÊ matrix is singular for all r, we choose Sylvester equation (17) to prove the equivalence to moment matching. As mentioned above, we assume the input to be chosen piecewise constant and thusL Due to this special structure, the ðr Â rÞ observability matrix is a diagonal matrix with non-zero entries and thus ðS;LÞ is observable. Converting this problem back to Sylvester equation (16), the expansion points are generalized eigenvalues of ðÀÊ; I r Þ and thus inverse eigenvalues of ðÀI r ;ÊÞ, whereÊ ¼ ÀS. SinceÊ is a strict upper triangular matrix, all eigenvalues of ðÀI r ;ÊÞ are zero and thus all expansion points are 1 (see, e.g. [15] ([16, Section 1.6 4]).

Laguerre Polynomials
Since theÊ matrix for these polynomials is always invertible, we choose Sylvester equation (12). The explicit inverse tô is given by the upper triangular matrix . . . SinceŜ ¼ ÀÊ À1 by definition and due to the choice of piecewise constant inputs, we haveL : Observing that the first rows in the powers ofŜ can be written in terms of binomial coefficients and using the sum formula for integers k; l; n ! 0 (see, e.g. [17, Chapter 1]), and the properties for integers n; k ! 1, we obtain a structured observability matrix ; that is known as the Pascal matrix. It can be shown (see, e.g. [18]) that the LU decomposition of this matrix leads to its triangular factors being triangular Pascal matrices, and thus the determinant is always 1. Consequently, ObðŜ;LÞ always has full rank and we have established the equivalence to moment matching choosing the expansion points as eigenvalues ofŜ, i.e. s 0 ¼ 1.

Jacobi Polynomials (Including Legendre and Chebychev Polynomials)
In this case we choose Sylvester equation (17) to avoid problems with a singularÊ. As for the Hermite polynomials we considerL ¼ 1 0 . . . 0 ½ : For this class of orthogonal polynomials, theS matrix has the same structure and only differs in its entries α i ; β i and γ i for i ¼ 1; . . . ; r : . . . γ rÀ1 β rÀ1 α rÀ1 γ r β r : We now prove the full rank of the observability matrix by induction.
Base clause: If we compute the observability matrix for certain small r with the above mentionedL andS, we can clearly see a structure, namely: These matrices are lower triangular and obviously have full rank, because α i Þ0 by definition.
Induction hypothesis: Now, assume that the observability matrix of size r Â r has a lower triangular structure and full rank with Ã representing the non-zero entries. Induction step: To prove the lower triangular structure and the full rank of the observability matrix of size ðr þ 1Þ Â ðr þ 1Þ, we only need to have a closer look at the r þ 1 st row and column because the r Â r block is unchanged. Since the first row is the first unit vector by definition andS is a tridiagonal matrix, the first r entries of the last columns are equal to zero. The last row is computed by multiplying the r th row withS. Due to the tridiagonal structure the first r entries are non-zero if α i ; β i and γ i are non-zero for i ¼ 1; . . . ; r. The r þ 1 st entry is obtained by multiplying Thus, the observability matrix is a lower triangular matrix whose diagonal entries are non-zero since α i Þ0 "i, which proves its full rank and completes the induction. The proof for Legendre and Chebychev polynomials is omitted since these polynomials are special cases of the Jacobi polynomials (see, e.g. [5,Chapter 22]).
For SYLTDMOR2 presented in Section 3.2, we can apply Theorem 4.1 directly for the Hermite and Laguerre polynomials for SYLTDMOR2, since all generalized eigenvalues of ðÀI r ;ÊÞ are 1 and 1 respectively, and obtain an equivalence. Regarding the Legendre, Chebychev polynomials of first and second kind, the Jacobi polynomials in both approaches and the Laguerre polynomials for the original approach proposed in [1], we first have to prove the distinct eigenvalues before applying Theorem 4.2. Since this is hard to verify, we show in Figure 2 in the left subfigure the minimal distance between the generalized eigenvalues of ðÀÃ;ẼÞ with matrices arising in Sylvester equation (11) in SYLTDMOR1 and in the right subfigure for the matrix pencil ðÀI r ;ÊÞ, whereÊ occurs in Sylvester equation (12) in the SYLTDMOR2 algorithm, for a reduced order r ¼ 1; . . . ; 1000. A reduced order r > 1000 is not desirable, since reducing the original system and simulating the ROM is done with dense matrices of dimension r and thus the computational costs become Oðr 2 Þ or even Oðr 3 Þ, when using an implicit solving scheme, which is way to too expensive. Since, in the right subfigure, the minimal distance between the eigenvalues up to a reduced order of r ¼ 1000 is clearly larger than zero, even in finite precision, we conclude that one can apply Theorem 4.2 for the practically relevant r. Considering the Jacobi polynomials, we assume that a; b > À 1 are chosen such that all eigenvalues are distinct. Summarizing these findings, we have proven the equivalence to a Krylov subspace MOR method for SYLTDMOR2 presented in Section 3.2.  Nevertheless, we have also shown the observability numerically in the left subfigure of Figure 1 for the Laguerre, Legendre and Chebychev polynomials of first and second kind. In the left subfigure of Figure 2 we can also see, that the minimal distance between the generalized eigenvalues of ðÀÃ;ẼÞ is clearly larger than zero in case of Legendre and Chebychev polynomials of first and second kind. Hence, for these polynomials the eigenvalues are distinct. In case of the Laguerre polynomials, the minimal distance seems to decrease, but for a reduced order of r ¼ 1000 the minimal distance is approximately 10 À3 . It is thus too large for a computational error from round off accumulation. Hence, also the generalized eigenvalues for the Laguerre polynomials are distinct for all relevant reduced orders r. Thus, we conjecture the equivalence to moment matching for the TDMOR approach of Jiang and Chen [1] and thus SYLTDMOR1: Conjecture 4.5 (Equivalence between SYLTDMOR1 and moment matching) Consider (1) with piecewise constant input and zero initial state.
If we choose • Laguerre polynomials, • Legendre polynomials, odd reduced order, • Chebychev polynomials of first kind, odd reduced order, • Chebychev polynomials of second kind, odd reduced order, then SYLTDMOR1 presented in Section 3.1 is equivalent to the moment matching method, where the expansion points are chosen to be the eigenvalues of the matrix pencil ðÀÃ;ẼÞ arising in Sylvester equation (11) and depend on the choice of the orthogonal polynomials and the reduced order. Remark 4.6. The equivalence between the Laguerre based time domain MOR and moment matching has been proven rigorously by Eid in [12].

Advantages of Moment Matching over SYLTDMOR(1/2) in Practice
Comparing the polynomial based time domain MOR with moment matching, one should keep in mind, that the expansion points can be freely chosen using moment matching. Since we use an IRKA based algorithm to determine the rational Krylov subspace, these expansion points will be optimized fitting to the original LTI system. Computing the reduced system with the time domain framework of Jiang and Chen presented in Section 2 or SYLTDMOR2 in Section 3.2, the expansion points are fixed by the choice of the family of polynomials. Exemplary, these expansion points or equivalently the generalized eigenvalues of matrix pairs ðÀÃ;ẼÞ (SYLTDMOR1) and ðÀI r ;ÊÞ (SYLTDMOR2), are shown in Figure 3 for a reduced order r ¼ 40.
Thus, a variation of the expansion points arising from the time domain approach can be achieved by in-or decreasing the reduced order r or rather using another family of orthogonal polynomials, on the one hand. On the other hand, orthogonal polynomials of higher degree s 2 Z up to degree s þ r À 1 could be used or r arbitrarily chosen orthogonal polynomials. To the best of our knowledge, this has not been tried in the open literature so far.
Considering the choice of expansion points, the time domain MOR framework based on orthogonal polynomials, thus only seems to be a restriction of moment matching to a rather limited set of possible combinations. Thus, in practice IRKA has to be expected to provide more accurate ROMs in almost all cases. Further, IRKA has to be at least as good as the discussed time domain MOR approaches.

Numerical Examples
To illustrate the effectiveness of the time domain MOR techniques based on orthogonal polynomials, but also to confirm our conjecture about the restriction of this method compared to moment matching, we will present results for three well-known test examples.
The basic setup for the first two examples is the same, namely: ; t 2 ½0:1; 0:2Þ: 1 ; t 2 ½0:2; 1 8 < : The test system is always the following:   where yðtÞ and y r ðtÞ are computed using the implicit Euler method (see, e.g. [19,Chapter 2]). Note, that the 1-norm averaged error over time looks comparable and is thus not illustrated. We also show the total time, that was spent to reduce the original LTI system and to solve the reduced LTI system compared to the time, that was spent on solving the original LTI system. In these figures, we compare SYLTDMOR1 and SYLTDMOR2 to the two most important and well accepted methods for stable LTI systems, IRKA (one-and two-sided) and balanced truncation. For the one-sided IRKA approach, the projection matrix V is computed from the output Krylov subspace, i.e. in case of one expansion point s 0 V is the basis of K r ðA À s 0 EÞ ÀT E T ; ðA À s 0 EÞ ÀT C T . In case of multiple expansion points s 1 ; . . . ; s r V is a basis of S r i¼1 K r i ðA À s i EÞ ÀT E T ; ðA À s i EÞ ÀT C T . This is implemented according to the theory in Section 4 and does not present a restriction to moment matching since upon convergence of IRKA, the expansion points are (locally) optimally placed for the system with respect to H 2 approximation. In the figures, we use the following notations for Chebychev polynomials of first (Chebychev1) and second kind (Chebychev2), the onesided IRKA resp. moment matching (oIRKA/oMM), the two-sided IRKA resp. moment matching (IRKA/tMM) and balanced truncation (BT). Note, that we computed 50 cycles to average the results.

Triple Peak Example
Our first example is the triple peak, sometimes also called FOM, example (see, e.g. [20]), where the dynamical system (1) of order n ¼ 1 006 is given by E ¼ I n ; A 2 R 1 006Â1 006 a block diagonal matrix of the form ! ; . . . ; À1 000g; and the input and output matrices are . . . ; 10 |fflfflfflfflfflffl{zfflfflfflfflfflffl} 6 ; 1; . . . ; 1 |fflfflffl ffl{zfflfflffl ffl} 1 000 2 R 1Â1 006 : That means xðtÞ 2 R 1 006 and uðtÞ; yðtÞ 2 R. In Figure 4, the 2-norm averaged relative error over time is illustrated for the reformulated time-domain approach SYLTDMOR1, presented in Section 3.1, and its variation SYLTDMOR2, presented in Section 3.2, for Hermite, Laguerre, Legendre and Chebychev polynomials of first and second kind, each compared to the one-and two-sided IRKA and balanced truncation. In both subfigures, we can easily see, that the frequency domain MOR approaches approximate the original system, for a reduced order r ! 11, much better than the time domain approaches. Only the Legendre and the two Chebychev polynomial families in SYLTDMOR2 show a considerable decay of the relative error ending up with a relative error of 10 À10 for a reduced order r ¼ 40. The same orthogonal polynomials in SYLTDMOR1 have only a slight decay of the relative error ending up at around 10 À5 for a reduced order r ¼ 40. Compared to this, balanced truncation and both IRKA approaches show a nicer decay ending with a relative error of 10 À12 for r ! 28 and are thus at least 2 orders lower compared to the Legendre and both Chebychev polynomials for both represented time domain approaches. Regarding the Laguerre polynomials, the relative error does not seem to change for an increasing reduced order r and stagnates around 10 À1 . This phenomenon can be explained, if we look at their differential recurrence coefficients stated in Table 1. These coefficients are constant and do not depend on the reduced order r such that matrixŜ is always a triangular matrix with only 1 on the diagonal as seen in Section 4.3.2. Thus the only expansion point of these polynomials for SYLTDMOR2, given by the multiple eigenvalue ofŜ, is 1 with multiplicity r. That means in the sense of moment matching, that moments up to order r are matched. But on the other hand all other important frequencies are ignored leading to a bad approximation. Considering SYLTDMOR1 it might be possible that this effect is also caused by the clustering of the eigenvalues. A special case are the Hermite polynomials, since they are only illustrated until r ¼ 24. The reason for this is the extreme condition number, i.e. numerical singularity, of the matrices H andĤ, which makes it impossible to solve the linear system of equations.
To get an impression how the condition numbers grow by increasing the reduced order, the 2-norm condition numbers κ 2 ð:Þ using Hermite polynomials are shown for the triple peak example presented in Section 5.1 and the example of the following Section 5.2 for the matrices A; H andĤ in Figure 5. Since in MATLAB® a matrix is numerically not invertible for κ 2 ð:Þ > 10 16 , this bound is added in the subfigures of Figure 5. Nevertheless, we can easily see in Figure 4 that the Hermite polynomials have a large relative error of around 10 À1 in SYLTDMOR1 and about 10 0 in SYLTDMOR2 at r ¼ 24, such that the original system is not approximated well.
The total time, that is spent on reducing the original system and solving the reduced system, is illustrated in Figure 6 and is compared to the time, that is needed to solve the original ð1006 Â 1006Þ system. This figure is divided into three subfigures containing the reduction and solution times for determining V by solving the huge ðnr Â nrÞ linear system of equations (9) with MATLAB®'s backslash operator and Sylvester equations (11) and (12) with the method from [21]. In all three subfigures of Figure 6 it is easy to see, that all solution methods are faster than solving the original system. Comparing the time with the backslash operator and solving a Sylvester equation, irrespective whether SYLTDMOR1 or SYLTDMOR2 is chosen, the Sylvester solver is, for r ¼ 40, up to two times faster than solving with MATLAB's® backslash. For a reduced order of r ! 34, there is an increase of the total time for both IRKA variants. The reason for this can be found in Figure 4, since the relative error of these methods is close to machine precision and thus an improvement of the relative error is not possible.
Basically, this example demonstrates that the time domain MOR framework [1] reduces the LTI system fast. Further, these time domain MOR approach might be even faster by inserting the eigenvalues of matrix pencils ðÀÃ;ẼÞ or ðÀI r ;ÊÞ directly as expansion points in the moment matching method ending up with the same projection matrix according to Theorem 4.4 and Conjecture 4.5. This Could be implemented by precomputing the generalized eigenvalues and saving them in a data base, such that they are quickly available. Here, the Jacobi polynomials are excluded, since they require a further analysis of the choice of parameters a and b. Nevertheless, this example also demonstrates that the time domain MOR approaches are less accurate compared to IRKA or balanced truncation.

Triple Chain Example
The second example is the triple chain example from [23], i.e. three mass-springdamper chains of length 200 are fixed by one coupling mass. Since this example, which is parametrized as in [22], results in a second order systems M€ xðtÞ þ D _ xðtÞ þ KxðtÞ ¼ BuðtÞ; yðtÞ ¼ CxðtÞ; (19) we transform it into the first order system Here, the matrices of the second-order system (19) are given by a diagonal matrix M 2 R 601Â601 containing the masses, the damper matrix D 2 R 601Â601 and the stiffness matrix K 2 R 601Â601 . The input and output matrices are again transposes of each other given by Hence, E; A 2 R 1 202Â1 202 ; B T ; C 2 R 1Â1 202 ; zðtÞ 2 R 1 202 and uðtÞ; yðtÞ 2 R. Figure 7 illustrates the 2 -norm averaged relative error over time for the time domain MOR approaches compared to the frequency domain MOR methods as mentioned in the previous example. Again, the relative error using Laguerre polynomials shows only minor changes and is for a reduced order r ¼ 40 at around 10 À1 using SYLTDMOR1 or SYLTDMOR2. Similar to the Laguerre polynomials, the Hermite polynomials do not approximate the original system well since the relative error is, especially in case of SYLTDMOR2, too large, namely 29 for a reduced order r ¼ 24. In case of SYLTDMOR1, the relative error decreases to around 10 À2 , but since the H matrix is numerically singular, it is not possible to determine further projection matrices. In both subfigures, the 2-norm averaged relative error over time decreases, if Legendre or Chebychev polynomials of first or second kind are used. This relative error is around 10 À3 for SYLTDMOR1 and 10 À4 for SYLTDMOR2, for a reduced order r ¼ 40. Comparing both time domain approaches to the one-sided IRKA algorithm, we can clearly see that the relative error using moment matching is always at least as small as in 5 10 15  case of the time domain MOR, but mostly even smaller. Looking at the two-sided IRKA algorithm and the balanced truncation method, we see, that these methods are even more successful since they have a steeper decrease of the relative error ending up with a relative error of order 10 À8 , which is 4 orders of magnitude lower than for the onesided method. We now take look at the total time, i.e. the time needed to reduce the original system and to simulate the reduced system. In the upper subfigure of Figure 8, illustrating the total time for the backslash solver, we can easily see that, if we use the time domain approach with Hermite and Laguerre polynomials, we are faster than solving the original system. Unfortunately, these polynomials do not approximate the original system well. Using the Legendre or one of the Chebychev families, this approach is faster than simulating the original system until r ¼ 21. Hence, considering a larger reduced order is not effective any more. If we use one of the Sylvester solvers instead, we can achieve a high speed-up, e.g. choosing a reduced order r ¼ 40, we can reduce the system and simulate the reduced system up to 19 times faster than simulating the original system. This also holds for the Hermite and Laguerre polynomials. The former ones are only considered up to a reduced order r ¼ 24. Regarding both IRKA implementations and balanced truncation, it is clearly visible that these MOR techniques consume more time than the time domain approaches using Sylvester equations. Nevertheless, these methods are still faster than simulating the original triple chain example.

Butterfly Gyroscope Example
A more practice oriented and larger example is given by the butterfly gyroscope example from the Oberwolfach benchmark collection for model order reduction (see, [24]) described in [25], which represents a vibrating micro-mechanical gyroscope. The device consists of a three-layer silicon wafer stack. Its middle layer contains the sensor element, which consists of two wing pairs that are connected to a common framethe reason the gyro is called butterfly. This example is given as second-order system (19) and transformed into a first order realization Here, matrices M; D; K 2 R 17 361Â17 361 of the second-order system are symmetric matrices, the input is the vector B 2 R 17 361 and output matrix is given as C 2 R 12Â17 361 . In order to obtain a SISO system, we only consider the first row of C.
Hence the matrices and vectors of the first order system are of dimension E; A 2 R 34 722Â34 722 ; B T ; C 2 R 1Â34 722 ; zðtÞ 2 R 34 722 and uðtÞ; yðtÞ 2 R.
However, this can easily be transformed to a time interval t 2 ½0; 1 using the transformation t 1 ¼ t 1 0:005 with time step τ ¼ 0:001 as in the previous examples. The initial state x 0 and the input uðtÞ are chosen as in the previous examples. Furthermore, we computed only one cycle instead of 50 to limit the computation time, since external effects are expected to be far less influential in this case.
In Figure 9, the 2-norm averaged relative error over time is illustrated. These subfigures differ from the ones from the previous examples. Here, each subfigure consists of two plots, where the upper plot is logarithmically scaled from 10 3 to 10 300 and the lower plot is logarithmically scaled from 10 À6 to 10 3 . The reasons for this unusual scaling are large differences of the relative errors of the time domain MOR approach using orthogonal polynomials compared to the remaining MOR methods as IRKA and balanced truncation. While the maximum relative error of the remaining methods is around 10 3 , the minimum relative error of the orthogonal polynomials is in the same area and increases even up to an order of 10 294 . Furthermore, it is even possible that this method produces unsuitable, i.e. non stable ROMs. Then, the relative error is given by NaN and omitted in the plot. Compared to this, the relative error for the balanced truncation method and both IRKA approaches nicely decreases. Looking at reduced order r ¼ 40, balanced truncation and the two-sided IRKA method end up  with a relative error of around 10 À5 and the one-sided IRKA approach with a relative error of around 10 À2 . Thus Figure 9, illustrates impressively that the orthogonal polynomial based time domain MOR framework fails completely for this example. Figure 10 illustrates the total time that is needed to reduce and simulate the reduced system. Just like in the previous examples, SYLTDMOR1 and SYLTDMOR2 are clearly faster than using one of the IRKA approaches or the balanced truncation method. Still, the increased computation time is a price worth paying, since the latter methods produce suitable and reliable ROMs in all test cases. Note, that we cannot report the total time using MATLAB®'s backslash operator, since the computations became too memory consuming even for our well equipped test system.
If we now take the relative errors illustrated in Figure 9 into account, IRKA and balanced truncation are clearly more desirable than the time domain MOR techniques, because all of them approximate the original model behavior far better.

Observations
The numerical examples, reported above, give us an impression of the effectiveness of the time domain framework of Jiang and Chen and its variation SYLTDMOR2 compared to other frequency domain MOR techniques. Regarding the relative errors, the one-and two-sided IRKA algorithm and balanced truncation always have a stronger decrease and thus (almost) always smaller relative errors. The fact, that these methods consume more time than the time domain approaches is negligible, since these methods (nearly) always performed faster than solving the original system.
Note, that we also tried to implement the iterative splitting method proposed in [1], but our implementation never converged for the above mentioned examples. Thus, we are not able to compare them with our proposed solving techniques. Since solving with MATLAB®s backslash operator is time-consuming, we also tried to solve (9) using the preconditioned Generalized Minimum Residual (GMRES) method. But since the matrix H in (9) is ill-conditioned, we were not able to find a good preconditioner, such that (9) can be solved fast.
The aim of model reduction in our context is to find a reduced system approximating the original system well, such that the time, that is spent on reducing and solving the reduced system, is less than simulating the original system. SYLTDMOR1 and its variation SYLTDMOR2 clearly consume less time than solving the original system, but the relative error either decreases very slowly as in the first two examples and thus the reduced order needs to be comparably large, or the computed ROM is not suitable at all. This behaviour can be easily explained when looking at the trajectories of all examples illustrated in Figure 11. The trajectories of the triple peak and the triple chain example can easily be expressed by using low-order polynomials. In contrast to this, the trajectory of the butterfly gyroscope example is oscillating rather fast and thus it requires higher order polynomials to approximate the solution properly, which are not presented in the bases generating low-order models following the Jiang/Chen framework.

Concluding Remarks
In this paper, we picked up the time domain MOR framework based on the idea of Jiang and Chen in [1] and transformed the resulting huge linear system of equations into a Sylvester equation, that can be solved very efficiently. A slight variation of the formulation leads to another even nicer Sylvester equation considering a fixed initial state and leading to easier structure in the small coefficient matrices. Using the duality theorem, we show a connection of the time domain MOR methods to moment matching, but we also illustrate that the expansion points created by the time domain approaches cannot adapt to the system and thus the time domain approaches in this paper cannot keep up with proper moment matching, which is only one iteration step of IRKA.

Note
1. Machine precision, i.e. a tolerance of % 10 À16 , turned out to give unreliable rank decisions in the numerical experiments.

Code Availability
The MATLAB® implementation used to compute the presented results can be obtained from doi: 10.5281/zenodo.1243090 and is authored by: MANUELA HUND and JENS SAAK.