Enhancing feedforward controller tuning via instrumental variables: with application to nanopositioning

ABSTRACT Feedforward control enables high performance of a motion system. Recently, algorithms have been proposed that eliminate bias errors in tuning the parameters of a feedforward controller. The aim of this paper is to develop a new algorithm that combines unbiased parameter estimates with optimal accuracy in terms of variance. A simulation study is presented to illustrate the poor accuracy properties of pre-existing algorithms compared to the proposed approach. Experimental results obtained on an industrial nanopositioning system confirm the practical relevance of the proposed method.

ILC algorithms update the feedforward signal by learning from previous tasks under the assumption that the task is repetitive. ILC consequently enables superior performance with respect to model-based feedforward for a specific task by compensating for all repetitive disturbances. However, changes in the reference signal typically result in significant performance deterioration (see, e.g. Hoelzle, Johnson, & Alleyne, 2014). Motion systems are typically confronted with similar yet slightly different reference signals (Lambrechts, Boerlage, & Steinbuch, 2005;Oomen et al., 2014). In contrast to ILC, modelbased feedforward results in moderate performance for a class of reference signals instead of only one specific CONTACT Frank Boeren f.a.j.boeren@tue.nl reference (Butterworth et al., 2012). Note that the performance for model-based feedforward is highly dependent on the model quality of the parametric model of the system and the accuracy of model-inversion (Devasia, 2002). By introducing basis functions in ILC, the advantages of model-based feedforward and ILC are combined in van de Wijdeven and Bosgra (2010). This approach is further improved in van der Meulen et al. (2008), where the need for an approximate model of the system, as is common in ILC, is eliminated by exploiting results from iterative feedback tuning (Bazanella, Campestrini, & Eckhard, 2012). The iterative feedforward control approach in van der Meulen et al. (2008) is extended to multivariable systems and input shaping in Heertjes, Hennekens, and Steinbuch (2010) and Boeren, Bruijnen, van Dijk, and Oomen (2014), respectively. However, in Boeren, Oomen, and Steinbuch (2015), it is shown that the least-squares algorithm used in van der Meulen et al. (2008) can lead to a bias error in the estimated parameters. A new algorithm is proposed in  based on instrumental variable (IV) techniques that results in unbiased parameter estimates. However, accuracy in terms of variance of the estimate has not yet been investigated.
Although iterative feedforward control based on instrumental variables is promising for motion control, existing approaches suffer from poor accuracy properties in terms of variance. This severely limits the practical applicability of existing approaches in case of noisy signals. This paper aims to reveal non-optimal accuracy for the approaches in  and van der Meulen et al. (2008), and develop an algorithm that leads to optimal accuracy in the presence of noise.
The contributions of this paper are fourfold. First, an analysis is provided to show that the approaches in  and van der Meulen et al. (2008) lead to poor accuracy in terms of variance. Therefore, variance results developed in open-loop identification (Söderström & Stoica, 1983, Chapters 5 and 6) and closed-loop identification (Forssell, 1999;Gilson & Van den Hof, 2005) are extended towards iterative feedforward control. As a second contribution, this insight in the accuracy aspects is exploited to develop an algorithm that achieves optimal accuracy. The proposed algorithm (1) exploits an iterative refined instrumental variable (RIV) method that is similar to the approaches presented in Young (1976Young ( , 2015, Young and Jakeman (1979), Jakeman and Young (1979), and Young and Jakeman (1980), and (2) is closely connected to the estimation of inverse systems (see, e.g. Jung & Enqvist, 2013). Third, a simulation study is presented to (1) illustrate that the proposed algorithm leads to enhanced accuracy properties compared to pre-existing approaches and (2) confirm the significance of the accuracy of parameter estimates on the achievable performance. As a final contribution, experimental results confirm the practical relevance of the proposed algorithm. This paper significantly extends earlier results reported in  and Boeren, Oomen, and Steinbuch (2014) by thorough proofs and extended experimental results. Related datadriven tuning algorithms are available in Formentin, van Heusden, and Karimi (2013b), Karimi, Butcher, and Longchamp (2008), and Kim and Zou (2013). Furthermore, instrumental variable approaches are often used for estimating the parameters of industrial robots (see, e.g. Janot, Vandanjon, & Gautier, 2014a, 2014bPuthenpura & Sinha, 1986;Yoshida, Ikeda, & Mayeda, 1992). This paper is organised as follows. In Section 2, the problem formulation is outlined. In Section 3, asymptotic expressions for optimal accuracy are developed for feedforward control. To provide a concise presentation of the contributions of this paper, the second contribution is presented before the first contribution is explained. That is, a new tuning algorithm for iterative feedforward control is proposed in Section 4. Then, in Section 5, the accuracy properties of existing approaches are analysed. In Section 6, a simulation study of a motion system is presented to compare the proposed and preexisting approaches. In Section 7, the theoretical results are confirmed by experiments on an industrial nanopositioning system. Finally, conclusions are drawn in Section 8.

Notation:
The variable q denotes the forward shift operator qu(t) = u(t + 1). For a vector x, ||x|| 2 where N is the number of samples.

Problem setup
Consider the two degree-of-freedom control configuration as depicted in Figure 1. The true unknown system P(q) is assumed to be discrete-time, single-input singleoutput, and linear time-invariant, with rational representation . The control configuration consists of a given stabilising feedback controller C fb (q), and a feedforward controller C j f f (q). The index j denotes the jth task in a sequence of finite time tasks of length N samples, where j = 0, 1, ..., M. Furthermore, T s denotes the sampling time.
Let r denote the reference signal. Typically, r is designed as a known nth-order multi-segment polynomial trajectory with constraints on the first n derivatives, as in, e.g. Biagiotti and Melchiorri (2012) and Lambrechts et al. (2005). Also, w j (t) = H(q)ϵ j (t) denotes an unknown disturbance, where H(q) is a monic, asymptotically stable, proper system, and {ϵ j (t)} is normally distributed white noise with zero mean and variance λ 2 ε . Hence, w j and r are uncorrelated. The feedforward signal is denoted by u j f f , while the measured signals e j m and y j m in the jth task are given by and sensitivity function S(q) = (1 + P(q)C fb (q)) −1 . Since P(q) is assumed to be unknown and w j is an unknown disturbance, it is not possible to determine e j r and e j w (resp. y j r and y j w ) based on the measured signal e j m (resp. y j m ).

Iterative feedforward: batch-wise tuning
In iterative feedforward control, the measured signals e j m (t ) and y j m (t ), for t = 1, ..., N, are stored. Hence, the data set that is used for the estimation of the feedforward controller is given by After the jth task is finished, this batch of measured data is used to perform an offline update of the existing feed- before initiating the (j + 1)th task.
The update C f f (q, θ ) is given by with parameters θ = [θ 1 θ 2 . . . θ n θ ] T ∈ R n θ , and polynomial basis functions To illustrate a typical selection of the parameter vector θ and basis functions (q), consider the following example that is aimed at feedforward control for motion systems.

Example 2.1:
Let C j f f (q) = 0, i.e. only a feedback controller C fb (q) is used in task j, and let r(t) be a fourthorder reference trajectory. A typical parametrisation of C f f (q, θ ) is given by with basis functions j s ] T , and sampling time T s . The parametrisation in (5) consists of acceleration feedforward with acceleration a(t) = ψ a (q −1 )r(t), i.e., the second derivative of r(t), and snap feedforward with snap s(t) = ψ s (q −1 )r(t), i.e. the fourth derivative of r(t). Furthermore, θ a denotes the mass of the system, while θ s denotes the snap parameter. As such, C f f (q, θ ) in (5) can compensate for the dominant component of the referenceinduced error (see, e.g. Lambrechts et al., 2005). Note that double (and fourth) differentiation is possible since r(t) is a deterministic and known signal. The corresponding feedforward signal u ff (t) is given by The approach proposed in this paper aims to estimate the parameters θ a and θ s based on measured data.
The use of feedforward is a standard approach to obtain a small error signal e j m (t ) when tracking a reference trajectory r(t) (see, e.g. Steinbuch & Norg, 1998). By subdividing e j m (t ) into e j r (t ) and e j w (t ), as defined in (2), it is immediately clear that C j+1 f f (q, θ j+1 ) has no influence on e j w (t ). Indeed, the goal of feedforward control is to determine a C j+1 f f (q, θ j+1 ) that minimises the referenceinduced error e j+1 r (t, θ j+1 ) for t = 1, ..., N in a suitable sense. Given the definition ofC j+1 f f (q, θ j+1 ) in (3), the aim of this paper is to determine θ j + 1 such that e j+1 r (t, θ j+1 ) is as small as possible. It directly follows from (2) that and e j+1 r (t, θ j+1 ) = 0 for all t if C j+1 f f (q, θ j+1 ) = P −1 (q). However, since P(q) is assumed to be unknown, it is not possible to determine either P −1 (q), or determine e j+1 r (t, θ j+1 ) before initiating the (j + 1)th task. Instead, the measured signal e j m (t ) in the jth task, contaminated by w j (t), is used in an optimisation problem to determine C j+1 f f (q, θ j+1 ). That is, the aim in iterative feedforward control is to determine the parametersθ in (4), before starting the (j + 1)th task, from the optimisation problemθ where the criterion V(θ ) is based on the stored signals e j m (t ) and y j m (t ) for t = 1, ..., N, as measured in the jth task. Then, C j+1 f f (q, θ j+1 ) is determined according to (3). Next, two assumptions are introduced.
Assumption 2.1: C fb (q) is designed such that S(q)H(q) = 1, where the noise model is parametrised as Assumption 2.2: The true unknown system is given by = 1 1 + a 1 q −1 + · · · + a n q −n .
Concerning Assumption 2.1, recall from (6) that C j f f (q, θ j ) aims to minimise e j r (t ), i.e. the error induced by the known r(t). In the optimal case, it holds that e j r (t ) = 0 for all t. In contrast, the main goal of the feedback controller C fb (q) is to compensate for w j (t) in view of minimising e j w (t ). These disturbances are assumed to be stochastic with a certain spectrum. Typically, the dominant disturbances are in the low-frequency range, e.g. due to amplifier noise (Fleming, 2014), cable slab, commutation errors, or immersion water-flow in lithographic applications. Then, C fb (q) is designed to compensate for these disturbances, in which case the optimal result is e j w (t ) = ε j (t ), i.e. the error signal being white noise. Clearly, this corresponds to S(q)H(q) = 1. Similar approximations are used in the identification of robotics (see, e.g. Janot, Gautier, Jubien, & Vandanjon (2014)). Assumption 2.1 can be achieved by, e.g. using traditional PID tuning, possibly with error-based retuning (van de Wal, van Baars, & Sperling, 2000), and common LQG control designs (Åström, 1970, Section 6.2). In a typical approach to design C fb (q) such that S(q)H(q) = 1, e j m (t ) is measured in an experiment where r(t) = 0 for all t. For this case, e j r (t ) = 0 for all t, and consequently e j m (t ) = e j w (t ). Then, C fb (q) is tuned until e j m (t ) = ε j (t ), i.e. e j m (t ) being white noise. Concerning Assumption 2.2, the assumption that P(q) = 1/A 0 (q) implies that there is a θ j + 1 such that C j+1 f f (q, θ j+1 ) = P −1 (q). This result immediately follows by observing that P −1 (q) = A 0 (q) and C j+1 f f (q, θ j+1 ) is restricted to a polynomial parametrisation as in (3). This assumption may appear as a stringent requirement on P(q). However, for a general class of motion systems as described in Lambrechts et al. (2005), the reference signal r(t) has a dominant low-frequency signal content, and P −1 (q) can be accurately described by an accelerationdependent and snap-dependent term (Boerlage, Tousain, & Steinbuch, 2004). That is, high-frequency dynamic aspects are concealed by the specific input design of r(t) (see also Hjalmarsson (2009) for a further explanation of this aspect). These results will be corroborated by the simulation example in Section 6 and the experimental results in Section 7, where it is shown that the reference-induced contribution to the error signal can be almost completely compensated for by using C j+1 f f (q, θ j+1 ) as in (3) of low degree. The proposed approach can be extended by allowing a more general parametrisation for C j+1 f f (q, θ j+1 ), as in, e.g. Boeren, Blanken, Bruijnen, and Oomen (2015) and Boeren, Bruijnen, van Dijk, et al. (2014) if a reference is needed with a significant high-frequency signal content.
Throughout, measured data from a single task is used to determineθ according to (7). This approach is pursued since it can effectively handle slow variations that are typically present in a motion system, for example, wear, by means of continuous adaptation of the feedforward parameters. Note that the presented approach can be directly extended to exploit data from multiple tasks (e.g. as in Norrlöf (2001, 2006) and Kushner and Yin (2003)).

Iterative feedforward control
Based on the known r(t) and measured e j m (t ) and y j m (t ) in task j, the predicted errorˆ j+1 (t, θ ) in task j + 1 can be determined as (see Figure 2): In the proposed iterative feedforward control approach, the parameters θ should be estimated directly from measured data as is done in Hjalmarsson, Gevers, Gunnarsson, and Lequin (1998), i.e. without estimating a model of P(q). As such, (8) cannot be directly used to determine θ . Instead, the following estimate of is determined based on the known r(t), and measured e j m (t ) and y j m (t ) in task j.
To show that (9) is a suitable estimate of (8), note that the commutative property of SISO systems enables rewriting y j (10) Moreover, by taking the expectation of (10), it follows that By noting that r(t) is deterministic, it follows that ES(q)P(q)r(t ) = S(q)P(q)r(t ). Furthermore, for w j (t) as defined in Section 2.1, it is immediately clear that EC −1 (q)S(q)w j (t ) = 0(see, e.g. Söderström, 2002, Lemma 4.1). By combining these results, (11) implies that (10) is an unbiased estimator of S(q)P(q)r(t).
In the remainder of this paper, (9) is used as the predicted error in task j + 1. Substituting the parametrisation defined in (4) into (9) and rearranging terms leads to the following estimation equation: where Note that (12) is linear in the parameters θ . Finally, the optimal parameters, denoted as θ 0 , are defined. By subdividing e j m (t ) in (12) into e j r (t ) and e j w (t ), as defined in (2), and rearranging terms,ê j+1 (t, θ ) in (12) is given bŷ Recall from Section 2.2 that the goal of feedforward control is to minimise the reference-induced error e j+1 r (t, θ j+1 ). Given the definition of e j r (t ) in (2) together with the parametrisation for C j+1 f f (q, θ j+1 ) in (3), it directly follows that the reference-induced error contribution in (12) can be expressed aŝ Then, the optimal parameters θ 0 are defined such that (14) is equal to zero for all t, i.e.ê j+1 r (t, θ 0 ) = 0, which directly implies that e j r (t ) = (ϕ j (t )) T θ 0 , and consequently that Note that this definition of

An instrumental variable approach to iterative feedforward control
A general framework is proposed in  for iterative feedforward control based on instrumental variables (IV). The rationale is that unbiased estimates ofθ are obtained without the need for a correct model of w j , which is in contrast to least-squares estimation, including van der Meulen et al. (2008). In IVbased approaches, V(θ ) is typically selected as where z(t ) ∈ R n z are instrumental variables that are uncorrelated with w j , W is a positive-definite weighting matrix, n z ࣙ n θ , L(q) is a prefilter andê j+1 (t, θ ) in (12).
Since r(t) is uncorrelated with w j , the instrumental variables z(t) are in the remainder of this paper selected as a function of (derivatives of) r(t).
Since V (θ ) is quadratic in θ , the minimiserθ of V (θ ) follows from the necessary condition for optimality ∂V (θ ) ∂θ = 0 since W is a positive-definite matrix. By substituting (12) in (16), it is straightforward to show that the minimiserθ of V(θ ) in (16) is given bŷ The key idea behind (16) is that high performance is obtained if z(t), consisting of (derivatives of) r(t), and e j+1 (t, θ ) are uncorrelated. This concept is, in fact, very well known and finds its roots in traditional feedforward tuning techniques in control engineering, as illustrated in the following example.
and parameter θ a . This parametrisation corresponds to acceleration feedforward with acceleration a(t) = ψ a (q −1 )r(t), i.e. the second derivative of r(t), while θ a denotes the mass of the system. Note that double differentiation is possible since r(t) is a known, noise-free signal. The instruments are selected as z(t) = a(t), while W = I, n z = n θ , and L(q) = I. For this specific case, (16) becomes The aim of the IV approach is to determine θ a such that a(t) andê j+1 (t, θ a ) are uncorrelated.
For the considered experimental setup in Section 7, the (normalised) a(t) of a typical r(t) is depicted in Figure 3, together withê j+1 (t, θ a ) based on (1) θ a = 0 (red), and (2) the minimiserθ a of V(θ a ) in (18) (green). Clearly, high performance is obtained withθ a , i.e. when a(t) andê j+1 (t, θ a ) are uncorrelated. In contrast, low performance is obtained for θ a = 0, which corresponds to significant correlation between a(t) andê j+1 (t, θ a ). This shows that the correlation between a(t) andê j+1 (t, θ a ) is a measure for the performance of a motion system.
Next, the asymptotic covariance matrix P IV is derived that is related toθ . Consider the asymptotic distribution ofθ given by where θ 0 is the asymptotic parameter estimate as defined in (15). It can be shown thatθ is a consistent estimator, i.e.θ converges to θ 0 for N to infinity, along similar lines as in, e.g. Söderström, Stoica, and Trulsson (1987). Then, the asymptotic covariance matrix P IV in (19) is given by with and reference-induced part ϕ j r (t ) of ϕ j (t) in (13) given by with y j r (t ) as in (2). A derivation of (20) for iterative feedforward control follows the proof derived in Söderström and Stoica (1989, App. A8.1) for open-loop identification, and Söderström et al. (1987, Section 3) for closed-loop identification.
Note that P IV in (20) depends on the design of z(t) and L(q). Next, a lower bound is derived for P IV as a function of z(t) and L(q), which corresponds to the minimum variance achievable with an IV method for feedforward control.

Optimal design procedure for z(t) and L(q)
The covariance matrix P IV in (20) depends on the design of z(t) and L(q). Optimal accuracy in terms of variance is obtained if z(t) and L(q) are designed such that P IV is equal to an optimal covariance matrix P opt IV , where for any z(t) and L(q), it holds that P IV P opt IV . The optimal covariance matrix P opt IV for iterative feedforward control based on instrumental variables is given by where P opt IV 0. A derivation of P opt IV follows along similar lines as the derivation for open-loop identification in Söderström and Stoica (1989, Chapter 8).
Equivalence between P IV in (20) and ϕ j r (t ) in (22). This result follows by substituting z opt (t), L opt (q), and W opt in (20). Note that this design is not unique, and closely related to the proposed instrumental variables in the RIV algorithms.
The following two observations are made based on z opt (t), L opt (q), and W opt . First, the optimal design reveals that minimum variance is obtained when the number of instruments n z is equal to the number of parameters n θ , and uniform weighting (W = I) is applied for t = 1, ..., N. Based on this observation, W and n z are furthermore selected as W = I and n z = n θ . Then, (20) becomes with R zϕ j and J as in (21). Furthermore, (17) reduces to a basic IV method, i.e.θ Second, the optimal instruments z opt (t) cannot be determined since P(q) is assumed to be unknown in the developed framework. To see this, note that ϕ j r (t ) in (22), and in particular y j r (t ), cannot be determined based on the known r(t), and measured e j m (t ) and y j m (t ) when P(q) is unknown. In the next section, a novel algorithm is proposed that alternates between iteratively updating the estimate of the noise-free z opt (t), and determiningθ , similar to the algorithms described in Young (2015).

Proposed RIV algorithm
The algorithm proposed in this section is an iterative method to achieve optimal accuracy by jointly updating the estimate of the optimal instruments z opt (t) in (24) and solving forθ as in (26). Note that measured data from a single task is sufficient. Related iterative RIV methods are developed in system identification (see  ;Young, 2015;, 1980. Let the index i denote the ith computational iteration of the proposed algorithm. Furthermore,θ <i−1> denotes the parameter estimate in iteration i − 1. In the ith iteration, z opt (t) is approximated by Subsequently, z p, <i > (t) in (27) is used to determineθ <i> in the ith iteration similar to (26): (1). The proposed algorithm to determineθ with optimal accuracy is summarised in Algorithm 4.1.  (13) and e j m (t ) in (1). The comparison study presented in Sections  and  of this paper shows that optimal accuracy is not achieved with the existing approaches in van der Meulen et al. () and Boeren, Oomen, et al. () while the proposed approach can achieve optimal accuracy upon convergence of the proposed RIV algorithm.

Method
Instrumental Covariance matrix P IV Optimal variables accuracy Optimal Before initiating Algorithm 4.1, it is advised to determine an initial parameter θ j by using a linear least squares estimation approach as in van der Meulen et al. (2008). Similar to the RIV algorithms presented in Young (2015), practical use has shown that such an initial estimate is typically sufficient for subsequent convergence of Algorithm 4.1.

Accuracy analysis of the proposed approach
In this section, it is shown that optimal accuracy in terms of accuracy is obtained with Algorithm 4.1. Consider the covariance matrix P IV, p corresponding to z p, <i > (t) as given in Table 1. The covariance P IV, p follows by substituting (27) in (25). Clearly, optimal accuracy for the proposed method, i.e. P IV, p equal to P opt IV in (23), is obtained if z p, <i > (t) converges to z opt (t) in Algorithm 4.1.
To show that z p, <i > (t) converges to z opt (t) in subsequent iterations of the proposed algorithm, substitute (2) and (22) in (24) where the last equality is obtained by using the commutative property of SISO systems. Then, the difference between z opt (t) and z p, <i > (t) in the ith iteration can be expressed as and z p, <i > (t) = z opt (t), i.e. optimal accuracy, is obtained if It remains to be shown that (31) holds, i.e. z p, <i > (t) converges to z opt (t), to guarantee that Algorithm 4.1 results in optimal accuracy. Recall from Section 3 that θ <1> in iteration i = 1 is a consistent estimator, i.e.θ <1> = θ 0 for N to infinity, and that consistency ofθ <1> implies that Substituting (29) in (29) with i = 2 and rearranging terms illustrates that z p, <2 > (t) = z opt (t). This result shows that optimal accuracy in terms of accuracy is achieved with Algorithm 4.1.

Remark 4.2:
For finite N,θ <1> is not exactly equal to θ 0 and multiple iterations are typically required to refine z p, <i > (t). Practical use of the algorithm shows good convergence properties. This is in accordance with the convergence analysis for closely related iterative RIV algorithms (Young, 2015).

Design procedure
Next, Algorithm 4.1 is embedded in a design procedure to determine C j+1 f f (q, θ j+1 ) according to (7) with V(θ ) in (16). This design procedure implements the main contribution of this paper, and is given next.

Remark 4.3:
Procedure 4.1 is based on measured data from a single task. The proposed procedure can directly be implemented as a (batch-wise) recursive approach, where measured data from multiple tasks is used to reduce the variance ofθ .

Accuracy analysis of existing approaches
In this section, the accuracy properties of the iterative feedforward tuning approaches in van der Meulen et al. (2008) and  are compared with the optimal approach derived in Section 3. An overview of this comparison is provided in Table 1.

Accuracy analysis of the approach in Boeren, Oomen, et al. (2015)
The instrumental variable approach in  uses as instruments z 1 (t) = (q)r(t), with (q) the basis functions of C j f f (q), and L 1 (q) = 1. The covariance matrix P IV, 1 corresponding to this design follows by substituting z 1 (t) and L 1 (q) = 1 in (25), and is given by Based on (31) and (23), it can be shown that P IV,1 P opt IV , i.e. the approach in  results in non-optimal accuracy in terms of variance.

Accuracy analysis of the approach in van der Meulen et al. (2008)
The iterative feedforward approach proposed in van der Meulen et al. (2008) utilises L 2 (q) = 1 and z 2 (t) = ϕ 2 (t) as instruments, where ϕ 2 (t ) = (q)(C f b (q) + C j f f (q)) −1 y m (t ) is constructed based on measured data obtained in an additional task. As such, measured data from two tasks is required to determineθ , which results in a 1/ √ 2 reduced accuracy compared to the optimal approach in Section 3. To see this, note that the asymptotic distribution ofθ corresponding to z 2 (t) yields based on two tasks of each N samples, where P IV, 2 yields In contrast, the asymptotic distribution corresponding to the optimal instruments z opt is given by based on two tasks of each N samples. Comparing (32) and (33) reveals a 1/ √ 2 reduced accuracy for the approach based on z 2 (t) when compared to the optimal approach. This result confirms that non-optimal accuracy is obtained with the approach proposed in van der Meulen et al. (2008).

Simulation example
In this section, a simulation study is presented to (1) Confirm that unbiased parameter estimates are obtained for all considered IV approaches in Table 1,  (2) Show that the proposed approach z p, <i > (t) leads to enhanced accuracy of the parameter estimates compared to the pre-existing approaches based on z 1 (t) and z 2 (t), (3) Illustrate the close relation between the statistical accuracy ofθ and the performance of a motion system.

System description
Consider the system P(q) given by P(q) = 1.761 × 10 −9 1 − 3.69q −1 + 5.225q −2 − 3.38q −3 + 0.8451q −4 , which represents a two-mass spring damper system with non-collocated dynamics (see Figure 4 for a Bode plot). A schematic illustration of the two-mass spring damper system is depicted in Figure 5. The feedback controller C fb (q) is given by Recall from Section 2.2 that the unknown disturbance w j (t) is assumed to be given by w j (t) = H(q)ϵ j (t). Here, H(q) is designed such that Assumption 2.1 holds, i.e. H(q) = 1 + P(q)C fb (q), and {ϵ j (t)} is normally distributed white noise with zero mean and standard deviation λ ϵ = 2.5 × 10 −8 . The system is excited by a thirdorder reference signal, designed as in Biagiotti and Melchiorri (2012). Furthermore, the feedforward controller The corresponding feedforward controller is denoted as C j f f (q,θ j ).

Simulation results: parameter estimation
The results of the Monte Carlo simulation study are given in Figure 6 and Table 2. The following observations are made: (1) Unbiased estimates of θ 0 , i.e.θ j = θ 0 , are obtained for z p, <i > (t), z 1 (t), and z 2 (t). This confirms that all considered IV approaches in Table 1 lead to unbiased estimates.
(2) The standard deviation ofθ j s is significantly smaller for z p, <i > (t) when compared to z 1 (t) and z 2 (t). This confirms that the proposed approach leads to improved accuracy in terms of variance. , z  (t) (middle) and the proposed z p, <i > (t) (right) show that the standard deviation ofθ a is comparable for all approaches, while the standard deviation ofθ s is significantly smaller for the proposed z p, <i > (t) compared to z  (t) and z  (t).

Simulation results: accuracy and performance
Next, the relation between the statistical accuracy of the estimated parameters and the obtained performance is analysed for the considered system. Recall from (6) that a high-performance C j f f (q, θ j ) minimises e j r (t, θ j ), i.e. the error induced by the known r(t). Sinceθ j in (34) is an unbiased estimate of θ 0 for all approaches in Table 1, it follows from Section 3 that e j r (t,θ j ) = 0 for all t when C j f f (q,θ j ) is used as feedforward controller. Consequently, (1) implies that which is the best possible result in the developed framework for a fixed C fb (q). However, as already shown in Section 6.2, the estimateθ j l in realisation l can significantly deviate from the sample meanθ j . In this section, the influence of this deviation on the achieved performance is analysed for the IV approaches in Table 1. Suppose thatθ while the feedforward controller is denoted as C j f f (q,θ j wc ). Next, it is shown that enhanced accuracy in terms of variance results in a smaller difference between e j wc (t,θ j wc ) in (36) and e j m (t,θ j ) in (35), i.e. improved worst-case performance.
The error e 1 wc (t ) in task j = 1 and cumulative power spectrum are depicted in Figures 7 and 8, respectively. The following observations are made: Table . Summary of results of Monte Carlo simulation. The mean value ofθ 1 a andθ 1 s in task j =  for z p, <i > (t), z  (t) and z  (t) confirm that unbiased parameter estimates are obtained for all methods, while the standard deviation ofθ 1 a andθ 1 s confirms that an enhanced accuracy is obtained with the proposed approach z p, <i > (t) compared to z  (t) and z  (t).  The error signal e 1 wc (t,θ 1 wc ) in task j =  shows that e 1 wc (t,θ 1 wc ) contains a significant reference-induced component e 1 r (t,θ 1 wc ) for z  (t) (left), while e 1 wc (t,θ 1 wc ) is dominated by e 1 w (t ) for z p, <i > (t) (right). For comparison, the peak value of the error signal with feedback only is given by  ×  − (m). (1) For z p, <i > (t), the contribution of e 1 r (t,θ 1 wc ) to e 1 wc (t,θ 1 wc ) is negligible compared to the contribution of e 1 w (t ). Hence, e 1 wc (t,θ 1 wc ) is similar to the optimal case e 1 m (t,θ 1 ).
Similar results as provided for z 1 (t) are obtained for z 2 (t), and are omitted for brevity. The provided simulation study showed that an enhanced accuracy of the parameter estimates results in a reduced difference between e 1 wc (t,θ 1 wc ) in (36) and e 1 m (t,θ 1 ) in (35). Hence, the worst-case error e 1 wc (t,θ 1 wc ) based on a single set of measured data is improved by using the proposed instruments z p, <i > (t). This confirms that the statistical accuracy properties ofθ j are important for performance.

Experimental results
In this section, the proposed approach in Section 4 is applied to the prototype industrial nanopositioning system depicted in Figure 9. The positioning stage, measurement system, and actuation system are placed on a vibration isolation table to isolate the system from external disturbances originating from the environment.
The positioning stage is magnetically levitated and actuated, and controlled in six motion degrees of freedom (DOFs): three translations (x, y, and z) and three rotations (R x , R y and R z ). Magnetically levitated stages exhibit contactless operation. Therefore, friction (which is typically an important disturbance in motion control) is eliminated.
The actuation system consists of six linear magnetic motors, with an added position offset such that each actuator can also generate a force in the perpendicular direction. The permanent magnets are connected to the vibration isolation table, while the coils are part of the positioning stage.
The measurement system consists of laser interferometers in conjunction with a mirror block, connected to the vibration isolation table and the positioning stage, respectively. This system enables high-accuracy position measurements in all six motion DOFs. In particular, subnanometer resolution position measurements are available for the translational DOFs x, y and z. Throughout, all systems and signals operate in discrete time with a sampling time of T s = 2 × 10 −4 s.

Control configuration and reference signal design
The proposed design procedure for the feedforward controller is applied to the x-direction of the system, i.e. the long-stroke (80 mm) direction of the setup in Figure 9. A  is designed for the multivariable system by means of sequential loopshaping (see Skogestad & Postlethwaite, 2005, Section 10.6) for details. By closing the control loops for the remaining 5 DOFs, i.e. y, z, R x , R y and R z , a single-input, single-output equivalent system P(q) is obtained for the x-direction. The frequency response function of the equivalent system P(q) for the x-direction is depicted in Figure 10. The dynamical response of a linear motion system P(s), where s is the Laplace operator, with proportional damping can be written as a sum of N second-order subsystems with n the number of rigid-body modes, c T i the ith column of the output matrix C ∈ R n×N , b i the ith row of the input matrix B ∈ R N×n , ζ i the dimensionless damping constant, and ω i the natural frequency of the ith second-order subsystem. Inspection reveals rigid-body behaviour (as described by the first term in (37)) below approximately 300 Hz, while the first resonance phenomena (as described by the latter term in (37)) appear at 480 and 860 Hz. For a motion system with dominant rigid-body dynamics in the frequency range of interest, parametric models are typically developed that only describe the rigid-body dynamics, i.e. the first term in (37).
The feedback controller C fb (q) for the x-direction of the system is depicted in Figure 11, and achieves a bandwidth (defined as the lowest frequency where |C fb P| = 1) of 120 Hz. This bandwidth results in rejection of low-frequency disturbances, while having sufficient robustness against uncertainty in the resonances of P(q).
The reference r(t) of the performed servo task is depicted in Figure 12, together with its acceleration, jerk and snap, i.e. the second, third and fourth derivative of r(t).

Parametrisation feedforward controller
For the general parametrisation of C ff (q, θ) in (3), (1) the number of parameters n θ and (2) the selection of basis functions (q) should be determined to obtain C ff (q, θ). Here, the parametrisation of C ff (q, θ) is chosen as in Lambrechts et al. (2005). This parametrisation is developed for motion systems with dominant rigid-body dynamics as in Figure 10, and is given by with basis functions and corresponding parameters θ = [θ v θ a θ j θ s ] T ∈ R n θ . For C ff (q, θ) in (40), it holds that i.e. the static gain of C ff (q, θ) is equal to zero. This condition implies that the feedforward signal u ff is equal to zero when the system is in stand-still, which is a desired property for motion systems with rigid-body dynamics. Furthermore, recall that the considered experimental setup operates contactless, thereby eliminating performancedeteriorating friction. As such, friction feedforward is not included in the feedforward controller in (40).

Experimental results for optimal IV method
In this section, the key experimental results of this paper are presented, which involve the application of Procedure 4.1 on the nanopositioning system in Figure 9. Five tasks are performed of N = 2700 samples each, with r(t) as depicted in Figure 12. In addition, r(t) filtered by the basis functions in (41)   performance obtained in task j = 1 is, therefore, the performance with only feedback control applied to the system. Alternatively, the linear least squares estimation approach in van der Meulen et al. (2008) can be used to determine an initial parameter vector θ 1 .
The measured error signal e j m (t ) in the first, second and third task are depicted in Figure 13, while the corresponding cumulative power spectrum is shown in Figure 14. In addition, the two-norm of e j m (t ), i.e. e j m (t ) 2 2 , as a function of tasks is depicted in Figure 15. The following observations are made:  (1) The peak value of e 3 m (t ) in task j = 3 is reduced by approximately 97% when compared to e 1 m (t ) in the first task. This confirms that the proposed iterative feedforward approach significantly enhances the positioning accuracy of the system compared to only feedback control.
(2) Figures 13 and 14 show that the low-frequency contribution up to approximately 10 Hz is not compensated for by C 3 f f (q, θ 3 ) in task j = 3. This implies that the dynamical behaviour of the experimental setup in this frequency range is not captured by the parametrisation proposed in Section 7.2. This can be contributed to the dynamics of the cable connection between the fixed world and the (moving) positioning stage, which acts as a low-frequency disturbance on the system.
(3) Figure 15 shows that e j m (t ) 2 2 converges in two tasks. This confirms that fast convergence in terms of e j m (t ) 2 2 is obtained with the proposed approach.

Analysis of Algorithm 4.1
The iterative refinement of z p, <i > (t) proposed in Algorithm 4.1 is an essential attribute of Procedure 4.1. In this section, Algorithm 4.1 is illustrated for the considered experimental setup. Recall from (28) that the optimal instruments z opt (t) can be expressed as and optimal accuracy is obtained, i.e. z p, <i > (t) = z opt (t), if This result enables a visual illustration of Algorithm 4.1 by comparing the identified frequency response function of S(q)P(q) with the frequency response of (C f b (q) + C j f f,<i> (q)) −1 . The analysis in this section is based on the measured signals e 1 m (t ) and y 1 m (t ), for t = 1, …, N, in task j = 1 with C 1 f f (q, θ 1 ) implemented on the experimental setup. Furthermore, the number of computational iterations of Algorithm 4.1 is given by K = 3. Figure 16 reveals that (C f b + C 1 f f,<3> ) −1 in iteration i = 3 of Algorithm 4.1 is a significantly improved approximation of S(q)P(q) in the frequency range up to 200 Hz, when compared to iteration i = 1. This confirms that iterative refinement of z p, <i > (t) by means of Algorithm 4.1 results in z p, <i > (t) that resemble the optimal z opt (t).

Enhanced flexibility to reference variations and comparison with ILC
As is argued in Sections 1 and 2, the key motivation for the proposed feedforward approach compared to ILC algorithms is an enhanced flexibility with respect to changes in the reference trajectory. To demonstrate this flexibility, two similar yet slightly different reference trajectories are applied to the considered nanopositioning system. These reference trajectories are depicted in Figure 17. Then, the optimal IV approach proposed in Section 4 is compared to the standard frequencydomain ILC-based approach (see, e.g. Bristow et al., 2006). Flexibility with respect to changes in the reference trajectory between tasks for ILC. Before task j = , the reference trajectory is changed from r  to r  (see Figure ). For standard ILC, the measured error signal e  (t) (blue) in task j =  is significantly smaller than for e  (t) (green) in task j = . This confirms that for standard ILC, the servo performance is severely deteriorated if the reference is changed at j = . (To view this this figure in colour, please see the online version of this journal.) Flexibility with respect to changes in the reference trajectory between tasks for the proposed approach in Section . Before task j = , the reference trajectory is changed from r  to r  (see Figure ). The measured error signal e  (t) (blue) in task j =  is similar to e  (t) (green) in task j = , which shows that the servo performance for the proposed IV-approach is invariant to changes in the reference.
(To view this figure in colour, please see the online version of this journal.) The measured error signals e 5 (t) in task j = 5 and e 6 (t) in task j = 6 as depicted in Figure 18 for ILC, and in Figure 19 for the proposed optimal IV approach confirm that (1) the servo performance obtained with ILC severely deteriorates when the reference is slightly changed and (2) the proposed approach is insensitive to reference changes. Note that the error in tasks j = 5 is smaller for ILC than for the proposed approach. Since motion systems are typically confronted with similar yet slightly different reference signals (Lambrechts et al., 2005;Oomen et al., 2014), the proposed optimal IV approach is preferred in industrial practice since learning transients are eliminated when the reference trajectory is changed.

Conclusions
In this paper, a new algorithm is proposed for iterative feedforward control based on instrumental variables. The key advantage of the proposed algorithm is that it achieves optimal accuracy in terms of variance, in contrast to existing approaches, which are shown to be non-optimal. To achieve optimal accuracy, the proposed algorithm iteratively updates an estimate of the optimal instruments. The assumptions that are introduced in Section 2.2 are for a large class of motion systems nonrestrictive for the achievable performance. If the considered motion system is subject to reference signals with highfrequency signal content, the proposed approach can be extended by allowing more general parametrisations by means of input shaping (Boeren, Bruijnen, van Dijk, et al. 2014) and rational feedforward (Bolder & Oomen, 2015). The proposed method is validated by means of a simulation example, showing improved accuracy compared to pre-existing approaches. Finally, the procedure is successfully applied to an industrial nanopositioning system. The presented experimental results confirm the practical relevance of the proposed approach.
Ongoing research focuses on extensions towards optimal input design (Formentin, Karimi, & Savaresi, 2013a), inferential control, positioning-varying effects (Groot Wassink, van de Wal, Scherer, & Bosgra, 2005) and multivariable systems. For the considered nanopositioning system in Figure 9, improved performance can be obtained by compensating for the dynamics of the cable connection between the fixed world and the (moving) positioning stage by means of feedforward control.