Robust analysis for data-driven model predictive control

Here the idea of data driven is introduced in model predictive control to establish our proposed data-driven model predictive control. Considering one first-order discrete time nonlinear dynamical system, the main essence of data driven means the actual output value in cost function for model predictive control is identified through input–output observed data in case of unknown but bounded noise and martingale difference sequence. After substituting the identified actual output in cost function, the total cost function in model predictive control is reformulated as its standard form, i.e. one quadratic program problem with input and output constraints. Then semidefinite relaxation scheme is used to derive a lower bound for its optimal value, and the robust counterpart of an uncertain quadratic program is reduced to one conic quadratic problem. The above semidefinite relaxation scheme and conic quadratic problem correspond to the similar robust analysis based on convex optimization theory. Finally, one simulation example is used to prove the efficiency of our proposed theory.


Introduction
Model predictive control (MPC) is a powerful methodology that has been widely considered and used in a variety of industrial applications, such as chemical process, water networks as well as building energy managements. More specifically, model predictive control is one special form of suboptimal control problem, whose control objective is to keep the state of a system near some desired points. MPC combines elements of several ideas that we have put forth, for example, certainty equivalent control, multistage lookahead and rollout algorithms. MPC tends some applied properties for classical linear quadratic control, i.e. there are two main reasons for replacing that classical linear quadratic control by MPC. (1) The considered system may be nonlinear, and a model that is linearized around the desired point may be inappropriate. (2) There may be control and state constraints, which are not handled adequately through a quadratic penalty on state and control. The solution obtained from a linear quadratic model is not suitable for this, because the quadratic penalty on state and control tends to blur the boundaries of the constraints. Generally, MPC converts one optimal control problem into one numerical optimization problem with equality or inequality constraints, which correspond to the control and state constraints. Moreover, when the considered system is either deterministic, or CONTACT Tang Xiaojun 18279820758@163.com * This paper was not presented at any IFAC meeting. else it is stochastic, it is replaced with a deterministic version by using typical values in place of all uncertain quantities, such as the certainty control approach in implementing MPC. Roughly speaking, at each stage, an optimal control problem is solved over a fixed length horizon, starting from the current state. The first component of the corresponding optimal policy is then used as the control of the current stage, while the remaining components are discarded. The optimization process is then repeated at the next stage, once the next state is revealed or the optimization algorithm is terminated iteratively.
From above detailed description on MPC, MPC corresponds to one numerical optimization problem, whose cost function or loss function is one error value between the actual output and its desired output reference. In reality, the desired output is given, but the actual output is unknown in priori, so first we need to model the considered system and collect its actual output through persistly exciting the system by one appropriate input signal. It means the considered system is identified and used to calculate the actual output. There are two modelling approaches, used to identify the considered system, i.e. the first principle and system identification. The first principle needs lots of priori information about the considered system, such as Newton law, mathematical or physical laws, and so on. Then main essence of system identification is to excite the considered system, then use these collected input-output data to identify or estimate the unknown parameters, as the parameters are estimated online and used to describe the considered system, whatever in open loop or closed loop condition. The advantage of the second system identification approach is that no any priori information are needed, but only input-output data. In this big data period, this requirement for input-output data is tolerable. Roughly speaking, when to obtain the actual output in cost function for MPC, the input-output data, corresponding to the considered system in open loop or closed loop, are collected to identify the system through some statistical methods, for example, least squares method, maximum likelihood method, etc. Then the identified system is applied to express or describe the actual output, so the actual output depends on the accuracy of the identified system. In practice, system identification is a well-developed technique for estimating system parameters from operational data, typically taken during dedicated system testing or excitation, so system identification is also named as data driven.
Due to the application of system identification into MPC or other control strategy, such as adaptive control, internal model control and robust control, a new conceptidentification for control was proposed in 2010s. Here we give a concise introduction or contribution on identification for control. In case of the unknown but bounded noise, one bounded error identification is proposed to identify the unknown systems with time varying parameters. Then one feasible parameter set is constructed to include the unknown parameter with a given probability level. In Alamo et al. (2009), the feasible parameter set is replaced by one confidence interval, as this confidence interval can accurately describe the actual probability that the future predictor will fall into the constructed confidence interval. The problem about how to construct this confidence interval is solved by a linear approximation/programming approach, which can identify the unknown parameter only for linear regression model. According to the obtained feasible parameter set or confidence interval, the midpoint or centre can be deemed as the final parameter estimation, further a unified framework for solving the centre of the confidence interval is modified to satisfy the robustness. This robustness corresponds to other external noises, such as outlier, unmeasured disturbance (Bertsimas & Goyal, 2012). The above-mentioned identification strategy, used to construct one set or interval for unknown parameter, is called as set membership identification, dealing with the unknown but bounded noise. There are two types of descriptions on external noise, one is probabilistic description, the other is deterministic description, corresponding to the unknown but bounded noise here (Blackmore et al., 2011). For the probabilistic description on external noise, the noise is always assumed to be one white noise, and its probabilistic density function (PDF) is known in advance. On the contrary for deterministic description on external noise, the only information about noise is bound, so this deterministic description can relax the strict assumption on probabilistic description. In reality or practice, bounded noise is more common than white noise. Within the deterministic description on external noise, set membership identification is adjusted to design controllers with two degrees of freedom (Calafiore, 2010), it corresponds to data-driven control or set membership control. Set membership control is applied to design feedback control in a closed loop system with nonlinear system in Campi and Garatti (2016), where the considered system is identified by set membership identification, and the obtained system parameter will be benefit for the prediction output. After substituting the obtained system parameter into the prediction output to construct one cost function, reference (Campi et al., 2009) takes the derivative of the above cost function with respect to control input to achieve one optimal input. Set membership identification can be not only applied in MC, but also in stochastic adaptive control (Callawy & Hiskens, 2011), where a learning theory kernel is introduced to achieve the approximation for nonlinear function or system. Based on the bounded noise, many parameters are also included in known intervals in prior, then robust optimal control with adjustable uncertainty sets is studied in Zhang et al. (2017), where robust optimization is introduced to consider uncertain noise and uncertain parameter simultaneously. To solve the expectation operation with dependence on the uncertainty, sample size of random convex programs is considered to replace the expectation by finite sum (Zhang et al., 2015). Generally, many practical problems in systems and control, such as controller synthesis and state estimation, are often formulated as optimization problems (Garatti & Campi, 2013). In many cases, the cost function incorporates variables that are used to model uncertainty, in addition to optimization variables, and reference (Farina et al., 2016) employs uncertainty described as probabilistic variables. Generally, the above-mentioned references are divided into two types, i.e. system identification and model predictive control, and these two types are separated to each other in today's research. Due to their separations, the combinations of them are considered now, i.e. identification for control. The mission of this paper is to achieve this goal, i.e. combining system identification and model predictive control. More specifically, system identification is proposed to identify the nonlinear function estimation, then this obtained nonlinear function estimation is substituted into the cost function for model predictive control. Finally, robust analysis is further studied from the point of bounded noise and martingale difference noise.
Here in this paper, we apply our previous work on robust control or robust optimization into this model predictive control, and in real-world applications, the data are not always known exactly, i.e. we know about the data is that they belong to a given set. From the theoretical perspective, an uncertain quadratic programming for model predictive control is formulated through our own derivations, then its robust counterpart of this uncertain quadratic programming corresponds to our studied robust analysis for model predictive control. More specifically, consider one first-order discrete time nonlinear dynamic control system, its output must be needed in the cost function for MPC, so first we need to construct the actual output for the considered nonlinear dynamical system. But here it is very different with the references, as the nonlinear dynamical system is considered. To implement the proposed MPC well that actual output is identified for the nonlinear dynamical system,. Moreover, our own derivations belong to the idea of system identification, i.e. the input-output data are proposed to construct the actual output directly, not to identify the nonlinear system parameter. Roughly speaking, data are used to describe the actual output directly, thus avoiding to estimate the nonlinear system parameter. The process of using the input-output data to denote the actual output directly coincides to the essence of data driven. After substituting the obtained actual output in the cost function for MPC, one numerical optimization problem with input and output constraints is needed to solve. After simple but tedious calculation, this constrain optimization problem can be formulated as one standard quadratic program problem. Due to numerous problems of planning, scheduling, routing, etc. can be posed as combinatorial optimization problems, i.e. optimization programs with discrete design variables. Then a combinatorial problem can be posed as minimizing a quadratic objective under quadratic inequality constraints. Lagrange relaxation or dual theory is proposed to analyse this quadratic program problem with inequality constraints, further this problem is reduced to one conic quadratic program problem in case of uncertain data, i.e. the data are typically known in the space of data-and uncertainty set, which for sure contains the actual data. This case exists in reality, where in spite of this data uncertainty, our decision must satisfy the actual constraints, whether we know them or not. Throughout this paper, the idea about applying system identification theory to identify or construct the actual output for one discrete time nonlinear dynamic control system corresponds to the data driven for model predictive control. Similarly consider the obtained quadratic program problem with inequality constraints, semidefinite relaxation scheme is proposed to analyse a lower bound for the optimal value, and the robust counterpart of an uncertain quadratic program is studied to be one conic quadratic problem. This robust counterpart coincides with the robust analysis for the above obtained data-driven model predictive control.
More specifically, in this paper, we mainly concern the following two contributions.
(1) In case of the unknown but bounded noise and martingale difference noise, the nonlinear function estimation, corresponding to the unknown nonlinear function, is constructed based on the input-output data.
(2) After substituting this obtained nonlinear function estimation into the cost function in model predictive control, one quadratic program problem with input and output constraints is reformulated.
(3) Consider the robust counterpart of an uncertain quadratic program, the semidefinite relaxation scheme is applied to get a lower bound.
This paper is organized as follows. In Section 2, one first-order discrete time nonlinear dynamical control system is considered, and some preliminaries are formulated about the noise for the nonlinear function. In Section 3, the estimate of the unknown nonlinear function is studied. Two estimations for the unknown nonlinear function are derived based on the detailed noise, i.e. unknown but bounded noise and martingale difference noise. This process about constructing the unknown nonlinear function corresponds to the nonlinear function estimation based on input-output data. It means we apply the collected input-output data to identify the unknown nonlinear function. In Section 4, the obtained estimation for the unknown nonlinear function is substituted into the cost function in model predictive control. This cost function is formulated or rewritten as one quadratic program problem in case of input and output constraints. Throughout our own derivations, it is reduced to one quadratic program problem with inequality constraints. Then semidefinite relaxation scheme is used to show a lower bound for its optimal value in Section 5, where the robust counterpart of an uncertain quadratic program is changed to be one conic quadratic problem. The whole analysis process is our named as our robust analysis. In Section 6, one simulation example illustrates the effectiveness of the proposed theories. Section 7 ends the paper with final conclusion and points out the next subject of ongoing research. Here in this paper, all the mathematical derivations are obtained by our own contributions.

Nonlinear dynamic system
To give an explicit expression about the actual output in one cost function for our considered model predictive control, we need to describe this actual output for one considered system, whatever linear or nonlinear dynamic system.
As linear system is widely studied in lots of references, so here we consider a more general case, i.e. consider the following first-order discrete time nonlinear dynamical system: where in Equation (1), y(t) and u(t) are the system output and input respectively at time instant t. Nonlinear function f () is completely unknown. Equation (1) shows the closed relation between two adjacent time instants (t + 1) and t. To be convenient for the later mathematical derivations, here the right side of Equation (1) considers the external noise w(+1). The goal of next section is to estimate or identify this nonlinear function from input-output data sequence {u(t), y(t)}, external noise w(t) is one unknown but bounded noise, which extends the special case of white noise, and its upper bound w > 0 satisfies.
To apply MPC to design the predictive controller u(t), the estimation of the nonlinear function f (y(t)) is expected to track one desired output reference y des (t). To measure its discrepancy, error value (f (y(t)) − y des (t)) is needed to expand at time instant t. Due to nonlinear function f (y(t)) is unknown, so the urgent mission is to estimate this nonlinear function and use its estimationŷ(t) in error value, i.e. (ŷ(t) − y des (t)).
In case of unknown but bounded noise w(t), the nearest neighbour estimation for nonlinear function f () is described to achieve the tracking. Set and i t = arg min 0≤i≤t−1 Then we set So at each time instant t ≥ 1, the nearest neighbour estimation for nonlinear function f (y(t)) is given aŝ Equation (6) can be rewritten aŝ To make the actual output y(t) track the desired output reference y des (t), we define Based on the definition (8), then the controller is defined as follows: where > 0 is one arbitrary small positive value. We remark that the estimators (5) and (6) may be referred to as the nearest neighbour estimator for f (), as can be seen intuitively from Equation (7). It is a natural one when we only know the generalized Lipschitz continuity of f () and the boundedness of the noise {w(t)}. Using the above controller u(t) (9), the tracking mission will reach, i.e. y(t) − y(i) → 0 and observing above description, the estimation for nonlinear function is important in determining the controller. To return back to our considered MPC, this estimation for nonlinear function is also needed, so in Section 3, we derive other estimation for nonlinear function based on our own derivation. Equation (8) is only used in tracking control not our mentioned model predictive control. Here we only give one example about constructing one nearest neighbour estimation for nonlinear function for the tracking mission. Then through using this same idea, one new nonlinear function estimation is studied for our proposed model predictive control.

Nonlinear function estimation
The nearest neighbour estimation (6) is efficient in case of bounded noise (2). To this end, as a consequence we assume that noise w(t) is a martingale difference sequence, i.e.
Making use of the property of martingale difference sequence, set where Z is the set of all integers, > 0 is one arbitrary small positive value, and it holds that. and We define the following interval value function () as, for each y It means (y) covers the neighbourhood of y. For every t ≥ 1, y(t) − y(i) ≤ , then it holds that.

Model predictive control
The goal of MPC is to make the considered system to track that desired output reference y des (t) and reject noise from t = 0 up to a finite time horizon N. MPC turns one optimal control problem into one numerical optimization problem, whose cost function is always set as that. min whereŷ(t) is the actual output, coming from Equation (16), y des (t) is the desired output reference, Q and S are two positive definite weighting matrices. For convenience, that cost function (17) is simplified to satisfy the standard form for the next analysis process. The main step in MPC is to obtain those optimal control input or optimal controller {u(0), u(1), . . . u(N)}, while minimizing the cost function.
Expanding the cost function (17) Neglecting the third term y T des (t)Qy des (t), as it is independent of the control input {u(0), u(1), . . . u(N)}, then the simplified cost function is that where the actual outputŷ(t) is given as one explicit function about the control input, i.e.

y(t) =
t−1 i=0 (y(i + 1) − u(i))I (y(t)) (y(i)) t−1 i=0 I (y(t)) (y(i)) A common way to specify MPC is through the use of a quadratic cost function with linear constraints, which handles both the dynamics, and the state and input constraints. The optimal solution to a constrain quadratic program can be found by solving one quadratic program with equality or inequality constraints. Here we add the corresponding input and output constraints as follows: (20) where u min and u max are lower bound and upper bound for control input u(t), t = 0, 1, . . . N respectively, and the definition is similar toŷ min andŷ max .
Combining Equations (19) and (20), our considered model predictive control is the following optimization problem with inequality constraints.
To simplify that cost function in Equation (21) where I N is the identity matrix with dimension N.
where vector u and y are defined as follows: To simplify notation, set x = [y, u], then N t=0ŷ Through these two equations (24) and (25), that cost function in (21) can be rewritten as These inequality constraints on input and output can be also reformulated as follows: Combining above two inequalities to obtain Define A i = 0 I 0 −I i as the ith column, and c i = U max −U min i , then Equation (29) is reduced to Applying the simplified cost function (26) and the simplified inequality constraint (30), the following quadratic program problem is obtained: When to solve this above quadratic program problem (31), many existed optimization methods can be applied directly, such as Newton method, Gauss-Seidel algorithm, ADMN and our studied dynamic program. But in the next section, we do not consider the problem of how to solve this quadratic program problem with inequality constraint, but give a preliminary robust analysis.

Robust analysis
Observing that quadratic program problem with inequality constraint A i x ≤ c i , i = 1, 2 . . . N again, there are numerous universal forms of combinatorial problems. It means a combinatorial problem can be posed as minimizing a quadratic objective under inequality constraints.

Semidefinite relaxation scheme
To bound from below the optimal value in Equation (31), we may use the Lagrange relaxation scheme and choose weights λ i ≥ 0, i = 1, 2 . . . N and add the inequality constraints with these weights to the cost function, thus coming to the following Lagrange function: By construction, the Lagrange function L λ (x) is ≤ the actual objective x T Ax + 2b T x. Consequently, the unconstrain infimum of this Lagrange function is a lower bound for the optimal value in x T Ax + 2b T x. From the dual theory, we assume that λ ∈ R + and ξ ∈ R are such that.
Recalling the structure of Lagrange function L λ (x), it means that the inhomogeneous quadratic form is nonnegative on the entire space. Then it is worthy to catalogue our simple observation. An inhomogeneous quadratic form f λ (x) = x T Ax + 2b T (λ)x + c(λ) − ξ is nonnegative everywhere if and only if (λ, ξ) satisfy the following linear matrix inequality: where Equation (35) means the left matrix is positive semidefinite.
Expanding linear matrix inequality as Then ξ is a lower bound for the optimal value.

Conic quadratic program
Consider that quadratic program problem again, in real applications, the data A, b, A i , c i , i = 1, 2, . . . N are not always known exactly, what is typically known is a domain in the space of data -an uncertainty set, which for sure contains the actual data. There are cases in reality where in spite of this data uncertainty, our decision x must satisfy the actual constraints, whether we know them or not.
If indeed all we know about the data is that they belong to a given set , but we still have to satisfy the actual constraints, the only way to meet the requirements is to restrict ourselves to robust feasible candidate solutionsthose satisfying all possible realizations of the uncertain constraints, i.e. vectors x such that ⎡ such that ∃A and b : (A i , c i , A, b) ∈ , is one given set. In order to choose among these robust feasible solutions the best possible, we should decide how to aggregate various realizations of the objective into a single characteristic. In other words, the robust counterpart of quadratic program problem (31) is the following optimization problem: Note that Equation (38) is a usual optimization problem.
As we see in a while, in many cases, it is reasonable to specify the uncertainty set as an ellipsoid, i.e. the image of the unit Euclidean ball under an affine mapping or as a conic quadratic program. Then in this case, the robust counterpart of an uncertain quadratic program problem is an explicit conic quadratic program. Thus robust quadratic program with conic quadratic uncertainty sets can be viewed as a generic source of conic quadratic problem.
Observing the robust counterpart of an uncertain quadratic program.
In the case of a simple ellipsoidal uncertain, assume that uncertainty set is i are the nominal data and P i e i , i = 1, 2 . . . N, represent the data perturbations, the restrictions e T i e i ≤ 1 enforce these perturbations to vary in ellipsoids.
In order to realize that the robust counterpart of our uncertain quadratic program problem is a conic quadratic problem, note that x is robust feasible if and only if for every = Thus x is robust feasible if and only if it satisfies the system of Similarly, a point (x, t) satisfies all realizations of the inequality x T Ax + 2b T x ≤ t by our ellipsoidal uncertainty set if and only if Thus the robust counterpart becomes one conic quadratic program

Simulation example
Now we propose one simulation example to illustrate the nature of the above results. Consider one problem of multi UAVs formation cooperative. As we know that multi UAVs formation cooperative depends on the transmission of information and sharing in case of implementing the task. Task coupling and information sharing make the formation cooperative as a distributed networked intelligence system, where each UAV is regarded as one knot and one wireless communication network is proposed to realize the communication among all UAVs. Based on this distributed networked intelligence system, each UAV can perform the monitoring task according to different locations for mission planning. One special architecture of monitoring mission for multi UAVs formation cooperative reconnaissance is plotted in Figure 1, where many cooperative modules are seen explicitly.
By using this network module, there are many types of function modules for each UAV, for example, cooperative mission planning, cooperative trajectory planning, tracking, attacking and searching ,etc. Network module sends control requirement to cooperative mission planning, so that each UAV is controlled by its network goal. All of these function modules will communicate with their neighbouring UAVs to share information resources. This architecture in Figure 1 can be simplified to Figure 2, where the basic internal structure of the function module is given to achieve the cooperative mission planning.
Further from flight control theory, we see that UAV reconnaissance platform acquires a frame of image, coinciding with ground target and landscape. Two cases exist for extracting target location information from the received image. The ground target positioning process is plotted in Figure 3, where the primary target is in the centre of the field of the airborne optoelectronic platform, and the secondary target is not in the centre of the image. As each UAV is regarded as one knot and one wireless communication network is proposed to realize the communication among all UAVs in Figure 4, so there are many communications among these UAVs, and these UAVs can be deemed as some subsystems.   The formation includes three UAVs: one radar jamming UAV, one missile jamming UAV and one investigation UAV. The initial positions of three UAVs are all located at starting point coordinates (0, 0) and the terminate positions are concentrated on coordinates (700, 700).The vector of the flight maximum speed,minimum speed, and the speed deviation is Surrounding battlefield environment contains radar threat, missile threat and antiaircraft positions threat. The deployment coordinates of the radar threat are (300, 300), and the deployment coordinates of the missile threat are (250,200). The region of the artillery positions threat is a rectangle range with a height of 300 and a width of 300. This rectangle range belongs to the no-fly zone. When sample period is T, matrices A ii and B ii are defined as follows respectively In simulation environment, the input signal is the excited signal chosen by user, output is measured from the point set collected by the accelerator, the number of sampled points is set N = 4096. The output and input of the 4096 sampled data are divided into 4 equal data blocks; each data block contains 1024 sample data. The weighted factor for each UAV is The weighted matrix in the cost function is The discrete time sampling period is taken as The simulation environment under windows system. The simulation scenario includes three UAVs which constitute wedge formation in the two-dimensional plane. The initial state of each UAV is respectively (500, 500), (500, 5000) (2000, 5000) In the final formation form, the displacements of each UAV relative to the confluence point are (−100, 100), (−100, 100) (0, 0) The optimal convergence point is (3200, 3500). The formation speed at the convergence point is (200, 0). The maximum acceleration speed of each UAV is limited and given as 35 m/s 2 . The maximum speed is 250 m/s, the total execution time is 30 s. Based on the dynamic programming strategy, each UAV can find the optimal confluent point which makes the energy consumption lower in the formation process. Three different colour curves in Figure 5 represent the best track trajectories after the optimal confluent point is determined by optimization.
The iteration curve about the objection criterion function as the iteration steps is given in Figure 6. From Figure 6, we see that when the number of the iteration step is 600  times, the whole algorithm will be terminated and the stable convergent is guaranteed.

Conclusion
This paper connects system identification, model predictive control and convex optimization theory to construct data-driven model predictive control. The combination of system identification and model predictive control is to derive the actual output value for cost function, and convex optimization theory is used to do some robustness analysis for our reformulated quadratic program problem with inequality constraints, corresponding to the optimization problem within model predictive control. As this is our primarily analysis for data-driven model predictive control, that about how to merge game theory and datadriven model predictive control is our next ongoing work.