Parametric uncertainty handling of under-actuated nonlinear systems using an online optimal input–output feedback linearization controller

This research introduces a new online optimal control based on the input–output feedback linearization and a multi-crossover genetic algorithm for under-actuated nonlinear systems having parametric uncertainties. At first, the input–output feedback linearization method is successfully implemented to derive the control law for a two degrees of freedom cart-pole nonlinear system. Then, the regarded optimization algorithm is applied to find the design parameters of the controller for different values of the uncertain variables. Next, an approximation function is suggested to calculate the optimum gains of the controller in the presence of the uncertainties in the system parameters. The simulation results are illustrated to prove the effectiveness and adeptness of the introduced scenario to overcome some common issues in the actual systems, i.e. under-actuating nonlinearities and uncertainties.


Introduction
The considered system in this paper is a cart-pole system having two degrees of freedom and one actuator as an under-actuated one. The cart-pole systems have always been regarded as a fundamental benchmark to challenge different types of control approaches. This importance can be because of such characteristics as being nonlinear, unstable, under-actuated and the possibility of imposing the various types of constrains and uncertainties on the system (Chiu & Wang, 2019;Franco et al., 2018;Irfan et al., 2018;Song et al., 2019;Su et al., 2018;Wang & Kumbasar, 2018;Wang & Liu, 2019).
On the other hand, although the control of an underactuated system seems far more difficult and complicated compared to a fully actuated one, in recent years, control of such systems has received increasing interests in the control literature (Bansal et al., 2018;Li et al., 2019Liu et al., 2020;Ye & Luo, 2019;Zheng & Xiong, 2014). These interests can be due to two main reasons. The first reason is related to practical application of such systems for controlling and stabilizing common plants like ships, aeroplanes, submarines, helicopters and robots. The second one can be due to the probability of outbreak of some technical defects at any moment of work and transformation of a real industrial system from a fully actuated case into an under-actuated one.
CONTACT M. J. Mahmoodabadi mahmoodabadi@sirjantech.ac.ir One of the effective control methodologies for this kind of nonlinear systems is the feedback linearization scheme. The main advantage of this controller is to make linear the system dynamics using change of variables instead of estimation of variables. For instance, Andalib Sahnehsaraei et al. (2013) have controlled a cart-type inverted pendulum plant combining approximate feedback linearization and sliding mode control for both position of the cart and angular position of the pendulum. Mahmoodabadi et al. (2018) have introduced a hybrid optimal controller based on a combination of robust decoupled sliding mode and adaptive feedback linearization for a class of fourth-order systems. Giuseppi et al. (2019) have presented a feedback linearization control strategy to govern a life-support system that may be attached to a satellite to increase its operational lifespan. Nechak (2019) has dealt with the feedback linearization active control of friction-induced limit cycle oscillations generated from the mode-coupling mechanism that are often undesirable in numerous applications of nonlinear dynamical friction systems. Djilali et al. (2019) have implemented an input-output feedback linearization controller based on a recurrent high-order neural network identifier trained with an extended Kalman filter for a doubly fed induction generator prototype connected to the grid. Kali et al. (2018) have designed an optimal super-twisting algorithm with time delay estimation based on the input-output feedback linearization for uncertain robot manipulators. Mahmoodabadi and Khoobroo Haghbayan (2019) have introduced a combination of approximate feedback linearization and sliding mode control approaches for stabilization of a class of fourth-order nonlinear systems. Finally, a novel optimum fuzzy combination of robust decoupled sliding mode and adaptive feedback linearization controllers has been established for uncertain under-actuated nonlinear systems using non-dominated sorting genetic algorithm by Mahmoodabadi and Soleymani (2020).
Besides, if the gains of a controller are tuned for a set of certain parameters of the system, those may be unsuitable for other values of the parameters (Li et al., 2019a;Sun et al., 2020aSun et al., , 2020b. In other words, the optimum performance of a controller is dependent on the values of the system parameters (Li et al., 2019b;Sun et al., 2019;Tong et al., 2020). Hence, some researchers have tried to introduce the online control approaches in which the gains are changed based on the system conditions. To name but a few, Gabasov et al. (2004) have investigated an optimal output online controller in real time on signals of the dynamical sensor for linear systems under uncertainties. Lu and Yao (2014) have proposed an online constrained optimization-based adaptive robust controller for a class of multiple-input-multiple-output systems with input saturation, state constraints, matched parametric uncertainties and input disturbances. Mahmoodabadi and Bisheban (2014) have introduced an online optimal linear state feedback control using a straightforward particle swarm optimization and moving least squares approximation.  have suggested an online optimal decoupled sliding mode control approach to determine the optimum parameters for an under-actuated ball and beam system. This research study substantially contributes to the nonlinear control theory by introducing a new online optimal control based on the input-output feedback linearization and a multi-crossover genetic algorithm for under-actuated nonlinear systems having parametric uncertainties. To reach this goal, at first, a control law is extracted via the input-output feedback linearization approach for stabilization of an under-actuated cart-pole system with two degrees of freedom. Then, the obtained control law is improved by an online optimal approach so that it can always present an optimal robust controller against the parametric uncertainties of the system. The proposed method is developed by the idea that nonlinear systems should operate optimally for any values of system parameters not only for some special ones. In order to reach this goal, an approximation function is designed and applied to estimate the suitable control parameters at any conditions. The initial data of the approximation function are produced by a multicrossover genetic algorithm. Generally speaking, the suggested control structure is unlike any standard form of controllers previously seen in the literature, and it is not necessary to investigate the advantages of the references.
The rest of the paper is organized as follows. Section 2 briefly presents the dynamical equations of the cart-pole system. The input-output feedback linearization method is implemented on the system in Section 3. Optimization and multi-crossover concepts are represented in Section 4. Section 5 introduces the proposed online optimal control and depicts the simulation results. Finally, Section 6 concludes the paper.

Cart-pole system
Consider the cart-pole system illustrated in Figure 1 composed of a cart which is able to move left and right on a horizontal rail and a pole pivoted on the cart that can rotate around the vertical axis. The dynamical equations of motion derived via the Lagrange approach can be stated as follows (Andalib Sahnehsaraei et al., 2013): where M and m denote the masses of the cart and the pole, respectively. l represents half the length of the pole.
x shows the lateral displacement of the cart, and θ states the angular displacement of the pole. Finally, F signifies the only input applied to the cart in order to stabilize the system. The control objective of this system is to find a law for F so that the pole stands in the vertical situation while the cart holds the origin point. If the state vector is regarded as z = [z 1 , z 2 , z 3 , where u represents the manipulated control input.

Input-output feedback linearization
Generally speaking, in order to linearize the nonlinear systems, two strategies are commonly suggested. The first strategy is to linearize the nonlinear equations around the equilibrium points using Taylor expansion (Jacobean linearization). The second one is to change the coordinates and transform the nonlinear system dynamics into a fully or partially linear one (feedback linearization) (Khalil, 1996;Sastry, 1999;Slotine & Li, 1991). In this research, an especial type of the feedback linearization method, namely, input-output feedback linearization, is considered to utilize the second idea. In this method, in order to control and stabilize both outputs of the two degrees of freedom system by the use of only one control input (the force applied to the cart), a reference angular position for the pole is defined based on the position of the cart. Then, the angular position of the pole is controlled using input-output feedback linearization technique so that it can track the reference angular position. Now, if the position of the cart converges to zero, the angular position of the pole will converge to zero, too (Henmi et al., 2010, July;Jouili & Braiek, 2016). If the control effort u for Eq.
(3) is defined as follows: then the sate equations are linearized as follows: where with the selection of v θ = −k 3 (x 3 − θ r ) − k 4 (x 4 − θ r ), the angular position of the pole, namely x 3 , will be able to track the reference angular position θ r . The reference angular position is defined as θ r = (2α/π)tan −1 (k 1 (x 1 − x r ) + k 2 (ẋ 1 −ẋ r )), where x r is the desirable or reference value for the position of the cart. Now, by selection of x r = 0, the convergence of both the angle of the pole and the position of the cart to zero would be possible (Henmi et al., 2010, July). It is noticeable that k 1 , k 2 , k 3 , k 4 and α as constant parameters would be found via the optimization process. Moreover, by selection of α, k 1 , k 2 > 0, the stability of the zero dynamics and consequently, the stability of the internal dynamics would be guaranteed.

Optimization and multi-crossover genetic algorithm
In fact, optimization as one of the oldest topics has been widely extended in different fields of technologies and sciences. For example, in economy, maximization of profit and sales as well as minimization of costs would be desirable or in daily life, people want to reach a maximum degree of happiness with the least amount of effort. The present study uses the genetic algorithm as a powerful evolutionary algorithm to solve both single-and multi-objective optimization problems. The proposed genetic algorithm uses tournament mechanism as the selection operator and a multi-crossover operator instead of the traditional one. The multi-crossover operator uses three parents to produce three children while the traditional or classical crossover operator uses two parents to produce two offspring. Andalib Sahnehsaraei et al. (2012aSahnehsaraei et al. ( , 2012b have shown that such an algorithm acts far better than a traditional genetic method for solving both single-and multi-objective optimization problems. Let ρ i (t), ρ j (t) and ρ k (t) represent three randomly selected chromosomes, and ρ i (t) has the smallest fitness value among these chromosomes. The suggested formulae for the multi-crossover operation are given as where r 1 , r 2 , and r 3 ∈ [0, 1] are random values. If P mc is the probability of the multi-crossover and N is the population size, (P mc × N)/3 chromosomes would be selected for changing. Moreover, the mutation formula is presented as follows: where ρ l (t) is a randomly selected chromosome, r ∈ [−1, 1] is a random value and ε is a constant. If P m and N, respectively, illustrate the probability of the mutation and the population size, then P m × N chromosomes would be randomly mutated.

Online optimal control against parametric uncertainties
In Section 3, control law was obtained by the input-output feedback linearization technique for stabilizing the two degrees of freedom cart-pole system. Here, the purpose is to develop the control law so that a continuous optimum control takes place at any moment against the changes of parameters such as the length of the pole and the mass of the cart over time.
The method to reach the goal is that, at first, the change domains of the system parameters are defined, and then different values of these parameters are produced. Let us select the domain of changes of the system parameters as M ∈ [0.1, 2], l ∈ [0.05, 1] and m = 0.1l. Suppose the steps of changes as 0.1 and 0.05 for the mass of the cart and half the length of the pendulum, respectively. Regarding a uniform distribution of the two parameters, 400 ordered pairs (M, l) will be produced.
At the next stage, for the different values of the parameters, the optimum gains of the controller are found by the multi-crossover genetic algorithm. It is clearly obvious that each of the obtained control gains is optimal only for a specific set of parameters. The regarded objective function for this optimization operation is formulated as follows: where f is the objective function which must be minimized. x and θ denote the displacement of the cart and the angular displacement of the pole, respectively. Finally, F represents the control force applied to the cart. The design parameters for this objective function are the controller gains regarded as k 1 , k 2 , k 3 , k 4 and α. All required values for implementation of the multicrossover genetic algorithm are listed in Table 1. Some of the obtained optimum gains are mentioned in Table 2. In the third step, the purpose is to gain an estimating function which can produce optimal control gains at any moment according to the changes of the parameters. The coefficients of the estimating function are determined by the multi-crossover genetic algorithm. The regarded general form of the estimating function for the control gains (k 1 , k 2 , k 3 , k 4 and α) could be stated as follows: where w j (j = 1 − 5) are the coefficients of the function. Furthermore, M and l, respectively, denote the mass of the cart and half the length of the pole that vary depending on time.
In order to have the best correspondence between estimation function y estimating and actual optimum gains y actual , the introduced multi-crossover genetic algorithm is employed. Root mean squared error (RMSE) is considered as the objective function that should be minimized while function parameters w j (j = 1 − 5) are regarded as the design variables for this optimization process.
where n is the number of ordered pairs (M, l), and here, it is equal to 400. Figures 2-6 compare the actual and estimated values for control gains k 1 , k 2 , k 3 , k 4 and α, respectively. Table 3 shows the values of w j (j = 1 − 5) and RMSE corresponding to control gains k 1 , k 2 , k 3 , k 4 and α found in the optimization process based on the multicrossover genetic algorithm. Figure 7 depicts the block diagram of this online optimal control technique in order to stabilize the system having time-varying uncertainties.
It is worth mentioning that the proposed online optimal controller will able to adapt itself to any new conditions      imposed by uncertainties and then present not only a robust control but also the optimal one. For the time-varying uncertainties, let us suppose that M and l change as step and trapezoidal functions over  time according to Figure 8(a-d). In the following, Figures 9-12 depict the response of the state variables of the system against the changes of parameters M and l over time. The graphs compare the results when the control law (4) uses the estimating function (10) to obtain the optimal control gains k 1 , k 2 , k 3 , k 4 and α, and the method proposed by Henmi et al. (2010, July) when it utilizes fixed control gains k 1 = 0.241259, k 2 = 0.33335, k 3 = 3.256576, k 4 = 1.159235 and α = 0.299905 which are found for M = 1 kg, l = 0.5 m and m = 0.05 kg.
Looking more closely at the obtained graphs, it can be seen from Figures 9(a), 10(a), 11(a) and 12(a) that the  values of the settling time for the proposed controller are correspondingly about 8, 17, 14 and 18 s while those are, respectively, about 16, 23, 16 and 23 s for the suggested approach by Henmi et al. (2010, July). Furthermore, these graphs depict that the online optimal based method has an overshoot around 5 (m) while the other one displays it more than 10 (m). On the other hand, Figures 9(c), 10(c), 11(c) and 12(c) correspondingly show that the introduced control method is able to stabilize the pole of the system in about 10, 18, 13 and 24 s while those are, respectively,  about 14, 28, 14 and 28 s for the input-output feedback linearization represented by Henmi et al. (2010, July). Moreover, these graphs demonstrate that the online optimal input-output feedback linearization controller has the overshoot around 0.66 rad while the approach introduced by Henmi et al. (2010, July) exhibits more this value. Generally, it can be seen that the proposed online optimal control method is capable to decrease the values of the overshoot and settling time for both pole angle and cart position. At the end, Table 4 shows how the values of the objective function (12) has become better (less) when the control law (4) uses the Table 4. Values of the objective function introduced by Equation (9) and the related improvement percentage.
Step M and step l Step M and trapezoidal l estimating function to approximate the optimum control gains.

Conclusion
The present paper has considered a two degrees of freedom cart-pole system having only one actuator with parametric uncertainties as a well-known benchmark in the control engineering science. The paper has proposed an online optimal controller in order to present the optimal performance against the uncertainties at any moment of time. It means that the proposed online optimal controller has tried to adapt itself to the current conditions of the system at any moment. The proposed algorithm has been tested to control the state variables of the system including cart's position and velocity as well as pole's angle and velocity when it was exposed to the parametric uncertainties such as various combination of cart's mass and pole's length variations. Then, the results have been compared to that of a non-online fixed-gain controller recently published in the literature. The comparisons reveal that the proposed online optimal controller can improve the settling time and overshoot of both degrees of freedom in all situations of the parametric variations.

Disclosure statement
No potential conflict of interest was reported by the author(s).