Robust pose tracking control for a fully-actuated hexarotor UAV based on Gaussian processes

This paper presents a robust position/attitude tracking control method for a fully-actuated hexarotor unmanned aerial vehicle (UAV) based on Gaussian processes. Multirotor UAVs suffer from modelling errors due to their structure complexity and aerodynamical disturbances whose perfect mathematical formulation is intractable. To handle this issue, this paper incorporates a data-based learning technique with model-based control. The hexarotor UAV dynamical model, considering modelling errors and aerodynamic disturbances as unknown dynamics, is first derived. Gaussian process regression is next introduced as a learning method for the unknown dynamics, which provides probabilistic distributions of the predicted values. The predicted means are regarded as deterministic information and cancelled out by feedforward control inputs. The predicted variances are considered as the bounds of the model uncertainties with high probability, and a robust control method to ensure ultimate boundedness of the tracking control error is proposed for the uncertain system. The effectiveness of the proposed method is demonstrated via experiments with a self-developed hexarotor UAV testbed.


Introduction
A multirotor unmanned aerial vehicle (UAV) has become one of the most common vehicles for these decades, thanks to its high mobility in threedimensional (3D) space. Its industrial application is expected in various fields including agriculture monitoring [1], building inspection [2], load transportation [3], and work with a robotic arm [4]. In standard structures of multirotor UAVs, each rotor is placed at the vertex of a regular polygon and in the same direction as shown in Figure 1(a). This same direction property causes underactuation, that is, UAVs cannot move in the horizontal direction without tilting their bodies. This motion limitation is often undesired, e.g. in the aforementioned application. To guarantee the full actuation property, i.e. to control the 6D pose (3D position and 3D attitude) individually, special structures have been recently developed [5][6][7][8][9]. One of the common approaches is to introduce tilted rotors as shown in Figure 1(b). Here, at least six rotors are required to individually control the 6D pose, and multirotor UAVs with six rotors are called hexarotors.
However, this kind of fully-actuated structure brings a new technical issue that tilted rotors cause aerodynamical interference between the rotors, which does not appear in standard structures. Multirotor UAV structures also have the model complexity such as rotor dynamics and aerodynamics associated with thrust generation. Therefore, perfect mathematical formulation of fully-actuated multirotor UAV dynamics is intractable. Moreover, when the UAV flies near buildings (walls) or the ground, counter wind, called ground effects, strongly affects the UAV motion, and modelling of this effect is also quite challenging. In view of these facts, this paper incorporates a data-based learning method for these kinds of unknown dynamics with model-based control.
Multirotor UAV motion control considering the effect of aerodynamical interference from surrounded environments is investigated in [10][11][12]. The works [10,11] tackle landing control problems taking account of ground effects, and [12] proposes flight control methods near a wall. They achieve desired control tasks based on adaptive robust control techniques but focus only on standard multirotor UAV structures. Since fully-actuated UAVs with tilted rotors suffer from different types of aerodynamical interferences, we have to design other new controllers. Additionally, while the methods in [10][11][12] succeed in attenuating aerodynamical effects, the characteristics of those unknown dynamics are not directly estimated. Compared with such an adaptive control approach, Gaussian Process Regression (GPR) [13] directly provides unknown dynamics prediction as a probabilistic distribution with mean and variance information. The prediction results are also provided in the closed form based on current states, which enables us to easily utilize the predicted values for control. With hope for this versatility, this paper introduces GPR as a learning method for unknown dynamics of the fully-actuated UAVs.
Another advantage of GPR is that it is a nonparametric regression model, i.e. the structure of a target model is not restricted. Thanks to these versatile properties, GPR has been recently utilized as a statistical modelling methodology in robotics and control engineering fields [14][15][16][17][18][19][20]. Especially, similarly to these works, the authors in [14,15] employ GPR for multirotor UAV control. The work [14] approximates an evaluation index of control performance by GPR, and the optimization of controller gains is achieved while guaranteeing the stability of the UAV. The authors in [15] guarantee the safety during learning process by introducing control barrier functions, where GPR is used for learning the disturbances of the UAV and the improvement of tracking control performance is confirmed via simulation. The work [16] also designs control barrier functions for safe flight control based on GPR learning results, where more general uncertainties are learned and the predicted variance modifies the control barrier function values. Whereas these studies handle standard structures of multirotor UAVs, we expect that GPR can be also applied to fully-actuated structures thanks to its versatility. Moreover, since the predicted variance can be regarded as uncertainties of the system, GPR is useful for robust controller design.
This paper presents a new robust pose tracking control method of a fully-actuated multirotor UAV based on GPR for learning of its unknown dynamics. The proposed technique explicitly utilizes the predicted mean and variance information for robust controller design. By regarding the predicted means by GPR as deterministic information and cancelling out them by feedforward inputs, the control performance is improved. On the other hand, the predicted variances are considered as the bounds of the model uncertainties with high probability, and a robust control method to ensure ultimate boundedness of the tracking control error is proposed for the uncertain system. The main contribution is to show the utility of the proposed scheme via experimental verification with a self-developed fullyactuated hexarotor UAV testbed. Here, we demonstrate learning of unknown dynamics caused by modelling errors, aerodynamical interference between the tilted rotors, and ground effects.

Fully-actuated hexarotor structure
Throughout this paper, we consider the hexarotor UAV structure shown in Figure 2, which is proposed by our previous works [8,9]. This structure guarantees the full actuation of the hexarotor UAV, namely, can generate 6D wrench (3D force and 3D torque) inputs independently.
Let the world frame be w and body-fixed frame b whose origin coincides with the centre of mass of the vehicle. The position of each rotor i ∈ {1, . . . , 6} in b is denoted by r i ∈ R 3 , the tilting angle around the direction of r i is represented by γ i ∈ (−π/2, π/2), and the direction of the thrust force generated by each rotor is denoted by s i ∈ R 3 ( s i = 1). We also represent each counter torque coefficient by c i ∈ R, where c i > 0 (c i < 0) means that the direction of the counter torque generated by rotor i is the same as (opposite to) that of the thrust force [21].
Then, the fully-actuated hexarotor UAV structure in Figure 2 is formulated as follows: Here, (1a)-(1c) mean the symmetry of the structure, (1d) guarantees the full-actuation property, and (1e) implies that all rotors have the same specification.

Remark 2.1:
Although we provide a special structure here for reasons of brevity and according to the experimental testbed used in Section 5, the subsequent discussion holds for any kinds of multi-rotor UAVs as far as they are fully actuated.

Hexarotor model with unknown dynamics
In this work, we consider translational force and angular velocity as the control inputs as in [22,23]. The usage of the angular velocity inputs is motivated by the fact that not torque inputs but angular velocity ones are used for many multirotor UAVs now in use, i.e. desired angular velocity is relatively easily achieved by a welltuned local controller. In fact, we can also directly command the angular velocity inputs in the experiments in Section 5. The position of the hexarotor UAV in w is denoted by p = [x y z] T ∈ R 3 , and the attitude is represented by XYZ Euler angles ξ = [φ θ ψ] T ∈ R 3 [24]. The force and angular velocity inputs of the vehicle in b , generated by the six rotors, are written by F ∈ R 3 and ∈ R 3 , respectively. Then, the nominal model of the UAV is given as follows [21]: Here, e 3 = [0 0 1] T ∈ R 3 , m > 0 is the mass of the vehicle, and g > 0 the gravity acceleration.
is the rotation matrix representing the attitude of b relative to w . (I n ∈ R n×n is the n × n identity matrix.) This rotation matrix is defined as are respectively the basic rotation matrices around x-, y-, and z-axes. J(ξ ) ∈ R 3×3 is the Jacobian matrix provided as follows: By denoting the total states as x = [p TṗT ξ T ] T ∈ R 9 and the total inputs as u = [F T T ] T ∈ R 6 , dynamics (2) can be written by the following state equation: Here,x ∈ R 9 indicates the state of the nominal system. Let us next consider unknown dynamics for (3). In this work, the modelling error of the nominal system (3), aerodynamical interference between the rotors, and the ground effects are mainly considered as the unknown dynamics, and suppose that they are dependent on the state x. In this case, by regarding the total modelling error from the nominal model (3) as the unknown dynamics: we finally obtain the following state equation as the actual system:ẋ Here, we also suppose that d(x) = [0 T 3 d 1 (x) · · · d 6 (x)] T holds because our focus is learning the modelling error and aerodynamical interference that mainly affect the position dynamics rather than the position kinematics. (0 n ∈ R n is the n-dimensional zero vector.)

Remark 2.2:
It is mathematically possible to consider the unknown dynamics dependent also on the input u, i. e. d(x, u). However, this setting causes time inconsistency in the implementation stage because d(x, u) is used for determining u in the robust controller. On the other hand, since the control input u is formed by a state-feedback-type controller, we expect that the input information is indirectly used for learning of the unknown dynamics d. In fact, we show the present setting works in the experimental verification in Section 5.

Control objective
The objective of this paper is to propose a robust tracking controller for given reference pose trajectories so that the tracking errors to the trajectories are bounded. Here, the probability distributions of the unknown dynamics d(x) are learned by GPR and utilized for the controller design.

Gaussian process
A Gaussian process (GP) is a kind of probability process whose distribution is given by random variables dependent on its inputs. Suppose that a function δ(·) : R n → R follows a GP. Then, given any finite input variables Here, N (·, ·) is the Gaussian distribution, μ(χ) ∈ R the mean function, and k(χ , χ ) > 0 covariance one. The fact that δ(·) follows a GP is represented by k(χ , χ )).
The function k(χ , χ ), also called the kernel function, represents the covariance of δ(χ) and δ(χ ). This paper employs the following squared exponential kernel (see [13] for a review of different kernels): . . , n} are called hyperparameters. This kernel implies that the more similar the values of χ and χ are to each other, the stronger the correlation between δ(χ) and δ(χ ) is. In addition to the choice of kernel functions, the way to choose the hyperparameters is the design freedom of GPR.

Gaussian process regression
By using state/input data of the UAV, given (measured) by preliminary experiments, as training data, we predict d i (x), i ∈ {1, . . . , 6} of (4) under the assumption that they follow GPs. For the simplicity of the learning process, each d i (x) is individually predicted by supposing that they are independent of each other, i.e. k i (x, x )) holds for all i ∈ {1, . . . , 6}. In this section, we briefly explain the GPR method for one element of d(x), simply denoted by d(x) ∈ R. Suppose that the training data D = {x i ,d(x i )} N i=1 is given by preliminary experiments. Here, keeping the measurement noise for the training data in mind, we considerd(x i ) ∼ N (d(x i ), σ 2 n ). We note that the variance σ 2 n of the measurement noise can be also handled as a hyperparameter of the GPR. Then the posterior predictive distribution of d(x * ) for any query state x * ∈ R 9 can be calculated by the prior distribution of d(x) and the joint distribution with it.
Since we have no information about the prior distribution of the unknown dynamics, we simply suppose that the mean of the prior distribution is zero. In this case, the joint distribution of the training datâ d(x i ), i = 1, . . . , N and the predicted value d(x * ) is given as follows: Here the covariance vector of d D and d(x * ), and k(x * , x * ) the covariance of d(x * ) itself, i.e. the prior covariance. Then, we get the following predicted distribution from the joint distribution (5) [13]: Notice here that the predicted mean μ(x * ) and variance σ 2 (x * ) are given in the closed form, which is one of the versatile properties of GPR, that is, the predicted results can be easily utilized for control. The hyperparameters of the GPR are determined by solving the optimization problem: Here, x D = {x 1 , . . . , x N } is the training data, and θ = [σ f l 1 · · · l 9 σ n ] T ∈ R 11 . This optimization problem can be numerically solved by standard gradient methods [13].

Unknown dynamics representation as uncertainties
Considering the probabilistic characteristics of the predicted values by the GPR, we represent the unknown dynamics as the uncertainties with some offsets as follows: Here, μ(x) = [μ 1 (x) · · · μ 6 (x)] T ∈ R 6 and σ (x) = [σ 1 (x) · · · σ 6 (x)] T ∈ R 6 are the stuck vectors of the means and variances predicted by the GPR, respec- . . , 6}, and β > 0 is the parameter to determine the confidence region of the existence of the uncertainties. The representation (6) means that we deal with the predicted mean μ(x) as the deterministic information and the probabilistic characteristic is handled by β σ (x). In this work, we set β = 3, i.e. the 99.73% confidence region (only in the sense of the given training data).

Feedback linearization based on predicted means
We first denote the desired pose trajectories by p Then, defining the outputs as the tracking errors: we obtain the following output dynamics from (2), (4), (6), and (7): Let us next consider the following input transformation with the new inputũ = [F T˜ T ] T ∈ R 6 : Define the new state as z = [y T pẏ T p y T ξ ] T ∈ R 9 . Then the variance information of the unknown dynamics can be rewritten by σ (x) = σ (t, z) because the desired pose trajectories p d and ξ d are given by functions of time t. Substituting (9) into (8) yields the following uncertain linearized state equation: β σ (t, z)) β σ (t, z)). (10) Here, O n ∈ R n×n is the n × n zero matrix.
Remark 4.1: (10) is the linear system with the bounded uncertainties determined by the parameter β and the variance information σ (x) predicted by the GPR. Here, the predicted mean μ(x) is cancelled out by the feedforward input in (9). Therefore, standard robust control methods can be applied as shown in the following section.

Robust pose tracking control
We finally design a pose tracking controller to guarantee robust stability against the uncertainty β σ (t, z). Based on the Lyapunov redesign technique [25], we propose the following tracking control law: Here, K c ∈ R 6×9 is the feedback gain matrix to stabilize the origin of (10), i.e. designed so that A − BK c is Hurwitz. P ∈ R 9×9 is the solution of the following Lyapunov equation for any positive semidefinite matrix Q ∈ R 9×9 : and > 0 is the design parameter to determine the tracking performance.
Then, we have the following theorem.
Proof: See the proof of Theorem 14.3 in [25].
Remark 4.2: Theorem 4.1 seems to indicate that any small ultimate bound can be obtained by making the parameter small, but actually, we have to take care of high gain feedback in such a case. We also note that the statement of Theorem 4.1 holds only in the sense of the given training data.

Experimental environment
The utility of the proposed approach is demonstrated via experiments with the self-developed hexarotor UAV testbed shown in Figure 3. The testbed mainly consists of the flight controller "Pixhawk," motors "Cobra CM-2204/28 Motor," propellers "HQprop 5 × 3 Carbon Composite," and frames made of the aluminium alloy "A2017." The physical parameters are summarized in Table 1.
The experimental environment is shown in Figure 4. The states of the hexarotor UAV are obtained from the measurement by the motion capture system with the motion capture cameras "OptiTrack Prime13" and numerical differentiation techniques, where the velocity information is obtained by applying the following linear predictor as a noise reduction filter: Here, k ∈ Z is the step number to show each sampling period of the experiments. The control input is calculated by the PC with the CPU "Intel ® Core TM i7-7700K @4.20GHz" and RAM 64.0 GB, and sent to the vehicle via the wireless communication device "DFT, D8R-II PLUS." The sampling time of the state measurement is 10 ms and that of sending the control input command is 23 ms. The prediction of the unknown dynamics d(x) via the GPR is conducted by using 200 data points obtained by the preliminary experiments. Here, the 200 data points are extracted from 1500 data points by k-means clustering [26]. 1 The hyperparameter θ is optimized by GPML Matlab Code [27].

Experiment 1
We first demonstrate the effectiveness of the proposed approach via standard flight to mainly consider the modelling error of the nominal model and the aerodynamical interference between the tilted rotors.   is applied to the hexarotor UAV testbed. As the comparison, we also apply the nominal controller: (9) without μ(x) andũ = −K c z with K c in (12). This K c is empirically tuned to obtain good tracking performance for the nominal controller. In the preliminary experiments to collect data for the GPR learning, we modify the desired trajectory to cancel out the offset error yielded by the nominal controller and moreover apply small deviations to obtain the data around the true desired trajectory. In addition, we utilize the desired trajectories replacing z d (t) = 0.

Experiment 2
Let us next demonstrate the effectiveness of the proposed method via landing flight to also consider the ground effects from the experimental field.   the preliminary experiments to collect data for the GPR learning, we also modify the desired trajectory to cancel out the offset error yielded by the nominal controller and apply small deviations x d (t) = ±0.1 and y d (t) = ±0.1 to obtain the data around the true desired trajectory. In this experiment, legs with the height 0.1 m are attached to the vehicle so that z = 0.1 means landing of the UAV.
The experimental results are depicted in Figures 7 and 8. Figure 7(a) shows that there exist steady position tracking errors and the hexarotor UAV cannot land even at t = 8 [s] when the desired height becomes 0 m (less than the landing height 0.1 m). We also see from Figure 7(b) that when the vehicle approaches to the ground, especially in the region z ≤ 0.25, the attitude errors are much oscillating and the maximum size of the   errors is around 3.0 deg. These poor results imply that the ground effects are not small enough to be neglected. On the other hand, we see from Figure 8(a) that the position tracking is much improved and the hexarotor UAV successfully lands on the field. Figure 8(b) also shows that the attitude errors are attenuated within ±1.8 deg even when the vehicle is near the ground. These results show the utility of the proposed approach.
Since the current UAV testbed is designed aiming only at adding the ability to horizontally move without tilting its body, it cannot significantly change its roll/pitch angle. Therefore, one of future directions of this work is to conduct experimental verification to evaluate the robustness against large body tilting.

Conclusion
This paper presented a robust pose tracking control method of a fully-actuated hexarotor UAV based on GPR for learning of its unknown dynamics. In the proposed scheme, the means predicted by the GPR are regarded as the deterministic values and explicitly cancelled out by feedforward control inputs to improve the tracking performance. On the other hand, the predicted variances are considered as the bounds of the model uncertainties with high probability. Then, we proposed a robust control law based on the Lyapunov redesign technique for the uncertain system. The utility of the proposed method was demonstrated via two kinds of experiments with a self-developed hexarotor UAV testbed. The first experiment was the standard flight to show the effectiveness of the proposed approach for modelling errors and aerodynamical interference between the tilted rotors. The second experiment was the landing flight to show the utility also for ground effects.
Although the present experimental results show the utility of incorporating GPR, the proposed method works only for repeating process as in the current experimental settings since the learning is conducted offline with preliminary experimental results. Therefore, the main future work is to extend the present approach to incorporate online GPR as in [28,29], where the main issues are to guarantee the stability of the UAV during the learning process and computational cost for the online implementation.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
This work is supported in part by JSPS KAKENHI under 21K04113 and in part by JSPS KAKENHI under 21H01348.

Notes on contributors
Tatsuya Ibuki received the B.Eng., M.Eng., and Ph.D.Eng. degrees from Tokyo Institute of Technology in 2008, 2010, and 2013, respectively. He was a research fellow with the Japan Society for the Promotion of Science from 2012 to 2013, an assistant professor with the Department of Systems and Control Engineering, Tokyo Institute of Technology from 2013 to 2020, and a visiting scholar with the School of Electrical and Computer Engineering, Georgia Institute of Technology in 2019. Since 2020, he has been a senior assistant professor with the Department of Electronics and Bioinformatics, Meiji University. His research interests include cooperative control of robotic networks, fusion of control theory and machine learning, and vision-based estimation and control.
Hiroto Yoshioka received the B.Eng. and M.Eng degrees in Control Engineering in 2017 and 2019, respectively. He is currently engaged in software development using Robot Operation System (ROS) at eSOL Co.,Ltd.