Dynamic learning from adaptive neural control for flexible joint robot with tracking error constraints using high-gain observer

ABSTRACT This paper presents dynamic learning from adaptive neural control with prescribed tracking error performance for flexible joint robot (FJR) included unknown dynamics. Firstly, a system transformation method is introduced to convert the original FJR system into a normal system. As a result, only one neural network (NN) approximator is used to identify the uncertain system nonlinear dynamics and the verification on the convergence of neural weights is simplified extremely. To further solve the predefined performance issue, a performance function is introduced to describe a tracking error constraint and a error transformation technique is used to convert the constrained tracking control problem into the unconstrained stabilization of error system. By combining a high-gain observer and the backsteppig method, the adaptive neural controller is presented to stabilize the unconstrained error system. Under the satisfaction of the partial persistent excitation condition, the adaptive neural controller is shown to be capable of achieving unknown dynamics acquisition, expression and storage. Furthermore, a neural learning control with using the stored NN weights is proposed for the same or similar control task so that a time-consuming NN online adjustment process can be avoided and a better control performance can be obtained. Simulation results demonstrate the effectiveness of the proposed control method.


Introduction
Over the past two decades, the dynamics behaviour analysis, modelling and control of robotic manipulators with joint flexibilities has received a plenty of attention (Benosman & Le, 2004;Jiang, Liu, Chen, & Zhang, 2015;Nicosia & Tomei, 1990;Ozgoli & Taghirad, 2006;Rahimi & Nazemizadeh, 2014;Spong, Hutchinson, & Vidyasagar, 2006). In comparison with rigid robots, the flexible joint robots (FJR) have a number of advantages such as low power consumption, light weight, good security, large working space (Rahimi & Nazemizadeh, 2014). However, due to the presence of elastic gearboxes, it is necessary to consider the motor mass and inertia in the FJR modelling. Therefore, for an n-link FJR, 2n generalized coordinates are usually required to describe its whole dynamics behaviours, and thus FJR model is more complex than a rigid robot. Additionally, the model uncertainties, including parametric uncertainties and nonparametric dynamic uncertainties, caused by modelling errors and external disturbances, make the controller design for FJR challenging extremely. Based on the singular perturbation approach, Spong (1989) proposed an adaptive control scheme for FJR. In Al-Ashoor, Patel, and Khorasani (1993), a robust adaptive controller was CONTACT Min Wang auwangmin@scut.edu.cn developed for a reduced-order flexible-joint model using on-line identification of the manipulator parameters. Subsequently, many efficient methods, such as backstepping approach (Krstic, Kanellakopoulos, and Kokotovic, 1995;Oh & Lee, 1999;Soukkou & Labiod, 2015), sliding mode control (Huang & Chen, 2004), neural network (NN) control (Ge, Lee, & Tan, 1998) and adaptive fuzzy control (Li, Tong, & Li, 2013;Tang, Chen, & Lu, 2001), have been widely applied to deal with the control problem for FJR. By combining backstepping design and the function approximation technique, an adaptive sliding controller was proposed in Huang and Chen (2004) for a singlelink flexible-joint robot with mismatched uncertainties. The H ∞ disturbance attenuation design and recurrent NN adaptive control technique were proposed in Miao and Wang (2013) to achieve the desired H-infinity tracking performance of uncertain FJR. Although the backstepping design has extensive applications in FJR control fields, it is limited in its inherent computational complexity by the repeated differentiations of virtual controllers. To address this problem, Swaroop, Hedrick, Yip, and Gerdes (2000) presented a dynamic surface control technique by introducing a first-order low-pass filter in each recursive step, such that the repeated differentiation of the virtual control inputs can be eliminated. Using dynamic surface control design, a robust adaptive neural control (ANC) design method was developed in Yoo, Park, and Choi (2006) for FJR with model uncertainties using the self-recurrent wavelet NN. The result in Yoo et al. (2006) was further extended to FJR with position measurements (Yoo, Park, & Choi, 2008). It should be pointed out that as a recursive design similar to backstepping, dynamic surface control (DSC) design still needs to employ many function approximators when unmodelled dynamics is presented in the considered systems. This not only requires large computation, but also makes the convergence of estimate parameters verified difficultly. Meanwhile, so far, most existing intelligent control schemes for FJR only focused on the stability of closedloop systems without guaranteing that the estimated parameters converge to their optimal values. Therefore, the optimal parameter values are not capable to be stored and reused, so that for repeating the same or similar control task, the estimated parameters still have to be recalculated online (Ge, Hang, & Zhang, 1999;Ge & Wang, 2002;He, Chen, & Yin, 2016;Liu, Li, Tong, & Chen, 2016;Wang, Liu, Chen, & Zhou, 2014;Yu, Yu, Chen, Gao, & Qin, 2012). From this view of point, the learning ability of FJR is very limited. However, driven by practical specifications, learning and adaptation has become an essential abilities for FJR control systems to improve the control performance in uncertain dynamic environments. Learning in a dynamic environment for control systems is an extremely challenging problem. As stated in Narendra and Annaswamy (2012), the obstruction to achieve the parameter convergence is mainly caused by the verification of a persistent excitation (PE) condition. Recently, a deterministic learning mechanism (Wang & Hill, 2006) was proposed to verify that a partial PE condition can be satisfied by localized radial basis function (RBF) NNs along recurrent trajectories. Subsequently, the proposed method (Wang & Hill, 2006) achieves the convergence of partial NN weight estimates and accurate approximation of unknown dynamics. By presenting an extension of a recent result on stability analysis of linear timevarying (LTV) systems, the deterministic learning method was further developed for nth-order affine/nonaffine systems in a normal form (Dai, Wang, & Wang, 2014;Liu, Wang, & Hill, 2009). To address the learning and control problem of unknown cascaded nonlinear systems, a few elegant dynamic learning methods were proposed by combining a recursive design with a system decomposition strategy (Wang, Wang, Liu, & Hill, 2012;Wang & Wang, 2015a, 2015b. Furthermore, the learning mechanism was also applied into some physical systems such as marine surface vessels (Dai, Wang, & Luo, 2012;Dai, Zeng, & Wang, 2016), robot manipulators (Wang, Ye, and Chen, 2017b), and gait recognition (Zeng & Wang, 2016). For FJR with unknown system dynamics, two or three NN approximators will be used in a recursive design process. However, in order to store the unknown knowledge and reuse the stored information in the same or similar control task, the convergence of neural weight estimates need to be verified. When multiple NN approximators are employed to approximate the unknown dynamics, the NN convergence of the next subsystem relies strictly on the NN convergence of the previous subsystem. The verification of all neural weight convergence still is a complex and challenging task. Although deterministic learning can effectively recall the learned knowledge to improve the transient performance of the closed-loop systems, it is not capable to guarantee that the tracking error satisfies the predefined transient performance, such as maximum overshoot, convergence speed. However, a large number of practical systems are subject to predefined transient performance. The violation of the transient performance may cause un-safety problem or system damage. The predefined performance issue is an extremely challenging problem (Peng, Wang, and Wang, 2018). Recently, Bechlioulis and Rovithakis (2008) proposed a robust adaptive neural controller for multi-input multi-output feedback linearizable nonlinear systems with unknown nonlinearities, which achieved prescribed performance. The key idea is to use a performance function for describing the prescribed performance and transform the constrained system into an equivalent unconstrained one. The prescribed performance algorithm was further extended to the tracking control of many nonlinear systems, such as strict feedback nonlinear systems (Han & Lee, 2013;Tong, Chen, and Wang, 2005), robot-tracking control systems (Chen, Wu, Jiang, & Jiang, 2014;Kostarigka, Doulgeri, & Rovithakis, 2013;Teng, Yang, Dai, & Wang, 2016;Wang & Yang, 2017), a servo system (Na, Chen, Ren, & Guo, 2013), and permanent magnet synchronous motors (Chang & Tong, 2017). Using the existing schemes (Bechlioulis & Rovithakis, 2008;Chang & Tong, 2017;Chen et al., 2014;Han & Lee, 2013;Kostarigka et al., 2013;Na et al., 2013;Teng et al., 2016;Tong, 2005;Wang & Yang, 2017), the resulting closed-loop error systems is complex when the studied systems include unmodelled dynamics, so that it is difficult to achieve the convergence of estimate parameters and accurate approximation of unknown system dynamics.
In this paper, we address the dynamic learning control for FJR with prescribed performance. As stated above, the use of multiple NN approximators might break the achievement of neural learning control because the verification of the multiple NN weights convergence is difficult or impossible. To overcome such problem, a novel system transformation with a set of new state variables is designed to convert the original FJR system into a normal system and then a high-gain observer is employed to estimate the unavailable new state variables. The main benefit of the transformation is that only one NN approximator is used to identify uncertain system nonlinear dynamics, which extremely simplifies the controller design and the verification on the convergence of neural weights. Consequently, a stable ANC scheme is proposed for the FJR to guarantee the closed-loop stability and the prescribed tracking error performance. Further, by verifying the PE condition, the NN weights converge to the ideal constant weights which means the converged NN estimate parameters can be stored in the adaptive training phase. For the same or similar control task, the experienced knowledge (constant NN weights) is reused to reduce the calculation burden and to achieve a better transient performance than in the adaptive training phase.
The rest of this paper is arranged as follows. Problem formulation and some preliminaries are presented in Section 2. In Section 3, the detailed adaptive control procedure with prescribed performance is given. Section 4 presents how to acquire and store the experience knowledge and a rigorous proof is also showed. In Section 5, neural learning control using experience knowledge is developed. Simulation studies are performed to verify the effectiveness of the proposed controller in Section 6. Finally, the conclusions are included in Section 7.

System description
In this paper, we consider a single-link flexible joint manipulator which can rotate in vertical plane. The dynamic equations for this system are given by the following differential equations (Slotine & Li, 1991): where q 1 ∈ R,q 1 ∈ R,q 1 ∈ R are the angular position, velocity and acceleration of manipulator, respectively. Similarly, q 2 ∈ R,q 2 ∈ R,q 2 ∈ R are the shaft angle, velocity and acceleration of motor, respectively. u ∈ R is the control input used to denote the motor torque. The parameter J denotes the rotary inertia of motor. M and I are the mass and inertia of manipulator, respectively. L describes the distance from the centroid of the manipulator to the end of the manipulator. K is the parameter which depicts the joint flexibility. In this paper, the dynamics (including these system parameters) of the robot manipulators (1) are uncertain.
To simplify the controller design, we define x 1 = q 1 , x 2 =ẋ 1 =q 1 , x 3 =ẋ 2 , x 4 =ẋ 3 as the new state variables. Therefore, the system (1) can be transformed into a normal system as follows: where G = K/(IJ),q = [q 1 ,q 1 , q 2 ,q 2 ] ∈ R 4 , and It should be pointed out that F(q) is unknown smooth function and G is unknown constant due to the unknown system dynamics including these parameters I, M, L, J and K.
The system states q 1 ,q 1 , q 2 ,q 2 in robot manipulator system (1) are all measurable, but the states x 3 and x 4 in the transformed system (2) are not available since the dynamics of the considered system (1) are uncertain. So, we design an observer to estimate the states x 3 and x 4 . According to the following Lemma, the high-gain observer used in Behtash (1990) is chosen in this paper.
Lemma 2.1: Consider the following high-gain observer system: where p is a small and positive designed constant, ν = [ν 1 , ν 2 , ν 3 ] ∈ R 3 are observer states, and the parameters c i (i = 1, 2) are chosen such that the polynomial s 3 + c 1 s 2 + c 2 s + 1 is Hurwitz. According to the conditions of Lemma 2 in Wang and Wang (2015a), the following condition can be guaranteed: where φ = ν 3 + c 1 ν 2 + c 2 ν 1 , ψ (i) and φ (i) denote the ith derivative of ψ(t) and φ, respectively. There exist positive constants t * and h k , such that for all t > t * , |φ k | ≤ h k holds.
Remark 2.1: For the considered flexible joint manipulator (1), multiple NN approximators have to be employed to identify the unknown system dynamics using the existing ANC methods. This makes the controller design become a complicated procedure. Simultaneously, it is a challenge to obtain the convergence of NN weights for high-order flexible joint manipulator (1). To solve such problem, the original FJR system (1) is transformed into a normal system (2) by designing the new state variables. The transformed normal system can effectively reduce the number of NN approximators, but it contains partial unmeasurable states x 3 and x 4 . As a result, a three-order observer (3), instead of the all-state observer, is employed to estimate these unmeasurable states.
The desired reference trajectory y d is chosen as a recurrent signal, which is obtained by the following reference model:ẋ where

RBF network and spatially localized approximation
The RBF NNs have been widely employed as function approximators due to some significant merits including the linear structure, the universal approximation property and the good learning ability (Sanner & Slotine, 1992). Therefore, for any smooth function Q(X): R l → R over a compact set ⊂ R l , there exists an arbitrary small value η such that sup X∈ |Q(X) − W T S(X)| < η holds, where W ∈ R N being weights vector with NN nodes N and S(X) = [s 1 (X), s 2 (X), . . . , s N (X)] T ∈ R N being the RBF vector. The RBF s i (X) is chosen as: where ξ i and ς i are the centre and width of the receptive field, respectively. With sufficiently large node number N and suitable values ξ i , ς i , the RBF NN W T S(X) can approximate any smooth function Q(X) to any arbitrary accuracy as where W * are the ideal weights, η(X) is the approximation error, and |η(X)| ≤ η * with η * > 0 being a small value for all X ∈ X . In Wang and Hill (2006), it has been proven that for any bounded trajectory X(t) over the compact set X , the continuous function Q(X) can be approximated using a limited number of neurons, which is located in a local region close to the trajectory: (8) where ζ represents the region around the trajectory X(t), and the approximation error η ζ (X) satisfies that |η ζ (X) − η(X)| is small. The convergence of estimate parameters in adaptive system (Gorinevsky, 1995) is a challenging work since the PE condition is difficultly verified a priori. According to the previous results in Wang and Hill (2006), it has been proven that the regressor subvector S ζ (X), along any recurrent orbit X(t), satisfies the PE condition. The following lemma shows the PE property of RBF NN.

Lemma 2.2 (Partial PE for RBF NN (Wang & Hill, 2006)):
Consider any continuous recurrent orbit X(t), and assume that X(t) remains in a bounded compact set X . Then, for the RBF NN W T S(X) with centres laid on a regular lattice which is large enough to cover the compact set X , the regression subvector S ζ (X) consisting of RBFs close to the recurrent orbit X(t) is persistently exciting.

ANC with prescribed performance
In this section, our main work is to develop a novel adaptive neural controller which guarantees the convergence of the tracking error and achieves the prescribed tracking performance at the same time. In order to avoid using a large number of NN approximators and verify the convergence of NN weights easily, the flexible-joint robot system (1) is transformed into the normal form (2). For the transformed system (2), the control objectives are not changed. As a result, we introduce a performance function which describes the prescribed performance such as convergence rate, maximum overshoot and steady-state error. To solve the restricted tracking control problem, a performance transformation technique is introduced to transform the constrained tracking error into equivalent unconstrained one. Finally, combining the backstepping method and employing RBF NN approximators, we design an adaptive neural controller and present the corresponding stability analysis.

Constrained error transformation
According to Lemma 2.1 and Equation (4), it can be included that ν 2 /p, ν 3 /p 2 can well estimate the unmeasurable states x 3 , x 4 . Then, to facilitate analysis, define the following error vectors: e = e 1 , e 2 ,ê 3 ,ê 4 where e is an ideal error vector because of the unmeasurable x 3 , x 4 ,ê is an actual error vector by using observer states, andẽ refers to the observer error vector. In this paper, the tracking error e 1 = x 1 − x d1 is required to satisfy the following prescribed performance condition: where σ and σ are positive design constants, ρ(t) denotes a performance function chosen as where ρ 0 > ρ ∞ and s > 0 are predefined parameters.
From (13), the performance function ρ(t) is positive, smooth and exponential decreasing with lim t→∞ ρ(t) = ρ ∞ > 0. According to the performance function (13) and the error restricted condition (12), the prescribed performance can be detailed as follows: (1) σ ρ 0 and −σ ρ 0 denote the upper bound and the lower bound of the overshoot of tracking error e 1 , respectively.
(2) The lower boundness on the convergence rate of e 1 is confirmed by a decreasing rate s which introduces by the exponential term e −st . (3) The upper bound and the lower bound of the allowable steady-state tracking error e 1 are defined as σ ρ ∞ and −σ ρ ∞ , respectively.
Through tuning the parameters σ and σ , s, ρ 0 , and ρ ∞ probably, we can obtain different prescribed tracking error performances to satisfy the real industrial demands.
To achieve trajectory tracking control with prescribed performance, a performance transformation method is introduced to convert the constrained error e 1 (t) into an equivalent unconstrained error z 1 (t). Define a smooth and strictly increasing function (z 1 ) and satisfying the following conditions In this paper, the transformed function (z 1 ) is constructed as where δ = ln(σ /σ ). And the illustration of function (z 1 ) is shown in Figure 1.
Remark 3.1: Note that the designed function (z 1 ) in (15) passes through the coordinate origin from Figure 1. This is a significant feature for the asymptotic convergence of tracking error e 1 . From (15) and (16), it can be guaranteed e 1 = 0 when (0) = 0. In other words, the tracking error e 1 converges to zero rather than a neighbourhood of zero when the transformed error z 1 converges to zero.

ANC and stability analysis
In this subsection, the backstepping design method is introduced based on the unconstrained error variable z 1 (t), and the unknown dynamics of the flexible-joint robot system is approximated by RBF NN.
According to (28), the desired control law is designed by where k 4 > 0 is a design constant. Noting that the system dynamics Q(X) is unknown, an RBF NN W T S(X) is employed to approximate it. Then, we have where the NN input vector is X = [q 1 ,q 1 , q 2 ,q 2 , f 4 (·)] T ∈ R 5 , W * is an ideal constant weight vector and |η| ≤ η * is the NN approximation error with an any small positive constant η * . LetŴ be the estimated weights of W * , and define the estimate errorW =Ŵ − W * . Then, the adaptive neural controller can be designed as with the updated law of NN weightsŴ given bẏ where > 0 is the designed diagonal matrix, and σ > 0 is a small value for improving the robustness of the adaptive controller.
Theorem 3.1 (Stability and Tracking): Consider the closed-loop system consisting of the flexible joint manipulator system (1) with the prescribed performance restriction (12), the reference recurrent model (5), the error transformation (17), the ANC law (32) together with neural weights updating laws (33). If the bounded initial conditions satisfy the prescribed performance (12), there exists a finite time T such that the proposed control scheme ensures: • all the signals in the closed-loop system are uniformly ultimately bounded (UUB); • the tracking error e 1 converges to a small neighbourhood of zero while satisfying the prescribed performance restriction (12).

Learning from ANC
Based on the proposed stable ANC scheme in Section 3, this section will discuss how to achieve the knowledge acquisition, expression, and storage of the unknown system dynamic using only a limited number of neurons located in a local region along the trajectory. Without the accurate NN weights convergence, it is hard to acquire and store the knowledge of the time-varying NN weights. It has been shown in Wang and Hill (2006) that the convergence of NN weights is related to PE condition of regression vector S(X). From Lemma 1, the PE condition can be achieved if we can verify the recurrent property of the NN input variables X. Subsequently, the converged NN weights can be obtained and stored as experience knowledge.
Using the spatially localized approximation property of the RBF NN presented in (7), the closed-loop system consisting of (33) and (34) can be expressed bẏ where S ζ (X) is the subvector of S(X), consisting of RBFs that are close to the reference orbit X(t),Ŵ ζ ∈ R ζ is the corresponding estimate weights subvector with 0 < ζ < N,ζ denotes the region which is far away from the orbit X(t), and η ζ = η −W T ζ Sζ (X) is the approximation error along the reference orbit, ||η ζ | − |η|| is close to zero because of the small value ofW T ζ Sζ (X).
Proof: In order to obtain the exponential stability of the closed-loop system, the system (36) needs to be described as a class of LTV system with very small perturbations whose exponential stability has been proven with some necessary condition (Liu et al., 2009). where However, for the actual flexible manipulator, the large joint flexibility parameter K and the inertia parameters I,J may cause the large term G, and the disturbance Gη ζ may be large due to the possible large G. The existing result of the exponential convergence becomes invalid. Therefore, the following discussion is given in this paper: (1) When the term G is a small value, Gη ζ will become a small perturbation. As a result, the system (38) can be regarded as a class of LTV system with small perturbations.
(2) When the term G > 1 is a large value, a state transformation z s = z 4 /G is introduced to make the perturbation Gη ζ small. Then, the system (38) can be transformed as The LTV system with small perturbations is exponentially stable if the following conditions are satisfied: C1: B(t) satisfies PE condition; C2: There exists a symmetric and positive matrix P(t) such that P(t)B(t) = C(t) holds; C3: and A T (t)P(t) + P(t)A(t) +Ṗ(t) < 0 holds.
Condition C1 means that S ζ (X) needs to be verified the satisfaction of the PE condition. As stated in Lemma 2.2, the regression vector S ζ (X) is persistently exciting if the NN inputs X = [q 1 ,q 1 , q 2 ,q 2 , f 4 (·)] T are recurrent. From Theorem 3.1 and the stability analysis in Appendix 1, z i and e 1 converge exponentially to small neighbourhoods of zero for t > T. Since e 1 = x 1 − x d1 and z 2 = x 2 − α 1 , the states x 1 = q 1 and x 2 =q 1 follow the recurrent signals x d1 and α 1 for all t > T, respectively. Noting z 3 = ν 2 /p − α 2 , ν 2 /p is recurrent with the same period as α 2 . From Lemma 2.1, state x 3 is also recurrent as ν 2 /p. By combining the original system (1) and the transformed system (2), it can be concluded that q 2 is recurrent. Recursively, we can also obtain the recurrent property of f 4 (·). So, the regression vector S ζ (X) satisfies PE condition due to the recurrent inputs X(t). Meanwhile, a positive P(t) can be easily found to satisfy the condition C2 and condition C3. For example, by choosing P(t) = ζ /G or P(t) = G ζ (for large Gη ζ ), we obtain P(t)B(t) = C(t) and A T (t)P(t) + P(t)A(t) +Ṗ(t) < 0 if appropriately designing the positive constant k 4 .
Based on the perturbation theory from Lemma 4.6 in Khalil (1996) andW ζ =Ŵ ζ − W * ζ ,Ŵ ζ converge exponentially to small neighbourhoods of optimal weights W * ζ for t > T. Therefore, the constant convergent weightsW can be obtained according to (37), and the unknown system dynamics can be approximated by where η ζ 1 and η ζ 2 are close to the ideal approximation error η * because the small convergence errorsW ζ . Noting thatζ denotes the region which is far away from the orbit X(t), the NN weightsŴζ are not activated and updated whenŴ(0) is chosen, in other wordsWζ remain small (close to zero). So (40) can be rewritten as where η 2 = η ζ 2 −W T ζ Sζ (X) with ||η 2 | − |η ζ 2 || being small. This completes the proof.

Learning control using learned knowledge
In the previous section, it has been proven how to achieve the knowledge acquisition, expression, and storage of unknown dynamics Q(X). In this section, the stored NN weightsW in (37) will be reused to construct a neural learning controller for the same control task. The control performance will be improved with the following neural learning controller where the learned constant NN weightsW are obtained from (37), the definition of z 3 and z 4 are same as the ANC process in Section 3. For clarity purposes, the schematic diagram of the proposed neural learning control scheme framework is shown in Figure 2. In what follows, the following results can be obtained based on the proposed neural learning controller (42).

Simulation results
In this section, the simulation of a fourth-order one-link flexible manipulator is performed to test the effectiveness of the proposed ANC scheme and the NN learning control scheme. Considering the dynamic model of the one-link flexible manipulator system in (1), the actual parameters in the system are chosen as M = 2.3 kg, L = 1 m, g = 9.8 m/s 2 , I = 2.3 kg · m 2 , J = 0.5 kg · m 2 , k m = 15. And the recurrent reference trajectory y d is chosen as which is shown in Figure 3. In this paper, the tracking error e 1 is required to satisfy the prescribed performance which is described as where σ = 1 and σ = 1.2.

Learning from stable ANC
Based on the dynamic form ( In order to show the better transient performance of the proposed adaptive neural controller (32)-(33), the simulation comparison is performed with the existing adaptive DSC method in Wang and Wang (2015a) without prescribed performance. For contrast, we choose the flexible manipulator (1) with the same system parameters and the same reference trajectory (43). The transient and steady performance of e 1 with both control schemes is shown in Figure 4. It can be seen that the faster convergence speed and the smaller overshoot of the tracking error e 1 can be obtain with the proposed ANC method than the DSC method in Wang and Wang (2015a). From Figure 4, the tracking error obtained (Wang & Wang, 2015a) not only violates the prescribed performance (44), but also shows damped oscillations. On the other hand, according to Appendix 1, the tracking error e 1 of the proposed ANC method can be made arbitrarily small by choosing the large k 2 , k 4 and the small σ . In particular, two NN approximators have  to been used to identify uncertain system dynamics of the flexible manipulator (1), which makes time consumption very lager than the proposed method in this paper. For example, based on the same computer setup, the actual time consumption is 2548 and 4005 s, respectively, using the proposed method in this paper and the existing method in Wang and Wang (2015a). Figure 5 indicates that the state observer outputs can well estimate unmeasurable system states. The bounded control input u is shown in Figure 6. From Figures 3-6, the proposed adaptive neural controller (32) -(33) guarantees all the signals of the closed-loop systems are bounded and the tracking error remains in the predefined bound (44) all the time.

Learning control with experience reuse
After achieving stable ANC in Section 6.1, the main objectives of this section are to show that the neural weight estimatesŴ can exponentially converge to constant weightsW; the unknown system dynamics Q(X) in (29) can be accurately identified by the constant RBF NNs W T S(X), and the constant RBF NNs can be reused to design a neural learning controller for the improved control performance of the same control task.
As stated in Section 4, we can verify the regression vector S(x) satisfies the partial PE condition when system output y d tracks the given periodic reference trajectory (see Figure 3 for the output tracking performance). Under the PE condition, the NN weight estimatesŴ can converge to the ideal NN weightsW, which is shown in Figure 7. From Figure 7, the NN weight estimates achieves the good convergence within the time interval [450s, 500s]. Therefore, based on Theorem 4.1, the converged constant NN weightsW can be calculated bȳ The constant NN weightsW in (45) (45), we can design a neural learning controller (42). To show the improved control performance for the same control task using the experience knowledge, a simulation comparison is given between the proposed ANC (32)-(33) and the proposed learning controller (42) with the experience knowledge (45). For comparison, in the simulation, the flexible manipulator (1) is used again as the plant to be controlled by the neural learning controller (42). The reference trajectory and the system initial condition are taken to be the same as those used in Section 6.1. Simulation results are shown   in Figures 8-9. From Figure 8, it can be seen that the unknown system dynamics Q(X) can be well approximated by the constant RBF NNW T S(X) with the experience knowledge (45). From Figure 9, it is clear that the tracking error transient performance in learning control phase is better than in ANC phase. Moreover, comparing with the ANC (32)-(33), the proposed neural learning controller effectively recalls the experience knowledge (45), so that the running time is saved nearly 2/3 and the computational burden is also greatly eased due to avoiding to recalculate NN weight estimates tuned in ANC process.

Conclusions
In this paper, we addressed learning from ANC of FJR with the predefined performance constraint. By combining a system transformation method and the high-gain observer, the proposed ANC scheme guaranteed that the tracking error converged to a small residual set of zero in a finite time T and all the signals in the closed-loop system were UUB. Simultaneously, the prescribed performance could be guaranteed by the error transformation method, which converted the original constrained tracking error control issue into the equivalent unconstrained error stabilization issue. Specially, the proposed adaptive neural control scheme only use one NN approximator, so that we could easily verify the partial PE condition of RBF NN. Under the partial PE condition, the proposed ANC achieved the knowledge acquisition, representation and storage of unknown system dynamics. Using the stored experience knowledge without the online NN adjustment, the neural learning control scheme was proposed to obtain a better transient and steady-state performance for the same or similar control task. Some further actions can be taken to extend the proposed scheme to solve different kinds of control problems including the outputfeedback control (Wang, Liu, Li, & Wang, 2018;Wang, Liu, & Shi, 2017a) and the leader-follower formation control (He, Wang, Dai, and Luo, 2018).

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This work was supported in part by the National Natural Science