Online reinforcement learning control of robotic arm in presence of high variation in friction forces

The operation and accuracy of industrial robotic arms can be negatively affected by significant fluctuations in friction forces within their joints, potentially resulting in financial and operational losses. To mitigate these issues, an online model-free reinforcement learning controller specifically designed to handle high variations in joints’ friction forces. To the best of our knowledge this is the first time where reinforcement learning controller is used to handle high friction variations in a robotic arm. Initially, the dynamic equations of the robotic arm are derived, verified and validated to ensure an accurate representation of real-world behaviour. The stability of the closed-loop system is analyzed using the Lyapunov second method. The performance of the proposed controller in terms of position tracking is compared against four commonly used controllers found in literature for similar applications: (i) nonlinear model-based computed torque controller, (ii) proportional-derivative controller, (iii) adaptive iterative learning controller and (iv) radial basis function neural network adaptive controller. Simulation results demonstrate that the reinforcement learning controller outperforms the other controllers in terms of tracking performance, even in the presence of significant variations in joint friction forces.


Introduction
Motion control of robotic manipulators is challenging, mainly due to presence of joints' friction, nonlinearity and coupled dynamics.The motion control of a robot becomes even more challenging specially when the robot operates in harsh environments (i.e.presence of dust and debris) (Wong et al., 2018).Usually, high friction forces in the manipulator's joints are the most dominant factor that affects controller performance.Friction causes stick-slip in relative motion, affects the durability and reliability of systems, and leads to significant performance losses if it is not adequately considered (Liu et al., 2019a).Friction effects can be cancelled by using friction compensation.This can be done by incorporating an estimation of the friction force into the control signal to cancel out its influence (Caldarelli et al., 2022).Model-based friction compensation schemes require an in-depth understanding of friction characteristics (Gao et al., 2022a).In the last decades, the friction phenomena were studied extensively and many friction models were proposed including: Coulomb model, viscous model, Stribeck mode, Dahl model, LuGre model and Generalized Maxwell-Slip model (Gao et al., 2022a).Generally, utilizing any of the aforementioned friction models within the model-based friction compensation scheme assumes that model parameters are constant or slowly varying in nature (Huang et al., 2019).This assumption is often invalid, particularly when the robot is operating under harsh conditions as in mining, grinding and polishing applications.In such applications there is significant presence of pollutants like dust and debris, causing the frictional forces to increase.Therefore, model-based compensation approaches usually lose their effectiveness (Chew et al., 2021) and (Yin et al., 2021).Consequently, model-based controllers are usually combined with another control strategy that helps in compensating for uncertainties as in Abraham et al. (2020).
Another alternative to model-based controller is the model-free controller, which does not depend on any mathematical model of the manipulator.Intelligent controllers are among the most promising model-free control strategies.Intelligent controllers are inspired by biological systems and human cognitive capabilities.They possess learning and adaptation capabilities.Different intelligent control strategies have been investigated in the literature.In Esmaeili et al. (2019), a data-driven observer with an adaptive sliding mode controller for manipulators was investigated.An iterative feedback model-free adaptive learning control of pneumatic artificial muscle was investigated in Wu et al. (2019).Aliman et al. (2022), designed an adaptive fuzzy-proportionalderivative (PD) controller for the rehabilitation lower limb exoskeleton.In Gundogdu and Celikel (2021), a nonlinear autoregressive control scheme was proposed to control a single link manipulator at low speed.
Many of the intelligent control strategies mentioned earlier require offline training and tuning.However, when a robot operates in a harsh industrial environment (i.e.dusty), performance with offline training strategies deteriorates because the friction forces can vary significantly.Therefore, learning approaches that estimate and compensate for friction forces online showed a promising performances as in Roveda et al. (2022).
Recently some promising artificial intelligence (AI) techniques, that are capable of performing training and adaptation online, have been utilized to control robotics with presence of uncertainties and disturbances.In Chen et al. (2020), linear quadratic regulator (LQR) with radial basis function neural network (RBFNN) was employed to enhance the tracking performance under variable admittance control for human-robot collaboration.In Kumar and Rani (2021), a model-free scheme was integrated with a RBFNN to compensate for the unknown dynamics and uncertainties.In Lee et al. (2019), an adaptive iterative learning control (ILC) algorithm was proposed to adaptively identify the friction model over multiple iterations.In Gao et al. (2022b), an online adaptive backstepping integral nonsingular terminal sliding mode control was proposed for precision trajectory tracking of manipulators under unknown dynamics and external disturbances.Cremer et al. (2020), investigated a model-free online neuro-adaptive controller with inner and outer neural networks (NNs) for human-robot interaction.An active inference online joint space torque controller for manipulators was proposed in Pezzato et al. (2020).In Liu et al. (2019b), RBFNN-based tracking control of under actuated systems with unknown parameters and with matched and mismatched disturbances was developed and tested on a two-link planar manipulator.In Zhang et al. (2022), a PD controller with augmented NN was utilized to compensate for both the continuous dynamic and discontinuous friction of a two degree of freedom (DOF) robotic arm.
Another promising model-free AI control approach that can deal with nonlinearities, uncertainty and significant variation in the friction forces is the reinforcement learning (RL).RL is one of the data-driven decision-making frameworks that focuses on the interactions of an agent with its environment, where the agent tries to find a set of actions that maximizes the cumulative reward (Li & Deng, 2021).Pane et al. (2019) tested an RL-based controller on a robot manipulator, and compared its performance with PD controller, model predictive control (MPC) and ILC.The results showed better performance with the RLbased controller when compared with PD, MPC and ILC.An RL tracking controller with a kernel-based transition dynamic model was proposed in Hu et al. (2020).In this approach, a reward function was defined according to the features of tracking control to speed up the learning process.The results showed that their proposed algorithm has better tracking performance than NN and adaptive NN controllers when tested on 2-DOF robotic arm.In Ouyang et al. (2020), an adaptive control with actor-critic RL was proposed for a 2-DOF arm with elastic joints.The tracking performance with the RL controller was better than the performance obtained by a PD controller.In Lee and An (2021), an RL-NN-based controller was developed and tested experimentally on a self-balancing quadruped robot.Their results revealed a promising control algorithm that can replace the mathematically based robot control system.RL-based optimal controller was utilized in Liu et al. (2022) to minimize the tracking errors of a shape memory alloy actuated manipulator.

Paper contributions
Based on conducted literature and to the best of our knowledge, RL has not been evaluated in position tracking application of a robotic arm with presence of high variation in joints' friction forces.On the other hand, friction variation in a robotic arm is unavoidable and is more significant when a robot operates in a harsh industrial environment.If the controller dose not compensate for any friction variation in real time, then the tracking performance will degrade significantly.
In this paper, an online model-free RL control approach is utilized to control a 3-DOF robotic arm in presence of high variation in its joints' friction forces.The tracking performance with the proposed controller is compared against four controllers using different desired trajectories.The four controllers selected for comparison represent major control strategies that are usually used for similar applications, namely: nonlinear model-based computed torque (CT) controller, linear control strategy (i.e.PD), and model-free adaptive control strategy (i.e.ILC and RBFNN).

Paper organization
The sections introduced in the manuscript are organized as follows: the robot arm dynamics were derived, validated and verified in Section 2. Section 3 introduces the proposed RL controller.The simulation results and comparison with other controllers are presented in Section 4. Finally, concluding remarks are presented in Section 5.

Robotic arm modelling, validation and verification
The dynamic model of the 3-DOF articulated robotic arm shown in Figure 1 is derived using the wellknown energy-based Lagrangian dynamic formulation.The dynamic equation of the manipulator is represented using state space as in (1), where T is the joints' torque, θ is the joint angle, θ is the joint angular velocity, and θ is the joint angular acceleration, M(θ ) is the mass matrix, V(θ , θ ) is the vector of centrifugal and Coriolis forces, G(θ ) is the gravity force vector and τ f is the friction torque vector.
M(θ ) is given in Equations ( 2)-( 6), where m i is the mass of link i, I i is the inertia tensor of link i, L i is link i length and L ic is length of the center of mass for link i, and i from 1 to 3. (3) G(θ ) is given as in Equations ( 11)-( 13), where g is the gravitational acceleration.
In the developed model, the Stribeck friction model was adopted to represent the joints' friction forces as shown in Figure 2. τ f , the friction torque vector, given in ( 14), where f s is the static friction coefficient, f e is the applied joint's torque, f c is the Coulomb friction coefficient, f v is the viscous friction coefficient, and θs is the Stribeck velocity.always positive definite, that is θT M(θ ) θ is always greater than zero as stated in Siciliano and Khatib ( 2016).This condition is evaluated using different sets of joints' trajectories.Figure 3

Reinforcement learning controller
RL is inspired by how a human learns when subjected to an ambiguous task.In contrast to the supervised learning algorithm that aims to map between known inputs and outputs, the RL algorithm is based on rewarding for desired behaviour or punishing for undesired behaviour without any prior knowledge about the inputs and outputs.

RL controller design
The proposed controller in this paper contains two NNs, namely, critic and actor networks as shown in Figure 5.
The actor network provides the actual control signal (u) which is the control torque sent to the arm actuators, and the critic network provides long term performance information (Q) to the actor network, which will be used to improve the actor performance.The critic-actor RL is suitable for learning control policies online, and is capable of adapting time varying system parameters such as joints friction in the robot arm.The objective of the controller is to bring the joints to the desired angles θ d (t) while ensuring the stability of the closed-loop system.For a given desired angle trajectory θ d (t), the angles tracking error vector e(t) is given as.
where θ d (t) and θ(t) are the desired and actual joint trajectory, respectively.A utility function p(t) is defined to measure the current system performance, i.e. the current state of the angles tracking error, and it is expressed as where || is the first norm of the angles tracking error.The long-term performance of the system which represents the long-term effect of a given control action on the tracking error, is given as where u(t) is the current control action, α is the discount factor, a positive constant defined as 0 < α < 1, and R is the control horizon and it is selected as R > 1.

Critic network
The critic NN is utilized to approximate the long-term performance Q(t) as  Where Q(t) is the approximation of the long-term performance Q(t), W c (t) and N c (t) are the weights of the output layer, and the output of the hidden layer of the critic NN, respectively.The activation function N c (t) is defined as: Where W ci is the input-hidden weights of the critic network, Z(t) is the critic network input vector defined as Z(t) = [e(t), u(t)], and σ is the smoothing parameter.
The critic prediction error is the difference between the predicted and the actual long-term performance and it can be expressed as: Substituting ( 17) into (20) leads to Based upon the prediction error E c (t), and utilizing (18), the update rule of the critic NN is expressed as in ( 22), where α c is the adaptation gain of the critic NN weights.

Actor network
The actor network is used to approximate the control signal u(t) as where W a (t) and N a (t)are the weights of the output layer and the output of the hidden layer of the actor NN, respectively.
The activation function N a (t) is defined as: W ai is the input-hidden weights of the actor network, and σ is the smoothing parameter.
The update law of the actor network weights is given as where α a is the actor network adaptation gain.

Network convergence analysis
According to the universal approximation theory, a NN with a single hidden layer that has sufficient number of hidden neurones can approximate any nonlinear function under certain conditions.For a given function Y and its NN approximation ( Ŷ), this approximation property is typically expressed as: Where is the approximation error of NN and it is bounded as | | ≤ ζ , where ζ is a positive number, W is a constant ideal weight vector and N is the output of the hidden layer of the NN.Both critic and actor and networks are used to approximate the long-term performance and the control law respectively, so the weights of these networks are expected to be varying in a bounded range and both networks will produce an approximation error which is bounded as | | ≤ ζ .

Stability analysis
To ensure the stability of the closed-loop system, Lyapunov Second Theory will be used for this analysis.If the adaptation gains are selected as in ( 28) and ( 29), then the closed-loop system is semiglobal ultimate bounded (SGUB). (29) Proof: Let us define the weight errors E wa (t) and E wc (t) of the actor and critic networks, respectively as follows: Where Wa(t) and Wc(t) are the actual output weights of the critic and actor, respectively, and Wa(t) and Wc(t) are the constant ideal output weights of the critic and actor, respectively.
Let us consider a Lyapunov function candidate as follows.
The time derivative of the first Lyapunov function L 1 (t) is given as.
Simplifying further leads to L2 (t) ≤ 0. Since both L1 (t) ≤ 0 and L2 (t) ≤ 0 the system is semi globally ultimately bounded.The proof is complete.

Simulation results
To test the proposed controller, the previously derived dynamic equations of the robot were implemented and simulated via MATLAB software with ode14x solver on a computer running Windows 10 with Core i5 processor.The input-hidden weights for the critic and actor networks, W ai and W ci are initialized as: W ai = [−1 : 0.1 : 1] and W ci = [−1 : 0.1 : 1].The output weights of the critic and actor networks W a and W c are initialized to zeros.The critic and actor gains are selected as α c = 0.1 and α a = 100, respectively.Values for α and σ were chosen to be 0.99 and √ 0.5, respectively.The proposed RL control strategy is an online learning scheme; hence no previous learning or training is needed.Figure 6 shows the learning pattern for the RL controller when a sinusoidal trajectory is set as a reference for the three joints.The  evolution of the norm of the actor network weights as depicted in Figure 7 indicates that the weights of the actor network will converge after some learning time to be varying in bounded range less than 200.In the beginning, the responses for the three joints oscillate around the desired trajectory, which represents the RL controller learning stage.However, shortly the responses improved significantly which reflects the fact that the RL controller tuned itself successfully.
In the upcoming subsections, the proposed RL controller is to be tested with trajectories in both joint and Cartesian operational spaces, and with the friction forces in the joints being increased significantly.Moreover, the tracking performance of the proposed RL controller will be compared against the performances obtained with four common manipulators' benchmark controllers: (i) CT, (ii) PD, (iii) ILC and (iv) RBFNN.

Benchmark controllers
The performance of the proposed RL-based controller is compared with CT, PD, ILC and RBFNN control strategies.
The four benchmark controllers are usually used for the manipulator's position control and represent the major type of controllers presented in the literature.CT is an example of a model-based none adaptive controller, while PD is an example of a model-free none adaptive liner controller.On the other hand, ILC and RBFNN controllers belong to the online adaptive model-free control strategies.In this subsection, four controllers are introduced, tuned, and then tested on the 3-DOF manipulator discussed in Section 2. Their tracking performances are to be compared with the performance obtained with the RL proposed controller.

Computed torque control strategy
CT controller is a special type of feedback linearization and requires full knowledge of the system's dynamics and parameters.Feedback linearization is used to cancel all the nonlinearities in the dynamics of the robot; hence, the overall closed-loop system acts like a fully linear system, and it guarantees the stability of the closedloop system as stated in Siciliano and Khatib (2016).As shown in Figure 8, it uses two loops: (i) a feedforward loop that cancels out the nonlinearities by the use of inverse dynamic model, and (ii) a feedback loop that is used for trajectory tracking with a proportional-velocity (PV) controller.The equation of the controller output, U CT , is given as where K v_CT and K p_CT are the velocity and the proportional gains, respectively.θd and θ d are the desired angular velocity and angle, respectively.It is worth mentioning that the design and tuning were according to Siciliano and Khatib (2016).For the computed torque controller, the gains are selected as K v_CT = 300 and K p_CT = 1000.

Proportional-derivative control strategy
The PD controller is a widely used linear controller in industrial processes due to its simple design.Figure 9 shows the block diagram of the PD controller of the robotic arm.The control signal, U PD , is where K p_PD is the proportional gain, K d_PD is the derivative gain, e(t) is the tracking error and ė(t) is the derivative of the tracking error.Similar to the CT, the design and tuning were according to Siciliano and Khatib (2016).For the PD controller, the gains are chosen as K p_PD = 1000 and K d_PD = 100.

Iterative learning control strategy
ILC is a model-free adaptive control strategy.Due to its simplicity and robustness, ILC is becoming popular in robotics applications.The ILC controller is based on the assumption that tracking errors in a repetitive task stay unchanged in the absence of an explicit external correction.A feedforward ILC is added to a PD feedback tracking controller to compensate for this repetitive error as depicted in Figure 10 and given in ( 47).This PD-type ILC controller is similar to the one adopted in Boudjedir and Boukhetala (2021).
+ K d_ILC (e j (t + 1)-e j (t))] (47)   where j denotes the j-th iteration, H(q) is a low pass filter, e is the error, K p_ILC and K d_ILC denote the proportional and derivative gains, respectively.For the ILC controller, the gains are chosen as K p_ILC = 1000 and K d_ILC = 100.

Adaptive RBFNN control strategy
When utilized in control systems, RBFNNs have the benefits of superior process learning capabilities and some degree of disturbance immunity.The suggested RL controller is relatively comparable to the adaptive RBFNN because both learn online and are model-free controllers.The adopted RBFNN control law in this study is similar to the one adopted in Liu et al. (2021).The control law U RBFNN of the adaptive RBFNN is given in ( 48) and ( 49), and the weights are updated according to the error as given in (50).That is the network modifies its parameters to adapt to the variation of the plant.Figure 11 shows the block diagram of the RBFNN controller.
where x is the input vector, w in are the weights of the input layer, k(x, w in ) is the radial basis function in the hidden layer, w o are the output layer weights, α RBF is learning rate of RBFNN and ε is the error vector.The learning rate, α RBF , of the RBFNN controller is selected to be 3.4 and the weights and the parameters of the RBFNN are initialized as for the actor network.

Joint space simulation results
In this section, the proposed RL controller is tested in joint space and its trajectory tracking performances are compared with the tracking performances obtained with the benchmark controllers described in subsection 4.1.All controllers were well-tuned using the same trajectory presented in Figure 6.Then, the controllers were tested on a sinusoidal trajectory, shown in Figure 12, for the three joints with the friction increased by 100% and 200% at 4 and 8 s, respectively.The increase in friction forces represents a possible increase in the joints' friction that may happen in a real industrial environment due to pollutants such as dust and other debris.As depicted in Figure 12, all controllers performed well when there is no increase in friction, time from 0 to 4 s.However, when friction forces increased by 100% at 4 s and 200% at 8 s the performance with the CT controller started to degrade significantly, especially for joint 3, compared to other controllers.The significant failure of the CT controller in the tracking of the desired trajectory, when friction increases, is mainly due to the dependency of the CT on an accurate model of the robot and the lack for any adaptation.Figure 13 shows the tracking error for the three joints of the robot with all controllers, tracking error for CT was excluded since it was significantly higher than others.It can be seen from Figure 13 that the least tracking error was obtained with the RL controller.The performances with RBFNN and ILC were better than the PD performance.This is expected since both RBFNN and ILC are online adaptive controllers.Worst tracking performances were achieved with the non-adaptive controllers, CT and PD.Since the RL controller is a learning-based technique, it learns the changes in friction and updates its policy to overcome these changes.On the other hand, both PD and CT are not capable of changing the control policy and thus, they both fail to sustain the performance of the arm.RL in this learning task is continuous and thus, it keeps learning on the fly and updates its parameters using the update laws as indicated in ( 22) and ( 25).Since RL learns in two folds, using critic and actor NNs, it achieves better performance than ILC and RBFNN as both of the later methods have single learning approaches.

Task space simulation results
In this section, the simulation results of the end effector position in terms of the Cartesian space are discussed and the desired trajectory is given in the operational task domain (X, Y and Z). Figure 14 shows the responses of the five controllers when the desired trajectory is rectangular and with no variation in the friction forces.While the RL, CT, ILC and RBFNN controllers are all able to follow the required trajectory, the PD controller fails to follow the trajectory efficiently as depicted in Figure 14.Table 2 lists the performances of the controllers, in tracking the As shown in Table 2, when there is no variation in friction forces the PD has the largest error when compared to the other controllers.But when the friction forces increase by 100%, the CT performance degrades drastically and becomes the worst among all controllers as depicted in Figure 15.When compared to the CT, the performance of the PD did not worsen significantly but its performance was still inaccurate.On the other hand, ILC, RBFNN and RL adapted to the changes in friction, in contrast to PD and CT.To further compare between the ILC, RBFNN and RL controllers, Figure 16 shows only the tracking error of PD, ILC, RBFNN and RL, the CT error is excluded since its bad performance is visible.Figure 16 and Table 2 show the superiority of the RL performance compared with the performance of ILC and RBFNN.For the 200% friction increase case, Figure 17 shows the performance of all the controllers except the CT because of its out-of-range response.It is clear how the PD did not handle the variation in the friction well and its performance worsen in comparison to the previous cases.The performances of ILC and RBFNN were worsened when the friction increased to 200% and both oscillate around the desired reference as depicted in Figure 18 and Table 2.The RL, on the other hand, adapted to the friction increase and has the lowest error compared with the other four controllers.From Table 2, it was found, in the 200% friction increase case, that RMSE with RL is less than RMSE with PD, ILC and RBFNN by 95%, 70% and 69.1%, respectively.Moreover, IAE with RL was 94%, 64.7% and 74.6% less than IAE with PD, ILC and RBFNN, respectively.This shows the ability of the RL controller to cope with high changes in the joints' friction forces that may occur in the system, hence maintaining the accuracy of the system.

Conclusions
This study presents an online model-free control approach to efficiently deal with any variation in the friction forces that occurs in the robot's joints.In most cases, friction variation in a robotic arm is unavoidable and usually occurs when the robot ages or operates in a harsh industrial environment.In such situations, model-based and linearized control techniques fail to obtain acceptable tracking performance, especially in the presence of high friction variation.Even most adaptive controllers, that are tuned offline and are incapable of re-adjusting their adaptive parameters online, lose their effectiveness when significant friction variation occurs.In this study, an online model-free RL control approach was proposed and tested on the 3-DOF robotic arm with presence of high friction variation in the joints.The position tracking performance of the proposed controller was compared against four popular controllers utilizing different desired trajectories in both joint and Cartesian domains.Results showed that the RL controller has the best tracking performance out of the five controllers even with the presence of 100% and 200% increases in the joints' friction forces.With the presence of high friction variation in the robot, performances in terms of RMSE and IAE are degraded significantly with both PD and CT controllers.The performance of the other adaptive controllers, ILC and RBFNN, worsen when the friction increased to 200% in the task domain, demonstrating the low adaptability of the two controllers.This assures the superiority of the RL controller, which was able to adapt to the new changes in friction and was able to follow the desired trajectory accurately with the smallest tracking error.

Figure 2 .
Figure 2. Adopted Stribeck friction model in the joints of the robotic arm.
(a) shows one of those joints' trajectory sets, and Figure 3(b) shows that the positive definite condition is satisfied.Moreover, the developed manipulator model has been validated against the experimental results presented in de Jesús Rubio et al. (2014).The authors utilized in their work a robot similar to the robotic arm investigated in this paper.The validation approach presented in de Jesús Rubio et al. (2014) was followed, and Figure 4(a) shows the input voltage for each joint, and Figure 4(b) shows the angle for each joint.The obtained results are very similar to the results presented by the authors, which validate the robot model developed in this study.

Figure 3 .
Figure 3. Verification results for the positive definite condition for M(θ) (a) Joints' trajectories (b) Positive definite condition result.

Figure 4 .
Figure 4. Model Validation (a) Input Voltages for each joint (b) The response of the robot's joints

Figure 6 .
Figure 6.Learning pattern of RL controller with sinusoidal desired trajectory.

Figure 7 .
Figure 7. Norm of the actor network weights evolution with time.

Figure 9 .
Figure 9. Block diagram of the robotic arm with PD controller.

Figure 8 .
Figure 8. Block diagram of the robotic arm with CT controller.

Figure 10 .
Figure 10.Block diagram of the robotic arm with ILC controller.

Figure 11 .
Figure 11.Block diagram of the robotic arm with RBFNN controller.

Figure 12 .
Figure 12.Responses of the robot's three joints with RL, PD, CT, ILC and RBFNN with friction forces have been increased by 100% and 200% at 4 s and 8 s, respectively.

Figure 13 .
Figure 13.The trajectory error of the robot's three joints with RL, PD, ILC and RBFNN controllers with friction forces have been increased by 100% and 200% at 4 s and 8 s, respectively.

Figure 14 .
Figure 14.Responses of the robot's end effector for rectangle trajectory in Cartesian space with RL, PD, CT, ILC and RBFNN controllers and with 0% friction increment.

Figure 15 .
Figure 15.Responses of the robot's end effector for rectangle trajectory in Cartesian space with RL, PD, CT, ILC and RBFNN controllers and with 100% friction increment.

Figure 16 .
Figure 16.Tracking error responses of the robot's end effector for rectangle trajectory in Cartesian space with RL, PD, ILC and RBFNN controllers and with 100% friction increment.

Figure 17 .
Figure 17.Responses of the robot's end effector for rectangle trajectory in Cartesian space with RL, PD, ILC and RBFNN controllers and with 200% friction increment.

Figure 18 .
Figure 18.Tracking error responses of the robot's end effector for rectangle trajectory in Cartesian space with RL, PD, ILC and RBFNN controllers and with 200% friction increment.

Table 1 .
Simulation parameters values utilized in the 3-DOF robotic arm model.

Table 2 .
RMSE and IAE for PD, CT, ILC, RBFNN and RL controllers with rectangular trajectory in Cartesian space and with different friction variations.