Path planning for coal mine robot via improved ant colony optimization algorithm

This paper is concerned with the path planning of the coal mine robot. A new workspace model is presented to describe the complex coal mine environment. Thus, the cost of a path is composed of not only the distance of the path but also some hybrid costs that can be linked to the criteria of path optimization. To overcome the drawbacks of conventional ant colony optimization (ACO) algorithm, an improved ACO algorithm is developed to tackle the issues of path planning of coal mine robot based on the new workspace model. Some simulation experiments are carried out on the path planning of coal mine robot, and the validity and superiority of the new approach can be confirmed by the simulation results.


Introduction
Nowadays, the coal mine robot (CMR) is one of the emerging research areas. Path planning, which has been under intensive investigations in robotics (Luo et al., 2020), is of practical importance for CMR to carry out its tasks in the severe coal mine environment (Ge, 2019).
Up to now, the path planning of CMR is increasingly becoming a hot topic in the robotic field. For example, a modified genetic algorithm has been presented for the global path planning of coal mine rescue robot in Zhao et al. (2009), where new schemes are proposed for the initialization, crossover, and mutation. In Xu and Huang (2019), a new method based on genetic membrane algorithm has been put forward to plan robot path in the complex coal mine environment, and its validity is verified by several simulations and experiments. In Yao et al. (2014), a path planning method based on the artificial fish swarm algorithm (AFSA) has been proposed for the rescue robot, where the constraint conditions are optimized by the method of detecting the distances to the threat regions, and the performances of the algorithm are evaluated in the cases of different reference points and path lengths. A new model of the coal mine environment has been described in Xu (2014), and then a global path planning method is developed for the coal mine rescue robot based on the artificial bee colony (ABC) algorithm. In Ma and Mao (2018), a two-step path planning strategy has been proposed to achieve the CONTACT Lin Xu xulin724@163.com global/local path planning in gas distribution area on the basis of Dijkstra and ant colony optimization (ACO) algorithms. In Berglund et al. (2010), B-spline based smooth path planning and obstacle avoiding algorithm has been presented for the high-speed autonomous mining vehicles, so as to obtain the path that can ensure safety margin and minimum curvature variation. In Tao et al. (2019), a smooth path has been planned for the coal mine rescue robot based on an improved A * algorithm, where the redundant points on the path are deleted by using the D-P algorithm, and the extracted key points are fitted by the cubic spline function, so that the obtained smooth path has less turning points and shorter length. Nevertheless, to the best of our knowledge, the current studies pay more attentions to the robot path planning in a plane rather than the complex coal mine environment. Some scattered studies can be found on the path planning based on the terrain perception or non-planar environment (Lu et al., 2019;Tan et al., 2017), but these studies are usually subjected to some limitations such as the simple workspace model and the large computational burden. To combat such obstacles, a new workspace model is presented to describe the coal mine environment in this paper, and an improved ant colony optimization (ACO) algorithm is developed for the path planning of CMR based on the new workspace model. The main contributions of the current paper can be outlined as follows: (1) A new workspace model is presented to describe the coal mine environment, where the path cost is related to not only the distance of the path but also some other hybrid costs; (2) To handle some drawbacks of the conventional ACO algorithm, an improved ACO algorithm is developed to deal with the path planning of CMR based on the new workspace model; (3) Some simulation experiments are carried out on the path planning of CMR, and the validity and superiority of the new approach can be verified by several simulation results.
The rest of the paper is organized as follows. Section 2 is devoted to the modelling of the workspace, where the cost of the path is addressed in detail. Section 3 details the improved ACO algorithm for the path planning of CMR. In Section 4, some simulation experiments are implemented to verify the validity and superiority of the proposed approach. Finally, the conclusion is drawn in Section 5.

Model of workspace
In this paper, the workspace of CMR is modelled as a 2D plane with many weighted grids. Without loss of generality, we suppose the information of the workspace is known in advance, including the size of the workspace, the amount, size and position of the obstacles, etc.
Firstly, the 2D plane is divided into many squared grids, which can be numbered to describe their positions as shown in Figure 1. The size of the grid is determined according to the measurement of the mobile robot, and the grids are painted with white or black respectively in terms of whether they are feasible grids or partly occupied. Actually, the grid method has been widely used in the path planning of mobile robot because it is convenient to depict the obstacles in the workspace.
Secondly, a weight is set for each feasible grid of the workspace in accordance with the cost that the mobile robot has to consume to pass through the grid as shown in Figure 2. The larger weight of the grid means the  mobile robot will pay more costs (e.g. power and time) if this grid is on the path of the robot. The cost actually depends on several aspects affecting the movement of the mobile robot in the coal mine environment, e.g. height, slope, roughness, and humidity of the grid, which are also assumed to be known information of the grid. In order to clearly identify the grids with different weights, each feasible grid is painted with different colours, i.e. white, grey, blue, and red (see Figure 2 for more details).
In this paper, the positions of the grids can be represented by the methods of serial number and Cartesian coordinate. In the serial number method, the grids are numbered orderly from upper left to lower right as mentioned above. In the coordinate representation, the origin of the coordinates is set at the lower left of the workspace, and the positions of the grids are described by the coordinates of the grid centres. Accordingly, the two methods can be converted with each other by the following equations: where n denotes the grid number; N x and N y indicate respectively the amounts of the grids on X-and Ycoordinates; mod and ceil stand for the remainder operation and the rounding up operation respectively; and a denotes the side length of a grid. Based on the above workspace model, the cost of a robot path can be computed as follows: where C is the total cost composed of the length of the path L and other hybrid costs H, and w 1 and w 2 are respectively the weights of L and H. Moreover, the path length L can be calculated by where n 1 is the total grids on the robot path; n 2 is the amount of grids that are passed through diagonally. The hybrid cost H can be calculated by where h i indicates the weight of the ith grid as seen in Figure 2. For the sake of simplicity, the side length of grid is set as a = 1 in the following of this paper.
3. An improved ACO algorithm for path planning of mobile robot

Conventional ACO algorithm
The ant colony optimization (ACO) algorithm invented by Dorigo is an intelligent optimization algorithm inspired by the swarm behaviour of the ants (Dorigo et al., 2006), which can find the optimal path from the ant nest (start) to the position of food (destination).
In the food searching process, the ants will pick a suitable path at the intersection based on the transition rule related to the concentration of the pheromone that the ants release along the road. The pheromone will diffuse with time, thereby its concentration may decrease if there is not enough ants on the road. This means the concentration of pheromone is inversely related to the length of the path, because less ants would pass the longer paths that have less pheromone. Thus, the ants will more likely pick the path with more pheromones so as to find the shortest path. In conventional ACO algorithm, the transition rule is devised in accordance with the random proportion , whose probability can be calculated as follows: where p k ij (t) is the probability that the kth ant walks from the ith grid to the jth grid at the current iteration t; N k i denotes the set of grids that have not been visited by the kth ant at the current ith grid; τ ij denotes the concentration of pheromone between the ith and jth grids, which means the influence of the search history on the path choice of the ants, and α is its importance factor; η ij , whose importance factor is β, denotes the attraction of the jth grid to the ants, and can be calculated by where d ij indicates the distance between the ith and jth grids.
When an ant finds a path from start to destination successfully, the pheromone on the path will be updated using the following equations (Gao et al., 2020): where ρ denotes the pheromone diffusion factor; M denotes the amount of ants; and τ k ij indicates the pheromone released by the kth ant on the path between the ith and jth grids (denoted as l ij ) and calculated as: where L k denotes the successful path of the kth ant. (3) are not linked to the updating function of pheromone in conventional ACO algorithm. In the improved ACO algorithm, the cost composed of path length and other hybrid costs is applied to update the pheromone of a path, thereby the optimization criteria are linked to not only the length of a path but also other hybrid costs such as height, slope, roughness and humidity of the grids on the path.

Improved ACO algorithm
To balance the relation of exploration and exploitation operations, the transition rule is developed according to the pseudo-random proportion rule as follows: where q 0 is a parameter set in [0, 1]; J is a random variable with a probability distribution consisting with Equation (6). The heuristic information η iu is defined by where d uE is the distance between the next candidate grid and the destination. In some complex environments, the algorithm will stagnate if the ants encounter the special obstacles (Lee, 2017), see Figure 3 for an example. To deal with this problem, the early death strategy is usually utilized in literature, i.e. the life of the ant is ended in advance if it falls into a trap and stops the path finding. However, this method will reduce the effective ants of the algorithm, and thereby decreasing the efficiency of path finding. As such, the retreat-punishment strategy is presented in this paper to tackle the issues mentioned above. The ant will return to the previous grid if it falls into a trap surrounded by infeasible grids. Meanwhile, to avoid the repeated selection of the path, the punishment is executed for the pheromone of the path as follows: where δ denotes the intensity of the punishment in [0, 1]. A large δ implies the small probability that the path can be selected by other ants. In practical engineering, the smooth path will make the movement of the mobile robot more efficient. Thus, the corner of the path is considered for the pheromone updating in the grid environment. The corners of the path, including flat, obtuse, right and acute angles, are set the cost values 1, 2, 3, and 4 respectively in the light of their impacts on the smooth movement of the mobile robot. Therefore, the pheromone updating scheme is modified as follows: where C k is the path cost of the kth ant computed by Equation (3); φ k is the sum of the cost values of the path corners for the kth ant; and λ 1 and λ 2 are respectively the weights indicating the influences of the two aspects on the pheromone of the path.

Remark 3.2:
It should be pointed out that the costs of path corners are counted in the updating function of the path pheromone. That is the smoothness of the path is considered in the path planning of mobile robot. This scheme is another improvement for the conventional ACO algorithm, and will undoubtedly obtain a 'smoother' path with less corners, which is of great importance for the path tracking control of the mobile robot.
Additionally, pheromone diffusion factor ρ is set a large value at the beginning of the algorithm to hold the global exploration ability, and then it is decreased at the end of the algorithm to achieve a fast convergence. In this paper, the pheromone diffusion factor ρ is computed as follows: where γ denotes the regulator of the diffusion factor. To avoid a too small value of ρ that will affect the exploration of the ACO algorithm, a lower limit is given for the diffusion factor, which can be expressed by where ρ min is the lower limit of the diffusion factor. Besides, the diffusion factor will be increased to enhance the exploration ability of the ACO algorithm when the update of optimal path stops for a long time. To sum up, the improved ACO algorithm for the path planning of CMR can be demonstrated in Algorithm 1.

Simulation results
In this section, some simulation experiments are carried out to evaluate the validity and superiority of the developed path planning approach. The main computer configurations for the simulation experiments are as follows: Window 7 (64bit) and MATLAB R2014b for the software environment, Intel Core TM i7-6700 CPU @ 3.40GHz and 16.0GB RAM for the hardware environment. In the simulation experiments, some parameters are set as follows: M = 90 for the maximum ant amount, T = 200 for the maximum iteration, α = 1.4 for the pheromone importance factor, and β = 4.5 for the heuristic importance factor. Figures 4 and 5 show the path planning results using the conventional and improved ACO algorithms on the map of Figure 2, where the start and destination of the path are respectively set at the upper left and lower right grids. It can be found that the path produced by the improved ACO algorithm has less corners than its counterpart, because the corner costs have been linked to the criteria of the path optimization. Besides, the fitness values are demonstrated in Figurss 6 and 7. It is obvious that the fitness value of the improved ACO is smaller than that of conventional ACO, and the convergence speed of improved ACO is also more stable and faster.     The above-mentioned conclusions can be further verified by the path planning results on the more complex map shown in Figures 8 and 9. It is apparent that the path produced by the improved ACO is superior to that   of conventional ACO on not only the path length but also the corner cost. Additionally, the fluctuation of the fitness value of conventional ACO in Figure 10 is as severe as that of Figure 6. In contrast, the fitness value fluctuation of improved ACO is smaller and more stable in Figures 11  and 7.
Remark 4.1: Based on the above-mentioned observations, the paths generated by improved ACO consume less total costs than that of conventional ACO algorithm, which can be explained from two aspects. For one thing, the ants will select the next grids closer to the destination owing to the heuristic information composed of the distance between the current grid and the destination. For another, the global exploration ability is promoted by using the pseudo-random proportion rule, thereby extending the search space of ACO.

Conclusion
In this paper, an improved ACO algorithm is developed for the path planning of coal mine robot. A new workspace model is presented for the coal mine environment considering the hybrid costs denoted as the weights of the grids. Owing to the new pheromone updating scheme, the obtained path of the improved ACO consumes less costs than its conventional counterpart. Besides, the less corners of the path produced by the improved ACO will make the path tracking control of the mobile robot more convenient. Finally, these advantages of the new approach have been verified by several simulation experiments on path planning of the coal mine robot.
In future work, we will pay more attention to the path planning of coal mine robot based on other intelligent optimization algorithms (Zeng et al., 2016(Zeng et al., , 2020, and the application of the improved ACO to other problems Yin et al., 2020;Zeng et al., 2019Zeng et al., , 2021.

Disclosure statement
No potential conflict of interest was reported by the authors.