The robust shortest path problem for multimodal transportation considering timetable with interval data

ABSTRACT In the multimodal transport network, due to various uncertain factors such as weather and traffic conditions, the transport time will become uncertain accordingly, besides, railway transport and water transport are usually limited by timetable. These factors will inevitably affect the path selection. The purpose of this study is to seek an optimal transport scheme that considers both the uncertainty of the multimodal transport network and the timetable limit. In view of the uncertainties of the multimodal transport network, interval data are used to represent the uncertainty of network weights, and robust optimization method is then adopted to process the interval data. An optimal model of robust shortest path considering timetable limit is established and genetic algorithm (GA) is designed to solve the problem. The GA designed provides an encoding method for variable-length chromosomes applicable to shortest path problem solving in the multimodal transport. And the handling methods for loops and inaccessible paths due to chromosome crossover and mutation are also suggested. And finally, numerical examples are provided to verify the validity of the model and algorithm.


Introduction
Logistics costs have remained high for a long time, and choosing a reasonable transport path is of great significance to reduce logistics costs and improve transport efficiency. In the path selection of the multimodal transport, due to various uncertain factors such as weather and traffic conditions, the transport time is not certain, due to which, the transport scheme given under certain conditions would be unable to meet the requirements. In addition, there is timetable limit in railway and waterway transportation. These factors will inevitably affect the path selection. However, in the current researches on the multimodal transport path optimization, there is no research that considered both the uncertainty of multimodal transport network and timetable limit in different modes of transportation. The path selection in the multimodal transport is essentially a shortest path problem (SPP) in network optimization. The study of the SPP falls into two categories: one is the study on SPP in a deterministic network, and the other is the study on SPP in an uncertainty network. Due to the uncertainties of the transport network, the study of the SPP in an uncertainty network is much closer to the actual transport network, so the practical significance of this research will be a great insight into different study areas and researchers CONTACT Yong Peng pengyong@cqjtu.edu.cn as well. In most existing researches on the change of transport time caused by uncertainties, the uncertainties are always handled with stochastic programming and fuzzy programming, which are influenced by many factors that difficult to predict, such as traffic conditions, accidents, traffic jams or weather conditions in transportation. Usually, we are only able to achieve a time interval as shown in Figure 1. In this network, V = {1, 2, 3, 4, 5, 6, 7}, E = (1,2),(1,3),(2,3),(2,4),(2,5),(3,2),(3,4),(3,6),(4,5),(4,6),(5,6), (5,7),(6,7). Supposing t m 1,2 we have six samples: 10,14,15,20,14,17. The reference point t m 1,2 could be either the mean 15 or the mode 14 of the sample set. The support set could use the minimal and maximal values as bounds [10,20], depending on user's confidence level about the data set and preference of uncertainty. The interval data problem has received extensive attention from scholars (Montemanni, 2006; . The processing of interval data always relies on the robust optimization method. This subject was first proposed in 2001 by Karasan, Pinar, and Yaman (2001). Several exact techniques were introduced later by Montemanni, Gambardella, and Donati (2004), which extended the determination of a robust shortest path for general network. Subsequently, the robust shortest path problem (RSPP) has been considered by many authors in the context of interval data uncertainty (Escoffier, Monnot,& Spanjaard, 2008;Shahabi, Unnikrishnan, & Boyles, 2013). However, they only focused on the RSPP in mono-modal mode of transportation. As the multimodal transport develops, increasing attention begins to be paid to the characteristics of the multimodal transport, more and more researches have been focusing on the path optimization in the multimodal transport (Abbaspour & Samadzadegan, 2010;Chen, Hong, & Jia, 2015;Davies & Lingras, 2003;van der Weijde, Verhoef, & van den Berg, 2013;Xin, Feng, & Zhang, 2016;Zhang & Wang, 2015;Ziliaskopoulos & Wardell, 2000).
However, most of the existing researches handled uncertain factors with stochastic programming and fuzzy programming. That is, few of them dealt these uncertainties as the interval data. Though most of them have taken the transit time into consideration, they actually ignored timetable limit which surely exist in the multimodal transport, such as in waterway and railway transports. These would definitely affect the efficiency of the transport scheme. In Figure 2, for example, supposing that a batch of goods requires a transshipment time of t 0 to be transported to the destination port, however, due to the differences in transit time, the departure time of the goods would be differentiated, which are (t 0 + t 3 ), (t 0 + t 2 ) and (t 0 + t 1 ), respectively. As waterway and other modes of transport all have fixed departure time, the goods will not be transported to the next node immediately after they arrive. If the departure times of the goods are respectively T1,T2, T3 · · · , when the transit time is T3, the actual departure time will be T2 instead of (t 0 + t 3 ) When the transit time is (t 2 or t 1 ), the actual departure time will be T3 instead of (t 0 + t 2 ) or(t 0 + t 1 ).
Based on this, from the perspective of multimodal transport carriers, this paper investigates the practical problems existing in decision-making of the multimodal transport path selection. To handle uncertainties in the arc lengths, the study applies the robust optimization framework to solve the problem. Data uncertainty is structured by taking the arc lengths as intervals defined by known lower and upper bounds and no assumptions of any probability distribution are considered. The robust optimization method is used to establish a path selection model for the multimodal transport considering the interval data under uncertain circumstances. In light of the difficulty and uniqueness of the model, a heuristic algorithm is designed to study the path selection problem in uncertain network concerning the interval data.

Problem description
In today's logistics transport network, it always requires to pass through several nodes to carry goods to the destination. And among the nodes, there are various transport modes such as highway, railway and waterway to be chosen. As shown in Figure 3, when the goods are transported from node 1 to node 10, we can choose different transport modes. For the transport cost and time may differ from one transport mode to another, and the transport time is an uncertain interval number due to factors such as traffic accidents and weather changes, so different transport modes can be chosen in transporting goods. Therefore, the transshipment time and cost occur  in shipment transfer. In addition, since railway and waterway have a fixed dispatching time, the goods still need to wait at the node after shipments transfer for departure time.
It is required to figure out a transport plan with the least transport cost as the goods are transported to the destination in the allowed time range.

Notations
The following notations are used in this paper, see Table 1.

Model assumptions
(1) The goods can only be transshipped at the nodes.
(2) After the transshipment at the nodes, the goods is transported to the next node with the upcoming scheduled transport mode. (3) Only the delay due to transshipment is allowed in the process. (4) Sites and facilities at the nodes are all available for the transshipment between different transport modes.

Problem definition
Formally, we define the problem on a directed graph G = (V, E, M) where V is the set of nodes, Assume that nodes s ⊆ V and d ⊆ V denote the origin and destination, respectively, E is the set of arcs, Let e i,j represent the arc connecting node i to node j. M is the set of transport

V
The set of all the nodes in the multimodal transport networks, i, j ∈ V M The set of transport modes, m, n ∈ M E Index pairs (i, j) for all arcs in the network, with node i being the starting node. s The starting node that the goods departs d The  modes (i.e. highway transportation, railway transportation, waterway transport). An arc e i,j ∈ E which originates from i and terminates at j by taking mode m can be identified by a triplet (i, j, m), where m ∈ M and (i, j) ∈ V are two adjacent stops of mode m. There is cost and time weights at the arc, and the transport time is the interval data. A path P(e1, e2, . . . , ek) is required to be found in the network with the following properties: (1) P must start at s; (2) the departure time at s must be t o ; (3) P must end at d; (4) The minimum total transport time and the maximum total transport time required by P must be between T andT; (5) the total transport cost of P is preferred to be the minimum among all the other paths that have the properties (1)-(4).

Robust optimization formulation
Under uncertain circumstances, based on the limited information that has been learned, the following interval estimations are made for the transport time between In an uncertain network, it is of little probability for actual transshipment time to obtain the interval boundary value. If t m i,j = t m i,j , the path selected in this situation may result in a failure to arrive at the destination on time; if t m i,j = t m i,j + t m i,j , the solution obtained in this situation is too conservative. To regulate the robustness and optimality of the solution, the control parameter therefore is introduced. The disturbance range of transport time is controlled by the value of the control parameter. The robust optimization method is considered to be applied, that Through the value of , the disturbance of transport time in the interval is controlled. The value of it is reasonably given according to the characteristics of initial performance and impact strength of uncertainties. And meanwhile, the value of the control parameter could also reflect the attitude of the policy makers to avoid uncertainties. The larger the value is, the more conservative the attitude towards uncertainties.
Comprehensively considering the impact of the uncertainties of multimodal transport network and timetable limit on the multimodal transport path selection, this paper establishes a multimodal robust path optimization model that takes the total transport cost as the optimization goal, which requests the goods to be transported to the destination within the prescribed time. The model is shown as follows: In the model, the objective function refers to the minimization of the total cost, including the transport cost and transshipment cost. Equations (2)-(4)ensure that a continuous path starts from a specified starting node and ends at a specified destination. Equation (5) means that only one mode of transport can be chosen between two nodes. Equation (6) is the constraint for ensuring the consistence of the transport. Equation (7) is the relative robust control model. Equation (8) is the constraint of the time requirement, composed of transport time, transshipment time and waiting time. Equation (9) ensures the flow balance. Equations (10)-(11) mean that all the decision variables are binary. Equation (12) means that the goods can be transshipped for only once at a node. Equation (13) is a variable non-negative constraint.

Genetic algorithm design
The problem description and models previously demonstrate that this path optimization problem can be regarded as the SPP with the interval data. In other words, it is a problem of searching optimized path between cities under certain constraints. Aron and Hentenryck (2004), Lebedev (2004, 2005) proved that the problem is NP-hard. There are many algorithms for solving such optimization problems, such as accurate algorithm of dynamic programming method, branch and bound algorithm. However, these methods can only solve the problem of smaller scale effectively, and the accurate algorithm cannot resolve the contradiction between the time complexity and the quality of the solution. And it is difficult to determine the globally optimal solution within an acceptable time. And it also cannot use to solve our problem with timetable. The genetic algorithm (GA), however, has its own advantages in solving complex problems, which can reach the globally optimal solution or satisfactory solution in a relatively shorter time, and thus it has been widely applied (Rajarathinam, Gomm, Yu, & Abdelhadi, 2017). Therefore, a genetic algorithm is designed in this research to solve this problem.

Encoding scheme
Earlier, some researchers tried to adopt binary encoding to solve the SPP. This encoding method is characterized by its simple genetic operators operation and the 2 n spaces. For example, in the network shown in Figure 1, there are seven vertexes, which enjoy 2 7 = 128 encoding spaces. However, there are only 18 solutions in the  Figure 1, if the chromosome is {1111011}, it means that the vertexes 1, 2, 3, 4, 6 and 7 are on the path, but in this case, there are two paths 1-2-3-4-6-7 and 1-3-2-4-6-7. That is, the mapping between the encoding and the path may be of a one-to-many relationship.
Since the multimodal transport network is large in scale, the number of the shortest path cannot be known in advance. For the path may become longer, the length of the chromosome should not be fixed. In addition, since loops are not allowed, the number of nodes on the path will not be more than the number of nodes in the network. Therefore,to efficiently find out the shortest path, the algorithm must satisfy at least the following two requirements in coding: (1) use serial numbers of the codes in encoding and (2) use variable-length in encoding. Based on this, this paper designs a path encoding method of a variable length. For there are multiple modes of transport in multimodal transport network, the chromosome is divided into two segments to make it more convenient for encoding. The first segment represents the codes of nodes that the paths pass through and the second segment represents the transport modes between different nodes. The corresponding transport scheme of chromosome decoding is shown in Figure 5. In crossover and mutation, it is necessary to insert the transport modes into the gaps of the first segment. The numbers in the chromosome refer to the nodes on the path, while the letters represent the transport modes between two nodes.

The generation of the initial population
In the GA, the initial population may directly affect the convergence speed and results of the algorithm. This research, therefore, based on the characteristics of multimodal transport, proposes a heuristic method as follows to initialize population. This method is high in efficiency and helpful in avoiding loops. See the steps as follows: Step 1: make connection matrix according to the transport network. That is, as long as there is one mode of transport between two nodes, the nodes are connected. The weight of the matrix edge takes the minimum value of the weight of each connected transport mode.
Step 2: produce the first segment of the chromosome with Floyd algorithm and choose the transport modes between nodes to produce the second segment of the chromosome, and then, an initial chromosome will be prepared.
Step 3: as the number of populations has reached the required number, stop it; otherwise, randomly generate decimals from 1-2 and multiply the weights of the arcs of the path, and then update the multimodal network weights, return to step 2.

Evaluation function and constraints judgement
In the GA, the fitness function is used to indicate the superiority and inferiority of the individual, and the higher the fitness, the higher the probability that the individual passes its characteristics to the next generation. The objective function of the model established in this paper is to find the minimum value. However, in the fitness function, the fitness is required to be as large as possible. In this sense, the objective function cannot be directly used as fitness function F. This paper takes the reciprocal of the objective function as the fitness function, specifically: (14) In this model, there is a constraint that the total transport time should not be more than the transport time limit, but in the evolution of the algorithm, the transport time of some individuals may be more than the transport time limit, and the evolution efficiency of the algorithm would be probably lowered. Therefore, it is necessary to control and adjust the populations according to following steps: (a) Calculate the total transport time of the scheme represented by the individuals. (b) Determine whether the total transport time of the individuals is more than the transport time limit, if so, delete the individual from the population according to a certain probability.
(c) If there are q individuals are deleted in step (b), then copy q individuals randomly from the remaining population to keep the population size constant.

Selection
There are several selection methods, including random pairing, weighted random (roulette wheel) pairing and tournament selection. The roulette wheel pairing is used in this study. In this method, the probability assigned to a chromosome is inversely proportional to its cost.

Crossover
Crossover operation is an important feature of GA that distinguishes it from other evolutionary algorithms. It plays a key role in GA and is the main method for generating new individuals. Since the encoding in this paper is a path encoding with indefinite length, if a traditional single-point crossing or multi-point crossing strategy is adopted, it will possibly and inevitably result in illegal path P, which would be infeasible.
For there are two cases of chromosomes selected for cross-operation: first, two parent chromosomes share the same nodes (except the starting node and the final node); second, the two parent chromosomes do not have the same nodes. Therefore, an improved singlenode crossover method is designed to conduct crossover operation on the first segment of the chromosome. For the first case, we use the same nodes to conduct crossoperation. Assuming that P1 = (v 1 , v 2 , . . . , v k ) and P2 = (v 1 , v 2 , . . . , v k ) are the path nodes of the 2 chromosomes. Select a same node from the starting point as the intersection point in these two chromosomes, and switch the parts after the same node. If there are multiple identical nodes, then randomly select one sharing node as the intersection point. For example, as v i = v i , then the following two new chromosomes can be obtained. Obviously, these two new chromosomes are also a feasible path from the starting point 1 to the ending point k. Consider the network shown in Figure 1 as an example. This type of crossover operation is shown in Figure 6(a). Since the network in reality is often incomplete, for the second case, because it is possible to create illegal individuals in crossoperations, it is necessary to correct the illegal individuals. Taking the network illustrated in Figure 1 as an example, assume that node 1 is the starting node and node 7 is the destination node. Two valid paths 1-3-6-7 and 1-2-4-5-7 are selected as the parental chromosomes. As shown in Figure 6(b), a gene position {3, 2} is randomly selected for crossover operation. The child 2 generated after the crossover operation is an illegal path. There is no arc segment between node 2 and node 6. For this purpose, find a valid path from node 2 to node 6 according to the method described in 4.1 to make the child 2 chromosome a valid path. In addition, it should be noted that in  the chromosomes generated after crossover, there would be duplicated genes, suggesting that a loop appears in the path as shown in Figure 7, which then should be eliminated. The duplicate genes can be deleted. That is, delete the genes between the duplicate genes. For instance, if the chromosome generated after crossover operation is P = {1, 2, 4, 5, 4, 3, 6, 7}, after eliminating the duplicated gene, it will become P = {1, 2, 4, 3, 6, 7}. After completing the crossover on the first segment of the chromosome, for the gene parts remain unchanged, keep their transport modes on the second segment of the chromosome. As for the parts which have been charged, randomly select transport modes between nodes to generate their corresponding second segment parts.

Mutation
The chromosome would be easily trapped in local optimal solutions during selection and crossover operation, while mutation operators are able to endow GA with the abilities of local random searching, maintaining population diversity and avoiding premature convergence. Singlepoint mutation is then adopted in this research on the first segment of the chromosome. The specific operation is as follows: (1) Generate a random number r between 0 and 1, if r is less than the mutation probability p m , the gene position of the chromosome should be randomly selected with a certain probability v i .(2) Cut the path after v i and locally optimize the segment of the gene with the Floyd method. Starting from the gene v i , re-grow the path to obtain a new chromosome (path) as shown in Figure 8. If a loop appears after mutation, process it with the method described in the crossover operation. In addition, what calls for special attention is that there may be no accessible path from the mutated node to the destination. If so, find the previous node connected to the mutated node and adopt the Floyd method to generate a path from v i−1 to the destination. If there is still no path from v i−1 to the end, continue to find the previous node, and so on until find the path. Combine the path from the starting node to v i−1 with the path from v i−1 to the destination to replace variant chromosome. After the mutation operation on the first segment of the chromosome, for the gene parts remain unchanged, keep their transport modes on the second segment of the chromosome. As for the parts which have been charged, randomly select transport modes between nodes to generate their corresponding second segment parts. The following formula is used to adjust the mutation probability so as to enhance the search ability for the optimal solution of the population at the later stage of iteration.
In the formula, I represents the iteration times.

Elitism strategy and stopping conditions
Elitism strategy is also adapted to update the global optimal solutions. And the maximum iteration times is taken as the ending condition.

Population management strategy
In the algorithm, two kinds of population management strategies are adopted. One is a single-population strategy that satisfies a given population size N. That is, it is able to conduct selection, crossover and mutation operations on the whole population. It is called singlepopulation strategy genetic algorithm (SGA). According to whether genetic mutation probability adjustment is adopted, it again can be divided into SGA-1 with fixed mutation probability and SGA-2 with mutation probability and adjusted by iteration times as Equation (15). The other is a multi-population strategy that satisfies a given population size N. To be specific, N is randomly divided into n sub-populations. And operations of selection, crossover and mutation are carried out on each of the sub-populations. In each evolution, conduct the crossover operation on optimal solutions of each subpopulation to produce a new individual. And then, the fitness value is calculated. Select the optimal solution from the local optimal solutions, the new individual fitness values and the current global optimal solutions to update the global optimal solution. This method is known as multi-population strategy genetic algorithm (MGA). Similarly, according to whether genetic mutation probability adjustment is adopted, it can be divided into MGA-1 with fixed mutation probability and MGA-2 with variation probability and adjusted by iteration times.

Experiments
This study generated a directed lattice graph with nominal travel time and cost of each arc independent. The generated graph is a 5 × 5 lattice graph, as shown in Figure 9. Vertex 1 (upper left corner) is the departure point, and Vertex 25 (lower right corner) is the destination point. There are three transport modes, highway, railway and waterway available between two nodes. H, R and W, respectively, represent highway, railway and waterway. Because the RSPP for the multimodal transportation considering timetable with the interval data is a relatively new issue, there is no standard case library for testing. Thus, the study needs to construct test data by its own. The most generally used two methods for constructing data: one is random generation, and the other is modification based on existing instances. Due to the lack of authoritativeness and representativeness of the data that is generated randomly, the examples used in this research are modified based on the Solomon standard examples (Solomon, 1987). In Solomon's case, the data are divided into three types: C-type, R-type and RC-type. This research selects the first 25 nodes in Solomon's R101 to map the nodes in Figure 9. The Euclidean distance/90 (km/h) between nodes in Solomon's nodes indicate the average transport time by highway,  Euclidean distance/60 (km/h) represents the average transport time by railway and Euclidean distance/40 (km/h) demonstrates the average transport time by waterway. The disturbance time is equal to the average transport time. Supplementary data: the transport speed, transport cost and timetables for transport mode are shown in Table 2. The transshipment time and cost between different modes of transport are shown in Table 3. Supposing that we now need to transport a batch of goods from starting node 1 to the destination 25 at 8:00, and the goods is required to carry to the destination 25 in 2000 to 4000 min. So how can we arrange the transport so that the transport cost will be the lowest. Set the population size N = 100, the maximum iteration times Maxgen = 50, B represents the best fitness value, A refers to the average fitness value, C is the iteration times to find the best fitness value, t is the calculation time while T represents the total transport time. Then it was calculated with software on a computer with a CPU of Intel Core (I5), and a memory of 4.0 GB. When = 0.1, after constant adjustment of the crossover as well as mutation probability, use SGA-1, SGA-2, MGA-1 and MGA-2 to conduct the calculation. Table 4 shows that the minimum cost learned from the four calculating methods is 1282. The continuous varying of fitness calculated by each algorithm indicates that the algorithm has a strong ability in maintaining population diversities and avoiding inferior solutions, which is able to prevent the solutions from being trapped in local optimal solutions. In addition, the table also shows that the four calculating methods and objective function values will get stable in a short period, demonstrating that the algorithm is of satisfying efficiency. In MGA-2, with different crossover and mutation probabilities, the objective function becomes stable after less than 20 iterations. And the population then becomes homogeneous population. All the time taken is less than 6 s. Therefore, we say the MGA-2 is the most stable and efficient calculation method.
When the crossover probability is set to 0.8, the mutation probability is 0.1, = 0.1 and the iteration times is 50, adjust the size of the population and then conduct calculation with the four calculation methods. The results are shown in Table 5. From the table, it also can be seen that in the four calculation methods, the objective function values all could become stable within a short time, which further shows that the initial population generation method designed in this research is satisfying. Additionally, MGA-2 still uses less than 20 iterations to make the objective function values stable. The calculation time is still less than 9 s.
As population size N = 100, the max iteration times Maxgen = 50, crossover probability is set to be 0.8 and  mutation probability 0.1, is 0, 0.5, 0.75, 1. Adopt MGA-2 to conduct the calculation. The convergence of objective function values at different control levels then obtained is shown in Figure 10. The figure indicates that with the increase of the parameter with uncertain level, the total cost increases as well. In the robust optimization model, the uncertainty level parameter, to some extent, is able to measure decision-maker's conservativeness and risk aversion. Therefore, the decision-maker can choose appropriate according to their risk aversion level, and then determine the transport scheme in the uncertain environment. When the control parameter is 0, the transport cost of the path is the smallest, but it is, at the same time, the most unstable. When the control parameter is 1, that means the decision-maker intends to avoid risks, and the value of the total cost reaches the maximum. At the time, the robustness of the decision-making scheme is the strongest, while the optimality is the worst. When the control parameter is 0.5, the decision-maker holds mutual attitude towards risks, and the value of the total cost is moderate. In addition, as the robust control parameters increase, the required transportation costs increase accordingly. The transport schemes and required transport costs at different control levels are shown in Table 6. It shows that when the robust control parameter is 0, without considering uncertainties, the smallest value of objective function is 1166. And as the parameter goes up to 0.5, the smallest value obtained is 3519. Since the feasible region becomes smaller at the time, the minimum cost of the path suggested is obviously greater than that in the situation when the robust control parameter is 0. Similarly, when the robust control parameter is 1, the minimum cost of the path suggested is also more than the minimum cost in other situations, indicating that it is better in robustness. The fewer the paths satisfying the time limit, the more the cost required.

Conclusion
An RSPP for multimodal transportation considering time table with the interval data has been given in this paper. And then, based on the characteristics of the model, a GA has been designed to find the solution. The example shows that the model and algorithm have good robustness and are able to solve this path selection problem. Therefore, it is able to provide a scientific quantitative support to the path scheme selection in the multimodal transport under various uncertainties.

Funding
This research is supported by MOE (Ministry of Education in China) Project of Humanities and Social Sciences Youth Fund