A two-stage stochastic programming model for collaborative asset protection routing problem enhanced with machine learning: a learning-based matheuristic algorithm

ABSTRACT In this paper, a two-stage stochastic mathematical model is developed for an asset protection routing problem under a wildfire. The main aim of this study is to reduce the negative impact of a wildfire. Some parameters, such as travel and service times, obtaining profit by protecting an asset, and upper bounds of time windows, are considered as stochastic parameters. Generating proper scenarios for uncertain parameters has a large impact on the accuracy of the obtained solutions. Therefore, artificial neural networks are employed to extract possible scenarios according to previous actual wildfire events. The problem cannot be solved by exact solvers for large instances, so two matheuristic algorithms are proposed in this study to solve the problem in a reasonable time. In the first algorithm, a set of feasible routes is generated based on a heuristic approach, then a route-based mathematical model is used to obtain the final solution. Also, another matheuristic algorithm based on adaptive large neighbourhood search (ALNS) is proposed. In this algorithm, routing decisions are determined using the ALNS algorithm while other decisions are achieved by solving an intermediate mathematical model. The numerical analysis confirms the efficiency of both proposed algorithms; however, the first algorithm performs more efficiently.


Introduction
Wildfire disasters pose a major threat to humans and wildlife. Also, wildfire disasters have a significant economic impact (Nuraiman, Ozlen, and Hearne 2020). In Australia, 173 were killed and 414 people were injured during the Black Saturday wildfire in 2009. Also, it forced the evacuation of about 7500 people, and the cost of the damage was almost $4.5 billion (Shahparvari, Abbasi, and Chhetri 2017). In the 2019/2020 fire season in Australia, almost 2000 homes were destroyed and at least 28 people were killed (Green 2020). Wildfire risk management not only mitigates the risk of wildfire occurrence but also reduces the costs of the damage caused by wildfires.
Wildfire risk management consists of four phases: prevention and mitigation; preparedness; response; and recovery. Prevention and mitigation include activities concerned with preventing wildfires and reducing their negative impact. The activities related to fuel treatment are carried out in the first phase in order to prevent wildfires and reduce their effects. In the second phase, activities focus on adequate preparation for controlling CONTACT Mahdi Bashiri mahdi.bashiri@coventry.ac.uk Centre for Business in Society, Coventry University, Coventry, UK wildfires. The response phase includes actions such as asset protection and the evacuation process required to reduce the costs of the damage caused by wildfires. The last phase includes actions to repair or rebuild infrastructure during and after wildfire incidents. The activities performed in the preparedness and response phases are more critical because they can affect the efficiency of the actions in other phases. In this study, the strategic and operational decisions for managing a wildfire incident are considered simultaneously. Decisions related to the selection and opening of protection centres, as well as assigning vehicles to them in the preparedness phase, are determined by considering their effects on asset protection in the response phase. Assets can be part of production systems. A Phillips semiconductor plant faced a fire event in 2000 that led to disruption in production, with a loss of $400 million to Ericsson (Chopra and Sodhi 2004). Many firms in global supply chains are facing to supply disruptions, and fire can be one of them. Some researchers, such as Wu, Blackhurst, and Chidambaram (2006) and Ho et al. (2015), have indicated that fire is the main risk factor in production-related activities. Therefore, the protection of production-related assets in fires caused by natural disasters could be a very important application of the current study, which needs to be considered in the production research literature.
The values of some factors, such as wind direction, wind speed, monthly rainfall, and monthly maximum temperature, are unknown when decisions related to the first and second phases of wildfire risk management are made. These factors affect the intensity of wildfires and their spread patterns. As a result, some important parameters in wildfire risk management, such as travel and service times, utility obtained by protecting assets, and upper bounds of time windows, might be stochastic in nature. The values of these parameters are unknown, but they can be defined by different scenarios corresponding to each combination of the mentioned factors. Considering uncertainty related to these parameters helps managers to make decisions more accurately. Also, considering the uncertainty of these parameters is critical because it has a significant effect on saving human lives. So, in this study, a two-stage stochastic programming model is proposed for the asset protection problem by considering travel and service times, utility obtained by protecting assets, and upper bounds of time windows as stochastic parameters. The establishment of protection centres and assigning vehicles to them are the first-stage decision variables, which are independent of scenarios. Collaboration between main and backup protection centres and routing decisions are determined in the second stage after revealing the uncertain parameters.
For generating the scenarios, the wildfire spread pattern should be determined. Wildfire spread patterns depend on the severity of wildfire and wind direction. In this study, two neural networks, a generalised regression neural network (GRNN) and a multilayer perceptron artificial neural network are utilised for predicting the intensity of the wildfire based on data obtained from previous wildfires. The efficiency of these networks for estimating the severity of wildfires is examined.
The main contributions of this study can be described as follows: We propose a two-stage stochastic programming model for the asset protection routing problem. Collaboration between protection centres is considered in this paper. In the proposed mathematical model, some decisions related to the second and third phases of wildfire risk management are determined simultaneously. The travel and service times, utility obtained by protecting assets, and upper bounds of time windows are considered as uncertain parameters and are defined by scenarios generated based on real data. A generalised regression neural network and a multilayer perceptron artificial neural network are utilised to predict the severity of the wildfire used for obtaining the wildfire spread pattern and generating the scenarios. A two-phase learning-based matheuristic algorithm is proposed for solving the mathematical model. In the first phase, a set of feasible routes is determined by a heuristic algorithm. Then a routebased model is solved to obtain the solutions. Also, a matheuristic algorithm based on adaptive large neighbourhood search is developed in this study. All notations, sets, parameters, and decision variables in this study are defined in Table 1 (Nomenclature).
This paper is organised as follows; studies related to optimisation problems under wildfire, asset protection problem, and machine learning are briefly reviewed in Section 2. The problem is defined and formulated in Section 3. The scenario generation method is introduced in Section 4, the solution algorithms are explained in Section 5. Section 6 is devoted to numerical examples. Finally, the conclusions and some managerial insight are provided in the last section.

Literature review
Many recent studies have focused to reduce wildfires' negative impacts. Wang et al. (2019) proposed a video fire detection that has good accuracy, especially in the indoor environment. Tian, Ren, and Zhou (2016) developed a multi-objective hybrid differential evolution particle swarm optimisation algorithm for scheduling rescue vehicles during forest fires. Wu et al. (2017) proposed a biobjective model for scheduling fire engines during frost fires and developed an exact method based on dynamic programming as well as a greedy heuristic to solve it. Rashidi et al. (2017) proposed a robust maximal covering location-based model to determine ignition points of the wildfire that cause maximum damage to the landscape. Asset protection and evacuation are the activities done in the response phase to save human lives and to reduce the number of damaged infrastructures. Shahparvari, Abbasi, and Chhetri (2017) proposed a vehicle routing problem for evacuation during wildfires. Also, they developed a heuristic algorithm to solve it. Zhou and Erdogan (2019) proposed a multi-objective two-stage stochastic programming model for allocating firefighters and evacuation humans and then a goal programming approach is applied to solve the problem.  developed a mathematical model to determine the scheduling and routing decisions for evacuating humans under wildfire. They proposed a greedy method to solve the problem.
There are also some previous studies related to the asset protection problem. A limited number of studies have been published on the asset protection problem related to wildfire risk management. However, as the asset  Lower bound of time window for the ith (i ∈ I) asset UB i (ξ ) Upper bound of time window for the ith (i ∈ I) asset under scenario ξ (ξ ∈ ) V i (ξ ) Obtained utility by protecting the ith (i ∈ I) asset under scenario ξ (ξ ∈ ) C ij Travel cost between nodes i (i ∈ N ) and j (j ∈ N ) TT ij (ξ ) Travel time between nodes i (i ∈ N ) and j (j ∈ N ) under scenario ξ (ξ ∈ ) ST i (ξ ) Servicing time of the ith (i ∈ M) node under scenario ξ (ξ ∈ )). (ST i (ξ ) = 0, ∀i ∈ {I + 1, I + 2, . . .
Probability of scenario ξ (ξ ∈ ) T An arbitrary large number, T = max i∈I,ξ ∈ (UB i (ξ )) π ak A binary parameter and is equal to 1 if the ath (a ∈ A) protection activity can be done by the equipment in the kth (k ∈ K) vehicle, 0 otherwise γ ia A binary parameter and is equal to 1 if the ith (i ∈ I) asset requires the ath (a ∈ A) protection activity, 0 otherwise θ k Required equipment, protection components and space for vehicle k (k ∈ K) in a protection centre The time that servicing starts at the ith (i ∈ M) asset by kth (k ∈ K) vehicle under scenario ξ (ξ ∈ ) Parameters in Section 4 IST i Initial expected servicing time to protect asset i C T (ξ ) Normalised value corresponding to month's maximum temperature under scenario ξ (ξ ∈ ) C RF (ξ ) Normalised value corresponding to month's rainfall under scenario ξ (ξ ∈ ) C WS (ξ ) Normalised value corresponding to wind speed under scenario ξ (ξ ∈ ) α 1 , α 2 and α 3 The coefficients of the month's maximum temperature, month's rainfall, and wind speed in calculation of service time, respectively NC(ξ ) The number of cells in the region affected by the wildfire under scenario ξ (ξ ∈ ) Ar(ξ ) The area of region affected by wildfire under scenario ξ (ξ ∈ ) AS Area of each cell IV i The initial obtained utility value by protecting the ith asset C FS (ξ ) The fire time coefficient under scenario ξ (ξ ∈ ) i (ξ ) A binary parameter that is equal to 1 when the cell that the ith asset is located on is affected by wildfire ITBC i The initial time bound for the ith asset ν 1 , ν 2 , and ν 3 The coefficients of the month's maximum temperature, month's rainfall and wind speed in determining the upper bounds of time windows

Itt ij
The initial travel time between nodes i and j C I (ξ ) Travel time coefficient depends on the severity of wildfire and is determined based on the intensity of wildfire obtained from neural networks CA da (ξ ) Set of assets that have a great chance to be selected for protection (critical assets) Sets in Section 5.1 R(ξ ) Set of feasible routes obtained in the first phase of the proposed two-phase matheuristic algorithm under scenario ξ (ξ ∈ ), R = {1, 2, . . . , R(ξ )} In ns Set of the assets in the visiting sequence of the nsth subset (VS ns ) that time window constraints violated for them I ns Set of the assets in the nsth subset route da (ξ ) The set of obtained routes that start their tour from the dth (d ∈ D M ) main protection centre for doing protection activity type a (a ∈ A) under scenario ξ (ξ ∈ ) Parameters in Section 5.1 β ir (ξ ) Binary parameter that is equal to 1 if the ith node exists in the rth route under scenario ξ (ξ ∈ ) L id (ξ ) Critical level of the ith (i ∈ I) asset for the main protection centre d (d ∈ D M ) under scenario ξ (ξ ∈ ) The minimum allowable critical level for the assets to be selected as critical assets in the iterations of the two-phase matheuristic algorithm except iteration 1 2 The minimum allowable critical level for the assets to be selected as critical assets in the first iteration of the two-phase matheuristic algorithm NM i (ξ ) Neighbouring measure for the ith (i ∈ I) asset in the modified nearest neighbour algorithm under scenario ξ (ξ ∈ (continued)

CV ipns
Cumulative value of violation in time window constraints when asset i is located in position p in visiting sequence related to the nsth subset CC r (ξ ) Travelling cost corresponding to the rth (r ∈ R(ξ )) route under scenario ξ (ξ ∈ ) Float r (ξ ) The minimum value of violation between the arrival time and the upper bounds of time windows for the assets in the rth (r ∈ R(ξ )) route under scenario ξ (ξ ∈ ) NR id (ξ ) The number of routes in the set R(ξ ) consisting of the ith asset λ id (ξ ) Random number generated from an interval The time that servicing starts at node i (i ∈ I) if it is added in route VS ns after the last node ρ A weight to determine the importance of time window constraints in calculation of neighbouring measure Decision variables in Section 5.1 u rk (ξ ) 1 if route r (r ∈ R(ξ )) is assigned to the kth (k ∈ K) vehicle under scenario ξ (ξ ∈ ), 0 otherwise Sets in Section 5.2 R ad Set of routes are considered to serve the assets that require protection activities type a (a ∈ A) in main protection centre d, Set of assets that have not been visited in the current solution under scenario ξ (ξ ∈ ) RL Restricted list of assets that their corresponding MV are greater than zero AL dr a (ξ ) Set of positions in route r that requires protection activity type a (a ∈ A) in main protection centre d Parameters in Section 5.2 VT dar (ξ ) Obtained profit if a vehicle is assigned to the r th route that requires protection activity type a (a ∈ A) in main protection centre Total travel cost if a vehicle is assigned to the r th route with requiring protection activity type a (a ∈ A) in main protection centre d (d ∈ D M ) under scenario ξ (ξ ∈ ) zz d A binary parameter is equal to 1 if at least one asset is assigned to the dth (d ∈ D M ) main protection centre in the current solution Float1 dar (ξ ) The minimum value of violation between the arrival time and upper bounds of time windows for the assets in the r th route corresponding to the dth (d ∈ D M ) main protection centre for protection activity type a under scenario ξ (ξ ∈ ) L ro d (ξ ) Number of nodes removed by the selected removal strategy from the route corresponding to the dth Number of nodes inserted by the selected insertion strategy to the route corresponding to the dth Neighbour solution 1

Random number Cim dr aj (ξ )
Total travel cost if selected asset (î) is inserted in the jth position in the route r that requires protection activity type a (a ∈ A) in main protection centre d (d ∈ D M ) under scenario ξ (ξ ∈ ) Vim dr aj (ξ ) Total obtained profit by protecting the assets in route r that require protection activity type a (a ∈ A) in main protection centre d (d ∈ D M ) under scenario ξ (ξ ∈ ) y best kd The values for variables q kd in the best solution obtained by the improved ALNS algorithm z best d The values for variables z d in the best solution obtained by the improved ALNS algorithm NN dr a (ξ ) The number of assets in the route r that requires protection activity type a (a ∈ A) in main protection Decision variables in Section 5.2 w dkr (ξ ) 1 if vehicle k (k ∈ K) is assigned to the route r corresponding to the dth (d ∈ D M ) main protection centre when the protection activity provided by that vehicle is compatible with the protection activity required by assets in route r , 0 otherwise ww dr aj (ξ ) 1 if assetî is assigned to the jth position in the route r that requires protection activity type a (a ∈ A) in main protection centre d (d ∈ D M ) under scenario ξ (ξ ∈ ), 0 otherwise.
protection problem can be formulated as a team orienteering problem with time windows (Nuraiman, Ozlen, and Hearne 2020), mentioned studies are reviewed in this section as well including the team orienteering problem with time windows and also the problem with uncertain parameters. Machine learning helps decision-makers to diagnose the complex relationship between data and make decisions more accurately. Machine learning can be classified into different classes such as supervised, unsupervised, semi-supervised, reinforcement, active, and deep learning (e.g. Ieracitano et al. 2020;Shi, Kang, et al. 2020;Cao et al. 2018;Principi et al. 2019). The studies related to the machine learning method can be categorised into three classes including optimisation (e.g. Wu et al. 2020;Priore et al. 2019;Liu, Pang, and Qi 2020;Pandey, Wang, and Boyles 2020;Kim, Jeong, and Shin 2020;Shi, Fan, et al. 2020), learning of an operator behaviour in a heuristic (e.g. Burke, Kendall, and Soubeiga 2003;Bai et al. 2012;Sabar and Kendall 2015;Qu et al. 2020;Choong, Wong, and Lim 2018;Zhao and Zhang 2020;Wang et al. 2020;Shao, Pi, and Shao 2018;Tinós 2020), and estimation of a parameter (e.g. Barteczko-Hibbert, Gillott, and Kendall 2009;Li and Bai 2016;Cui, Rajagopalan, and Ward 2020;Chen, Guo, and Zhao 2020;Yao and Cao 2020).
In this section, previous studies are investigated in the following three sub-sections: Studies related to the Asset Protection Problem, Studies related to the use of machine learning for optimisation, and finally, studies related to using machine learning for prediction in wildfire-related researches. van der Merwe et al. (2015) proposed a mathematical model for the asset protection problem by considering different types of vehicles and defining travel time as a function of vehicles' speed. They used the CPLEX solver to optimise the proposed model. Roozbeh, Ozlen, and Hearne (2018) developed an adaptive large neighbourhood search (ALNS) algorithm for asset protection problem during a wildfire. Nuraiman, Ozlen, and Hearne (2020) proposed a two-phase decompositionbased matheuristic algorithm for asset protection problem in a wildfire. In the first phase, the assets are classified into different stages and a vehicle routing problem is solved for each stage. Then the obtained routes in different stages are connected in the second phase. In the mentioned studies related to asset protection in a wildfire, it is assumed that the locations of active protection centres are known and only optimal routes for asset protection are determined. Also, all parameters such as travel and service times are considered as deterministic parameters. Bashiri et al. (2021) proposed a two-stage stochastic programming mathematical model for the asset protection problem. Their proposed model determines opening protection centres and routing decision variables, simultaneously. Also, in their mathematical model, one type of protection centre was considered. They used the Frank-Wolfe progressive hedging algorithm to solve the problem. In our study, a two-stage stochastic programming model is proposed for the asset production routing problem by considering two types of centres including main and backup protection centres. It is assumed that vehicles can be transferred from a backup protection centre to the main protection centre to deal with uncertainty. Also, we developed two matheuristic algorithms to solve the proposed model.

The asset protection problem
As mentioned before, the asset protection problem can be formulated as a team orienteering problem with time windows. Therefore, the team orienteering problem with time windows and orienteering problem with uncertain parameters are reviewed. The orienteering problem is an NP-hard problem (Tsiligirides 1984) and the team orienteering problem with time windows can be defined as an extension of that problem. Gunawan, Lau, and Vansteenwegen (2016) reviewed the studies related to variants of orienteering problem and the solution methods developed to solve them. Different approaches have been proposed for solving the team orienteering problem with time windows. Lin and Vincent (2017) proposed a multi-start simulated annealing algorithm for the team orienteering problem with time windows by considering mandatory and optional customers.  proposed a selective discrete particle swarm optimisation for solving the team orienteering problem with time windows and partial scores. Hu, Fathi, and Pardalos (2018) developed a multi-objective evolutionary algorithm based on the combination of decomposition and constraint programming approaches for a multiobjective team orienteering problem with time windows.  proposed a hybrid artificial bee colony algorithm for the team orienteering problem with time windows when the obtained score for each node depends on the time of visit. Verbeeck, Vansteenwegen, and Aghezzaf (2016) developed a metaheuristic algorithm for solving the time-dependent orienteering problem by considering travel time as a stochastic parameter. Amarouche et al. (2020) proposed the neighbourhood search algorithm for the team orienteering problem with time windows. They improved the efficiency of the neighbourhood search algorithm by considering the memory mechanism and splitting algorithm. Saeedvand, Aghdasi, and Baltes (2020) proposed a multi-objective team orienteering problem with time windows for robots in rescue operations. They developed a multi-objective evolutionary algorithm for solving the proposed model and also used Q-Learning to consider the dynamic nature of disasters and probable changes.
Based on the authors' knowledge, the uncertainty was considered in limited studies related to the orienteering problem. While considering uncertainty will cause more accurate decisions. The studies related to asset protection problem and orienteering problem are reviewed in Table 2 and compared with this study. In this study, travel and service times, utility value, and upper bounds of time windows are considered as stochastic parameters and a two-stage stochastic model is proposed.
As shown in Table 2, no previous study related to the asset protection and team orienteering problem with time windows considers collaboration between depots. Also, the heuristic and metaheuristic algorithms are applied in most studies related to the asset protection problem and the team orienteering problem with time windows. Recently, matheuristic algorithms have been used widely in studies related to vehicle routing problems e.g. Yu, Fang, et al. (2019). In some types of these algorithms, a heuristic or metaheuristic algorithm is developed to obtain feasible routes. Then a route-based version of the proposed model is solved to achieve a feasible solution. Bertazzi et al. (2019) proposed a matheuristic algorithm for a multi-depot Inventory routing problem. They developed a heuristic algorithm based on the clustering of nodes for generating feasible routes. Then a route-based formulation of the multi-depot Inventory routing problem is optimised to obtain a feasible solution. Hu and Lim (2014) proposed a matheuristic algorithm for the orienteering problem with time windows. The simulated annealing and local search algorithms were used to construct a set of feasible routes in the first phase. In the second phase, a route-based formulation of the team orienteering problem with time windows is solved to achieve a high-quality solution. Mancini (2017) developed a two-phase matheuristic algorithm for a vehicle routing problem with time-dependent travel time. In the first phase, feasible routes are constructed by a multistart random constructive heuristic. A set partitioning formulation of the proposed model is solved in the second phase to obtain a near-optimal solution. Grenouilleau et al. (2019) developed a matheuristic algorithm for home healthcare routing and scheduling problem. In the proposed algorithm, a set partitioning-based formulation is solved based on the feasible routes obtained by a large neighbourhood search heuristic. In this paper, a two-phase matheuristic algorithm is proposed for the asset protection problem. In the first phase, a set of feasible routes are generated based on a developed heuristic algorithm. Then a set partitioning formulation of the asset protection problem is solved to obtain a nearoptimal solution. Another matheuristic algorithm based on the ALNS algorithm is developed in this study, as well.

Using machine learning to solve combinatorial problems
Using machine learning for solving combinatorial problems has attracted much interest in recent years. Bengio, Lodi, and Prouvost (2020) reviewed the studies that used machine learning algorithm for solving combinatorial optimisation. Talbi (2016) proposed a combination of a metaheuristic algorithm and machine learning technique to solve optimisation problems. Abbasi et al. (2020) proposed an approach to solve large-sized instances of stochastic optimisation problems by using a machine learning method. They used classification and regression tree, k-nearest neighbours, random forest, and multilayer perceptron artificial neural network models to approximate the relationship between the parameters of the problem and the optimal values of decision variables. Václavík et al. (2018) used an online machine learning technique to approximate a tight bound for the pricing problem in the branch and price algorithm and, as a result, the computational time of the algorithm was reduced. Bayliss et al. (2020) proposed a learning-based heuristic algorithm for solving team orienteering problem. They assumed that travel time between two nodes depends on a drone's flight path between previous nodes. Therefore, a machine learning-based heuristic approach is developed in this paper to predict travel times. Different types of vehicle routing problems are combinatorial problems and a machine learning algorithm is applied for solving them. A combination of reinforcement learning and metaheuristic or heuristic algorithm is used widely for solving different types of vehicle routing problems. In some studies reinforcement learning is applied to manage neighbourhood search in local search algorithm proposed for solving VRPs. Peng et al. (2020) proposed a learning-based memetic algorithm by applying reinforcement learning for pickup and delivery vehicle routing problem. Chen, Qu, et al. (2020)

Using machine learning for prediction under wildfires
As mentioned before, machine learning can also be applied in the studies to better estimation of parameters in the models and as a result to reduce the uncertainty of decisions and make more accurate decisions.  proposed a multi-objective model for scheduling a single bus corridor. They used machine learning to obtain a pattern based on the historical data to mitigate uncertainty. Silva et al. (2020) used machine learning and fuzzy logic to obtain wildfire risk and danger indices according to available data that are used in the dynamic wildfire warning system. Zema et al. (2020) used an artificial neural network to predict the hydrological treatments as well as erosion in burned and non-burned soils after forest wildfires. Watson et al. (2019) examined 10 machine learning models including elastic net regression, generalised additive models (GAM), gradient boosting, k-nearest neighbour regression, lasso regression, linear models, multivariate adaptive regression splines (MARS), neural network, random forest, and support vector machines with a radial basis kernel (SVM) to predict ozone exposure during wildfires. Zhai et al. (2020) used a neural network to predict wildfire spread. Sayad, Mousannif, and Al Moatassime (2019) used artificial neural networks and support vector machine to predict wildfires. In this paper, machine learning is utilised to estimate the fire spread pattern in the wildfire based on historical data. Artificial neural networks are used to determine fire spread pattern approximately. Artificial neural networks are the methods used in supervised learning and applied invariant fields of study. Fire spread pattern can affect other parameters such as utility value and upper bound of time windows. Therefore, an accurate estimation can mitigate the uncertainty of the obtained decisions.
According to the above-mentioned literature review, it is observed that there are no previous studies to consider the cooperation between protection centres for the asset protection problem with efficient solution method development. The main contributions of the current study can be summarised as follows: Developing a two-stage stochastic mathematical model for the asset protection problem with collaboration between protection centres, using the ANN to generate more precise possible scenarios, developing two matheuristic algorithms with a learning mechanism in the first algorithm.

Problem description
In this paper, a stochastic asset protection routing problem is proposed for determining the locations of fire protection centres and protection of assets in a wildfire by considering uncertainty. Two types of fire protection centres, main and backup protection centres, are considered, and different types of vehicles are assigned to these centres. The vehicles depart from the main protection centres, and backup protection centres are used to transfer more vehicles to the main centres to deal with uncertainty. The locations of the fire protection centres are determined according to the utility obtained by protecting assets and routing costs. Also, the assets are selected for protection based on a trade-off between their obtained utility and corresponding routing cost/time. Therefore, in some cases, assets that are far from the main protection centres might be selected to be protected because of their higher obtained utility.
The wildfire spread pattern may affect some of the model parameters, such as utility obtained by protecting assets and upper bounds of time windows. Therefore, estimating wildfire intensity is necessary to determine the values of uncertain parameters such as the utility obtained by protecting assets and upper bounds of time windows under different scenarios. An artificial neural network is applied to estimate the intensity of the wildfire based on historical data. Also, some additional factors, such as temperature, wind direction, wind speed, etc. may affect the model parameters, so considering these factors makes the decisions more accurate. However, the deterministic values of these factors are unknown in the planning stage, but their probability distribution can be estimated according to historical data. The problem is illustrated in Figure 1. The assumptions of the proposed model are as follows: • Two types of fire protection centres are considered, main and backup centres. Vehicles are transferred from backup protection centres to main centres to deal with uncertainty. • A limited number of vehicles can be assigned to each protection centre according to its capacity. • Different types of vehicles, such as tankers, pumpers, and aerial vehicles, are considered. • Different types of protection activities, such as cleaning debris and hosing down structures, are considered. • The travel and service times, upper bounds of time windows, and utility obtained by protecting assets are considered as uncertain parameters and defined by the scenarios.

Problem formulation
In this section, a two-stage stochastic programming model for an asset protection routing problem in a wildfire is developed, with the possibility of transferring vehicles from backup protection centres to main centres. A stochastic asset protection routing problem is defined on a direct network, G = (N , E), where N is the set of nodes partitioned into a set of assets I and a set of protection centres D and E is the arc set E = {(i, j)|i, j ∈ N }. The sets, parameters, and decision variables of the proposed model are shown in Table 1.
Here, a two-stage stochastic programming model is proposed for an asset protection problem by considering uncertain parameters. Determining the locations of the protection centres and assignment of vehicles to the centres are strategic decision variables that are considered as the first-stage variables. Also, assignment and routing decisions are the second-stage variables. The proposed two-stage stochastic programming model, called SAPP-PC, is presented below.
The objective function (1) maximises the total achieved profit through the obtained value of protecting assets minus establishment and transportation costs. The third term of the objective function contains the cost corresponding to vehicles transferring from backup to main protection centres. Constraints (2) guarantee that each vehicle can be assigned to at most one protection centre. Constraints (3) ensure that vehicles can be assigned to the established protection centre. Constraints (4) guarantee that at least one vehicle should be assigned to each protection centre. Constraints (5) state that the kth vehicle can start its route from the main protection centre d and visit nodes if this vehicle is assigned to the main protection centre d or transferred from a backup protection centre to this protection centre. Constraints (6) ensure that the kth vehicle can be transferred from the d th backup protection centre to main protection centre d if this vehicle is assigned to the d th backup protection centre and the dth main protection centre is established. Constraints (7) ensure that the kth vehicle can protect the ith asset if this vehicle leaves its protection centre. Constraints (8) ensure that each asset can be visited by at most one vehicle that has the required equipment to protect that asset. Constraints (9) state that if the ith node is assigned to the kth vehicle departs from the main protection centre d, that node should be visited in the route of vehicle k. Constraints (10) are flow conservation constraints. Constraints (11) state that each protection centre has a capacity, therefore, a limited number of vehicles can be assigned to it. Constraints (12) calculate the arrival time at the main protection centres for vehicles transferred from backup protection centres. Constraints (13) are sub-tour elimination constraints. Constraints (14) are time windows constraints. Constraints (15) to (18) define the domain of the decision variables.

Scenario generation method
In this study, scenarios for stochastic parameters are generated based on real data in the south region of Hobart, Tasmania (Australia) based on the procedure proposed by Bashiri et al. (2021). In this study, the mentioned procedure has been improved by using an artificial neural network (ANN). In the above-mentioned procedure, a predefined value is considered to recognise the affected region during a wildfire. While in this study, we assume that the area of the region affected by wildfire depends on some factors such as month's rainfall, month's maximum temperature, etc. Therefore, an artificial neural network is applied to estimate the area of the affected region during the wildfire based on historical data, more accurately. Different values of some factors such as wind direction, wind speed, month's rainfall, and month's maximum temperature can affect the parameters of the proposed model. Different levels of factors considered in this paper for generating the scenarios are shown in Table 3. These factors have different dimensions so that mentioned levels should be normalised to remove dimension and balance magnitude differences when they are considered to generate scenarios. Therefore, the normalised values corresponding to each level of wind speed, month's rainfall, and month's maximum temperature factors are reported in Table 3. Different levels of wind direction, month's rainfall, and month's max temperature are determined based on the data obtained from bom.gov.au. Also, the levels of wind speed factors are determined based on historical data about wind speed in previous wildfires in Canada, the United States, and Australia. Parameters under each scenario are calculated based on the normalised values of mentioned factors. The notations used in this section are as mentioned in Table 1.
The service time of each asset under a scenario is obtained by Equation (19).
The obtained utility by protecting the asset and upper bounds of time windows depends on the fire spread pattern and start point of wildfire as well as the mentioned factors. The fire spread pattern depends on the intensity of the wildfire and wind direction. The intensity of wildfire under each scenario is unknown but some factors that affect it such as the month's maximum temperature, month's rainfall, and wind speed are known. Therefore, the intensity of wildfire (area of the region affected by wildfire) and as a result, the region that has been affected by wildfire should be predicted. As mentioned before, artificial neural networks (ANN) are applied to predict the area of the region that can be affected by a wildfire. More details about the artificial neural network can be found in Section 4.1. Artificial neural networks forecast the intensity of wildfire by considering 6 features as inputs including continent, country or state, and sub-region, month's maximum temperature, month's rainfall, and wind speed. Data obtained from the wildfires occurred in Canada, the United States, and Australia. It is assumed that the region is partitioned into M sub-regions. These sub-regions are called 'cells' and their area are equal. The selected region in Hobart is divided into 36 cells in Figure 2. The number of cells in the region affected by the wildfire under scenario ξ (NC(ξ )) is determined based on the following equation.
The fire spread pattern depends on the start point of wildfire (the cell that wildfire starts from it) and the wind direction as well as the number of cells in the region affected by wildfire. Therefore, the start point of wildfire must be determined under each scenario. The wildfire spread patterns when wind directions are west (W), northwest (NW), and southwest (SW) are shown in Figure 3. It is assumed that the wildfire starts from cell 13 and the number of affected cells by wildfire is equal to 11. Based on obtained wildfire spread pattern, scenarios for utility value and upper bound of time window are calculated by Equations (21) and (22), respectively.
The fire time coefficient depends on escape fire time. This coefficient is equal to 3, 2, and 1 for working hours in non-school holidays, working hours in school holidays, and non-working hours, respectively. The travel time between the nodes under each scenario is determined by the following equation.

Artificial neural network
An artificial neural network (ANN) is inspired by the neurons in the human brain. Artificial neural networks can be categorised into two main classes including multilayer feedforward networks and recurrent networks based on the network architectures (Zhang 2018). Two different neural networks including general regression neural network algorithm (GRNN) and multilayer perceptron artificial neural network (MLPNN) are used for predicting the wildfire intensity. The efficiency of these types of neural networks for predicting the intensity of wildfire is based on data obtained from the previous wildfire in Canada, the United States, and Australia. The general regression neural network was proposed by Specht (1991). This algorithm can be applied for regression, prediction, and classification. The GRNN is a one-pass learning algorithm and as a result, backpropagation is not required. Also, the GRNN requires less data for convergence. This algorithm has been used widely in studies in the forecasting fields (e.g. Liang, Niu, and Hong 2019;Niu, Liang, and Hong 2017;Haidar et al. 2011).
The multilayer perceptron artificial neural network (MLPNN) is one of the well-known and useful types of feedforward neural networks. The MLPNN consists of three main layers including input, hidden, and output. One or more hidden layers can be considered in this algorithm. The MLPNN used backpropagation for training and can estimate the non-linear relationship between inputs and outputs accurately (Bishop et al. 1995).
The parameters of the neural networks are reported in Table 4. The number of input and output layers are 6 and 1, respectively. The input features include continent, country or state and sub-region, month's maximum temperature, month's rainfall, and wind speed. Data obtained from the wildfires occurred in Canada, the United States, and Australia. Also, the month's maximum temperature, month's rainfall, and wind speed in the location and time that wildfire occurred are obtained from the weather spark website (https://weatherspark.com).
The MATLAB source code was used to build neural network models. For tuning the parameter of MLPNN including the number of hidden layers and the proportion of data used for training, this algorithm runs 50  times under different combinations of these parameters. The average and standard deviation of mean square error (MSE) for each combination are reported in Table 5. Also, the average computational times of learning MLPNN are reported in this table. As reported in this table, the average and standard deviation of MSE when the number of hidden layers and proportion of data used for training are 10 and 0.7, respectively are less than other configurations. Therefore, 10 hidden layers are selected and 0.7 of the data are used for training the MLPNN.

Two-phase matheuristic algorithm
To solve the SAPP-PC efficiently, a two-phase matheuristic algorithm is proposed and the pseudo-code of the developed algorithm is described in Algorithm 1. Also, the main steps of the algorithm are illustrated in Figure 4. In the first phase, the feasible routes are constructed based on the proposed heuristic algorithm. Then, a routebased formulation of SAPP-PC called RBSAPP-PC is constructed according to the obtained routes from the previous phase to be solved and achieve a near-optimal solution for SAPPPC. The sets, parameters, and decision variables in the matheuristic algorithm have been defined in Table 1.

Constructing feasible routes
Generating a restricted number of feasible routes for SAPP-PC is the aim of this phase. The feasible routes are constructed according to an iterative algorithm. In each iteration, a set of assets that have a greater chance to be selected for protection (critical asset) is determined at first. For this purpose, a random number (λ id ) is generated from an interval {min j∈N {C ij }, 2 × C i,(I+d) } for each asset under each scenario. If the difference between λ id and obtained utility by protecting the ith asset under scenario ξ is less than a predefined value ( 2 ), it is selected as a critical asset. While in other iterations, the critical assets are determined based on a learning approach. In these iterations, a measure (L id (ξ )) is calculated for each combination of nodes under each scenario to determine the critical level. The critical level for each asset is determined based on the number of obtained feasible routes which include mentioned asset in previous iterations. On the other hand, the critical level of the assets is achieved based on the learning from the patterns of assets' selection in the previous iterations. The critical level is calculated by the following equation: A diversification mechanism is applied to avoid trapping in a local optimum solution. In this mechanism, the critical levels are 1 for the assets that did not exist in the set of critical assets in the previous iteration. Therefore, these assets are selected as critical assets in the current iteration. This diversification mechanism is done in all iterations except the first iteration. Then the set of critical assets with the same required activities for protection is defined for each main protection centre under each scenario. For generating the feasible routes, all subsets corresponding to set CA da (ξ ) are generated, and then the travelling salesman problem with time windows (TSPTW) should be solved for each subset to achieve the visiting sequence. To solve the TSPTW efficiently a heuristic algorithm based on the nearest neighbour algorithm (NN) (Kizilateş and Nuriyeva 2013) is proposed. The pseudo-code of the developed heuristic algorithm is presented in Algorithm 2. In the nearest neighbour algorithm, the nearest node (having a minimum value of travelling cost) is selected from the set of the unvisited nodes in each iteration. However, in the modified nearest neighbour algorithm (MNN) the nearest node not only is determined based on travelling cost but also the time window constraints are considered to select the nearest node. For this reason, a neighbouring measure (NM i (ξ )) is proposed and calculated for each asset under each scenario based on the following equation.
where i − is the last node added to the route corresponding to the nsth subset (VS ns ). Considering the term UB i (ξ ) − tr i in the calculation of the neighbouring measure guarantees that the assets have more priority to select as a nearest node when the difference between Algorithm 1: The framework of the proposed two-phase matheuristic algorithm.
Calculate critical level (L id (ξ )) by using Equation (24) ; iter = iter + 1; A route-based formulation of SAPP-PC model (RBSAPP-PC) according to the obtained routes R(ξ ) is solved to obtain the solution; their arrival times and upper bounds of time windows are small for them. In some cases, the obtained routes from the MNN algorithm are not feasible, therefore, an algorithm is developed to repair the infeasible routes. The pseudo-code of the proposed algorithm is illustrated in Algorithm 3.
In Algorithm 3, the cumulative value of violation in time window constraints is calculated for each asset in different positions of VS ns . Let CV ipns is the cumulative value of violation if the ith asset is located in the pth position of the route VS ns when visiting a sequence of other assets in this route is constant. To calculate CV ipns , the asset i is removed from its current position in the route VS ns and is located in the pth position. Let, redVS ns is constructed by omitting the ith asset from the visiting sequence VS ns . Then the ith asset is inserted in the pth position of redVS ns , therefore, the positions of the nodes located in the positions {1, 2, . . . , p − 1} are not changed. However, the nodes in the positions {p, p + 1, . . . , |VS ns | − 1} are located in new positions as {p + 1, p + 2, . . . , |VS ns |}, respectively. It can be concluded that the obtained route by locating the ith asset in the pth position of the route VS ns is feasible if CV ipns is equal to zero. Two situations can occur based on the obtained value of CV ipns for the assets. In the first condition, there is at least one position for the assets the cumulative value of violation in time window constraints is equal to zero for that position. In this condition, the asset and position corresponding to the cumulative value of violation are selected and the selected asset is located in the new position (selected position). In the second condition, there are not any positions for the assets, therefore, this route can not be considered as a feasible route. Therefore, finding a feasible visiting sequence for all nodes in the route VS ns or removing all nodes from the set I ns are the stopping condition of this algorithm.

Optimisation of route-based SAPP-PC
In this phase, a route-based model for the SAPP-PC is proposed according to the feasible routes obtained from the previous phase. A feasible solution for the SAPP-PC is achieved by solving the RBSAPP-PC model.
Calculating the neighbouring measure for each asset in the set UN based on the Equation (25) (2)-(4), (6), (11) r∈R(ξ ) k∈K a∈A (π ak γ ia ) β ir (ξ ) u rk (ξ ) ≤ 1, The objective function (26) maximises the profit through obtained values by protecting the assets minus opening and transportation costs. Constraints (27) ensure that the rth route can be assigned to the established main protection centres. Constraints (28) states that the rth route can be assigned to the kth vehicle under scenario ξ if this vehicle exists in the main protection centre corresponding to the rth route. Constraints (29) guarantee that the rth route can be assigned to the vehicles transported from backup to main protection centres if the travel time between these protection centres does not make the route infeasible. In this constraint, Float r (ξ ) is the minimum value of violation between the arrival times and the upper bounds of time windows for the assets in the rth route under scenario ξ and is calculated by (Float r (ξ ) = min i∈I|β ir (ξ )=1 {UB i (ξ )-arrival time of asset i in route r}). Constraints (30) state that each asset should be visited by at most one vehicle that has suitable equipment to do the required protection by this asset. Constraints (31) to (33) define the domain of the decision variables.

The improved adaptive large neighbourhood search
Because of the large number of decision variables in the problem, the classic ALNS (Hemmelmayr, Cordeau, and Crainic 2012) algorithm does not have proper efficiency, so we improved it by enhancing the algorithm by adding mathematical models as additional steps in the algorithm. The pseudo-code of the improved version of the ALNS is illustrated in Algorithm 4. Also, the proposed removal and insertion strategies are explained in this section. Moreover, the improvement phase of the proposed algorithm is illustrated in Algorithm 5. Extra sets, parameters, and decision variables for this algorithm have been defined in Table 1. As shown in Algorithm 4, first L ro d (ξ ) nodes are removed from the routes related to the dth main protection centre under scenario ξ by using one of the removal strategies. Then L io d (ξ ) assets are inserted to the routes related to the dth main protection centre under scenario ξ according to the selected insertion strategy. The roulette wheel procedure is used for selecting the removal and insertion strategies on the current solution (SS) according to their achieved scores in the previous steps. The feasibility of obtained routes is checked and infeasible routes are repaired. If there is not any feasible location for an asset in the route, it is removed from the route. Other

Solution representation for the improved ALNS algorithm
The solution representation used in this study for the improved ALNS algorithm is defined based on the location routing problem solution and consists of assets ({1, 2, . . . , I}), main protection centres ({I + 1, I + 2, . . . , I + d M }), and a dummy node (I + D + 1). Each row in the solution representation illustrates the routes of the corresponding scenario. For example, the first row shows the routes related to the first scenario. Assets after each main protection centre are assigned to that main protection centre. If an asset is not assigned to the main protection centre under different scenarios, that main protection centre is not established. Assets that have not been visited in the current solution under each scenario are added after the dummy node in the solution. An example with 20 assets, 3 main, and 2 backup protection centres is shown in Figure 5. In this example, the assets are assigned to the main protection centres 2 and 3 (corresponding nodes are 22 and 23, respectively), therefore main depot 1 (the corresponding node is 21) is not established. The assets 10, 12, 18, 9, and 11 are assigned to main protection centre 2, and asset 16 is allocated to main protection centre 3 under scenario 5. Node 26 is a dummy node and the set of assets NV(ξ ) that have not been visited in the current solution for each scenario are mentioned after this node. In the mentioned example NV(5) = {1, 2, 3, 4, 5, 6, 7, 8, 13, 14, 15, 17, 19, 20}.

creating an initial solution
The initial solution is created randomly. The first main protection centre is located in the first cell of the solution for all scenarios then a random permutation of the set including assets and other main protection centres is generated for each scenario to determine the routing decision. The assets allocated to each main protection centre are ordered based on their required protection activity without changing the visiting sequence of the asset with the same protection activity. The reason is that assets with different protection activities can not be served by the same vehicle. Figure 6 illustrates the initial solution for a scenario. In this solution, the assets 10, 12, 18, 9, and 11 are assigned to depot 2 (node 22). Assets 10, 12, and 18 require protection activity type 1, and assets 9 and 11 need protection activity type 2. Therefore, ordering the assets based on their required activities will lead to a reasonable allocation of assets to vehicles as shown in this figure.
The modified solution can be infeasible because of time window constraints violation. Therefore, the obtained solution should be repaired. For repairing the infeasible routes the same procedure as mentioned in Algorithm 3 for the two-phase matheuristic algorithm is used. If a feasible sequence does not exist for an asset in a route. It can be visited by another vehicle, therefore, it can be inserted into another route corresponding to that main protection centre with the same type of service. For more explanation of repairing procedure, an example is explained. It is assumed that the obtained visiting sequence for the assets which require protection Improve the best solution by a fix and optimise procedure described in Algorithm 5; activity type 1 in the modified solution of Figure 6 is infeasible. The Time window constraints are violated for asset 18. Firstly, according to Algorithm 3, it is added to different positions of the current route to obtain the   feasible visiting sequence. If any feasible visiting sequence is not found, it is added to the second route. The steps are shown in Figure 7. In this example, three routes are considered for the assets assigned to the main protection centre 2. Two routes are related to the assets which require protection activity type 1 and one route includes the assets that need protection activity type 2. Also, a route for the assets that requires protection activity type 3 is assigned to main protection centre 3. Because the type of vehicles assigned to each main protection centre is not determined yet, the maximum number of routes can be considered to serve the assets with requiring protection activity type a in each main protection centre is determined by k∈K π ak . If the feasibility can not be guaranteed for an asset assigned to the main protection, then it should be added after the dummy node in the solution.
The decisions about the establishment of protection centres, assigning vehicles to protection centres, collaborations between protection centres, and selecting the routes based on the vehicles assigned to each protection centre are determined by using the Loc-Int model.
The proposed Loc-Int model is given by: s.t.: (2)-(4), (6), (11) Constraints (35) guarantee that the main protection centre can be constructed if at least one asset is assigned to it in the current solution. Constraints (36) ensure that the kth vehicle can be allocated to the r th route in the main protection centre d if that vehicle is assigned to this main protection centre or is transferred from a backup protection centre to the dth main protection centre. Constraints (37) guarantee that each vehicle can be assigned to at most one route when it has the required equipment to protect the assets in this route. Constraints (38) state that at most one vehicle with the proper equipment can be assigned to each route. Constraints (39) guarantee that the vehicles can not be assigned to the routes that have not the proper equipment to protect them. Constraints (40) ensure that the vehicles which are transferred from the backup protection centres can be assigned to a route if the time window constraints can be met. Constraints (41) determine the domain of variables. The solution obtained by the Loc-Int model needs to be modified. The assets that are assigned to the main protection centre that is not established according to the solution of the Loc-Int model are removed from their current cells and are transferred after a dummy node in the current solution. Also, if any vehicle is not allocated to a route, the assets in this route are removed from current cells and are transferred after the dummy node. For example, if no vehicle is assigned to the route that includes asset 18, it should be removed from its current cell and is considered as a not visited asset.

Removal strategies
Two removal strategies are used in the improved ALNS algorithm which were applied by Hammami, Rekik, and Coelho (2020).
• Random removal: in this strategy, a specific number of nodes (L ro d (ξ )) are selected randomly for each main protection centre under each scenario and are removed from their corresponding routes.
• Least profit removal: in this strategy, a specific number of nodes (L ro d (ξ )) with the least profit are removed from the routes corresponding to the dth main protection centre under scenario ξ .

Insertion strategies
Five different insertion strategies are designed in this study which is mentioned as follows. The nodes are selected by following strategies and then they are inserted in a feasible cell with the cheapest cost. In the first four strategies, nodes are selected among non-visited and removed nodes, however, in the last strategy nodes are selected among only non-visited nodes.
• Random selection by considering types of assigned vehicles to main protection centres: in this strategy, a predefined number of assets (L io d (ξ )) are selected randomly from the non-visited assets (placed after dummy node) and removed assets from other main protection centres under scenario ξ . In this strategy, during the selection, it is ensured that the required protection activities by the selected assets should be compatible with the types of assigned vehicles to the dth main protection centre in the SS solution.
• Selection of highest profit assets by considering types of assigned vehicles to main protection centres: a predefined number of assets (L io d (ξ )) with top higher profits are selected from the non-visited (placed after dummy node) and removed assets from other main protection centres under scenario ξ . In this strategy, during the selection, it is ensured that the required protection activities by the selected assets should be compatible with the types of assigned vehicles to the dth main protection centre in SS solution.
• Random selection: in this strategy, a specific number of assets (L io d (ξ )) are selected randomly from the nonvisited and removed assets from other main protection centres under scenario ξ .
• Highest profit selection: L io d (ξ ) number of assets with top higher profits are selected from the non-visited and removed assets from other main protection assets under scenario ξ .
• The highest mean value of profit selection: in this strategy, the mean value of potential achievable profit by adding the ith asset to the current routes is calculated S ∀i ∈ I. Then L io d (ξ ) number of assets with top higher MV values are selected from non-visited nodes under scenario ξ .

Improvement procedure
In this phase, the obtained solution from the ALNS algorithm is improved by using a fix and optimise procedure. The pseudo-code of this procedure is illustrated in Algorithm 5. In this procedure, an asset is selected from a restricted list of non-visited assets in each iteration and we try to add it to be visited. The restricted list of assets is selected according to the MV values. And assets with positive values of MV are added to the restricted list. In each iteration, an asset with the largest value of MV is selected from the restricted list. Then the costs related to inserting the selected asset to different positions of routes corresponding to each main protection centre are calculated. If insertion of an asset to a position of a route makes it infeasible, then its cost is assumed to be a large value. This guarantees to achieve a feasible solution using the improvement procedure. Then an intermediate model called the Imp-Int model is solved for the selected asset. The proposed the Imp-Int model is given by: s.t.: (2)-(6), (11), (36)-(41) Equation (42) is the objective function that calculates the expected achieved profit by inserting assetî in the best position. Constraints (43) assure that available active main protection centres be active in the solutions after improvement. constraints (44) state that at least the vehicle assigned to each depot in the best solution of the ALNS algorithm should be assigned to the depots after the improvement phase. Constraints (45) guarantee that the selected asset can be assigned only to a position of one route of a depot. Constraints (46) ensure that the selected asset can be assigned to a route if a vehicle is allocated to that route. Constraints (47) determine the domain of variables.

Algorithm 5:
The framework of the improvement procedure.

Input: SS best ;
Calculate MV for all assets; Define the set RL that includes assets with positive MV values sorted in a descending order while RL is not empty do Select the first element of set RL namedî Calculate the insertion cost ofî in the jth position of each route; Solve Imp-Int(î) model and update the decisions (z best d and y best dk ) Remove assetî from RL

Computational results
In this section, the validity of the proposed model (SAPP-PC) and neural networks to generate the scenarios are examined through numerical examples. The efficiency of the proposed matheuristic algorithms is evaluated using the instances generated based on Barreto's benchmark (Schneider and Drexl 2017) and a real case that are available at http://prodhonc.free.fr/Instances/instances_us. htm. This benchmark has been developed for location routing problem but the considered problem in this paper includes both location and team orienteering problems, simultaneously. Also, all parameters of the  Number  of  instances   Small  20, 25, 30  5  3  5  10, 20  30  Medium  35, 40, 45  5  3  5  10, 20  30  Large  50, 55, 60, 70, 80  5  3  5  20  25   Table 7. Additional parameters generation scheme based on the benchmark data set for the numerical analysis.

Parameters
Corresponding values or distribution benchmark are deterministic however, there are some stochastic parameters in this study. Therefore, we used the above-mentioned instances with some modifications adding required parameters such as travel times, service times, time windows, and profits to be used in the proposed model. Instances in this paper are generated based on coordChrist50, coordChrist75, and coordChrist100 instances of Barreto's benchmark. To access the data related to this study has been explained in the 'Data availability statement' section. Three sizes of problems are examined in this section and five instances (Ins 1, Ins 2,. . . , Ins 5) are considered for each combination of their specifications. As a result, 85 instances are examined in this section. Their specifications are shown in Table 6. The values of some required parameters are not available in the above-mentioned benchmark data set, so the values generating scheme is introduced in Table 7. Based on the locations of nodes that are reported in Barreto's benchmark, the whole area is partitioned into 100 cells. Different scenarios are designed according to the different start points of fire and spread patterns. If asset i is not located in the cells that are affected by the fire under scenario ξ , the obtained value by protecting this asset and upper bounds of time windows of that under scenario ξ will be considered to be zero. The computations were performed on a 3.5 GHz Workstation with 32 GB RAM and 6 cores operating with Windows 10 (64-bit) and using the Julia software. CPLEX 12.7.1 has been used as a solver.

Examining the effect of considering protection centre collaboration
The collaboration between protection centres reduces the worth of damage by protecting more assets, especially in the case that the available capacity in the main depots is not enough to assign different types of vehicles to them. The expected value of obtained utility with (EV WDC ) and without (EV WODC ) considering the collaboration between the main and backup protection centres under the different number of assets are reported in Table 8. In addition, the relative growth in obtained utility (RGU) is investigated. RGU is calculated by RGU = EV WDC /EV WODC × 100%. It is assumed that the numbers of scenarios, main and backup protection centres are 10, 2, and 3, respectively. The results indicate that by considering the collaboration between protection centres, the obtained utility increases 1.34 times on average. It confirms that the collaboration between protection centres will be effective in a wildfire.

Examining the effects of GRNN and MLPNN in predicting the wildfire intensity
A paired t-test is used to test the null hypothesis that MSE for MLPNN is equal to that obtained for GRNN. For this reason, MLPNN and GRNN were run 50 times and a paired t-test was investigated to consider a onetailed alternative hypothesis. The box plot of differences between the obtained MSE values is illustrated in Figure 8. Based on the obtained p-value (.001), the null hypothesis should be rejected, therefore, it can be concluded that MLPNN has a more accurate estimation of the intensity of wildfire. According to this analysis, MLPNN was used in this study.

Efficiency evaluation of the proposed matheuristic algorithms
The efficiency of the proposed two-phase matheuristic algorithm for solving the SAPP-PC model is presented in this section. The review of previous studies on efficient solution algorithms for the orienteering problem (Hammami, Rekik, and Coelho 2020), asset protection problem (Roozbeh, Ozlen, and Hearne 2018), and   Sun et al. 2020;Hellsten, Sacramento, and Pisinger 2020) confirms that the adaptive large neighbourhood search (ALNS) based algorithms have performed efficiently for these types of problems. Therefore, in this study, the efficiency of the proposed matheuristic algorithm is examined comparing to the improved version of the adaptive large neighbourhood search algorithm as well as the best objective function value obtained by the CPLEX solver in 10,000 s. The tuned parameters for the improved ALNS are mentioned in Table 9.

Examining the efficiency of the proposed matheuristic algorithm
The results obtained from the proposed two-phase matheuristic algorithm are compared with the results achieved by using the CPLEX solver for small-size instances (up to 35 assets and considering the different number of scenarios), as reported in Table 10. The best objective function value of the SAPP-PC model (Obj C ) obtained by the CPLEX in 10,000 s is reported in this table. In some cases the optimal solutions can not be found in 10,000 s, therefore, in these cases, the best solutions provided by the commercial solver is reported as Obj C . The obtained objective function values achieved by the two-phase matheuristic algorithm with learning procedure (Obj Math−L ) and without learning procedure (Obj Math−WL ) are reported in this table as well. The improved ALNS algorithm is solved 10 times for each instance and the best obtained objective function value (Obj B ImpALNS ) and the average value of objective function values (Obj M ImpALNS ) achieved by this algorithm are reported in Table 10. Also, the average computational time for the improved ALNS (Time ImpALNS ) algorithm is reported in this table. Gap c−Math and Gap c−ImpALNS represent the deviation of the value obtained by the CPLEX from both algorithms of the two-phase matheuristic with learning procedure and improved ALNS, respectively: Gap c−Math(c−ImpALNS) = (Obj C − Obj Math−L (Obj B ImpALNS ))/Obj C × 100%. In addition, the computational times of CPLEX and the matheuristic algorithm are reported in Table 10.
As shown in this table, the achieved objective values for the two-phase matheuristic algorithm with and without the learning mechanism are almost equal but the average computational time of the two-phase matheuristic with the learning mechanism is 2 times less than the average computational time of the two-phase matheuristic without the learning mechanism. Therefore, applying learning mechanism has a great impact to reduce the computational time of the proposed algorithm. The obtained results show that the two-phase matheuristic algorithm with learning mechanism has good performance and it can find near-optimal solutions (0.21% gap on average). As reported in this table, the average computational time of the CPLEX solver is 2 times greater than the average computational time of the twophase matheuristic algorithm with learning mechanism for small-size instances. While the computational time of the two-phase matheuristic algorithm with learning mechanism for instances with less than 30 assets is greater than the computational time of the CPLEX solver, therefore, it is better to use the CPLEX solver for solving these instances instead of the two-phase matheuristic algorithm. Also, the matheuristic algorithm with the Note: Index Math-L, Math-WL, C, and ImpALNS denote the obtained results by our the proposed two-phase matheuristic with learning, the two-phase matheuristic without learning, the CPLEX, and the improved ALNS, respectively. Note: Index Math-L, Math-WL, C, and ImpALNS denote the obtained results by the proposed two-phase matheuristic with learning, the two-phase matheuristic without learning, the CPLEX, and the improved ALNS, respectively.
learning mechanism has a better performance comparing to the improved ALNS algorithm. The average gap of the obtained solutions by the improved ALNS algorithm is 31 times greater than the average gap of achieved solutions from the two-phase matheuristic algorithm with the learning mechanism. The computational time of the two-phase matheuristic algorithm with the learning mechanism is significantly less than the computational time of the improved ALNS. The efficiency of the proposed two-phase matheuristic algorithm for mediumsize instances (up to 45 assets under the different number of scenarios) is examined and the results are reported in Table 11. Based on the results reported in Table 11, the CPLEX solver can find optimal solutions for 14 instances in less than 10,000 s, while the proposed two-phase matheuristic algorithm with the learning mechanism could find good quality solutions with 0.55% gap on average. The average computational time of the CPLEX solver for the instances that optimal solutions are found is 1997.60 s, while the average computational time of the proposed two-phase matheuristic algorithm with the learning procedure is 67 s. Also, the solutions achieved by the improved ALNS have 4.40% gap on average, therefore, the proposed two-phase matheuristic algorithm with the learning mechanism has a better performance comparing to the improved ALNS algorithm. Therefore, it can be concluded that the proposed two-phase matheuristic algorithm with learning mechanism is efficient for solving medium-size instances. Also, the results show that the quality of the proposed two-phase matheuristic with and without learning mechanism is equal but the average computational time of the two-phase matheuristic algorithm without the learning mechanism is 4 times more compared to the two-phase matheuristic algorithm with the learning mechanism. It is concluded that using the learning mechanism will reduce computational time without changing the quality of the obtained solutions.
The computational time corresponding to generating the subsets is increased by increasing the number of scenarios. Therefore, in the two-phase matheuristic algorithm, the critical assets in each iteration are partitioned in a predefined number of clusters and the subsets are generated for each cluster in large-size instances. Then combinations of the feasible routes constructed based on the subsets of different clusters are examined to generate other routes. The efficiency of the two-phase matheuristic algorithm for large-size instances (up to 80 assets) are examined and results are reported in Table 12. As reported in this table, the CPLEX solver could not find the optimal solution for any of the instances during a restricted 10,000 s time. While the average computational Note: Index Math-L, C, and ImpALNS denote the obtained results by our proposed two-phase matheuristic with learning, the CPLEX, and the improved ALNS, respectively.
time of the proposed two-phase matheuristic algorithm with the learning mechanism is 1481.96 s. Also, the average gap of 1.02% confirms a good quality obtained solutions by the two-phase matheuristic algorithm. The efficiency of the proposed algorithm is examined by comparing it with the improved ALNS. The average computational time of the improved ALNS (1417.31 s) is less than the average computational time of the proposed two-phase matheuristic algorithm with the learning mechanism but the quality of the obtained solutions by the improved ALNS is less than the solutions achieved by the two-phase matheuristic algorithm. On the other hand, the average gap of obtained solutions from the improved ALNS is almost 4 times greater than the average gap of the obtained solutions by the two-phase matheuristic algorithm. Therefore, the proposed twophase matheuristic algorithm with the learning mechanism has much better performance in large-size instances comparing with the improved ALNS algorithm.

Examining the efficiency of the improved ALNS comparing to the classic ALNS algorithm
In this section efficiency of the improved ALNS algorithm is compared with the classic ALNS for small-size and medium-size instances. In the classic ALNS, all decisions are considered in the solution representation and the mathematical models Loc-Int and Imp-Int are not utilised. Each algorithm runs 10 times and the best obtained objective values for the improved ALNS (Obj B ImpALNS ) and the classic ALNS (Obj B ALNS ) are reported in Tables 13 and 14. Also, the average objective values for the improved ALNS and the classic ALNS (Obj M ImpALNS and Obj M ALNS , respectively) and the average computational time of the algorithms are reported in these tables. The gap between the best objective values obtained by these algorithms is calculated by The obtained results in Table 13 show that the average computational time of the classic ALNS is almost 12 times less than the average computational time of the improved ALNS. The gap between the best objective values by these algorithms is 22.10% on average. Therefore, the improved ALNS has a better efficiency compared with the classic ALNS for small-size instances. Also, the obtained results in Table 14 show that the achieved solutions by improved ALNS algorithm have better quality compared with the classic ALNS for medium-size  instances. The improved ALNS found better solutions compared with the classic ALNS (22.91% on average).

Examining the efficiency of the two-phase matheuristic by increasing numbers of protection centres and vehicles
In this section, the effect of an increase in the number of protection centres and vehicles on the efficiency of the two-phase matheuristic algorithm is examined by considering 50, 55, and 60 assets. Also, it is assumed that the number of protection centres and vehicles are 8 and 8, respectively. The objective function values and computational times of the matheuristic algorithm for different instances are examined by the best objective value obtained by the CPLEX solver in 10,000 s. In nine instances, the algorithm outperforms the commercial solver in less computational time. Also, the average gap between the best objective value obtained by the CPLEX solver and the objective value of the two-phase matheuristic algorithm is −0.19%. It is concluded that the two-phase matheuristic algorithm can achieve good quality solutions in a reasonable time (Table 15).

Evaluating the effects of the diversification procedure in the two-phase matheuristic algorithm
In the two-phase matheuristic algorithm, a diversification approach is used in each iteration. In this section, the effect of applying a diversification mechanism on the solution quality is examined for various instances by considering 50 assets, 5 vehicles, 5 protection centres, and 20 scenarios. The obtained objective values from the two-phase matheuristic algorithm with and without considering diversification approach (Obj div and Obj wdiv , respectively) under different values of 2 are reported in Table 16. Based on the reported results in this table,  considering diversification mechanism has a significant impact on the quality of the obtained solutions by the two-phase matheuristic algorithm especially in the case of using a large value for 2 .

Real case application
Hobart, Tasmania (Australia) is one of the most firehazardous areas in the world. In this area, many activities such as fuel treatment are implemented to reduce the risk of damage by wildfire. Therefore, the efficiency of the proposed two-phase matheuristic algorithm and the effects of considering collaboration between protection centres are examined based on a real case application in Hobart. For this reason, 20 assets are selected in Hobart and their pair distances are calculated according to their latitudes and longitudes. 5 random locations are considered for protection centres and it is assumed that 8 vehicles are available. The scenarios are generated based on the method proposed in Section 4 by using real data about wind direction, month's rainfall, and month's maximum temperature in Hobart, Tasmania (Australia). The best objective function values achieved by the CPLEX in 10,000 s and the objective values and computational times of the two-phase matheuristic algorithm are reported in Table 17. In these cases, a time limit equal to 1000 s is considered for solving the route-based version of SAPP-PC in the second phase of the two-phase matheuristic algorithm. The results show that the proposed matheuristic algorithm has a satisfactory performance for real data as well. The CPLEX solver can not find the optimal objective function values in any cases while the proposed two-phase matheuristic algorithm has found good quality solutions (0.25% gap on average) in a reasonable time (672.35 s in average). The effect of considering collaboration between protection centres in the obtained utility profit is illustrated in Figure 9. In this figure, the RGU (relative growth in obtained utility that is calculated by RGU = EV WDC /EV WODC ) is shown for the real case under a different number of scenarios. As shown in Figure 9, considering collaboration between protection centres has a significant effect to increase the obtained utility by protecting assets.

Conclusion
Protecting assets is one possible strategy during a wildfire. Therefore, proper planning of asset protection before an actual wildfire may result in efficient actions during the wildfire. Since there are some sources of uncertainty related to the asset protection problem, it is considered as two-stage stochastic programming. The more uncertain parameters there are, the more possible scenarios there are. However, there is no full data for the scenarios, so an artificial neural network was employed to extract the required data for all the scenarios, considering previous wildfire data. This made our research very close to real-world situations.
The results of our numerical studies show that the proposed approach can reduce the costs of damage by considering collaboration between protection centres. In real case applications, the protection centres have limited capacity; therefore, they do not have enough capacity for different types of vehicles and equipment for asset protection problems. So, considering collaboration between protection centres in a disaster situation is critical to reducing the costs of damage. Also, results obtained based on the real case show that the utility obtained from protecting assets by considering collaboration between protection centres is increased for increases with a large number of scenarios.
Two solution approaches, a two-phase matheuristic and an improved ALNS, were developed to solve the problem efficiently. The efficiency of the proposed algorithms is examined for instances with different sizes. The results show that the two-phase matheuristic algorithm performs well for small-size instances and can find nearoptimal solutions (0.21% gap on average). Also, the computational time of the two-phase matheuristic algorithm is 2 times less than the computation time of the CPLEX solver for small-size instances. The CPLEX solver can find optimal solutions in 10,000 s for 14 medium-size instances from 30 instances, and the average computational time for these cases is 1997.60 s, while the average gap and computational time for the two-phase matheuristic algorithm are 0.55% and 67.00 s, respectively. The CPLEX solver cannot find optimal solutions within 10,000 s in any large-size instances. But the two-phase matheuristic algorithm can find good quality solutions in 1481.96 s on average. Therefore, the numerical analysis confirms that the proposed two-phase matheuristic solution method is efficient enough regarding quality and computational time.
The results of a numerical study confirm that the proposed solution algorithm is much more efficient when the algorithm is equipped with a learning stage during the algorithm iterations. The average computational times of the two-phase matheuristic algorithm with the learning procedure for small-size and medium-size instances are 2 and 4 times, less, respectively than the average computational times of the two-phase matheuristic algorithm without the learning procedure. Moreover, the performance of the proposed two-phase matheuristic algorithm is compared with an improved version of the ALNS algorithm, and the results confirm the effectiveness of the proposed algorithm.
Since this study can also be used for the protection of production centres during wildfires, it can be mentioned as a managerial insight for wildfire management teams. Therefore, they can make protection activities more collaborative between protection centres, which may not only enhance protection effectiveness but also decrease total asset protection costs. So, it can be suggested that production centres be allocated and protected by protection centres with collaboration potential. In this study, the fire spread pattern was not determined precisely. Future studies could predict fire spread on the cells by machine learning equipped with image processing before, during, and after a wildfire in the asset protection problem. Another direction for future research could be improving the first phase of the matheuristic algorithm to extract feasible routes more efficiently.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Data availability statement
The data that support the findings of this study are openly available in GitHub at https://github.com/bashirimahdi/ SAPP-PC.

Notes on contributors
Erfaneh Nikzad received the B.Sc. degree in industrial engineering from Alzahra University, in 2011, the M.Sc. degree in industrial engineering from Shahed University, Tehran, in 2014, and the Ph.D. in industrial engineering from Shahed University, Tehran, Iran, in 2020. She currently works as a lecturer in West Tehran Branch, Islamic Azad University, Tehran, Iran. Her research interests include vehicle routing problems, matheuristic algorithms, stochastic programming, and machine learning. Mahdi Bashiri has over 17 years of academic experience with a particular interest in operations and supply chain management, operations research, transportation planning, Heuristic and Matheuristic algorithms. Also, he has participated in industrial and business projects in different levels. He has been involved in two UKRI funded projects. He has supervised more than 5 Ph.D. students and more than 60 M.Sc. students. He is an active reviewer for reputable academic journals. He is a recipient of the 2013 young national top scientist award from the Academy of Sciences of the Islamic Republic of Iran.