An efficient hybrid differential evolutionary algorithm for zbilevel optimisation problems

Bilevel problems are widely used to describe the decision problems with hierarchical upper – lower-level structures in many economic fields. The bilevel optimisation problem (BLOP) is intrinsically NP-hard when its objectives and constraints are complex and the decision variables are large in scale at both levels. An efficient hybrid differential evolutionary algorithm for BLOP (HDEAB) is proposed where the optimal lower level value function mapping method, the differential evolutionary algorithm, k-nearest neighbours (KNN) and a nested local search are hybridised to improve the computational accuracy and efficiency. To show the performance of the HDEAB, numerical studies were conducted on SMD (Sinha, Maro and Deb) instances and an application example of optimising a venture capital staged-financing contract. The results demonstrate that the HDEAB outperforms the BLEAQ (bile-vel evolutionary algorithm based on quadratic approximations) greatly in solving the BLOPs with different scales.


Introduction
Many decision-making problems in the economic, public and private sectors could be described as bilevel optimisation problems (BLOPs), such as the taxing strategy (Wei, Liang, Liu, Mei, & Tian, 2014), environment economics (Sinha, Malo, & Deb, 2013), homeland security (Wein, 2009), the toll-setting problem (Brotcorne, Labb e, Marcotte, & Savard, 2001), operational decision-making problem (Haghighat & Kennedy, 2012), transportation policy formulation (Sinha, Malo, & Deb, 2015), spatial targeting of agri-environmental policy (Whittaker et al., 2017), and so on. The main characteristic of the BLOP is the hierarchical upper-lower-level structure (i.e., leader-follower structure), where the private profit-seeking follower at the lower level could well be in conflict with the leader's objective at the upper level. Typically, there lower level. Second, a nested local search method is used to further save the calls of FEs (i.e., the times to invoke the evaluation of FEs) at the lower level by providing the upper level candidate decision variables with higher quality. Third, the DEA is applied to boost further the computational efficiency.
The remainder of this study is organised as follows. Section 2 reviews the related literature in BLOP in the recent decades. Section 3 analyses the performances of three meta-modelling methods in reducing the bilevel problem into a single-level one. Section 4 introduces the principles and pseudo-codes of the HDEAB in solving the BLOP. Section 5 provides detailed numerical studies for comparing the performances of the HDEAB and BLEAQ on SMD (Sinha, Maro, & Deb 2018) instances and an application example of optimising a venture capital stage-financing contract with different scales. Section 6 gives the conclusions.

Literature review
For recent decades, much literature has presented various algorithms to solve the BLOPs, which can be categorised into two streams: (1) using the classical algorithms; and (2) adopting the evolutionary algorithms.

The classical algorithms for solving the BLOP
The most attractive merit of the classical algorithms lies in the fact that the optimal results can be theoretically guaranteed. When the lower level problem is convex and sufficiently regular, it can reduce the bilevel problem into a single-level one by using the Karush-Kuhn-Tucker (KKT) method, where the Lagrangian multipliers and complementarity constraints are introduced to incorporate the lower level problem into the upper level one. However, this method increases the total number of decision variables, which gives rise to three undesired phenomena: (1) non-convexity happens easily even when the lower level problem is sufficiently regular; (2) the computational burden could rise exponentially, because more decision variables have to be optimised (Sinha et al., 2018); and (3) it could not easily handle the situation when the objectives and constraints of BLOP are non-linear or non-convex.
To overcome the shortcomings of the KKT method, many researchers resorted to other approaches. For example, a descent algorithm was developed for solving a nonlinear BLOP (Savard & Gauvin, 1994). A trust region method was adopted to solve the generalised BLOP (Marcotte, Savard, & Zhu, 2001), where the bilevel problem was locally approximated with a model involving a linear program at the upper level and linear variational inequality at the lower level. In Colson, Marcotte, and Savard (2005), the BLOP was solved in two steps: first, a bilevel linear quadratic model was used to approximate the BLOP, thus it was reduced into a single-level problem; second, the reduced problem could be solved with a mixed integer programming method.
The above classical methods might be restricted in their applications in the real world. For a more general BLOP where the objectives and constraints (i.e., functions) involved are non-convex and non-differentiable, the classical methods are unable to solve the BLOP exactly. It is computation-expensive to obtain a solution when the numbers of decision variables at both levels are large in scale, because the BLOP is strongly NP-hard in nature (Benayed & Blair, 1990;Vicente, Savard, & Judice, 1994).

The evolutionary algorithms for solving the BLOP
Since most BLOPs are intrinsically NP-hard, it is realistic to have a satisficing solution rather than the optimal one. The EA requires few on the types of objectives and constraints, so that it complements the shortcomings of the classical methods. The studies with the EA contributed almost 10% of all studies in solving BLOPs (Sinha et al., 2018).
The work in Mathieu, Pittard, and Anandalingam (1994) might be the first study to solve the BLOP with an EA. In this work, the upper level problem was solved by a genetic algorithm (GA), while the lower level problem was solved by a linear programming method. This pure nested strategy solved the BLOP in a nested manner: the lower level optimisation problem was solved for every given upper level decision. Recently, more researchers considered the EA to be better than the classical methods in computational efficiency and accuracy for solving the BLOP with more complex conditions and constraints (Yin, 2000). Based on a constraint-handling scheme, an EA with a specifically-designed crossover operator was proposed to solve the non-linear BLOP (Wang, Jiao, & Li, 2005). When the objectives were linear and the constraint regions were polyhedrons at both levels, an algorithm was developed by combining the classical extreme point enumeration techniques with a genetic search, which was able to solve the quasi-concave BLOP with the linear lower level function (Calvete, Gale, & Mateo, 2008). The work proposed by Wang, Li, and Dang (2011) moved a great step forward, where a new EA was proposed and could handle a nondifferentiable upper level objective and non-convex lower level problem. A much simpler approach was proposed in Angelo, Krempser, and Barbosa (2013), where two differential evolutionary techniques were applied on both levels with the continuous decision variables. Recently, an EA embedded with the KKT proximity was proposed in Sinha, Malo, and Deb (2017), which demonstrated a promising future of solving the BLOPs by hybridising EA and other approximate approaches.
There is room for improving the pure nested strategies mentioned above in solving BLOPs. It is often unnecessary to solve the lower level optimisation problem for every upper level decision (Sinha et al., 2018). There will be a great improvement in boosting the computational efficiency, when the lower level calls of the FEs could be saved. Sinha and his co-authors showed that the lower level calls of the FEs could be saved greatly through the meta-modelling methods. For example, they introduced a BLEAQ where the reaction set mapping method was adopted . This approach demonstrated its capability of handling the BLOPs with different kinds of complexities by using a smaller number of FEs. Later, they improved the BLEAQ by archiving and local search techniques ). This improved BLEAQ offered a significant improvement in reducing the calls of FEs at both levels. Recently, they proposed a meta-modelling-based strategy to iteratively approximate the optimal lower level value function (Sinha, Malo, & Deb, 2016). It not only freed the high restrictive class of bilevel problems, but also saved the computational burden for some complex BLOPs. Angelo, Krempser, and Barbosa (2014) reported a DEA assisted with a similarity-based surrogate model (i.e., KNN) to reduce greatly the FEs on the lower level problem by the KNN techniques.

Distinctiveness of this research
This study was motivated by Angelo et al. (2014) and Sinha et al. (2017). Although these two works did great jobs in solving different BLOPs, there were shortcomings that hampered the computational accuracy and efficiency.
First, the reaction set mapping adopted in BLEAQ is not an ideal meta-modelling method to save the lower level calls in Sinha et al. (2017) because the quadratic approximation method used in BLEAQ requires calculating the inverse of the coefficient matrix. When the number of upper level decision variables (i.e., n) is on a large scale, the computational complexity could reach Oðn 6 Þ to solve the inverse once for each variable at the lower level. Furthermore, the quadratic approximation method might perform poorly on the boundary of the original approximation region because the discontinuity could rise due to the constraints of the decision variables at both levels. Besides, the multi-valued phenomenon would raise the discontinuity in the reaction set mapping method used in Sinha et al. (2017), thus causing the low computational accuracy.
Second, compared with the BLEAQ in Sinha et al. (2017), there is an obvious advantage of the algorithm proposed in Angelo et al. (2014), which could make it easier to use the KNN candidate as an approximation of the lower level optimal solution. However, the disadvantage of this method is that it randomly selects only one KNN candidate of the lower level optimal decisions for the given upper level decision variables. This method might pick the undesired KNN candidate while ignoring the well-behaved one due to the random selection. Therefore, low computational efficiency and accuracy would occur simultaneously in Angelo et al. (2014).
Third, the algorithms proposed in Angelo et al. (2014) and Sinha et al. (2017) often converge rather slowly for some BLOPs where the multi-valued phenomenon might exist at lower level problems. Once the meta-modelling method fails to find the real optimum for the upper level problem, the convergence of approximation will be rather time-consuming.
To overcome the shortcomings of the algorithms in Angelo et al. (2014) and Sinha et al. (2017), we propose a more efficient algorithm called HDEAB. First, to ensure a higher computational accuracy, the optimal lower level value function mapping method is utilised to reduce the BLOPs into single-level ones. Second, the KNN method is adopted to save the lower level calls of FEs. Third, a nested local search is used for boosting the computational efficiency by providing the upper level candidate with higher quality. By numerical studies on the SMD instances, we demonstrate that the HDEAB has the same robustness as the BLEAQ presented in Sinha et al. (2017). What is more, the HDEAB performs withhigher computational accuracy and efficiency, and could solve the BLOP with larger scale of decision variables than the BLEAQ.

Reduction methods for BLOPs
Consider a general BLOP with n and m decision variables at the upper and lower level problems, respectively (i.e., x u 2 X u & R n and x l 2 X l & R m , where the subscripts u and l denote the upper and lower levels, respectively). Decision-makers at both levels are private-profit maximisers with different objectives (i.e., F : A general BLOP is given in Equation (1): where G k and g j denote the constraints for the upper and lower level problems.
For the given upper level decision variables x u , the follower optimises the lower level decision variables x l to maximise his/her own objective. While for the leader, he/she has to incorporate x l into his/her own objective before optimising x u . The procedures for solving the BLOP are highly nested, hence it is computation-expensive when the numbers of decision variables at both levels increase. If the bilevel problem is reduced into a single-level one, then the computational efficiency of solving the BLOP could be greatly improved.
The reaction set mapping and the optimal lower level value function mapping method are meta-modelling methods, which are most related to this study, to reduce the BLOP. They focus on the same reduction principle: incorporating the lower level problem into the upper level by various mapping methods. Once the reduction work is done, various optimisation techniques could be utilised to approximate the optimal solution of the BLOP. Note that even the bilevel problem could be reduced into a single-level one, and the task of solving the BLOP is still NP-hard in nature as the number of decision variables increases.

Reaction set mapping method
In this method, the constraints defined at the lower level problem can be represented by the reaction set mapping (denoted as the w-mapping), which is given by Then the BLOP can be reduced into a single-level constrained optimisation problem, which is given by Once the value of wðx u Þ is determined, a smaller search space of x l could be obtained. Particularly when the wðx u Þ is single-valued, there is no need to optimise x l because x l ¼ wðx u Þ always holds. Only x u needs to be optimised in the X u space, which is given by Equation (4): When wðx u Þ is set-valued, the optimisation of x u will face serious problems of discontinuity or local optimums. No optimisation technique can promise that it is easy and straightforward to approximate wðx u Þ. For example, in Figure 1 the trace could be discontinuous and non-differentiable (e.g., traces 1 and 2) in the grey shaded area where wðx u, 0 Þ is multi-valued for a certain x u, 0 . The optimisation technique could not predict the wðx u, 0 Þ whether it is attributed to trace 1 or to trace 2.
Three shortcomings might counteract the advantages of the w-mapping. First, the widely used quadratic approximation method in the w-mapping always requires calculation of the inverse of the coefficient matrix. When the number of upper level decision variables (i.e., n) is large, it is a rather computationally expensive task to solve the inverse of coefficients matrix; because the computational complexity could reach Oðn 6 Þ to solve the inverse once at a time. The complexity of approximation for the optimal solution of the lower level with m decision variables will reach Oðn 6 Â mÞ once at a time. It is extremely time-consuming to iteratively approximate the lower level optimum. We could not promise to get the approximation within the given time (i.e., 5 hours) under certain computational environments (e.g., given the CPU and memories). Second, the decision variables at both levels might be always constrained. Discontinuity frequently rises on the boundaries of the decision variables at both levels. The assumption embedded in the quadratic approximation method is that the small samples come from a family of smooth or continuous functions. Therefore, it might perform poorly on the boundaries where discontinuity takes place.
Third, the algorithms proposed in Angelo et al. (2014) and Sinha et al. (2017) reported only one feasible solution of the lower level problem for each iteration. However, when there are many local optimums for the lower level problem, the quadratic approximation method could fail to give a suitable lower level solution for the given upper level decision vector, thus the convergence of the algorithms might be rather slow and the computational accuracy might be doubtful.

Optimal lower level value function mapping method
This method (denoted as the u-mapping) is another meta-modelling method to reduce the bilevel problem. For a given x u , the minimum lower level function value uðx u Þ could be given as follows: By Equation (5), the BLOP could be reduced into a single-level problem which is shown as follows: Both x l and x u in Equation (6) should be predicted in the u-mapping. It is different from the w-mapping where only the prediction of x l is needed. Therefore, more computational efforts are needed to solve the BLOP by the u-mapping than by the w-mapping. Interestingly, predicting x l and x u could overcome the shortcoming of the discontinuity in the w-mapping where the wðx u Þ is multi-valued. For example, as shown in Figure 2, the discontinuity phenomenon in the w-mapping has gone at x u, 0 in the u-mapping. This is because the u-mapping is always single-valued by Equation (5). Therefore, the u-mapping could provide a relatively higher accuracy than the w-mapping in solving BLOP. Note that the value function uðx u Þ is seldom known and should be still obtained by the approximation tool. Now, let us think of approximating the uðx u Þ, and letûðx u Þ be the approximation of uðx u Þ. Note that there will be errors in approximatingûðx u Þ. These errors might leadûðx u Þ to be less than the true uðx u Þ, thus could exclude the true solution of the BLOP. To avoid this phenomenon, an error term (i.e., e) is added on to theûðx u Þ, thus Equation (6) can be reformulated as: The u-mapping has its advantages in avoiding the discontinuity and providing a more continuous trace for approximation than the w-mapping, by which a higher computational accuracy in solving BLOP can be achieved. However, the computational efficiency might be hampered if no method can be used to accelerate the approximation of the u-mapping.

Algorithm description
In this study, the u-mapping is chosen to solve the BLOP for its higher computational accuracy. To improve the computational efficiency, the KNN method is first adopted to save the calls of FEs at the lower level. Second, a nested local search is used to provide the upper level candidate with higher quality, which further saves the calls of FEs at the lower level. Third, the steps of the HDEAB are outlined, which hybridises the DEA, KNN and the nested local search. The HDEAB's pseudo-codes are given in the online supplement. 4.1. Using KNN to save the calls of the FEs at lower level As shown in Equation (7), the evaluation ofûðx u Þ requires immense approximations at the lower level for the given x u . Hold the fact in mind that if the approximation tasks at the lower level problem (i.e., f ðx u , x l Þ ûðx u Þ þ e) can be save, then the computational efficiency of the u-mapping can be greatly improved. In order to speed up the approximation procedures for the optimal lower level solution (x l ), the KNN is used to construct a candidatex l . The candidatex l could be iteratively estimated by Equation (8) for arbitrarily given upper level decision variables x u, 0 : where x u, j is the archived jth-nearest upper level decision variable, x l, j is the corresponding jth-nearest lower level optimal solution, and dðÁÞ is the Euclidean distance, which measures x u, 0 and x u, j .
For a given x u, j , the candidatex l could be accepted as the true lower level optimal solution if f ðx u, j ,x l Þ satisfies the constraints in Equation (7). Otherwise, ifx l is not satisfied the neighbours ofx l should be searched to approximate the true x l until f ðx u, j ,x l Þ satisfies the constraints.

Using a nested local search to boost the computational efficiency
In principle, if the upper level candidatex u with a relatively high quality could be obtained, then the calls of FEs at the lower level could be further saved and the computational efficiency of the u-mapping could be improved. However, this task should be solved in a nested way: for the lower level candidatex l given by Equation (7), it should first be made sure that the upper level candidatex u falls into the feasible region which is given by Equation (8). Otherwise, it cannot be hoped thatx l is the satisfied candidate.
Recall the reduced BLOP in Equation (7). Ifx u 2 X u is chosen as the feasible region, then many infeasible candidates of the upper level (x u ) might be brought in due to the constraints. Generally, the feasible region ofx u is no larger than X u due to G k ðÁÞ and g j ðÁÞ. It will be computation-expensive to exclude the infeasible candidatê x u . Fortunately, ax u with a higher quality could be obtained by solving Equation (9), which are the constraints in Equation (7): min 0 s:t: g j ðx u , x l Þ 0, j ¼ 1, 2, . . . , J G k ðx u , x l Þ 0, k ¼ 1, 2, . . . , K x u 2 X u , x l 2 X l (9) where zero denotes that the objective is constant.
Thex u that satisfies the constraints in Equation (9) will be considered as the feasible candidate for the upper level problem. Once thex u is ready, the approximation tools (e.g., the sequential quadratic programming (SQP) or the DEA) can be utilised to predict thex l . By solving Equation (9), the total calls of FEs at the lower level could be greatly saved by usingx u with a relatively higher quality.

Hybridise the DEA, KNN and the nested local search
In this section the steps of the HDEAB are introduced, in which the DEA is adopted as the optimisation engine for both levels. The main reason is that the DEA could, astonishingly, handle the BLOP where the objectives are non-convex, non-differential or have many local optimums (Tripathy & Panda, 2017).
The HDEAB contains five steps, which are initialisation, mutation, crossover, selection and the termination criterion. To keep the consistency of the notations, let X without subscript denote the vector of the decision variables at both levels. There are M decision variables in the X. Let minPðXÞ be the objective (e.g., PðÁÞ can be either FðÁÞ or f ðÁÞ in Equation (7)).
Step 1: Initialisation Before using the DEA to optimise the PðXÞ, the population (i.e., X i, G , i ¼ 1, . . . , N) should first be constructed, which has the form: where X i, G denotes the ith individual vector at the Gth generation (X i, G is called the gene), N denotes the size of the population, and x j, i, G denotes the jth variable of the ith individual vector at the Gth generation. Suppose x L j and x U j are the lower and upper boundaries for the j th variable, respectively. The initial population X i, 0 is randomly selected from ½x L j , x U j .
Step 2: Mutation For each X i, G , a mutant vector V i, Gþ1 (also called the donor individual) can be generated at the G þ 1th generation by the following formulation: , r 1 and r 2 are the indexes (which are integers) randomly chosen in f1, . . . , Ng, l is the mutation factor, and l 2 ½0, 2. r 1 and r 2 should be different, thus the size of the population (N) should be no less than three. X best, G is the best individual in the population.
Step 3: Crossover To increase the diversity of the individuals in the population, the crossover is essential in the DEA. A new trail individual U i, Gþ1 ¼ ½u 1, i, Gþ1 , . . . , u j, i, Gþ1 , . . . , u M, i, Gþ1 could be generated by the following crossover procedure: where i ¼ 1, 2, . . . , N, j ¼ 1, 2, . . . , M; CR is the probability for crossover, which is a constant between ½0, 1; rand j, i is uniformly distributed in ½0, 1; and I rand is a random integer between 1 and N, thus I rand ensures that U i, Gþ1 6 ¼ X i, G . Each new generated trial individual U i, Gþ1 should fall between the boundaries given by Equation (7).
Step 4: Selection After the crossover, the new trial individual U i, Gþ1 is treated as the candidate solution of the individual at the G þ 1th generation. Comparison between X i, G and U i, Gþ1 should be made to select the best candidate. The lowest function value measured by Equation (13) is used as the selection criterion for the one which will enter the next generation.
First, U i, Gþ1 enters the population if it satisfies Equation (14): Second, if the termination criterion is unsatisfied, then go back to the mutation step.
Step 5: Termination criterion In this study, the variance-based termination criterion is used for both levels. This termination criterion at the G th generation for X G, i could be given by a G , which is shown below: where r 2 G, i and r 2 0, i denote the variances for X G, i and X 0, i , respectively, and M is the number of decision variables in vector X G, i . The algorithm is terminated when a G a stop . The value of a G usually lies between zero and one. By the definition of a G , the value of a G is closely related to r 2 0, i . If the value of r 2 0, i is very small (i.e., 1 Â 10 -3 ), then a G ¼ P M i¼1 r 2 G, i is applied. In this study, the local search is employed when a G ða stop Þ 0:1 , otherwise the local search does not have to be employed because X G, i is always far from its optimal solution when a G >ða stop Þ 0:1 .

Numerical study
In this section, to compare the performance of the HDEAB and BLEAQ, numerical studies are conducted on the SMD instances and an application example given of optimising a venture capital staged-financing contract. Both approaches are run on each SMD instance 31 times, which is also the number of times that had been done in Sinha et al. (2017). All numerical studies are conducted on Matlab2016a with the hardware CPU i5 @3.20 GHz and 8 G RAM.

Performance on the small-scale SMD instances
The SMD instances contain 12 problems, of which the first eight instances are the unconstrained BLOPs and the rest are constrained. To compare the performance of the HDEAB and BLEAQ, the same parameters are used as those used in Sinha et al. (2017) to generate the small-scale SMD instances (i.e., p ¼ 1, q ¼ 2, r ¼ 1), where the numbers of decision variables at the upper and lower levels are two and three, respectively (i.e., the scale is 2 Â 3). For SMD6, s ¼ 1 is used to generate the instance with a scale of 2 Â 3. In this study, the scale of 2 Â 3 is defined as small.
The parameters for the HDEAB in this study are as follows: for the unconstrained SMD instances, the population sizes for upper and lower levels are N ¼ 20 and n ¼ 20, while for the constrained SMD instances, N ¼ 30 and n ¼ 30. The rest of parameters are: a u stop ¼ 10 À6 , a l stop ¼ 10 À6 , l ¼ 0:9, CR ¼ 0:9 and e ¼ 10 À10 . Both approaches could provide a 100% success rate in approximating the benchmark solution for every SMD instance with small scale. Compared with the bestknown solution of each SMD instance, there are 31 absolute differences and the calls of FEs for upper level (UL) and lower level (LL) problems, respectively. The median absolute differences (MADs) and median function evaluations (MFEs) are used to measure the computational accuracies and efficiencies at both levels.
The numerical results of the MADs and MFEs at both levels of 12 SMD instances are given in Tables 1 and 2. The results show that the HDEAB performs with much higher computational accuracies and efficiencies than the BLEAQ on each SMD instance. The average MADs at UL and LL given by the HDEAB are 10.72% and 5.33% of those given by the BLEAQ. The average MFEs at UL and LL given by the HDEAB are only 24.8% and 18.6% of those given by the BLEAQ.

Performance on the large-scale SMD instances
To our best knowledge, the largest scale of the SMD instances is 10 Â 10, and the BLEAQ could only successfully solve the first eight unconstrained instances . To investigate the largest scale that could be successfully solved by HDEAB and BLEAQ, the numbers of decision variables are gradually doubled at both levels (i.e., 5 Â 5, 10 Â 10 and 20 Â 20). a u stop ¼ 10 À6 and a l stop ¼ 10 À6 are the same for both approaches. The computational time is restricted to within 5 hours for each SMD instance because it will be extremely time-consuming when the scale reaches 20 Â 20, which might be unbearable for the decision-makers. However, one has to note that different computational environments will lead to different computational consumptions. In this study, the iteration is stopped when either the stop criterion or the time restriction is reached.
Tables 3 and 4 give the numerical results of the performance of these two approaches on the same SMD instances. The largest scale for the BLEAQ is 10 Â 10, which is the same as in . The BLEAQ cannot provide the numerical solutions for any SMD instance within 5 hours when the MFEs at the lower level exceed 1E þ 07. However, the HDEAB could successfully obtain the numerical results for the scale up to 20 Â 20 even when MFEs at lower level reach 1E þ 09. No numerical solution will be obtained by the HDEAB and BLEAQ within 5 hours when the scale of SMD instance is larger than 20 Â 20 because the MFEs at the lower level increase exponentially. The HDEAB also provides higher computational accuracy and efficiency than the BLEAQ. Take the 10 Â 10 SMD instances, for example, the average MADs at UL (LL) given by the HDEAB are 0.73% (4.04%) of those by the BLEAQ, and the average MFEs at UL (LL) given by the HDEAB are 6.2% (14.8%) of those by the BLEAQ.

An application example: optimal venture capital staged-financing contract
In this section, we consider an application example of BLOP where the entrepreneur (EN) and venture capitalist (VC) are entering a staged-financing contract.
In this example, the return on investment (ROI) of the start-up is r, which is a random variable, with l and r 2 being the mean and variance, respectively. EN is the leader who decides to invest an amount of the owner's capital (y) and the proportion of revenue shared with the VC's equity investment (c). VC is the follower who invests the EN with a mixture of equity and debt in M stages for mitigating the risk of investment. VC's decision variables are the investment amount on equity in stage i (x i, 1 , which brings the revenue crx i, 1 ), and the investment amount on debt in stage i (x i, 2 , which brings the revenue r d x i, 2 ; r d is the interest rate). The VC's total amount of investment is The total revenues of EN and VC are given by ( where dðyÞ and cðxÞ are the EN and VC's opportunity costs, and d 0 ðÁÞ>0, d 00 ðÁÞ>0, c 0 ðÁÞ>0 and c 00 ðÁÞ>0. Both EN's and VC's objectives (F EN and F VC ) are given in Equation (17), which are the classical investment portfolio optimisations. Both players aim to maximise their revenues for given risks. ( where E½Á and Var½Á are the mean and variance operators, and b EN and b VC are the players' attitude factors on the revenues and risks. Theoretically, F EN and F VC reach their maximal values at the global optimums. When x and y are unconstrained, Equation (17) can be solved analytically through backward induction, while it cannot be solved analytically when x and y are constrained. The scale of Equation (17) is 2M Â 2. When the value of M increases, nonconvexity rises easily.  To compare the performances of HDEAB and BLEAQ, we set the parameters as follows: l ¼ 10, r ¼ 5, r d ¼ 0:1, b EN ¼ b VC ¼ 0:5, cðxÞ ¼ x 2 , and dðyÞ ¼ y 2 . M varies from 5 to 25. To save space, we only conduct numerical studies on the constrained case, where x 2 ½0, 100 and y 2 ½0, 10. Parameters for the HDEAB are the same as in Section 5.2. and F HDEAB EN than the BLEAQ does when the scale increases. The reason is that the BLEAQ could easily fall into the local optimum due to the non-convexity introduced by a larger value of M and the constraints of x and y.

Conclusion
In this study, an efficient approach (HDEAB) was proposed to solve the BLOPs. In HDEAB: (i) the optimal lower level value function mapping method was adopted to provide higher computational accuracy; (ii) the KNN and a nested local search were  hybridised to boost the computational efficiency; and (iii) the DEA was utilised as the optimisation engine for both levels. From the numerical studies on the SMD instances, the HDEAB demonstrated higher performances than the BLEAQ on both smalland large-scale SMD instances. Specifically, on the small-scale SMD instances (i.e., 2 Â 3), the computational accuracies at upper and lower levels given by the HDEAB were 9.3 and 18.9 times higher than those by the BLEAQ while costing only 24.8% and 18.6% of the calls of MFEs required by the BLEAQ. Given the computational environment, the HDEAB could solve the largest scale SMD instances up to 20 Â 20, while the BLEAQ could only solve 10 Â 10 SMD instances in maximal. On 10 Â 10 SMD instances, the HDEAB provided 137.4 and 24.8 times higher computational accuracy at upper and lower levels than the BLEAQ, while costing only 6.2% and 14.8% of the calls of MFEs required by the BLEAQ. By an application example in optimal VC staged-financing contract, the HDEAB outperformed the BLEAQ in providing higher objective values for both entrepreneurs and venture capitalists.

Disclosure statement
No potential conflict of interest was reported by the authors.