A multi-objective bilevel optimisation evolutionary algorithm with dual populations lower-level search

In multi-objective bilevel optimisation problems, the upper-level performance of different lower-level optimal solutions may be very different, even though they belong to the same lower-level problem. It may lead to poor optimisation results. Therefore, the lower-level search should search lower-level non-dominated solutions that are also non-dominated in the upper-level objective space. In this paper, we use two populations in the lower-level search. The first population maintains non-dominance and diversity in the lower-level objective space and provides the second population with convergence pressure from the lower level. The second population selects the upper-level non-dominated solutions that are not dominated by the first population in the lower-level objective space, which make the second population maintain the non-dominance at both upper and lower levels. Besides, to improve the search efficiency, we set up the upper-level mating pool to generate the upper-level vectors of offsprings near the upper-level vectors of the better individuals in the current population. To balance convergence and diversity, the selection operator of a decomposition based multi-objective evolutionary algorithm is adopted. The proposed algorithm has been evaluated on a set of benchmark problems and a real-world optimisation problem. Experimental results demonstrate that the proposed algorithm is efficient and effective.


Introduction
In multi-objective bilevel optimisation problems, the upper-level performance of different optimal solutions may be very different, even though they belong to the same lower-level problem. The existing multi-objective bilevel optimisation algorithms ignore this point and may only find the lower-level non-dominated solutions with poor upper-level performance in the lower-level search, resulting in a poor quality upper-level Pareto front (PF). To solve this problem, we proposed a dual populations lower-level search to get those lower-level non-dominated solutions with good upper-level performance in this paper. In addition, we generate the upper-level vectors of offsprings near the upper-level vectors of the better individuals in the current population, which can effectively reduce the computational overhead.
In engineering design and business decision-making, there are a lot of applications of multi-objective bilevel optimisation problems (Cai et al., 2018;Chen et al., 2022;Kamal et al., 2019;Kolak et al., 2018;Lin et al., 2010;Lv et al., 2016;Ni et al., 2020;Shaikh et al., 2020;Yanhong et al., 2011;Zhang, 2014). For example, in a manufacturing corporation, the chief executive officer (CEO) aims to maximise the profits and the quality of the products. The CEO sends the design parameters (such as material and appearance), quantity and delivery date of the product to several branch heads working under him. According to these requirements of the CEO, branch heads plan the number of production lines and the scheduling of workers to manufacture one or more components of the final product, maximising their branch profits and the worker satisfaction. Obviously, a CEO's possible decisions must consider the reactions of the heads, which makes the decision-making a multi-objective bilevel optimisation problem by nature and can be described by the following general model of multi-objective bilevel problem: where F(x u , x l ) is the upper-level optimisation task (e.g. the profits and the quality of the products), and f(x u , x l ) is the lower-level optimisation task (e.g. branch profits and the worker satisfaction). The functions g(x u , x l ) and h(x u , x l ) are lower-level constraints, which define the feasible space of the lower-level optimisation task. The decision vector x of the upper-level optimisation task consists of two smaller vectors, i.e. x = [x u , x l ], where x u is the upper-level vector (e.g. the design parameters, quantity and delivery date of the product), which is defined in the upper-level variable space u , while x l is the lower-level vector (e.g. number of production lines and the scheduling of workers), which is defined in the lower-level variable space l . It is noteworthy that the lower-level optimisation task only optimises x l , while x u is fixed as parameters. Hence, the optimal solutions of the lower-level optimisation task can be expressed as functions of x u . Then, the feasible space of the upperlevel optimisation task is defined by the upper-level unequal constraints G(x u , x l ) and the upper-level equal constraint H(x u , x l ) as well as the lower-level multi-objective optimisation. In other words, a feasible solution of the upper-level optimisation task must not only satisfy G(x u , x l ) ≤ 0 and H(x u , x l ) = 0 but also belong to the Pareto optimum solution set (PS) of the lower-level optimisation task. Both the nested structure of bilevel problems and the characteristics of multiple optimum solutions of multi-objective optimisation problems make the multi-objective bilevel optimisation problems computationally demanding and difficult to be solved. In addition, although some representative multi-objective methods of multi-objective evolutionary algorithms (MOEAs) have achieved remarkable results in large-scale multi-objective and multi-objective optimisation problems  as well as feature selection problem (Zhang et al., 2017), due to the lack of consideration of nested structure, these methods may not be effective for multi-objective bilevel problems.
Given the difficulty of solving multi-objective bilevel optimisation problems, there are few related researches. In addition to the design of test problems (Deb & Sinha, 2010) and special multi-objective bilevel problems (Sinha et al., 2016), most of the existing researches focus on methods to reduce the computational overhead (i.e. the number of real function evaluations (FEs), especially the lower-level FEs). These methods can be classified into: (1) Single-level reduction methods The idea of the single-level reduction to transform the multi-objective bilevel problem into a simple single-level problem has achieved some results, but these methods often need to meet some extremely harsh conditions, which are difficult to apply in practical problems. For example, based on the theoretical progress of establishing the relationship between bilevel optimisation and multi-objective optimisation in recent years, Ruuska and Miettinen (2012) transform the multi-objective bilevel optimisation problem into a multi-objective optimisation problem and then solve it, but this method is only applicable to optimistic multi-objective bilevel optimisation problems. Pieume et al. (2013) show how to construct two artificial multi-objective programming problems so that any point which is Pareto efficient solution to the two problems is an efficient solution to the multi-objective bilevel optimisation problem, and some necessary and sufficient conditions are given for the results to be applicable. He and Lv (2017) use Kuhn-Tucker optimality condition to replace the lower-level problem of a class of multi-objective bilevel optimisation problems, and the perturbed Fischer-Burmeister function is used to smooth the complementary condition. On this basis, a particle swarm optimisation algorithm is used to solve the smooth multi-objective programming problem.
(2) Surrogate-model-based methods Surrogate-model-based methods usually adopt an approximated surrogate model to approximate the mapping from the upper-level vector to the optimal solution set of its corresponding lower-level problem. Sinha et al. (2017) introduce an approximated set-valued mapping to obtain approximate lower-level PS for any upper-level vector, greatly reducing the number of evaluations for lower-level tasks.

(3) Population migration methods
Learning from the idea of transfer learning (Cao et al., 2021;Li et al., 2022;Sundar & Sumathy, 2022), population migration methods are to migrate the PS of the adjacent lower-level problem to the population of the current lower-level problem, so as to speed up the search of the current lower-level problem. For example, Deb and Sinha (2010) established an archiving set to store the currently found upperlevel non-dominated individuals. When searching for the lower-level optimal solutions corresponding to a new upper-level vector, the lower-level initialised population is selected from the archive set according to the distance in the upper-level decision space. Said et al. (2020) study the bilevel combinatorial optimisation problem. They reduced the computational cost through multi-population strategy and migration mechanism, and introduced an index in the lower level optimisation to measure the performance of the lower-level non-dominated solution in the upper-level problem.
Although it improves the searching efficiency to some extent using the information of individuals that have been the searched, this method only simply transfers the existing data without summarising, extracting and utilising the knowledge contained in the data, so the effect may not be satisfactory.
Recently, Cai et al. (2022) use an interaction matrix to represent the variable interactions. Based on the interaction matrix, the variables are divided into different groups and are optimised in a collaborative way.

(5) Others
There are also some other methods. For example, Tao et al. (2012), Zhang et al. (2012) and Zhang et al. (2013) adopt the framework of alternately executing fixed upperlevel vector for lower-level optimisation and fixed lower-level vector for upper-level optimisation. It simplifies the optimisation process of the bilevel problem and reduces the computational cost to a certain extent, and also changes the nested structure of the bilevel problem. Therefore, the solution obtained by the algorithm is only a rough approximation of the optimal solution of the bilevel problem, and the accuracy is not good As shown in (Table 1), although the above algorithms can greatly reduce the computational overhead, they only use the lower-level objective value as the basis for environment selection in the lower-level search, and the result is to obtain a set of solutions evenly distributed on the lower-level PF. However, the upper-level performance of different optimal solutions may be very different, even they belong to the same lower-level problem. For example, Figure 1 shows the performance of the PS of a lower-level problem in the lowerlevel objective space and the upper-level objective space, respectively. We can see that only those diamonds remains non-dominant in both the upper-level and lower-level objectives. If the proportion of such points in the lower-level PS is further reduced, the lower-level search of the above algorithm is difficult to obtain these points, which may result in a poor quality upper-level PF. Therefore, the lower-level search should not only ensure the non-dominance of the final individuals in the lower-level objective space but also ensure their non-dominance in the upper-level objective space. In this paper, we propose a multiobjective bilevel optimisation evolutionary algorithm with dual populations lower-level search (MOBEA-DPL), which adopts a dual populations (population SP i and RP i ) lower-level search strategy in the lower-level search. In the early stage of lower-level search, the members of the two populations are consistent and evolve towards lower-level PF. However, in the later stage, the first population continue to select individuals according to the convergence and diversity in the lower-level objective space, providing the second population with convergence pressure from the lower level , while the second population is selected according to the upper-level objective on the premise that they are not dominated by the first population. In this way, the second population not only meet the lower-level optimality constraints but also have a good performance in the upper-level objective space.
In addition, in order to improve the search efficiency, we set up the upper-level mating pool according to the performance of the current generation of individuals in the upperlevel task. Compared with the previous practice of treating all upper-level vectors equally and ignoring the quality differences of sub-populations corresponding to different upperlevel vectors, it is obviously more reasonable and efficient that the upper-level vector with  higher quality sub-population should have more opportunities to produce offspring. However, due to the nested structure of bilevel problems, it is likely to lead to the loss of diversity of the mating pool if the upper-level vector is selected to join the mating pool according to the principle of convergence first, resulting in the population can only search the part of the upper-level PF or even premature convergence. Therefore, we adopt the principle of diversity first in MOEA/D-multi-objective to multi-objective (M2M) proposed by Liu et al. (2014) to improve the search efficiency and maintain diversity at the same time. As shown in Table 2, the contributions of this paper are as follows: (1) We design a dual populations lower-level search strategy. Individuals obtained by means of this strategy not only meets the lower-level optimality constraints but also have good performance in the upper-level objective space.
(2) We select the upper-level mating pool according to the performance of the current generation of individuals in the upper-level objective space using the principle of diversity first, so that the upper-level vector with good performance can obtain more mating opportunities. Then, the upper-level vectors of offsprings are generated near the upper-level vectors of the better individuals in the current population, which greatly improves the search efficiency and maintain diversity at the same time. The remainder of this paper is organised as follows. In Section 2, we describe the proposed MOBEA-DPL in detail. Experimental settings and the results on test problems with varying complexities and a real-world optimisation problem are presented in Section 3. Finally, conclusions are drawn in Section 4.

Proposed algorithm
The overall framework of the proposed MOBEA-DPL is presented in Algorithm 1. First, a set of upper-level vectors Ar u of size n u are randomly initialised. Next, the dual populations lower-level search is executed to optimise the lower-level multi-objective task for all x u,i in Ar u . Then, the upper-level population Pop are selected from lower-level sub-population SP through upper-level environment selection. Finally, offspring reproduction, upper-level environment selection and refine search are repeated until the termination criteria is met.

Dual populations lower-level search
In the initialisation and offspring reproduction operations, for each upper-level vector, the corresponding optimal lower-level vectors would be obtained through lower-level search. It should be noted that not all lower-level non-dominated solutions can remain non-dominated in the upper-level objective space. In the upper-level objective space, the performance of different optimal solutions may be very different, even they belong to the same lower-level problem. Thus, the lower-level search should not only ensure the non-dominance of the final individuals in the lower-level objective space but also ensure their non-dominance in the upper-level objective space. Therefore, we propose a dual populations lower-level search in this paper, which is shown in Algorithm 2.

Algorithm 2 DP-LLsearch
Sort the members in r using non-dominated sorting approach according to their lower-level objectives 7: RP i ← r /*Select the best n l individuals from r using reference-point-based non-dominated sorting approach according to their lower-level objective value*/ First, a set of lower-level vectors SP i of size n l are randomly initialised and RP i has the same members, which is shown as Figure 2). Second, the offspring (SQ) of SP i are reproduced through differential evolution (DE) operator (Song & Li, 2022;Zhao et al., 2021). Then, SP i , SQ and RP i are merged into r and all members in r are sorted using non-dominated sorting approach according to their lower-level objectives. Next, RP i is updated by the best n l individuals from r using reference-point-based non-dominated sorting approach (Jain & Deb, 2014) according to their lower-level objective value, which is followed by the update of SP i . If the number of lower-level non-dominated individuals in r is no more than n l ( Figure  2)), SP i is updated by RP i ; otherwise, SP i is updated by the best n l individuals from lowerlevel non-dominated individuals in r using reference-point-based non-dominated sorting approach according to their upper-level objective value. Finally, offspring reproduction and the update of RP i and SP i are repeated until the termination criteria are met .
It is worth noting that the proposed dual populations lower-level search can adaptively select the upper-level objectives or the lower-level objectives for optimisation. In the early stage of lower-level search, the number of lower-level non-dominated individuals in r is no more than n l , and the search focuses on optimising the lower-level objectives. The members of the two populations are consistent and evolve towards lower-level PF. As the search progresses, the number of lower-level non-dominated individuals increased gradually. Once the number of lower-level non-dominated individuals exceeds n l , the evolutionary direction of the two populations will differ. RP i continue to select individuals according to the convergence and diversity in the lower-level objective space, providing SP i with convergence pressure from the lower level, while SP i is selected according to the upper-level objectives on the premise that it is not dominated by RP i . In this way, RP i is a group of points evenly distributed on the lower-level PF, while SP i not only meet the lower-level optimality constraints but also have a good performance in the upper-level objectives space.

Upper-level environment selection
The purpose of the upper-level environment selection proposed in this paper is to update the upper-level population Pop, lower-level populations Archive and the lower-level reference points set RP. It is worth noting that Pop are the individuals with the best upper-level performance. In the next section, we will generate the upper-level vectors of offsprings around the upper-level vectors of Pop to speed up convergence and reduce computational overhead. However, it may cause the population to fall into local optimisation. As shown in Figure 3, there is a upper-level vector of a local optimal solution in the upper-level variable space and the region dominated by the local optimal solution is quite large. If the selection is based on the principle of convergence first, the upper-level vectors of Pop will converge to the local optimal upper-level vector with a high probability. If we continue to generate the upper-level vectors of offsprings near the upper-level vector of Pop at this time, individuals in Archive will completely fall into local optimisation. Therefore, we should select individuals to consist Pop based on the principle of diversity first, keeping the diversity of upper-level vectors of Pop and Archive as much as possible to avoid falling into local optimisation. The selection operator in MOEA/D-M2M is exactly the selection operator with diversity first. MOEA/D-M2M is a variant of MOEA/D (Zhang & Li, 2007). Different from MOEA/D's method of transforming a multi-objective optimisation problem into multiple single objective optimisation problems, MOEA/D-M2M transforms a multi-objective optimisation problem into multiple multi-objective optimisation problems. Specifically, MOEA/D-M2M divides the candidate solutions into multiple sub-populations according to the diversity of them in the objective space, and then selects individuals in each sub-population according to convergence and diversity to form the next generation population. Therefore, the next generation population has good convergence while retaining excellent diversity. Such solutions are suitable for us to accelerate convergence and avoid the search falling into local optimisation.
The details of upper-level environment selection are shown in Algorithm 3. First of all, k reference weight vectors W = {ω 1 , . . . , ω k } are uniformly generated, and the size of sub-population (s) is set. Next, the best n u individuals are selected from R to update Pop according to the selection operator of MOEA/D-M2M, and the upper-level vectors of individuals in Pop are grouped and stored in Pu = {Pu 1 , . . . , Pu k } according to their matching relationship with W. Then, the individuals in R that have the same upper-level vectors as the individuals in Pop are eliminated. After that, we start updating the Archive. In the beginning, an empty set Ar u is created to record the upper-level vectors corresponding to individuals that will be stored in the updated Archive. Then, for each reference weight vector ω i , do the following to fill n u = k * s different upper-level vectors into Ar u .
First, the set Pu i of the upper-level vectors of the individuals associated with ω i in Pop is assigned to Au i . Next, duplicate upper-level vectors and the same upper-level vectors from Au i as in Ar u are deleted. When the number of upper-level vectors in Au i is less than s, we need to supplement Au i . If there are individuals associated with ω i in R, the optimal individualx is selected according to the upper-level objective value; otherwise,x is picked out from R randomly. The upper-level vector ofx is added into Au i , and individuals with the same upper-level vector asx in R are deleted. After supplementing Au i , Ar u merges Au i .
After Au i is filled with n u upper-level vectors, Archive is updated with individuals in Archive and SQ that have the same upper-level vectors as Ar u , and RP is updated with individuals in RP and RQ that have the same upper-level vectors as Ar u . So far, we have updated Archive and RP, and obtained mating pools Pu = {Pu 1 , . . . , Pu k } and Au = {Au 1 , . . . , Au k }.

Offspring reproduction
After obtaining mating pools Pu and Au, offspring reproduction is executed, which is shown in Algorithm 4. For each Pu i , each upper-level vector (Pu for j = 1 : s do 3: end for 6: end for

Refine search
To ensure that each individual in Pop satisfies the lower-level optimality constraint, refine search is executed for upper-level vectors in Pop before the end of each round of upper-level search. It is worth noting that the proposed refine search is consistent with the dual populations lower-level search in design concept. While searching for better lowerlevel solutions, it prefers individuals that perform better in the upper-level objective space. The details of refine search are shown in Algorithm 5. First, the upper-level vectors corresponding to the individuals in Pop are extracted without repetition as the upper-level vectors that need refine search, which are recorded as X rs u . For each upper-level vector X rs u,i in X rs u , each corresponding individual in Pop and any two corresponding individuals in Archive generate a new lower-level offspring off i through DE operator. Next, off i and all corresponding individuals in Archive and RP are merged into r. Then, the best n l individuals are picked out from r using reference-point-based nondominated sorting approach according to their lower-level objective value and denoted as rp. After that, individuals corresponding to X rs u,i in RP are replaced by rp. If the number of lower-level non-dominated individuals in r is no less than n l , the lower-level dominated members are eliminated from r; otherwise, r is updated by rp. Subsequently, individuals corresponding to X rs u,i in Archive are replaced by r. After all vectors in X rs u have completed the above operations, upper-level environment selection is executed to update Pop, Archive, Pu and Au. It should be noted that the number of individuals corresponding to an upper vector in Archive may exceed n l , which needs to be reduced. For each upper-level vector with more than n l corresponding individuals in Archive, its corresponding individuals in Archive are sorted using reference-point-based non-dominated sorting approach according to their upper-level objective value. The best n l individuals are retained, while the others are eliminated.

Termination criteria
We adopt a hypervolume-based termination criterion proposed by Sinha et al. (2017) at both levels. At the lower-level, the maximum (H max l ) and minimum (H min l ) hypervolume are calculated according to the lower-level objective value of RP i in every generation from the previous τ l generations (with the worst encountered lower-level objective value in the if |r| non ≥ n l then 9: Eliminate the lower-level dominated members from r. if The number of individuals corresponding to X ls u,i in Archive is more than n l then 18: Individuals corresponding to X rs u,i in Archive are sorted using reference-pointbased non-dominated sorting approach according to their upper-level objective value. The best n l individuals are retained, while the others are eliminated.
If H l ≤ l , then the lower-level optimisation task is terminated indicating adequate convergence. A similar measure is used for the upper-level termination as well, except that the upper-level objective value of Pop are used to compute the hypervolumes. We have used τ l = 10, τ u = 10, l = 0.001 and u = 0.001 in our study.

Experimental results and discussion
To empirically examine the performance of the proposed MOBEA-DPL, we evaluate MOBEA-DPL on the DS test problems proposed by Deb and Sinha (2009).
There are five test problems in the DS series. In DS1, DS2 and DS3, the dimension of both upper and lower-level problems can be increased by increasing the parameter K. The parameter γ can be adjusted to cause a small proportion of lower-level Pareto-optimal points to be responsible for the upper-level Pareto-optimal front. Besides, conflict between two levels of optimisation can be introduced by choosing τ = −1. Moreover, the lowerlevel problem of DS1 is difficult to be solved to Pareto-optimality due to multi-modalities, and it can make an algorithm difficult to locate the true Pareto-optimal points of DS1 by adjusting parameter α. While the upper-level problem of DS2 has multi-modalities, thereby causing an algorithm difficulty in finding the upper-level Pareto-optimal front. DS3 is the most challenging of these three with following properties: first, the Paretooptimal fronts for both the lower and upper levels lie on constraint boundaries, thereby requiring good constraint handling strategies to solve both problems optimally. Second, not all lower-level Pareto-optimal solutions qualify as upper-level Pareto-optimal solutions. Third, every lower-level front has an unequal contribution to the upper-level Pareto-optimal front. When it comes to DS4 and DS5, the problem complexity in converging to the appropriate lower and upper-level fronts can be increased by increasing K and L. Only one Pareto-optimal point from each participating lower-level problem qualifies to be on the upper-level front in DS4. DS5 has similar difficulties as in DS4, except that only a finite number of x u qualifies at the upper-level PF and that a consecutive set of lower-level Pareto-optimal solutions now qualify to be on the upper-level PF (Deb & Sinha, 2009).
In the remainder of this section, we first present a brief introduction to the experimental settings of all the compared algorithms. Then, the results from 21 runs of each algorithm for each test problem are presented. A comparative study of the computational expense required by different approaches is also provided.

Experimental settings
(1) Genetic operators: In this paper, the DE (Vesterstrom & Thomsen, 2004;Vishwakarma & Sisaudia, 2020) is adopted in all algorithms in the experiments for offspring generation. The parameters of DE in all algorithms in the experiments are default values, which are set to C r = 1 and F = 0.5, while the expectation of number of bits doing mutation is set to proM = 1 and the distribution index of polynomial mutation is set to disM = 20. Such parameter setting is also adopted by Cai et al. (2022).
(2) Population size: In all algorithms in the behaviour study of the MOBEA-DPL, the upperlevel population size is set to n u = 50, while the lower-level population size is set to n l = 50. The number of reference weight vectors in upper-level environment selection is set to k = 10, and the size of sub-population is set to s = n u /k = 5. While in the comparisons of MOBEA-DPL with other approaches, n u is set to 20 for DS1-DS3, and 5 for DS4 and DS5. n l is set to 20 for DS1-DS3, and 40 for DS4 and DS5. These are the same as the settings in BLMOCC (Cai et al., 2022). k is set to 5.
(3) The test problems: The parameters setting of the test problem is shown in Table 3.

Behaviour study of the MOBEA-DPL
In this section, there are three comparison algorithms. The first one, denoted as MOBEA-nl, considers the quality differences of sub-populations corresponding to different upper-level vectors as MOBEA-DPL, but loses sight of the fact that the lower-level non-dominated solutions may not remain non-dominated in the upper-level objective space, which uses an evolutionary many-objective optimization algorithm usingreference-point based nondominated sorting approach (NSGAIII) proposed by Jain and Deb (2014) Table 4 presents the comparison results of the four algorithms in inverted generational distance (IGD) on problems proposed by Deb and Sinha, 2009 (the DS series), including the results of Wilcoxon rank-sum test calculated at a significance level of α = 0.05, where "+", " = " and "−" indicate that the compared algorithm is better than, similar to and inferior to the proposed MOBEA-DPL, respectively. We can see that several comparison algorithms have no great difference in the quality of upper-level PF in the first three problems. This is because for DS1 and DS2, the entire lower-level PS can remain non-dominated in the upper-level objective space. Although not all lower-level Pareto-optimal solutions qualify as upper-level Pareto-optimal solutions (Deb & Sinha, 2009), the proportion of qualified ones is still relatively large, which is not enough to make a significant difference in the results. Therefore, MOBEA-nl and MOBEA-nrl using the usual lower-level search strategy can also obtain the lower-level most of non-dominated solutions with good upper-level performance. However, when it comes to DS4 and DS5, the algorithms with dual populations lower-level search strategy (i.e. MOBEA-DPL and MOBEA-nr) have obvious advantages. We uniformly take five points on the upper-level decision variable space (i.e. x u ∈ [1, 2]) of the two problems, which are x 1 u = 1, x 2 u = 1.25, x 3 u = 1.5, x 4 u = 1.75 and x 5 u = 2. Then, we use the two different lower-level search strategies used in the comparison algorithms Note: The symbols "+", " = " and "−" denote that the compared algorithm is better than, similar to and inferior to the proposed MOBEA-DPL, respectively, under the Wilcoxon's rank sum test with a 0.05 significance level.

Comparison results in IGD
to solve the lower-level tasks corresponding to these five upper-level variables, and the results are shown in Figures 4 and 5. The left column subgraphs are the distributions of the solutions in the lower-level objective space, while the right column subgraphs are the distributions of the solutions in the upper-level objective space. We can see that although they are points on the same lower-level PF, their performance in the upper-level objective space are very different. NSGAIII in the experiment is the same as the lower-level search strategy of the existing algorithms, which only uses the lower-level objectives as the basis for the lower-level environment selection. They can find the solutions in the lower-level PS, but they cannot further screen the solutions in the lower-level PS. However, the proposed dual population lower-level search strategy not only searches the lower-level PS but also converges to those solutions with good upper-level performance in the lower-level PS. Therefore, with dual populations lower-level search strategy, MOBEA-DPL and MOBEA-nr obtain significantly better results in IGD, which proves the effectiveness of the proposed dual populations lower-level search. Table 5 shows the minimum, maximum and average number of FE required for the upper and lower-level objectives from 21 runs of the compared algorithms. At the same time, Table 5 also gives the ratio of average FEs, where the numerator is the average FEs of the comparison algorithm and the denominator is the average FEs of MOBEA-DPL. Comparing MOBEA-DPL and MOBEA-nl, MOBEA-nr and MOBEA-nrl, respectively, we can see that the proposed dual population lower-level search not only increases the upperlevel (UL) FEs on all the test problems but also increases the lower-level (LL) FEs on the test problems (i.e. DS1, DS2 and DS3) where all or most lower-level Pareto-optimal solutions qualify as upper-level Pareto-optimal solutions. As for the test problems (i.e. DS4 and DS5) where only a smaller part of lower-level Pareto-optimal solutions qualify as upperlevel Pareto-optimal solutions, the proposed dual population lower-level search can reduce their lower-level FEs. The growth rate of upper-level FEs on all the test problems is between 1.47 and 12.7. In bilevel optimisation algorithms, the lower-level FEs are often more than one order of magnitude than the upper-level FEs, so the computational overhead is mainly to count the lower-level FEs. Therefore, the increase of upper-level FEs is acceptable. On DS1, DS2 and DS3, there are obvious differences in lower-level FEs. Due to the proposed dual population lower-level search, the lower-level FEs have increased slightly on DS1 and DS3, but more than doubled on DS2. This is because the upper-level task and lower-level task of DS2 are conflict (i.e. τ = −1), which means that when dual population lower-level search selects individuals according to their upper-level objective value to form the next  generation population, solutions far from the lower-level PF among the lower-level nondominated solutions are preferred. Therefore, it leads to slower convergence and more lower-level FEs. When it comes to DS4 and DS5, the proposed dual population lowerlevel search can effectively find the upper-level Pareto-optimal solutions in the lower-level Pareto-optimal solutions, which makes each upper-level vector obtain the correct evaluation and avoid the search misled by the wrong evaluation, so MOBEA-DPL and MOBEA-nr have less lower-level FEs. However, only one Pareto-optimal point from each participating lower-level problem qualifies to be on the upper-level PF in DS4 (Deb & Sinha, 2009), which makes DS4 a more difficult problem than DS5. Without dual population lower-level search, MOBEA-nl and MOBEA-nrl are difficult to find better solutions on DS4, so premature convergence take place. Therefore, the reduction of lower-level FEs on DS4 is much less than that on DS5.

Comparison results in FEs
Comparing MOBEA-DPL and MOBEA-nr, MOBEA-nl and MOBEA-nrl, respectively, we can see that the proposed upper-level vector generation strategy has no effect on those problems with low-dimension upper-level decision space (i.e. DS4 and DS5, whose upper-level decision space dimension is only 1), because the upper-level vectors are easy to converge quickly in such a low-dimensional space. When it comes to problems with high-dimension upper-level decision space (i.e. DS1, DS2 and DS3, whose upper-level decision space dimension is 5), the situation is a little complicated. On DS1 and DS2, algorithms with the proposed upper-level vector generation strategy (i.e. MOBEA-DPL and MOBEA-nl) achieve smaller  upper-level FEs and smaller lower-level FEs with a decrease of between 19% and 30%. However, MOBEA-DPL achieved 10% less upper-level FEs and 11% less lower-level FEs than MOBEA-nr on DS3, MOBEA-nl achieved 3% more upper-level FEs and 4% more lower-level FEs than MOBEA-nrl on DS3. That is because not all lower-level Pareto-optimal solutions qualify as upper-level Pareto-optimal solutions in DS3. With the proposed dual population lower-level search, MOBEA-nr can continuously improve the quality of the upper-level PF by converging the population to those qualified lower-level Pareto-optimal solutions. Therefore, even if the probability of generating a promising upper-level vector is low, MOBEA-nr can avoid premature convergence on DS3. However, with a low probability of generating a promising upper-level vector and without the proposed dual population lower-level search, MOBEA-nrl inevitably converges prematurely on DS3. With the proposed upperlevel vector generation strategy, MOBEA-nl is more likely to generate promising upper-level vectors and continuously improve the upper-level PF, so MOBEA-nl also avoids premature convergence on DS3. MOBEA-nrl has obtained the worst IGD and the least FEs on DS3, which provide evidence for this claim.
Overall, the experimental results demonstrate that the proposed dual population lowerlevel search can effectively improve the quality of upper-level PF of problem where not all lower-level Pareto-optimal solutions qualify as upper-level Pareto-optimal solutions and greatly increased the computational overhead for problems with conflicting upper and lower-level objectives. In order to obtain a better solution, the overhead of computing resources is acceptable. In addition, the experimental results also show that the proposed upper-level vector generation strategy can effectively reduce the computational overhead for problems with high-dimension upper-level decision space. Thus, the proposed MOBEA-DPL is efficient and effective.

The comparisons of MOBEA-DPL with other approaches
In this section, we conduct experiments to compare the performance of MOBEA-DPL with a nested strategy (NS), a state-of-the-art hybrid bilevel evolutionary multi-objective optimisation (H-BLEMO) algorithm (Deb & Sinha, 2010) and a state-of-the-art cooperative coevolution with knowledge-based dynamic variable decomposition optimisation algorithm (BLMOCC) (Cai et al., 2022). The settings of test problems and comparison algorithms are the same as those in BLMOCC (Cai et al., 2022). It should be noted that we do not have the original codes of H-BLEMO and BLMOCC. We do not think we can reproduce their algorithms perfectly according to their papers. Therefore, we extract the data of IGD and FEs in the original BLMOCC. For IGD, 1000 reference points are uniformly sampled from the true PFs on all the problems, which is the same as that in BLMOCC. Similarly, the scale of the nondominated solutions used to calculate IGD is also consistent with that in BLMOCC. Thus, it can be reflected for a fair comparison.  Table 7.
We can see that the proposed MOBEA-DPL achieves the best IGD on all problems. The reason is that the proposed dual population lower-level search strategy and the refine

Application on a real-world optimisation problem
In this section, MOBEA-DPL is applied to the real-world decision-making problem we mentioned in the introduction, which is about the CEO making decisions considering the reactions of branch heads. A fuzzy version of this problem is addressed by , and a deterministic version of this problem can be mathematically formulated as follows: max x u ∈R 2 x l ∈R 3 F(x u , x l ) = x u,1 + 9x u,2 + 10x l,1 + x l,2 + 3x l,3 9x u,1 + 2x u,2 + 2x l,1 + 7x l,2 + 4x l,3 s.t. x l ∈ arg max where x u = (x u,1 , x u,2 ) ∈ R 2 is the CEO's decision variables, and x l = (x l,1 , x l,2 , x l,3 ) ∈ R 3 are the branch heads' decision variables. The CEO aims to maximise the net profits and the quality of products produced in the company, while the objectives of the branch heads are to maximise both the worker satisfaction and their branch profits. They are all subject to constraints such as the requirements of material, marking cost, labour cost and working hours.  It is not possible to solve this problem analytically.  solved this problem by converting the two objectives into a single objective with the weighted sum approach for obtaining one optimal solution. Figure 6 shows the final non-dominated set obtained by MOBEA-DPL on this problem. The single optimal solution obtained by  has also been presented in Figure 6 (marked in triangle). It can be observed that MOBEA-DPL is able to well-approximate the whole PF, including the optimal solution obtained by . The results further verify the effectiveness of MOBEA-DPL for a realworld multi-objective bilevel optimisation problem.

Conclusion
Multi-objective bilevel optimisation problems are difficult optimisation problems with two levels of multi-objective optimisation tasks. Both the nested structure of bilevel problems and the characteristics of multiple optimum solutions of multi-objective optimisation problems make the multi-objective bilevel optimisation problems computationally demanding and difficult to be solved.
Most of the existing researches focus on methods to reduce the computational overhead. Although the above algorithms can greatly reduce the computational overhead, they only use the lower-level objective value as the basis for environment selection in the lower-level search, and the result is to obtain a set of solutions evenly distributed on the lower-level PF. However, the upper-level performance of different optimal solutions may be very different, even they belong to the same lower-level problem. Not all lower-level Pareto-optimal solutions qualify as upper-level Pareto-optimal solutions. If the proportion of such points in the lower-level PS is further reduced, the lower-level search of the above algorithm is difficult to obtain these points, which may result in a poor quality upper-level PF. Therefore, we propose a MOBEA-DPL, which use two populations (RP i and SP i ) in the lower-level search. RP i always maintains non-dominance and diversity in the lower-level objective space throughout the lower-level search process, which provides SP i with convergence pressure from the lower level. SP i is consistent with RP i in the early stage of lower-level search. But in the later stage, SP i searches for the upper-level non-dominated solutions that is not dominated by RP i in the lower-level objective space. In this way, SP i not only meets the lower-level optimality constraints but also have a good performance in the upper-level objective space.
In addition, to improve the search efficiency, we set up the upper-level mating pool according to the performance of the current generation of individuals in the upper-level task. Compared with the previous practice of treating all upper-level vectors equally and ignoring the quality differences of sub-populations corresponding to different upper-level vectors, it is obviously more reasonable and efficient that the upper-level vector with higher quality sub-population should have more opportunities to produce offspring.
The proposed algorithm has been evaluated on five test problems with different kinds of complexities. The experimental results demonstrate that the proposed dual population lower-level search can effectively improve the quality of upper-level PF of problem where not all lower-level Pareto-optimal solutions qualify as upper-level Pareto-optimal solutions and greatly increased the computational overhead for problems with conflicting upper and lower-level objectives. In order to obtain a better solution, the overhead of computing resources is acceptable. In addition, the experimental results also show that the proposed upper-level vector generation strategy can effectively reduce the computational overhead for problems with high-dimension upper-level decision space. Thus, the proposed MOBEA-DPL is efficient and effective. Meanwhile, MOBEA-DPL is also applied to the real-world decision-making problem and successfully obtained the whole PF, including the optimal solution obtained by . It further verifies the effectiveness of MOBEA-DPL for a real-world multi-objective bilevel optimisation problem.
However, the FEs required for the upper and lower-level objectives of the proposed MOBEA-DPL are still large. In general, the dual population lower-level search strategy and upper-level vector generation strategy proposed in this paper are complementary to the existing researches. Combined with the existing computational overhead reduction methods, MOBEA-DPL can obtain better quality upper-level PF and consume less computing resources. In addition, due to the existence of extreme points and the reference-pointbased non-dominated sorting approach used in the upper-level environment selection, the proposed MOBEA-DPL is difficult to obtain uniformly distributed points on the upper-level PF of problems DS4 and DS5. These problems need to be solved in the next research.

Disclosure statement
No potential conflict of interest was reported by the authors.

Funding
This work was supported in part by the National Natural Science Foundation of China (62172110), in part by the Natural Science Foundation of Guangdong Province (2022A1515010130), and in part by the Programme of Science and Technology of Guangdong Province (2021A0505110004).