A mapping-based constraint-handling technique for evolutionary algorithms with its applications to portfolio optimization problems

A novel Constraint-Handling Technique (CHT) for Evolutionary Algorithms (EAs) applied to constrained optimization problems is proposed. It is assumed that the feasible region of the constrained optimization problem is defined by a convex-hull of multiple vertices. On the other hand, without loss of generality, the search space of EA is given by a hyper-cube. The proposed CHT called Convex-Hull Mapping (CHM) transforms the real vector in the search space of EA into the solution in the feasible region. It is also proven that CHM performs a surjective mapping from the search space of EA to the feasible region. Although the proposed CHM can be applied to any EAs, one of the latest EAs, or Adaptive Differential Evolution (ADE), is used in this paper. By using ADE, CHM is compared with conventional CHTs in a real-world optimization problem in the field of finance, namely the portfolio optimization problem. Portfolio optimization is the process of determining the best proportion of investment in different assets according to some objective. Specifically, to reveal the characteristic of CHM depending on the number of the above vertices, three different formulations of the portfolio optimization problem are employed to evaluate the performance of ADE using CHM. Numerical experiments show that CHM is better than conventional CHTs in most cases. Moreover, the hybrid method combining CHM with a conventional CHT outperforms the original CHT.


Introduction
Various Evolutionary Algorithms (EAs) have been proposed nowadays as excellent and practical numerical optimization methods. However, most of EAs are originally designed for solving unconstrained optimization problems. On the other hand, real-world optimization problems are formulated as constrained ones in many cases. Thus Constraint-Handling Techniques (CHTs) for handling constraints with EAs are essential to cope with real-world optimization problems by using EAs effectively.
During the last few decades, a number of CHTs have been proposed for EAs [1]. Those CHTs can be categorized as four different classes: (1) Penalty functions (2) Mapping-based techniques (3) Special operators (4) Separation of objective function and constraints.
Except mapping-based techniques, CHTs modify the procedures of original EAs to deal with constraints. Without loss of generality, the search space of EA is given by a hyper-cube. Mapping-based techniques, which is also called decoders [2], transform real vectors in the search space of EA into feasible solutions of the constrained optimization problem. The largest advantage of mapping-based CHTs is that original EAs can be applied directly to the constrained optimization problem without making any changes on the problem formulation nor the procedures of EAs. On the other hand, it is difficult to provide a simple and generic mapping-based CHT as discussed below.
This paper proposes a novel mapping-based CHT called Convex-Hull Mapping (CHM) for EAs applied to constrained optimization problems. Let X ⊆ n be the feasible region of the constrained optimization problem. We assume that the feasible region X ⊆ n is given by a convex hull of multiple vertexes. Therefore, the feasible region is a convex polyhedron. As stated above, the search space of EA is defined by a hypercube S = [0, 1] q . Thereby, the proposed CHM provides τ : S → X. Figure 1 shows an image of the proposed CHmapping. The procedure of CHM is very simple. Furthermore, it is proven that CHM performs a surjective mapping from the search space S ⊆ q of EA to the feasible region X ⊆ n . On the other hand, not all constrained optimization problems have the feasible region defined by a convex polyhedron. Besides, it is hard to obtain all vertexes of the convex polyhedrons for some cases. Therefore, a real-world optimization problem, or the portfolio optimization problem, is used to demonstrate the usefulness of the proposed CHM.
The proposed CHM can be used for any EAs. Therefore, to evaluate the effectiveness of CHM, we have employed the basic Differential Evolution (DE) as an instance of EA in our previews paper [3]. DE has been shown to be a simple yet powerful EA [4][5][6]. However, the performance of DE depends on the setting of control parameters, namely the scale factor and crossover rate. Actually, we cannot expect any EA that is able to perform the best on all problems as No Free Lunch Theorem [7] has suggested. To eliminate the influence of differences in the search performances of EAs and to evaluate the effectiveness of CHM as fairly as possible, we employ a state-of-the-art EA called JADE [8]. For changing the values of the control parameters of DE adaptively, many Adaptive DE (ADE) have been proposed nowadays [8,9]. Among them, JADE has been reported to be one of the most powerful ADE [10].
This paper is an extended version of the paper presented in the SICE Annual Conference 2021 [3] and differs from the paper in the following three points: (1) In addition to the basic DE, the latest ADE is used as an example of EAs. (2) The performance of CHM is compared with conventional CHTs. (3) Three different formulations are used for the portfolio optimization problem.
As we will discuss later, a variety of problem formulations using various constraints are reported for the portfolio optimization problem. The three types of formulations in this paper have suitable constraints, respectively, to evaluate the characteristics of CHM. Besides, these constraints are very common in many problem formulations of the portfolio optimization problem. Experimental results also show that CHM enhances the performance of EA rather than conventional CHTs in many cases.
The remainder of this paper is organized as follows. Section 2 describes the existing works related to mapping-based CHTs and portfolio optimization problems. Section 3 formulates the constrained optimization problem in which the feasible region is defined by a convex hull of vertices. Section 3 also describes the procedure of CHM. Section 4 explains the basic DE and the latest ADE called JADE. From Section 5 to Section 7, three different formulations of the portfolio optimization problem are described respectively. Then the performance of CHM is compared with conventional CHTs through the three types of portfolio optimization problems. Section 8 discusses the characteristic of CHM in terms of constraints for the three portfolio optimization problems. Finally, Section 9 concludes this paper and provides future work.

Constraint-Handling technique
Constrained optimization problems are usually formulated as where x = (x 1 , . . . , x n ) ∈ n is the vector of decision variables. The number of decision variables is n. The objective function and constraints can be linear or nonlinear. For handling equality constraints h i (x) = 0, they are usually transformed into inequality ones such as g i (x) = |h i (x)| ≤ ε by using a very small value ε > 0. Even though many EAs have been proposed nowadays, the search space S ⊆ n of EA is usually restricted to a hyper-cube such as S = [0, 1] n . Therefore, EAs cannot deal with constraints for the constrained optimization problem in (1) except the box-constraint: x j ≤ x j ≤ x j , j = 1, . . . , n. The box-constraint can be eliminated easily by using a simple transformation from z ∈ S to x ∈ n as follows: where z = (z 1 , . . . , z n ) ∈ S and 0 ≤ z j ≤ 1, j = 1, . . . , n.
To apply EAs to constrained optimization problems, many Constraint-Handling Techniques (CHTs) have been proposed for EAs [1]. Let X ⊆ n be the feasible region of a constrained optimization problem. Thus every x ∈ X satisfies all constraints. Many CHTs consider the search space S ⊆ n that contains the feasible region such as X ⊆ S. Specifically, the search space of EA is defined as S = [x j , x j ] n by using the transformation in (2). Therefore, original EAs need to be modified to evaluate infeasible solutions x / ∈ X existing in the search space S ⊆ n .
An EA called GENOCOP [11] has been proposed for optimization problems in which the feasible regions X ⊆ n are convex. GENOCOP uses a number of special operators to generate a new feasible solution from existing feasible ones. For example, by using a random value α ∈ (0, 1), the arithmetical crossover of GENO-COP generates a feasible solution x ∈ X from two feasible ones x 1 ∈ X and x 2 ∈ X as follows: GENOCOP is not a generic CHT but a kind of EA for solving convex-constrained optimization problems because the types of available operators are limited.
Several mapping-based CHTs have been proposed to provide a mapping from the search space S ⊆ n of EA to the feasible region X ⊆ n . In Koziel's decoder [2], the search space of EA is given as S = [−1, 1] n ⊆ n . Then the homomorphous mapping [2] provides τ : S → X by using a reference point x 0 ∈ X. Specifically, the vector y ∈ S is transformed into the feasible solution x ∈ X as where the value of the scale factor β ∈ , β ≥ 0 is decided for each y ∈ S. For the homomorphous mapping in (4), the reference point x 0 ∈ X has to be given properly because the performance of EA depends on the selection of x 0 ∈ X. Besides, if the boundary of the feasible region X ⊆ n is unknown, a binary search is needed to decide the value of the scale factor β ∈ for each y ∈ S = [−1, 1] n . The homomorphous mapping has also been extended to deal with non-convex feasible regions [2,12]. However, its actual implementation is far from trivial and involves a high computational cost when it is used for practical optimization problems.
Mapping-based CHTs using the grid generation and the circle packing have been reported, respectively, to approximate Riemann mapping from the search space of EA S ⊆ n to the non-convex feasible region X ⊆ n . However, these approximated Riemann mappings τ : S → X are demonstrated only in the case of n = 2 [13].
A Support Vector Machine (SVM) has been used to provide a mapping τ : S → X [14]. The SVM has been trained in advance to discriminate feasible solutions from infeasible ones in a higher-dimensional feature space H. Then infeasible solutions are converted into feasible ones in the feature space. Thus the mapping τ : S → X is defined as τ = ϕ −1 • ϕ where ϕ : S → H and ϕ −1 : H → X. However, it is non-trivial to prepare a good training dataset for the SVM. Besides, it is hard to realize the inverse mapping ϕ −1 : H → X. Consequently, the SVM is not a generic CHT.
A Deep Neural Network (DNN) has also been used to provide a mapping τ : S → X where the feasible region X ⊆ n is given by a set of non-convex areas [15]. To prepare a good training dataset for the DNN, a number of machine learning techniques are employed. Nevertheless, it is hard to provide a surjective mapping by the DNN. Thus the DNN-mapping needs to be used in combination with other CHTs.
In the portfolio replication problem, Orito et al. [16,17] have proposed mapping-based CHTs. The search space S ⊆ q of EA is given by a hyper-cube, while the feasible region X ⊆ n is a hyper-plane. Even though the CHTs can be used to eliminate only one equality constraint, they are effective for large-scale optimization problems. That is because the dimensionality of the feasible region X ⊆ n is reduced in the corresponding search space S ⊆ q such as n > q. On the other hand, since each of the mappings ϕ : S → X is defined by the product of multiple trigonometric functions, they cannot generate solutions x ∈ X uniformly on the hyper-plane.
A certain mapping-based CHT has been used widely for portfolio optimization problems to eliminate one equality constraint [18]. The mapping-based CHT is called Similarity Transformation (ST) in this paper, and it will be detailed later.

Portfolio optimization problem
Portfolio optimization is the process of determining the best proportion of investment in different assets according to some objective. Since portfolio optimization is one of the most challenging problems in the field of finance [19], a number of problem formulations are reported. Optimization methods of mathematical programming [20] have been used widely to solve portfolio optimization problems formulated based on Markowitz's model [21]. Recently, various EAs have also been applied to portfolio optimization problems which are extended variants of Markowitz's model.
The cardinality constraint limits the portfolio to have a specified number of assets. Genetic Algorithm (GA), Tabu Search (TS), and Simulated Annealing (SA) have been applied to the portfolio optimization problem with the cardinality constraint [18]. An extended GA has been proposed to solve the portfolio optimization problem considering the costs for selling and buying assets to change a portfolio structure [22]. Evolutionary Strategy (ES) and Differential Evolution (DE) have been applied to the portfolio optimization problem including the assets with long and short positions [23]. Particle Swarm Optimization (PSO) has been used to optimize the portfolio for a cardinality constrained Markowitz's model [24]. Fireworks Algorithm (FWA) [25] and Artificial Bee Colony (ABC) algorithm [26] have been proposed for solving the portfolio optimization problem based on the efficient frontier model [18]. An Estimation of Distribution Algorithm (EDA) has been proposed for the optimization problem of creating a replicate portfolio for a given portfolio with good returns [27].
All the above works have formulated portfolio optimization problems as constrained optimization problems and applied various EAs to them by using some CHTs.

Constrained optimization problem
In this paper, the constrained optimization problem is formulated as x ∈ X ⊆ n (5) where X ⊆ n denotes the feasible region. We assume that the feasible region X ⊆ n of the optimization problem in (5) can be represented as a convex hull of multiple vertexesx i ∈ n , i = 1, . . . , p as where p is the number of vertexesx i ∈ X ⊆ n and p ≥ n. By using all vertexesx i ∈ X, i = 1, . . . , p shown in (6), every feasible solution x ∈ X of the constrained optimization problem in (5) can be represented as

Procedure of CHM
EAs are applied to the constrained optimization problem in (5). We assume that all vertexesx i ∈ X, i = 1, . . . , p are known. The search space of EA is defined by a hyper-cube S = [0, 1] q ⊆ q . Therefore, the vector z ∈ S has q elements as where 0 ≤ z j ≤ 1, j = 1, . . . , q and q = p−1.
The proposed CHM transforms the vector z ∈ S into the feasible solution x ∈ X. The procedure of CHM, namely τ : S → X, is described as follows: Step 1: Sort all elements z j ∈ [0, 1], j = 1, . . . , q of the vector z ∈ S in ascending order. Then generate a new vectorz = (z 1 , . . . ,z j , . . . ,z q ) such as Step 2: Decide the values of coefficients  where p = q + 1 and a i ≥ 0, i = 1, . . . , p.
Step 3: Compose the feasible solution x ∈ X as wherex i ∈ n , i = 1, . . . , p are the vertexes of the feasible region X ⊆ n .

Example of CHM
Let X ⊆ 2 be a feasible region of the constrained optimization problem in (5). The feasible region X ⊆ 2 is a convex hull of the following three vertexes: Since the number of vertexes is p = 3, the search space of EA is given as S = [0, 1] q , q = p−1 = 2. Figure 2 shows 50 vectors z k ∈ S, k = 1, . . . , 50 generated randomly. Figure 3 shows a set of the corresponding feasible solutions x k ∈ X generated from the vectors z k ∈ S by using CHM such as For example, from z = (0.5, 0.2) ∈ S, coefficients in (10) are calculated as From (11), the corresponding feasible solution x ∈ X is composed as

Optimization problem
Without loss of generality, most EAs that do not have some kind of CHT can only be applied to the unconstrained optimization problem formulated as where S = [0, 1] q ⊆ q denotes the search of EA.
If the objective function in (12) is defined as g = f • τ by using CHM τ : S → X, any EAs can be applied to the constrained optimization problem in (5). Specifically, we have only to solve the unconstrained optimization problem in (12) by using EA. Then the best solution z b ∈ S obtained by EA is converted into the best solution x b ∈ X of the constrained optimization problem in (5) such as

Differential evolution
As an instance of EAs for solving the optimization problem in (12), we explain the basic DE [4]. DE has NP individuals, or vectors z k ∈ P t , k = 1, . . . , NP, in the population P t ⊆ S where t denotes the current generation. The initial population P 0 ⊆ S is generated randomly. Then each z k ∈ P t , k = 1, . . . , NP is assigned to the target vector in turn. The basic DE uses the strategy called "DE/rand/1/bin" [4] to generate a new vector u k ∈ S, which is compared with z k ∈ P t , as follows.
By using three vectors z k1 , z k2 , and z k3 (k = k1 = k2 = k3) selected from P t randomly, the mutation vec- where the value of the scale factor F ∈ (0, 1] is given in advance.
In the case of v k / ∈ S, outside elements u j,k ∈ are corrected as By merging the elements of the target vector z k ∈ P t and v k ∈ S in (13), a new vector u k = (u 1,k , . . . , u q,k ) ∈ S called the trial vector is generated as where the value of the crossover rate CR ∈ [0, 1] is given in advance. rand j ∈ [0, 1], j = 1, . . . , q denote uniform random numbers. The index of element j r ∈ [1, q] is also selected randomly to avoid generating redundant vectors like u k = z k . In each of the generations, the trial vector u k ∈ S, k = 1, . . . , NP is compared with the corresponding target vector z k ∈ P t . Then either the trial vector u k ∈ S or the target vector z k ∈ P t is selected for a member of the next generation z k ∈ P t+1 . Specifically, u k ∈ S is chosen if g(u k ) ≤ g(z k ). Otherwise, z k ∈ P t is chosen.
In this paper, the termination condition of DE is given by the maximum number of generations NG. Then the algorithm of the basic DE is described as follows: Step 1: Generate the initial P 0 ⊆ S = [0, 1] q randomly.
Step 3: If t = NG holds, output the best vector z b ∈ P t and terminate.

Adaptive differential evolution
DE has been shown to be a simple yet efficient EA for many optimization problems. Its performance, however, is still quite dependent on the setting of control parameters, namely the scale factor F ∈ (0, 1] in (13) and the crossover rate CR ∈ [0, 1] in (15). Therefore, various parameter control methods have been reported for DE [10]. Adaptive DE (ADE) employs a parameter control method in which feedback from the evolutionary search is used to dynamically change the control parameters.
In this paper, a powerful ADE called JADE [8] is chosen among various ADE to solve the unconstrained optimization problem in (12). As well as the basic DE, JADE assigns each z k ∈ P t , k = 1, . . . , NP to the target vector in turn. For each of the target vector z k ∈ P t , the scale factor F k ∈ (0, 1] is generated according to a Cauchy distribution with location parameter μ F and scale parameter 0.1 as and then truncated to be 1 if F k > 1 or regenerated if F k ≤ 0. JADE uses the strategy called "DE/current-topbest/1/bin" [8] to generate a new vector u k ∈ S for z k ∈ P t . A vector z p ∈ P t is selected randomly from the top 100p% vectors in P t . Besides, z k1 and z k2 (k = k1 = k2) are also selected randomly from P t . By using F k ∈ (0, 1] in (16), the mutation vector v k ∈ q is generated as For each of the target vector z k ∈ P t , the crossover rate CR k ∈ [0, 1] is generated according to a Normal distribution of mean μ CR and standard deviation 0.1 as and then truncated to be CR k ∈ [0, 1] if CR k < 0 or CR k > 1. Instead of CR, CR k is used in (15) to generate u k ∈ S. If u k ∈ S is better than z k ∈ P t , associate control parameters F k and CR k are called successful ones.
The location μ F in (16) and the mean μ CR in (18) are initialized to be 0.5. After that, by using the sets of successful control parameters S F = {F k } and S CR = {CR k }, they are updated, respectively, at the end of each generation as follows: where mean L (S F ) denotes the Lehmer mean of F k ∈ S F [8].
where mean A (S CR ) denotes the arithmetic mean of CR k ∈ S CR [8].
Except the above parameter control method and the greedy strategy in (17), the algorithm of JADE is almost the same with the basic DE.

Adaptive DE using feasibility rule
The feasibility rule [1] is one of the most widely used CHTs because of its simplicity and efficiency. By using the feasibility rule and the transformation in (2), JADE can be applied to the constrained optimization problem formulated in (1).
From the constraints g i (x) ≤ 0, i = 1, . . . , m of the optimization problem in (1), the constraint violation is defined for a candidate solution x ∈ n as The search space of JADE applied to the constrained optimization problem in (1) The search space S ⊆ n contains the feasible region X ⊆ n such as X ⊆ S. The algorithm of JADE using the feasibility rule is the same with the original JADE except the range of the search space and the selection method for the member of the next generation z k ∈ P t+1 . As stated above, the trial vector u k ∈ S is generated and compared with the target vector z k ∈ P t . The values of the objective function in (1) and the constraint violation in (21) are evaluated for u k ∈ S and z k ∈ P t . Then if either of the following two conditions is satisfied, u k ∈ S is chosen for z k ∈ P t+1 . Otherwise, z k ∈ P t is chosen for z k ∈ P t+1 .

Problem formulation
We invest our money in n assets. Let x j ∈ , j = 1, . . . , n be the proportion held of asset j. The portfolio is defined as x = (x 1 , . . . , x n ). A long position means to buy an asset, while a sort position means to sell an asset. In a long trade, we buy an asset and wait to sell it when the price rises. In the problem formulation, however, a long-only portfolio x ∈ n is considered over a single period. Thus x ∈ n is constrained as Let μ j ∈ , j = 1, . . . , n be the expected return of asset j in the period. Besides, let σ ij ∈ be the covariance between the expected returns of assets i and j.
The return of a portfolio x ∈ n is evaluated as where µ = (μ 1 , . . . , μ n ) ∈ n and x = ( The risk of a portfolio x ∈ n is evaluated as where C = [σ ij ] ∈ n×n is the covariance matrix.
The gradient of the objective function f (x) in (26) can be derived as From constraints in (26), we define the activity of a feasible solution x ∈ X as If γ (x) = 0 holds, at least one constraint is active with x ∈ X. From Karush-Kuhn-Tucker (KKT) conditions [28], the optimal solution x ∈ X of the portfolio optimization problem in (26) Figure 4 illustrates the feasible region X ⊆ n of the portfolio optimization problem in (26) for the case of n = 2. Black two dots denote vertexes. From the linear constraints of the portfolio optimization problem in (26), the feasible region X ⊆ n can be represented by a convex-hull of the following p vertexesx i ∈ n , p = n: 1, 0, . . . , 0) . . .

Experimental setup
By using CHM, the portfolio optimization problem in (26) is transformed into an unconstrained optimization problem as shown in (12). Besides, the search space of EA is defined as S = [0, 1] q , q = n−1. Therefore, the dimension of the feasible region X ⊆ n is reduced in the search space of EA S = [0, 1] q ⊆ q such as q < n.
Actually, many EAs have been applied to portfolio optimization problems which are constrained by (23). For the conventional EAs, the search space of EA has been defined as S = [0, 1] n . Then a real vector z ∈ S = [0, 1] n is converted into a feasible solution x ∈ X ⊆ n by using a mapping-based CHT as follows: where z = (z 1 , . . . , z n ) ∈ n and x = ( The mapping-based CHT in (30) is called Similarity Transformation (ST) in this paper. By using ST in (30), the portfolio optimization problem in (26) can also be transformed into an unconstrained optimization problem as shown in (12).
CHM is compared with ST in the diversity of solutions distributed on the feasible region X ⊆ n . As shown in Figure 4, the feasible region X ⊆ n is a hyper-plane. Let P 0 ⊆ S be an initial population generated randomly. The initial population P 0 ⊆ S is transformed into a set of feasible solutions X 0 ⊆ X by using  CHM and ST respectively. Then the diversity of the set of feasible solutions x k ∈ X 0 , k = 1, . . . , NP is measured by using an evaluation function η(X 0 ) defined as follows: where θ j is the average of the j-dimension over all solutions x k ∈ X 0 . Figure 5 compares CHM with ST in the diversity of solutions x k ∈ X 0 evaluated by (31). The population size, or the number of x k ∈ X 0 , is chosen as NP = 400. The horizontal axis of Figure 5 is the dimension of x k ∈ n . The results in Figure 5 are the average of 100 runs. From Figure 5, the proposed CHM outperforms the conventional ST in the diversity of feasible solutions x k ∈ X 0 transformed from P 0 ⊆ S. Figure 6 compares CHM with ST in the same way with Figure 5. The dimension of x k ∈ X 0 ⊆ n is fixed at n = 40. The horizontal axis of Figure 6 is the population size. From Figure 6, we can confirm that CHM outperforms ST in the diversity of solutions x k ∈ X 0 . Furthermore, from Figure 6, we can see that the diversity of solutions x k ∈ X 0 , k = 1, . . . , NP cannot be enhanced only by increasing the population size NP.
The basic DE and JADE are applied, respectively, to the portfolio optimization problem in (26) by using the following three methods. From the result of a preliminary experiment, the parameters NP = 100 and NG = 300 are chosen for them.  (1) CHM-based: z k ∈ P t is transformed into x k ∈ X by using CHM. (2) ST-based: z k ∈ P t is transformed into x k ∈ X by using ST. (3) CHM-ST hybrid: A set of initial solutions X 0 ⊆ X is generated by using CHM. Let P 0 = X 0 . After that, z k ∈ P t is transformed into x k ∈ X by using ST.
A data set of assets provided by OR-Library [29] is used for an instance of the portfolio optimization problem in (26) where the number of assets is n = 31. Two values are chosen for the risk aversion indicator λ ∈ [0, 1], namely λ = 0.1 and λ = 0.9. We can obtain a risk-loving portfolio, or a solution x ∈ n , if λ = 0.1 is used. On the other hand, we can obtain a risk-averse portfolio x ∈ n if λ = 0.9 is used.

Experimental results of using DE
By using the above three methods, the basic DE is applied to the portfolio optimization problem in (26). The parameters F = 0.5 and CR = 0.9 are chosen for DE. Figure 7 shows the objective function values f (x b ) of the best solutions z b ∈ P t at generation t where the risk aversion indicator is chosen as λ = 0.1. The results of Figure 7 are the average of 30 runs. Similarly, Figure 8 shows the values of f (x b ) for λ = 0.9. From Figure 7 and Figure 8, the best solution z b ∈ P t obtained by using CHM-based method is obviously better than the other methods. On the other hand, the   (26) with λ = 0.9.
1.617 × 10 −1 8.437 × 10 −2 9.182 × 10 −2 best solutions z b ∈ P t obtained by ST-based and CHM-ST hybrid methods have converged to poor solutions because of the lack of diversity in the population P t . Table 1 compares the best solutions x b ∈ X obtained for λ = 0.1 by DE using the three methods respectively. Table 1 shows the objective function f (x b ) in (26), the absolute value of the gradient ∇f (x) in (27), the activity of solution γ (x) in (28), and the distance d(x b ) from the origin 0 ∈ n to the solution x ∈ n defined as where 1/ √ n ≤ d(x) ≤ 1 holds for the portfolio optimization problem in (26).
The results of Table 1 are the average of 30 runs. Similarly, Table 2 compares the best solutions x b ∈ X obtained by DE using the three methods for λ = 0.9.
From the values of f (x b ) in Table 1 and Table 2, we can see that CHM-based method is obviously better than the other methods. Furthermore, the best solution x b ∈ X obtained by using CHM might be an optimal one because of γ (x b ) 0.

Experimental results of using JADE
By using the three methods, JADE is applied to the portfolio optimization problem in (26). Figure 9 shows the objective function values f (x b ) of the best solutions z b ∈ P t for the portfolio optimization problem with λ = 0.1. The results in Figure 9 are the average of 30 runs. Similarly, Figure 10 shows the values of f (x b ) for λ = 0.9.
From Figure 9 and Figure 10, the best solutions z b ∈ P t obtained by JADE seem to have converged sufficiently at t = 200 in every case. Therefore, we can confirm that the number of generations, or NG = 300, chosen for JADE is enough large.
In the same way with Table 1, Table 3 compares the best solutions x b ∈ X obtained by JADE using three methods for the portfolio optimization problem in (26) with λ = 0.1. Similarly, Table 4 compares the best solutions x b ∈ X for λ = 0.9.   (26) with λ = 0.9. Table 3. Comparison of three best solutions x b ∈ X by JADE for the portfolio optimization problem in (26) with λ = 0.1.
1.629 × 10 −1 1.608 × 10 −1 1.608 × 10 −1 From the values of γ (x b ) 0 in Table 3 and Table 4, the best solutions x b ∈ X obtained by JADE using the three methods satisfy the optimality condition.
Unfortunately, from the objective function values f (x b ) in Table 3 and Table 4, we cannot confirm the difference between the best solutions x b ∈ X obtained by using the three methods. Therefore, by using Steel-Dwass's test, we evaluate the average ranking of the objective function values. Table 5 shows the result of Steel-Dwass's test about the value of f (x b ) in (26) obtained for λ = 0.1 in which * means p < 0.05; ** means p < 0.01; n.s. means not significant. Similarly, Table 6 shows the result of Steel-Dwass's test about the value of f (x b ) in (26) obtained for λ = 0.9.
From Table 5, CHM-based method is better than STbased method for the case of λ = 0.1. Besides, there is   Table 6, ST-based method is better than CHM-based method for the case of λ = 0.9. However, CHM-ST hybrid method is significantly better than ST-based method and is the best method in Table 6.
From the values of d(x b ) in Table 3 and Table 4, the location of the best solution x b ∈ X for the portfolio optimization problem in (26) clearly depends on the value of λ ∈ [0, 1]. The risk-loving portfolio obtained with λ = 0.1 is far from the center of the feasible region X ⊆ n . On the other hand, the risk-averse portfolio obtained with λ = 0.9 exists near the center of the feasible region X ⊆ n . From Figure 5, ST tends to generate more solutions x k ∈ X in the center of the feasible region X ⊆ n . As a result, ST-based method is better than CHM-based method for λ = 0.9.

Problem formulation
The deposit means an asset that has a constant rate r 0 ∈ . The rate r 0 ∈ is small but positive. Let x 0 ∈ be the proportion for the deposit. Considering the deposit as an investment option [30], the portfolio x ∈ n is constrained as The proportion of the deposit x 0 ∈ can be eliminated from (33) as x 1 + · · · + x j + · · · + x n ≤ 1, From (24), the return of the portfolio x ∈ n including the deposit becomes where 1 = (1, . . . , 1) ∈ n . From (34) and (35), the portfolio optimization problem is formulated as ⎡ where the risk aversion indicator λ ∈ [0, 1] is given by investors.
The gradient of the objective function f (x) in (36) can be derived as From constraints in (36), we define the activity of a solution x ∈ n as where the solution x ∈ n is not always feasible one. Figure 11 illustrates the feasible region X ⊆ n of the portfolio optimization problem in (36) for the case of n = 2. Black three dots denote vertexes. From the linear constraints of the portfolio optimization problem in (36), the feasible region X ⊆ n can be given by a convex-hull of the following p vertexesx i ∈ n , p = n + 1: 0, 1, 0, . . . , 0, 0) . . .
By using CHM, the portfolio optimization problem in (36) is transformed into an unconstrained optimization problem as shown in (12). Besides, the search space of EA is defined as S = [0, 1] q , q = n. As stated above, JADE using the Feasibility Rule (FR) can also be applied to the portfolio optimization problem in (36). In the case of JADE with FR, the search space of EA is given as S = [0, 1] n . Furthermore, from the constraints in (36), the constraint violation in (21) is defined as JADE is applied to the portfolio optimization problem in (36) by using the following three methods in which NP = 100 and NG = 300 are chosen for JADE.
(1) CHM-based: z k ∈ P t is transformed into x k ∈ X by using CHM. (2) FR-based: JADE is used with FR. Hence z k = x k ∈ X holds if φ(z k ) = 0. (3) CHM-FR hybrid: A set of initial solutions X 0 ⊆ X is generated by using CHM. Let P 0 = X 0 . After that, JADE is used with FR to evolve the population P t .
The data set of assets provided by OR-Library [29] is also used for an instance of the portfolio optimization problem in (36) where the number of assets is n = 31. Figure 12 shows the objective function values f (x b ) of the best solutions z b ∈ P t at generation t obtained by JADE using the three methods for the portfolio optimization problem in (36) with λ = 0.1. The results of Figure 12 are the average of 30 runs. Similarly, Figure 13 shows the values of f (x b ) in (36) obtained for λ = 0.9. Figure 14 shows the constraint violation values φ(z b ) of the best solutions z b ∈ P t obtained by FR-based method for the portfolio optimization problem in (36) with λ = 0.1 and λ = 0.9 respectively. From Figure 14, we can see that it will take many generations to find a feasible solution z k ∈ P t , φ(z k ) = 0 by FR-based method.

Experimental results of using JADE
In the same way with Table 3, Table 7 compares the best solutions x b ∈ X obtained by JADE using three methods for the portfolio optimization problem in (36)   Table 7. Comparison of three best solutions x b ∈ X by JADE for the portfolio optimization problem in (36) with λ = 0.1.  Table 8 compares the best solutions x b ∈ X for λ = 0.9.
From the values of d( Table 7 and Table 8, the location of the best solution x b ∈ X for the portfolio optimization problem in (36) clearly depends on the value of λ ∈ [0, 1]. In the case of λ = 0.9, the best solutions x b ∈ X exist at the origin of the feasible region X ⊆ n because d(x b ) 0 holds for them.
From the values of f (x b ) in Table 7, CHM-based method seems to be better than the other methods. On the other hand, we cannot confirm the difference between the best solutions x b ∈ X obtained by JADE using the three methods in Table 8. Table 9 shows the result of Steel-Dwass's test about the value of f (x b ) in (36) obtained by JADE using the three methods for λ = 0.1. Similarly, Table 10 shows  Table 9 and Table 10, the proposed CHM-based method significantly outperforms the other methods regardless of the value of λ ∈ [0, 1].

Problem formulation
We consider a long-short portfolio that includes the deposit for a single period. In other words, we can sell one asset and buy another. However, the total amount of transactions is regulated [31]. Therefore, the portfolio x ∈ n is constrained as |x 1 | + · · · + |x j | + · · · + |x n | ≤ 1 From (35) and (41), the portfolio optimization problem is formulated as ⎡ where the risk aversion indicator λ ∈ [0, 1] is given by investors.
The gradient of the objective function f (x) in (42) is also given by (37). From constraints in (42), we define the activity of a solution x ∈ n as where the solution x ∈ n is not always feasible one.  Figure 15 shows the feasible region X ⊆ n of the portfolio optimization problem in (42) for the case of n = 2. Black four dots denote vertexes. From the linear constraints of the portfolio optimization problem in (42), the feasible region X ⊆ n can be given by a convexhull of the following p vertexesx i ∈ n , p = 2 n:
By using CHM, the portfolio optimization problem in (42) is transformed into an unconstrained optimization problem as shown in (12). Besides, the search space of EA is defined as S = [0, 1] q , q = 2 n − 1. JADE using FR can also be applied to the portfolio optimization problem in (42). In the case of JADE with FR, the search space of EA is given as S = [−1, 1] n . Furthermore, from the constraints in (42), the constraint violation in (21) is defined as JADE is applied to the portfolio optimization problem in (42) by using the above three methods, namely (1) CHM-based, (2) FR-based, and (3) CHM-FR hybrid. In the three methods, NP = 100 and NG = 300 are chosen for JADE.
The data set of assets provided by OR-Library [29] is also used for an instance of the portfolio optimization problem in (42) where the number of assets is n = 31. Figure 16 shows the objective function values f (x b ) of the best solutions z b ∈ P t at generation t obtained by   the three methods for the portfolio optimization problem in (42) with λ = 0.1. The results of Figure 16 are the average of 30 runs. Similarly, Figure 17 shows the values of f (x b ) obtained by the three methods for λ = 0.9. Figure 18 shows the constraint violation values φ(z b ) of the best solutions z b ∈ P t obtained by FR-based method for the portfolio optimization problem in (42) with λ = 0.1 and λ = 0.9 respectively. From Figure 18, the number of generations to find a feasible solution by FR-based method depends on the value of λ ∈ [0, 1].

Experimental results of using JADE
In the same way with Table 7, Table 11 compares the best solutions x b ∈ X obtained by JADE using three methods for the portfolio optimization problem in (42) with λ = 0.1. Similarly, Table 12 compares the best solutions x b ∈ X for λ = 0.9. Table 13 shows the result of Steel-Dwass's test about the value of f (x b ) obtained for the portfolio optimization problem in (42) with λ = 0.1. Even though the dimension of its search space S = [0, 1] q , q = 2 n − 1 is very high, CHM-based method outperforms Table 11. Comparison of three best solutions x b ∈ X by JADE for the portfolio optimization problem in (42) with λ = 0.1.     Table 14, FR-based method is better than CHM-based method in the case of λ = 0.9. However, CHM-FR hybrid method is significantly better than FR-based method and is the best method in Table 14.
From the values of d(x b ) in Table 12, the best solutions x b ∈ X for λ = 0.9 exist in the centre of feasible region X ⊆ n . Therefore, finding feasible solutions is not difficult for FR-based method as shown in Figure 18. On the other hand, the dimension of the search space of JADE given by CHM is high: q = 2 n − 1. As a result, FR-based method is better than CHM-based method for λ = 0.9 as shown in Table 14.

Discussion
From the viewpoints of constraint setting, we discuss the characteristics of the three types of portfolio optimization problems, namely Efficient Frontier (EF) model in (26), EF model including Deposit (EF_D) in (36), and EF model of Long-Short portfolio (EF_LS) in (42). Table 15 shows the dimension of the search space S = [0, 1] q provided by CHM for  Figure 19. The diversity of the set of x k ∈ X 0 ⊆ n , k = 1, . . . , NP for the dimension n. The population size is chosen as NP = 400. The set of feasible solutions x k ∈ X 0 ⊆ X is generated by using CHM.
each of the portfolio optimization problems with the n-dimensional feasible region X ⊆ n . The dimension of S = [0, 1] q depends on the number of vertexesx i ∈ n , i = 1, . . . , p used to defined the feasible region such as q = p−1. From the dimension of S = [0, 1] q in Table 15, EF_LS seems to be the most difficult for EAs using CHM. Actually, the best solutions x b ∈ X in Table 12 (EF_LS with λ = 0.9) have not fully satisfied the optimality condition: By using CHM, a set of feasible solutions x k ∈ X 0 ⊆ X, k = 1, . . . , NP are generated randomly for each of the three types of portfolio optimization problems. Figure 19 shows the diversity of the set of feasible solutions x k ∈ X 0 defined by (31). The population size is chosen as NP = 400. The horizontal axis of Figure 19 is the dimension of the feasible region X ⊆ n . The results in Figure 19 are the average of 100 runs. From Figure 19, we can see that the diversity of the set of x k ∈ X 0 decreases in proportion to the dimension of the feasible region X ⊆ n . The diversity of the set of x k ∈ X 0 also depends on the dimension of the search space S ⊆ q in Table 15. From Figure 19, we can confirm that EF_LS is the most difficult problem for EAs.
Next, we discuss the issue of redundancy in the solution representation by CHM. As shown in the proof of Theorem 3.1, CHM τ : S → X is composed of three different mappings such as τ = τ 3 • τ 2 • τ 1 . The first mapping τ 1 : S → Z denotes the sorting of the elements of z = (z 1 , . . . , z q ) ∈ q . Thus the first mapping is surjective. There are q! redundant representations of z ∈ S for eachz ∈ Z. However, we can expect that there are no redundant vectors z k ∈ P t in the population, or a set of some vectors generated stochastically in real q-space S ⊆ q . Let A ⊆ p be a set of a = (a 1 , . . . , a p ) ∈ p in (10). The second mapping τ 2 : Z → A defined by (10) is bijection. Therefore, the probability that the same solution is generated from different vectors, namely Pr(x k = x i | z k = z i ), depends on the third mapping τ 3 : A → X defined by (11). As a result, the probability is proportional to the difference in dimension between the two spaces X ⊆ n and A ⊆ p , p ≥ n.
Experimental results show that CHM enhances the performance of EA rather than conventional CHTs in many cases, but evaluating the influence of the representational redundancy of solutions on the search performance of EA is a future challenge.

Conclusion
A novel mapping-based Constraint-Handling Technique (CHT) called Convex-Hull Mapping (CHM) is proposed for EAs applied to constrained optimization problems. The search space of EA is given by a hypercube. On the other hand, we assume that the feasible region of the constrained optimization problem is given by a convex hull of multiple vertexes. The proposed CHM transforms the real vector z ∈ q in the search space S = [0, 1] q of EA into the solution x ∈ n in the feasible region X ⊆ n .
The advantages of the proposed CHM are • It provides a surjective mapping τ : S → X.
• It does not require additional parameters.
• It can be used directly for any EAs.
• It does not require any modifications for original EAs. • It can also be combined with conventional CHTs.
• It can be calculated easily if all vertexes are given.
The performance of CHM is compared with conventional CHTs through three types of portfolio optimization problems. The results of Steel-Dwass's test on the objective function values of the best solutions obtained by JADE show that the proposed CHM is significantly better than conventional CHTs in many cases. Furthermore, a hybrid method combining CHM with a conventional CHT outperforms the original CHT.
Since the constraints effectively handled by CMH are very common in many problem formulations of portfolio optimization problems, CHM has the potential to contribute to design powerful EAs for solving various portfolio optimization problems.
Unfortunately, even if all constraints are linear, the proposed CHM cannot be used for EAs which are applied to an optimization problem in which • It is hard to obtain all vertexes of a convex hull.
• Vertexes are too many in comparison with variables.
• The feasible region forms a concave polyhedron.
To enumerate all vertexes of the feasible region, something like the simplex algorithm [32] for the linear programming may be used. Furthermore, if a feasible region defined by a concave polyhedron or many vertexes can be divided into several convex polyhedrons, CHM can be used for EAs to solve the constrained optimization problem. Specifically, CHM can be used to solve a set of unconstrained optimization problems which are defined, respectively, in each of the convex polyhedrons. Therefore, in future work, we would like to develop a divide-and-conquer method for CHM to deal with constrained optimization problems that have complex feasible regions.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Notes on contributors
Kiyoharu Tagawa He received his Ph. D. degree from Kobe University, Japan, in 1997. He is a Professor in School of Science and Engineering, Kindai University, Japan since 2007. His current research interests include evolutionary algorithms, optimization methods under uncertainties, and their applications to real-world problems. He is a member of SICE, IEEJ, IPSJ, and IEEE.

Yukiko Orito
She received her Ph.D. degree from Tokyo Metropolitan Institute of Technology in 2003. She is an associate professor in department of economics, Hiroshima University, Japan. Her current research interests are in evolutionary algorithms and its applications to engineering, finance, and economics. She is a member of IEEJ, JSEC, IPSJ and IEEE.