Empirical comparison of level-wise hierarchical multi-population genetic algorithm*

ABSTRACT Metaheuristics have recently been commonly used to solve complex problems in real applications. Some scholars used multiple populations in metaheuristic approaches to shorten the execution time in finding nearly optimal solutions. In this paper, we revisit the properties of sub-population execution for genetic algorithm (GA) and design a level-wise hierarchical sub-population architecture. We also make experiments for comparing the performance of the architecture for unimodal and multimodal functions. The experimental results show that using the hierarchical GA architecture produces a more significant improvement on multimodal functions than on unimodal ones.


Introduction
In the past decades, many metaheuristics were generated from very simple concepts of nature. The inspiration could be classified into the following three major types (Mirjalili, Mirjalili, & Lewis, 2014). The first type is from the evolutionary concept. In this type, the optimization is done by evolving an initial group of random solutions by evolutionary algorithms. One of the most popular algorithms in this type is the genetic algorithm (GA) (Holland, 1975).
The second type of metaheuristics is based on animals' behaviours. These algorithms mainly mimic the social behaviour of swarms of creatures in nature. The search agents implemented by computer programs thus navigate towards solutions using the simulated collective and social intelligence of creatures. Examples include Particle Swarm Optimization (PSO) (Kennedy & Eberhart, 1995) and Ant Colony Optimization (ACO) (Dorigo, Birattari, & Stützle, 2006), among others.
The third type of metaheuristics is from the observation of physical phenomena. This kind of algorithms randomly generates a set of agents to search and move throughout the search space based on some physical phenomenon. Some famous examples are Gravitational Local Search (GLSA) (Simon, 2008) and Gravitational Search Algorithm (GSA) (Esmat, Nezamabadi-Pour, & Saryazdi, 2009).
In this paper, we focus on designing hierarchical execution process for GAs and compare its performance on functions with different numbers of modals. The first GA was proposed by Holland in 1975(Holland, 1975. The idea came from Charles Darwin's theory about 'natural selection, survival of the fitness' (https://en.wikipedia. org/wiki/Charles_Darwin). GA is a heuristic search procedure based on the mechanism of natural selection, genetics, and evolution. During these years, GA has been widely applied in many fields such as bioinformatics (Wasserman & Sandelin, 2004), optimization design (Samii & Michielssen, 1999), machine learning (Goldberg & Holland, 1988), knowledge systems (Nara, Univ, & Shiose, 1992), and manufacturing scheduling (Wall, 1996).
Multi-population genetic algorithms (MGAs) have been recognized as being efficient and effective in finding nearly optimal solutions (Cantú-Paz, 1998). In this paper, we study the characteristics and execution of GA on a hierarchical tree structure. We first propose an algorithm called hierarchical multi-population genetic algorithm (HMGA), in which, the execution process is represented as a tree structure and conducted in a level-wise manner. The given total number of chromosomes are divided into several sub-populations, which are then put at leaf nodes of a tree structure and move up along with the tree when some criteria are met. Each node executes a traditional GA algorithm with the sub-population assigned. We expect that the search diversity can increase in this way because of the sub-populations and their layer-by-layer merge, and thus the final performance may be improved.
The rest of the paper is organized as follows. Section 2 reviews some related studies, including GAs, MGAs, and hierarchical structures. The adopted HMGA is described in Section 3. Experiments on unimodal and multimodal functions are shown in Section 4, with comparison and discussion. Finally, conclusion and future work are stated in Section 5.

Related works
In this section, some related studies on GAs and MGAs are briefly reviewed.

Genetic algorithms (GAs)
Holland firstly introduced the GAs in 1975 (Holland, 1975). It is categorized as a global search metaheuristic that mimics the process of natural selection. The metaheuristic is often used to find true or approximate solutions to optimization and search problems (Mitchell, 1998). GAs is a particular class of evolutionary algorithms, which generate solutions using the techniques inspired by natural evolution, such as inheritance, mutation, selection, and crossover (also called recombination). In a GA, a population composed of candidate solutions is maintained, and each candidate is formed by a set of genes, as shown in Figure 1. The genes may represent the properties of the solution and can be mutated and crossovered to evolve towards better solutions (https://en.wikipedia.org/ wiki/Genetic_algorithm).
Candidate solutions may be encoded in different ways. For numerical solutions, binarycoded representation and real-coded representation are two primary ways. After coding, each chromosome is represented by a string of bits or numbers. Then a GA executes the required crossover and mutation operators to make the population evolve according to the specified evaluation function.

Multi-population genetic algorithms
The multi-population genetic algorithm (MGA), which was first proposed by Grefenstette (1981), is an extension of the traditional single-population GAs. It divides a population into several isolated sub-populations in which individuals are allowed to exchange from one to another. Compared to the single-population GA, the MGA is more similar to the natural GA. MGA is able to raise the parallelism of the genetic execution and is more resistant to premature convergence than the single-population one.
Multi-population genetic algorithms (MGAs) have been recognized as being efficient and effective in finding nearly optimal solutions (Cantú-Paz, 1998). During these years, different variants based on MGA have been proposed. For example, Cochran, Horng, and Fowler (2003) adopted a MGA, which used a two-stage approach to solve multiobjective scheduling problems. Zegordi and Nia (2009) proposed the multi-society genetic algorithm (MSGA), which had three populations with different fitness functions for solving the transportation scheduling. Lin, Hong, Liu, and Lin (2012) proposed a general framework for studying MGAs with ring topology. Sefrioui and Périaux (2000) proposed a hierarchical topology for the layout of three layers. The sub-populations in all the layers simultaneously ran with different resolutions. The method used parallel GAs based on the notion of sub-populations. Its main advantage lay in the interaction of the three layers. The solutions went up and down the layers, and the best ones keep going up until they get refined. In this paper, we adopt a hierarchical processing mechanism different to GA.

Hierarchical multi-population genetic algorithm (HMGA)
In this section, we describe the proposed HMGA. The main idea of the proposed HMGA is first explained. The basic strategy used in HMGA is the divide-and-conquer in which the whole execution is organized into a tree hierarchy. An original GA population is first divided into several sub-populations. Then each sub-population is placed in a leaf node of the tree structure and is processed by the GA algorithm individually. When the termination criterion is met, the sub-populations that belong to the same parent node are merged and conceptually placed in the parent node. Then the same procedure is repeated until the root node. The best result from the root node is finally output as the solution.
Formally, let the height of the hierarchical tree structure be denoted h and the branching degree denoted b, respectively. Also let p represent the total chromosome number at the nodes of the same levels, and S i,j denotes the sub-population that is located at the j-th node (from the left) of the i-th level. Note that here the root is at level 1. Figure 2 illustrates the above idea with h = 3 and b = 2.
For the hierarchical MGA in Figure 2, its execution process can be shown in Figure 3. In Figure 3, the total N chromosomes are first divided into four sub-populations and put into the four leaf nodes of the execution tree. Each sub-population in a leaf node is then executed by the GA. When the termination criterion for the HMGA is met, like achieving a certain number of generations, both the sub-populations belong to the same parent node at the next high level (i.e. level 2). For example, P 3,1 and P 3,2 will be merged into P 2,1 , as Figure 4 shows.
With the above idea, the detailed steps of the proposed HMGA are described as follows.
The Hierarchical Multi-population Genetic Algorithm (HMGA): INPUT: 1. A given fitness function F corresponding to the problem to be solved, 2. A given GA with a crossover rate Pc and a mutation rate Pm, 3. A fixed total number N of chromosomes at each level, 4. Height h and branching degree b of the hierarchical execution, 5. A total generation number T.
OUTPUT: A nearly optimal solution for the fitness function F. STEP 1: Initially set the current generation number g as 0, where g is used to keep the current iteration number from the beginning of the algorithm.
STEP 2: Randomly generate N chromosomes and equally allocate them into b h−1 subpopulations, with each has⎿N/b h−1 ⏌or⎿N/b h−1 ⏌ + 1 chromosomes.
STEP 3: Evaluate the fitness values of the chromosomes in each sub-population.   STEP 4: Set the current level number l as h, where l is used to keep the current level number.
STEP 5: For each sub-population in Level l, do the following substeps. SUBSTEP 5.1: Execute the crossover operations on the chromosomes. Here we use the roulette wheel to select chromosomes for crossover.
SUBSTEP 5.2: Execute the mutation operations on the chromosomes. SUBSTEP 5.3: Evaluate the fitness values of the offspring chromosomes. SUBSTEP 5.4: Execute the selection mechanism to generate the population at the next generation.
STEP 7: If g = T, then do Step 10 to finish the algorithm; If l is equal to 1 go to STEP 5; If g = (⎿T/l ⏌*(h − l + 1)), do the next step (STEP 8); Otherwise, go to STEP 5 to execute for the next iteration.
STEP 8: Merge the sub-populations with the same parent into the one at level l−1. STEP 9: Set l = l−1 and go to STEP 5. STEP 10: Stop the execution and output the best solution in the root population to users.
Note that different selection mechanisms can be chosen in STEP 5. For example, the elite or roulette wheel strategies can be applied here.

Experimental evaluation
A series of experiments were conducted to evaluate the performance of the proposed algorithms. Four benchmark functions were used to test the performance of the proposed approaches. These functions could be divided into two groups: unimodal problems and multimodal problems (Jamil & Yang, 2013). Tables 1 and 2 list the ranges of the search spaces and the best values respectively for the unimodal and the multimodal functions (Jamil & Yang, 2013).
The Sphere and Rosenbrock functions belong to the first group. These two functions are relatively simple. It is worth mentioning that the Rosenbrock function is sometimes treated as a multimodal function. Here we put it in the first group since its characteristics are more cos ( closely tied to a unimodal function in our experiments. The remaining two functions, the Rastrigin function and the Griewank function, are multimodal. Such functions are more difficult than unimodal because they have many local optima. The 2-D landscape maps of the two unimodal and the two multimodal functions (Jamil & Yang, 2013) are shown in Figures 5 and 6, respectively. These functions were tested by the proposed algorithms for showing the effects of hierarchical execution. The parameter settings for all the experiments were described as follows. The dimension (variable number) d of each test function was set at 10. The maximum total number T of generations was set at 3000. For the other parameters of GA, the crossover rate Pc was set at 0.8 and the mutation rate Pm was set at 0.005. Different numbers of populations were used.
A fixed total number of 100 chromosomes were used to investigate the variation of the performance of HMGA. The following six cases of hierarchical execution with different height and branch degrees were adopted and shown in Figure 7.  As the algorithm of HMGA shows, the termination criterion at each level was for subpopulations to run for T/h iterations. Figures 8-11 show the results of HMGA for the four benchmark functions.
It could be observed from the above results that using hierarchical execution was useful especially on F3 and F4. According to Figures 10 and 11 for F3 and F4, HMGA had better performance than traditional GA. This is because the former could increase the diversity from merge in the execution process.
The experimental results indicated that HMGA performed well when the function was multimodal. Figure 10 for F3 is a good example. In Figure 10, it could be found that the traditional GA got trapped in a local optimal early, and the HMGA with a large height degree and a high branch degree could more easily avoid being trapped in a local optimal. The reason was HMGA could continue improving its solutions due to the merge operations. Since MHGA processed the sub-populations at each level for the same 1000 iterations (T/h), it could be observed that HMGA got a better solution after 1000 and 2000 iterations, because the merge operations occurred at those times.     We then show the effect of HMGA on F3 with a fixed height = 2 and different branch factors. The results are shown in Figure 12. It could be seen that more branches brought better benefits because the diversity was increased.
We then show the effect of HMGA on F3 with a fixed depth = 2 and different heights. The results are shown in Figure 13. It could be seen that larger height brought better benefits because of the mixing effect.

Conclusion and future work
In this paper, we have revisited the multi-population mechanisms and designed a hierarchical execution architecture of GA. We have also proposed a HMGA. In the algorithm,  an original GA population is first divided into several sub-populations, each of which is placed in a leaf node of the tree structure and is processed by the GA algorithm individually. When the termination criterion is met, the sub-populations that belong to the same parent node are merged and conceptually placed in the parent node. The algorithm can also be easily executed in parallel since the nodes are homogeneous and coded the same. Several experiments have been made to verify the performance of the proposed algorithm. From the experimental results, it can be observed that the proposed approach shows good performance when compared to the single layer execution. Besides, the effect of the proposed approach on multimodal functions is better than on unimodal ones. As future works, we will conduct more experiments on other benchmarks. We will also study the effect of adding more mechanisms such as migration in the architecture.

Disclosure statement
No potential conflict of interest was reported by the authors. He is currently a distinguished professor at the Department of Computer Science and Information Engineering and at the Department of Electrical Engineering, National University of Kaohsiung, and a joint professor at the Department of Computer Science and Engineering, National Sun Yat-sen University, Taiwan. He got the first national flexible wage award from Ministry of Education in Taiwan. He has published more than 500 research papers in international/national journals and conferences and has planned more than 50 information systems. He is also the board member of more than 40 journals and the programme committee member of more than 500 conferences. His current research interests include knowledge engineering, data mining, soft computing, management information systems, and www applications.

Note on contributors
Yuan-Ching Peng received his B.S. and M.S. degrees from the Department of Computer Science and Information Engineering in National United University, Miaoli Taiwan, and the Department of Computer Science and Information Engineering from National Sun Yat-Sen University, Kaohsiung, Taiwan, in 2008 and 2016, respectively. His research interests include data mining, evolutionary algorithms, and fuzzy theory.
Wen-Yang Lin is a professor at the Department of Computer Science and Information Engineering, National University of Kaohsiung. He received his Ph.D. in Computer Science and Information Engineering from National Taiwan University in 1994. From 2004 to 2007, he has chaired the Department of Computer Science and Information Engineering at National University of Kaohsiung, and served as the Director of Computer Science and Information Center from 2008 to 2010. His current research interests include data mining, data warehousing, and evolutionary computation. He is also interested in applying data mining techniques to the area of Healthcare and Medical Informatics. He has coedited several special issues of renowned international journals, (co-)authored more than 180 refereed publications, served as co-chair of programme committee, and organized special sessions for many international conferences, including ASONAM, IEEE SMC, WCCI, and IEA/AIE. He is a member of IEEE, the Taiwanese AI Association, and Taiwanese Association for Social Networks.
Leon Shyue-Liang Wang received his Ph.D. from State University of New York at Stony Brook in 1984. From 1984to 1994