Load balancing for mitigating hotspot problem in wireless sensor network based on enhanced diversity pollen

ABSTRACT This paper proposes a load balancing to mitigate the hot spot problem in wireless sensor network (WSN), based on enhancing diversity pollens in Flower pollination algorithm (FPA). The hotspot problem in the WSN is spots near base station (BS) that consume more energy and drain out energy more quickly than other nodes farther from the BS. The spots near BS are hotter than other places due to the heavy traffic from the cluster members and other cluster heads (CH) for relaying data to BS. Enhancing diversity pollens for FPA is one of the solutions to deal smoothly with trapping local extrema for solving the hotspot problem. To evaluate the proposed algorithm, we firstly use a set of benchmark functions to test performance quality, and secondly, we deal with the load balancing problem in WSN. The results compared with some metaheuristic approaches and other related clustering algorithms demonstrate that the proposed method performs better than the others regarding various performance metrics such as the load balancing, execution time, energy consumption, and convergence rate.


Introduction
Rapid development in the field of integrated circuits and information technology leads to the development of cheap and compact size sensor nodes (Chang & Huang, 2016). Wireless Sensor Network (WSN) is made up of a set of such sensor nodes arranged in an ad-hoc fashion to monitor and interact with the physical world. Applications of WSNs have been expanding enormously due to the inclusion of new emergency areas (García-hernández, Ibargüengoytia-gonzález, García-hernández, & Pérez-díaz, 2007). For example, there are requests for environmental monitoring such as the landslide, earthquakes, rivers, and agricultural monitoring (Dao, Pan, & Nguyen, 2015;Nguyen, Pan, Chu, Roddick, & Dao, n.d.). In these use cases, sensor nodes primarily are energy constrained. An energy-efficient method in WSN deployment is clustering, where the sensor nodes are organized into groups termed as clusters (Nguyen, Dao, Horng, & Shieh, 2016;Zhao & Chen, 2017). A selected node in the cluster as a leader is Cluster Head (CH), and the other nodes in executed by applying the enhanced diversity pollen in FPA. The final, a load balancing among the CHs to mitigate hotspot issue in WSN is figured out and the results compared with others methods in the literature are carried out to evaluate the performance of the proposed algorithm.
The rest of this paper is organized as follows. Related work as problem statement of load balancing and a brief review of FPA are given in Section 2. A methodology of the enhanced diversity FPA is presented in Section 3. Applied dynamic diversity FPA for load balancing for mitigating hotspot issue is discussed in Section 4. Finally, the conclusion is provided in Section 5.

Energy consumption model
We envision that the traffic supported in WSNs is for periodic data acquisition. A WSN consists of N uniformly distributed sensor nodes over a monitoring area. There is one BS connected to the user via Internet. Throughout this article, we let S = {s 1 , s 2 , … s i , s N−1 , s N } denote the set of nodes, where s i is nodei and |S| = N. Sensor nodes have no geographical perception, as their positions can only be determined, via information exchanging in the network. Sensor nodes are identical but unable to recharge their energy. Nodes can control their transmission power level. In each round, all sensor nodes collect and transmit data packets.
WNS uses wireless radio transceivers depending on the various parameters, e.g. distance and energy consumption. The distance between the transmitter and receiver obeyed on the attenuated transceiving power decreased exponentially with the increasing distance (Heinzelman, Chandrakasan, & Balakrishnan, 2002). Here, we denote (x i ; y i ) and (x j ; y j ) are the coordinates of nodes i and j. Suppose node i sends l bits to destination node j, the energy consumption on this transmission over distance d is calculated by Equation (1): where E is the consumed energy; Tx is for transmitting, elec, amp indicate electronic and amplify for digital coding, modulating, filtering, and spreading signal. It can be observed from Equation (1) that when the distance between two nodes is larger than the threshold d 0 , much energy is consumed. This demonstrates that the multi-hop communication is more effective in WSNs. d o is a threshold of space model, where the power loss 1 fs , 1 mp are for free space and multipath models. The energy consumption for receiving the messages can be expressed as WSN assumed implemented N nodes in the area of two dimensions M 2 with k clusters. Dealing with the hotspot problem in WSN is to balance the load among the CHs based on the utilized different clustering techniques. Unequal Clustering reduces the size of the clusters closer to BS and the cluster size increases as the distance between the BS and CH increases (Soro & Heinzelman, 2005). The cluster members sense the real world parameters and transmit the sensed value to its CH. CH receives and aggregates data to remove redundant data and transmit aggregated data to BS directly or via intermediate CHs (Abbasi & Younis, 2007). In equal clustering, the size of the cluster is same throughout the network. However, in unequal clustering, the cluster size is determined based on the distance to BS (Nguyen, Shieh, Dao, Wu, & Hu, 2013). The consumed energy of a cluster for a data frame is modelled as: where N/k − 1 is average of member nodes in a cluster; E CH , E members are the dissipated energy for CH and members respectively. They could be calculated as follows: Therefore the consumed energy for a WSN in a generation could formulate from the energy process and energy transceiver as: where E p is consumed energy for microcontroller and supply voltage of node. It does not affect the optimizing processes. Therefore, the consumed energy E frame could optimize based on the distance for clustering optimization.

Flower pollination algorithm
A recently population-based algorithm is known as FPA (Yang, 2012). FPA is drawn inspiration from two pollination processes of the flowering plant, including self-pollination and cross-pollination. In the flowering plant, pollen are transported by pollinators according to the rules of Lévy flights, and they can self-pollinate randomly. There are two universal concepts of exploring and exploiting search space in the population-based algorithm as the guidance of optimal process. A self-pollination of the flowering plant is viewed as local pollination that expressed for exploitation in the search area. Because the flower pollination processes can occur at both local and global, to imitate this feature, the proximity probability p (denoted p [ [0, 1]) is to switch between natural global pollination to intensive local pollination. It means to change between the exploring and exploiting phases in FPA to control characteristics of local and global pollination. Let ′ s x t j , x t k be solution vectors of the pollen, i.e. pollen in the same plant or the flowers. We could model for the local pollination as follows: where u is a random variable with a distributed uniform [0, 1]. In cross-pollination considered global, gametes of flower pollen are carried by pollinators, e.g. insects. Insects can often fly and move longer range. A Lévy flight can express flying insects over long distances with various length steps, and be used to mimic this characteristic efficiently. Let L be a Lévy distribution with active as drawn formula.
where Γ(λ) is the gamma function, and this distribution is valid for large steps s > 0. L(λ) is called the step size like the parameter that corresponds to the strength of the pollination. Updating solution vectors learned from the cross-pollination for global pollination are given as where g * is the current best solution found so far, γ is a scaling factor to control the step size and t is the current generation or iteration. In the evaluation for updating pollination, according to the fitness function, if x t j and x t k come from the same plants or the same selected population, u becomes a local random walk. A probability switch p is set to 0.8 as a preliminary parametric might offer to performance some applications well [14]. The basic steps of FPA are described as pseudo code in Figure 1.

Methodology of diversity pollen
We deploy several crowds in optimization algorithm of FPA by dividing the population into subpopulations, and we use the neighbourhood topology to share the available fitness resources. Because of such subpopulations could evolve themselves independently in regular iterations to locate for the better area in the search space (Liang & Suganthan, 2005). The information is exchanged among subpopulations whenever the communication strategy is triggered to achieve the benefit of cooperation and exploitation. To adjust the proportion of the global and local searching processes, a dynamic switching probability strategy in the exchanging period is used to make the dynamic diversity. Once an activated schedule is set with a stage iteration, a new configuration of small groups would be started searching the promising area in the global target (Yang, 2012).
In FPA, a switching probability p [ [0, 1] is used to control the proportion of local search and global search, and it is a constant value. To produce dynamics in an algorithm, we use a switching parameter for sharing global and local search processes in the exchanging period. Switching probability p can transform as following formula: where t and iter are the exchanging period and current iteration, a, b are constants in arrange of [0,1] and a . b. p is a range of [0,1], α is set to 0.55 and b to 0.1. The fitness sharing available resources is commonly used in the Niching technique. Each group in FPA has its pollen as known the search agent and finest agents according to the fitness evaluation function. The finest agents among the pollen in a group will be assigned to the poorer agents in other groups, replace the inferior agents or update the inadequate agents for each crowd after running period. An enhanced diversity FPA is figured out based on both FPA optimization and Niching technique. We consider two characters in region structure: small size and communicating. Small size is used as small groups to create the diversity in local search. FPA with small areas performs better on some constrained problems with multimodal. The small-sized groups could be employed by dividing the population in FPA into subgroups. Each subset uses its members to search for a better area of the local in the search space. The communicating character, the better-obtained information of evolving optimization in each cluster, is exchanged among them to achieve the cooperative individuals and exploitation through the local extremes to the global optimum. Since the small-sized groups are searching using their best historical information, they are easy to converge to a local optimum.
Let n be number pollen of the group. It is called subpopulation size of the group. Let G j be the group, where j is the index of the group and j is set to 0, 1, 2, … , m. Here, m is the number of groups. The top k fitness pollen in the group G j according to objection function evaluation will be copied to G j+1 to replace the worst fitness pollen with the same number of pollen during run time with t > R = u. The new configuration of the groups will be started searching in every R generations with dynamic switching probability triggered. The useful information could be obtained by exchanging among the groups. Thus, simultaneously the diversity of the population could perform better on the complex multimodal problems.
The steps of the diversity agent can be described as shown in Figure 2.

Simulation results
Six selected testing functions (Das & Suganthan, 2011;Jamil & Yang, 2013) are used to validate the performance metric of the proposed algorithm of the enhanced diversity FPA (namely dFPA). The obtained evaluation output values of the testing functions are averaged over 25 runs with different random seeds. The optimization goal is to minimize the outcome of the benchmark functions. Let vector X ={x 1 , x 2, … , x n } be n-dimensional real-valued. The population size is set the same for all the algorithms of the proposed algorithm and original ones in the experiments. The details of parameter settings of FPA can be found in Yang (2012). Table 1 lists the initialization of boundary, dimension (Dim) and total iterations (MaxIters) for the testing functions.
The setting parameters for FPA and dFPA are as follows. The total population size N is set to 80 (for dFPA is the total population size n × m is set to 20 × 4). The initial probability p is set to 0.55. l is set to 1.5, and the dimension d is set to 30. α and β are set to 0.55 and 0.1, respectively. The exchanging period is set to 20 and the dimension d is set to 30. Let MaxIters be maximum iterations. MaxIters can be set to 500, 1000, 2000, or 10,000 for each    Table 2 shows the comparison of the performance quality and execution time of the proposed algorithm with that FPA for a set of selected optimization problems. The calculated value columns of RD (ratio deviation) and metric comparison in Table 2 are the percentage deviation of the proposed approach compared with the original method regarding running time and performance metric, respectively. The results of the proposed algorithm on all of these cases of testing problems show that the proposed algorithm provides a better performance than that obtained from original algorithm. The rate for a maximum case of derived from the proposed method increases greater than those obtained from FPA is up to 43%. However, the value for the minimum case increases by only 5%. The obtained average values for all testing cases of the proposed algorithm offer 24% increase in comparison with those obtained from the FPA. However, the execution time of them is similar.      their index containing real numbers. The plotted semiology measures indicated the performance through using a base 10 logarithmic scale with their index containing real numbers. The convergence criteria are the measure through using the convergence, and the time consumption is the execution time of dFPA and FPA methods for the particular optimization problems. Apparently, the proposed dFPA performance quality for all of the cases of testing functions is higher than those obtained from FPA regarding the accuracy and convergence speed. However, the time requirement of two these methods are equal.

Load balancing for mitigating hotspot in WSN problem
The causes of the hotspot problem in WSN with multi-hop communications are unbalanced energy consumption. The higher energy consumed nodes are close to the BS because of the heavy traffic flows. In this section, the applied dPFA selects the optimal group of sensors as CHs and utilizes optimized parameters for objective function to form clusters with supposing of preventing unbalanced energy consumption. The optimized total communication distances from the cluster members to CHs and CHs to BS in WSNs can lead to an increase in saving energy. The experiments consists of the following steps: modelling objective function, describing proper pollen representation as mapping solution to CHs, optimizing CH, and comparing results.

Objective function
We assumed a WSN model with all randomly deployed sensor nodes along with a few CHs in the desired area. In communication range of a cluster, a node can assign to any CH within their coverage communicated range. Therefore, a particular sensor node can be configured to some pre-specified CHs. Thus each sensor node has a list of CHs, but it should be allocated to only one CH amongst them. The sensor nodes collect the local data and send it to their corresponding CHs. On receiving the data, the CHs aggregate them to reduce the redundant data within their cluster. Then CHs send the aggregated data to the BS directly or through the intermediate CHs.
Terminologies used in this section of balanced load clustering are as follows: . A set of CHs denoted by symbol H = {h 1 , h 2 … , h m }, n > m. . A set of sensor nodes denoted by symbol S = {s 1 , s 2 … , s n }. . l i indicates the load provided traffic of a sensor node s i , s i ∈ S. . H j indicates the set of CHs to which sensor node s j may be assigned, where s j [ S and H j # H . Let L i be the CHs load h i . The overall load of each CH is calculated as The load traffic of the sensor nodes can be estimated prior with CHs of the clustering in WSN. However, the load balancing problem could be addressed with the primary objective to minimize the overall load of the CHs. Let a ij be a Boolean variable such that a ij = 1, if the sensor node s i is assigned to the CH h j and a ij = 0, if it is not. Then the optimization problem of load balanced clustering (LBC) concerning Integer Linear Programming can be formulated as follows: and The constraints in Equation (12) indicate that a sensor node can be assigned to one and only one CH, and Equation (13) shows that load of CH with the burden of all the attached sensor nodes must not exceed the overall maximum traffic load of the CH. The average load and the standard deviation of the CH load are optimized to balance load networks by applying the clustering evaluation model for measuring the performance. Table 3 shows an example of 4 CHs and 10 sensor nodes in a WSN, i.e. H = {h 1 , h 2 , h 3 , h 4 }, and S = {s 1 , s 2 … , s 10 }. Thus, the dimension of the pollen position vector is the same as the number of sensor nodes, i.e. D = 10. All the sensor nodes and their possible CHs to which the sensor nodes can be assigned. The initial population of pollen vector is generated randomly. For example, random vector is as indicated in the first column in Table 3, X t ij = [0. 45, 0.18, 0.38, 0.66, 0.86, 0.63, 0.23, 0.41, 0.34, 0.72]. It should be noted that the pollen in clustering solution because the entire 10 sensor nodes are assigned to their corresponding CH. In this example, suppose s 01 is possible to connect to one of CH amongst h 1 , h 2 , and h 3 . Similarly, s 02 selects h 3 , or h 2 in H 2 and so on. The first element of pollen vector x is 0.45, and Ceiling(x|Count(H 1 )|) = 2; therefore, the second CH from H 1 , i.e. h 2 , is selected for assigning s 1 as shown in Table 3. In the same way, all the sensor nodes are allocated to a CH using the randomly generated vector. The final assignment of the sensor nodes to their corresponding CHs is as shown in the last column of Table 3.
We build a fitness function to evaluate the pollen of the initial population as follows. It is noted that the load balancing of the CH not only minimizes the maximum CH load but also concentrates on the load distribution among all the CHs. where m and n are the number of CHs and sensor nodes, respectively. Therefore, we construct the fitness function by the standard deviation (s) of the CH load, which gives an even distribution of the load per cluster. The standard deviation of CH load is computed by where wL j is the overall load of the CH h j . Apparently, the less the standard deviation, the higher is the fitness value. Therefore, a chosen fitness function is with the reciprocal of the standard deviation of the CH load.

Experimental results
Simulation of the network with N-node (N = 100,200..) is distributed in a two-dimensional problem space [0:100,0:100]. In the target network, there are N deployed nodes in a n × n grid space test platform is established, where nodes were randomly distributed between (x = 0, y = 0) and (x = n, y = n) with the Sink with N set to 100, 200, 300, and 400 nodes.
The initial values of communication energy parameters, E j = 0:5 J, E elec = 50 nJ/bit, 1 mp = 0:0013 pJ/bit/m 4 , 1 fs = 10 pJ/bit/m 2 , E DA = 5 nJ/bit/signal, a number of initial CHs k = 10-20, l = 1024 bit (Heinzelman et al., 2002). The test case with considered various grid sizes can be effect to solution search rate. That means the number of sensor nodes was increased or decreased accordingly to different N values were verified to evaluate the effectiveness, timeliness, and reliability. Table 4 illustrates the cases of N from 100 to 400. It shows that by increasing the density of node in a cluster, the chances for nodes to become CH will be higher. The number of CHs is about 10.0% of the total number of nodes; the percentage may vary if nodes are unevenly distributed.
We figure the performance metric judge of the load balancing out, by calculating the standard deviation of the CH loads and the impact mean of the number of sensor nodes. The standard deviation of the CH load gives an even distribution of the load per CH. The fitness function in Equation (16) is repeatedly run the generations of 2000 with 25 runs. The final obtained results are taken as the average of the outcomes from all runs and then compared with the related clustering approaches, i.e. FPA (Yang, 2012), DE (Kuila & Jana, 2014), GA (Hussain et al., 2007), and LBC (Gupta & Younis, 2003). The algorithms are run for the same load of the sensor nodes by varying the sensor nodes from 100 to 300 and the number of CHs for 15 and 20. It is observed that our proposed method produces better load balancing for the equal load of the sensor nodes than others, as shown in Figure 6(a,b).  Figure 8(a) shows the comparison of the convergence rate of our proposed method with FPA, DE, GA, and LBC approaches. We ran our algorithm for 200 sensor nodes and 20 CHs. Apparently, the proposed method shows the faster convergence than the other algorithms. Figure 8(b) demonstrates the comparison of the average residual energy of the proposed approach with FPA, DE, GA, and LBC approaches for a network with 200 sensor nodes and 10 percentages of it sensor nodes. Apparently, the proposed method offers the bigger number of the rounds than other methods. Figure 9 shows the comparison of a number of rounds at first CH die, and the execution time of the proposed method with FPA, DE, GA, and LBC methods. The proposed method produces the better performance regarding the execution time and the number round at first CH die than other methods.

Conclusion
In this paper, we proposed an enhanced diversity pollens in FPA for the load balancing in WSN and the global optimization. The significance of the dynamic diversity population is to avoid the easy loss of the global optimum nearby, rather converge to a local optimum of the optimal algorithms for solving the complex constrained optimization problems. Several evolution subpopulations from FPA and the neighbourhood topology technique are used to model the diversity pollen. The new promising area is created after triggered communication among subgroups.
A dynamic switching between global and local search in each exchanging period provides the diversity of searching process. In experiment sections, a balanced load clustering for mitigating hotspot issue in WSN and a set of selected optimization problems are used to evaluate the accuracy, and the computational time of the proposed method. According to the experimental results for the testing functions, the proposed method is 24% faster in regarding the convergence than FPA. For the load balancing clustering issue, the proposed algorithm offers the better performance than the other based clustering algorithms, such as GA, DE, FPA, and LBC algorithm regarding balanced load of the CHs for the equal as well as the unequal clusters in WSN.