An improved search space resizing method for model identification by standard genetic algorithm

ABSTRACT In this paper, a new improved search space boundary resizing method for an optimal model's parameter identification for continuous real time transfer function by standard genetic algorithms (SGAs) is proposed and demonstrated. Premature convergence to local minima, as a result of search space boundary constraints, is a key consideration in the application of SGAs. The new method improves the convergence to global optima by resizing or extending the upper and lower search boundaries. The resizing of the search space boundaries involves two processes, first, an identification of initial value by approximating the dynamic response period and desired settling time. Second, a boundary resizing method derived from the initial search space value. These processes brought the elite groups within feasible boundary regions by consecutive execution and enhanced the SGAs in locating the optimal model's parameters for the identified transfer function. This new method is applied and examined on two processes, a third-order transfer function model with and without random disturbance and raw data of excess oxygen. The simulation results assured the new improved search space resizing method's efficiency and flexibility in assisting SGAs to locate optimal transfer function model parameters in their explorations.


Introduction
One of the most common problems that may be encountered during model's or control's parameters optimization by optimization algorithms is premature convergence due to search space boundary constraints. An optimization process has prematurely converged to a local optimum if it is no longer able to explore other parts of the search space region than the area currently being explored and there exists another region that may contain a superior solution (Ursem, 2003). Particularly, a set of transfer function parameters to be optimized for a continuous higher order model distinguishes the dynamic characteristics of the system. At present, some algorithms and techniques are in application for improving the search space boundary constraints. Figure 1 illustrates several common phenomena (factors) to take into account when the initial population is generated randomly.
The search space selection is one of the grounds that lead to premature convergence. A well-selected search space region will bring the elite group within the feasible region to avoid premature convergence (Rajarathinam, Gomm, Yu, and Abdelhadi, 2015). In fact, wellselected search space regions will sustain the population diversity. Preservation of search space and population CONTACT J. Barry Gomm j.b.gomm@ljmu.ac.uk diversity is correlated with sustaining a well balance between exploration and exploitation (Weise, 2009). An exploration is applied to examine new and unknown regions in the search space, and exploitation applies the previously visited and identified information to assist to locate the elite solution (Rajarathinam et al., 2015).
A brief knowledge about a variety of methods of sustaining the population diversity and selective pressure to avoid premature convergence was described (Deepti and Shabina, 2012). Nakisa, Nazri, Rastgoo, and Salwani (2014) presented a comprehensive survey of the various particle swarm optimization (PSO)-based algorithms such that PSO is a computational search and optimization method based on the social behaviours of birds flocking or fish schooling. Chaiwat and Prabhas (2011) proposed the self-adaption technique to control the population diversity without explicit parameter setting. The technique is based on the competition of preference characteristics in mating. Based on simulation results, the adaptive technique has potential to adapt the diversity of the population for a given problem without the knowledge of correct parameter settings. Also, it has a good performance in finding the solution. A number of basic variations have been developed to solve the premature convergence problem and improve the quality of solution found by PSO. Suri, Rakesh, and Pardeep (2013) proposed that the Elitism technique was augmented within a genetic algorithm (GA) allowing the best solution from any generation to be carried across to the new population allowing it to sustain. Social disaster technique (SDT) was used when premature convergence occurred and the problem of premature convergence may be avoided by creating random offspring and inserting diversity in the population (Ramadan, 2013). This paper attempted to use both concepts of elitism and SDTs spanning across various generations. A previous solution was chosen and it has been looked upon how elitism and SDTs fare towards the same problem. Malik and Wadhwa (2014) proposed a collaboration of dynamic genetic clustering algorithm and elitist technique for preventing premature convergence. This proposed technique provides a strong immunity to mutation and crossover operators to be trapped in local optima.
Based on the complex Box technique, a boundary search method for optimization problems in the case of the optimal solution at the boundary was proposed (Zhu, Li, and Zhang, 1984). It has been demonstrated and verified, if there is an optimal solution at the boundary constraint set. Recently, a modified GAs is applied in solving the n-Queens difficulty on a chessboard (Heris and Oskoei, 2014). The holism and random choices cause solving difficulties for standard genetic algorithms (SGAs) in searching a large space. To improve the solving difficulty, the minimal conflicts algorithm is collaborated with SGAs. The minimal conflicts algorithm gives a partial view for SGAs by a locally searching space but, the collaboration of algorithms consumed time for searching.
An analysis and a comparative study on the effect of applying three boundary extension methods [boundary extension by mirroring, boundary extension with extended selection by shortest distance selection, and boundary extension with extended selection by shortest distance selection with ageing] from a view point of the sampling bias is contributed (Tsutsui and Goldberg, 2001). The studies disclosed that using the smaller sampling bias had good performance on both functions which have their optimum at or near the boundaries of the search space, and functions which have their optimum at the centre of the search space. However, the named three boundary extension methods are extending the search space of lower and upper boundaries simultaneously if the optimum value is located near or at either boundary, which may not be necessary. A similar approach called the self-adaptive boundary search strategy for penalty factor selection within SGAs was proposed (Wu and Simpson, 2002). This approach guides the SGA to preserve around constraint boundaries and improves the efficiency of attaining the optimal or near-optimal solution. A penalty factor within a GA is adapted and co-evolved such that the GA population is adjusted (or forced) to search around the upper or lower boundaries of the feasible and infeasible regions. The penalty factor represents a decision variable within the population string to force the GA to search an optimal solution without altering the search space. A technique for resolving the structural optimization difficulties in quantizing the subjective uncertainties of active constraints is proposed by fuzzy logic formulation (Wu and Wang, 1992).
Another method to improve the prematurity and to sustain the diversity population was proposed by niche genetic algorithm (NGM) associated with isolation mechanism (Lin, Hao, Ji, and Dai, 2000). A comparison study was done on NGM and annealing GA where the annealing GA has better premature convergence (Tu and Mei, 2008). However, the annealing GA is time-consuming by extra procedures. Another method, named accelerating genetic algorithm (AGA), was proposed to resizing the feasible region into the elite individual's adjacent region for better local searching and convergence (Jin, Yang, and Ding, 2001). Search space boundary reduction for the candidate diameter for each link by pipe index vector and critical path method, along with modified genetic operator's derivatives, was proposed (Mahendra, Gupta, and Bhave, 2008;Vairavamoorthy and Ali, 2005). Further, an improved AGA based on the saddle distribution by which adding random individuals into the initial population to increase the searching ability of optimal solution was proposed (Xu, Zhong, and Tang, 2012).
Direct identification of continuous-time, transfer function models from sampled input-output process data is considered in this paper. There are some established continuous-time identification methods, such as using instrumental variables and frequency responses (Fengwei, Hugues, and Marion, 2015;Garnier and Young, 2004;Rao and Unbehauen, 2006). Continuous-time identification methods involve the need of special filtering functions to generate the necessary signal timederivatives for estimation from sampled data (Garnier and Young, 2004;Rao and Unbehauen, 2006). This is not a trivial task and also, data usually need to be sampled faster than if discrete-time model identification were done (Rao and Unbehauen, 2006). Nonlinear estimation algorithms may also be required (Fengwei et al., 2015). This paper investigates the application of SGAs to the problem of identifying the parameters of continuous-time models, specifically Laplace transfer functions including time delays, without the need for additional signal filtering or particularly faster data sampling.
A literature review discloses that most GA researched techniques in general applications have an initial knowledge, or value, of search space parameters or they are randomly identified by trial and error technique at initial execution. Further, some research papers are literally not adjusting the feasible search space region to the centre if the optimum value is located near or at the boundary. Also, the discussed research information involves complex mathematical approaches and inevitably can be time-consuming for convergence. This paper proposes and investigates a new improved search space method, named the predetermined time constant approximation (PTcA), to enhance the SGAs exploration and exploitation towards the global optima for identification of continuous-time transfer functions. This method employs a novel search space boundary extension technique by PTcA, which guides the search to concentrate on optimal values within the boundaries of the feasible region of the solution space and adjusting the feasible region towards the centre according to the optimum value. Further, the proposed technique introduces a method to predetermine the initial values of continuous higher order model parameters according to the transient response, instead of an initial random selection.
The structure of this paper is as follows; first, the SGAs convergence states for an optimal value by search space boundary constraints are discussed. Second, the approximation process of the predetermined time constant method is discussed. Further, search space boundary extensions for better exploration and for optimal exploitation are discussed here. Finally, the effectiveness of the PTcA method is assessed with two processes: a third-order transfer function with and without random disturbance and real numerical data from an excess oxygen (EO 2 ) process. Also, a fourth order model for EO 2 is compared with the EO 2 process data and a thirdorder model of EO 2 to measure the effectiveness of the proposed methods. The proposed methods are developed and tested in simulations based on Matlab/Simulink models.

Prior knowledge of specific problem
In numerous optimization problems, the functional information related to the problem may exist, and can frequently be applied a priori to effectively assist SGAs to execute well in terms of rate of convergence. If there exists prior information about regions in the search space where the optimal points may be located, a percentage of the population can be initialized by selecting candidate solutions from these promising regions. This approach can be applied whenever one searches to improve on previously identified 'optimal' solutions.
As follows, the SGAs commence with a set of potentially above-average solutions, which can significantly improve the rate of convergence of the SGAs, whereas the crossover and mutation operators theoretically ensure that the SGAs are still able to explore different regions in the search space (Vlachos, 2000). Such heuristic initializations of the population should be applied carefully in order to avoid premature convergence, the situation where the SGAs may convergence to a sub-optimal region in the search space.

Convergence constraints by search space boundary
In most situations, selecting the search space boundary regions is delicate if there is no prior knowledge of optimum value location. Thus, a randomly selected search space boundary is a significant factor which leads the SGAs to often converge and get trapped in local optima, resulting in sub-optimal solutions. Particularly, if it locates near the boundary or outside the boundary as illustrated in Figure 2, where SB Lower is lower search boundary, SB Upper is upper search boundary, GO is the genetic operator for convergence precision and X i is the optimal value. The SGAs' convergences according to search space boundary constraints can be classified by three states: • State 1 -If the optimal value(X i ) is located within a uniformly distributed elite group around region [X i − GO , X i + GO ] , the genetic operators have higher probability of converging to the global optimum. Thus, the randomly generated initial population within the well-distributed elite group search boundary has higher probability exploring and exploiting a better parent chromosome. Further, the selected parent chromosome will be evaluated by genetic precision process (selection, crossover and mutation) to produce fitter offspring without any convergence constraint.
the SGAs possibly will converge to local minima. The elite group which is distributed near the boundary may have located a part of the elite group at the outer boundary. If the elite group at the outer part has the genetic information of an optimal value, the genetic operators will suffer to exploit the optimal value and the exploration process will retard. As a result, the search space boundary constraints will lead the SGAs to converge to local minima.
, the SGAs will fail to explore and exploit the optimal value. The simulation may be retarded and stopped.

Predetermined time constant approximation
To improve the choice of search space boundaries for optimal model parameters identification, a new boundary resizing technique without a complex mathematical constraint is introduced here, named PTcA. The proposed PTcA method provides a prior knowledge of higher order poles coefficients of a transfer function, named initial predetermined time constant (Ts p(Initial) ) value from the dynamic response of a process. Applying the Ts p(Initial) value gives an approximation of the elite group distribution within a feasible boundary region by resizing the boundary region at the initial stage. This gives the genetic operators opportunity to locate the optimal parameter values rapidly without any constraint. Therefore, identification of denominator polynomial coefficients which provide a foundation for determining a system's dynamic characteristics is primarily considered here.
Consider a system can be modelled by the general order differential equation a n d n y dt n + a n−1 where f (t − θ) is the input signal or forcing function with time delay, θ, y(t) is the output signal and K p is the process gain. Assuming zero initial conditions, y(0) = 0, y (0) = 0, . . . , and taking the Laplace transform of Equation (1) gives the general order transfer function of the form K p a n s n + a n−1 s n−1 + · · · + a 1 s + 1 e −θ s , (2) where a n . . . a 1 are coefficients of the denominator polynomial which is particularly defining the components in the homogeneous response. For the PTcA method application, the denominator of Equation (2) is approximated as follows: By applying the PTcA method, the coefficients of the denominator polynomial, a n · · · a 1 in Equation (2) Ts p(Initial)n s n + Ts p(Initial)n−1 s n−1 + · · · + Ts p(Initial)1 s + 1 e −θ s .
As discussed earlier, it is difficult to approximate the higher order model's denominator polynomial coefficients without a prior knowledge. However, the initial value of K p and θ can be easily approximated by observing the magnitude of a step response from C(t) = 0 to C(t) = Css ± δ(%) and delay of transmission from t = 0 to t = θ, respectively. Therefore, only the denominator polynomial coefficients are considered here.
The PTcA method can be divided into two supprocesses. First sub-process is an identification of Ts p(Initial) from a dynamic step response for initial boundary setting. The identification process is illustrated in Figure 3 and described as follows: • Selecting ts (δ%) , where ts is settling time and δ is the settling band in % (δ = 1, 2 and 5). The selection of desired δ is according to the raggedness of the dynamic response. The ts is defined as α, settling time constants for which the response remains within δ% of the final value. This can be approximated as: ζ ω n ts ∼ = α. Hence, the ts (δ%) = 1%, 2% and 5% → α = 5, 4 and 3, respectively. • Estimating the process's dynamic response period where Css is the final steady-state value. • Approximating a Ts p(Initial) = DR P(t 1 −0) /α (δ%) .
• Applying Ts p(Initial) according to the respective transfer function coefficients, a n s n + a n−1 s n−1 + · · · + a 1 s + 1 → Ts p(Initial)n s n + Ts p(Initial)n−1 s n−1 + · · · + Ts p(Initial)1 s + 1.  The second sub-process of PTcA method is the search space boundary optimization by resizing the upper and lower search boundary based on Ts p(Initial) . As illustrated in Figure 4, the SB O is optimum search space boundary, SB Lower is lower search boundary and SB Upper is upper search boundary. An optimum search space boundary as illustrated in Figure 4 can be expressed as For an SB O , the SB Upper and SB Lower are extended by 100% and 75% from Ts p(Initial) , respectively. Especially, 100% of extension for SB Upper is required as the optimal solution can be mostly located close to the upper boundary region. Such a search space extension is required for SGAs to explore the elite groups which are uniformly distributed within boundaries and to exploit the X i .
As illustrated in Figure 4, the Ts p(Initial) is only applied for initial search boundary resizing and the first SGAs execution. Further search space boundary resizing is decided by the previously executed sub-optimal value (X i ), which is presumed as the next value for Ts p . The sub-process of search space boundary adjustment and an optimal X i identification can be stated as follows: • Initial attempt -Identified Ts p(Initial) according to the respective denominator polynomial coefficients are applied with 100% extension on SB Upper . The SB Lower is extended to approximately 95% instead of 75% for better exploration at the beginning stage. Execute the SGAs. • Second attempt -Genetically identified X i by initial attempt (first execution) of respective denominator polynomial coefficients are applied for next execution to extend (with, Ts p = X i ) accordingly (SB Upper to 100% and SB Lower to 75%) to optimize SB O . Execute the SGAs. • Subsequent attempt -Continuing the SGAs execution with unchanged boundary search approximation by second attempt, until optimal X i and minimum sum of square error (SSE) attained. • *Subsequent attempt -If the extended boundary in the second attempt is not an SB O , consecutive boundary resizing is essential until SB O is achieved. Then, continuing the SGAs' execution until optimal X i and SSE are attained.

Simulation results
To illustrate the non-complexity and effectiveness, the proposed time constant approximation method is applied on two example processes; a third-order transfer function with and without disturbance and real numerical data from an excess oxygen (EO 2 ) process step response.

Process 1 -third-order transfer function
For simulation study, the following transfer function of a third-order process is selected with the process gain, K p = 10 G(s) = 10 15s 3 + 78s 2 + 6s + 1 .
The particular motive of selecting this third-order transfer function is that it has a real pole at −5.1245 and a pair of complex poles at −0.0378 ± 0.1076i, which are exhibiting a significant oscillatory response as illustrated in Figure 5. Also, to assess the PTcA method's flexibilities and effectiveness, the third-order transfer function coefficients are moderately small parameters. So, an appropriate search space boundary extension is required.
According to the third-order process step response in Figure 5, an extension on the search space boundaries are approximated for K p ∈ [5 : 15] for better exploration as the K p = 10. For better approximation of polynomial coefficients, the DR P(t1−0) = 123 − 0 s = 123 s . Selecting ts (δ%) = 1%, as the desired α is 5, gives the Ts p(Initial) = 24.6 s. Therefore, the Ts p(Initial) for the third-order polynomial coefficients can be approximated by Ts p(Initial) = 24.6; → (Ts p(Initial) s + 1) 3 → (Ts p(Initial) s) 3 + 3(Ts p(Initial) s) 2 + 3Ts p(Initial) s + 1 → 1.728 × 10 6 s 3 + 4.32 × 10 4 s 2 + 3.6 × 10 2 s + 1. (7) Based on Table 1, the SGAs explored well the entire search space boundaries and exploited the elite group within the chosen boundary region, X i − GO , X i + GO for Ts p values of S 2 and S 1 at the initial attempt. This can be seen by the consistency of the Ts p values of S 2 and S 1 in further execution with readjusted boundaries at the second attempt. Therefore, further resizing of search boundary is not required as the X i will evolve well within SB O to attain the X i . This has enhanced the exploitation of   an optimal X i at each subsequent attempt by the SGAs for these parameters.
On the other hand, the simulation results reveal that the elite group of PTcA values of S 3 are distributed near SB Lower region. This is clearly noticeable at the first, second and third execution results that the Ts p value of S 3 is remaining around SB Lower . This caused the SGAs to fail to exploit an optimal X i and converge to local minima as a part of the elite group is located outside SB Lower (state 2). As a result, three adjustments on boundaries, especially on SB Lower , are required to optimize the SB O and to bring the elite groups within a feasible boundary region. As expected, the boundaries are optimized and the elite groups are explored well at the fourth execution. Further SGAs execution enhanced an optimal X i exploitation.
The flexibilities and effectiveness of the PTcA method is further assessed on the third-order transfer function model with 5% random disturbance. Initially, identified transfer function coefficients without the disturbance are applied on the third-order model with disturbance. The simulation results in Figure 6 and Table 2 reveal that the exploration of elite groups and exploitation of an optimal X i for the third-order model with disturbance is a very similar process without disturbance. Notice that the peak time in Figures 5 and 6 is the same for all waveforms because the imaginary part of the model poles remains the same (Table 3). Nevertheless, the identified model responses, with and without noise, closely match the response of the actual system as shown in Figures 5 and 6. Thus, the effectiveness of the PTcA method is well demonstrated in optimizing the SB O and exploiting the X i with or without disturbance.
Based on minimum SSE, the selected third-order model transfer function without (Equation (7)) and with Figure 6. Transient responses of third-order transfer function real and model with 5% disturbance. Table 2. Simulation results of third-order transfer function with 5% disturbance executions.  = 9.976 24.05s 3 + 76.33s 2 + 6.398s + 1 .
By comparing the identified Ts p coefficients with thirdorder transfer function model's parameters, the S 2 and S 1 values have 98% similarity. But, the S 3 value only has 54% of similarity. According to Table 3 and Figure 7, the complex poles of all third-order models illustrate that the imaginary parts are considerably constant. But, the real part is slightly moved along the real axis, causing a small change in the damping ratio for these roots. These small changes in the complex poles are consolidated with the differing position of the other real root.

Process 2 -excess oxygen (EO 2 )
Raw numerical data of excess oxygen (EO 2 ) was collected from a real industrial furnace by an empirical technique for 1000 s with 5 s intervals. As illustrated in Figure 8, the process response of EO 2 is exhibiting an approximate first-order plus dead-time dynamic system. The data were gathered by the step input of increasing air ratio from 9.5 to 10.5 in volumetric (ft 3 ).
As discussed earlier, the polynomial coefficients of the continuous real time transfer function are primarily  considered here for optimal model identification by the PTcA method. The process gain (K p ) and transport delay (θ) can be approximated by close observation of the EO 2 real plant transient response. As illustrated on the transient response of EO 2 , K p ≈ 1.54 and θ ≈ 160 s . As a result, an extension on the search space boundaries are approximated for K p ∈ [1 : 2] and θ ∈ [50 : 200]. If a process has transport delay, then the DR P needs to be calculated from t = θ to t = t1. For better approximation, the θ is selected 100 s. Thus, for the EO 2 dynamic response, the DR P(t1−θ) = 700 − 100 s = 600 s . Selecting ts (δ%) = 1%, as the desired α is 5, gives the Ts p(Initial) = 120 s.
According to the PTcA technique, genetically identified X i by the second execution for the respective polynomial coefficients illustrates that the resized search boundary by initially identified X i at first execution is SB O (Table 4). Therefore, further resizing of the search boundary after second iteration is not required as the X i will evolve well within SB O to attain the X i . As illustrated in Table 4, the distribution of elite groups within the boundary region [X i − GO , X i + GO ], the exploitation of optimal X i and the consistency of the X i values of S 2 and S 1 in further execution by SGAs are exhibiting similar process characteristics as the third-order transfer function model. Table 4. Third-order model polynomial coefficient approximation by SGAs execution.  Based on the initial attempt, the elite groups of X i values of S 3 are uniformly distributed around X i − GO region. The simulation results shows that the X i values of S 3 are still continuously evolving within the boundary SB O region at each execution. Therefore, further readjustment of SB O boundaries is not required as the elite groups are still within the boundary range (state 1) as discussed in Section 3. So, for the third-order model of EO 2 , the X i values by the fifth execution are selected as the SSE and Gen (generation) is minimum and optimal (X i ). The identified transfer function of fifth and eighth execuations are G(s) ( Figure 10. Transient responses of two global optimal values with real process of EO 2 . Figure 11. Transient responses of third and fourth order models with real process of EO 2 . However, the inconsistency of S 3 shows that there are two optimal values of X i (X i = 8187.7; 4137.2), which frequently appear within the SB O region at first, second, fourth, fifth, sixth, seventh and eighth execution. According to the transfer function of fifth and eighth execuations, the K p , θ and the polynomials of S 1 and S 2 are exhibiting 97.5% of similarities, approximately. While, the S 3 is exhibiting 50.5% of similarity. This illustrates that the K p , θ and the coefficients of S 1 and S 2 are not consolidated with S 3 in attaining two X i . This has been verified by simulation results in Figures 9 and 10 for both optimal X i values of S 3 attaining a minimum SSE (iterations 5 and 8 in Figure 9 and Table 4). Furthermore, the inconsistency of S 3 demonstrated that the SGAs with improved boundaries were well sustaining the population diversity by exploring the feasible search region and exploiting to optimal X i .
A similar process of optimal model parameter identification by PTcA method was also applied for a fourth order model. As illustrated in Figure 11, the fourth order model response is exhibiting the effectiveness of the PTcA method in exploring the search space region to exploit the X i for S 1 , S 2 , S 3 and S 4 . Based on the EO 2 model responses, the fourth order model is well fitted with the real data response. According to Table 5, the third-order model's real pole and pair of complex poles are exhibiting inconsequential domination in characterizing the response, which causes a rise in the error criterion compared to the fourth order model. However, the fourth order model with another pair of complex poles is enhancing the extrapolation on the real data characteristic and achieves a lower SSE.

Conclusion
The proposed predetermined time constant (PTcA) method enhanced the optimization of search space boundaries for global optima convergence. The response's dynamic period and settling time provide better presumption of an initial Ts p value for search space optimization. The extended SB Upper and SB Lower for an optimal search boundary (SB O ) derived from an initial Ts p brought the elite group within a feasible bounded search region. Further, SGAs execution improved the exploration of elite groups to locate and exploit the optimal values for the identified model parameters. As expected, the polynomial coefficients of all estimated models are optimized well by SGAs. Further work includes assessing the performance of the method on other process data and statistical comparisons of the method with other conventional algorithms for continuous-time transfer function estimation.