Application of computational fluid dynamics and surrogate-coupled evolutionary computing to enhance centrifugal-pump performance

To reduce the total design and optimization time, numerical analysis with surrogate-based approaches is being used in turbomachinery optimization. In this work, multiple surrogates are coupled with an evolutionary genetic algorithm to find the Pareto optimal fronts (PoFs) of two centrifugal pumps with different specifications in order to enhance their performance. The two pumps were used a centrifugal pump commonly used in industry (Case I) and an electrical submersible pump used in the petroleum industry (Case II). The objectives are to enhance head and efficiency of the pumps at specific flow rates. Surrogates such as response surface approximation (RSA), Kriging (KRG), neural networks and weighted-average surrogates (WASs) were used to determine the PoFs. To obtain the objective functions’ values and to understand the flow physics, Reynolds-averaged Navier–Stokes equations were solved. It is found that the WAS performs better for both the objectives than any other individual surrogate. The best individual surrogates or the best predicted error sum of squares (PRESS) surrogate (BPS) obtained from cross-validation (CV) error estimations produced better PoFs but was still unable to compete with the WAS. The high CV error-producing surrogateproducedtheworstPoFs.Theperformanceimprovementinthisstudyisduetothechange inflowpatterninthepassageoftheimpellerofthepumps.


Introduction
With the advancement of computational capabilities, the three-dimensional simulation of any complex geometries such as turbomachines is possible. Even an analysis of multistage pumps is possible, although a detailed analysis takes a long time. The optimization of a turbomachines requires a large amount of performance data at different design points in order to generate an objective function. This process may take a long time to simulate, and thus the cost of optimization can be high. Derakhshan, Pourmahdavi, Abdolahnejad, Reihani, and Ojaghi (2013) reported an approach for enhancing the performance of turbomachines by optimizing its parameters via experimental and numerical methods. Hundreds of parameters of an impeller can be altered and the performance can be upgraded. The parameter such as number of blades affects the low-pressure area at the blade inlet and the jet wake formation in the blade passages (Houlin, Yong, Shouqi, Minggao, & Kai, 2010). A change in blade angles changes the hydraulic efficiency and the cavitation formation (Kamimoto & Matsuoka, 1956;Luo, Zhang, Peng, Xu, & Yu, 2008;Sanda & Daniela, 2012). Rutter, Sheth, and O'Bryan (2013) optimized the hydraulic performance of an electrical submersible pump (ESP) handling a single-phase fluid, with selected head, efficiency and input power as the objective functions. Cao, Peng, and Yu (2004), Shi, Long, Li, Leng, and Zou (2010) and Marsis, Pirouzpanahand, and Morrison (2013) also reported an improvement in pump performance by changing the design parameters. However, the evaluation of all the parameters for optimization requires a high-dimensional analysis that is even more time consuming. To avoid this issue, a recently developed regression-based technique which generates a low-fidelity model also known as a surrogate is being used to assist engineers to optimize geometry in less time. The issue of the 'dimensionality curse' of surrogate models is usually helped by a sensitivity analysis, which can reduce the number of design parameters (Samad, 2008). Here, the low-fidelity model specifies an approximating function which approximately mimics the high-fidelity computational fluid dynamics (CFD) or computer aided engineering (CAE) model.
Surrogates are problem-dependent; thus, a single surrogate cannot be applied to solving every problem. To handle this challenge, an ensemble of surrogates or the concept of the WAS was introduced. The WAS is based on the average of the predicted error sum of squares (PRESS), which is implemented from cross validation (CV) error estimations. As the WAS includes the contribution from all the surrogates, it gives quite satisfactory results.
In this paper, a strategy using multiple-surrogateassisted multi-objective optimization is developed to enhance the performances of two pumps: an ESP and a centrifugal impeller. CFD simulations were performed to find the objective function values at different points and a PoF was generated via a genetic algorithm. The PoFs obtained from all surrogates were compared to find a robust surrogate that performs well for both cases. The results and discussion section contain an analysis of the surrogate fidelity and flow physics of the pumps. Figure 1 shows the optimization strategy used in this study. It starts with problem formulations which include understanding the problem and deciding on the objective functions which need to be optimized and the design variables which influence these functions. In this study, two cases are used.

Problem formulation and numerical schemes
Case I involves a centrifugal pump impeller with a diameter (d o ) of 365 mm. Three design variables and two objectives were considered based on the literature (Bellary & Samad, 2014;Gulich, 2010;Luo et al., 2008); inlet angle (β 1 ), exit angle (β 2 ) and number of blades (G). The objective functions were to maximize efficiency and head. The geometric features of the impeller are shown in Table 1 (Lazarkiewicz & Troskolanski, 1965).
Computer-aided design (CAD) modeling and meshing of the flow domain were carried out using Ansys BladeGen and TurboGrid modules, respectively (Ansys CFX 13.0, 2010). Ansys CFX 13.0 was used for the flow simulations. Table 2 shows the meshing and boundary conditions. Further details regarding geometry, boundary conditions, validation and grid independence test can be found in Bellary and Samad (2014). Case II involved an ESP with a diameter of 100 mm. The impeller and diffuser were considered for the analysis, and two variables (inlet and outlet blade angles) along with two objectives (head and efficiency) were used for the optimization in this case. Figures 2 & 3 show the geometry and the flow domain of the ESP, and the impeller features are shown in Table 1. The simulations were carried out for an impeller at a speed of 3000 rpm with a discharge rate of Q = 0.0037 m 3 /s. The grid independence test was performed and an optimum number of nodes equal to 227,416 was selected. Mass flow inlet and pressure outlet boundary conditions are used at the inlet and outlet, respectively. Further details regarding the boundary conditions and flow simulation can be found in Table 2. The turbulence model used was k-which uses the scalable wall-function approach to improve robustness and accuracy as it efficiently captures near-wall boundary layer effects (Wilcox, 1994). The numerical results were validated with the experimental results given in Caridad, Asuaje, Kenyery, Tremante, and Aguillon (2008).
The working fluid for both the cases was water of standard properties at a normal temperature and pressure. The performance of the pumps was calculated by solving the Reynolds-averaged Navier-Stokes (RANS) equations and used as design variable responses.

Design of experiment
After the completion of the problem formulation stage, a design space bounded by the two extreme (top and bottom) limits of the variables was created. The lower and upper bounds were determined via a literature survey (Bellary & Samad, 2014;Gulich, 2010;Li, 2002;Luo et al., 2008;Marsis et al., 2013;Ohta & Aoki, 1996), followed by simulations at the corners of the design space. A three-level full factorial design (Myers & Montgomery, 1995) was employed to choose the designs or design points from the design space. Once again, RANS equations were solved to obtain the design variable response that those design points. These responses were then used to construct the surrogate model.

Formulating the surrogate modeling
The surrogates used in this study are RSA, RBF, KRG and WAS. RSA is a polynomial function for responses generated from numerical calculations. The basic concept of a neural network or RBF is to learn from events. The RBF  comprises of a two-layer network. The first layer is a hidden layer of a radial basis function and the second is a linear output layer. The variables required to generate this surrogate model are a spread constant and a user-specified error goal, which comes from the mean input responses.  The KRG (Husain & Kim, 2010) interpolation employs a known function for a large-scale variation and a function for a small-scale departure. The KRG model evaluated at some unsampled location has a global model and a systematic departure. The algorithm to construct a PoF through WAS and best PRESS surrogate (BPS) is as follows: Algorithm 1. Algorithm for BPS-and WAS-assisted PoF.
Step1: Generate data from CFD analysis Step2: Construct surrogates Step3: Calculate PRESS For BPS-BPS PoF Step4a: Find least error and select surrogates for objective 1 and objective 2.
Step5a: Generate population for NSGA-II using the selected surrogates Step6a: Generate PoF and cluster points Step7a: Check errors with CFD analysis For WAS-WAS PoF Step 4b: Find weights for the surrogates and construct WAS for objective 1 and WAS for objective 2.
Step 5b: Generate population for NSGA-II using the WAS models Step 6b: Generate PoF and cluster points Step 7b: Check errors with CFD analysis

Weighted-average surrogate (WAS)
Generally, a large amount of data distributed in the design space fits a surrogate well, and the surrogate reflects the trend of the objectives. Any poorly-fitted surrogate may lead to unreliable results. The ensemble or averaged surrogates adequately refines the response trend of the data and insulates against bad surrogates Queipo et al., 2005). A WAS model generated from the PRESS or CV errors (Sanchez et al., 2008) of the basic surrogates (RSA, KRG and RBF) was adopted in this problem. The predicted response of the WAS model is defined in Equation (1). In simplified form, the function can be written as where F rsa , F krg , and F rbf are constructed using the CFD evaluated responses. A lower weight is assigned to the surrogate with a higher CV error, which means that the surrogate makes a reduced contribution to the construction of the WAS model. In CV, the data is divided into k subsets of equal size. A surrogate is constructed k times, with each time excluding one of the subsets from training, and the excluded subset is used to evaluate the error. The k errors finally produce the error sum of squares. This process is known as leave-one-out CV. The calculation of the weights used in WAS is given in Goel et al. (2007).

Best PRESS surrogate (BPS)
To implement a surrogate ensemble, the best surrogate must be selected from the available options. This can be achieved using a weighting scheme, and a weight equal to 1 is assigned to the surrogate with the least CV error. A 0 weight is allocated to the other models. This scheme is termed the BPS model  and is based on the concept of selecting a surrogate that gives the least CV error. In the present problem, the CV errors for the objectives were calculated separately and the surrogate which produced the least error was used for the PoF generation.

Non-dominated sorting of genetic algorithm-II (NSGA-II)
NSGA-II (Deb, 2001) is widely used in the scientific community. In the present problem, the algorithm is coupled with the Surrogates Toolbox (Viana, 2011;Viana et al., 2014) in order to make handling multiple functions easier. The relationships among the competitive objectives were established via the PoF produced by NSGA-II. The problem relates to two competing objectives (for each pump case), where the enhancement of one objective results in the degradation of the other. All feasible solutions related to the multi-objective problem are grouped into two types: dominated and non-dominated. The non-dominated solutions are designated as Paretooptimal solutions. The objective functions are defined mathematically and estimated by numerical simulation. In this approach, an initial or parent population is arbitrarily established, and the objective function values are estimated for each design. Here, the term 'population' refers to the designs in the design space and the nondomination condition and crowding distance determine the rank of this population. Intermediate populations are created via selection, crossover and mutation by implementing a genetic operator. The populations (parent and intermediate) are combined, and the rank and crowding distance of the combined population is obtained. New populations are selected depending on the rank and crowding distance and are again ranked for the next generation based on the non-domination criteria. This process is reiterated until the limiting number of generations is attained.

k-means clustering
Five clusters in the global Pareto-optimal solutions were generated by the k-means clustering algorithm. This is a repeated-fitting process that generates the number of specified clusters. The clusters are distributed across the whole range of the PoF.

Case I
Initially, three different variables (β 1 , β 2 and G) affecting the performance of the pump (Lazarkiewicz & Troskolanski, 1965;Srinivasan, 2008) were selected and a parametric study was performed. The allocation of the angles was assured based on the literature (Li, 2002;Ohta & Aoki, 1996). Design points within the design space were selected from a three-level full factorial design method. The design space defined by the lower and upper limits of the variables is given in Table 3. The impeller passage was generated by polynomial curves at the inlet and the outlet, so that a small change in angle produces a continuous surface. The surrogate which gives the least CV error was selected as the BPS. In Table 4, RSA shows the least CV error for both objectives, so it was selected as the BPS in  both cases. The BPS was then used to generate populations using NSGA-II to build a BPS-BPS PoF (Figure 4). Figure 4(a) shows the comparison of PoFs for different surrogates. The KRG-produced PoF designs are greatly distant from reality and can also be compared in the PRESS (Table 5) which shows that the KRG-produced errors are the highest for both objectives. In Figure 4(b), the PoF at some cluster points of the KRG-KRG PoF show an efficiency of more than 100%, which is unrealistic for this type of simulation. The WAS-WAS PoF has higher fidelity. The design variable values ( Figure 5) are calculated for the cluster points. To get a higher efficiency, the number of blades should be increased and the inlet angle should be reduced. The exit angle initially increases the efficiency, but after attaining a certain efficiency it is reduced. Similarly for the head, the opposite phenomenon is observed ( Figure 5(b)). Among different cluster points for the BPS, the one with a 59.63 exit blade angle (β 2 ) is chosen and compared with other cluster points of different β 2 values. In Table 6 for different numbers of blades other than the designed one, the error is greater even though the exit blade angle has increased. For the other cases, the same surrogates, i.e., KRG-KRG and RBF-RBF were selected for both objectives. The combined error (C Err ), which is the root mean square error (RMSE) of   η Err and H Err , is high for the KRG-KRG results and thus it made the least contribution to constructing the WAS model. The surrogates produce different values of PRESS; hence, the use of multiple surrogates or weighted surrogates is suggested. Table 5 shows the PRESS errors and RMSEs of the BPS and WAS at different cluster points. All five clustered designs were simulated for comparison.
In Figure 6, the CFD results are compared for the two extreme cases (cluster points A and E in Figure 5(b)). The shock losses, secondary losses and hydraulic losses which reduce the hydraulic efficiency are greater at the off-design points (Lazarkiewicz & Troskolanski, 1965;Srinivasan, 2008). The head generation is augmented by an increase in β 2 at the reference point. Here, an absolute flow velocity at the exit (c 2 ) increases with an increase in the peripheral velocity (c u2 ), which enables a further increment in the dynamic head (Bellary & Samad, 2014;Gulich, 2010). The contours of total pressure at two extreme cluster points show that the highest pressure is generated at the impeller exit where the kinetic energy of flow attains a maximum level. This is because the pressure augments constantly with the mechanical energy  supplied by the rotation of impeller and is then transformed into pressure energy. In the vicinity of the leading edge (LE) of the blade at the suction side, minimum pressure is observed ( Figure 6). An increase in blade angle increases the pressure difference between the outlet and inlet of the blade. The absolute velocity component (c u2 ) at the exit of the impeller increases with an increase in the discharge angle (Srinivasan, 2008). This matches with the existing results (Lazarkiewicz & Troskolanski, 1965;Srinivasan, 2008), which state that an increase in β 2 decreases the relative velocity and increases the total pressure.

Case II
In this case, a three-level full factorial design was also employed to find the designs from the design space (Table 3) obtained from the literature (Li, 2002;Ohta & Aoki, 1996;Srinivasan, 2008). A polynomial curve makes a smooth impeller surface. The variables and responses were used to construct the KRG and RBF surrogates and the WAS was formulated using weighted averaging on these surrogates. The PRESS shows that the RBF is the best one for both objectives (Table 4) and thus it selected for the BPS-BPS PoF construction. RSA and KRG produced the higher errors and so their contributions to the construction of the WAS model were least. Table 5 shows the CFD results of the cluster points and their respective errors. The WAS-WAS generated points show the least error, while the KRG-KRG show the most error. Figure 7(a) shows that the RSA-RSA PoF has a different trend in PoF unlike the WAS-WAS PoF and the BPS-BPS PoF trends. The errors in Table 4 show that RSA produced the highest error for the efficiency. The cluster points and respective design variable values are shown in Figures 7 and 8. The inlet angle (β 1 ) increases if a higher efficiency is expected but the exit angle behaves in the opposite way. For a higher head, the opposite behavior of the variables can be observed (Figure 8(b)). Table 5 shows that the WAS-WAS PoF has the least combined error (C Err ). Table 7 shows that the error is low in the WAS-WAS PoF and the BPS-BPS PoF. Figure 9 shows the CFD calculations for two extreme points (designs A and E) from the clustered points of Figure 8(b). On the suction side and at 50% span, a low-pressure region appears in both the designs. The impingement of the flow on the LE causes a sudden loss in pressure. Also, a wider low-pressure region, which causes more flow losses and reduced efficiency, is present in design A as compared to design E. In Figure 9, the pressure increases from the LE to the trailing edge (TE) at different rates in both cases (designs A and E). A rapid   pressure increase from LE to TE is observed in design A compared to design E, which generates more head in design A (Gulich, 2010). The values of efficiency and pressure head for these two cases are given in Table 7. From the above analysis, it can be seen that the surrogates predict differently with different sets of data. The best surrogate for one set of data (Case I) produced the worst results for the other set (Case II). In Case I, the best surrogate was RSA, while it was RBF in Case II. The WAS model performed reliably or had higher fidelity with respect to CFD.

Conclusion
In this paper, a multiple-surrogate-assisted multi-objective optimization procedure is presented. The capability of surrogates to produce PoFs was evaluated with two examples of pump or turbomachinery design and optimization.
It was observed that the CV error is different for different surrogates and gives an indication of the degree of fit for each surrogate. A surrogate producing a high CV error is highly unreliable for generating a PoF. The verification with the CFD model shows that the error in Pareto designs is high for the surrogates which produce a high CV error. The BPS-produced PoF is better and can be used instead of any individual surrogate. The WAS-WAS PoF, which shows a robust result, is comparable to the BPS-BPS PoF. Hence, instead of using a single surrogate, the authors suggest the use of multiple surrogates to generate PoFs. A change in exit angle influences the augmentation of head and efficiency. The improvement in efficiency was due to the mitigation in flow loss in the disc friction losses outside of the impeller shroud and hub.
The BPS performs well for both cases and a wider design space or sparsely-distributed design points may require sequential sampling or multiple surrogate analysis.

Disclosure statement
No potential conflict of interest was reported by the authors.