Energy-saving oriented optimization design of the impeller and volute of a multi-stage double-suction centrifugal pump using artificial neural network

To broaden the efficient operating zone and increase the energy efficiency of a multi-stage double-suction centrifugal pump, a multi-component and multi-condition optimization design method involving high-precision performance predictions, a flow loss visualization technique based on entropy production theory, and machine learning is proposed. First, the accuracy of the baseline pump numerical methodology is verified via a grid convergence analysis and experiments. Thereafter, nine design parameters of the impeller and double volute are selected as design variables. Subsequently, 150 designs are created according to the Latin hypercube sampling method (LHS) and numerically simulated using an automatic simulation program. A backpropagation neural network (BPNN) and a multi-objective genetic algorithm (MOGA) are adopted to maximize the efficiency at 0.6Q d, 1.0Q d, and 1.2Q d. Finally, the optimal results are verified via numerical calculations and analyzed. The results indicate that the efficiency of the optimized pump is increased by 2.05%, 3.56%, and 5.36% at 0.6Q d, 1.0Q d, and 1.2Q d, respectively. The comparative analysis of the energy characteristics reveals that the improved performance of the optimized pump can be attributed to the improved matching between the rotor and stator. This research further demonstrates the accuracy and reliability of the optimization method using an artificial neural network (ANN).


Introduction
As the world population increases exponentially and economies develop further, energy consumption is increasing at a staggering rate, resulting in enormous pressure on energy production and a series of environmental issues. Pumps are widely used as general-purpose power machinery in various fields to transport fluids. Their energy consumption constitutes a considerable proportion of the total economic cost (Shankar et al., 2016).
A multi-stage double-suction centrifugal pump is a special pump with axisymmetric and horizontal split structures, which is designed to satisfy cases involving large flow rates and high head (Wei et al., 2019), such as long-distance water diversion projects, agricultural irrigation, and seawater desalination. As this pump is the core unit in the system, its performance significantly affects the overall energy efficiency of the system. However, the lack of an efficient hydraulic model leads to a narrow efficient operating zone and considerable energy consumption for this type of pump. Therefore, it is of great significance to improve the performance of multistage double-suction centrifugal pumps by optimizing CONTACT Ji Pei jpei@ujs.edu.cn the hydraulic design. Traditional optimization design is mainly based on mathematical models constructed using empirical formulas, which provide few optimization parameters and unsatisfactory results. Nevertheless, with the development of computational fluid dynamics (CFD), a CFD-based optimization design has been the preferred alternative to avoid these problems. CFD is an interdisciplinary application in which fluid mechanics, numerical mathematics, and computational science are combined to simulate realistic fluid flow conditions through numerical solutions (Al-Obaidi, 2019c, 2021Xu et al., 2021). The effectiveness of CFD has been proven in several investigations of the clock effect, cavitation, flow loss, and pressure fluctuations in pump design (Al-Obaidi, 2019a, 2019b, 2020a, 2020b, 2020cAl-Obaidi & Mohammed, 2019;Li et al., 2020;Zhang et al., 2021). In recent years, CFD combined with computer-aided optimization methods has been widely implemented for the optimization design of pumps. Design of Experiment solutions (DOEs), surrogate model fitting, and intelligent algorithms directly solving are the most popular auxiliary methods based on this technology. The DOE approach is often used to optimize objective functions with extreme evaluation costs because it can find relatively better solutions with a small number of function evaluations (Gan et al., 2022). Using the Orthogonal design approach, Long et al. (Long et al., 2016) proposed an optimal design of a reactor coolant pump diffuser and confirmed the accuracy of the method with experimental results. Similarly, Yang et al. (Yang et al., 2021) implemented optimization of the energy efficiency for an electrical submersible pump (ESP) by using the Taguchi method and pointed out that this method had apparent advantages in multi-parameter optimization. Furthermore, the DOE approach demonstrates good applicability in unsteady CFD optimization (e.g. pressure pulsations) (Xiao & Tan, 2021). Nevertheless, the DOE method is challenging to obtain good results in terms of global optimization. In contrast, the method of solving the optimization problem directly through intelligent algorithms and numerical simulations has the most substantial global optimization capability (Fang et al., 2020). The genetic algorithm (GA) and particle swarm algorithm (PSO) are the most commonly used algorithms in this approach. In addition, this method demonstrated promising results in the optimization design of pumps and other turbomachinery Song & Liu, 2018). However, owing to the high computational cost, this method has not been applied on a large scale. Therefore, with the rapid development of computational resources and artificial intelligence technologies, the application of CFD technologies combined with machine learning in optimizing pump profiles has demonstrated significant advantages with regard to multi-parameter support, accuracy, and cost. In particular, for state-of-art computational power, this approach has replaced other mathematical solutions as the mainstream optimization method for pumps . In this approach, a sample library is constructed from the results of numerical simulations based on the DOE method, followed by an artificial neural network (ANN) to build a surrogate model. Then, an intelligent algorithm is employed to obtain the global optimum. Compared with other commonly used surrogate models, such as the Kriging model (Wang, Zhang, et al., 2022) and the response surface model (Ghorani et al., 2020), the ANN-based surrogate model has exhibited better stability and robustness in numerous pump optimization problems Wang et al., 2019). The impeller is a critical component of energy conversion in pumps. Therefore, many researchers have used the ANN-based surrogate model and focused on optimizing the impeller to improve the overall performance. Based on a backpropagation neural network (BPNN) and an improved genetic algorithm, i.e. the non-dominated sorted genetic algorithm II (NSGA-II), Zhao et al. (Zhao et al., 2015)  presented an optimization on the blade profile and the meridional section for a low specific speed centrifugal pump and broadened the high-efficiency area. Furthermore, Pei et al.  found that a BPNN with two hidden layers provided a higher prediction precision in centrifugal pump optimization. Similar approaches were also applied to improve the performance of mixedflow pumps (Suh et al., 2019), axial pumps (Zhang et al., 2011), and other turbomachinery (e.g. compressors and turbines) (Cho et al., 2012;Qin et al., 2022b). However, there are rarely studies focusing on the matching optimization of the impeller and stator, especially for centrifugal pumps. In addition, under some circumstances, only optimizing the impeller is ineffective for improving the performance of the pump (Shim & Kim, 2020a). Consequently, it is of great significance to consider the matching of the impeller and volute when optimizing centrifugal pumps and to investigate the influence of these two components on the hydraulic performance. For the design of multi-stage double-suction centrifugal pumps, Wang et al. (Wang et al., 2017) and Zhang et al. (Zhang & Huang, 2020) implemented hydraulic performance enhancement by modifying the structural design of the pump components. However, their methods relied significantly on the designers' experience. A systematic analysis of the effects of the main design parameters on the pump performance was not conducted, and the matching relationships between different flow components were not considered. Moreover, in these studies, performance improvement over the entire flow range was not achieved.
In summary, the optimal work based on the ANNbased surrogate model has developed maturely but owing to the nonlinear parameters, complicated structure, mesh adjustment, and low fidelity of CFD predictions, the enhancement of the energy efficiency for the multi-stage double-suction centrifugal pump has not been substantially developed. Therefore, in this paper, considering the matching between the impeller and volute, nine main geometric parameters of these two components were selected as the design variables. The hydraulic efficiency under three selected operating conditions was set as the optimization objective to improve the energy efficiency of the pump. Then, a BPNN model was established based on the numerical results using an automatic simulation program. Subsequently, a multi-objective genetic algorithm (MOGA) was used to obtain Pareto-optimal solutions. Finally, a correlation analysis was utilized to investigate the effect of the main design parameters on the hydraulic performance of the pump, and using flow floss visualization techniques based on entropy production theory, a detailed comparative analysis of the energy characteristics of the original and optimized models was conducted to investigate the optimization mechanism.

Description of baseline model
The reference model under investigation is an industrial multi-stage double-suction centrifugal pump designed and manufactured by Shangdong Shuanglun Co., Ltd., China. The design flow rate, head, and power are 0.15 m 3 /s, 132 m, and 269 kW, respectively, and the rotating speed is 1490 rpm. At the design flow rate, the specific speed of the pump is calculated to be 17.62 using the following expression (Fang et al., 2020).
Where n represents the rotating speed, Q d represents the design flow rate, and H represents the head.
A horizontal split structure, as shown in Figure 1, is used. Semi-spiral suction chambers with baffles are arranged symmetrically on both sides to allow the fluid to flow to the first-stage impellers. The inter-stage flow channel transports the fluid from the first-stage impeller to the second-stage impeller, converting kinetic energy into pressure energy by reducing the velocity with minimal loss (Wilson et al., 2006). The second-stage impeller is a double-suction impeller. The hydraulic model of the two-stage impeller adopts the same mechanism. Fluid is discharged via the double volute, and a partition structure is employed inside the volute casing to balance the radial force. Detailed information regarding the multistage double-suction centrifugal pump is presented in Table 1.

Numerical setup
The original flow domain for the numerical simulation established by the three-dimensional (3D) modeling software NX12.0 is shown in Figure 2. Straight pipes with a length of eight times the pipe diameter were placed at the pump inlet and outlet to reduce the effects of flow separation and recirculation under off-design conditions on the convergence of the numerical calculation. The computational domain was partitioned into 11 sections: the suction pipe, suction chamber, first-stage impeller, interstage flow channel, second-stage impeller, double volute, and discharge pipe.
The commercial CFD program, ANSYS CFX, was used in this study to investigate the hydraulic performance and internal flow field of the pump, in which the finite volume method is used to discretize the 3D incompressible continuity and Reynolds-Averaged-Navier-Stokes (RANS) equations. The k-ω shear stress transport (SST k-ω) turbulence model was adopted as it has been proven to accurately predict the performance of multi-stage double-suction centrifugal pumps (Koranteng Osman et al., 2019;Wang et al., 2017).
Specifically, a high-resolution scheme was used to discretize the advection terms. The reference pressure was set as 0 atm. For the boundary conditions, the total pressure (1 atm) and a constant mass flow rate were applied at the inlet and outlet, respectively, and the turbulent intensity was set to 5%. Except for the impeller calculation domains, all other calculation domains were set as stationary regions. The frozen-rotor strategy was selected to transmit data at the interface between the rotor and stator. All the solid walls of the pump were set with smooth and no-slip conditions, and an automatic wall function was selected. Considering the balance between the time cost and numerical accuracy, the convergence criterion for the steady-state simulation was a root-mean-square (RMS) residual less than 10 −5 or a maximum of 500 iteration steps.

Mesh generation
Compared with simulations based on unstructured grids, those based on structured grids typically consume less time and exhibit better convergence characteristics. In addition, they allow the number of grid nodes and the near-wall mesh to be controlled more easily. Therefore, in this study, the hexahedral structured grids in rotating and stationary domains were generated using the commercial meshing software TurboGrid and ANSYS ICEM, respectively. A H/C-type topology was adopted for the blade. To precisely capture the complex flow behaviors near the wall, the mesh on solid surfaces was refined to resolve the high-pressure and velocity gradients.

Grid convergence analysis
The specific node number of each component was determined based on the grid convergence index (GCI) using the Richardson extrapolation method. The detailed procedure of the grid convergence analysis was presented in papers by Celik (Celik & Karatekin, 1997) and Wei (Wei et al., 2019). Three sets of meshes with mesh numbers of 29.43, 12.85, and 5.67 million were generated as the fine-, medium-, and coarse-gird schemes, respectively, in this study to ensure a grid refinement factor exceeding 1.3, as shown in Table 2. The dimensionless variable w r /u 2 (where w r represents the radial velocity, and u 2 represents the impeller outlet circumferential velocity) at the design flow rate of 18 monitoring points at 0.8 times the firststage impeller outlet diameter (Figure 3; only the first and final points are marked, whereas the remaining points are set at equal intervals) was selected as a key variable for grid convergence analysis. Because of the trade-off between cost and accuracy, the medium-grid scheme was selected to verify that it could satisfy the requirements for the GCI . Figure 4a illustrates the medium-grid convergence index, which is plotted in the form of error bars, and   Figure 4b illustrates the extrapolated value of the present simulation results. The maximum discretization uncertainty was 4.95%, which is lower than the recommended value (Qin et al., 2022a). The medium-grid solution was close to the extrapolated value ( Figure 4b). The local apparent order p ranged from 0.867 to 3.62, with a global average value of 1.834. The results above indicate that the medium-grid scheme (with a total mesh number of 12.85 million) satisfies the requirements of the GCI. Figure 5 shows the medium-grid resolution. The nearwall region needed a fine grid because of the highvelocity gradient near the solid walls. Hence, the y + value had to be considered during the meshing. At critical solid surfaces such as those of the blade and tongue, the first grid height was set to less than 0.05 mm with an expansion ratio ranging between 1.15 and 1.35. In addition, in these regions, y + was less than 10, and the maximum y + , in this case, was less than 50 (refer to Figure 6). The detailed y + distribution is presented in Table 3. The maximum aspect ratio of this set of mesh was less than 200, and the minimum angle was greater than 14°. On the whole, the quality of the selected mesh was appropriate for predicting the pump hydraulic performance and capturing turbulent structures using the SST turbulence model (Wang, Tai, et al., 2022).

Test rig set up
To verify the reliability of the numerical results, the baseline model was investigated experimentally using an open-loop test rig at Shangdong Shuanglun Co. Ltd.,  China. A schematic of the test rig is shown in Figure 7. A torque meter was used to measure the shaft torque with a measurement precision within ±0.3%. An electromagnetic flowmeter with an error margin of less than Â ±0.5% was used to measure the flow rate. The ranges of the two pressure sensors were 0-100 kPa and 0-2.5 MPa, respectively, and the uncertainty was less than ±0.1%. The characteristic signals were transmitted by the data acquisition device to the LabVIEW platform for analysis. Data containing the shaft torque, flow rate, and inlet/outlet pressure were recorded at the nominal rotating speed for each operating condition. The test was repeated thrice for the same operating point, and the mean value was used as the final datum. The overall uncertainty in measuring the head and efficiency was less than ±0.5%.

Hydraulic performance comparison
A comparison between the hydraulic performance based on the computational results obtained using the aforementioned numerical settings and that based on the experimental results is shown in Figure 8. Nondimensional performance parameters were used for the comparison, which can be calculated as follows .
Where denotes the head coefficient, λ indicates the shaft power coefficient, P represents the shaft power, g denotes the gravitational acceleration factor, and R 2 represents the impeller outlet radius.
The error between the numerical results and the experimental results was within 5%. Because the volumetric and mechanical efficiencies were not considered in this study, the efficiency and head obtained via the numerical simulation were generally higher than the experimental values. The numerical efficiency curve exhibited good agreement with the experimental curve, and the best efficiency point was located at approximately 0.9 times the design flow rate. The deviation in the efficiency under the nominal condition was 3.8%, and as the flow rate increased, the deviation decreased. The curve depicting the head shows that the calculated data agreed well with the test data. The numerical error for the head   at the nominal flow rate was 2.5%, and under off-design conditions, the error did not increase significantly. As noted from the shaft power, the calculated values were slightly lower than the experimental values, and the maximum deviation occurred under partial-load conditions. In summary, the numerical methodology used in this study exhibits high reliability and can satisfy the requirements of subsequent studies.

Entropy production theory
With regard to the multi-stage double-suction centrifugal pump, owing to the viscosity of the working medium and the presence of Reynolds stress during operation, mechanical energy is irreversibly converted into internal energy, resulting in a significant amount of irreversible energy loss. Meanwhile, the flow loss in turbomachinery is complex, as the total pressure loss cannot be used to visualize the distribution of the flow loss and locate the maximum value. Therefore, the local entropy production method based on the second law of thermodynamics is appropriate for visualizing and measuring the flow loss in a pump.
Entropy is a state variable that is essentially used to characterize the degree of chaos inside a system. In turbulent flows, the transfer equation for single-phase incompressible flows in Cartesian coordinates can be described as follows (Ji et al., 2020).
Where ρ represents the density of the fluid; s represents the specific entropy; u, v, and w represent the velocity components in the Cartesian coordinate system; T represents the thermodynamic temperature; div( q/T) denotes reversible heat transfer, /T and /T 2 represent the entropy production caused by viscous dissipation and heat transfer, respectively.
Except for the two terms /T and /T 2 , which are always positive, the other quantities can be positive or negative depending on the directions of the flow and heat flux.
According to the Reynolds time-averaged method for the turbulent regime, the instantaneous quantities in Equation (4) can be classified into time-mean and fluctuating quantities. Thus, Equation (4) For this research, the temperature within the flow domain was assumed to be constant. Hence, the heat transfer during pump operation was neglected. This implies that the terms div q T − ρ ∂u s ∂x + ∂v s ∂y + ∂w s ∂z and θ T 2 were set to zero. The time-averaged format of the transfer equation is expressed as follows.
WhereṠ D represents the local entropy production rate (EPR),Ṡ D represents the viscous (direct) dissipation induced by the time-averaged flow field, andṠ D represents the turbulent (indirect) dissipation due to the velocity fluctuation. These two components can be calculated using Equation (7).
Where μ represents the dynamic viscosity. Viscous dissipation (Ṡ D ) can be directly calculated during post-processing using velocity field data. Nevertheless, the component of the velocity fluctuation cannot be obtained using numerical simulations based on the RANS method. According to Kock (Kock & Herwig, 2004), the dissipation caused by velocity fluctuation is closely associated with the turbulent dissipation rate in the two-equation turbulence model. Hence, for the SST k-ω model used in this study, the turbulent dissipation can be approximated using Equation (8).
Therefore, the total entropy production of each computational domain can be determined by integrating the viscous and turbulent dissipations as follows. Figure 9 presents the definitions of the geometric parameters of the impeller and double volute. The meridional shape of the impeller was fitted using fourth-order Bézier curves with five control points. Due to the size limitation of the suction chamber, the impeller inlet diameter D 1 was fixed. Considering the assembly of the structure of the impeller, inter-stage flow channel, and volute, the impeller outlet diameter D 2 and volute inlet diameter D 3 were set as constants. Therefore, only the impeller outlet width b 2 was chosen as the design variable to control the meridional shape of the impeller. The blade angle was set to vary linearly from the hub to the shroud, and the wrap angle was consistent at the hub and shroud. For the purpose of reducing the computational complexity and the generation of the invalid blade geometry, the blade thickness was maintained consistent with that of the baseline model during the parametric design process. Hence, five parameters, namely the blade inlet angle at hub β 1h , blade inlet angle at shroud β 1s , blade outlet angle at hub β 2h , blade outlet angle at shroud β 2s , and blade wrap angle ϕ, were selected as the design variables to determine the blade profile. For the double volute, there was a constant velocity in all cross-sections of the circumference, which is used to design the variation of the cross-sectional area. According to Stepanoff (Shim et al., 2018), the constant velocity is obtained by the equation c u = ks √ 2gH (k s can be determined dependent on the specific speed n s ). In a previous study (Shim et al., 2016), it can be found that the partition start angle θ had a certain influence on the hydraulic performance of centrifugal pumps. Thus, three geometric parameters of the double volute, namely the volute inlet width b 3 , Stepanoff number k s , and partition start angle θ, were selected to be the design variables. Table 4 presents the ranges of all the design variables used for the parameterization.

Automatic simulation and software integration
In this study, an automatic geometry modeling and numerical simulation procedure were proposed to solve the problem of sample database establishment. The framework was built using Python to integrate the 3D modeling and CFD software. Figure 10 illustrates the logic of the automatic simulation program.
The design variables were classified using a Python program to distinguish between the geometric parameters of the impeller and double volute. The flow domains of the impeller and volute were generated by the turbomachinery design software, CFturbo. The geometry was imported into the ANSYS Workbench for the final meshing and numerical calculations. The calculation results were extracted and aggregated in batch mode using Python code.

Optimization objective and constraint
In order to broaden the high-efficiency area of the multistage double-suction centrifugal pump and ensure that the performance index satisfies the requirements of engineering application, the objective function for the multiobjective optimization is defined as follows.
Where the subscripts refer to the operating conditions, and the superscript ORI refers to the baseline values.

Optimiziatiom procedure
The optimization procedure performed in this study, as shown in Figure 11, can be categorized into three stages. The main task of the first stage was to create the sample database. In this process, 150 sample points were generated using LHS, which can meet the requirements of a high-precision surrogate model trained by a BPNN. Then the automatic simulation program was used to obtain the objective values of each sample point. In the second stage, the database was utilized for training the multi-layer feedforward neural networks to construct the approximate functional relationship between the design variables and the objective values. Finally, a MOGA was applied to    solve for the Pareto-optimal solutions, and the accuracy of the optimal results was verified via CFD.

Design of experiment
LHS was introduced by McKay et al. (2000) as a statistical method; it affords the advantages of effective space filling and the ability to fit nonlinear relationships. Hence, sample points were generated using LHS in this study.
Under three different pump operating conditions, 150 sample points were generated in the design space. The characteristic prediction of each design was based on the automatic simulation program described in Section 4.1.2. Owing to the failure of the modeling and calculation, not all cases were valid during this process. A total of 149 valid cases were included in the database. Table 5 presents the data distribution in the sample database. For the pump efficiency, the maximum  values of all features in the database exceeded the baseline values, and under off-design conditions, the difference was more significant. Meanwhile, the difference between the minimum and baseline values was relatively large under off-design conditions. The standard deviation of the efficiency under the nominal condition reached a minimum, and the mean value under partial-load conditions was the most similar to the baseline value. Compared with the pump efficiency, the pump head was more sensitive to changes in the design parameters, particularly under overload conditions. Thus, it is necessary to strictly limit the variation of the head in the optimization equations.

Surrogate model
An ANN, which was proposed by MaCulloh and Pitts (McCullock & Pitts, 1956), is a computing system inspired by the biological neural network that constitutes animal brains. The ANN has been widely used in turbomachinery optimization recently because of its excellent learning, updating, and nonlinear fitting abilities. The BPNN ) is a multi-layer feedforward neural network based on supervised learning. It is generally trained via the steepest descent method and exhibits excellent adaptivity in achieving a strong nonlinear mapping from input to output. The basic topology of the BPNN is shown in Figure 12. An ANN typically comprises an input layer, a hidden layer, and an output layer. One or more hidden layers may be use used, but overfitting occurs more easily as the number of hidden layers increases.
In this study, three-layer BPNNs with six neurons in the hidden layer were used to construct a surrogate model between the decision variables and objective functions. The number of hidden-layer neurons was determined via trial and error. To ensure the high accuracy of the surrogate model, the sample database was allocated for training and testing at a ratio of 7:3.

Optimization algorithm
In this study, the enhancement in the efficiency of the pump under the three operating conditions should be categorized as a multi-objective optimization problem in high-dimensional design spaces. The non-dominated sorted genetic algorithm (NSGA) based on the controlled elitism concept is well known for its advantage of searching for different solution spaces simultaneously (Deb et al., 2002). The NSGA-II is an improved version of the NSGA, and it adopts a fast non-dominated sorting mechanism, which can approximate the Pareto frontiers (optimal solutions) accurately, whereas the uniformity of the Pareto-optimal solution dispersion is guaranteed via the concept of congestion degree. Hence, NSGA-II has been widely used in turbomachine optimization processes. In this study, multi-objective optimization was performed using Python code. The detailed settings are listed in Table 6.

Correlation analysis
Correlation analysis can reveal the correlation between the design parameters and hydraulic performance of a pump. Hence, a correlation analysis based on the Pearson factor  was applied in this study. The Pearson correlation coefficient, r X ,Y , can be calculated using Equation (11).
Where (X, Y) is a pair of random variables, cov denotes the covariance, σ X represents the standard deviation of X, σ X represents the standard deviation of Y, and N represents the sample size. The range of r X ,Y is from −1 to 1. An r X ,Y value of less than 0 implies that the variables are negatively correlated, 0 implies no correlation, and greater than 0 implies that the variables are positively correlated. Figure 13 shows the results of the correlation analysis. The results reveal that b 2 and k s significantly affected the pump efficiency under off-design conditions. Compared with other parameters that determine the blade profile, the wrap angle ϕ exerted the greatest effect on the efficiency, particularly under the nominal condition. Similar to θ, the volute inlet width b 3 only exerted a certain effect on the efficiency under the nominal condition. For the pump head, the impeller outlet width b 2 played an essential role under all operating conditions, whereas β 1h , β 1s , β 2h , and β 2s could be neglected as their correlation coefficients were extremely small. The blade wrap angle ϕ adversely affected the head, and the corresponding magnitude of the correlation coefficient was smaller than that of the efficiency. The effects of the three geometric parameters of the double volute were negligible, except for k s , which contributed positively to the pump head at the design and partial flow rates. To sum up, the impeller outlet width b 2 had the most significant effect on the hydraulic performance of the pump, particularly on the pump head, which agrees with the findings of Lettieri et al (Lettieri et al., 2014). Conversely, the parameters of the blade profile, namely β 1h , β 1s , β 2h , and β 2s , exerted negligible effects on the hydraulic performance, which is probably due to the small ranges of these parameters. Therefore, when optimizing similar types of pumps in the future, the ranges of these design parameters can be appropriately expanded to obtain a model with higher hydraulic performance instead of selecting a conservative range to ensure the success rate of automatic three-dimensional modeling. In addition, Shim and Kim (Shim & Kim, 2020b) discovered that the Stepanoff number k s could affect the slope of the head curve and the location of the best efficiency point of the centrifugal pump, which is consistent with the present correlation analysis.

Regression analysis for surrogate model
The reliability of the surrogate model has a decisive effect on the results of the optimization based on the model and optimization algorithm. In order to evaluate the accuracy of the artificial neural networks used in this research, the regression analysis was executed.
The adjusted R Squared (R 2 adj ) (Gan et al., 2022) was chosen to indicate the regression performance of the surrogate model, which can be expressed as follows.
Where n represents the total sample size; p represents the number of features; y andŷ represent the actual and predicted values of sample points, respectively. The results are shown in Figure 14. A minimum R 2 adj of 0.95858 was recorded in the fitting of the efficiency under the nominal condition (1.0Q d ). Thus, the prediction accuracy of the surrogate model is sufficient for subsequent optimization design (Suh et al., 2019).

Pareto frontiers
The multi-objective optimization problem of the multistage double-suction centrifugal pump was solved using a MOGA, and the detailed procedures are provided in Section 4.3. The Pareto frontiers stabilized after 10,000 iterations, and 400 Pareto-optimal solutions were obtained in the 3D function space, indicating a tradeoff among the conflicting objectives, as shown in Figure  15 (where ORC denotes the original case, and OPIV represents the selected case in the Pareto frontiers). It is evident that the improvement of one objective function would undoubtedly result in the deterioration of the others. Most Pareto cases exhibited better performance than the original case under the selected operating conditions. Nevertheless, under the partial-load condition, there were still some Pareto cases that did not exhibit an efficiency enhancement compared with the original case because the specific speed of the pump was low at this time. Meanwhile, those Pareto solutions with a higher efficiency than that of the original case under the partial-load condition were usually accompanied by a decrease in head. Therefore, it is necessary to select an appropriate solution according to the practical engineering requirements. In this study, the selection criterion was to guarantee an efficiency improvement of at least 2% for the three selected operating conditions while focusing on improving the efficiency under the nominal condition. Finally, a Pareto-optimal case with uniformly disturbed characteristics was selected for further research, which is denoted by the star symbol in Figure  15, and the detailed design information is listed in Table  7. The selected case was verified via CFD. A comparison between the predicted data obtained using the BPNN and the numerical verification data is presented in Table 8. It is worth mentioning that the maximum error between the validations and predictions was 2.11%, and the mean value was less than 1%, which confirms the reliability of the surrogate models used in the optimization process. Moreover, these results illustrate the high applicability of the ANN for predicting the performance of multi-stage double-suction centrifugal pumps.

Geometry comparison
A detailed geometry comparison between the original case (ORC) and the optimized case (OPIV) is depicted in Figure 16. It is observed that the optimized blade differed significantly from the original blade, particularly at the blade trailing edge. For the double volute, the crosssectional area of the optimized case was larger than that of the original case because of the reduction in the Stepanoff number k s . Additionally, the position of the partition   structure changed significantly, which affected the pressure pulsation characteristics of the pump to some extent (Shim et al., 2016). The tongue location remained almost unchanged.

Performance comparison
The characteristics curves comparison is illustrated in Figure 17. It should be noted that the optimized model indicated a higher efficiency than that of the original  model over the entire flow range, with increases of 2.05%, 3.56%, and 5.36% at 0.6Q d , 1.0Q d , and 1.2Q d , respectively. In addition, the best efficiency point tended to shift to the design flow rate, which is distinguished from the original model. After optimization, the head declined at 0.6 and 0.7 times the nominal condition, but the decrease was less than 2%, which is within the permitted range for engineering applications. Among the selected optimized working conditions, the head at 1.2Q d exhibited the most significant increase (7.49%). Moreover, the shaft power of the optimized model displayed an interesting difference from that of the original model. Under all working conditions, the shaft power of the optimized model was lower than that of the original model, but the difference decreased as the flow rate increased. In summary, the optimization scheme can effectively alleviate the rapid drop of the efficiency and head curves under overload conditions and broaden the high-efficiency area of the multi-stage double-suction centrifugal pump. Although the maximum efficiency improvement was less than 6%, the multi-stage double-suction centrifugal pump is a typical low specific speed centrifugal pump, and its performance is more difficult to improve. Therefore, the technology proposed in this paper is still of great reference value for practical engineering applications.
To further clarify the optimization mechanism, the hydraulic head distributions are presented in Table 9. It can be seen that the hydraulic loss in the suction chamber increased after optimization because of the significant structural changes in the first-stage impeller; however, this loss was almost negligible. The working capacity of the first-stage impeller of the optimized model decreased significantly compared with that of the original model. By contrast, the capacity of the second-stage impeller improved under the nominal condition and was similar to that of the original scheme under off-design conditions. Compared with the original model, the head losses of both the inter-stage flow channel and double volute were reduced. Meanwhile, it should be noted that the loss of the optimized volute decreased more significantly under the nominal and overload conditions, i.e. by more than 30%, primarily because of the improved matching between the impeller and volute. This demonstrates the superiority and necessity of multi-component matching optimization.

Blade loading comparsion
The cavitation characteristic is one of the most critical criteria for the safe operation of pumps. The cavitation phenomenon in centrifugal pumps tends to occur under overload conditions. Hence, the comparison of static pressure distributions on feature spanwise planes along the streamwise at 1.2 times the design flow rate is shown in Figure 18. Compared with the original scheme, static pressure on the suction side near the leading edge (s = 0) in the optimization scheme was increased to some extent on all feature spanwise planes in the first-stage impeller, where cavitation was most likely to appear. In contrast to the first-stage impeller, the pressure near the leading edge in the second-stage impeller increased significantly as the number of stages increased, which implies that cavitation was less likely to occur in this component. Moreover, after optimization, the minimum pressure value near the leading edge in the second-stage impeller increased significantly. Therefore, it can be concluded that the anticavitation erosion performance of the optimized model improved.

Comparison of total entropy production
A comparison of the total entropy production is shown in Figure 19. The flow loss of the pump was significantly reduced after optimization. The total energy loss of the optimization scheme was reduced by 9.4%, 20.0%, and 29.4% at 0.6, 1.0, and 1.2 times the design flow rate, respectively, which explains the considerable efficiency improvement compared with the original model from the perspective of energy conversion. Specifically, the total entropy production of the inter-stage flow channel decreased to some extent under the selected operating conditions owing to the enhanced matching between the upstream and downstream components. Moreover, the flow loss of the double volute decreased considerably as the flow rate increased because of the change in its structure and the improvement in compatibility with the upstream fluid domains. Notably, even though the energy loss of the optimized impeller increased under some operating conditions compared with the original model, the overall energy efficiency of the pump did not deteriorate, which illustrates that simply increasing the efficiency of the rotor components in practical engineering applications will not necessarily reduce the overall energy consumption of the pump. Therefore, the optimization method considering the matching of the rotor and stator is a more reliable way to optimize multi-stage centrifugal pumps.

Detailed distribution of EPR in impeller
Owing to the symmetry of the pump structure, only one-half of the components were considered for analysis. Figure 20 shows a comparison of the velocity streamline and EPR distributions in the first-stage impeller at span = 0.5. As can be seen in this figure, since the downstream component adopted a double-volute structure with a symmetric arrangement to balance the radial forces, the corresponding feature distributions exhibited symmetric characteristics. Specifically, due to the fact that the incidence angle was significantly positive at the partial flow rate, a larger boundary layer on the blade pressure side was generated, which led to a greater amount of low-energy fluid along the pressure side and hence intensified the energy loss caused by the wake  vortices (region M and M'). As the flow rate increased, the losses in region M and M' tended to be reduced. In addition, separate flow structures associated with highpressure gradients were another major contributor to the energy loss. With the rise of the flow rate, the fluid began to propagate along the blade streamlines, which caused the loss to decrease gradually. By comparing the original and optimized schemes, it can be observed that the flow conditions and EPR distribution did not change significantly under the nominal and overload conditions. Nevertheless, because the incidence angle of the optimized impeller decreased under the partial-load condition, the inflow conditions improved, which reduced the size of the flow separation structures in the impeller channel, and reduced the energy loss to a certain extent.
In terms of the second-stage impeller (refer to Figure  21), after optimization, the overall flow conditions deteriorated at the partial flow rate, which was primarily attributed to the increase in the positive incidence angle. In addition, the fluid tended to propagate toward the pressure side, resulting in a higher pressure gradient and hence an increase in the losses in region K, L, and N. Compared with the original model, the optimized model indicated a significant decrease in the loss caused by wake vortices in region T under the partial-load condition, which is contrary to the phenomenon observed in region J. Moreover, the wake vortex structures in region J almost vanished under the nominal and overload conditions. In contrast to the partial flow rate, because of the reduction of the blade inlet angle, the inflow direction of the optimized model fitted better with the blade profile under the overload condition. Correspondingly, the energy loss declined significantly, particularly in the region I. Meanwhile, the losses caused by the vortex structures in region L and T reduced slightly after optimization at the design flow rate, whereas the energy characteristics in the other regions were almost identical.
To further investigate the energy characteristics in the impellers, Figures 22 and 23 intuitively depict the difference in the mass-flow average EPR for the two models along the streamwise under different operation conditions. In general, the variation laws of the EPR curves for the two models were consistent. As the fluid developed, the EPR values exhibited an overall increasing trend and reached a maximum near the trailing edge under the selected conditions, which is consistent with the phenomenon observed by Wu et al. (2022) in a singlestage centrifugal pump. By combining with the streamline distribution in Figures 20 and 21, the occurrence of the maximum EPR value could be attributed to the intense mixing of jet and wake. Meanwhile, the partialload condition where the jet-wake phenomenon was most pronounced was the one with the highest entropy production value among the three selected conditions. Hence, subsequent research can further enhance the performance of low specific speed centrifugal pumps by optimizing the shape of the blade trailing edge to suppress the jet-wake phenomenon (Lin et al., 2022). In addition, different from the EPR distribution in the middle flow channel of the first-stage impeller (From s = 0.2 to s = 0.4), distinct peaks and valleys appeared in this region of the second-stage impeller, possibly because of the more intense blade-fluid energy transformations caused by the complex secondary flow (Chen et al., 2022).
A comparison between the original and optimized models shows that owing to the significant change in the optimized blade trailing edge (refer to Figure 16), the EPR curves near this region showed a downward trend, especially in the first-stage impeller under the partial-load condition. This phenomenon means that the optimized scheme can improve the inflow conditions in the downstream fluid domains of the impellers to some extent. However, the optimized scheme was ineffective for improving the EPR distribution along the streamwise under some operating conditions. In summary, the results of the comparison between the two models are consistent with the results of the previous quantitative analysis of the total entropy production in the impellers.

Detailed distribution of EPR in double volute
The EPR and streamline distribution comparison results of the double volute are depicted in Figure 24. It can be observed that the region of high entropy production rate was significantly associated with the intense impellertongue interaction (refer to Figure 21a) and the reflux vortex structures in the diffusion section. Besides, the energy losses of the original model increased with the flow rate, while the optimized model had the minimum energy loss at the design flow rate. Based on a comparison between the original and optimized models, after reducing the partition start angle θ, the loss near the starting position of the partition did not decrease remarkably. However, along with the increase in the volute crosssectional area, the reflux vortices were suppressed, particularly under the overload condition, which contributed to the apparent loss decline in this region. Meanwhile, owing to the improved matching between the impeller and volute, the flow regime in the tongue area was superior to that in the original model, and the energy losses decreased accordingly.

Conclusion
In this work, a multi-stage double-suction centrifugal pump with a specific speed of 17.62 was numerically calculated, and its performance was verified via tests. By considering the matching between different components, nine main design parameters of the impeller and double volute were selected as the design variables. Using the LHS method, 150 sample cases (149 valid cases) were generated, and they were then simulated using the automatic CFD analysis program to obtain the objective values. Then, a correlation analysis between the design parameters and the performance was carried out. Furthermore, an ANN and a MOGA were adopted to execute a multi-objective optimization for the pump. Finally, an in-depth analysis of the energy loss characteristic on the optimal results was demonstrated to reveal the reasons for the energy efficiency improvement. The following conclusions were drawn.
(1) According to a regression analysis, R 2 adj was greater than 0.95, which indicates the high reliability of the ANN-based surrogate model in fitting the nonlinear relationship between the nine design variables and optimization objectives. Thus, in future optimization studies, it is feasible to use a BPNN as the surrogate model.
(2) The results of a correlation analysis show that the impeller outlet width b 2 exerted the most significant effect on the hydraulic performance of the pump. By contrast, except for the wrap angle, the remaining design variables that determined the blade profile exerted negligible effects on the performance, which is probably due to the strict range constraints. (3) According to CFD verification, the optimization scheme provided 2.05%, 3.56%, and 5.36% efficiency improvements at 0.6Q d , 1.0Q d , and 1.2Q d , respectively, and maintained a lower shaft power compared with that of the original scheme. Meanwhile, the head was improved over the entire flow range (except for a 1.24% decrease at 0.7Q d ), particularly at 1.2Q d , where the maximum improvement was 7.49%. Furthermore, the blade loading analysis demonstrates that the optimized model exhibited better anti-cavitation erosion performance. Therefore, the optimized scheme can satisfy the requirements of practical engineering applications and contribute to energy saving. (4) A quantitative entropy production analysis reveals that the optimization scheme effectively reduced the energy loss of the pump by 9.4%, 20.0%, and 29.4% at 0.6Q d , 1.0Q d , and 1.2Q d , respectively, which explains the considerable efficiency improvement from the perspective of energy conversion. Furthermore, a comparative analysis of the energy characteristics shows that the improvement in energy efficiency could be attributed to the improved matching between the rotor and stator. Meanwhile, subsequent research should pay more attention to optimizing the shape of the blade trailing edge to further improve the performance of centrifugal pumps.
This study provides a ready reference to facilitate the matching optimization design of various turbomachines. However, there is still more scope to optimize the shape of the impeller meridional section and volute cross-section. In addition, for the multi-stage centrifugal pump, the improvement of the operational stability is also a pressing issue in practical engineering. Thus, the optimization of unsteady characteristics such as pressure pulsations, axial force, and radial force will be the focus in the future when computational resources allow.

Abbreviations
Description Unit

Disclosure statement
No potential conflict of interest was reported by the author(s).