Designing a committee of machines for modeling viscosity of water-based nanofluids

Viscosity is a crucial thermophysical feature of a substance that must be accurately determined before designing a system with nanofluid as the working fluid. In this study, the modern technique of committee machine intelligent system (CMIS) is used for establishing a predictive model for the relative viscosity of the water-based nanofluids. The model was developed by considering 1440 experimental data points of different types of water-based nanofluids containing Al2O3, SiC, SiO2, TiO2, CuO, nanodiamond, and Fe3O4 nanoparticles. The CMIS model combines three intelligent models including a multilayer perceptron (MLP) model trained with Levenberg-Marquardt (LM), an MLP model trained by Bayesian Regularization (BR) and a radial basis function (RBF) approach to estimate the relative viscosity of different water-based nanofluids. Statistical and graphical error criteria revealed that the CMIS technique successfully estimates the relative viscosity of all data points over the whole ranges of operational conditions with a mean absolute relative error of approximately 1.25%. According to their precision and performance, the established CMIS system provides the best performance, followed by the BR-MLP, LM-MLP, and RBF models. Moreover, the performance and estimation capability of the CMIS model was verified against 13 theoretical and empirical models.

The dispersion of nanoparticulates into a base fluid significantly changes its thermophysical properties (Suganthi & Rajan, 2017). The thermophysical properties are basic properties of nanofluids which can have significant effects on heat transfer characteristics and fluid flow behavior (Suganthi & Rajan, 2017;Zhao et al., 2015) and many studies have focused on this subject (Bollineni et al., 2021;Dogonchi et al., 2019Dogonchi et al., , 2020Ghalambaz et al., 2020;Ishak et al., 2020;Mehryan et al., 2019;Menni et al., 2020;Molana et al., 2020;Nasrin et al., 2012;Olayiwola & Dejam, 2020;Parvin et al., 2012;. Hence, thermophysical properties calculation is of crucial importance before applying a nanofluid for a specific application. One of the important thermophysical properties is viscosity which indicates the internal resistance of nanofluids to flow (Mahbubul et al., 2012). The viscosity of nanofluids is influenced by a number of parameters, including volume concentration, morphology, temperature, particle size and shear rate. Several review articles have studied the effect of these parameters on the viscosity of nanofluids (Ahmadi Nadooshan et al., 2018;Khodadadi et al., 2018;Koca et al., 2018;Munyalo & Zhang, 2018;Sezer et al., 2019). The viscosity of nanofluids can affect convective heat transfer coefficient, pressure drop, required pumping power, and flowing skin friction coefficient (Mahbubul et al., 2014;Tamim et al., 2016;Yang et al., 2017). In addition, the viscosity can affect velocity of nanofluids, which can cause changes in temperature distribution and thus heat transfer performance (Chandrasekar et al., 2010). Therefore, precise assessment of nanofluid viscosity is crucial before designing a system in which nanofluids are the working fluid.
Generally, available methods for determining the viscosity of nanofluids are categorized as experimental measurements, empirical correlations, theoretical models, and computer-aided models. In recent decades, the viscosity of different nanofluid systems, especially water-based nanofluids, has been analyzed through many experimental studies. Nguyen et al. (2007) studied the viscosity of CuO and Al 2 O 3 water based nanofluids at a temperature between 22°C and 75°C and nanoparticle volume fraction up to about 9.4%. Their obtained results showed that at a constant temperature, the dynamic viscosity of nanofluids rises with enhancing nanoparticle concentration, but reduces with enhancing temperature at a constant volume nanoparticle concentration. They also proposed several empirical correlations for estimating the viscosity of nanofluids at low nanoparticle volume fractions. Rheological properties of alumina-water nanofluid at low volume particle ranges (0.01-1%) were studied by Sekhar and Sharma (2015). They concluded that the nanofluids viscosity has a nonlinear relationship with temperature due to the aggregation of particles. Toghraie et al. (2016) conduct an experimental research to measure the dynamic viscosity of waterbased magnetic nanofluids. They reported that by rising the temperature, the viscosity of nanofluids significantly decreases. Moreover, they proposed a correlation to model their experimental results. Sundar et al. (2016) surveyed the heat transfer applications of nanodiamondwater nanofluids. They determined the viscosity of nanodiamond-water nanofluids. They showed that the nanofluids viscosity enhances about 1.57 times at a temperature of 293 K and 1.8 times at a temperature of 333 K compared to water viscosity. They also introduced a novel correlation for estimating the viscosity of nanofluids by taking the influence of volume fraction and temperature into account. There are also other experimental studies about viscosity of nanofluids which are reviewed in (Bashirnezhad et al., 2016;Gupta et al., 2017;Koca et al., 2018;Murshed & Estellé, 2017;Yang et al., 2017).
Experimental measurements are the most accurate approach; however, they are costly, time-consuming, and difficult to carry out, restricting their widespread application. On the other hand, theoretical and empirical correlations and computer-aided models can be utilized quickly and have gained vast popularity among researchers in recent years. Einstein (1906) developed the first theoretical model for estimation of nanofluids' viscosity which could be used for uncharged hard spheres with infinite dilution (ϕ < 2%). According to this model, viscosity has a linear relation with the particle volume concentration. Brinkman (1952) improved Einstein's model so that it became appropriate for wider range of particle's volume fraction (ϕ < 4%). Batchelor (1977) proposed a second-order polynomial based on the particles' volume fraction. In this model, he took into account the inter-particle interaction and particles' Brownian motion. Through utilizing the expansion of the Taylor series for a random bed of particles with dilute concentrations, Lundgren (1972) put forth a thirdorder polynomial to estimate the viscosity of nanofluids.

T+273.15
Thomas and Muthukumar (1991) suggested a mathematical expression by considering the effects of three-body hydrodynamic. Apart from the aforementioned models, numerous theoretical models have been proposed so far, among which the models proposed by Krieger and Dougherty (1959), Roscoe (1952), Simha (1952), Graham (1981), Metzner (1985), Frankel and Acrivos (1967), Brenner and Condiff (1974) and Saitô (1950), are the most conspicuous ones. Empirical models, in contrast, usually do not have any theoretical background and are developed based on curve fitting to the available experimental data. For example, Maïga et al. (2004) proposed an empirical correlation for the viscosity of Al 2 O 3 -water by utilizing the least-square curve fitting. Chen et al. (2007) established a simple polynomial for the viscosity of TiO 2ethylene glycol. In another study, the viscosity of Fe3O4water at a temperature between 20°C and 60°C and particle's volume fraction of 0-2% was successfully predicted by a correlation developed by Sundar et al. (2013). Using 701 experimental viscosity data for Al 2 O 3 , SiO 2 , TiO 2 , and CuO-water, Meybodi et al. (2016) develop a model as a function of temperature, volume fraction and the size of nanoparticulates. In addition to the mentioned models, many other empirical models in the past year have been introduced, including Tseng and Lin (2003), Godson et al. (2010), Nguyen et al. (2008), Garg et al. (2008), Kulkarni et al. (2006), Hosseini et al. (2010), Kole and Dey (2010), Sekhar and Sharma (2015), Chiam et al. (2017) and Elcioglu et al. (2018). Unfortunately, most of these correlations have been proposed according to the experimental data that are generated based on restricted experimental conditions (i.e. limited ranges of temperature, size and particle volume fraction, and a few types of nanoparticles). Besides, it is proven in numerous articles (Atashrouz et al., 2014;Heidari et al., 2016;Meybodi et al., 2016;Zhao et al., 2015) that these models normally do not return reliable results and are usually applicable within narrow ranges of experimental conditions. Table 1 summarizes a list of the most important empirical and analytical approaches for the viscosity of water-based nanofluids.
Recently, computer-based intelligence and optimization techniques have been utilized to predict the thermophysical characteristics of nanofluids. A comprehensive review has been done by Bahiraei et al. (2019) to investigate the possible applications of various types of artificial intelligent methods in different issues related to nanofluids. Mehrabi et al. (2013) introduced a Fuzzy c-means based Adaptive neuro-fuzzy inference system (FCM-ANFIS) model which considered, nanoparticle volume concentration and size as well as temperature as the input parameters and the nanofluids' viscosity of Al 2 O 3 , TiO 2 , SiO 2 and CuO as the dependent variable. The research results were in line with the findings of the experiment. Meybodi et al. (2015) introduced a Least Square Support Vector Machines (LSSVM) technique to estimate the viscosity of water-based nanofluids of SiO 2 , Al 2 O 3 , CuO, and TiO 2 . 801 experimental data points in the temperature of 10-72°C and volume concentration of less than 13% were utilized to propose the model. Nanoparticle type, nanoparticle size, temperature, volume fraction of nanoparticle, and base fluid (water) viscosity were chosen as the inputs of the model. They also successfully compared their results with experimental data.
Recently, Hemmati-Sarapardeh et al. (2018) and Shateri et al. (2020) proposed accurate models using an extensive experimental data bank containing a broad range of operational parameters to estimate the viscosity of nanofluids. However, these models were for all types of nanofluid systems.
For the case of water-based nanofluids, a more accurate model is required. We separated water-based nanofluid data from Shateri et al., 2020) and tried to model water-based nanofluid with high accuracy, which is very important in many industrial and scientific applications. Based on what was discussed above, there are limited tries in the literature for the water-based nanofluids prediction. Moreover, the existing ones were proposed based on a limited number of measured data and are applicable only within narrow ranges of influencing parameters such as temperature and particle volume fracture. As a result, it seems necessary to develop a reliable as well as comprehensive model based on a great number of measured data points which could be applicable to estimate the nanofluids' viscosity in a broad range of input variables.
In this study, first, 1440 data points for the viscosity of water-based nanofluids are collected from literature, which contain viscosity data of different types of nanoparticles dispersed in water with broad range of nanoparticle size, temperature, and nanoparticle volume concentration. Afterward, one Radial Basis Function (RBF) neural network, three Multilayer Perceptron (MLP) networks, and one Least Square Support Vector Machine (LSSVM) are developed. Then, three of the most accurately developed models are combined into a committee machine intelligent systems (CMIS) and the parameters of CMIS are optimized using constrained multivariable search methods, including successive linear programming (SLP) and generalized reduced gradient (GRG). Besides, the performance and estimation capacity of the established model is evaluated against 13 empirical and theoretical models. A basic framework of the topic of this study is illustrated in Figure 1.

Data collection
One of the deficiencies of the previous correlations/models for the viscosity of water-based nanofluids is the limited number of data points and narrow range of inputs. To propose a comprehensive as well as robust model for the prediction of water-based nanofluids' viscosity, 1440 data sets were extracted from open literature sources. Apparently, this data is the most comprehensive data bank ever used in the literature to introduce a viscosity model for water-based nanofluids. This data bank includes viscosity data for seven different nanofluids based on a broad range of nanoparticle density, nanoparticle size, temperature, particle volume fraction, and viscosity of the base fluid as the target. Table 2 summarizes the details of the data set and their corresponding references. Table 3 shows the statistical criteria of the inputs. In a set of data, the measure of how asymmetric a distribution can be is called skewness. In other words, skewness is the degree of deformation from the symmetrical bell curve or normal distribution. Skewness can be positive, negative, zero or undefined. The skewness of a normal distribution is zero. Left skewed distributions have negative skewness values, while right skewed distributions have positive skewness values. Kurtosis is a statistical measure that is used to describe the shape of a probability distribution. A normal distribution possesses a zero Kurtosis, however positive and negative Kurtosis correspond to the cases with narrower and flatter distribution, respectively.

Multilayer perceptron neural network (MLP-NN)
As a type of computational intelligent modeling tools, Artificial Neural Networks (ANNs) have been developed with similar characteristics to biological neural network systems. ANNs can quickly apply for pattern recognition, trend identification, prediction, and forecasting through their high capability in finding complicated nonlinear Reference (Anoop et al., 2009;Chandrasekar et al., 2010;Lee et al., 2008;Mehrabi et al., 2013;Mena et al., 2013;Meybodi et al., 2016;Murshed et al., 2008;Nguyen et al., 2007;Pak & Cho, 1998;Pastoriza-Gallego et al., 2009;Sekhar & Sharma, 2015;Tavman et al., 2008;Yiamsawas et   relationships between inputs and outputs without being prejudice about the distribution and nature of the data. An ANN consists of two principal elements as (i) processing elements (neurons) and (ii) interconnection among the neurons (Mohaghegh, 2000). MLP networks are the most common types of ANNs and are based on a supervised learning paradigm named backpropagation. An MLP neural network consists of three groups of layers: input layer, output layer, and hidden layers, which are the in-between layers between input and output layers (Lashkarbolooki et al., 2012). Elaborated processing tasks are performed in the hidden layers to determine the complex dependency of the input parameters and desirable target of the model. Each neuron in the MLP neural network contains two parameters: bias and weight, also called synaptic parameters. Optimum values of the bias and weight of each neuron can be obtained during the training process by utilizing a learning algorithm. The response of neurons in the hidden and output layers is scaled through an activation function. For the hidden and output layers of an MLP, different activation functions, which can be linear or non-linear, are used. Some of these activation functions are as follows. Binary Step :

Radial basis function neural network (RBF-NN)
The difference in processing information by neurons has separated the RBF neural network and MLP networks. In RBF neural networks, a nonlinear radial basis function is utilized as an activation function in typically one hidden layer and results in an output layer obtained by a linear combination of RBFs of input and neurons Broomhead & Lowe, 1988;Najafi-Marghmaleki et al., 2017;Panda et al., 2008;Sayahi et al., 2016). Compared to the input layer, the hidden layer usually has equal or higher dimensionality (Zhao et al., 2015). While the input is considered as a vector of real numbers (x), the output becomes a scalar function of the input vector as follows: where N expresses the count of neurons in the hidden layer, which is usually lower than the total number of input data used in the training stage, ω j denotes the connection weight of neuron j, φ j (r) is the radial basis transfer function applied for transferring a Euclidian distance r = || x-cj || to the output layer for each neuron, y(x). The c j is the center vector for neuron j used for defining the distance (norm) between centers of each neuron in the hidden layer and inputs. Here, we use the most common radial basis (or activation) function that is the Gaussian function defined as φ(r) = exp(r 2 /2σ 2 ), where σ is the spread coefficient. These functions are local to the center vector, i.e. they lead to maximum mapped value φ j (r), but approach zero at long distances from the center of a neuron. Importantly, the radius of each radial basis function expressed by the spread coefficient has to be determined empirically. Therefore, for Gaussian RBF-NN, the spread coefficient σ and the count of neurons in the hidden layer are two key parameters that should be optimized.

Least square support vector machine
One of the robust learning algorithms that can be applied for regression purposes is the support vector machine (SVM). This machine learning technique was improved by a new version known as the least square support vector machine (LSSVM). The LSSVM is simpler with a higher rate of convergence compared to SVM. Moreover, in LSSVM, the linear form equation is used to relate input and output values for large datasets. In contrast, nonlinear and quadratic set of equations is applied in SVM methods which need expensive calculations. In LSSVM, the cost function that includes a regression error of N training objects (e k ) is defined as follows: where T stands for the transpose matrix, and γ is the summation of regression errors. The cost function formula is conditioned on the limitations as follows (Suykens & Vandewalle, 1999).
where, w is the linear regression weight (regression slope), b stands for the intercept of linear regression or bias, and y is the model output vector.
The following equation represents the calculation of the weight coefficient (w) (Eslamimanesh et al., 2012): where, Based on LSSVM algorithm, Equation (9) can be reformulated as represented below Gharagheizi et al., 2011;Hemmati-Sarapardeh et al., 2013;Pelckmans et al., 2002): As a consequence, the Lagrange multipliers are obtained as follows (Pan et al., 2013): The above linear regression equation can be written in other forms using the Kernel function. In this work, we employed the radial basis function (RBF) Kernel, expressed as (Hemmati-Sarapardeh et al., 2013): where, σ 2 is a tuning parameter for optimizing the LSSVM model. In summary, optimization of σ 2 and γ as two tuning parameters of LSSVM is done through the training procedure by applying an external optimization method. In present research, the LSSVM approach proposed by Suykens and Vandewalle (1999) and Pelckmans et al. (2002) was used.

Optimization techniques
For training the MLP networks, three kinds of optimization techniques were employed, namely, Scaled Conjugate Gradient (SCG), Levenberg-Marquardt (LM), and Bayesian Regularization (BR). Also, the optimized values of LSSVM tuning factors were determined utilizing Coupled Simulated Annealing (CSA). Details on the application of these optimization techniques were described in our previous studies (Hemmati-Sarapardeh et al., 2018).

Committee machine intelligent system (CMIS)
In contrast to the ordinary procedure in which the best model is chosen, and the others are disregarded, CMIS utilizes the incorporation of all optimized models to one unified framework. This can be obtained by various methods which combine the answers of various parent models. As an example, simple averaging can be used to combine the solutions linearly. In this method, all solutions are contributed equally and their performance is not considered. In this study, we employed a weighted averaging CMIS scheme. To reach the best weights of each model, the same as our previously published models (Ameli et al., 2016;Arabloo et al., 2014), besides a branch-and-bound method, constrained multivariable search methods, including SLP and GRG, were applied. Other sources of information about these methods can be easily accessed (Ameli et al., 2016;Arabloo et al., 2014).

Model development
In this survey, 1440 data sets of seven different waterbased nanofluid systems were gathered from open literature sources to develop accurate models to estimate the relative viscosity of water-based nanofluids in broad ranges of working circumstances. Nanoparticle density, size and volume fraction, temperature, and base fluid viscosity were selected as independent parameters and nanofluid relative viscosity as the dependent variable. Five different artificial intelligent models including three MLP models trained with SCG, LM, and BR, an RBF neural network model, and an LSSVM optimized with CSA, were proposed. For the development of these models, 80% of data was considered as the training set and the remaining 20% as the testing set to analyze the reliability and precision of the systems. The separation of data set into training and testing subsets was done randomly, and to prevent a local aggregation of data points in specific regions of input parameters, many random distributions were investigated.
As stated in section 3.1 an MLP structure can consist of several hidden layers. In this study, in the development of all of the MLP models, it was found that considering two hidden layers for the MLP model gives the most accurate results. Besides, the best transfer functions in all of the MLP models, was found tansig for the first hidden layer and logsig for the second hidden layer. In addition, for the output layer purelin transfer function was the best one. Our optimization procedures revealed that 5-20-5-1, 5-20-5-1 and 5-20-5-1 were the best structure for all three MLP models, in which the first, second, third, and last numbers show the number of inputs, the number of nodes/neurons in the first and second hidden layers, and the number of nodes/neurons in the output layer, respectively. It must be mentioned that for each of the MLP models, different initial biases and weights were randomly ascribed more than 100 times, and those biases and weights providing the most appropriate outputs were selected. This was performed since the performance of the MLP models significantly dependent on the value of initial weights and biases.
The accuracy of the RBF model is dependent on its two major factors, which are the spread coefficient and the maximum count of neurons. To establish an reliable RBF model, the values of these parameters should be optimized. In this study, the best values of these parameters were obtained based on a trial and error procedure. Different values were assigned to these two parameters, and the optimum ones were chosen by minimizing the mean square error between the output of the models and experimental data. As a result, the maximum number of neurons was obtained 450 and the value of spread coefficient was determined 0.75.
Using the CSA optimization algorithm, the two important parameters of the LSSVM model, σ 2 and γ , were obtained to be 4.0089 and 2.3203×10 7 , respectively. After developing the intelligent models, the three bestdeveloped models with the highest accuracy (LM-MLP, BR-MLP, and RBF) were integrated into a general model with more reliability and robustness utilizing a committee machine intelligent system. It should be pointed out that for developing CMIS, normally all of the models are involved, while we only considered three of the most accurate ones to make the developed model less complicated.
The optimum coefficients of the CMIS model were obtained by constrained multivariable search methods of GRG and SLP. The final output of the CMIS model is formulated as follows.
where a 1 to a 3 are as follows: a 1 = 0.4573447, a 2 = 0.4600937, a 3 = 0.08225137, Table 4 summarizes some statistical error functions including Average Relative Error (ARE), Average Absolute Relative Error (AARE), Standard Deviation (SD), Root Mean Square Error (RMSE), and coefficient of determination (R 2 ), which are defined as follows:

Accuracy and validity of the models
As it is evident in Table 4, among the proposed MLP models, the MLP model optimized with BR provides more accurate results (has lowest AARE) compared to LM and SCG. Table 4 also shows that both of the developed RBF and CSA-LSSVM models provide high accuracy in predicting the relative viscosity of the data points; however, their accuracy is a little lower than the accuracy of the LM-MLP and BR-MLP. Overall, according to Table 4, it can be concluded that the BR-MLP provides the best performance, followed by the LM-MLP, RBF, CSA-LSSVM and SCG-MLP models.
As mentioned earlier, three of the most accurate models (BR-MLP, LM-MLP and RBF) were selected and combined into a single model by a committee machine intelligent system. Table 4 shows that the CMIS model with AARE values of 1.263%, 1.117%, 1.246% for the training set, testing set, and all data, respectively, is the best model for estimating the relative viscosity of water-based nanofluids.
To verify the accuracy of the CMIS model, its results over the whole range of the data were compared against numerous empirical correlations and theoretical models. Table 5 compares the statistical parameters for these models. As given in this table, the CMIS model provides the lowest ARE, AARE, RMSE, and SD, which is an indication of its reliability against available models. Moreover, Table 5 demonstrates that the previously available empirical and theoretical models show considerable deviation from the experimental values. High positive values of ARE for these models indicate that they underestimate the relative viscosity of water-based nanofluids. It is worth mentioning that from among the available models, those developed models by Maïga et al. (2004), Meybodi et al. (2016), and Buongiorno (2006) have the highest accuracy; thus, they have been considered for further analyses.  Einstein (1906) 28.41 28.42 6.12 0.37 Brinkman (1952) 28.11 28.12 6.11 0.37 Ward (1955) 27.95 27.96 6.11 0.36 Lundgren (1972) 28.02 28.03 6.11 0.36 Batchelor (1977) 28.00 28.02 6.11 0.36 Thomas and Muthukumar (1991) 28.08 28.10 6.11 0.37 Maïga et al. (2004) 14  Figure 2 depicts the cross plot of the predicted relative viscosity for both of the training and test sets and the relevant measured data for the proposed CMIS model and three of the most accurate available models. It should be noted that few data points have relative viscosity above 10 and thus are not shown in this figure to better visualize the performance of the models. As it is evident from Figure 2, for the case of CMIS model, the accumulation of all of the data points near the unit-slope line is a visual indication of high ability of the model in predicting the experimental values, while the deviation of the data points from the unit-slope line for Meybodi et al. (2016), Maïga et al. (2004), and Buongiorno (2006) reveals their inaccuracy in the prediction of relative viscosity values. In addition to cross-plots, 3D error distribution plots for the proposed CMIS model are depicted in Figure 3 with respect to various input variables. In these plots, the relative error is plotted against temperature and one of the other input parameters. The precision of the developed CMIS is confirmed as shown in this figure as a plurality of points are located near the zero-error line, and only a few points can be seen at a far distance from this line. Also, it is evident that no point is predicted with a relative error of higher than 17%.
To further ensure the preciseness of the CMIS method, the absolute deviations between the predicted values of CMIS, Maïga et al. (2004), Meybodi et al. (2016), and Buongiorno (2006) models and experimental data are sketched in Figure 4. As shown in this figure, the CMIS model could predict the viscosity of more than 90% of data sets with an absolute relative error of lower than 3% and about 96% of data set with absolute relative lower than < 5%, which indicates its astounding performance in predicting the water-based nanofluids relative viscosity. Figure 4 also shows that for Maïga et al. (2004), Meybodi et al. (2016) and Buongiorno (2006) models, 42%, 33%, and 37% of the data set have an absolute relative error of < 5%, respectively.
A more detailed accuracy analysis was done by sketching the AARE (%) versus particle volume fraction, temperature, base fluid viscosity, and particle size for the CMIS model and three of the most accurate available models. The results are shown in Figures 5-8. For plotting these figures, all of the data points were categorized into four categories. The AARE (%) values were computed for each of the categories to evaluate the validity of the systems at various input parameter values. Figure 5 shows the AARE profile at four different nanoparticle volume fraction ranges for the CMIS, Maiga et al. (Maïga et al., 2004), Buongiorno (2006), andMeybodi et al. (2016) models. Figure 5 indicates that the maximum value of the AARE for the proposed CMIS model is 1.82%, which demonstrates the high precision of the CMIS model over the whole range of the particle volume fraction. However, the most accurate results of the CMIS model (AARE = 0.45%) can be achieved at nanoparticle volume fractions higher than 5%. It should be noted that the other models have reasonable performance only for nanoparticle volume fractions between 0% and 0.5%. The reliability of previously available models significantly decreases as the nanoparticle volume fraction increases, whereas the CMIS model returns precise predictions for the entire ranges.     Figure 6 illustrates the AARE of the models over the various ranges of base fluid viscosity. It is crystal clear that the literature models have a high AARE at all ranges of base fluid viscosity. Moreover, the available models show lower accuracy at higher values of base fluid viscosity and among them, Maïga et al. (2004) give the best predictions for base fluid viscosity higher than 1 cP with an AARE equal to 12.02%. However, for all ranges of base fluid viscosity, the proposed CMIS predicts the viscosity of water-based nanofluids with the highest reliability.

Effect of base fluid viscosity
The maximum value of AARE (1.43%) for the CMIS model is in the base fluid viscosity range of more than 5 cP. The most precise predictions of the CMIS model (AARE = 1.12%) are for the base fluid viscosity ranging from 0.5 cP to 0.75 cP.

Effect of temperature
The average absolute relative error (AARE) at four different temperature limits is depicted in Figure 7 for the investigated models. As this figure illustrates, the CMIS approach gives much more precise results  in the whole range of temperature in comparison to all of the three models. Among the available models, Maiga et al. 's model (2004) has the lowest AARE% of 13.55% at the temperature of > 55°C, whereas the CMIS model predicts the viscosity accurately at all range of temperature. Figure 8 shows AARE% for all approaches at various particle sizes. Again, the best performance of all models is that of CMIS, which returns very accurate results. The three other models have an unacceptable performance, especially for small nanoparticle size. The lowest AARE% for the CMIS model is 0.79% for nanoparticle size > 45 nm, while the greatest accomplishment of the available models is that of Maïga et al. (2004) with an AARE% of 6.23% in the range of 30-45 nm.

Trend analysis of the proposed models
To examine that the new proposed model shows a correct trend with respect to the input parameters, the CMIS predictions as well as those of Maïga et al. (2004), Buongiorno (2006), and Meybodi et al. (2016) have been sketched for different nanoparticle with respect to the particle volume fraction and temperature in Figure 9 and Figure 10, respectively. Figure 9 presents the relative viscosity profile versus volume fraction. Figure 9(a and b) correspond to the Al 2 O 3 -water system at the temperature of 10°C and particle size of 8 and 43 nm (Pastoriza-Gallego et al., 2009). Figure 9(c and d) are for the SiO 2 -water system at the temperature of 25°C and particle size of 7 and 20 nm (Jia-Fei et al., 2009). As can be seen, the CMIS model successfully follows the same trend of experimental data and accurately predicts the relative viscosity with variation in volume fraction and particle size. In contrast, the three other models deviate drastically from the experimental data with an increase in particle volume fraction. Figure 10 compares the estimations of the models with experimental relative viscosity change with respect to temperature and volume fraction for two different nanofluids (Pastoriza-Gallego et al., 2009;Sundar et al., 2013). As it is obvious, the CMIS model can accurately estimate the experimental data variations; in contrast, none of the available models could follow the experimental trend. The available models, except that of Meybodi et al. (2016), only consider the particle volume fraction as the input parameter and neglect the influence of other parameters such as size of particulates and temperature. Therefore, they show a significant deviation from the experimental data. Moreover, although Meybodi et al. (2016) considers the influence of temperature. Nonetheless, it displays a considerable inaccuracy in viscosity prediction which may be attributed to the inadequate number of data applied for their model development.

Outlier detection
In order to check the authenticity of the used data as well as the usability domain of the established CMIS model, one of the well-established outlier detection approaches, the so-called Leverage approach, was employed in this communication (Goodall, 1993;Gramatica, 2007;Leroy & Rousseeuw, 1987). In this algorithm, three principles, namely Hat matrix (H), Leverage limit (H * ), and standardized residuals (SR) are computed using the following equations (Goodall, 1993;Gramatica, 2007;Leroy & Rousseeuw, 1987): where z i , MSE, and H ii represent the error, mean square error, and hat indices of the ith data point, respectively. I is the count of inputs plus one, and N denotes the total number of data points. X is a two-dimensional N×b matrix (the dimension of the model is indicated by b), while t denotes the transpose matrix (Atashrouz et al., 2016). The Leverage approach represents a plot, the so-called Williams plot, which makes it easy to interpret the result. In this plot, the points having SR values of higher/lower than 3/−3 and H value of lower than H * are detected as outliers and called upper/lower suspected points. Also, the points with SR values between 3 and −3, and H values lower than H * , are called valid points, which are statistically valid and predicted with an acceptable error. Moreover, good/bad high Leverage points refer to the points with H value of higher than H * and SR values of lower/higher than −3/3. The obtained Williams plot for the proposed CMIS mode in illustrated is Figure 11. As shown, only 34 points (out of 1440) were detected as outliers, which is only 2.36% of the total databank. Also, 23 points (1.6% of the whole databank) can be seen in the good high Leverage zone, which are not covered by the proposed CMIS model's usability region but are predicted well. Only one point is located in the bad high Leverage zone, which is not only out of the applicability domain of the proposed model but is predicted with a large error. According to the obtained results, the data's authenticity and the CMIS model's usability are both confirmed. The main advantage of the proposed CMIS is its accuracy and validity in a wide range of nanoparticles and operational conditions. However, application of this model for nanoparticles which were not used in this study is not recommended. As the model is a datadriven base model, its validity and accuracy are within the range in which the model has been developed. For other ranges and nanoparticles, this model should be used with care and its accuracy and validity are not guaranteed.

Summary and conclusions
In this paper, five computer-aided models were proposed based on 1440 data sets include seven nanoparticles for the prediction of water-based nanofluids' relative viscosity. Then, a single robust model was developed through the combination of these models. The following conclusions can be drawn from this study: (1) The proposed CMIS model predicts the viscosity of water-based nanofluids with more precision than other the available models in the literature and can be used in a broader range of operational circumstances.
(2) The available models' predictions are acceptable only for very low values of volume concentrations; nonetheless, for values greater than 0.5%, they do not return reliable results.
(3) All of the presented models in this study possess high capability and accuracy in the prediction of the value of relative viscosity for the water-based nanofluids and all are in satisfactory agreement with the experimental values. (4) According to their accuracy and performance, the proposed CMIS model provides the best performance, followed by the BR-MLP, LM-MLP, RBF, CSA-LSSVM and SCG-MLP models. (5) The CMIS model outperforms all of the available models in the literature with an ARE of 0.007%, AARE of 1.246%, RMSE of 0.036, SD of 0.021, and R 2 of 0.9999. (6) The proposed CMIS can follow the actual trends of viscosity with respect to input parameters variations.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Funding
The author(s) reported there is no funding associated with the work featured in this article.