Optimization of performance and emission of compression ignition engine fueled with propylene glycol and biodiesel–diesel blends using artificial intelligence method of ANN-GA-RSM

The present study proposes the hybrid machine learning algorithm of artificial neural network-genetic algorithm-response surface methodology (ANN-GA-RSM) to model the performance and the emissions of a single cylinder diesel engine fueled by diesel and propylene glycol additive. The evaluations are performed using the correlation coefficient (CC), and the root mean square error (RMSE) values. The best model for prediction of the dependent variables is reported ANN-GA with the RMSE values of 0.0398, 0.0368, 0.0529, 0.0354, 0.0509 and 0.0409 and CC 0.988, 0.987, 0.977, 0.994, 0.984, 0.990, respectively for brake specific fuel consumption (BSFC), brake thermal efficiency (BTE), CO, CO2, NOx and SO2. The proposed hybrid model reduces BSFC, NOx, and CO by −30.82%, 21.32%, and 11.32%, respectively. The model also increases the engine efficiency and CO2 emission by 17.29% and 31.05%, respectively, compared to a single RSM in the optimized level of independent variables (69% of biodiesel's oxygen content and 32% of the oxygen content of propylene glycol).


Introduction
In recent years, pollutants from internal combustion engines have led to a major concern about environmental issues. Diesel engines have been proven to endanger human health by emitting greenhouse gases (GHG) (Krzyżanowski et al., 2005). To reduce diesel engine emissions, modifications have been made to the fuel system, combustion chamber or engine control system (Papagiannakis et al., 2007). The use of catalysts in modern diesel vehicles is common. Much attention has been paid to the improvement of diesel fuel recently. Biodiesel from vegetable oils is a suitable alternative fuel to diesel fuel. Biodiesel is produced as an oxygenated fuel from renewable and sustainable primary sources . Using biodiesel to refine diesel fuel is an effective way to reduce emissions (Barrett, 2011). Because engine emissions occur due to incomplete combustion of fuels, which is mainly due to insufficient oxygen supply (Dec, 1997).
Research results show that in oxygenated fuels, the amount of heat value decreases with increasing oxygen content (Farkade & Pathre, 2012). Therefore, with the use of oxygenated fuels, the brake-specific fuel consumption (BSFC) increases (Chang et al., 2013). The challenge is exacerbated when oxygenated additives' contribution to diesel fuel increases (Botros, 1997;Murcak et al., 2013). Contrary to previous research, a number of studies have shown that BSFC decreases with the use of oxygen additives (Yilmaz et al., 2014).
In addition, the use of oxygenated additives can reduce combustion temperature. Because the presence of oxygen causes the fuel cetane number to increase and consequently reduce the ignition delay (Fang et al., 2013). In the study by Coniglio et al. (2013), it has been explained that the oxygenated additives reduce the ignition delay. Thus, it reduces the reactivity and accordingly reduces the temperature of the combustion. Also, Imdadul et al. (2016) show that thermal braking efficiency (BTE) increases with the use of oxygenated additives. But Labeckas et al. (2014) reported that increasing the amount of oxygen in the fuel reduces thermal braking efficiency. Yesilyurt et al. (2020) employed blends of biodiesel and pentanol as oxygenated additives with diesel fuel samples in a diesel engine for consideration of performance and emission characteristics. According to the results, the presence of pentanol as an oxygenated additive could successfully reduce the engine emissions and lead the combustion process to reach a complete combustion condition. Choi et al. (2015) and Labeckas et al. (2014) reported that CO emission increases with increasing levels of oxygenated additives in diesel fuel due to the low cetane number of oxygenated fuels, increased delay in combustion and incomplete combustion of fuel. In contrast, Ilkılıç et al. (2011) and Balamurugan and Nalini (2014) indicated that oxygenated fuels reduce CO, which they have argued due to enhanced oxidation of fuels by internal oxygen fuel. However, Balamurugan and Nalini (2014) and S. Kumar et al. (2013) attributed this to the low carbon to hydrogen (C/H) ratio of oxygenated fuels. Also, Abdalla and Liu (2018) and Atmanli et al. (2015) claimed that NOx emissions in diesel engines using oxygenated additives had been increased slightly. This can be due to the low cetane number of oxygenated fuels and consequently, the increase in temperature inside the combustion chamber. NOx formation occurs at high temperatures. Some researchers have also claimed that the enthalpy of evaporation of oxygenated fuels is higher, resulting in lower adiabatic flame temperatures and conclude that using oxygenated fuels reduces the peak temperature inside the cylinder and reduces NOx emission (Armas et al., 2014;How et al., 2014;C. Kumar et al., 2019). Armas et al. (2012), Ferreira et al. (2013) and claimed that the emission of HC using oxygenated fuels is higher than that of diesel due to the high heat of evaporation in oxygenated fuels. The high heat of evaporation slows down the evaporation and makes the fuel-air mixture poorer, resulting in lower combustion temperatures inside the cylinder, resulting in incomplete combustion and not burning any part of the fuel. Also, (Armas et al., 2014) and (Hebbar & Bhat, 2013) indicated that lowoxygenated fuels lead to a decrease in HC emissions and reduce in PM emissions due to the internal oxygen content of these fuels which lead to accrue a complete combustion and to reduce the amount of PM and soot. Sources have shown that oxygenated additives have great potential to reduce emissions of diesel engines.
In recent years, various types of these materials have been introduced. However, the case for the use of oxygenated additives is still open and research is continuing intensively. Recently, Artificial neural network (ANN)based methods have become more practical for experimental applications (Amid & Mesri Gundoshmian, 2017). Prediction of engine performance and emission characteristics is one of the talented fields for the use of ANN-based techniques. The main reason for the use of ANN for predicting engine behavior, is the complexity of the combustion process for investigating the relations among performance factors, emission factors and engine input factors that depend on the design of the experiment factors. Sometimes, there is a need for ignoring what happens within a process. ANN-based techniques like a black-box are able to do such missions without the need to know the nature of the process happens in real (Agatonovic-Kustrin & Beresford, 2000;. In the present study, it was aimed to employ a hybrid common ANN-based method called ANN-GA to develop a model for the prediction of engine emission and performance variables (as dependent variables) based on the oxygen content of the fuel samples (as independent variables). This was performed for preparing a platform to be employed by RSM for optimizing the process in the second step. Considering the fuel samples' oxygen content as a variable in modeling process and the optimization based on oxygen content help us reach a proper blend of fuel and additive. This can be the main novelty of the study. This is important from several aspects. One is making a cost-effective fuel blends and the second is make a sustainable combustion process with a lowest emission and highest performance. This study's main purpose is to consider the effect of the oxygen content of the propylene glycol additive and biodiesel on the performance and emission characteristics of a diesel engine for making a prediction platform and optimization using the hybrid ANN-GA-RSM method.

Experimental tests and data sets
Biodiesel used in this research from west Cooking oil (WCO) in accordance with the optimized source method , with the chemical formula C 18 H 34 O 2 (Jannatkhah et al., 2019) was produced. Propylene glycol with a purity of 99.8% was purchased from Merck Company (CAS # 57-55-6) with the chemical formula C 3 H 8 O 2 (http://www.merckmillipore. com/INTL/en/product/12-Propanediol; Najafi et al., 2019). Also, pure diesel fuel # 2, C 14 H 24  was used as a reference fuel. Some properties of propylene glycol, biodiesel, and diesel were measured according to ASTM standard, which is given in Table 1.  The oxygen content of propylene glycol (OxPG) and biodiesel (OxB) were 42.1% and 11.35%, respectively, whereas diesel fuel had no oxygen. Therefore, the percentages of oxygen in propylene glycol (OxPG) and biodiesel (OxB) were introduced as inputs to the optimization system. Responses or dependent variables examined in this study included performance variables (BSFC and BTE) and engine emissions (CO, CO 2 , SO 2 , and NOx). Propylene glycol was blended with diesel fuel at 6 levels of 0, 0.2, 0.4, 0.6, 0.8 and 1% and biodiesel at 4 levels of 5, 10, 15 and 20%. And pure diesel fuel was used as a control sample. Propylene glycol oxygen (OxPG) in the fuel blends ranged from 0 to 0.842%, while that for biodiesel (OxB) ranged from 0.556-0.27%. Experimental tests were performed using a Kirloskar single-cylinder diesel engine. The engine specifications are presented in Table 2.
Fuel consumption was measured in accordance with reference . Engine emissions of CO 2 , CO, SO 2 , NOx, and O 2 were measured with a KIGAZ 210 gas analyzer. The temperature of the exhaust gases was measured using the PT100 sensor. Inlet air flow was measured with an AVM-305 anemometer. Engine tests were performed at full load at a constant speed of 1500 rpm. Figure 1 shows the schematic diagram of the engine tested. Table 3 presents the specifications and accuracies of the measuring instruments.

ANN_GA method
ANN is used as one of the most efficient and practical intelligent approaches for modeling, clustering, predicting and signal processing purposes (Faizollahzadeh   Ardabili et al., 2018). According to a biological nervous system, the ANN's main logic is for its applications in undefined systems without the need for specific systematic relationships. ANN was first introduced by McCulloch and Pitts (McCulloch & Pitts, 1943). This technique has been employed in different research works in agricultural, engineering, and industrial fields. ANN contains input, hidden and output layers. Neurons are considered as connectors of layers. The hidden layer includes sets of neurons. Figure 2 indicates the architecture of ANN developed in this study. Based on Figure 4, the developed ANN technique contains two inputs as oxygen contents of biodiesel and PG. In the present study, the best architecture for the hidden layer was obtained by trial and error for generating six outputs (BSFC, efficiency, CO, CO 2 , NOx, and SOx). The optimal architecture was obtained to be 2-6-3-6. The ANN method's main approach is to generate output values by each neuron using Equation (1) for each input (x j for j = 1, 2, . . . , n) by weights (for i = 1, 2, . . . , n). (1) But, ANN contains disadvantages such as long time consuming and lack of using an optimal global solution. These issues made researchers to use algorithms for improving the leakages of ANN method. The GA, as the frequently used optimizer, was employed in the present study to improve the performance of the ANN method in developing a predictive modeling system for the performance and emission characteristics of a diesel engine fueled by biodiesel and different levels of PG additive in line with comparing their performance.
ANN-GA method was first developed by Whitley et al. (1990) in 1990. This technique applies genetics, mutation, natural selection, and crossover. The main performance of this technique is in this way that, first, the algorithm generates populations owned by n individuals. Then estimates the correlation among each individual. After finishing this, select two parents from the old population by considering their correlations and generates probability values to be considered a crossover between two parents for creating a new individual. After forming individuals, there are two selections. One is finishing the algorithm for the best solution in the current population. The second way is to repeat the algorithm for finding the best solution. GA employs natural selection, mutation, and crossover. Choosing was carried out by the use of a uniform selection technique, which excludes bias and minimal spread. Table 4 presents the characteristics of GA developed in this study.
In the present study, four best models were selected as the developed ANN-GA among other runs (Table 6). GA can be a proper solution for disadvantages of ANN but it is not deterministic alone. Therefore, many researchers employed different optimizers that GA is at the top of them.

Response surface methodology (RSM)
Response Surface Methodology (RSM) is a collection of statistical and mathematical techniques useful for developing, improving, and optimizing processes. The most important application of RSM is in certain situations where variables affect the variables or characteristics of a process. These variables are called response process variables. Influential variables are called independent variables or factors and are determined by the researcher (Tamilvanan et al., 2020). In response surface method, statistical models are developed to investigate the range of factors change. These models provide an approximation for the relationship between factors and variables. In other words, statistical models such as Equation (3) are created to predict the factor y based on variables.
The shape of the function f is unclear and may be very complex (Khuri & Mukhopadhyay, 2010). In this study, the effect of two different oxygenated fuel types on a diesel engine was modeled. The independent input variables were the percentage of oxygen in the propylene glycol additive (OxPG), and the percentage of oxygen in the biodiesel (OxB) and the responses or dependent variables included: SO 2 (ppm), CO 2 (Vol.%), CO (%), NOx (ppm), BTE (%) and BSFC (g/kWh) were respectively. So the general shape of the model is as follows: The model was developed using Design-Expert 8.0 software (Stat Ease Inc. Minneapolis, USA). The Box-Behnken scheme was used in optimization. Each variable in the Box-Behnken scheme was encoded at three different levels, namely −1, 0, and 1 factors, respectively. The range of oxygen percentages in propylene glycol additive (OxPG) and the percentage of oxygen in biodiesel (OxB) were as shown in Table 5.

Normalization
In statistics and related applications, normalization of data can be employed for different purposes. As a simple definition, normalization of different values measured on different scales of data is a way to adjust data in a standard scale. In other words, normalization can be defined as a shift and scaling versions of data with different scales and ranges to eliminate the effects of scale and levels influences in data set [98]. In the present study, the range of input and output variables were different. Therefore, this made us normalize all the parameters in a specific range to increase the accuracy of the prediction. There are different normalization methods such as standard score normalization, Min-Max Feature scaling, students' t-statistic normalization, coefficient of variation and standardized moment normalization. In the present study, Min-Max Feature scaling as the most effective and frequently used normalization method in rescaling purposes (Equation 3).
where x is the data in the measured data set scale and z is the normalized value of x in the scale of its minimum and maximum values.

Evaluation criteria
In order to compare the performance of the developed ANN-GA techniques, two frequently used metrics,  including root mean square error (RMSE) and correlation coefficient, were considered (Equations 4 and 5) to find and calculate the differences between target and predicted data (Faizollahzadeh Ardabili et al., 2019).
where, A is the target values and P is the predicted values for n data.

Results and discussions
In this section first results of the modeling process using ANN-GA is discussed. Training of ANN using GA technology was started in the presence of 25 populations to 100 populations with 25 intervals using 70% of total data. In each section, training was a repeatable process to reach the best network with high accuracy. The evaluation criteria factor for judgments about the accuracy of networks was the RMSE factor. As presented in Table 6, the bestoptimized network was ANN-GA, with a population size 75 in generation 106th with the highest correlation coefficient and the lowest RMSE values. The next step was to evaluate the testing capability of the developed networks in the presence of 30% remaining data. In this step, also networks have been evaluated by importing 30% of data and generating the related results to be compared using RMSE and correlation coefficient values. Table 7 presents results for the testing process of ANN-GA techniques. As is clear form results, in testing stage, the main competition is related to model No. 3 with 75 populations and model No. 4 with 100 populations. But by an exact consideration, model No. 3 owns the competition in generation 106th by considering the lowest training time (from Table 6) compared with that of the model No. 4. Figure 3 also presents the plot diagram of the predicted variables in the presence of target and predicted values to indicate the linearity and deviation of the above mentioned variables using determination coefficient related to testing step model No. 3. Therefore, model No. 3 was selected as the best method for the prediction phase. This network was employed in the optimization phase to develop the proposed innovative ANN-GA-RSM technique in comparison with single RSM (without predicting with ANN-GA).

Optimization
Optimization process was performed by importing the normalized data into a novel hybrid method by merging ANN-GA and RSM technique. In statistics, RSM, introduced by George E. P. Box andK. B. Wilson in 1951 (Box &Wilson, 1951), is a tool to provide the functions among several explanatory and response variables. This technique benefits a sequence of designed experiments to obtain an optimal output (or responses).
In the present study, RSM was developed using Design Expert software version 7.0 software. Through a trial and  error method the quadratic process order and manual selection were selected to model BSFC, efficiency, CO, CO 2 , NOx and SO 2 by oxygen content of biodiesel and PG. Optimization was performed to find a condition of fuel blends for reaching a maximum BTE and CO 2 and minimum BSFC, NOx, SO 2 and CO. Because, these limitations can be resulted from a complete combustion condition. Figure 5 presents the optimized levels of experimental data using single RSM. But as was previously mentioned, the main aim of the present study was to develop a novel hybrid ANN-GA-RSM. Therefore, the following platform was developed and the outputs of the hybrid responses were generated and reported in Figure 6 to be compared with those of the single RSM. In fact, Figures 5 and 6 presents the relation between oxygen content of PG and Biodiesel and their effects on performance and emission factors for RSM and the hybrid ANN-GA-RSM. Figure 4 indicates the schematic diagram of the developed hybrid models.
As is clear, the effect of the oxygen content of biodiesel on the variations of parameters is higher than that for the oxygen content of PG. this can be due to the lower portion of PG in fuel samples compared with biodiesel. Also, the optimum condition for parameters is in the middle range of biodiesel oxygen content. This can be confirmed  Ali et al. (2015) about optimization of the performance and emission characteristics of a diesel engine fueled with biodiesel. According to the claims, the middle range of biodiesel contents can improve viscosity in the presence of oxygenated additive and accordingly improve the diesel engine's performance and emission characteristics. The similar finding is also claimed by Ramakrishnan et al. (2018) in the presence of pentanol and biodiesel as oxygenated additives for diesel fuel. Results indicated a significant improvement in brake power and BSFC as well as engine emissions. This can be due to improving the combustion process by the presence of oxygen. But, increasing the oxygen content higher than the specific value can reduce thermal efficiency and reduce the engine's performance. This phenomenon was also claimed by  in a study that claimed the maximum available energy can be reached at middle range for biodiesel portions in diesel fuel as oxygenated additive. In order to do an exact comparison between single RSM and hybrid ANN-GA-RSM responses, Table 8 was generated and prepared from Figures 5 and 6 to indicate the optimized responses and optimization capability of the proposed method in comparison with single RSM. As is clear from Figure 5 the optimized condition was provided at 69% of biodiesel's oxygen content and 32% of PG's oxygen content.
In order to have an exact comparison between methods at the same point, the ANN-GA-RSM method was set at a single RSM condition and Figure 6 was prepared at 69% of oxygen content of biodiesel and 32% of the oxygen content of PG, similar to that of the single RSM.
As is clear from the results, the prepared models could successfully cope with modeling and optimizing tasks in a way that models provided very good results. Using a hybrid method improved and increased the system optimization efficiency compared to the single RSM (Table 8). The optimization cost function was to reduce BSFC, increase efficiency, reduce CO emissions, increase CO2 emission (to reach a complete combustion), reduce NOx emission, and reduce SO 2 emission. As is clear from Table 6, the proposed ANN-GA-RSM successfully improved the condition by reducing 30.82% of BSFC, 21.32% CO emission, 11.32% NOx emission and 41.7% of SO 2 emission and increasing 17.29% of efficiency and 31.05% of CO 2 emission in comparison with single RSM.

Conclusion
The present study's strategies were to consider biodiesel and propylene glycol's oxygen content to manage the single cylinder CI engine performance and emission characteristics using an innovative ANN-GA-RSM technique in a proper way. Results of the present study can effectively help researchers and policymakers in the field of using and managing propylene glycol additive for improving the performance and emission of diesel engines and also can give a proper perspective for other relevant studies for employing other optimizers and machine learning techniques as well as other additive types. The predicting process was developed using the oxygen content of biodiesel and propylene glycol (as two independent variables) to estimate the BSFC, engine efficiency and CO, CO2, SO2, and NOx emissions (as dependent variables). According to the results, ANN-GA with a population size 75 could provide the highest prediction performance. Therefore, this model was employed in the optimization process and could successfully reduce BSFC, NOx, and CO by −30.82, 21.32, and 11.32%, respectively. It could successfully increase the engine efficiency and CO2 emission by 17.29 and 31.05%, respectively, compared with a single RSM in the optimized level of independent variables (69% of biodiesel's oxygen content and 32% of the oxygen content of propylene glycol). Our future perspective is to develop a managing hardware device for the engine setup using hybrid machine learning techniques.