Artificial neural networks for predicting the demand and price of the ‎‎hybrid elec‎tric vehicle spare parts

Abstract The hybrid electric vehicles (HEVs) market has grown tremendously in the past few years which, as a result, has led to an exponential growth in the spare parts (SPs) market. Therefore, there is a strong need, nowadays, to predict the demand as well as the price of these SPs. However, ascertaining such an aim is not as easy as it may seem, this being due to the facts that (i) the demand is highly uncertain as it depends on many uncertain variables, and (ii) the price does not follow the normal value chain methods. In this research work, the artificial neural network (ANN) is utilized to develop models that can map 15 vehicles and SPs-related variables to the demand and the price of the HEV SPs. It has been demonstrated that the ANN models have the ability to predict both the demand and the price of the HEV SPs. In addition, the developed ANN models outperform the linear regression models by minimizing the root mean square error values by approximately 4 and 5 times for the demand and the price, respectively. Neural network-based models have been employed to accurately predict the demand as well as the price of the HEV SPs by mapping them to 15 vehicles and SPs-related variables.


ABOUT THE AUTHOR
Wafa' H. AlAlaween is currently an Assistant Professor at The University of Jordan. She received her PhD degree from The University of Sheffield, the UK in 2018, and since then she has been teaching various courses related to artificial intelligence and deterministic and stochastic optimization. Her research interests include artificial intelligence, biologically inspired computing and optimization, and fuzzy and neuro fuzzy systems in various applications including manufacturing, pharmaceuticals as well as healthcare. She has published many research papers in reputable journals and conferences. She has been recently working on various projects. One of these projects aims at developing systems-engineering models to predict the demand and the price of the spare parts for the hybrid electric vehicles in Jordan. Researchers from different institutions collaborated to perform such a project.

PUBLIC INTEREST STATEMENT
The hybrid electric vehicles (HEVs) and their spare parts (SPs) markets have grown tremendously in the past few years. Therefore, predicting the demand and the price of these SPs is advantageous. In this research work, the artificial neural network (ANN) is, thus, utilized to develop models that can map vehicles and SPs-related variables to the demand and the price of the HEV SPs. It has been demonstrated that the ANN models can predict both the demand and the price of the HEV SPs. The developed models represent a promising development in the automotive industry, where such models can be employed to (i) accurately predict the demand and the price of the HEV SPs; (ii) support activities related to warehouse management and inventory control; (iii) support budgeting and procurement planning activities; and (iv) improve the performance of the SPs supply chain.

Introduction
Recently, hybrid electric vehicles (HEVs) have attracted a lot of interest, this being due to the fuel economy and the preservation of the environment via minimizing greenhouse emissions. In general, the HEVs are powered by an internal conventional combustion engine and an electric propulsion system (Alalawin et al., 2020). The governments of developed countries have been regarded as pioneers in the promotion of HEVs by enacting legislation to limit the gas emissions of new vehicles (Bennett et al., 2016). Jordan, like many other nations across the world, promotes the use of hybrid vehicles not just to protect the environment but also to reduce the use of fossil fuels (Alalawin et al., 2020). Therefore, the HEVs market has recently grown tremendously which, as a result, has led to an exponential growth in the HEV spare parts (SPs) market (Lorentz et al., 2011). In addition, the availability of the HEV SPs and their reasonable price play a vital role in estimating the demand of the HEVs. Thus, there is a strong need for efficient and effective management of the HEV SPs by being able to predict the demand and the price. However, achieving such a target is considered to be a challenging task. This can be attributed to (i) the uncertainties in predicting the demand of the HEV SPs as it depends on many uncertain factors; and (ii) the fact that the price does not follow the normal value chain methods, this being due to the fact that it is estimated by the different types of maintenance (e.g., preventive and corrective maintenance) (Au-Yong et al., 2016;Goossens & Basten, 2015;Hassan et al., 2012;Roda et al., 2014;Syntetos et al., 2009;Wang, 2012). This has, therefore, urged the need to (i) systematically collect and analyze the HEV SPs; and (ii) develop predictive modelling paradigms that have the ability to successfully predict the demand and the price of the HEV SPs. Predicting the demand and the price of the HEV SPs can be seen in the light of (i) minimizing the logistic cost; (ii) satisfying customer demand; and (iii) reducing the inventory cost of SP companies, in particular (Eachempati et al., 2021;Gu et al., 2015;Zhu et al., 2017).
Many research studies have been devoted to analyzing and classifying the SPs demand (Van der Auweraer et al., 2019). For instance, the SP demand patterns were classified into four categories based on the demand regularity and the various quantities. Such categories are intermittent, lumpy, smooth and erratic (Van der Auweraer et al., 2019). Based on such a classification, various demand forecasting models (e.g., Croston's, bootstrapping and judgmental forecasting) have hitherto been suggested by literature in this field (Pennings et al., 2017). In general, the forecasting approaches can be categorized into (i) qualitative approaches which, as the name indicates, are based on participant perceptions, experiences and personal judgments; and (ii) quantitative approaches, which are based on numerical calculations (Jonsson & Mattsson, 2009). In addition, the quantitative approaches can also be classified into time-based and casualbased approaches. The former depends on historical data; while the latter depends on explanatory factors (Boylan & Syntetos, 2008). In addition to additive and multiplicative Winter's approaches, moving average, exponential smoothing as well as double exponential smoothing paradigms have, for instance, been widely utilized in the related literature (AlAlaween et al., 2021;Lindsey & Pavur, 2008). Furthermore, the Croston approach, as a powerful method, was also employed to predict the occurrence of the demand at a specific interval via the deployment of the well-known Bernoulli process (Croston, 1972;Rao et al., 1973). Such an approach was, then, improved and revised to consider various cases (Pennings et al., 2017;Syntetos & Boylan, 2005;Teunter et al., 2011).
Recently, Artificial Intelligence (AI) has found its way into many areas including, but not limited to, manufacturing, supply chain, energy efficiency and pharmaceutics . This can be attributed to the recent advances in computing abilities which allow companies to employ such models in their processes such as maintenance planning, budgeting, supply chain and inventory (AlAlaween et al., 2021). Moreover, hybrid machine learning algorithms including, but not limited to, the integration of particle swarm optimization and genetic algorithm with the artificial neural network (ANN), have facilitated the development of several industrial domains due to their massive scalability and reliability at a lower cost (Ahmadi & Chen, 2020;Moosavi et al., 2019). Their applications have expanded beyond the creation of power transformer protection schemes and the petroleum sector, to predicting pandemic outbreaks Ahmadi, Soleimani, et al., 2015;Gambhir et al., 2017;Geethanjali et al., 2008). Therefore, it has also been utilized in the automotive SPs industry. For instance, linear regression was also employed to anticipate the demand and the price of the HEV SPs in Jordan (Alalawin et al., 2020). Likewise, the SPs were classified and evaluated objectively based on their degree of importance, which was a function of eight factors (Zhang et al., 2020). Furthermore, various models (e.g., the autoregressive integrated models of moving average (ARIMA) and the ANN) were employed to predict the monthly demand of the SPs where the results of such models were compared for a transitional company in order to minimize the predicted error (Vargas & Cortés, 2017). Likewise, ANN-based hybrid models (e.g., a radial basis function) were also utilized to anticipate stock prices (Chandar, 2021). Moreover, 8-years timeseries data for the logistic department of Bosch Automotive Electronics in Portugal were utilized to develop several models (e.g., support vector regression, AI models, ARIMA and random forest) to predict the automobile demand shifts (Gonçalves et al., 2021). It was shown that the accuracy of the AI models was superior in the long run when compared to the one of the traditional models (e.g., ARIMA). In addition, a faster and simpler model based on machine learning was proposed in order to forecast the intermittent demand of the automotive industry (Lolli et al., 2017). Although it was not computationally expensive, the predictive performance of the proposed model was not as good as the one of the well-known ANN. In addition, a hybrid two-stage model was proposed to predict the lumpy demand of SPs in order to deal with the required computational efforts of the ANN (Rosienkiewicz et al., 2017). In such a model, the simple linear regression model was used in the first stage to predict the demand. When the predictive performance was not as expected, in other words, the error was high, an ANN model was employed in the second stage.
The majority of the predictive models proposed in the related literature have focused on predicting the demand of the SPs and its various types. Therefore, there is a strong need to predict not only the demand of the SPs but also the price as both can determine the demand of the HEVs. Thus, in this research work, a modelling paradigm based on the ANN is employed to predict both the demand and the price of the HEV SPs for the automotive industry in Jordan. Such a model is developed by mapping 15 vehicle types and SP-related variables related to the demand and the price of the HEV SPs. The rest of this paper is structured as follows: Section 2 defines the various ways employed to collect the required data from the automotive industry in Jordan. Section 3 briefly describes the background of the ANN model utilized in this research, whereas Section 4 discusses the results obtained for both the demand and the price of the HEV SPs for the automotive industry in Jordan. Finally, Section 5 concludes the whole paper and presents some future pointers to the research.

Data set
The main purpose of this research paper is to develop predictive models based on the ANN to anticipate the demand and price of the HEV SPs in Jordan. Therefore, the related data were collected from the automotive industry in Jordan. Approximately 65 various types of HEVs can be found in Jordan. However, only four types, namely, Hyundai Sonata, Toyota Camry, Toyota Prius and Ford Fusion, represent more than 66.3% of the total number of the HEVs in Jordan according to the "Drivers and Vehicles Licensing Department" (Alalawin et al., 2020). The collected data are related to the main systems of the HEVs which are the engine, transmission, electrical, chassis and service systems. The data set was collected via (i) questionnaires (i.e. interviewer-administered and online questionnaires); and (ii) data requisition from several resources such as Jordan Free Zone Corporation, local retail stores and repair shops and authorized HEVs spare parts websites. It is worth mentioning at this stage that the data can be divided into two categories; vehicle types and SPs-related variables. In addition to the sources used, Table 1 summarizes the various investigated variables, which represent the inputs of the ANN model developed in this research work.
It is worth emphasizing that the CR was estimated based on a conducted questionnaire, as described in (Saunders et al., 2016). It is worth emphasizing that the CR was estimated based on a conducted questionnaire, as described in (Saunders et al., 2016). The qualitative AHP classification model developed by (Li & Kuo, 2008) was employed to calculate criticality since it categorized spare parts based on a subjective weight. In addition, the FR for each HEV SP was estimated based on an online questionnaire conducted during January 2020 in Jordan. The FR was then estimated as described in (Alalawin et al., 2020), where The FR was estimated as follows: where x represents the number of failures that occurred to each spare part, y represents the total number of HEV and Z represents the total number of years users owned the vehicle. It is noticeable that some of the variables defined in Table 1 are considered to be numerical variables (e.g., NoV, VG, RC, TMC, CR, OP, FR and VP), whereas some of them are considered to be nominal (e.g., VT, CO, OR, NoU, SP, SL and RL). Both the demand and the price of the HEV SPs, as dependent variables, are considered to be numerical variables. Because of the nature of the data, both the investigated variables, as the inputs of the ANN, and the demand and the prices of the SPs, as the outputs of the ANN, were normalized.
Statistical correlation analysis was executed between the vehicle types and the SPs-related variables and the demand and the price of the HEV SPs. In general, the correlation coefficient, as a statistical tool, indicates the strength of the association between two variables. The output values range from −1 to 1, indicating a negative or positive association, respectively. A correlation value of 1 or −1, or closer to them, indicates a strong linear relationship between the two variables, whereas a value of zero, or closer to zero, means a weak linear relationship between the two variables. Table 2 summarizes the correlation coefficient values. It is noticeable that the vehicle types and the SPsrelated variables have different effects on both the demand and price of the HEV SPs. To illustrate, the correlation coefficient, for instance, that represents the strength of the linear relationship between the SP variable and the demand is smaller than the one that represents the strength of the linear relationship between the same variable and the price, in other words, the former linear relationship is weaker than the latter one. It is also apparent that some variables have different nature of the relationships. For example, the relationship between the FR and the demand is a direct one, whereas the relationship between the FR and the price is inverse.

The artificial neural network
The recent key development of computational efforts has been significantly utilized in several areas including, but not limited to, medicine, logistics and manufacturing (AlAlaween et al., 2021). Such a development has been utilized to employ the data available in developing databased models that can either replace or complement physical ones in case they are too complex to use or they never exist (AlAlaween et al., 2016). The ANN, as a data-driven model, has been successfully utilized in various driplines because of its ability to mimic the human way of thinking (Géron, 2019). In general, the ANN consists mainly of three layers, namely, an input layer, a hidden layer and an output layer, as shown in Figure 1. Each layer consists of at least one neuron. For instance, the input and the output layers, as the name indicates, consist of input neurons (i.e. the input variables of the ANN) and the output neurons (i.e. the outputs of the ANN), whereas the hidden layer consists of the hidden neurons that usually represent the transfer functions used to map the input variables to the outputs (Alshafiee et al., 2019). It is worth mentioning that various transfer functions (e.g., linear and tangential) have been employed in the related literature (Bishop, 2006). In this research work, the sigmoid function was employed as a transfer function for all the hidden neurons. Such a function can be written as follows (Bishop, 2006): where f j (x) represents the j th transfer function of the j th hidden neuron and x is an n-dimensional input vector (i.e. x= [x 1 , x 2 , . . ., x n ]). The parameters w ij and w oj represent the coefficient connecting the i th input neuron to the j th hidden neuron and the bias of the j th hidden neuron, respectively. It is worth mentioning that the l th output of the ANN (i.e. the predicted output) is commonly represented as a linear function of the transfer functions in the hidden layer. Therefore, the l th output of the ANN (y l ) can then be written as follows: where w jl and w ol represent the coefficient connecting the j th hidden neuron to the l th output neuron and the bias, respectively. In this research work, and in order to improve the predictive performance of the ANN, a multi-input single-output (MISO) ANN is employed, in other words, two ANN models were developed for the demand and the price of the HEV SPs.
In general, the numbers of neurons in both input and output layers are usually determined by the under investigation case. However, the number of hidden neurons that need to be nominated is the optimal one that leads to the best predictive performance in terms of the minimum error between the predicted and the target output (AlAlaween et al., 2016). It is worth emphasizing at this stage that the coefficients connecting the input neurons with the hidden neurons and the coefficients connecting the hidden neurons with the output neurons are initialized randomly. Such coefficients are then optimized to minimize the predictive performance error via the use of optimization algorithms. Various optimization algorithms (e.g., Gradient Descent and Scaled Conjugate Gradient) have hitherto been developed and utilized in various disciplines (Bishop, 2006). In this research work, Levenberg-Marquardt Algorithm was employed to optimize the ANN coefficients. Various predictive performance measures (e.g., mean square error) can be employed. The root mean square error (RMSE) and the coefficient of determination (R 2 ), as performance measures, were utilized in this research paper.

Analysis, results & discussion
The ANN models were developed, in this research work, to map the demand and the price of the HEV SPs to the vehicle types and SPs-related variables defined in Section 2. The ANN is considered to be one of the best paradigms to model the demand and the price of the HEV SPs in this research paper, this being due to the nature of the inputs investigated. To illustrate, the data set comprises of data for different vehicle types and generations, in addition to the unique parameters of both the vehicles and their spare parts' which differ for diverse types and generations. Therefore, the data set is not considered as a time-series data set. Moreover, the seasonality behaviour of the demand was not included in this research study. Therefore, the ANN, as a powerful interpolator, was employed in this research work. For the price of the HEV SPs, the data collected were utilized to develop a MISO ANN model. The data that consist of approximately 7652 data points, were divided randomly into three data sets, namely, training (70%), validation (15%) and testing (15%) sets. Commonly, the training data are used so the model can learn the relationships between its inputs and outputs, and to, accordingly, adjust the connecting coefficients. The validation data are usually employed to evaluate the network generalization and, as a result, to stop the training when its performance stops improving, whereas the testing data are utilized to evaluate the network by assessing its performance on data that are kept hidden during the training process (AlAlaween et al., 2021). Various numbers of hidden neurons that were in the range of 1 to 20 were tried. The best number of hidden neurons that was finally selected was the one that led to the optimal ANN predictive performance (i.e. minimum error) measured via the RMSE value. For the price of the HEV SPs, the optimal number of hidden neurons was 14. It is worth mentioning at this stage that the training was stopped at 286 epochs (i.e. iterations), where the best validation performance was obtained, as shown in Figure 2. It is noticeable that the RMSE values improved significantly for three data sets during the training process where the Levenberg-Marquardt algorithm was employed. Figure 3 shows the ANN predictive performance for the price of the HEV SPs for training, validation and testing, where the RMSE values for the training, validation and testing sets are 98, 112 and 123, respectively. It is noticeable that the data points fit adequately around the best fit line. Furthermore, it is apparent that the RMSE values for the validation and testing sets are relatively higher than the training RMSE value, this can indicate an over-training case. However, in this research work, it does not seem to be the case, this is being due to the price values of the HEV SPs in these sets. To illustrate, the number of data points whose values are greater than 6000 to the total number of data points is relatively high in the testing and validation sets, thus, the RMSE value was slightly higher. However, the error values for these points are actually less than 10%. This can be proved by evaluating the R 2 values which are 0.96, 0.95 and 0.96 for the training, validation and testing sets, respectively. Similarly, an ANN model was developed for the demand of the HEV SPs. The optimal number of hidden neurons was 9. The training process was stopped at 138 epochs where the best predictive performance was obtained. The ANN predictive performance for the demand of the HEV SPs for training, validation and testing is presented in Figure 4, where the RMSE values for the training, validation and testing sets are 43, 34 and 42, respectively. It is also apparent that the data points presented in Figure 4 fit adequately around the best fit line. Furthermore, it is noteworthy that the RMSE value for the validation set is smaller when it is compared to the ones for the training and testing sets, this being due to the demand values of the HEV SPs in the validation set. It is worth mentioning that the R 2 values for the training, validation and testing are 0.97, 0.97 and 0.96, respectively. Based on these values, the overfitting phenomenon was not noticed in this research paper.
For comparison purposes, the performance measures of the ANN models developed in this research work for the demand and the price of the HEV SPs were compared to the ones of the linear regression models developed in (Alalawin et al., 2020). Table 3 summarizes the performance measures represented by the RMSE and R 2 values of the ANN and linear regression models developed for both the demand and the price of the HEV SPs. For the price of the HEV SPs, it is demonstrated that the ANN model outperformed the linear regression model, with an overall improvement value of approximately 21% in R 2 . In addition, the RMSE value of the ANN is approximately 5 times better than that of the linear regression. Such a significant improvement can indicate that the relationships between the 15 investigated inputs and the price of the HEV SPs can be nonlinear. It can also be noted that the linear regression model failed to represent the complex nonlinear relationships between the investigated inputs and the demand of the HEV SPs, and failed to take into account the interrelationships among the investigated inputs. This can be demonstrated by the estimated RMSE and R 2 values for such a model, where the R 2 is approximately zero.
In summary, the ANN was successfully employed to develop models that can map the variables that are related to the SPs and the types of vehicles to the price and demand of the HEV SPs. The developed models represent a promising development in the automotive industry, where such models can be employed to (i) accurately predict the demand and the price of the HEV SPs; (ii) support activities related to warehouse management and inventory control; (iii) support budgeting and procurement planning activities; and (iv) improve the performance of the SPs supply chain.

Conclusions
The demand and the price of the spare parts (SPs) of the hybrid electric vehicles (HEV) play an important role in determining the demand of the HEVs. However, predicting the demand and the price of the HEV SPs is not a trivial task because the demand is affected by many uncertain variables and the price does not follow the normal value chain methods. In this paper, the artificial neural network (ANN) was, therefore, utilized to represent both the demand and the price of the HEV SPs as functions of 15 vehicles and SPs related variables. The two developed ANN models were able to successfully predict the demand and the price of the HEV SPs. In addition, they outperformed the linear regression models with