Using machine learning for cutting tool condition monitoring and prediction during machining of tungsten

ABSTRACT Machining of single-phase tungsten, used as a plasma facing material in fusion energy reactors, is commonly associated with rapid tool wear and short tool life. Conventional methods of monitoring tool wear or changing cutting tools after a predetermined period are inefficient and can lead to unnecessary tool change or risk damaging the workpiece. Tool wear can adversely affect the surface finish and dimensional tolerances of machined parts. Predicting its onset can avoid this critical damage whilst ensuring maximum tool life is utilised. In this paper, firstly the tool life results in end milling single-phase tungsten using different cutting tool geometries and cutting speeds are provided for the first time. A novel method is proposed by combining sensor signal prediction and classification machine learning models. It works by forecasting the cutting tool bending moment signal which is then used for predicting future cutting tool condition in end milling of pure dense tungsten. A series of machining experiments, covering the whole life of a cutting tool, were performed to collect the sensor signals. The current time series signal from the sensory tool holder is employed to forecast the future signal by training a 1D convolutional neural network (1D CNN) and an artificial neural network (ANN). The forecasted signal is then used to predict the state of the cutting tool in the future. Machine learning classifiers namely, random forest (RF), support vector machine (SVM) and extreme gradient boosting (XGBoost) supervised learning models were trained and validated on actual sensor signals to correlate the tool conditions with specific sensor signal features. The investigations revealed that the 1D CNN performed best in forecasting the time series sensor signal whilst achieving a mean absolute error of 3.37. In addition, the RF, when trained on Wavelet Scattering features, resulted in the most accurate classification of sensor signals for tool condition detection. The analysis showed that the combination of 1D CNN signal forecasting, feature extraction through statistical analyses and RF classifier performs best in predicting the state of a cutting tool in near future. Using this method allows for decision making for changing the tool whilst ensuring that the maximum useful life of a cutting tool is utilised. It also enables preventing undesired damage to the machined surface due to late detection of tool wear or delays in taking appropriate actions. The application of this method can reliably reduce the manufacturing costs and resource consumption associated with cutting tools for machining tungsten and minimise tool wear induced damage to the workpiece.


Introduction
The increasing demand for clean energy and the requirements for operational safety as well as energy security have been the driving factor in the developments for fusion energy production.Fusion energy production addresses many of the issues associated with operating fission energy reactors.However, due to high costs and technical limitations, the deployment of fusion energy reactors still remains limited to research (Tikhonchuk 2020).Other constraints in the reactor interior including high operational temperature, enormous thermal shock from repeated plasma strikes and prolonged exposure to irradiation damage due to neutron bombardment preclude the use of majority of conventional materials (Haag et al. 2023).Materials selected for this harsh environment must thrive in the long term.Due to their hightemperature properties, refractory metals and specifically tungsten are used as plasma facing material components in these reactors (Ueda et al. 2014).
In order to be able to scale up the operations of fusion reactors, tungsten parts need to be machined to specific geometries.However, owing to the properties of tungsten such as its brittleness and high material strength and hardness, its machining is mainly characterised by short tool life (Edstrom et al. 1980).This in turn requires frequent tool change which can be very inefficient and uneconomical in terms of increased production time and manufacturing costs.Olsson et al. (2021) investigated different tool materials, namely, ceramics, coated and uncoated tungsten carbide, cermet, PCD and PcBN in turning single-phase tungsten.They found that all tools apart from PCD and PVD coated tungsten carbide tools failed within a few seconds of machining.They noted poor surface quality and subsurface damage due to cracking, built-up-edge formation and surface cracking.A thorough review of the machining and processing of tungsten has indicated that there is a lack of knowledge on the machining performance of tungsten and its alloys (Omole et al. 2022).Specifically, there is no report on machining performance of tungsten in intermittent cutting processes such as milling.In machining critical and highvalue products, such as tungsten used in fusion reactors, conservative measures are taken to prevent costly damages to the parts.This means that the cutting tools are often discarded prior to reaching the end-of-life criterion leading to high manufacturing and environmental costs.By implementing automated tool condition monitoring (TCM) this conservative approach can be avoided and cutting tool utilisation can be maximised whilst preventing damage to the parts.
The research on TCM spans over many decades on the implementation of various sensor systems and identifying specific features in various sensor signals which may relate to certain incidents during machining such as tool failure, wear or chipping (Byrne et al. 1995).Dynamometers, accelerometers and acoustic emission sensors have been used to collect data during machining which are further analysed for TCM.Aitor et al. (2022) compared various sensor signals for TCM in drilling Inconel 718 and concluded that cutting forces are the best predictors for TCM.Boud and Gindy (2008) reported that cutting forces, sound pressure and machine table displacement signals were most sensitive to tool wear.In order to reduce the complexity of dealing with large amount of data collected from sensors during machining, various dimensionality reduction and feature extraction methods have been used.Wang et al. (2017) investigated a variety of dimensionality reduction techniques and their impact on predictive performance.The techniques include kernel principal component analysis, locally linear embedding, isometric feature mapping (ISOMAP) and minimum redundancy maximum relevance.The kernel principal component analysis performed best in terms of the sensing accuracy.Kong et al. (2018) used an integrated radial basis functionbased kernel principal component analysis to extract features from multi-domain signals to predict tool wear using the gaussian process regression model.The effectiveness of the proposed method was attributed to the ability of the technique to remove the negative effects of noises in the signals.Benkedjouh et al. (2015) applied expectation-maximisation principal component analysis and ISOMAP reduction techniques to fit a nonlinear regression model using support vector regression (SVR).Although the authors did not make a comparison between both methods to determine which performed better, the proposed approach was found to be suitable for assessing tool wear evolution of cutting tools and predicting the remaining useful life.
Using machine learning methods instead of conventional statistical methods and feature detection have gained popularity in recent years (Serin et al. 2020).Machine learning methods can overcome the issues in dealing and processing with large sensor data.Shankar et al. (2019) trained an ANN using cutting force and sound pressure signals to detect tool condition in machining of 7075-T6 aluminium and reported successful detection of tools with flank wear in excess of 300 µm.Cho et al. (2010) extracted a range of domain-specific features from multiple sensor signals i.e. cutting forces, vibrations, acoustic emissions, and spindle power.These were combined to classify tool wear using multilayer perceptron, radial basis function network and support vector machine.Hassan et al. (2021) proposed using a Wavelet Scattering Convolutional Neural Network to extract stable representation of signals in the timefrequency domain and classifying them for TCM.The authors reported 100% detecting tool failure in machining Al7075-T6 and Ti6Al4V alloys.This network configuration produces data representations which minimise intra-class variabilities whilst preserving inter-class discriminability.Wu et al. (2017) extracted statistical features from cutting forces, vibrations and acoustic emissions signals in a multi-sensor fusion approach to predict tool wear using artificial neural networks, SVR and random forests.Wang et al. (2017) performed a multi-domain analyses from a multisensory data collection of cutting forces and vibrations signals to improve the performance of an SVR model.The literature indicates that the two prominent sensor signals used for TCM are cutting forces from a dynamometer and acoustic emissions.Dynamometers, whilst effective, are expensive and restrict the size of the workpiece that can be machined.Acoustic emission sensors generate large volumes of data even for small cuts due to their very high-frequency band, making data processing computationally expensive.Therefore, both sensors are only suitable for laboratory use.
The methods in the aforementioned studies have been utilised to estimate the current tool condition with reasonable effectiveness.However, tool wear progression tends to have a nonlinear behaviour and can have detrimental effects, especially as the coating layer starts to degrade.Whilst the current tool condition monitoring is advantageous, the capability to also anticipate future tool conditions is vital.Sun et al. (2020) proposed an approach to TCM using LSTM and residual CNN to enable the early detection of tool condition and support decision-making for tool changing.The LSTM was used to forecast tool wear values whilst the CNN was built for current tool condition monitoring.Cheng et al. (2022) developed a method which uses a parallel CNN followed by a Bidirectional LSTM network for TCM.A multi-step approach was proposed in which a dense residual neural network was subsequently used to predict the tool wear into the nearest future.Wang et al. (2019) developed a deep heterogeneous model, comprising a Bi-directional and Unidirectional gated recurrent unit (GRU), to predict future tool wear with reasonable performance.Hall et al. (2022) presented a framework for forecasting sensor signals for TCM.They implemented a deep learning method using convolutional long short-term memory to forecast sensor signals based on future frames predictions of scalograms representations of raw acceleration signals.
The monitoring of the current tool condition has been the focus of most studies in the literature with limited research on the capability to predict the future condition.In addition, the studies have been conducted using multiple external sensors, such as dynamometers, accelerometers and acoustic emissions sensors, which may not be suitable for industrial adoption.Whilst the fusion of multiple sensors provides more insight into the specific events during machining for scientific research, they do not necessarily enhance the ability for tool condition monitoring.Aitor et al. (2022) suggested that cutting forces provide sufficient information for TCM.However, most studies rely on a dynamometer for TCM which is not practical for industrial application.
In the absence of specialist tools for machining tungsten, this paper reports on tool life results from end milling of single-phase dense tungsten used in fusion energy production for the first time using a range of rake angles and cutting speeds.A new method is presented for independently detecting the current tool condition and predicting the future condition of an end milling tool used for machining tungsten based on tool bending moment signal from a sensory tool holder.A number of machine learning models have been trained and tested for predicting the time series future bending moment signal as the machining and hence tool wear progresses.Machine learning classifier networks are used to classify the current and predicted bending moment signal for detecting the current tool condition and predicting future tool condition respectively.The variation of cutting tool geometry and cutting speed enhances the robustness of the models.The capability to monitor and predict tool conditions help to maximise the cutting tool utilisation whilst minimising tool wearinduced damage to the workpiece.
Following this introduction, Section 2 provides the research methodology used in this study, including the experimental procedure for machining data collection as well as the frameworks of the different machine learning models used in this paper.Section 3 presents and discusses the experimental results followed by the findings from training and testing of various algorithms for future bending moment signal prediction and both the current tool condition monitoring and future tool condition prediction.Finally, Section 4 presents the conclusions and possible future works as a result of the findings from this paper.

Methodology
This section presents the experimental and theoretical methodologies for data collection and processing for tool condition monitoring and prediction in end milling tungsten.The experimental setup used to acquire the sensor signals during machining in an end milling operation is first described.The collected data was then pre-processed for training and testing machine learning models.
The proposed TCM method is illustrated in Figure 1, and it includes the bending moment signal forecasting, current tool condition detection and future tool condition prediction stages.Here, the current tool condition detection and future tool condition prediction are two separate tasks using the same trained classifiers.The signal forecast stage includes a multi-step forecast of the sensor signals using neural networks i.e. 1D CNN and ANN.This stage comprises the following steps: data acquisition, signal preparation and pre-processing, training and validation, and forecast.In the current tool condition detection stage, the classifiers i.e.RF, SVM and XGBoost were trained and validated using features extracted from the data.In the future tool condition prediction stage, features were extracted from the forecasted signals (from the signal forecast stage) and inputted into the alreadytrained classifier models to predict the future tool state.

Experimental setup for data collection
The machining experiments involved monitoring the tool flank wear and tool life in end milling of 99.99% pure tungsten workpiece material.These experiments were conducted on a XYZ vertical milling centre with a 13 kW spindle in dry conditions.A two flute, 12 mm diameter solid carbide end mill with Balzers Balinit Latuma PVD AlTiN coating and 34° helix angle was used for each experiment.Four different rake angles of 8°, 10°, 12° and 14° were tested.In addition, two cutting speeds of 40 m/min and 60 m/min were used.Table 1 shows the cutting parameters used in this study.A full combination of these parameters, with at least one repeat, resulted in 306 cutting passes.The experimental setup is illustrated in Figure 2 with the Spike sensory tool holder, cutting tool and tungsten block shown.The Spike sensory tool holder wirelessly measures the bending moments in the x and y directions in the cutting tool's coordinate system with a 2.5 kHz sampling rate.The signals from the bending moment of the cutting tool in x and y directions in each machining pass were used for  training and validation of the machine learning models.Each cutting pass was performed in the longitudinal direction of the workpiece along the 40 mm length as shown in Figure 2. The flank wear was measured routinely using a digital microscope following the ISO8688-2 criterion for localised wear on any individual tooth.

Sensor signal forecasting
There are various methods of forecasting time series data including statistical methods such as the autoregressive integrated moving average (ARIMA) and the recurrent neural network-based algorithms including LSTM and GRU which can capture longrange in time series data.Despite the prevalence of recurrent neural network-based algorithms, CNNs are also suitable for such tasks due to their ability to detect and preserve useful patterns in sequential data (Babu, Zhao, and Xiao-Li 2016).The relative simplicity of the CNN as well as the ANN serves as an advantage in time series forecasting tasks in terms of reducing computational complexity.In this study, the widely used 1D CNN and ANN have been used to forecast time-domain sensor signals in a supervised learning task.The structures of the networks are illustrated in Figure 3.The 1D CNN can detect patterns across a sequence by convolving the data using kernels.In this way, the network (Figure 3(a)) can learn to preserve useful information whilst ignoring unimportant details (Jin, Cruz, and Goncalves 2020).ANNs have been shown to be effective for predictive tasks due to their excellent ability to capture and map the complex relationships between input features and outputs.The network architecture is characterised by the number of layers and the number of neurons in each layer as illustrated in Figure 3(b).The network learns through backpropagation which involves an update of the weight and bias parameters in each layer (Zheng et al. 2018).The TensorFlow framework was used for the implementations of both algorithms in this study (Abadi et al. 2016).Unlike the more complex recurrent neural network-based LSTM and GRU algorithms, the 1D CNN and ANN have been chosen for their computational simplicity which supports the aim of this study to develop a relatively straightforward method for tool condition monitoring.
Prior to processing the signals for training the learning algorithms and forecasting, the data was preprocessed.Two signals were collected for tool bending moment in x and y directions.As shown in Figure 4, the resultant bending moment was then calculated using Euclidean theorem.The raw signal from the sensors includes non-material cutting data before the tool engages with the workpiece and after  it disengages at the end of a cutting pass.When the tool engages with the workpiece, there is a sudden increase in the bending moment of the tool.A thresholding method was used to detect the moment that the tool engages and disengages with the workpiece and the data was trimmed as shown in Figure 4.The resultant bending moment signals was computed from the measured signals in the x and y directions and subsequently trimmed as depicted in Figure 4.This enables the training to focus on the sensor signal during material cutting.This threshold approach was consistently applied in this paper.
For signal forecasting, the sensor signals collected from the experiments explained in Section 2.1 were split into 70% training, 15% validation and 15% testing.The sensor signals from the previous machining passes were used for training the networks and forecasting the sensor signal for the future pass.The signals were accumulated in each cutting pass and pre-processed in the same way i.e. signals in pass 1 were pre-processed and used to train the neural networks to forecast the signals in pass 2; signals from passes 1 and 2 were merged, pre-processed and used for training to forecast signals in pass 3; signals from passes 1, 2 and 3 were merged to forecast pass 4 signals and so on up to the final pass as depicted in Figure 5.The pre-processing step specific to the signal forecasting involved the creation of a variable windowed dataset using TensorFlow.This is done such that the input features are data points within the predefined window and the next data point in the series represents the output.The window sizes include 100, 200, 300 and 400 data points, with each representing 0.04 s, 0.08 s, 0.12 s and 0.16 s of machining time, respectively.These sizes were chosen to test the effect of the variable length of input features and the stability of the algorithms in making predictions.Furthermore, each window was shifted once, each time, across the entire signals to collate input features and output values which were stacked vertically.This shift was performed until there are not enough data points to fit in the window.
Training during the forecast stage was performed on the data available up to the current cutting pass; The forecast follows a multi-step sequential approach where the last data points of the accumulated signals up to the current cutting pass, and with length of equivalent window size, were fed into each trained model to predict the next value.The window was then shifted once to include the newly forecasted value to make up the window size.The new windowed data points were used to make another prediction.The prediction is repeated until data points with the equivalent length of the window are forecasted.In this way, the cutting pass signals were forecasted, starting from the second cutting pass up to the last as illustrated in Figure 5.

Sensor signal classification
Supervised learning with classifiers can learn specific features in a sensor signal and associate the signal to a specific class.There are many different classifier models such as logistic regression, decision tree, K-nearest neighbour, naïve Bayes, SVM, RF, XGBoost and even ANNs and CNNs with different levels of complexity, training data and computational requirements.In this study, RF, SVM and XGBoost have been selected due to their relative simplicity, reliability, and ease of computation.Figure 6 shows the frameworks of these classifiers.The RF and XGBoost have been specifically chosen for the classification tasks in this study due to their ensemble learning method of combining multiple predictors.
The RF is an ensemble learning method consisting of several decision trees with each predictor tree trained on a random subset of the dataset.The algorithm makes decision with the core idea being that performance improves when multiple predictors are involved compared to a single predictor algorithm (Breiman 2001).In training the RF, different subsets of the data are generated through bootstrap aggregating where sampling is performed with replacement (Sheykhmousa et al. 2020).The RF algorithm also introduces extra randomness when growing the trees by selecting the best feature among a random subset of features, and this creates a greater tree diversity which help reduce variance and avoid overfitting (Wu et al. 2017).The RF has been particularly chosen because of its versatility in handling different tasks with little pre-processing requirements and an efficient computation.It is robust to outliers and non-linear data.
The implementation of the SVM takes an entirely different approach with its main motivation being to separate a dataset into classes with a surface that maximises the margin between them.The algorithm does this by transforming the original feature space into a higher dimensional space such that an optimal hyperplane, which maximises the separation distances among the classes, is determined.It is one of the most widely used for classification tasks as it can handle very large feature spaces with acceptable generalisation capability (Cervantes et al. 2020).In this study, the SVM was used to separate the tool conditions into different classes through a kernel function chosen through grid search optimisation using the Scikit-learn library (Pedregosa et al. 2011).Aside being effective in handling classification tasks, the algorithm is also memory efficient since it only uses a subset of the training data as support vectors; however, scaling of the input features is required as has been done in this study through the standardisation of the data.
Just like the RF, a boosting algorithm is an ensemble learning method which combines several weak predictors into a strong learner.The main idea is to sequentially train predictors with each predictor trying to correct its predecessor by fitting itself to the residual errors of the previous predictor (Gao et al. 2019).The XGBoost overcomes the limitations of the boosting algorithm by making training faster through its parallel implementation of the algorithm.Unlike the RF, the XGBoost inherently introduces a regularisation term to improve generalisation and prevent overfitting (Chen and Guestrin 2016).The implementations of all classifiers, including hyperparameter tuning and optimisation, were carried out using the Scikit-learn library (Pedregosa et al. 2011).

Feature extraction
Machining learning classifiers learn specific features from data which can be used for classifying the signals.There are various methods of extracting features from time series signals.This may include statistical analyses from the original time domain or transformation of the signals into the timefrequency domain using methods such as continuous wavelet transform (CWT) and wavelet scattering (WS).The cutting tool bending moment signals and the root mean square (RMS) values were selected as statistical parameters from the original signal, x i , as defined in Table 2. Furthermore, φ x 1 , φ x 2 , φ y 1 and φ y 2 were extracted as features representing the phase differences in the signals, as a function of the spindle speed, for tooth 1 in x-direction, tooth 2 in x-direction, tooth 1 in y-direction and tooth 2 in y-direction, respectively.The features were extracted from varied window sizes with the rake angle and cutting speed also encoded as features using one-hot encoding.Standardisation was used to rescale each feature to have a zero mean and unit variance.
where N spindle is the spindle speed in revolutions per minute Median ii-Continuous wavelet transform.The CWT compares the signal to a shifted and compressed or stretched wavelet to determine the transform coefficients for time-frequency localisation.CWT is defined for a signal, f t ð Þ; wavelet analysing function, ψ t ð Þ; position parameter, u; and scale parameter, sð > 0Þ, as (Zhu, San Wong, and Soon Hong 2009): CWT was implemented using PyWavelets (Lee et al. 2019), an open source wavelet transform software for Python.The Morlet wavelet with scales ranging from 1 to 100 was used to compute the coefficients.The first two principal components were extracted which, taken together across the dataset, are responsible for an average explained variance of 96%.These two components were then flattened and stacked across the set to train the classifier algorithms with the tool conditions as labels.
iii-Wavelet scattering.WS transform of the signals provides stable representations of the acquired signals in the time-frequency domain such that the lowvariance features, which are insensitive to translations of the inputs, are extracted.Features are generated iteratively where the output of one stage is fed as input to the next.The three operations performed in each stage include convolution, where the input signal is transformed by each wavelet filter; nonlinearity transformation, by taking the modulus of the filtered outputs; and averaging each of the moduli with a scaling filter to generate the scattering coefficients.The operations can be mathematically represented for the first stage (Hassan, Sadek, and Attia 2021): where S 1 x t; λ 1 ð Þ are the corresponding first order coefficients, x is the input signal, ψ λ 1 are the set of wavelet filters in the first filter bank, and ϕ J is the scaling function.The second order coefficients are similarly generated in the second stage by simply applying the same operations on each filtered outputs from the first stage using the set of filters, ψ λ 2 , to produce the second order coefficients, S 2 x t; λ 1 ð Þ, as shown (Hassan, Sadek, and Attia 2021): The energy of the coefficients rapidly dissipates with the stage iterations through the network and converges to 0 i.e. higher order scattering coefficients have lower energies compared to the lower order coefficients (Bruna and Mallat 2013).Networks with two wavelet filter banks have been found to be sufficient for most applications, and as a result, the second order scattering coefficients have been chosen to extract features in this study.
The WS transform was implemented using the Kymatio package (Andreux et al. 2020) in Python.The NumPy frontend was called to compute the 1D first-order scattering coefficients, with the plot shown in Figure 7.The parameters for the transform include the number of samples, given by the length of the original signal; the averaging scattering scale, which is specified as a power of 2 and set as 6 to get an averaging, or maximum, scale of 64; and the number of wavelets per octave, which is set as 16 to resolve frequencies at a resolution of 1/16 octaves.Similarly, the first two principal components, which account for an average explained variance of 89%, were extracted and used for training.

Classifier training and testing for current tool condition detection and future tool condition prediction
The tool condition monitoring approach in this study is divided into two independent stages: i) detecting the current state of the cutting tool based on the sensor signal as it is collected during the machining and ii) determining the future state of the tool based on the forecasted signal as explained in Section 2.2.As shown earlier in Figure 1 in Section 2, both stages are classification tasks which require the training of classifier models.
The sensor signals from the machining experiments in Section 2.1 were labelled and used for training the classifier models namely, RF, SVM and XGBoost.The signals for each tool condition were classed, based on the flank wear (V B max ), as 'minor', 'medium' and 'severe' for the purpose of training.The thresholds used to class the signals based on the wear states include V B max � 100μm for 'minor', 100μm < V B max � 300μm for 'medium', and V B max > 300μm for 'severe'.In this way, the classifier algorithms were trained by mapping the extracted features from the signals to the wear class as indicated earlier in Figure 7.
The dataset includes 306 cutting pass instances which were collected from the experiments as described in Section 2.1.This data was split into 80% training and 20% testing.Prior to training each classifier algorithm, the GridSearchCV method in Scikitlearn was used to determine the optimal hyperparameters within predefined search spaces.Crossvalidation was used to aid in this search to establish these optimal values.The hyperparameters of the RF are the number of estimators and the bootstrap definition.The GridSearchCV method determined the number of estimators as 2000 whilst bootstrapping was defined as False.The kernel of the SVM was found to be linear while the C hyperparameter value was determined as unity.Similarly, the optimal number of estimators for the XGBoost was found to be 6000.
A 5-fold cross-validation was first used to assess performance and test convergence of the classifier models before testing on the held-out test set.The prediction of severe wear condition and, especially, the ability to determine its onset are more important from a tool condition monitoring perspective.The predictions are summarised in a confusion matrix where the precision, recall, F1-score, and overall accuracy can be computed as concise metrics for each tool condition class.The F1-score combines the precision and recall into a single metric and is more representative in evaluating predictive performances.As a result, the F1-score for the severe tool condition prediction has been given more importance and used to assess the performances of the classifier models.The metrics can be computed as follows: where TP is the number of true positives, TN is number of true negatives, FN is the number of false negatives and FP is the number of false positives.The same classifier models that were used for current tool condition monitoring were used for future tool condition prediction.The models were trained on actual sensor signals as explained earlier.However, for the classification, the forecasted signals from Section 2.2 were used.In the future tool condition prediction stage, statistical features were extracted from signals forecasted for each pass and used to make predictions of tool wear states using the trained classifiers.The features extracted from the forecasted signals of a particular window size were inputted into classifiers trained on data of corresponding window size.For example, the features from a forecasted signal of 0.04 s window size were input into a classifier trained on the equivalent 0.04 s window size to make predictions of the future tool states.

Results and discussion
Some of the results from the machining experiments are presented in Section 3.1 with the following Sections showing and discussing the results for the tool monitoring tasks.The tool bending moment signals from the machining experiments were used to train an ANN and a 1D CNN network to forecast the future sensor signals.The forecasted signals were then validated against the actual signals (ground truth) from the machining experiments.The results from training, testing, and forecasting the bending moment signals are detailed in Section 3.2.The bending moment signals from machining experiments were labelled and different feature extraction methods namely, statistical parameters, CWT and WS were used to train a number of machine learning classifiers (RF, SVM and XGBoost) as explained in Section 2.3.The trained models were used for detecting the current state of the cutting tools during machining using the actual bending moment signals and to predict the future conditions of the cutting tools based on the forecasted signals from Section 2.2.The results from training and validating the classifier models for detecting the current condition and predicting the future condition of the cutting tools are explained in detail in Section 3.3.

Machining experiments
The tool wear growth in each machining experiment is shown in Figure 8.The longest tool life for machining pure dense tungsten was about 14 min using the cutting tool with 14° rake angle when machining at 40 m/min cutting speed.Irrespective of the cutting speed, the tool with 14° rake angle performed best.
The average bending moment signal for the first machining pass of each experiment is also shown in Figure 9.The results indicate that the tool with 12° rake angle when machining at 60 m/min has the lowest bending moment, and thus cutting force during machining.This means that this tool experienced the least impact during machining, and it would be reasonable to expect that this experiment would also result in the longest tool life.However, from Figure 8, this is not the case.The tool life for this experiment was about 8 min, even when its average bending moment was about 12% lower than when machining at the best rake angle and cutting speed combination (14° and 40 m/min).In addition, the experiment with the highest average bending moment (8° and 40 m/min) has an eventual tool life of about 11 mins which is even higher than when machining with the least initial bending moment signal (12° and 60 m/min).
This shows that a visual plot of cutting signals is not sufficient to understand the machining of tungsten.These signals cannot merely be extrapolated as the basis to monitor the tool condition at any point during machining.The approach proposed in this paper helps to mitigate this challenge as the signals are correlated to the actual tool performance by using machine learning algorithms to learn useful features from the signals.

Forecasts of bending moment signals
The bending moment signal from the machining experiments were used to train the 1D CNN and ANN to predict the future bending moment signals.
In this scenario, the bending moment signal from the 1 st pass is used to predict the bending moment in the 2 nd pass.Hereafter, the signals from the 1 st and 2 nd passes are used to predict the signal in the 3 rd pass and so on.Figure 10 shows the evaluation of the training, validation and testing of the 1D CNN and ANN with the MAE used as the metric for comparing predictions of the bending moment signal with the ground truth signal during training and validation.Additionally, an example is provided on the ability of each network in predicting the bending moment signals compared with experimental results during testing.
A representative evaluation of the 1D CNN on the different datasets are shown in Figure 10   The results of each network's prediction on selected passes (2, 5, 8, 11, 14, 17, 19 and 20) are shown in Figures 11 and 12.They clearly demonstrate that the 1D CNN can model the variability of the original signals for all passes.This is also evidenced in Figure 10(b) where the algorithm's prediction matched the test set signal undulations best.Whilst the ANN has a relatively poor variability in its prediction, its MAE per pass is comparable to that of the 1D CNN.
Figure 13(a) shows the ground truth mean resultant bending moment per pass compared to the mean forecasts per algorithm.Figure 13(b) further indicates the errors made on the ground truth per pass.The capability of the 1D CNN to match the signal variability, coupled with its low prediction errors, make it suitable for further trials, hence justifying its selection for the classification tasks in Section 3.3.3.Nevertheless, based on the predicted data presented in Figures 11 and 12 and the mean bending moment shown in Figure 13(a), both the 1D CNN and ANN underpredict the bending moment in comparison with the experimental data.
This performance of the 1D CNN agrees with the findings by (Van et al. 2020).The authors compared the performance of the 1D CNN with the ANN and LSTM as well as with other traditional models, such as the autoregressive integrated moving average (ARIMA) and seasonal ARIMA.They showed that the predictions of the 1D CNN outperformed these algorithms and concluded that it may be more suitable for time series modelling.The algorithm was   also able to better match the peaks in the observed data despite its underestimations.Based on these results, the 1D CNN was chosen for forecasting the future bending moment signal for future tool condition prediction in Section 3.3.3.

Tool wear classification
This section details the results from training the classifier algorithms namely, RF, SVM and XGBoost for current tool condition monitoring as well as for the predictions of the future tool conditions based on forecasted signals.In Section 3.3.1, the results from training and testing the classifier algorithms for detecting the current tool conditions are presented.The impact of signal length (window size) on the detection performance were investigated for four window sizes of 0.04 s, 0.08 s, 0.12 s and 0.16 s.Statistical features were extracted from the signals and were the basis of the training and testing in this section.The performances of the classifier algorithms in classifying the current tool conditions, when trained on features extracted using CWT and WS of the whole signal, are discussed in Section 3.3.2.In addition, the best performing feature extraction method is identified.In Section 3.3.3,the classifier algorithms trained on experimental data are tested for predicting the future tool conditions using the forecasted sensor signals from the 1D CNN as explained in Section 3.2.

Current tool condition detection using statistical features
The results from training and testing the classifier algorithms on statistical features for detecting current tool conditions are presented in this section.The confusion matrices in Figures 14, Figures 15 and 16 show the comparisons of the test set predictions and the actual tool conditions for the RF, SVM and XGBoost, respectively.All classifiers correctly detect all the severe cases for the 0.16 s window; however, the SVM has the highest F1-score of 96%.The maximum F1-scores of the RF and XGBoost are 93% and 90%, respectively.In all cases, there were no instances where a classifier detected a severe tool condition when the actual condition is minor and vice-versa.Furthermore, both minor and severe conditions were only misclassified as the intermediate medium condition.
The classification results suggest that extracting features from a larger window might improve performance.However, as shown in Figure 17, the models trained on the statistical features extracted from the whole signal resulted in less accurate classification relative to those trained on much shorter window sizes presented in Figures 14, Figures 15 and 16.Training the RF, SVM and XGBoost models on the features extracted from the whole signals resulted in F1-scores of 85%, 86% and 86%, respectively.
The inclusion of the cutting tool rake angle and cutting speed as one-hot encoded features (four encodings for the rake angle and two for the cutting speed) show that these bear relatively low importance and have the lowest scores of all input features.The feature importance of the combined six encoded features, using the in-built feature importance attribute in Scikit-learn, is 3.49% (0.58% on average) compared to an average of 3.69% per contributing phase difference feature.On the other hand, x max has the maximum feature importance of 23.19%.
Retraining each algorithm with the exclusion of the rake angle and cutting speed as features show the performances are not significantly degraded, even with the potential loss of data.The results are shown in Figure 17(d-f) where the F1-scores are 83%, 87% and 77% for the RF, SVM and XGBoost, respectively.
The algorithms trained on the whole signal were further validated on an unseen data.The results are presented in Figure 18 where the RF has the best performance with a 100% F1-score.The algorithm also correctly detects the onset of severity at the 15 th machining pass.The SVM and XGBoost, on the other hand, were not able to detect the onset of severity.

Current tool condition detection using continuous wavelet transform and wavelet scattering features
In addition to statistical feature extraction, the effect of feature extraction has been tested with the application of continuous wavelet transforms (CWT) and wavelet scattering (WS) to whole signals.Equivalent comparisons have been made with the performances of the algorithms when trained on statistical features as in Section 3.3.1,with the exclusion of the rake angle and cutting speed as features.The confusion matrices in Figure 19 demonstrate the capabilities of the different models in detecting the tool condition against the actual condition.The results from the CWT features are shown in Figure 19(a-c).Both the RF and XGBoost have F1scores of 89%, which are higher than their performances when trained on statistical features (83% and 77% respectively).However, the F1-score of the SVM is 85% compared to 87% on the statistical features.Figure 19(d-f) presents the performances of the RF, SVM and XGBoost algorithms when trained on WS features.The performances have greatly improved compared with when trained on either statistical or CWT features.The F1-scores have increased to 92% for all classifiers, which signify that they can be more reliable when severe tool conditions are detected.
The performances of each algorithm are also tested on unseen data from the validation set to further assess the reliability of the prior predictions.The results are presented in Figure 20 for each algorithm when trained on CWT and WS features.With the CWT in Figure 20(a-c), the RF and XGBoost have 100% F1scores compared to 73% for the SVM.The RF and XGBoost were also able to detect the onset of severity unlike the SVM which could not.Figure 20(d-f) show the results with the WS features where the RF and SVM have F1-scores of 100% whilst the XGBoost has 92%.Based on these performances, the RF and SVM could detect the onset of severity.
Whilst the performance of SVM significantly increased when using WS, the RF trained on WS features demonstrated the best performance in correctly detecting the current cutting tool condition.Similar findings have been reported in the literature on the superior performance of the RF.Wu et al. (2017) found that the RF produced more accurate predictions of tool wear relative to the SVM.In the study by Park et al. (2019), the RF and XGBoost outperformed the SVM in classifying tool wear state.Das et al. (2019) also found that the RF showed better classification results compared to the SVM.The ensemble approach of combining multiple decision trees in the RF could explain this superior performance compared to other non-ensemble methods.
It can also be deduced that training the classifiers on WS extracted features offer stability in performances.Compared to training on statistical features as in Section 3.3.1, the CWT and WS features help the classifiers perform better as the F1-scores and overall accuracies are increased.The capabilities of the CWT and WS transforms to provide features which simultaneously preserve both spectral and temporal information explain the improved performances.Furthermore, the superior performances of the classifiers on WS features can be attributed to the stable transformations of the signals as features which limit the intra-class variabilities whilst also maximising inter-class differences.This stable representation of signals by the WS transform was the subject of a study by Hassan et al. (2021) where distortions in the signals were minimised to achieve a 98% accuracy of tool condition predictions.

Future tool condition prediction from forecasted signals
The results obtained from predicting tool conditions ahead of time are presented in this section where the classification results are shown from pass 2. Being able to predict the condition of the tool ahead of machining enables decision making for changing the cutting tool or adjusting the cutting parameters.
The forecasted signals in Section 3.2 are of finite intervals and it is not practical to transform them into the time-frequency domain using either the CWT, WS or other transformation methods.This is because of boundary effects or distortions which are common in processing time series signals of finite length (Su, Liu, and Jingsong 2012).The signals need to be extended to alleviate these distortions.Common extension methods including zero padding, periodic extension and symmetric extension are based on assumptions about the signal characteristics.They fail to produce satisfactory results in terms of preserving the time-varying characteristics of the signals (Strand and Nguyen 1996).As a result of this challenge, only statistical analyses have been performed to extract features from the forecasted signals to make predictions about the future tool conditions.
Statistical features were extracted from the signals forecasted per cutting pass in Section 3.2 by the 1D CNN for a whole set of experiment.These features were input into each trained classifier of corresponding window size.The results are presented in Figures 21, Figures 22, 23 for the RF, SVM and XGBoost, respectively.In Figure 21, the RF has similar prediction of the severe cases for both the 0.04 s and 0.12 s windows.Similar observations can be made for the 0.08 s and 0.16 s widows.The SVM in Figure 22 has similar predictions of the severe cases across all windows, hence suggesting a more invariant response to window size.The XGBoost in Figure 23 has similar performances as the RF and SVM for all window sizes except with the 0.16 s window which has the worst prediction by missing three of the severe cases.The analyses here show that none of the algorithms have their performances significantly altered by the window size.
However, the RF with 0.16 s window resulted in the best performance in terms of the overall accuracy, prediction of severe cases and the detection of the onset of severity at pass 14.The SVM with the 0.04 s window was also able to detect the onset of severity, but it has a relatively poorer performance.These show that the RF has a better performance than both the SVM and XGBoost and can be reliably deployed for future tool condition prediction tasks as in this study.

Discussion and future work
Machining of single-phase dense tungsten is increasingly required in order to be able to scale fusion energy reactors and reduce the overall costs associated with fusion energy generation.Beyond fusion energy generation, tungsten has applications in various fields including aviation, automotive, aerospace, electronics, medicine, chemicals, sports etc. (Lassner and Schubert 2005).A series of end milling machining experiments were performed for the first time using 12 mm diameter solid carbide tools with AlTiN coating and various rake angles.The investigations showed that a tool with 14° rake angle performed best in terms of tool life among the tested  geometries.Whilst the variation of cutting tool geometry exposes the learning models to variations in cutting tools, other tool microgeometries can be further explored to enhance the machining performance for milling single-phase tungsten.
Cutting forces have been identified by various researchers as the best predictor for TCM (Aitor et al. 2022;Cho, Binsaeid, and Asfour 2010).However, using a dynamometer is not practical in industrial applications and its use is limited to laboratory tests.Wireless sensory tool holders allow for unintrusive data collection for TCM which can be applied in production scenarios and retrofitted into existing machine tools.In this study, a Spike sensory tool holder was used for collecting cutting tool bending moment signals during machining.The bending moment signals were used for detecting and predicting the cutting tool condition, both of which are independent tasks.Instead of predicting the future tool wear or tool condition directly as is common in the literature, the cutting tool bending moment signal was first forecasted using 1D CNN and ANN models and 1D CNN was selected for further investigation.Whilst this model achieved an MAE of 3.37 in predicting the future bending moment signals, there are many other learning algorithms that can also be tested.For instance, Hall et al. (2022) used ConvLSTM to predict future transformed signal frames rather than the actual sensor signals.Other potential learning algorithms for time series prediction are sequence models such as LSTM and GRU which can capture long range dependencies in the data.These models can enable the measured signals to be better matched to further minimise the errors of the forecasted signals.However, these models require large datasets for training and validation and are computationally expensive.Three classifier networks namely, RF, SVM and XGBoost were used for classifying sensor signals for current tool condition detection as well as future condition prediction.Amongst these, the RF performed best in detecting the current tool conditions with an accuracy of 95%.The RF also achieved 89.5% accuracy in predicting the future conditions.There are other deep learning classifier models such as CNN and ANN for detecting small features in sensor signals.However, these models also require large datasets for training and validation.Future work will focus on assessing and comparing the performance of these models in detecting specific events during machining such as chipping and nose breakage.Feature engineering could also benefit future studies where new features can be engineered from the signals to aid training and improve the performance of the classifiers.The short tool life experienced in machining tungsten makes conventional methods of tool wear monitoring for making decision for changing the tool extremely inefficient.Detecting the current state of the tool might not leave enough time to take appropriate actions before damaging the workpiece using a worn tool.Being able to predict the condition of the tool in the near future enables timely decision making for changing the tool.

Conclusion
Tool life results from end milling single phase fully dense tungsten used in fusion energy reactors have been reported for the first time and tool bending moment signals were collected for tool condition monitoring using a sensory tool holder.A method for independently detecting the current tool condition as well as predicting the future condition in end milling of single-phase tungsten has been tested and validated.In this method, the cutting tool bending moment from a sensory tool holder during machining is used for detecting and predicting the cutting tool condition.The sensor signals from the experimental data have been used to train a number of classifier models namely, random forest (RF), support vector machine (SVM) and extreme gradient boosting (XGBoost).For detecting the current tool condition, the trained models have been used to classify the actual measured signals to determine the tool condition.The method also combines two stages for predicting the future cutting tool condition.Firstly, 1D convolutional neural network (1D CNN) and artificial neural network (ANN) have been trained and validated on historical data to forecast the time series tool bending moment in the future.Secondly, the same trained classifier models have been used to classify the forecasted bending moment signal to predict the future condition of the cutting tool.In this way, the capabilities of different classifiers in detecting current tool condition based on the actual bending moment and in predicting the future tool condition based on forecasted bending moment signal have been assessed.Overall, the following conclusions were drawn: RF achieved 95% accuracy in detecting the current cutting tool condition when it was trained on features extracted from the bending moment signal using wavelet scattering.• Classifying the forecasted sensor signal to define the cutting tool condition in the future has shown to successfully predict the future tool condition with 89.5% accuracy.Future tool conditions can be predicted by extracting statistical features from the forecasted signals.In terms of the predicting the severe wear condition of the cutting tool, the combination of 1D CNN for forecasting bending moment signal and RF classifier can be especially used to classify the tool states whilst also being able to predict the onset of severity.
Cost effective and reliable manufacturing of parts made from refractory metals such as tungsten is necessary for successful implementation of fusion energy reactors.The proposed tools and methods presented in this paper can help industries in machining of tungsten parts and predicting when a cutting tool is expected to suffer from severe tool wear prior to damaging the workpiece, hence replacing the need for constant process interruption or changing the tools prematurely.

Figure 1 .
Figure 1.The proposed approach for detecting current tool conditions and predicting future conditions.

Figure 2 .
Figure 2. Illustration of experimental setup for data collection.

Figure 4 .
Figure 4. Example bending moment signal from machining in (a) x-direction, (b) y-direction and (c) the resultant bending moment indicating the cutting tool engagement with the workpiece.

Figure 5 .
Figure 5. Illustration of sensor signals measured and forecasted in each cutting pass.

Figure 7 .
Figure 7. Illustration of the tool condition classification with the original and transformed signals.

Figure 8 .
Figure 8. Tool wear plots during machining of tungsten at (a) 40 m/min and (b) 60 m/min cutting speed.

Figure 9 .
Figure 9. peak bending moment for first machining pass.

Figure 10 .
Figure 10.Performance evaluations of the 1D CNN and ANN on the validation and test sets: (a) learning curve of the 1D CNN; (b) test set prediction of the 1D CNN; (c) learning curve of the ANN; (d) test set prediction of the ANN.

Figure 13 .
Figure 13.Mean bending moment for each machining pass for (a) ground truth and prediction per algorithm; (b) prediction error per algorithm.

Figure 14 .
Figure 14.Confusion matrices for the classification of tool conditions by the RF for (a) 0.04 s window; (b) 0.08 s window; (c) 0.12 s window; and (d) 0.16 s window.

Figure 15 .
Figure 15.Confusion matrices for the classification of tool conditions by the SVM for (a) 0.04 s window; (b) 0.08 s window; (c) 0.12 s window; and (d) 0.16 s window.

Figure 16 .
Figure 16.Confusion matrices for the classification of tool conditions by the XGBoost for (a) 0.04 s window; (b) 0.08 s window; (c) 0.12 s window; and (d) 0.16 s window.

Figure 17 .
Figure 17.Confusion matrices with/without rake angle and cutting speed as encoded features for RF, SVM and XGBoost when trained on statistical features.

Figure 18 .
Figure 18.Confusion matrices on validation set for (a) RF, (b) SVM, and (c) XGBoost when trained on statistical features.

Figure 19 .
Figure 19.Confusion matrices of each algorithm on test set for each algorithm when trained on (a)-(c) CWT features and (d)-(f) WS features.

Figure 20 .
Figure 20.Confusion matrices on validation set for each algorithm when trained on (a)-(c) CWT features and (d)-(f) WS features.

Figure 21 .
Figure 21.Prediction of tool conditions by the RF per cutting pass for all window sizes.

Figure 22 .
Figure 22.Prediction of tool conditions by the SVM per cutting pass for all window sizes.

Figure 23 .
Figure 23.Prediction of tool conditions by the XGBoost per cutting pass for all window sizes.

•
Tool bending moment signals collected from a sensory tool holder have been successfully used for detecting the current tool condition by training classifiers to map the measured signals to the actual tool conditions during machining.• Both the 1D CNN and ANN can be used for forecasting time series bending moment signals in the near future.However, the 1D CNN outperformed the ANN in its forecasts by closely matching the actual signals an MAE of 3.37.• All classifier models tested can detect the current tool condition with different degrees of accuracy.

Table 1 .
The geometric features of the cutting tool, workpiece dimensions and cutting parameters.

Table 2 .
List of statistical features extracted from signals.