Prediction of Ethereum gas prices using DeepAR and probabilistic forecasting

ABSTRACT Ethereum is a major public blockchain. Besides being the second-largest digital currency by market capitalization for its cryptocurrency, the Ether (Ξ), it is also the foundation of Web3 and decentralized applications, or DApps, that are fuelled by Smart Contracts. At the time of this writing, Ethereum still uses Proof of Work (PoW) consensus algorithm to ensure the integrity of the blockchain and to prevent double spend. PoW requires the participation of miners, who are incentivized to assemble blocks of transactions by being rewarded with cryptocurrency paid by transaction originators and by the blockchain network itself via newly minted Ξ. Network fees for transaction submissions are called gas, by analogy to the fuel used by cars, and are negotiable. They are also highly volatile and hence it is critical to predict the direction they are heading into, so that one can time transaction submissions, when feasible. There have been several efforts to predict gas prices, including usage of large Mempools, analysis of committed blocks, and more recent ones using Facebook's Prophet model [Taylor, S. J., & Letham, B. (2017). Forecasting at scale. PeerJ Preprints, 5, e3190v2. https://doi.org/10.7287/peerj.preprints.3190v2]. In this study, we introduce an innovative approach that employs the DeepAR [Salinas, D., Flunkert, V., Gasthaus, J., & Januschowski, T. (2020). Deepar: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting, 36(3), 1181–1191. https://doi.org/10.1016/j.ijforecast.2019.07.001] model, known for its superior forecasting accuracy over conventional methods by virtue of its ability to learn from multiple related time series. This methodology not only offers immediate advantages but also holds promise for ongoing enhancements. We substantiate our claims through empirical testing, utilizing data extracts from the Ethereum blockchain and cryptocurrency price feeds. This document is an extended version of our ICCS 2022 paper on the same topic. In this paper, we dive deeper into the internals of DeepAR forecasting algorithm [Salinas, D., Flunkert, V., Gasthaus, J., & Januschowski, T. (2020). Deepar: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting, 36(3), 1181–1191. https://doi.org/10.1016/j.ijforecast.2019.07.001], analyse the correlation between the on-chain/off-chain sample data, and describe additional experiments that empirically prove our findings and, finally, perform a comparison of our outputs with those from the Prophet [Taylor, S. J., & Letham, B. (2017). Forecasting at scale. PeerJ Preprints, 5, e3190v2. https://doi.org/10.7287/peerj.preprints.3190v2] model.


Introduction
Ethereum is the second largest public blockchain by market capitalization, after Bitcoin.Ethereum's native currency, the Ether (Ξ) has soared since the network's inception.
When it was launched in July 2015, Ethereum blockchain introduced a new concept: Smart Contracts.These are programs built using a Turing-complete language (e.g.Solidity), are stored on the blockchain and are executed when specific conditions are met (Zheng et al., 2020).Smart contracts are the building blocks of Decentralized Applications, or DApps, that made the Ethereum blockchain so useful and popular.For example, Smart Contracts are the foundation of tokens that can represent digital assets or even real-world objects (Simionescu et al., 2021).ERC-20 (Wackerow, 2021) is a technical standard that defines all the software artefacts for implementing Ethereum tokens using smart contract languages, such as Solidity and Vyper.Gas prices fluctuate widely due to network activity and token transfers are a key factor.This is why we decided to feed ERC-20 token transfer activity into our model.Just as Bitcoin, Ethereum, as of this writing, is still mainly using Proof of Work (PoW), a.k.a.mining, to maintain the integrity of the blockchain.This activity is critical to prevent double spending, i.e. block users from spending their coins multiple times.Mining also makes it extremely difficult for adversaries to attack and manipulate the data on the ledger.It is also the sole method to mint new coins.As of this writing, 2 Ξ are minted every time a block is created.These coins are also rewarded to the successful miner.In addition to gaining ownership of new digital coins, miners are also compensated for incorporating data entries into the blockchain.In Ethereum, these fees are called 'gas'.
The term comes from the analogy that a car needs fuel to run and gas is the fuel that helps recording of transactions on the distributed ledger.Gas is the unit of measurement of computational power required for a miner to process a transaction and is measured in WEI = 10 −18 ETH.The price of the execution of a transaction, a contract, or a deployment for a smart contract is GasCost • GasPrice (Chuang & Lee, 2021).Just as fuel prices in real world, gas price may vary being subject of a negotiation process.The sender of a transaction specifies the maximum amount they are willing to pay, just as the miner has the options of accepting, partially refunding or rejecting the offer.This also means prices may increase drastically during peak times and transactions with lower maximum gas limit may have to wait longer before they are added to the blockchain.For this reason, predicting the gas prices is an extremely useful exercise.
This paper is organized as follows: in Section 2, we go over related works.Section 3 explains parts of the Ethereum blockchain that are important for our study.We introduce our new estimation model in Section 4. Section 5 describes the data we used and how we analysed it for our experiment.Our experiments and their results are discussed in Section 6.We wrap up the paper in Section 7, where we also suggest areas for future research.

Literature review
To date, to the best of our knowledge, there are three methods to analyse and predict gas prices are highlighted in Chuang (n.d.).The first method assumes the analysis of pending transactions in large Mempools (Blocknative, n.d.).Mempool is a buffer area where pending transactions sent by Ethereum clients are stored before they are added to the Ethereum blockchain.This method proves to be resource intensive and complex to implement as it requires access to multiple Mempools and also it assumes that the owners of these Mempools are honest.
A second method consists of analysing recently committed blocks.Such oracles include Ethereum client, Geth (Chuang, n.d.), the EthGasStation oracle (Ethereum Gas Station, n.d.), Gas Station Express (Gas Station Express Oracle, n.d.).Note that oracles are systems that connect the blockchain to the outside world, and specifically, gas price oracles provide guidance to users regarding the gas price to pay to ensure that miner will accept the fees and commit the submitted transactions into subsequent blocks (Smith, 2022).
A forecasting model based on Gated Recurrent Unit (GRU) (Cho et al., 2014) and a Gas Recommendation engine that leverages the output of the forecasting model was proposed by Werner et al. (2020).Their approach used a neural net forecasting model that also included an additional parameter that reflects the urgency of the transaction (the higher the gas price, the faster the transaction is committed).The model reduced fees by more than 50% while increasing the waiting time by 1.3 blocks, when compared to the GETH oracle.Mars et al. (2021) evaluated the LSTM, GRU and Prophet models to anticipate gas prices.An empirical evaluation resulted in better outcomes from LSTM and GRU models than Prophet model and the GETH oracle.
A Gaussian process model to infer the minimum gas price is presented in Chuang and Lee (2021).Gaussian process is a non-parametric Bayesian approach to estimate a posterior over functions based on prior over functions using test data.This model performs better than GasStation-Express and Geth only when gas prices fluctuate widely.For this reason, they propose a hybrid solution combining GasStation-Express with their model.
A closely related problem concerns the estimation of the time for confirming a transaction using ML models.Just as gas prices, transaction confirmation times can be inferred based on historical blockchain data and temporal patterns.
Two regression models based on Random Forest and Multilayer Perceptron are evaluated and compared between them, as well as with other regression models in Singh and Hafid (2019).The problem was also approached using Decision Tree, Random Forest, Logistic Regression and Support Vector Machine (with linear, Gaussian and sigmoid kernels) in Oliveira et al. (2021).
In conclusion, the literature presents various approaches to predicting Ethereum gas prices, including analysing pending transactions in large Mempools, examining recently committed blocks and employing machine learning models such as Gated Recurrent Unit (GRU), LSTM and Prophet.Our work introduces a novel application of the DeepAR model, known for its superior forecasting accuracy due to its ability to learn from multiple related time series.We also draw insights from the related field of estimating transaction confirmation times, providing valuable context for our study.Our methodology offers immediate advantages and the potential for ongoing enhancements, substantiated through empirical testing with Ethereum blockchain and cryptocurrency price feed data.This study contributes a new perspective to the research community, aiming to improve the accuracy of gas price predictions and further the development of this important area of research.

The Ethereum blockchain
The Ethereum blockchain is a linked list of blocks.A block contains a data record, also known as transactions and a number of other fields, such as the previous block's hash.Snapshots of the Ethereum database are stored on multiple nodes.Ethereum nodes vie with each other in a process known as mining to add new blocks to the blockchain.The agreement on the state of the blockchain, or consensus, is achieved using distributed algorithms, such as the Proof of Work (PoW) algorithm currently used by Ethereum, or the upcoming Proof of Stake (PoS) algorithm planned for Ethereum 2.0.Adding a block assumes the execution of its transactions and the update of Ξ balances and values of Ethereum accounts.These are kept separately from the blockchain in a Merkle Patricia Trie structure (Ward, 2020).This structure has the role to maintain the coherence of blocks' cryptographic hashes, as well as to provide efficient storage and retrieval operations.
Table 1 presents the structure of block headers, to be used as our sample data.The Previous Block Hash, Transaction Hash Root and the Receipt Root Hash presented in Table 1 are generated by the corresponding Merkle Patricia Trie structure.
Transactions in the block have a source address, a target address, amount of cryptocurrency moved, and an array of input bytes.The fields of a transaction that are relevant for our experiments are listed in Table 2.

The proposed model
For this experiment, we decided to use DeepAR (Salinas et al., 2020).Amazon SageMaker's DeepAR is a supervised learning algorithm for forecasting scalar (one-dimensional) time series using recurrent neural networks (RNNs) (Figure 1).
Unlike other forecasting methods, such as autoregressive integrated moving average (ARIMA) and exponential smoothing (ETS), DeepAR can learn a global model from multiple time series.The empirical experimental results produced by the Amazon  (Hochreiter & Schmidhuber, 1997).
Equation ( 1) is based on the chain rule applied over a joint distribution of T variables.P(Z T+1 ) is the probability of the next value in a sequence and Z t is the value of time series at time t.To avoid information leakage, the model is always trained with data from previous steps.Similar equation applies for multiple time series, with Z i,t representing the value of time series i at time t.In this case, the conditional distribution is modelled as follows: where Q Q is the model distribution or the approximate posterior distribution for model parameters Θ. Z i,t 0 :T is the data we are trying to predicta sequence of values for the ith time series from time t 0 to T. Z i,1:t 0 −1 are the past values of the ith time seriesthe data points that the model uses to make its predictions.X i,1:T are the covariates.On the left side, Q Q (Z i,t |Z i,1:t−1 , X i,1:T ): is the likelihood factor for time tthe conditional probability of Z i,t given the past values Z i,1:t−1 and the covariates X i,1:T .Basically, the model's distribution over the future values of the time series, given the past values and the covariates, is equal to the product of the likelihood factors for each time step from t 0 to T. Each likelihood factor is the model's estimate of the probability of the value at that time step, given the past values and the covariates.The right-hand side of Equation ( 2) can be written as where ℓ is the likelihood corresponding to a fixed parameterized distribution, such as Gaussian or negative binomial, with parameters captured by function θ of the hidden outputs: Function h is implemented via recurrent neural networks using LSTM cells.Per Hochreiter Schmidhuber (1997), the LSTM 'memory cell' state, c i,t−1 , is fed in along with the prior step hidden state h i,t−1 and the inputs x i,t and z i,t−1 , and it outputs h i,t , c i,t and z i,t .The update for the cell state c i,t is given by the formula below, as provided in the same paper: where f i,t represents the forget gate, i i,t the input gate and ci,t the candidate memory cell.
The LSTM cells allow simultaneous training of related time series.An encoder-decoder setup, common in sequence-to-sequence models, is used to transfer states between the conditioning range and the prediction range.In DeepAR, both the encoder and the decoder have the same architecture.

Data collection and pre-processing
According to Salinas et al. (2020), the covariates can be item and/or time dependent.We assumed that gas prices would be influenced by the volume of the transactions, exchange rate of Ether, smart contract activity (e.g.tokens being minted, transferred, etc.).
Distinct from previous models in this field that we are aware of, which solely rely on onchain data, our approach uniquely incorporates both on-chain and off-chain information.For instance, we utilize off-chain data such as cryptocurrency price streams and Ethereum node log dumps.
We developed a Jupyter notebook for extracting timestamps and averages by minute for gas prices, transaction values, committed transactions (i.e.added to blocks), token transfers, events emitted by ERC20 tokens, and gas used by transactions.For our experiment, we used the data as seen in Table 3. Figure 2 shows the normalized data for the test time range.The correlation matrix shown strong negative correlation between Gas Prices and most other features, except MATIC Prices.The correlogram in Figure 3 is based on the absolute values of the correlation coefficients, reinforcing the fact that some features (i.e.MATIC Price) may not be related to Gas Prices.
Adding related time series to the training data in DeepAR can enhance the model's performance by enabling shared learning across series, improving the handling of rare events, increasing prediction accuracy, and providing more reliable uncertainty estimates.Our experiments confirmed these assumptions.
For the training and validation phase, we processed the mean of all the time series in Figure 4 for every 20 minutes.After experimenting with various time series frequencies, we chose 20 minute intervals as the most optimal for this type of data.Ethereum gas prices fluctuate widely and hence the data is very noisy.It is noteworthy that outliers did not have an impact on the model prediction, unlike was the case with Prophet where we had to truncate them for better results.In the end, we had data processed at 20 minute intervals for 291 days ( 1 January 2021 to 18 October 2021).We used 80% of the data for training and 20% for validation.

Model configuration and setup
We built a Python Jupyter notebook (Ferenczi, n.d.) and used Gluon Time Series (GluonTS) (Alexandrov et al., 2019) for probabilistic time series modelling.The DeepAREstimator is an implementation of the model described by Salinas et al. (2020).We have configured the estimator as follows: . Prediction length of 40 and the samples represent the means at 20 minute intervals, thus providing 40 • 20 = 800 minutes = 13 hours and 20 minutes).Prediction length is the number of subsequent data points inferred by the model. .Architecture of 4 layers with 40 cells per each layer.
. Dropout rate = 0.1. .Context length (number of steps) of 80 (2*prediction length).Context length is the number of points provided to the model to make the prediction.The target time series are the Gas Prices and there are 7 additional dynamic features.
. Cell type GRU.Note that we experimented with LSTM cells as well, although we did not notice any significant improvement, but rather a slight slow-down of the training. .The learning rate callback had the following settings: patience = 10, base LR = 10 −3 , decay factor=0.5. .Training was configured to run for 200 epochs.
. We selected the checkpoints from 2 models, based on the best metric values.
The experiments were performed on a desktop computer equipped with Intel Corporation Xeon E3-1200 v6/7th GenCore Processor with 32 GB RAM and 240GB SSD, running Ubuntu 18.04.
To gauge the effect of covariates, also known as dynamic features, from Table 3, we employed an iterative approach for training and inference, integrating each covariate one by one.Our tests spanned a variety of date and time targets.We empirically observed a consistent enhancement in our prediction metrics as we incorporated more features.
We constructed the model and generated forecasts for a duration of 13 hours and 20 minutes by utilizing historical gas prices and the feature data displayed in Table 3.This was accomplished through a greedy approach.Since using the MATIC prices appeared to be redundant when used in tandem with Ξ prices, we dropped the former from subsequent tests.Once we found the feature with best results, we ran the training and inference with the remaining five feature data and selected again the one giving best results.We continued the process for the rest of the features, thus performing a total of 7 + 5 + 4 + 3 + 2 + 1 = 22 trials.Given y i as the observed value, ŷi the predicted value and n the number of samples, we computed the following metrics using sklearn package.
(1) Mean Absolute Error (MAE): (2) Mean Squared Error (MSE): (3) Quantile Loss (QL) for a given quantile q is defined as This value is averaged across all predictions.We compared the values obtained for the following quantiles: q [ {0.1, 0.5, 0.9}.6. Results and discussion

Evaluation of experimental outcomes
The results of our experiments are graphically presented in Figures 4-6 that show the dynamics of the prediction when using various sets of dynamic features.The actual numerical values obtained for the performance metrics are presented in Table 4.Note that, to calculate RMSE metric, we used Mean Normalization by re-scaling the test and predicted data to have a mean of 0 and variance of 1 (Figures 7 and 8).
We prove empirically that the usage of covariates has a positive influence on the quality of the prediction.To prove it, we performed an experiment by generating predictions for 30 distinct time periods with 0 and 5 covariates, respectively.We aggregated the sample data for every 5 minutes and performed predictions of 3 h 20 m every 9 h 20 m between 7 September and 18 October 2021.The result can be seen in Table 5.

Analysis and interpretation
Time series forecasting has carved its niche as a pivotal tool in the commercial landscape, extending a diverse array of frameworks for deployment by data scientists.Nevertheless, it  Given the wealth of covariates associated with cryptocurrency prices, including Ethereum (Ξ), the possibility of generating precise predictions with a shorter historical basis is viable.This capability is particularly beneficial for low-power devices, such as those operating on the Internet of Things (IoT), which require optimal transaction timing on the blockchain.Our analysis spanned the period from 1 January 2021 to 18 October 2021.During this interval, gas prices varied between 10 and 4315 GWEI, translating to a staggering fluctuation of 431.5%.
At the prevalent exchange rate of 1 Ξ to 4189 USD, this volatility highlights the significant financial implications of transaction timing and emphasizes the critical importance of accurate predictive mechanisms.
Despite scope for further optimization, our experiment has underscored the efficacy of probabilistic forecasting through DeepAR.In the face of a noisy dataset, accurate predictions were achieved, necessitating minimal feature engineering.Interestingly, our model required normalization of features for convergence, contradicting some literature.Importantly, the incorporation of dynamic features led to an enhancement in predictive metrics, as illustrated in Table 4.The precision of DeepAR's Monte Carlo sampling-based quantile estimates, as depicted in the figures, holds considerable potential for practical applications.Prophet was unable to predict the price spike after 6 p.m. that day.This is reflected in the sudden increase in RMSE around that time.

Conclusion and future directions
Our empirical exploration of Ethereum gas price prediction utilizing DeepAR supports the assertion that strategic selection of covariates can substantially bolster model performance.Influences on gas prices have been identified to include seasonal variations, transaction volumes, transaction values, the number of token transactions, Ξ price, and the amount of gas used per block.Similarly, gas prices, akin to cryptocurrencies, are indirectly affected by external factors such as market trends, regulatory developments and investor sentiment.
In the future, our focus will lie in the discovery and identification of additional influential features that could augment the performance of our model.Notwithstanding the shift from proof-of-work in Ethereum, the insights obtained from this study could be instrumental in other facets of blockchain and cryptocurrency.The possible applications of this solution are vast, potentially extending to token price prediction, offering valuable preemptive insights to regulatory bodies prior to significant market events.
The concept of developing an automated system capable of processing a wide array of on-chain and off-chain feeds, identifying correlations, and subsequently selecting an optimal set for model generation presents an exciting direction for future research.Given DeepAR's superior performance over Facebook's Prophet with a smaller dataset, as indicated by Z ̆unić et al. (2021), exploring the potential for deploying our models on low-power connected devices holds significant promise.

Figure 2 .
Figure 2. Snapshot of the normalized data used for experiment.

Figure 3 .
Figure 3. Relationship strength --the absolute value of the correlation coefficient.Lighter colours mean stronger relationship.Clearly, MATIC Prices are the least related to Gas Prices, as we also established empirically.

Figure 5 .
Figure 5. Prediction with three dynamic feature inputs: Transaction Values, Committed Transactions and Gas Used.

Figure 6 .
Figure 6.Prediction with five dynamic feature inputs: Ξ prices, Transaction Values, Committed Transactions, Token Transfers and Gas Used.

Figure 7 .
Figure 7. Prediction for the same time interval using time series model generated using Facebook's Prophet.The blue line is the y value.Green line is ŷ, similar to Figure 6

Figure 8 .
Figure 8. Performance metrics (RMSE) of the Prophet model for the testing timeframe (27 hours).Prophet was unable to predict the price spike after 6 p.m. that day.This is reflected in the sudden increase in RMSE around that time.

Table 1 .
Goodfellow et al., 2014)um blockchain has a header with these fields.Salinas et al., 2020)show an improvement on standard metrics of up to 15% compared to state-of-the-art methods, such as Facebook's Prophet.Deep Autoregressive Networks (DARNs) are a type of generative neural network used for modelling sequential or time-series data.DARNs are 'generative' just like Generative Adversarial Networks (GANGoodfellow et al., 2014), 'deep' because they use multiple layers of neural networks, and 'autoregressive' because they predict future values based on past values in a sequence.DeepAR uses a specific type of RNN called Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), to model time-series data.Both architectures are designed to help fix the vanishing gradient problem in standard RNNs
(Salinas et al., 2020) model training(Salinas et al., 2020)with inputs z i,t , covariates x i,t , outputs h i,t , and the likelihood ℓ(z i.t |u i,t ).

Table 3 .
Features used for training and inference (minute-by-minute intervals).

Table 4 .
Comparison of metrics by number of dynamic features involved.

Table 5 .
Averaging of predictions for 30 distinct time periods with and without dynamic features.
is paramount to acknowledge that no single framework acts as a 'one-size-fits-all' solution.This point is emphasized by the empirical work ofZ ̆unić et al. (2021), which demonstrated that DeepAR (AWS) models exhibit superiority over traditional methods when a large volume of signals is available for model development, especially in cases with brief data history.