Forecasting salmon market volatility using long short-term memory (LSTM)

Abstract Forecasting salmon market volatility is crucial for reducing future uncertainty for market participants. This study explores the efficacy of the Long Short-term Memory (LSTM) network, a deep learning technique, in forecasting multi-step ahead salmon market volatility. The performance of the LSTM is assessed against a constructed volatility proxy and the Autoregressive Moving Average (ARMA) model, a traditional benchmark in time-series analysis. Evaluation is performed across various forecasting horizons using different forecast error measures. Our findings indicate that the ARMA model outperforms the LSTM in predicting salmon market volatility, suggesting that any non-linear patterns in the salmon market volatility might be too insignificant for an LSTM model to exploit effectively. However, we observed a significant discrepancy between the actual volatility values and the forecasts obtained by both models, indicating the complexity of accurately predicting salmon market volatility.


Introduction
Salmon is a volatile commodity, with the increased demand over recent decades resulting in high salmon prices and increased volatility, especially since the mid-2000s (Asche et al., 2019;Bloznelis, 2016;Guttormsen, 1999;Oglend, 2013).This high price volatility presents challenges for all market participants, including farmers, processors, traders, and intermediaries.Farmers, who control production, can adjust harvesting to maximize their profits or to meet biomass limitations (Asche, 2008;Forsberg & Guttormsen, 2006;Guttormsen, 2008).
As the industry evolves, with aquaculture becoming more industrialized, firms are increasing their scale of production and capital intensity.This shift does not affect all firms uniformly, with some seeking public listing at stock exchanges.This offers improved access and cheaper financing but also alters profitability dynamics, with listed companies potentially benefiting from working capital optimization, yet also facing heightened risks from operating leverage and liquidity (Sikveland et al., 2021).This increasing industrialization and capitalization have also pulled in a growing number of investors into the salmon aquaculture industry, further highlighting the importance of climate-related financial disclosure.Firms that comply with this increased demand for transparency don't just minimize their carbon impact, they also attract a growing number of environmentally conscious investors, further demonstrating the crucial role of climate change consciousness in the industry (Zitti & Guttormsen, 2022).
This complex interplay of industrialization, capitalization, and environmental consciousness has been further complicated by the rapid growth in aquaculture, now accounting for roughly half of the global seafood supply (Asche et al., 2022), a dynamic sector shaped strongly by government policies in terms of geographic distribution, species types, technology, management practices, and infrastructure (Naylor et al., 2023).This rapid growth and government involvement in aquaculture can lead to fluctuations in the salmon spot price and, in turn, increased volatility.On the other hand, processors face thin profit margins and customers demanding lower prices, operating based on their expectations of the future salmon spot price (Bergfjord, 2007;Kvaløy & Tveterås, 2008).
There are hedging opportunities available in the salmon market to mitigate this volatility, however, they remain thin due to a lack of speculative traders (Andersen & de Lange, 2021;Asche et al., 2016;Ewald et al., 2022).Fish Pool, a futures exchange of salmon, was established in 2006 to provide these hedging opportunities.However, its role has been controversial among aquaculture economists, with some arguing that its launch increased salmon price volatility (Bloznelis, 2016).Further studies found that shorter futures contracts contribute to more volatility (Ankamah-Yeboah et al., 2017), futures prices are efficient in the long-run but not in the short-run (Andersen & de Lange, 2021), and Fish Pool futures can be considered as a hedging instrument but not an investment asset (Ewald et al., 2022).In support of Fish Pool, other researchers found that the contract settlement price used is representative of salmon transaction prices (Oglend & Straume, 2019), and stock prices reflect salmon price information earlier than the Fish Pool Index (Dahl et al., 2021).Additionally, studies on Norwegian salmon export transactions suggest a high rate of price revisions and an informative salmon price index, indicating that price revisions are more likely when transaction prices are below the reference price in the market (Oglend et al., 2022).Despite these debates, the salmon futures market is characterized by low liquidity, with infrequent trades that account for less than 10% of the physical market volume (Fish Pool, 2020).
Given these challenges, this study aims to fill a critical gap in the literature by implementing a volatility forecasting model that could significantly aid salmon market participants and provide valuable insights for both the academic field and market participants.We will examine whether further exploration of neural networks for forecasting salmon market volatility is necessary or if reliance on traditional time-series forecasting models suffices.Despite several studies analyzing salmon price volatility, there has been a limited effort towards forecasting this volatility (Asche et al., 2015;Asche et al., 2019;Asche & Oglend, 2016;Bloznelis, 2016;Dahl & Oglend, 2014;Dahl & Jonsson, 2018;Oglend, 2013;Oglend & Sikveland, 2008;Solibakke, 2012;Steen & Jacobsen, 2020).
Deep learning techniques such as neural networks have shown great potential in forecasting financial data, including commodity prices (Hamid & Iqbal, 2004;Manogna & Mishra, 2021;Verma, 2021;Xu & Zhang, 2021, 2022).Among these techniques, Recurrent Neural Networks (RNNs) have been found particularly suitable for predicting financial market volatility due to their ability to learn temporal dependencies of time-series data (Selvin et al., 2017).Specifically, Long Short-Term Memory (LSTM) networks, a type of RNN capable of capturing long-term dependencies, have yielded promising results in various forecasting tasks (Kim & Won, 2018;Nelson, 1991).Despite these promising results, the application of LSTM networks in forecasting salmon market volatility remains unexplored.
In this study, we implement LSTM networks to forecast salmon market volatility, filling both a gap in the literature and a practical need for robust forecasting models in the salmon industry.The approach we undertake involves creating a volatility proxy based on the standard deviation of the logarithmic returns rolling over 4 weeks.This proxy, although a mere estimation of volatility, is forecasted using the ARMA model, with the forecasting ability of the LSTM assessed against this.Each model's forecasting performance is evaluated under different, multi-step ahead forecast horizons using various forecast error measures, with the model yielding the lowest errors deemed the best performing one.The expected value of the forecast losses generated by each model is compared using the Diebold-Mariano (DM) test to assess robustness, and the forecasts obtained by LSTM are examined for significant difference against actual volatility values.
Our study is structured as follows: Section "Methodology" discusses the specifications of neural networks and the LSTM model.Section "Data" demonstrates seasonality and structural changes in the salmon spot price series.Section "Measurement and model Assessment" outlines the statistical metrics and tests used to evaluate the forecasting performance of the LSTM model compared to a benchmark model.Section "Proposed models" describes the application of the proposed models and Section "Empirical results" reports their results.Finally, Section "Concluding remarks" concludes and discusses potential future research directions.

Neural networks (NN)
NNs are linear and polynomial methods that connect a set of input variables x i t � � with i ¼ 1, . . ., n, n being the number of inputs connected to an output ỹt f g: They consist of three different types of layers.The input, the hidden, and the output layers.The input layer includes the input nodes, and each node represents a different variable.When applied on univariate time-series data, a lagged version of the data is used, and the nodes correspond to each lag.The output layer is usually formed with one node which represents the output of the NN.The hidden layer(s), h i t � � where i ¼ 1, . . ., m is the number of nodes that separate the input from the output layer and define the amount of complexity the model is capable of fitting.The number of hidden layers and the number of nodes in each layer are based on the complexity of the model under study and a trial-and-error approach.An illustration of a type of NN is presented in Figure 1.

Recurrent neural network (RNN)
RNN is also known as the Elman network (Elman, 1990), a class of ANNs that incorporates a recurrent hidden state, that consists of hidden layers, whose activation after each iteration depends on previous states and the current input.The RNN's feature is that it incorporates a "memory" mechanism that is saving a copy of the previous values of the layer containing the recurrent nodes and using them as an additional input for the next step (Makridakis et al., 2018).The weights of an RNN determines how much significance to give to the present input and the past hidden state.The weights are adjusted via backpropagation until the error function (SSE) is minimized.The network's "memory" feature allows them to exhibit dynamic temporal behavior.The illustration in Figure 2 portrays this procedure.

Long Short-term memory (LSTM)
The training process of RNNs suffers from vanishing gradient problems (Bengio et al., 1994).RNNs are only able to store short-term information and they have difficulties carrying information for longer periods.Hochreiter and Schmidhuber (1997) developed the Long Short-term Memory (LSTM) as a solution.It is an advanced type of recurrent neural network and is applied in a number of different areas (e.g.handwriting recognition, speech recognition, see (Graves et al., 2013).LSTM is a network architecture that in combination with an appropriate gradient-based algorithm can use memory cells and gates to store information for long periods.Gers et al. (2000) specify that the cell state is the core of the LSTM model because the cell state represents the "memory" feature.Simpler, the cell state behaves like a "transport line" that captures and stores information.The information passing through the cell state is filtered by the gates.The gates have the power to add or remove information to and from the cell state.
There are three gates: the forget gate, the input gate, and the output gate.The forget gate takes as inputs information carried from the last hidden state, h tÀ 1 , and the current input, x t : It passes these inputs via a sigmoid function that returns values between zero and one, where zero means "nothing goes through" and one "everything goes through".The output of the first gate f t is: The input gate decides which information to store in the cell state.This gate has two parts.First, it also receives inputs h tÀ 1 and x t and passes them via a sigmoid function as follows: Next, it passes the same inputs via a hyperbolic tangent function that creates a vector of new information: The two outputs, i t and Ct are combined and added to the cell state.The current period cell state is created using the outputs from the first and second gates as follows: The forget gate's output, f t , is multiplied with the information carried from the previous period cell state, C tÀ 1 : Then, the product from the input gate, i t Ct , is added and the new cell state, C t is created.
The output gate also receives inputs h tÀ 1 and x t and passes them via a sigmoid activation.The output, o t , is denoted as follows: (5) In the meantime, the cell state, C t , passes through a hyperbolic tangent function and the output is multiplied with o t , to decide which information the new hidden state, h t , should carry to the next period.This is described as follows: The structure of the LSTM layer is shown in Figure 3.It is evident from the figure that the three sigmoid functions, r, and the hyperbolic tangent function, tanh, are controlling the three gates.Scalar multiplication is denoted as � and addition is denoted as þ: Overall, the LSTM updates the cell state, from C tÀ 1 to C t , filters important and non-important information via the three gates, and generates h t :

Data
Our analysis employs weekly salmon spot prices sourced from the NASDAQ Salmon Index 1 .This data extends from week 27 of 2007 until week 51 of 2019.The prices included represent averages for all weight classes and are given in Norwegian Krone (NOK) per kilogram (Kg).
The decision to use spot prices as opposed to futures contracts prices was informed by the relatively low liquidity and infrequent trades of the salmon futures market (Bergfjord, 2007;Bloznelis, 2018a).Further supporting this choice, Asche et al. (2016) discovered that innovations in the spot price impact futures prices, suggesting that futures prices do not provide a price discovery function.This is an expected finding considering the immaturity of the salmon futures market and provides a compelling rationale for choosing the salmon spot prices over the futures prices in our volatility forecasting study.
An important note on the choice of our dataset timeframe is necessary here as we have intentionally chosen not to include data from 2020 onwards.The primary reason for this decision is the onset of the COVID-19 pandemic, which would introduce a structural breakpoint in the data and significantly disrupted market dynamics.The pandemic represents an outlier event that, while undeniably impactful, may not be indicative of typical market behavior.Including data from this period could skew our model and undermine the reliability of the forecasts by potentially overfitting to the exceptional conditions brought on by the pandemic.Consequently, limiting our analysis to pre-2020 data allows us to avoid this issue and provide forecasts that are grounded in more standard market conditions, therefore offering a more unbiased and reliable prediction of typical salmon market volatility.

Seasonality
Before obtaining the logarithmic return series, we considered the seasonal patterns in salmon spot prices.As a result of factors related to supply and demand a key characteristic of salmon production is seasonality.Seasonality in supply does not match the seasonality in demand and that generates seasonal patterns in salmon price.Modeling seasonality when having a weekly time-series is complicated.In the existing literature, the most common technique is the Fourier series; that is, sums of trigonometric functions (Bloznelis, 2016(Bloznelis, , 2018b;;Oglend, 2013).Here, we follow a technique introduced by Hyndman (2014) and also applied by Bloznelis (2018b), that uses a regression with ARMA errors, having Fourier terms as regressors.The number of Fourier terms could be up to 26 pairs for weekly data.However, the number of Fourier terms for the fitted model was selected by minimizing the Akaike information criterion (AIC) and choosing between none to 26 pairs, while the same applies for selecting the order of the ARMA model (Hyndman & Athanasopoulos, 2018).The salmon spot price exhibits large movements around Christmas and Easter as shown by Bloznelis (2018b).Therefore, we also incorporate the possibility of deterministic seasonality occurring, by adding four dummy variables to specify the weeks before and after Christmas (including Christmas week) and four more to specify the weeks before and after Easter (including Easter week).The deterministic seasonality is expressed by the means of these eight seasonal dummy variables.The combination of the Fourier terms and these eight seasonal dummy variables exhibit the seasonal component, which is subtracted from the data before proceeding.The seasonally adjusted version of the salmon spot price will be used in place of the original series without explicit reference.The development of the seasonally adjusted salmon spot price series is depicted in Figure 4.
Given that our aim is to forecast salmon market volatility, the parametric elimination of seasonality prior to forecasting proves unfeasible.This stems from the fact that, based on the information available at the time of forecasting, we can't predict the precise timings and occurrences of future seasonal patterns, making the removal of seasonality to use the seasonally adjusted series for forecasting inappropriate.Ideally, seasonality should be considered only until the point of initiating forecasting, specifically, it should be controlled only within the in-sample (training) data and not extended to the out-of-sample (testing) data.Nevertheless, upon noting the minimal influence of the seasonal component (refer to Figure A1), we found the original series has a standard deviation of 14.703, while the seasonally adjusted series indicates a 14.339 standard deviation.Since we determined the seasonal component to be minimal, we opted to disregard the seasonal adjustment and conduct our volatility analysis and forecasting on the original, unadjusted series.These minor differences of the two series can be observed in Figures 4 and 5.
Moreover, we apply a logarithmic transformation on the original spot price series as it is often used in volatility analysis and forecasting.The return from week to week is denoted as Y ¼ W t =W tÀ 1 , where W t is the spot price at time t (or current week's price) and W tÀ 1 is the spot price at time t À 1 (or previous week's price).To consider for proportional changes in the returns, we apply a logarithmic transformation of the first price difference r t ¼ lnðW t =W tÀ 1 Þ: Figure 6 presents the development of the logarithmic returns.

Structural changes
We choose to analyze the sample period starting from 2007 week 27 based on the findings of Bloznelis (2016), who estimated the salmon spot price volatility of different salmon weight classes, and found a structural breakpoint in the logarithmic returns series from 1996 week 1 until 2005 week 45, and from 2007 week 27 until 2013 week 13.From Figure 6 there is a noticeable change in the variability of the logarithmic returns from middle 2012 and onward.Therefore, to test whether the variability of the logarithmic returns is different before middle 2012, we split the sample into two periods; one from 2007 week 27 until 2012 week 17 and one from 2012 week 18 until 2019 week 51.We use an FÀ test to examine whether the two sub-samples have equal variance.The FÀ test assumes that the two sub-samples are independent of each other and hence independent across time.For simplicity of the analysis we ignore any potential time dependence at this point, assuming that it is not strong enough to invalidate the results.The second main assumption of the FÀ test is that the two sub-samples must be normally distributed.We test for normality of each of the two sub-samples using the Shapiro-Wilk normality test and we find that they are likely normally distributed, hence we are not violating the normality assumption of the FÀ test.The results strongly reject that the two sub-samples have equal variances with p-values well below 0.01.However, as we are interested in forecasting using an in-sample (train sample) and a holdout sample (test sample), forecasting using two sub-samples, before and after the break-point, is not feasible.Therefore, for forecasting salmon returns volatility we use the sample from 2012 week 18 until 2019 week 51.This is the main sample which we refer to throughout the remainder of the article.

Descriptive statistics
Table 1 shows descriptive statistics such as mean, standard deviation, skewness, and kurtosis of the salmon spot prices and their logarithmic returns, as well as the results of the augmented Dickey-Fuller (ADF) test, a unitroot test, and the Shapiro-Wilk test, a normality test.The ADF test was used to confirm the stability of the time-series (Dickey & Fuller, 1979).For the ADF test, a negative value, whose absolute value exceeds the critical value, suggests stationarity.From the ADF statistic in Table 1, we interpret that the spot price series contains a unit-root at all significant levels, e.g.1%, 5%, 10%, implying it is non-stationary, while the logarithmic returns series does not contain a unit root, thus, it is stationary.Figure 6, illustrates the logarithmic returns, where the stationarity of the series can be observed.
The logarithmic returns are centered at zero with a standard deviation 6.4%.As confirmed by the Shapiro-Wilk test result of 0.996, the logarithmic returns closely align with a normal distribution.Although commodity prices, including salmon, often exhibit asymmetry and nonnormal distributions, our data reveals different characteristics.The skewness of the returns is slightly above zero, hinting that large positive returns might be marginally more common than large negative ones.However, this near-zero value suggests a symmetric distribution.The balance between the largest negative return (12%) and the largest positive return (17.1%) corroborates this symmetry, thereby indicating normality in the returns distribution.
Moreover, the kurtosis indicator is less than zero, a sign of platykurtosis.This implies that the returns are less densely populated in the tails and more concentrated around the mean than one would expect from a normal distribution.Given its relative proximity to zero, we interpret this platykurtosis as insignificant, reinforcing our assumption of normality.In the context of the LSTM and ARIMA models, both of which are being applied to forecast the volatility of the salmon market in this study, the absence of significant asymmetry and non-normal distribution in our data offers a less complex environment for these models to capture the underlying patterns.As such, the characteristics of our data provide a more straightforward foundation for evaluating the effectiveness of LSTM and ARIMA in predicting salmon market volatility.

Volatility measure
For a measure of volatility, which serves as the target value for the supervised learning process in the neural network (refer to Section "LSTM experiment"), we utilize the sample standard deviation of logarithmic returns, computed over 4-week intervals using a rolling approach 2 .
The variance calculation incorporates the mean of returns from the same rolling period.Consequently, the formula for calculating the variance proxy is presented as follows: where r j represents the logarithmic return at time j, and r t signifies the average of the logarithmic returns from time t to t þ T. This ensures that the mean return value utilized in the variance calculation is derived from the same period for which the variance is computed.The volatility is estimated using a rolling-window approach, reducing the length of the logarithmic returns to ðk À 4Þ, where k represents the length of the logarithmic return series.
Even though implied volatility would serve as an ideal forward-looking measure, the absence of a robust market for options contracts on spot prices at Fish Pool requires the employment of historical volatility measures.
Despite this constraint, we are confident that the volatility proxy introduced in Equation ( 7) provides a reliable measure for our analysis.

Benchmark model: ARMA
To provide a meaningful comparison for the performance of the neural network in forecasting the volatility measure presented in Equation ( 7), it's crucial to employ a benchmark model.Based on insightful suggestions from a reviewer, we've chosen to incorporate the ARMA model as this benchmark.
The ARMA(p,q) process that generates the volatility series V t f g a t¼1 is formulated as follows: In this formulation, V t is the actual value at time t, and e s is the random error at the same time point.The parameter m represents the intercept, and / i ði ¼ 1, 2, . . ., pÞ and h j j ¼ 1, 2, . . ., q ð Þ are the model parameters.The quantity p denotes the number of autoregressive terms, and q denotes the number of random error terms, also known as moving average terms.
The selection of the lag order values p and q is done using the Akaike Information Criterion (AIC), a well-known method for model selection.Maximum allowable lag orders are specified as (pmax, qmax), and the optimal p and q are those for which the AIC is minimized.
Estimation of the ARMA model parameters is performed via the method of maximum likelihood, which aims to find the parameters that make the observed data most probable.
This ARMA benchmark model will allow us to conduct a rigorous and fair comparison of the forecasting capabilities of the neural network, thereby highlighting the strengths and potential areas of improvement in our approach.
The ARMA model is chosen for its simplicity, interpretability, and the flexibility it offers in modeling various types of temporal dependencies, making it a robust choice for a benchmark model.

Model assessment
To evaluate the accuracy of the models' forecasting performance, we employ four statistical error measures.These include the popular mean squared error (MSE), root mean squared error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE).These measures are commonly used in forecasting studies and provide a comprehensive evaluation of the models' performance.The error measures are defined as follows: ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi 1 N where V t is the volatility value, and x t is the predicted volatility value.These measures provide a holistic assessment of the prediction accuracy, taking into account both the magnitude of the errors (through the MSE and RMSE) and their relative size compared to the actual values (through the MAE and MAPE).

Diebold-Mariano test
The performance of any pair of forecasts can be measured utilizing the DM test (Diebold & Mariano, 2002).The null hypothesis of the test is that the expected loss due to forecast errors is equal for both forecast models, implying that both underlying models are equally accurate.The forecast errors from the two different forecasts are transformed into corresponding losses using selected loss functions.If the expected value of the loss function from each forecast is equal, the population mean of the loss differential series should be equal to zero.
For the DM test, we produce two series of volatility forecasts, ri, 1 , . . ., ri, N where i ¼ 1,2 from two different forecasting models.Next, we evaluate the accuracy of these forecasts against the series of volatility proxies denoted as r 1 , . . ., r N with loss function Lðr Laurent and Violante (2012) showed that loss functions such as MAE are not suitable for comparing volatility models, we choose MSE and RMSE as the loss functions for the DM test.The null hypothesis of a DM test is denoted as: Eðd t Þ ¼ 0, where d t ¼ Lðr 1, t , t, r t Þ À Lðr 2, t , t, r t Þ is a loss differential sequence with a given loss function L(�).The DM statistic is denoted as follows: where N is the sample size, and f d ð0Þ is a consistent estimate of f d ð0Þ, which stands for the spectral density of the loss differential at frequency 0.

Proposed models
As discussed in the introduction, the literature available for salmon price volatility mainly utilizes autoregressive financial time-series models.Deep learning techniques have not been applied before in the context of salmon price volatility.A number of academic studies investigating financial market volatility have integrated feed-forward neural network techniques and found significant evidence that they strengthen volatility predictions (Kristjanpoller & Minutolo, 2016;Roh, 2007;Tseng et al., 2008).We acknowledge these findings and we aim at exercising their validity for the salmon market.To do so, we assess the forecasting performance of each model individually and examine which one (if any) is able to accurately forecast salmon spot price volatility over a given forecast horizon.

Implementation of ARMA model
Initially, we calculate a proxy for volatility using a 4-week measure of volatility, as defined in Equation ( 7).Subsequently, we employ an ARMA model to forecast the series of this volatility proxy.Before setting up the ARMA model, we test whether the volatility proxy series contains a unit root using the ADF test.The results indicate that the series doesn't contain a unit root, suggesting it's likely stationary and therefore employing an ARMA Model is indeed feasible.The model parameters are determined by the Maximum Likelihood (ML) estimator, and their selection is guided by the Akaike Information Criterion (AIC), with the model yielding the lowest AIC being deemed the optimal fit (Akaike, 1969).Our model selection process identifies an Autoregressive model of order 5, or AR(5).The selected AR(5) model is further validated by diagnosing the behavior of the residual series.The Ljung-Box test is used to confirm the linear independence of residuals across time, while the Shapiro-Wilk normality test establishes that the residuals follow a normal distribution.
In terms of forecasting, the volatility proxy series is divided into a training set, including 80% of the data, and a testing set containing the remaining 20%.The AR(5) model is fitted on the training data, and walk-forward validation over the testing set is used to produce multi-step ahead forecasts.
Finally, to evaluate the performance of the AR(5) model, we calculate forecast error measures.These are then compared to the volatility forecasts generated by the LSTM, which serves as our benchmark model for forecasting.

LSTM experiment
The initial step in our methodology involves establishing the LSTM network, where we employ the volatility proxy series as the input variable.The series' stationarity is confirmed through the application of the ADF test, resulting in evidence against the presence of a unit root.
The LSTM deep learning method employed in this study is characterized by supervised learning, an automatic search process for superior representations.We transform the input series into a supervised learning format via a lag transformation, whereby the value at time ðt À kÞ -where k signifies the number of lags -represents the input variable.The input variable from time ðt À kÞ to time t is then fed through an LSTM layer, pushed forward via one fully connected dense layer, and utilized to forecast volatility at ðt þ nÞ, where n represents the different forecasting horizons.The optimal number of lags was determined through the application of the Autocorrelation Function (ACF).Given that the volatility series is constructed based on a 4-week rolling standard deviation, it was not unexpected to observe autocorrelation up to lag 4 (see Figure A2 for more details).However, in order to fully leverage the long-term memory characteristic inherent in the LSTM model, we opted for a more extended lag length.Specifically, we set the timesteps 3 to 52, corresponding roughly to one year's worth of data.This decision allows the LSTM to capture and utilize long-term temporal dependencies, a feature that is central to its design and operation.
Prior to training the network, both the input and output variables are scaled to a range between À 1 and 1 to enhance the training process.The data series is then split into training and testing samples, with the same 80%-20% split as the benchmark model.We utilize the Random Search tuning technique for the optimization of the network's hyperparameters, with candidate hyperparameters presented in Table 2.The random search technique functions by sampling randomly from the specified pool of hyperparameters.Rather than undergoing exhaustive training and evaluation with each sampled set, the model is trained for a restricted number of iterations (epochs) based on these sampled hyperparameters.Optimal hyperparameters are then determined according to the results of these limited iterations.Random Search method is preferred due to its effectiveness in hyperparameter tuning, particularly in scenarios with limited computational resources or time.
The selection range for the hyperparameters has been carefully chosen based on their potential impact on the LSTM model.The units in the LSTM layer represent the dimensionality of the output space, and the selection range provides the model with enough flexibility to capture complex patterns in the data.The dropout layer helps in preventing overfitting by ignoring randomly selected neurons during training, and the activation function determines the output of a neuron given an input.The learning rate influences how much the model changes in response to the estimated error each time model weights are updated, and the epsilon parameter aids in maintaining numerical stability.
As a result of the Random Search technique, the neuron specification of the input LSTM layer is set to 30, the dropout value is adjusted to 0.4, and the activation function is determined to be the hyperbolic tan function.To further improve model convergence during training, we use kernel, recurrent, and bias regularizers.Regularization is a method used to avoid overfitting by adding a penalty term to the loss function.The penalty term corresponds to large weights in the model, forcing the model weights to be small, and therefore simpler.This enhances generalization and model performance on unseen data.The dense layer was specified with neurons equal to the forecasting horizon, which leads to a varying number of neurons based on the forecasting horizon n ¼ 1,4,8,12.Changing the number of neurons of the fully-connected layer can impact the convergence of the neural network (see Figure A3).
We then define a loss function and an optimization algorithm.The "mean squared error" and "ADAM" are used as the loss function and the optimization algorithm, respectively.The "ADAM" optimization method is chosen due to its efficient performance as a stochastic optimization algorithm (Kingma & Ba, 2014).The hyperparameters for the "ADAM" optimization algorithm are tuned with the help of the Random Search tuning technique.This technique sets the learning rate at 0.0001.The learning rate determines how much the model changes in response to the estimated error each time the model weight updates.The epsilon hyperparameter, which is used for numerical stability in the "ADAM" optimizer, is also tuned during this process.Numerical stability is crucial in preventing potential divisions by zero during the optimization process, hence tuning epsilon contributes to the robustness and stability of the learning process.
The model is trained using 200 epochs, where the training sample is passed through the single network to update the weights and to develop a more precise prediction model.During the training process, we employ an Early Stopping strategy, which monitors a designated metric on the validation data and halts the training procedure once the performance stops improving.Specifically, we observe the validation loss and set a patience level equal to 20 to control the number of epochs with no improvement after which training will be stopped.The necessary number of epochs for training the network can vary based on the characteristics and behavior of the underlying data series.

Empirical results
The main focus of this study has been an investigation into the predictive capability of the LSTM model regarding salmon market volatility.We have benchmarked these against the traditional ARMA model, employing various error measures to critically assess the performance of each.This section discusses the findings, emphasizing the utility of LSTM networks for volatility forecasting in the context of the salmon market.
Table 3 presents the out-of-sample multi-step ahead volatility forecasts, evaluated using MSE, MAE, RMSE, and MAPE.It is evident that the ARIMA model outperforms the LSTM model across all forecasting horizons in terms of the four metrics, showing lower errors for all.Specifically, when looking at the 1-step-ahead forecasts, it is noticeable that the ARIMA model performs better than the LSTM model in terms of all four metrics.Even though the errors have increased for both models when examining rhe 4-steps ahead forecasts, the ARIMA model remains superior to the LSTM.The results for the 8-steps-ahead and 12-steps-ahead forecasts also show a similar pattern.Despite an increase in errors for both models in these longer forecasting horizons, the ARIMA model consistently outperforms the LSTM.This indicates that ARIMA might be a better option for forecasting short-term salmon market volatility, potentially due to its simplicity and efficiency in situations with less complex data patterns.
To further examine these results, we analyzed the percentage changes in forecasting error measures when comparing the LSTM model against the benchmark ARMA model.These changes are presented in Figure 7.
Evidently, the most significant disparities between the two models are observed in 1-step ahead forecasting, where the LSTM model reports a 70% larger MSE score compared to the ARMA model.Although the ARMA model consistently outperforms the LSTM across all forecast horizons, we notice a diminishing divergence between the two models as the forecast horizon increases.For instance, when forecasting 12-steps ahead, the percentage difference in error measures between the two models is minimal.
As the forecast horizon extends to 12 steps, the differences in MSE, MAE, and RMSE between the two models diminish, indicating that the two models start to converge in their predictive performance.However, the MAPE value shows the largest relative discrepancy between the models at this horizon.This behavior could be attributed to the MAPE metric's sensitivity to situations where the actual observations are close to zero.Given that we're predicting market volatility, which inherently involves predicting changes that can be close to zero, it's likely that longer-term forecastswhich are likely to involve greater uncertainty and more instances of small changes-would result in more pronounced relative differences in MAPE between the LSTM and ARMA models.
In Figure 8, we illustrate the development of the forecast error metrics across various forecasting horizons for each metric separately.It is evident that the overarching trends across all error metrics exhibit similarity.We observe a sharp surge in forecast errors when moving from a 1-step to 4step ahead forecast for all metrics.However, all metrics reveal a subtle decrease in forecast errors when progressing from a 4-step to 8-step forecast horizon, only to rise again when forecasting 12 steps ahead.
The slight dip in forecast errors for the 8-step horizon, as compared to the 4-step and 12-step horizons, could be attributed to certain temporal characteristics inherent in the salmon market volatility data.It is plausible that the dataset contains information patterns that resonate more with an 8-step ahead forecasting cycle, possibly due to underlying economic or seasonality cycles associated with the salmon market.
Moreover, the discrepancies between the error metrics of the ARMA and LSTM models diminish progressively as the forecast horizon expands.When forecasting 12 steps ahead, the differences become rather marginal.This suggests that as we forecast further into the future, the ability of the two models to predict salmon market volatility begins to converge.In other words, both the traditional ARMA and the more advanced LSTM methods prove to be comparable in their volatility forecasting performance for longer-term predictions.
To further investigate the forecasting capabilities of each model, and to shed light on the observed drop in error measures from forecasting 4-steps ahead to 8-steps ahead, we opt to visualize the forecasts.Figure 9 showcases the predictive performance of each model at different forecasting horizons compared against the actual volatility measure (as represented by the volatility proxy measure in Equation ( 7)).
The ARMA model evidently performs better than the LSTM in capturing the spikes in the actual volatility series when forecasting 1-step ahead, which also corresponds to the forecasting horizon with the most marked differences between the two models.However, as we expand the forecasting horizon, the forecasting abilities of the two models begin to converge, and the forecasts tend to level off when forecasting 12-steps ahead, particularly for the LSTM model.
Interestingly, both models -ARMA and LSTM -report lower error metrics for 8-step ahead forecasts than for 4-step ahead ones (see Table 3).In Figure 9 it is evident that in the case of the ARMA model, the 4-step ahead forecast exhibits more fluctuations, but these often run counter to the direction of the actual volatility.However, its 8-step ahead forecast, while appearing flatter, more accurately mirrors the direction of the actual volatility.
For the LSTM model, the 8-step ahead forecast is even more flattened and seems to fluctuate minimally, mostly moving around the mean.Despite its apparently reduced dynamism, this relatively steady, meanrevolving forecast aligns better with the actual volatility series than its 4step counterpart, which might explain the lower error metrics at this 8-step horizon.
These visualizations suggest that, despite their differences, both models' forecasts better align with the actual volatility when forecasting 8-steps ahead, potentially due to inherent cycles in the salmon market data.While the ARMA model manages to capture the direction of volatility more accurately, the LSTM model's forecasts stay closer to the mean, providing a smoother, albeit less volatile, estimation that still manages to lower the error metrics compared to the 4-step ahead forecasts.
To further establish the predictive ability of each model, we applied the Diebold-Mariano (DM) test-a statistical tool for comparing the predictive accuracy of two forecasting methods-over different horizons.The results are displayed in Table 4.
The DM test was conducted using two loss functions: the Mean Squared Error (MSE) and Mean Absolute Error (MAE).The DM test statistic and corresponding p-values were calculated for each combination of loss function and horizon.
The results varied across horizons and loss functions.For a horizon of 1, neither MSE nor MAE showed a significant difference between the two forecast models, with p-values of .7848and .6841respectively.This trend continued for horizon 4 with the MSE loss function, with a p-value of .2167indicating no significant difference.
However, at horizon 4 with the MAE loss function, the DM test statistic was highly significant, with a p-value of .0000.This suggests that there is a .6318��� significant at 1%; À indicates that there is no significant difference between the forecasts generated by the two models.
significant difference in the accuracy of the two models' forecasts at this horizon when assessed using MAE.
The DM test results were also significant for the MSE loss function at horizons 8 and 12, with p-values of .0000,again indicating a significant difference in the forecast accuracy of the two models at these horizons.In contrast, the MAE results at these horizons were not significant.
The results of the Diebold-Mariano test emphasize the substantial influence the selection of the loss function can have on the comparative evaluation of model predictions.The variance in predictive accuracy between the LSTM and ARMA models can notably fluctuate or remain consistent, depending on the chosen loss function.To ensure a more robust analysis, this study employed both Mean Squared Error (MSE) and Mean Absolute Error (MAE) as loss functions.These choices help incorporate the potential impacts of both squared and absolute errors on the evaluation of our forecasting models.
To conclude our analysis, we examine whether a significant difference exists between the volatility predicted by the ARMA model and the LSTM model, compared to the actual volatility values.A method similar to that used by Fritz & Berger, (2015) is adopted here, involving the use of a paired sample t-test.However, prior to implementing this test, we must ensure that the forecasts produced by each model across all forecasting horizons are normally distributed.To verify this, we utilize both the Shapiro-Wilk test and QQ-plots (see Figures A4 and A5).
Our investigations indicate that the forecasts predominantly follow a normal distribution, with the exception of the 1-step ahead forecast generated by the ARMA model.Consequently, for this particular case, we resort to employing the Wilcoxon signed-rank test-a non-parametric test that doesn't necessitate the forecasts to be normally distributed.This test proves suitable for the ARMA 1-step ahead forecasts as it caters specifically to non-normally distributed data.The outcomes of these statistical tests are illustrated in Table 5.A clear observation from these results is that both the ARMA and LSTM models exhibit a statistically significant difference from the actual volatility values across all forecast horizons, as evidenced by the p-values below the standard thresholds of .1,.05,and .01.
However, the extent of the difference varies with the forecast horizon and between the two models.For instance, the ARMA model shows a relatively larger t-statistic for the 1-step ahead forecast, suggesting a greater discrepancy between its predictions and the actual values.On the contrary, this difference narrows down as we extend the forecasting horizon, with the LSTM model showing slightly better alignment with the actual values, especially at the 12-steps ahead forecast.
The results underscore the complex nature of volatility forecasting and the nuanced performances of the ARMA and LSTM models across different forecast horizons.

Concluding remarks
An accurate multi-step ahead salmon market volatility forecasting model holds considerable value for various participants in the salmon market.Despite the existence of hedging opportunities, these can often be limited due to the lack of speculative traders.Consequently, the implementation of a predictive model for salmon market volatility, which can reliably anticipate market fluctuations, could be a significant asset for all participants in the market.
To date, there has been no exploration of the use of deep-learning techniques for predicting salmon market volatility.Recognizing this gap, this study seeks to examine the application of such advanced methodologies in this domain, hoping to bring about improved forecasting performance and valuable insights for stakeholders.Therefore, we explored and compared the forecasting capabilities of two time-series prediction models, ARMA and LSTM, with respect to predicting salmon market volatility.The analysis presented a clear, albeit complex, picture of their comparative effectiveness across different forecast horizons.
Our results indicate that the ARMA model has a slight edge over the LSTM model in terms of capturing the spikes in volatility at a 1-step ahead forecasting horizon.However, as the forecast horizon expands, the performance differences between the two models begin to narrow, and the forecasts from both models tend to level off when forecasting 12-steps ahead.
The results indicated a reduction in error measures when transitioning from a 4-step to an 8-step ahead forecast for both models.This behavior, further illustrated by visualizing the forecast results, suggested a potential alignment between the 8-step forecasting cycle and inherent patterns present in the salmon market volatility data.It's plausible that this may be attributed to the seasonal patterns reflected within the out-of-sample series over a mid-term future period.Although our analysis has demonstrated that seasonally adjusting the salmon spot price series is not required for the purpose of this study, these findings emphasize the potential significance of seasonality when conducting mid-term forecasting studies.
To establish whether the forecasts generated by the two models were significantly different we employed the Diebold-Mariano (DM) test.Both Mean Squared Error (MSE) and Mean Absolute Error (MAE) loss functions were employed to present a comprehensive understanding of the predictive accuracy of both models.The results indicated discrepancies between the two forecasts for all horizons except the 1-step ahead forecast.As these results varied based on the underlying error metrics, we argue that the choice of loss function plays a significant role in the comparative evaluation of model predictions.
Last, the paired tÀ test and Wilcoxon signed-rank test results emphasized the nuanced performances of the models across different forecast horizons.It was found that both models exhibited a statistically significant difference from the actual volatility values across all forecast horizons, indicating the intricate nature of volatility forecasting.
This study highlights the importance of using a range of statistical methods and taking a comprehensive approach in analyzing and comparing forecast models.While both ARMA and LSTM models demonstrate their unique strengths, their efficacy is significantly influenced by the characteristics of the data and the chosen forecast horizon.Moreover, despite the ability of the LSTM to model complex nonlinear relationships, the ARMA model proved superior in predicting salmon market volatility.Thus, we suggest that the salmon market may be more linear than expected, with negligible or even no non-linear volatility patterns for an LSTM model to exploit.
Future research could build on this study by exploring additional factors that might impact the accuracy of these models, such as the influence of different market dynamics or even the impact of macroeconomic variables.Further studies could also consider the implementation of hybrid models, combining the strengths of ARMA and LSTM, to improve forecasting accuracy.Moreover, given that our study period concludes before the COVID-19 pandemic, it would be interesting to test this framework with adequate post-COVID data.
Our findings contribute to the ongoing dialogue around the best practices in volatility forecasting, providing valuable insights for stakeholders in the salmon market.Furthermore, the framework of this study could be adapted for use in other commodity markets where accurate volatility forecasting is also critical.

Figure 1 .
Figure 1.A feed-forward Artificial neural network (ANN) with three inputs and one hidden layer with two hidden nodes.

Figure 2 .
Figure 2.An RNN with one input layer that consists of two neurons, two hidden layers Consisting of two and three neurons, respectively, and an output layer.

Figure 3 .
Figure 3. Structure of a long short-term memory (LSTM) layer with a forget gate as introduced by Gers et al. (2000).

Figure 4 .
Figure 4. Seasonally adjusted time-series of weekly salmon spot prices in NOK/kg.

Figure 5 .
Figure 5. Original time-series of weekly salmon spot prices in NOK/kg.

Figure 7 .
Figure 7. Percentage change in error metrics when forecasting with the LSTM model compared to the benchmark ARMA model.(A) One-step ahead forecasting; (B) Four-step ahead forecasting; (C) Eight-step ahead forecasting; (D) Twelve-step ahead forecasting.

Figure 9 .
Figure 9. Forecasts of the ARMA and LSTM model against the actual volatility (A) one-step ahead forecasting; (B) Four-step ahead forecasting; (C) Eight-step ahead forecasting; (D) Twelvestep ahead forecasting.

Table 2 .
Candidates for the hyperparameters to be tuned.

Table 3 .
Results of the out-of-sample multi-step ahead forecasts with the MSE, MAE, RMSE, and MAPE loss functions.

Table 4 .
Results of the Diebold-Mariano (DM) test comparing the forecasting accuracy of ARIMA and LSTM models for salmon spot prices using MSE and MAE loss functions.

Table 5 .
Paired two-tailed t-test and Wilcoxon signed-rank test results for ARIMA and LSTM models at different forecasting horizons.The null hypothesis for the tests is that there is no significant difference between the forecasts produced by the underlying model and the actual volatility values.a The Wilcoxon signed-rank test is applied for the forecasts generated by the ARMA model when forecasting 1step ahead as they do not follow a normal distribution.***, **, and * denote significance at 1%, 5%, and 10% significance level, respectively.