Subordinated affine structure models for commodity future prices

Abstract To date the existence of jumps in different sectors of the financial market is certain and the commodity market is no exception. While there are various models in literature on how to capture these jumps, we restrict ourselves to using subordinated Brownian motion by an α-stable process, α ∈ (0,1), as the source of randomness in the spot price model to determine commodity future prices, a concept which is not new either. However, the key feature in our pricing approach is the new simple technique derived from our novel theory for subordinated affine structure models. Different from existing filtering methods for models with latent variables, we show that the commodity future price under a one factor model with a subordinated random source driver, can be expressed in terms of the subordinator which can then be reduced to the latent regression models commonly used in population dynamics with their parameters easily estimated using the expectation maximisation method. In our case, the underlying joint probability distribution is a combination of the Gaussian and stable densities.

ABOUT THE AUTHOR M. Kateregga holds a Ph.D. in Mathematical Finance from the University of Cape Town. Michael is a Software Developer at Mira Networks in South Africa. His main interests include financial markets research and functional programming particularly Haskell.

PUBLIC INTEREST STATEMENT
This paper is entitled Subordinated Affine Structure Models for Commodity Futures Prices. The aim is to provide a mathematical background on a special family of processes that could capture unusual patterns such as extreme events in the commodities market. Such events could range from wild fires on farms to Tsunamis and as a result the commodities spot prices would be affected significantly. This realisation provides the motivation of the study in this paper. As a risk mitigation process, it is undeniable that investors in the commodities market would be interested in how this kind of risk could be captured in the future pricing model and more so, how the model would be calibrated to historical data. The current paper seeks to answer these questions and establishing a novel approach on pricing commodities futures.

Literature
A large volume of literature on commodities market has been published since the invention of the continuous benchmark model of Black and Scholes (1973) for pricing options and corporate liabilities. Among the many models developed, the widely used and referenced study on commodities is the work by Schwartz (1997). From the latter, numerous models have been developed as a result of the growing commodities market in terms of volumes traded and complexities of their contracts over the years. We give an account of the various literatures relevant to this research in Table 1.
The current research builds on findings from Kyriakou, Nomikos, Pouliasis, and Papapostolou (2016) by extending the results to subordinated Brownian motion.
A selection of key commodity jump models that have developed overtime.

Introduction
Commodities exhibit distinctive features that a good model ought to capture. For instance to estimate the commodities market as closely as possible, one has to factor in jumps in the underlying spot price. However, models designed with a jump component are non-trivial. In this research, we derive a relatively easy estimation method for commodities prices using subordination as a proxy for introducing jumps. Other features include mean-reversion, contango, backwardation and seasonality. Commodities also experience extreme volatility and price spikes resulting in heavy-tailed distribution of the returns. Commodity markets are unique compared with other markets such as equity, bond, currency or interest rate markets in the sense that most commodities are real physical assets that are produced, transported, stored and consumed. They are not assets valued on long-lived companies like in equity markets.
As indicated in Fama and French (1988), commodity pricing can be approached from two perspectives, the theory of storage which explains why high supplies and inventories running at minimum would result into contango, low futures and spot price volatilities, and in turn futures premiums being equivalent to full storage costs. On the other hand, why low supplies and enhanced production inventory levels yield to backwardation, a rise in volatilities of the spot and the nearby future prices. Another feature explained by this theory is the periodically continuously compounded convenience yield (usually denoted by δ) on inventory which is the benefit of holding a physical commodity as opposed to having a future contract of its delivery at some future time and second, the cost of storage. The future price motivated by the theory of storage is given by (1) where c accounts for the storage costs, r is the periodically continuously compounded interest rate, S t is the current spot price and T is the maturity date of the future contract.
The second perspective is the theory of expected risk premium discussed in Keynes (1930) and Hicks (1939). It asserts that the future prices are given by the discounted (by the risk premium) expected future spot price: where γ is the risk premium and E t ½Á ¼ E½ÁjF t , F t is the filtration up to time t.
A number of models based on the latter approach have been developed over the years to mimic the market as closely as possible for various commodities. This includes Schwartz's common continuous stochastic factor models Schwartz (1997), Schwartz and Smith (2000) and the jump models of Kyriakou et al. (2016).
The motivation and contribution of this paper are based on the existing erratic features in electricity and energy markets, where jumps are evident, resulting in skewed distributions of the spot prices. We consider a subordinated Brownian motion by an α-stable process, α{ð0; 1Þ, as the source of randomness in the underlying asset to model commodity future prices. The stunning feature in our pricing approach is the new simple technique derived from our novel approach for subordinated affine structure models.
We show that the affine property is attainable and applicable to generalised commodity spot models, and as illustration, we consider a stochastic differential equation with subordinated Brownian motion as the source of randomness to derive the commodity future prices. It is argued in some existing literature that the likelihood function exists in integrated form for models with singular noise meanwhile for cases of partially observed processes, a filtering technique is required, see for instance Date and Ponomareva (2010) and Yang, Lin, Luk, and Nahar (2014). However, the work presented in this paper provides a new approach of pricing commodity futures for models with latent variables using the maximum expectation maximisation. We show that the commodity future price under a one-factor model with a subordinated random source driver can be expressed in terms of the subordinator, which can then be reduced to the latent regression models commonly used in population dynamics with their parameters estimated easily using the expectation maximisation method. In our case, the underlying joint probability distribution is a combination of Gaussian and stable densities.
The rest of the paper is organised as follows. The following Section 3 introduces some features of stable processes essential to this work. In Section 4, we review the concept of affine models and extend the idea of obtaining Laplace transforms of random processes to subordinated processes. In Section 5, we derive our pricing formulas for commodity futures using the results derived in Section 4. In Section 6, we discuss the numerical implementation of our one-factor commodity future models. Section 7 concludes.

Stable processes
The discussion in this section is mainly based on Kateregga et al. (2017). A stable or α-stable process, α{ð0; 2, belongs to the general class of Lévy distributions. It has a limiting distribution with a definitive exponent parameter α that determines its shape. The following two definitions follow from Rachev (2003).
Definition 3.1 Let X 1 ; X 2 ; Á Á Á ; X n be independent and identically distributed random variables and suppose a random variable S defined by 1 a n ∑ n i¼1 where "→" represents weak convergence in distribution, a n is a positive constant and b n is real. Then, S is a stable process. The constants a n and b n need not to be finite.
Definition 3.1 allows the modelling of a number of natural phenomena beyond normality using stable distributions. The fact that a n and b n do not necessarily need to be finite provides the generalised central limit theorem.
Definition 3.2 (Generalised Central Limit Theorem) Suppose X 1 ; X 2 Á Á Á , denotes a sequence of identically distributed independent random variables from some arbitrary distribution and let sequences a n {R and b n {R þ . Then, we define a sequence of sums Z n such that their distribution functions weakly converge to some limiting distribution. That is PðZ n < xÞ ! HðxÞ; n ! 1; where H(x) is some limiting distribution. The traditional central limit theorem assumes finite mean a :¼ E½X i and finite variance σ 2 :¼ Var½X i and defines the sequence of sums such that the distribution functions of Z n weakly converge to h sG ðxÞ. That is where h sG ðxÞ denotes the standard Gaussian density Suppose the identically distributed independent random variables X i equal to a positive constant c almost surely and the sequences a n and b n in (4) are defined by a n ¼ ðn À 1Þc and b n ¼ 1, then Z n is also equal to c for all n > 0 almost surely. In this case, the random variables X i are mutually independent and as a consequence, the limiting distribution for the sums Z n belongs to the stable family of distributions by definition. Therefore, stable distributions behave similarly to the central limit theorem for distributions with a finite second-order moment (the Gaussian), Crosby (2008). This is one reason why they are regarded as stable. They are also preferred compared with all other laws such as the Normal Inverse Gaussian (NIG), Variance Gamma (VG) and other distributions from the generalised hyperbolic family because of their heavier tails.
Definition 3.3 Samoradnitsky and Taqqu (1994) An α-stable distribution is a four-parameter family of distributions denoted by Sðα; β; ν; μÞ where (1) α{ð0; 2 is the characteristic exponent responsible for the tail of the distribution.
(4) μ{R is the location (sometimes referred to as mean).
The analysis of stable processes is usually through their characteristic functions and Laplace or Fourier transformation. Unlike their densities, their characteristic functions always exist. Literature on their integral representations and density functions is provided in Zolotarev (1964), 1980, Zolotarev (1986). The distribution functions for the different α values have been tabulated in Dumouchel (1971), Fama and Roll (1968) and Holt and Crow (1973).
Definition 3.4 (Gajda and Wyłoman`Ska (2012)) Let S t and L t denote an α-stable process and its respective inverse. Then for t{½0; T, we define the process L t as 1 S t and L t are non-decreasing and cádlág with their graphical representations given in Figure 1.

Density and characteristic functions
Let ðX t ; t ! 0Þ denote a Lévy process. The characterisation of X t is usually given by the Lévy-Khintchine formula.
Definition 3.5 (Lévy-Khintchine) Applebaum (2004) Consider a Lévy process X ¼ ðX t Þ t!0 . There exists b{R and σ ! 0 such that the characteristic function of X is where I is the indicator function and Å is a σ-finite measure satisfying the constraint Inverse α-Stable Process Figure 1. The top graph shows a plot of a stable process S t and the bottom graph shows its inverse process L t simulated using exponent parameter value, α ¼ 0:8, plotted against time on the horizontal.
Definition 3.6 (The Lévy-Itô Decomposition) Applebaum (2004) If X t is a Lévy process, there exists b{R, a Brownian motion B σ ðtÞ with variance σ{R þ and an independent Poisson random measure N on R þ Â ðR À 0 f gÞ such that, for each t ! 0, To preserve the martingale property, the compound Poisson random measure is compensated asÑ ¼ N À tλ where λ is a Lévy measure satisfying (11).
For process S t , we require σ ¼ 0 in (10) or B ¼ 0 in (12) and the Lévy measure in (11) given by The characteristic function Φ of S t is obtained using the domain of attraction of stable random variables (See Grigelionis, Kubilius, Paulauskas, Statulevicius, & Pragarauskas, 1999) and the Lévy-Khinchine representation formula (See Definition 3.5 or Applebaum (2004) for a detailed explanation), i.e.
Using Fourier transformation, the density function of S t is given by From Definition 3.4, it is easy to see the equivalence relation It follows that Fðt; uÞ tÞ denotes the probability density function of L t . Consequently According to Meerschaert and Straka (2013), the density hðt; uÞ can also be given by where hðτÞ is the density of a standard stable process with a Laplace transform hðτÞ ¼ expðÀτ α Þ. This follows from the fact that S u has the same distribution as u 1=α S 1 . As a result, the density of the inverse stable process L t can be given in terms of the standard stable process by Figure 2 shows density graphs of S t for different exponent parameter values.

Laplace transforms
Definition 3.7 Let X u be a subordinator. The Laplace transform of X u is defined by where ϕ is the Laplace exponent of X u known as the Bernstein function represented by ð1 À e Àτx ÞÅðdxÞ: where a; b > 0 and Å is the Lévy measure on ð0; 1Þ such that ð x 1þx ÅðdxÞ < 1 : The Laplace transform of the stable process S u is given by (see Meerschaert & Straka, 2013) where 0 β 1. For C ¼ Γð1 À αÞ (alternatively β ¼ 0), the Laplace transform simplifies to that of a standard stable process: The Laplace transformh L ðu; τÞ of the inverse stable process L t is obtained from (18): where the Laplace transform of ò t 0 f ðyÞy is τ À1f ðτÞ and h L ðu; τÞ :¼ 0 for l < 0 or τ < 0. Since (25) does not have the general form for a Laplace transform of a Lévy process, then L t is not a Lévy process.

Moment-generating function
There is a relationship between a moment-generating function of a random variable and its Laplace transform.
Proof 3.9 This relationship is verified in Miller (1951).
As a consequence of Lemma 3.8 and the explicit Laplace transform given by (23), we can deduce the first and second moments of S t . That is

Subordination
Let B St denote subordinated Brownian motion where S t is an α-stable process introduced before with α{ð0; 1Þ and B Á represents standard Brownian motion with mean zero. We require B t and S t to be independent. To ensure a complete working environment, we introduce a probability space.
and F S t are filtrations generated by B t and S t , respectively.
The transition density pðx; y; tÞ of B is given by The semi-group ðP t : t ! 0Þ of B is given by where f is a non-negative Borel function on R satisfying the following generator equation: The semi-group Q t has a transition density qðy; z; tÞ ¼ qðy À z; tÞ defined by qðy; tÞ ¼ ð 1 0 pðy; uÞh S ðt; uÞdu: Lemma 3.12 The mean and variance of B St are given by Proof. Suppose f in Definition 3.10 and Definition 3.11 is such that f ðz; tÞ ¼ z. Using (32) and (34) and partitioning the time interval ½0; T such that 0 < τ 1 < Á Á Á < τ n < T, where τ i are the jump times of the process B St , we observe that (36) and (37) hold for every interval ½τ i ; τ iþ1 Þ. Thus, in the limits, their sums converge, respectively, to 0 and E x ½S t on ½0; T.
Lemma 3.13 Let X be a Lévy process with characteristic exponent Ψ and S an independent subordinator with Laplace exponent Φ. Then, the subordinated process X St is a Lévy process with characteristic exponent mðÁÞ ¼ ΦðΨðÁÞÞ: Proof 3.14 The proof is given in Bochner (2012).
It is known that any Lévy process X s , t < s T with drift μ is fully determined by its characteristic function given by (see Fusai & Roncoroni, 2007) E½e iλXs ¼ e μΔþΨðλÞΔ where Δ ¼ s À t, μ is the drift parameter and ΨðλÞ is the characteristic exponent. A typical example is Brownian motion whose characteristic function is given by The characteristic exponent of subordinated Brownian motion B St is deduced from (23), (38), (40):
(5) X will denote a closed state space.
Definition 4.1 (Keller-Ressel, 2008) A process is stochastically continuous if for any sequence t n ! t in R !0 , then the random variables X tn converge to X t in probability with respect to ðP x Þ x{D .
Definition 4.2 (Keller-Ressel, 2008) An affine process is a stochastically continuous time-homogeneous Markov process (X t ; P x ) with a state space D where the characteristic function is an exponentially affine function of the state vector. In other words, there exist functions ψ 0 : R !0 Â iR d ! C and ψ 1 : for all x{D and for all ðt; uÞ{R !0 Â iR. (1) ψ 0 maps O to C À where C À :¼ u{C : Re u 0 f g .
The following theorem is one of the contributions in this paper.
THEOREM 4.5 Suppose X t is a regular affine solution to (46). Then b and σ can be expressed as: where i ¼ 0; Á Á Á d. Moreover, the characteristic function of X t has a log-linear form where u i {R. The coefficients ψ i are obtained by solving the system of Riccati equations: Proof. The proof is a generalisation of the two-dimensional case in Rouah and Heston (2015). ∎ There is extensive literature on affine processes X t where M : ð t 0 s s with t , a Poisson jump process. We are interested in the solution to Another contribution follows in the following theorem. We show that Y t ¼ X St is affine in the following theorem with d ¼ 1.
Proof 4.7 The Markov property of S t and the definition of its Laplace transform yields The last equality follows from the affine property of S t (see Definition 4.2).

Introduction
We develop representation formulas for future prices using the concepts introduced before. The source of randomness in the models developed in this section is Brownian motion subordinated by a non-decreasing α-stable process where α{ð0; 1Þ. The aim is to obtain future price formulas for commodity spot price models that incorporate stochastic volatility, jumps, seasonality and mean-reversion effects.
The following theorem presents the first main contribution of this paper.
THEOREM 5.1 Suppose without seasonality (i.e.f ¼ 0), the commodity spot price z given by (55) satisfies the following stochastic differential equation where S t is an independent, non-decreasing stable process with α{ð0; 1Þ. Then, the future price is where γ :¼ θ À σ 2 2κ , β{½0; 1 denotes the skewness parameter of the subordinator S t .

Proof 5.2 See Appendix A
One of the advantages of the future prices above is that we can determine the price by estimating the distribution of the latent variable S T . Moreover, this latency could be observed as jumps, volatility or extreme events such as a tsunami, fire etc. However since two-factor models have proven over the years to provide better fits, we propose one in the framework of subordination. This way we can separately represent volatility and jumps or extreme events.

The two-factor commodity spot model
In the two-factor spot model, the volatility is modelled as a stochastic process while retaining jumps in the spot model. The future prices model is given by We present the second main contribution of the paper in the following theorem.
THEOREM 5.3 Suppose without seasonality (i.e.f ¼ 0), the commodity spot price X satisfies the set of subordinated stochastic differential equation t ; such that B some random process. The future price is: where the details of coefficients ψ 0 ; ψ 1 and ψ 2 are given in Appendix A.

Numerical implementation
We focus on the one-factor model to explain our approach for estimating the model parameters. The data used in this section are obtained from the US Energy  f ðtÞ ¼ δ 0 t þ δ 1 sinðδ 2 ½t þ 2π=264Þ þ δ 3 sinðδ 4 ½t þ 4π=264Þ: The accuracy of the fitting in Figure 3 depends on the choice of the initial parameters δ 0 , δ 1 , δ 2 , δ 3 and δ 4 . The best fit of f ðtÞ can be obtained by obtaining an optimal set of the δ parameters.

Equivalent latent regression model
The de-seasonalised one-factor future price given by (58) can be written as where y ¼ ln F, x ¼ S t and a and b are given by and ε is an independent random error distributed as Nð0; ΘÞ with zero mean and variance Θ.
Clearly, (63) belongs to the class of latent regression models since x ¼ S t is not observable.
There is literature where this kind of problem is handled using expectation-maximisation (EM) algorithms (see Dempster, Laird, and Rubin (1977)) to estimate the model parameters. The latent variable x can be considered binary where in this case the EM algorithm would give estimates for a two-component normal mixture model. On the other hand, x can be allowed to be continuous between 0 and 1 with a beta distribution as in Tarpey and Petkova (2010). The EM algorithm for estimating the model parameters in this case is more involved than for the two-component mixture model and more computationally challenging, but can be done nonetheless. Basically, the latent or unobserved x variables are imputed by their conditional expectation given the outcomes y. Our approach is adaptable to the latter approach through the Dynkin-Lamperti Theorem (see Gupta and Nadarajah (2004)) where the unobserved variable follows a stable distribution defined on ðÀ1; 1Þ with α{ð0; 2 and the observable variable y represents the log-returns of the future prices. The algorithm is applied to the joint likelihood of the response y. We assume the error ε is independent of the latent predictor x. The joint density for x and y is given by hðx; yÞ ¼ hðyjx; a; b; ΘÞSðx; α; β; ν; μÞ: ¼ Nðy; a þ bx; ΘÞSðx; α; β; ν; μÞ; where Sðx; α; β; ν; μÞ is the α stable distribution, Nðy; A; BÞ denotes the normal distribution of a random variable y with mean A and variance B; and Θ is the variance of the outcome sample data. The marginal density of the response y is p Θ À1=2 expðÀΘ À1 ðÀy À β 0 À β 1 xÞ 0 Θ À1 ðy À β 0 À β 1 xÞ=2Þ Â Sðx; α; β; ν; μÞdx: Density (67) is an example of infinite mixture models used in ecological statistics (Fox, Negrete-Yankelevich, and Sosa (2015)).

The EM algorithm
For a data set ðx 1 ; y 1 Þ; Á Á Á ; ðx n ; y n Þ in (63). The log-likelihood is derived from (66) as Lðα; β; ν; μ; Θ; a; bÞ ¼ À Since x is not observable, the EM algorithm requires maximising the conditional expectation of the log-likelihood given the response vector y. That is E½Lðα; β; ν; μ; Θ; a; bÞjy; where at each iteration of the EM algorithm, the above conditional expectation is computed using the current parameter estimates. This current expectation-maximisation problem is similar to the problem handled in Tarpey and Petkova (2010), which implies that a similar EM algorithm can be applied here. That is, suppose ρ ¼ a b is a 2 Â p dimension coefficient matrix where each of the p columns of ρ provides the intercept and slope regression coefficients for each of the ρ response variables. Then the design matrix is denoted by X whose first column consists of ones for the intercept and the second column consists of the latent predictors x i ; i ¼ 1; Á Á Á ; n. The multivariate regression model follows: Moreover, and as indicated in Tarpey and Petkova (2010), the likelihood for the multivariate normal regression model can be given as Lðρ; ΘÞ ¼ ð2πÞ Ànp=2 jΘj À1=2 expðÀtr½Θ À1 ðY À XρÞ 0 ðY À XρÞ=2Þ: The EM approach requires that we maximise the expectation of the logarithm of (71) conditional on Y with respect to ρ and Θ. This leads to the following optimal factors: whereX ¼ E½XjY and ðX 0 XÞ ¼ E½X 0 XjY.
To implement the EM method in the R programming language, we first highlight that there are a minor differences to bear in mind before implementing the algorithm as we explain in the following.
First and foremost, the density of the predicator in our case is from a stable distribution. Recall, in general, the densities of stable processes cannot be expressed analytically, which makes it difficult to compute the log-likelihood. However, with the help of inbuilt packages in R including stabledist and StableEstim, the log likelihood can be satisfactorily estimated using EstimðÞ to obtain the stable parameters of S t , and dstableðÞ for its corresponding stable density.
Second, from the log-likelihood expression, we notice that we only require estimates of the conditional expectations of x, x 2 and log x with respect to the joint probability density given the response vector y.
On the other hand, we retain some of the steps in Tarpey and Petkova (2010). The initial values for the regression parameters a and b can be obtained from fitting a two-component finite mixture model or by a preliminary search over the parameter space. Initial values for Θ can be obtained using the sample covariance matrix from the raw data.

Data for the EM algorithm
The data used is stored in a data frame with three columns containing futures log-returns, spot price log-returns and binary data of 1's and 0's representing whether or not a jump has occurred within a given window size (see Table 4). Table 2 shows the estimated parameters as a result. The parameters were obtained from 5000 data points of crude oil log-returns arranged as in Table 4. We have displayed results from only two iterations because for large data sets the code tends to be slow in addition to suffering convergence issues. However, this can be improved and using faster machines.
The jump occurrence due to S t is determined by the method discussed in Lee and Mykland (2008) (also see Maslyuka et al., 2013). That is the realised return at any given time is compared with a continuously estimated instantaneous volatility σ t i to measure local variation arising from the continuous part of the process. The volatility σ t i is estimated using a modified version of realised bipower variation calculated as the sum of products of consecutive absolute returns in the local window (see Barndorff-Nielsen & Shephard, 2004). Then, the jump detection statistic L i { testing for jumps in returns occurring at a time t i within a window size K is calculated as the ratio of realised returns to estimated instantaneous volatility: where Y t at t ! 0 represents the commodity spot price andσ t i is estimated bŷ Care must be taken in choosing K, it must be large enough to accurately estimate integrated volatility but small enough for the variance to be approximately constant. In other words, K should be large enough but smaller than N, the number of observations so that the effect of jumps on estimating instantaneous volatility disappears. Some authors recommend K to be computed as K ¼ ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 252 Á n p , where n is the daily number of observations, whereas 252 is the number of days in the (financial) year. Moreover, the window size should be such that K ¼ OðΔt λ Þ with À 1<λ< À 0:5. For high-frequency data, Lee and Mykland (2008) recommend, for returns sampled at frequencies of 60, 30, 15 and 5 minutes, the corresponding values of K to be 78, 110, 156 and 270. For our case, we shall choose K ¼ 4 for crude oil future prices with returns sampled daily.

Detection of jumps in the data
The test statistic L follows approximately a normal distribution when the data set has no jumps and its value becomes large otherwise. According to Lee and Mykland (2008), the region for L is chosen based on the distribution of its maximum. For instance, suppose a particular interval ðt iÀ1 ; t i has no jumps and the distance between two consecutive observations in this interval is small (i.e. Δ ! 0), then the maximum should converge to the Gumbel variable: where has a cumulative distribution function Pð xÞ ¼ expðÀe Àx Þ, " A N is the set of i{ 1; 2; Á Á Á ; N f g such that there is no jump in ðt iÀ1 ; t i and c N , s N are defined as The test is conducted by comparing the standardised maximum of L i in (76) to the critical values from the Gumbel distribution where the null hypothesis of no jump is rejected when the jump statistic is given by where G À1 ð1 À λÞ is the ð1 À λÞ quantile function of the standard Gumbel distribution. Suppose λ ¼ 0:1, then we reject the null hypothesis of no jump when L i >s N η Ã þ c N where η Ã is such that expðÀe Àη Ã Þ ¼ 1 À η Ã ¼ 0:9. That is η Ã ¼ À logðÀ logð0:9ÞÞ ¼ 2:25. Figure 5 shows a graph of jumps detected in crude oil future prices where we have used 1's to record a jump occurrence and 0's for no jump.

Summary
We have shown that the affine property is attainable and applicable to generalised spot models. We considered a stochastic differential equation with the source of randomness as subordinated Brownian motion as a specific example to derive the futures price. Moreover, it has been argued in some existing literature that the likelihood function exists in integrated form for models with singular noise meanwhile for cases of partially observed processes, a filtering technique is required. However, the work presented in this paper provided a new approach of pricing commodity futures for models with latent variables using the maximum expectation-maximisation, without using any filtering. Our approach is easy to implement once the joint probability density is established. The numerical implementation of the two factor model is left for future work.
All the numerical results were generated using MATLAB and the Robust Analysis software package in R accessible at www.RobustAnalysis.com.

Funding
This work was supported by the University of Cape Town ; Author details M. Kateregga 1 Figure 5. The large spikes in the jump statistic graph reflect the extreme events in the price where as the blue strips represent the smaller jumps that might go unnoticed. The former jumps are easily detected in returns yet the latter are not that visible.
The factor Cðu 1 ; u 2 Þ is defined as Finally, the integrating factor I f is such that We provide the proof using the following proposition and subsequent lemmas.
Proposition 8.2 The mean and variance of the model (60)-(61) are given by Moreover, their affine forms can be given as linear models of both X and V: As a consequence, we deduce the following system of Riccati equations: Finally, the constant of integration (126) can be re-written as This completes the proof.