What drives risk in China’s soybean futures market? Evidence from a flexible GARCH-MIDAS model

ABSTRACT Modeling futures market risk simultaneously influenced by macro low-frequency information and daily risk factors is a valuable challenge. We propose a new general framework for it based on the flexible GARCH-MIDAS model. It uses a skewed t distribution to describe the asymmetry of long and short trading positions, allows for a different number of trading days per month, and can identify the optimal combination of risky factors. We also derive its impact response function on how low-frequency factors directly influence the high-frequency futures market risk. Through an exhaustive empirical analysis of the Chinese soybean futures market, we not only find its excellent out-of-sample market risk forecasting performance but also offer systematic recommendations for improving risk management.


Introduction
Due to the influence of financialisation, since the mid-2000s, almost commodity markets have witnessed turbulent time periods and become more vulnerable to unconventional shocks such as market sentiments, policy uncertainties, and other unexpected events (Zhang & Ji, 2019).Price volatility has a significant impact on production and investment decisions.Producers and consumers face increased price uncertainty and may suffer higher borrowing costs and greater cash flow volatility.In practice, advanced risk measurement models are needed to manage risk and avoid losses.
As the largest emerging economy, China ranks among the top commodity traders in the world.Among agricultural commodities, soybean products are of significant importance since they represent a major component of household consumption and have a critical influence on food security (Ordu, Oran, & Soytas, 2018).Compared to other international soybean futures markets, China's soybean futures market has its own peculiarities.According to the China Futures Association (CFA), China imports more than eighty percent of its total soybean consumption, which reached 85.511 million tons in 2019, accounting for 60% of global soybean trade.Given the huge domestic demand, the Ministry of Agriculture and Rural Affairs of China has decided to implement the MIDAS model using a skewed t-distribution.We construct a new complete risky factor set from the aspects of supply and demand, substitutes, downstream products, related financial markets, and economic conditions.Then, we successfully identify the best factor combination by eliminating the strong correlations among factors and comparing their significances by variance ratios.More importantly, we derive for the first time a dynamic equation on how low-frequency variables directly affect the volatility or VaR of highfrequency variables.It can help us understand how macro factors affect daily risk.In addition, our model is computationally flexible, allowing for a different number of trading days per month in practice.Finally, by the backtesting of out-of-sample VaR predictions, our model proves to be more effective than the benchmark model.
The remainder of this paper is organized as follows: Section 2 presents the details of the multi-factor GARCH-MIDAS-Skewed t model; Section 3 performs an empirical analysis; Section 4 gives the conclusion.

The new measurement framework for market risk
The classical GARCH-MIDAS (Engle et al., 2013) model is too simple, such as the assumption of normal distribution, to be suitable for analyzing futures markets.Therefore, from the practice of risk management in futures markets, we systematically propose a new risk measurement framework to improve the performance.

The flexible GARCH-MIDAS model
First, our model supports unbalanced mixed-frequency data structures, which are very common in practice but rarely discussed in the literature, that is, it allows for the various numbers of trading days per month, because holidays are unevenly distributed throughout twelve months in China.We denote the logarithmical return of the soybean future on day i in month t as r i;t , where t ¼ 1; . . .; T and i ¼ 1; . . .; n t .n t is the number of days in month t.Because holidays are unevenly distributed throughout twelve months in China, our model allows for the various numbers of trading days ðn t Þ per month.r i;t obeys the following process: ffi ffi ffi ffi ffi ffi ffi ffi ffi τ t g i;t p ε i;t ; ε i;t jΨ iÀ 1;t ,Fð0; 1Þ; (1) where Y iÀ 1;t is the information set up to day i À 1 of month t.The conditional variance σ 2 i;t is the product of the short-term component of volatility g i;t on day i in month t and the long-term component of volatility τ t in month t.Like Wang et al. (2021), we also take the standardized skewed t distribution with zero mean and unit variance (Hansen, 1994) for the error item ε i;t , which is denoted as Fð0; 1Þ.Its density function is expressed as follows: where η and λ are the degrees of freedom and skewness, respectively, 2 < η < 1, and À 1 < λ < 1.The constants a; b, and c are given by Second, considering that the volatility on the first day of the month, different from other days, is influenced by macro factors of the previous month in a GARCH (1,1) process, we put forth an exact formula to calculate the short-term volatility dynamics, which is not adequately stated in (Engle et al., 2013), that is, if i ¼ 1, the short-term component of volatility on the first day of the month will depend on the squared residual on the last day of the previous month as follows: However, starting from the second day in a month, that is, if i > 1, the short-term component of volatility is given by where α þ β < 1 holds in Eqs. ( 4) and ( 5).Third, it is important to analyze the combined impacts of multiple factors on commodity futures volatility in the framework of GARCH-MIDAS.However, to our knowledge, there is no literature so far that discusses how to determine the optimal combination of factors.We propose a method to determine the optimal combination of factors.We first determine the contributions of risky factors that are measured by the variance ratio (Engle et al., 2013), that is, VaR(τ t )/VaR(τ t g it ).Then, we use stepwise regression to add them to the equation in order of importance from highest to lowest, and if the newly added factor is significant and can improve the variance contribution, this factor will be retained, and so on.For the set of variables with high correlations large than 0.5, we first select the one with the highest variance contribution into the equation, and if it is retained, the other variables in this group are eliminated; if this variable is eliminated, the variable with the next highest variance contribution is added, and the process is repeated until the end.
For instance, we introduce one low-frequency exogenous factor Z t and one highfrequency exogenous factor x it measured by logarithmic returns into τ t in turn as follows: where Z lðvÞ tÀ k are the level and volatility of the k-order lag exogenous variables respectively.RV t is the monthly realized volatility that equals P n t i¼1 x 2 i;t .In Eq. ( 6), Z lðvÞ tÀ k and RV t are smoothed through the MIDAS regression.The singleparameter beta weight function is defined in (Engle et al., 2013), where we give the most recent information a larger weight, φ k ðωÞ ¼ ð1 À k=KÞ ωÀ 1 P K j¼1 ð1 À j=KÞ ωÀ 1 : (7) If the coefficient θ 2 is significant and the variance contribution is increased after adding x it , we will keep it.Other variables that are highly correlated with x it if present will be eliminated.
Our model can use the likelihood-based approach to estimate the parameters.In the case of a single-factor model, we denote the parameter vector as Θ ¼ μ; α; β; m; θ; ω; g 0 ; η; λ f g. g 0 is the initial parameter in the recursive Eq. ( 5).The loglikelihood function is presented as follows: 1 ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi τ t ðΘÞg i;t ðΘÞ p f r i;t À μ ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffi ffiffi τ t ðΘÞg i;t ðΘÞ p ! !

;
(8) where f ð�Þ is the standardized skewed t density function in Eq. ( 2).We obtain the optimal estimated parameters by minimizing the negative LLF.The optimal lag order K is determined by the Bayesian information criterion (BIC).

The impact response function
Fourth, it is important to discover the direct responses of high-frequency volatility and VaR to low-frequency information, also called the impact response function.In the original GARCH-MIDAS model, the daily volatility is decomposed into the product of the short-term volatility component and the long-term volatility component, so it cannot directly reveal how low-frequency variables affect daily volatility or VaR.Here, we present for the first time the response functions of daily volatility and VaR to the impacts of low-frequency information.Without loss of generality, we take single-factor model to derive a direct expression of σ it , that is, Let γ ¼ ð1 À α À βÞexpðmÞ and ΔZ lðvÞ tÀ k ¼ Z lðvÞ tÀ k À Z lðvÞ tÀ kÀ 1 .We substitute Eq. ( 9) in to Eq. ( 4), and then we have the conditional volatility σ 2 1t as follows: We substitute Eq. ( 9) in to Eq. ( 5), and then we have the conditional volatility σ 2 it for i > 1 as follows: Therefore, we derive a reparameterized GARCH-MIDAS model based on Eqs.(1), (10), and (12) as follows: Compared with the original GARCH-MIDAS mode, the Model ( 14) clearly reveals how low-frequency variables directly affect the conditional variance of the high-frequency dependent variable.Specially, the impacts of low-frequency variables on the volatility on the first day in a month have a different pattern from their impacts on the other trading days.We find that the impacts of the low-frequency variables are reflected by the monthly adjusted intercepts of γexpðθ P K k¼1 φ k ðωÞZ lðvÞ tÀ k Þ.In addition, the slopes remain unchanged as α and β since the second trading day in a month.However, for the first day in a month, such impacts can also change the slopes as α and β multiplied by

The estimation of VaR
To analyze the response of market risk or VaR to low-frequency information, we can easily obtain the quantile expression of Eq. (1) as follows: where Q r i;t ð�jΩ tÀ 1 Þ is the quantile or VaR of return r it at a � level.Q ε i;t ð�Þ is the quantile of a standardized skewed t distribution at a � level, which remains unchanged with a fixed �.Thus, the dynamic market risk is linearly correlated with the time-varying volatility σ it .We put forth Eqs. ( 13) and ( 14) to explain how low-frequency variables impact the volatility and VaR respectively.To verify the accuracy of our risk measures, we use several different backtesting methods discussed in the empirical section.

Risky factors and data description
Soy-related products play an important role in the lives of Chinese people.China is the world's largest soybean consumer market, but has a low self-sufficiency rate and relies heavily on foreign imports.100,328,200 tons of soybeans were imported in cumulative months throughout 2020, an increase of 11.7% year-over-year, with imports reaching $ 39.528 billion, thus China is keen to manage the soybean market risk well and guarantee the supply chain security of soybean-related industries.In the empirical study, we focus on the the representative Soybean No. 1 Futures in Dalian Commodity Exchange.The Dalian Commodity Exchange is the largest agricultural futures market in China and the second-largest soybean futures market in the world.Chinese Soybean No. 1 Futures uses the food-quality non-GMO soybeans as the underlying asset, and imported non-GMO soybeans of acceptable quality can be used as standard or substitutes for delivery.The logarithmic returns calculated from the closing price of Soybean No. 1 Futures (SFP) is adopted as the dependent variable.We next turn to systematically identify the risk factors from a comprehensive economic perspective.

Risky factors
Generally speaking, agricultural futures price fluctuations are mainly affected by a combination of factors such as supply and demand, inventories, substitutes, freight rates, trading positions and volumes, spot prices, price fluctuations in international markets, and macroeconomic trends.
Second, the information spillover among commodities is a core driver of commodity prices (Matesanz et al., 2014), so changes in the prices of downstream soybean products namely China soybean meal futures price (MEAL) and China soybean oil futures price (SOFP) can trigger changes in demand and have a knock-on effect on soybean futures volatility.The connections among commodities may source from the mutual substitution between agricultural products and fossil energy (Baffes, 2007), commodity financialization (Tang & Xiong, 2012), and common macroeconomic factors (Gleich, Achzet, Mayer, & Rathgeber, 2013).Thus, changes in the prices of China corn futures price (CORN) as a major soybean substitute will undoubtedly have a more direct impact on them.
Third, Natanelov, Alam, McKenzie, and Van-Huylenbroeck (2011) indicate that agricultural commodity futures price exhibits a long-term co-movement with crude oil prices.Liu, Pan, Yuan, and Chen (2019) find that 11 kinds of agricultural commodity futures prices present positive correlations with crude oil futures prices.transportation costs have a significant impact on commodity futures prices, and shipping costs are directly impacted by international crude oil prices.We use the Brent crude oil spot price (BRENT) as a risk factor.In addition, we also use the Baltic Dry Index) BDI*, which reflects spot freight rates on major routes.
Fourth, due to the strong demand in the domestic soybean market, price fluctuations in the Chicago Board of Trade, which serve as a market vane, can be transmitted to the Chinese soybean market.In the short term, domestic spot prices as well as trading volume and trading positions in the futures market will have a more direct impact on futures prices.Therefore, we select four short-term risky factors namely the soybean futures closing price in the Chicago Board of Trade (CBOT), China soybean spot price (SPOT), the trading volume of China soybean (VOLUME), and positions of China soybean (POSITION).
Fifth, in the long term, the macroeconomic situation affects commodity futures price trends.Batten, Ciner, and Lucey (2010) suggest the spillover effects of macroeconomic determinants on commodity prices, including the business cycle, monetary environment, and financial market sentiment.The exchange rate of the US dollar as well as interest rates (Chiang, Chen, & Huang, 2019;Gruber & Vigfusson, 2018), economic activity (Klotz, Lin, & Hsu, 2014), and stock levels and yields (Balcombe, 2011) are also important sources to drive commodity price fluctuations.Recently, using a generalized dynamic factor model, Kagraoka (2016) suggests the US inflation rate, world industrial production, the world stock index, and the price of crude oil are four common dynamic factors that determine the commodity price.Uncertainty and extreme events are also a source of volatility.Joets, Mignon, and Razafindrabe (2017) find the agricultural and industrial markets are more sensitive to changes in macroeconomic uncertainty.Hamadi, Bassil, and Nehme (2017) propose that macroeconomic news surprise has significant impacts on the volatility of agricultural commodities.Prokopczuk, Stancu, and Symeonidis (2019) discover credit risk, financial market stress, and fluctuations in business conditions are important predictors of commodity volatility.Some researchers such as (Tzeng & Shieh, 2016) and (Vercammen, 2020) examine the performance of commodity markets in extreme cases and find clear differences of commodity markets in normal periods.Furthermore, Hu, Zhang, Ji, and Wei (2020) show how macro factors contribute to the volatility of soybeans, gold, and West Texas Intermediate (WTI) markets through a dynamic connectedness network.Based on these analyses, we choose four macro variables as risk factors: the exchange rate (ER), China consumer price index (CPI), China agricultural production price index (APPI) and China money supply (M2).

The data descriptive analysis
The data set is obtained from China's WIND financial database and ranges from 16 May 2008 to 24 December 2019.We consider 18 risky factors as discussed above.Table 1 reports their type, frequency, and definition.For daily variables, we take the form of logarithmic return or increment.In particular, the sample data is divided into two subgroups.The in-sample data is used for volatility modeling, covering from May 2008 to December 2017.The out-of-sample data from January 2018 to December 2019 remain for prediction.
Table 2 reports the descriptive statistics.We have 2652 daily observations and 140 monthly observations.The negative skewness for SFP indicates that its empirical distribution is left-skewed.We find the kurtosis of SFP is larger than 3 and all Jarque-Bera statistics are significant, which indicates the empirical distribution has a sharp peak and thick tails than the normal distribution.It supports us to use the skewed t distribution instead of normal distribution.
The correlations between variables are reported in Table 3.Two factors are strongly correlated if the correlation coefficient exceeds 0.5 and is statistically significant at the 5% level.We find that 39% of all the coefficients show a strong correlation.To avoid multicollinearity, only the factor with the largest variance contribution from a set of strongly correlated factors is retained.The bold estimators are statistically significant at the 5% level.

Empirical results
In this section, we demonstrate in detail the process of measuring China's soybean futures market risk using the new model, presenting the key empirical findings.

Identifying important factors
We perform a data pre-processing for each risk factor.The level and variance of monthly factors are calculated using the error and squared error respectively from an autoregressive process (Engle et al., 2013) .For daily factors, we use its monthly realized volatility to explain the long-term component of soybean volatility.Engle et al. (2013) adopt the variance ratios, that is, VaR ðlogðτ t ÞÞ/VaR ðlogðτ t g i;t ÞÞ, to measure the contributions of economic sources to the expected volatility.We can use it to evaluate the importance of factors for explaining volatility.The variance ratios obtained by level values of factors are higher than those using variances of the factors, except corn futures price, money supply, consumer price index, and market trading volume.Each factor has two variance ratios, and we define the maximum one as its contribution.We rank all factors in the descending order of variance ratio, that is, Brent(level), CORN(variance), SOFP(level), MEAL(level), ER(level), POSITION(level), OUTPUT(level), CBOT(level), SPOT(level), CPI(variance), M2(variance), IMPORT(level), VOLUME(variance), CONSUME(level), OILE(level) and BDI(level).The last two factors (APPI and STOCK) have weaker contributions with variance ratios lower than 2%.
Further, Table 5 reports the estimators in all single-factor GARCH-MIDAS-Skewed t models.The estimators of θ except four factors namely OILE, BDI, APPI, and STOCK are statistically significant at a 5% level.For the related international commodity markets, we find that: (1) the Brent crude oil price (BRENT) produces a negative impact on the volatility of SFP because it directly affects the production and transportation costs of soybeans; (2) the international soybean futures price (CBOT), as a bellwether, has a positive impact on the volatility of SFP; (3) the position (POSITION) and market trading volume (VOLUME) have significant negative impacts on it, but the Chinese spot soybean price level (SPOT) plays an opposite role.For interested substitutes and downstream products, the rising prices of corn futures (CORN), soybean oil (SOFP), and soybean meal futures (MEAL) will significantly increase the volatility of SFP.
On the side of supply and demand, it is found that: (1) the higher domestic soybean yields (OUTPUT) in China, the less volatility of SFP.However, the more volatility of soybean import volume (IMPORT), the more volatility of SFP.This finding is questionable because China's soybean imports are mainly GMO soybeans, but also contain a small proportion of non-GMO soybeans and the proportion is increasing in recent years, while the underlying of the SFP is non-GMO soybeans, so we speculate that as the volume of imported soybeans expands, especially non-GMO soybeans, it will help alleviate demand for edible-grade non-GMO soybeans and thereby reduce price volatility.Therefore, we  will re-examine this relationship in the multi-factor model by introducing additional control variables.
(2) the impacts of soybean consumption (CONSUME) and demand for crushing oil (OILE) are both positive but much weaker than those of soybean supply.
In terms of macro factors, it is shown that: (1) the variance of the consumer price index (CPI) has a significant negative impact on the volatility of SFP, but the variance of money supply (M2) contributes positively to it; (2) the level of freight rate(BDI) has a negative impact with a low variance ratio of 4.88%.The APPI and STOCK are much weaker and can be ignored.

The direct impacts of low-frequency factors on the market risk
Based on the Model ( 14) and Eq. ( 15), we discuss the dynamic path on low-frequency factors influencing the high-frequency volatility or market risk.To be concise, we take the most influential factor of BRENT level as an example to analyze the time-varying impact mechanism.We plot the dynamic intercepts in each month and dynamic multipliers of slops only on the first day of each month in Figures 1 and 2 respectively. 1According to the results in Table 5, we find that the estimated coefficient of θ is negative for BRENT level factor (Z l ) and that γ ¼ ð1 À α À βÞexpðmÞ and the weigthts φðωÞ are always positive, so the dynamic intercept γexpðθ P K k¼1 φ k ðωÞZ l tÀ k Þ is nonlinearly negatively correlated with the BRENT level factor as showed in Figure 1.For instance, the intercept is lower during the high oil price period of 2011-2014 and higher during the low and sluggish oil price period of 2016-2018.
The multiplier of slope of expðθ P K k¼1 φ k ðωÞΔZ l tÀ k Þ on the first day of each month in Eq. ( 14) also has a non-linear negative correlation with the MIDAS-weighted increments P K k¼1 φ k ðωÞΔZ l tÀ k .We are more interested in whether the multiplier of slope is greater than one.In fact, since the weight φðωÞ is positive, it depends mainly on the crude oil price change series ΔZ l t .The slope multiplier is less than one when the price of crude oil continues to rise and greater than one when it continues to fall.This can be confirmed from Figure 2. The multiplier of slope is less than one during the sustained high oil price climbing phase in 2010-2012, while it is greater than one during the sustained declining oil price phase in 2014-2016.In short, in the long run, high Brent oil prices have a lower impact on the volatility of Chinese soybean futures returns or market risk than low oil prices, perhaps depending on the asymmetric response of investors in the Chinese soybean futures market to expectations about the correlation between international oil price movements and Chinese soybean futures returns, that is, low oil prices could reduce the transportation costs and prices of imported soybeans, which in turn could put pressure on Chinese soybean futures prices and trigger its strong volatility, and vice versa.We can perform similar analysis to other factors, which is omitted here.

The multi-factor GARCH-MIDAS-Skewed t model
Because multiple factors of different frequencies simultaneously impact soybean futures volatility, it is necessary to construct a GARCH-MIDAS-Skewed t model with more than one factor.As showed in Table 3, the correlations between factors are pretty high, which might cause a multicollinearity issue.We take two simple steps to overcome this problem.First, if several factors are highly correlated, we will only keep the factor with the highest variance ratio.For instance, SPOT, MEAL, and CBOT are highly correlated with SOFP (over 0.7), and the correlation between CONSUME and OILE reaches 0.9993.Therefore, we can exclude SPOT, MEAL, CBOT, and OILE in the multi-factor GARCH-MIDAS-Skewed t model, because their variance ratios are relatively lower.Second, the stepwise regression is used to select variables by adding one factor at a time in the order of variance ratios from the highest to the lowest.Specifically, a new factor only can be retained when it is statistically significant and raises the variance ratio.Finally, ten factors are remained namely BRENT(level), CORN(variance), SOFP(level), MEAL(level), CPI(variance), M2(variance), IMPORT(level), VOLUME(variance), CONSUME(level) and BDI(level).The estimated results of the multi-factor GARCH-MIDAS-Skewed t model are shown in Table 6.
The estimated dynamic long-term component of volatility and daily conditional variances of soybean futures price are shown in Figure 3.There is a strong consistency in the trend between estimated long-term volatility and daily volatility in soybean futures.Overall, the multi-factor GARCH-MIDAS-Skewed t model has a higher variance ratio of 80.79% (seen in Table 6) than the maximum 50.29% among all single-factor models (seen in Table 4), which provide stronger contributions to the volatility of SFP.The results indicate that international crude oil prices, prices of downstream products, consumer price index, money supply, China's soybean imports, the volume of China's soybean futures trading, China soybean consumption, and freight rate are significant determinants of the long-term volatility of China soybean futures.

Backtesting of VaR
As a new parametric model in measuring market risk, the multi-factor GARCH-MIDAS-Skewed t model works well to analyze the volatility of Chinese soybean futures.We next evaluate its performance in predicting VaR, compared with the GARCH and GARCH-MIDAS (Engle et al., 2013) models with a normal distribution.We focus on assessing the out-of-sample predictive power, covering 458 daily samples from January 2018 to December 2019.Market positions in the soybean futures market are divided into short and long positions.Holding a long position will result in losses when the futures price falls, while holding a short position will result in losses when the price rises.Our model uses a skewed t distribution that can well describe the asymmetry in the distribution of soybean futures returns and therefore allows for a precise assessment of market risk for both long and short position assets.When an asset return at the long (short) position is lower (higher) than the estimated VaR, it happens a VaR violation or a hit.The hit ratio (HR) is a simple measure of predictive quality, which is defined as the number of hits divided by the total number of samples.A good model should have an HR close to the significance level (P) of VaR, thus we define a relative percentage index as PI ¼ jHRÀ Pj P .The less PI, the better the model.
The backtesting of VaR forecasts is usually performed by likelihood ratio (LR) tests proposed by (Kupiec, 1995) and (Christoffersen, 1998), which assesses the unconditional coverage (UC) and conditional coverage (CC) as well as the independence (ind) of VaR exceedances or violations.Then, we also adopted the popular dynamic quantile (DQ) test proposed by (Engle & Manganelli, 2004), which links the violations to a set of explanatory variables that include a constant, the VaR forecast and the first four lagged hits.
We compare the forecasting performances of the interested models, considering long and short positions and two significance levels (1% and 5%).The multi-factor GARCH-MIDAS-normal and multi-factor GARCH-MIDAS-Skewed t models have the same explanatory variables in Section 4.3.The backtesting of out-of-sample VaR forecasts and the model comparison are shown in Table 7.
First, each model has four PIs in Table 7.We calculate their means that are 53.57%,45.27%, and 34.80% for GARCH(1,1), GARCH-MIDAS-Normal, and GARCH-MIDAS-Skewed t models, respectively.The GARCH and GARCH-MIDAS models with a normal distribution do not yield accurate forecasts, especially at the 5% level.Therefore, our model with the least PI value has a better performance in VaR forecasts than them.In particular, our model performs better for short VaR forecasts than for long VaR forecasts, because the PIs are lower in the right tail than in the left tail.Second, it should be noted that to distinguish between long and short positions, our model allows VaR to be negative, although in practice it is customary to express risk in terms of the amount of VaR.In the financial industry, according to the Basel Accord, capital adequacy depends on the absolute value of the predicted VaR.With the same quality of backtesting, models with lower absolute VaRs will have a more competitive advantage because they will achieve the same risk prevention effect by only having to maintain a lower level of capital adequacy.To assess the economic value of risk models, we  Third, it is important for backtesting process that only the GARCH-MIDAS-Skewed t model has passed all LR test and DQ test at 1% and 5% significance levels both for long and short positions.We plot the predicted VaRs for soybean futures long and short positions at the 1% significant level using the multi-factor GARCH-MIDAS-Skewed t model in Figure 4.
Overall, the multi-factor GARCH-MIDAS-Skewed t model has obvious advantages over other investigated models in predicting VaR.

Conclusions
China has always experienced a very high market demand for soybeans and increased dependence on foreign soybean imports in the past decade.It has triggered dramatic fluctuations in soybean futures prices.Therefore, risk management in the soybean futures market has become extremely important for stakeholders, which is prevalent in other agricultural futures markets.As a result, there is an urgent need for more sophisticated risk measurement tools on a global scale.Traditional risk measurement models are flawed in their inability to use mixed frequency information.The contribution of this paper includes two aspects.We first put forth an improved GARCH-MIDAS-Skewed t model to overcome this dilemma as described in Section 2, which has many flexible settings as described in Section 2 than the GARCH-MIDAS model (Engle et al., 2013).The multi-factor mixed-frequency GARCH-MIDAS-Skewed t model has a higher variance ratio than all single-factor models.It can also produce better VaR forecasts for long and short positions compared to the GARCH and GARCH-MIDAS-normal models.
In addition, we provide a practical and instructive analysis of volatility and market risk in the Chinese soybean futures market.We believe this approach is equally applicable to other agricultural futures.We systematically demonstrate how to identify the important factors, find the best combination of factors, and perform a backtesting of VaR forecasts.
Based on the findings of the empirical analysis, we give five useful recommendations for risk management in the Chinese soybean futures market as follows: (1) Investors should pay particularly close attention to the volatility of international crude oil prices.Brent crude oil prices are linked to soybean production and transportation costs, and when crude oil prices fall, soybean futures prices fall in tandem, with increased demand triggering increased volatility.The Baltic Dry Index freight rates play a similar role, but have a much weaker impact than oil prices.
(2) It is important to monitor price fluctuations of substitutes such as corn and downstream products such as soybean oil, which can trigger a homogeneous linkage of risks in the soybean futures market.
(3) The impacts of CPI and M2 cannot be ignored in the long term.Increased CPI volatility and higher price instability will increase investor uncertainty about commodity price expectations, thereby reducing market trading sentiment and price volatility.When the money supply fluctuates more, it is important to guard against increased risk in the soybean futures market.
(4) Market participants must pay attention to changes in the volume and structure of China's soybean imports (GMO and non-GMO soybeans).China's import dependence on soybeans has been above 80%, with China importing soybeans mainly to meet protein feed demand, and in recent years China has been increasing its imports of non-GMO soybeans to meet edible vegetable oil demand.More than 80% of China's homegrown soybeans are processed into food.As a result, there is a certain degree of substitutability between imported soybeans and domestically produced soybeans in terms of usage, so the increase in the amount of imported soybeans will, to a certain extent, reduce the volatility and market risk in China's soybean futures market.Of course, the increased consumption of soybeans in China will directly exacerbate volatility and market risk in the soybean futures market.
(5) China should develop a reasonable subsidy policy to encourage farmers to plant soybeans, increase the domestic supply of soybeans, and reduce dependence on foreign soybean imports; at the same time, expand the channels of soybean importing countries, establish reasonable trade mechanisms such as mutually lowering tariffs and increasing quotas; expand the proportion of imports of non-GMO soybeans, and maintain stable domestic soybean spot prices and futures prices.
In summary, we propose a new systematic framework for measuring futures market risk that integrates mixed-frequency data information into volatility modeling and market risk measurement, which can effectively address the scenario where the influencing factors have different observation frequencies and provide a method for identifying optimal factor combinations.In particular, combined with the empirical findings, we propose effective recommendations for improving risk management in the Chinese soybean futures market.

Disclosure statement
No potential conflict of interest was reported by the authors.

Figure 1 .
Figure 1.The dynamic relationship between the time-varying intercepts γexpðθ P K k¼1 φ k ðωÞZ l tÀ k Þ and the BRENT crude oil price.The latter is plotted in the right vertical axis.

Figure 2 .
Figure 2. The dynamic relationship between the multipliers of slop on the first day of each month expðθ P K k¼1 φ k ðωÞΔZ l tÀ k Þ and BRENT crude oil price.The latter is plotted in the right vertical axis.

Figure 3 .
Figure 3.The estimated dynamic long-term component of volatility and daily conditional variances of the soybean futures price.

Figure 4 .
Figure 4.The estimated VaR for long and short soybean futures assets at a 1% confidence level.

Table 2 .
Descriptive statistics of the variables.
Table 4 reports the variance ratios of the level and variance of all factors.
P values are below the correlation.Bold estimators are statistically significant at the 5% level.

Table 4 .
The variance ratios of all single-factor models.
Bold letters indicate the maximum of variance ratios among level and variance effects.

Table 5 .
Estimators of all single-factor GARCH-MIDAS-Skewed t models.

Table 6 .
Estimators of the multi-factor GARCH-MIDAS-Skewed t model.

Table 7 .
Backtesting of out-of-sample VaR forecasts and the model comparison.the mean of the absolute value of out-of-sample VaRs (MAVaR) in each model as showed in Table 7.In Panel B, C, and D of Table 7, the GARCH-MIDAS-Skewed t model presents lower PIs and lower MAVaRs than the GARCH-MIDAS-Normal model.In Panel A, although GARCH-MIDAS-Normal model has a lower MAVaR than the GARCH-MIDAS-Skewed t model, it does not pass the DQ test at a 1% significance level.
P values are below the estimators.Bold estimators indicate rejections from the LR tests and DQ test at a 1% significance level.calculate