A multivariate cointegration time series model and its applications in analysing stock markets in China

Abstract This paper explores nonlinear cointegration between Chinese mainland stock markets and Hong Kong stock market in a multivariate framework for the period January, 1998 to December, 2014 by a nonparametric method. The local linear kernel smoothing method is developed to estimate the unknown function, and the practical problem of implementation is also addressed. Then, a simple nonparametric version of a bootstrap test is adapted for testing misspecification. Furthermore, Some Monte Carlo experiments are presented to examine the finite sample performance of the proposed procedure. Finally, the stock markets data set is discussed in detail by using proposed procedures, showing that Shanghai Stock Index (SHSI) and Shenzhen Component Index (SZCI) can affect Hang Seng Index (HSI), and the influence appears to be a strong nonlinear characteristics.


Introduction
China is one of the fastest-growing economies in the world. According to International Monetary Fund (IMF) and purchasing power parity (PPP), Chinese economy is the second largest in the world by nominal GDP. China's financial sector, especially stock market has complemented the growth process significantly since the onset of financial reforms in the late nineties. Over the past two decades, the stock market has grown stupendously in terms of increase in size and volume of investment by both domestic and international investors. The relationship between stock prices and macroeconomic variables has been discussed all over the world. The growing linkages between macroeconomic variables and the movement of stock prices have well been documented in the literature over the last several years (Sadorsky, 2003;Chen, 2003;Hadi, Katircio glu, & Adaoglu, 2019;Katircio glu, Alkhazaleh, & Katircio glu, 2018;Katircio glu & Zabolotnov, 2019;Rezgallah, € Ozataç, & Katircio glu, 2019;Shaeri & Katircio glu, 2018). Campbell (1987Campbell ( , 1991, Fama and Schwert (1997) showed that short-and long-term interest rates have a modest degree of forecasting power for excess stock returns. Similarly, other studies, such as Campbell and Shiller (1991) and Fama (1984), have shown that the slope of the term structure of interest rates helps to forecast excess stock returns. On the other hand, Campbell and Ammer (1993) showed that short-term interest rates affect stock prices. Daily stock prices are determined by many factors, including enterprise performance, dividends, other countries stock prices, gross domestic product, exchange rates, interest rates, current account, money supply, employment, and so on. Countless factors impact stock prices. In China, the regional stock exchanges from Shanghai, Shenzhen and Hong Kong simultaneously reflect the country's economy. For the different locations, economic development backgrounds and trading rules, the three stock markets are showing different operating characteristics. To better explore the susceptibility the movements of of stock market in relation to country's macroeconomic indicators and fashion investment decisions and policy decisions for investors and policy makers, it is necessary to study the contact of the three stock markets.
The study of linear nonstationary time series and linear cointegration relationships has led to a huge development in macro econometrics and financial econometrics, since most of macro and financial time series exhibit nonstationary features. One of the main disadvantages of conventional linear cointegration methodologies is that it assumes that the cointegrating relationship does not change over the entire data span, which is too unrealistic to be true especially when the time series is long, such as Johansen (1988Johansen ( , 1991Johansen ( , 1995, Johansen and Juselius (1990), Shin (1998), andSmith (2001) for more details. Recently, there is a growing interest on the nonlinear properties of nonstationary time series. In this article, we investigate the local linear estimation of a nonparametric multivariate cointegration model of the form where X t is d-dimensional predictor variable with each element I(1) process, f ðÁÞ is an unknown to be estimated with the observed data fX t , Y t g n t¼1 but sufficiently smooth function, and u t is an I(0) process.
When X t is a univariate I(1) process, model (1) has been studied by many researchers. For instance, Kanas (2003) used the alternating conditional expectations (ACE) algorithm to nonlinearly transform stock prices and dividends for the USA for the period 1871-1999 and found strong evidence of cointegration between the transformed variables, which could be characterized as nonlinear cointegration. Karlsen, Myklebust, and Tjøstheim (2007) discussed the estimation of model (1) by the local constant kernel estimator and derive the asymptotic result under the framework of null recurrent Markov chains which include the unit root process as a special case. Based on the local constant kernel estimator, Wang and Phillips (2009a) developed the asymptotic theory for local time density estimation for a general class of functionals of integrated and fractionally integrated time series. Wang and Phillips (2009b) investigated model (1) with endogeneity, i.e., there is dependence between the nonstationary regressor X t and the stationary error u t . They show that the local constant kernel estimator is consistent and the limit distribution is mixed normal. Furthermore, by using local linear method, Liang, Lin, and Hsiao (2015) studied model (1), and derived the properties of the local linear estimator in the nonstationary setting. They also considered a data-driven method to select the bandwidth, and found a drastic improvement of the mean squared errors of the local linear estimator compared to the local constant estimator. Other papers include Breitung (2002); Ma and Kanas (2004); Bae, Kakkar, and Ogaki (2006) ;Jawadi, Bruneau, and Sghaier (2009);Cross and Worden (2011); and Ghosh and Kanjilal (2014) and reference therein. To our best known, however, a study on the estimation and application of multivariate nonlinear cointegration has been missing.
The development of the multivariate cointegration time series models was motivated by structural modeling of stock markets data of stock data of China. The data set consisting of 204 observations from Shanghai Stock Market, Shenzhen Stock Market and Hong Kong Stock Market, respectively, which can be obtained from http://finance.sina. com.cn/. The data is from January, 1998 to December, 2014 and the monthly closing prices of three market indexes are used. The interest of our study is to analyze the long-term relationship among SHSI (fP 1t g), SZCI (fP 2t g) and HSI (fP 3t g). Some financial time series such as quarterly earning per share of a company exhibits certain seasonality. In some applications, seasonality is of secondary importance and is removed from the data, resulting in a seasonally adjusted time series that is then used to make inference. Detailed analysis of the data set is reported in Section 4.
It is well known that in the stationary setup the local linear kernel estimator is superior to the local constant kernel estimator because it has less finite sample bias and no boundary effect, see Fan and Gijbels (1996). Therefore, in this article, we extend the local linear estimator to deal with the estimation of multivariate cointegration models. Also, we consider a data-driven method to select the bandwidth matrix. Further, the proposed method is applied to investigate the relationship among three stock markets in China, which shows that the proposed method is valid and accurate.
The plan of the article is as follows. Section 2 gives the selection and pre-processing of data. Section 3 discusses the analysis of stock index and the Granger causality test. Section 4 develops the proposed estimation method and presents an empirical application to stock index data. Section 5 contains the conclusion and discussion.

Methodologies and practical issues in model building
This section consists of two parts. In Section 2.1, the proposed local linear estimation is described. Some practical issues in model building are discussed in Section 2.2.

Local linear estimation
We consider the nonparametric cointegration model (1), where f ðÁÞ is a smooth function without other restrictions on its specific functional form. X t follows an I(1) process, i.e., X t ¼ X tÀ1 þ e t , t ¼ 1, . . . , n, with X 0 ¼ 0 and e t being a weakly dependent stationary process. The disturbance u t is a stationary process, i.e., an I(0) process.
For nonparametric regression models with independent and weakly dependent stationary data, it is well established that the local linear estimation method enjoys some optimality properties such as boundary estimation accuracy and minimax efficiency.
In this paper, we develop a class of kernel-type nonparametric regression estimators based on local least squares fitting using kernel weights. Much of our attention will be devoted to the local linear least squares kernel estimator of f and its derivative function f 0 which isâ andb, respectively, the solution for a and b to the following problem: where x is a given point in a neighborhood of X t , and H is a d Â d symmetric positive definite matrix depending on n, and KðÁÞ is d-variate kernel such that Ð KðuÞdu ¼ 1 and K H ðuÞ ¼ jHj À1=2 KðH À1=2 uÞ: We call H 1=2 the bandwidth matrix since it is the multivariate extension of the usual bandwidth parameter. Problem (2) is a straightforward weighted least squares problem, and assuming that X > WX is nonsingular, (2) has solutionf where ðX n ÀxÞg, and D f ðxÞ denotes the d Â 1 vector of first-order partial derivatives. Then, the least squares estimator of f ðÁÞ is given bŷ where e 1 is the ðd þ 1Þ Â 1 vector having 1 in the first entry and all other entries 0. It is well known that the bandwidth plays an essential role in the trade-off between reducing bias and variance. Similar to Cai and Tiwari (2000) and Cai (2007), we suggest a nonparametric version of Akaike Information Criterion (AIC) to select the bandwidth matrices. The basic idea is described as follows. For the observed values fY t g n t¼1 , the fitted values fŶ t g can be expressed asŶ ¼ S H Y, where S H is a smoother (or hat) matrix associated with the smoothing parameter matrix H. By recalling the AIC for linear models under the likelihood setting À2ðmaximized log likelihoodÞ þ 2ðnumberofestimatedparametersÞ, we select the optimal bandwidth matrix H opt by minimizing wherer 2 is an estimate of r 2 and trðS H Þ is the trace of S H , regarded as the nonparametric version degrees of freedom. This selection criterion counteracts the over/ under-fitting tendency of the generalized cross-validation and the AIC; see Cai and Tiwari (2000) and Cai (2002) for more details. Alternatively, one might use other existing methods although they may be required more computing; see Yao and Tong (1994), Fan and Gijbels (1996), and Tschernig and Yang (2000). In practice, it is desirable to have a quick and easy implementation to estimate the asymptotic variance off ðxÞ to construct a pointwise confidence interval. Specially, we first compute the residualsû t , using the local linear estimation, and then apply a simple moment estimation to obtain a direct and naive estimator of r 2 , that is, It might be shown thatr 2 is a consistent estimate of r 2 : Therefore,R ¼r 2m 0 ðKÞ is a consistent estimate of R, wherem 0 ðKÞ is a consistent estimator of m 0 ðKÞ with m 0 ðKÞ ¼ Ð K 2 ðuÞdu: Other alternative methods might also be applicable here.

Testing misspecification
In applications, it might be important and interesting to test whether model (1) holds with a specified parametric form. Inspired by Fan, Zhang, and Zhang (2001) and Juhl (2005), we consider the following testing problem where gðX t , hÞ is a given function indexed by an unknown parameter vector h.
For an easy implementation purpose, we adapt a misspecification test by comparing the residual sum of squares (RSS) from both parametric and nonparametric fittings. The testing procedure is described as follows. Letĥ be an estimator of h. Then, the RSS under H 0 is RSS 0 ¼ n À1 P n t¼1û 2 t, 0 , whereû t, 0 ¼ Y t ÀgðX t ,ĥÞ and the RSS under H a is RSS 1 ¼ n À1 P n t¼1û 2 t, 1 , whereû t, 1 ¼ Y t Àf ðX t Þ: Further, the test statistic is defined as The null hypothesis (5) is rejected for a large value of T n . For the sake of simplicity, we evaluate the p-value by using the following nonparametric bootstrap approach, which can be described as follow.
Step 1. Obtain the residualsû t, 1 ¼ Y t Àf ðX t Þ, wheref ðÁÞ is the local linear estimate of f ðÁÞ: Step 2. Obtain the bootstrap residuals fu Ã t g n t¼1 by randomly resampling with replacement from the set fû t, 1 À u t, 1 g n t¼1 with u t, 1 ¼ n À1 P n t¼1û t, 1 : Then, Y Ã t ¼ gðX t ,ĥÞ þ u Ã t : Step 3. Regress Y Ã t on X t to obtain the estimatorĥ Ã under H 0 , and Regress Y Ã t on X t to obtainf ðÁÞ under H a : Calculate the bootstrap residualsû Ã respectively. The test statistic T Ã n is constructed based on fû Ã t, 0 ,û Ã t, 1 g n t¼1 : Step 4. Repeat steps 2-3 B times and denote the bootstrap statistics as fT Ã n, b g B b¼1 : The bootstrap p-value is computed by p Ã ¼ B À1 P B b¼1 IðT Ã n, b ! T n Þ, where IðÁÞ is an indicator function.
We bootstrap the residuals from the nonparametric fitting instead of the parametric fitting, since the nonparametric estimation is always consistent, no matter whether the null or the alternative hypothesis is correct. Therefore, the method should provide a consistent estimator of the null distribution even when the null hypothesis does not hold. This test is similar to the generalized likelihood ratio test proposed by Fan et al. (2001) except that the samples used here are not independent.

Monte Carlo design
In order to illustrate our modeling procedure, we consider some simulated experiments. In the Monte Carlo experiments, the Epanechnikov kernel KðuÞ ¼ 0:75ð1Àu 2 Þ þ is used and the optimal bandwidth matrix H opt is selected as follows. For a predetermined sequence of H's from a wide square range, say both horizontal axis and vertical axis from 0.1 to 0.7 with an increment 0.025, based on the AIC bandwidth selector described in Section 2.2, we compute AICðHÞ for each H and choose H opt to minimize AICðHÞ: The performance of the proposed estimators is evaluated by the mean absolute deviation error (MADE) which is a common criteria and also is employed in Ye, Lin, Zhao, and Hao (2015).
We assess the finite sample performances of the local linear estimator and test with an example of the multivariate cointegration time series model. The generating mechanism follows (1) and has the explicit form where X 1t is simulated from the model X 1t ¼ q 1 X 1, tÀ1 þ e 1t with e 1t generated from Nð0, 0:5 2 Þ independently, X 2t is generated from the model X 2t ¼ q 2 X 2, tÀ1 þ e 2t with e 2t generated from Nð0, 0:3 2 Þ independently, u t is drawn from Nð0, 0:25 2 Þ, and fX 1t g, fX 2t g and u t are independent.

Monte Carlo results
The simulation is repeated 500 times for each of sample sizes n ¼ 100, 200, and 500 and we compute the optimal bandwidth matrix for each replication and each sample size. The median and standard deviation (in parentheses) of the 500 MADE values of for local linear estimate are summarized in Table 1. We can observe from Table 1, that all MADE values decrease as n increases. The values of MADE and standard deviation for the case that fX 1t g is i.i.d. (q 1 ¼ 1, q 2 ¼ 0) are both smaller than those for the dependent cases (q 1 ¼ 1, q 2 ¼ 0:3, 0:7), as one would have expected. The MADE values increase as q 1 decreases when q 2 ¼ 1, but the standard deviation values are opposite. Furthermore, for the case that both the time series fX 1t g and fX 2t g are nonstationary, i.e., q 1 ¼ 1, q 2 ¼ 1, the performance of proposed estimator is also satisfactory. In addition, we can see clearly that the MADE and the standard deviations are both small, which makes the asymptotic theory more relevant. For the sample size n ¼ 200, q 1 ¼ 1 and q 2 ¼ 1, we can easily compute the true value for R for this model, which is given by R ¼ m 0 ðKÞr 2 andR ¼ 0:0172: The selected optimal bandwidth is H opt ¼ 0:325 0 0 0:275 for this sample. In summary, we can conclude that the proposed estimation procedures perform fairly well.

Model misspecification checks
The aim of this set of experiments is to investigate the extent in which the proposed testing procedure is effective in dealing with misspecification. We use the null hypothesis H 0 : f ðÁ, ÁÞ ¼ g (a constant) versus the alternative H a : f ðÁ, ÁÞ 6 ¼ g to demonstrate the power of the proposed misspecification test. The power function is evaluated under a family of alternative models indexed by c, H a : f ðÁ, ÁÞ ¼ g þ cðf Ã ðÁ, ÁÞÀgÞ and 0 c 1, where f Ã ðÁ, ÁÞ is the true surface and g is the average height of f Ã ðÁ, ÁÞ (indeed, g ¼ 0:6667). The other type of tests can be considered in the same way. For n ¼ 100 and 200, the misspecification test described in Section 2.2 with 500 replications is employed and for each realization, the bootstrap sampling is 1000. For each replication and each sample size, according to the AIC bandwidth selector described in Section 4.2, the optimal bandwidth is computed and for simplicity, this optimal bandwidth is used in calculating the test statistics for both original sample and bootstrap samples.
The simulated power values against c are presented in Table 2 for different sample sizes. It is clear from Table 2 that the power for n ¼ 500 is lower than that for n ¼ 100 and 200. When c ¼ 0, the specified alternative hypothesis collapses into the null hypothesis. For instance, when n ¼ 500, the empirical power is 0.048, which is close to the significance level of 5%. This demonstrates that the bootstrap estimate of the null distribution is approximately correct. The power function shows that our test is indeed powerful.

Analysis for the closing price data of stock market in China
We apply the proposed model and its modeling procedures to analyze the long-term relationship among Shanghai Stock Index (SHSI, fP 1t g), Shenzhen Component Index (SZCI, fP 2t g) and Hang Seng Index (HSI, fP 3t g). Let fp 1t g, fp 2t g and fp 3t g represent the monthly closing value of SHSI, SZCI and HSI after logarithmic transformation. The stock index data concerned here can be seen as time series data. In order to get a better insight, the time series plots of fp 1t g, fp 2t g and fp 3t g from January, 1998 to December 2011 are depicted in Figures 1a-c, respectively. Following the convention in the finance literature, we consider the simple monthly positive returns r it ¼ log P it P i, tÀ1 for i ¼ 1, 2, 3, corresponding to SHSI, SZCI and HSI. The time series plots of r 1t , r 2t and r 3t are displayed in Figures 1d-f, respectively. In order to check the stationarity of fp it g and fr it g, the augmented Dickey-Fuller (ADF; Dickey & Fuller, 1979) test is used for the unit root of the series fp it g and fr it g: Here, other criteria introduced in Phillips and Perron (1988), Kwiatkowski, Phillips, Schmidt, and Shin (1992), Elliot, Rothenberg, and Stock (1996) or Ng and Perron (2001) could also be employed. The test results are presented in Table 3. We can observe from Table 3, that the series fp it g are nonstationary while the series fr it g are stationary, which implies that fp it g are I(1), that is, they are one-order integratedness. Furthermore, the Granger causality test shows that the current SHSI (p 1t ) and SZCI (p 2t ) can affect the current HSI p 3t : To establish the empirical relationship among the series of SHSI, SZCI and HSI, similar to Tsay (2005) who considered the linear relationship between the one-year Treasury constant maturity rate and the three-year Treasury constant maturity rate, we first fit a simple linear regression model based on the above results of Granger causality test p 3t ¼ a 0 þ a 1 p 1t þ a 2 p 2t þ u t : As a result, the least-square estimates of a 0 , a 1 and a 2 are 5.7663(0.2807), À0.1514(0.0905) and 0.5835(0.0590), respectively. The adjusted R-squared is 0.6290, and the values of F-statistic and Durbin-Waston statistic are 261.9054 and 0.1617. By comparing these results with those from standard linear model, we might suspect that the considered model might be nonlinear. To this end, we examine the residual series derived from (6) and find that it does depend on time; see Figure 2a. Therefore, we have sufficient reasons to fit the following nonparametric model: The local linear estimatorf ðÁ, ÁÞ is computed based on the optimal bandwidth matrix H opt ¼ 0:325 0 0 0:300 , which is selected by the AIC bandwidth matrix selector described in Section 2.2. The residual plot in Figure 2b shows that the errors are close to normal, thus the local linear estimates are expected to be efficient. To further study the effect of SHCI and SZCI on HSI, we computer the first-order partial derivatives of HSI over SHCI and SZCI, which are depicted in Figure 3 (solid curve represents the derivative of p 3t with respect to p 1t , dashed curve represents the derivative of p 3t with respect to p 2t ). We can observe from Figure 3 that the influence of SHCI and SZCI to HSI appears to be a strong nonlinear characteristics. Meanwhile, this effect presents a significant period, and it can be roughly divided into two periods: the first period of the trading months from 1 to 104 (The corresponding calendar months are from January, 1998-July, 2006.), the effect of SZCI p 2t and HSI p 3t on HSI p 3t is synchronous. The positive and negative roles appear alternately; the second period of the trading months from 105 to 204 (The corresponding calender months are from August, 2006to December, 2014, the SZCI p 2t shows a strong positive impact on HSI p 3t , while the SHCI p 1t has a negative impact on HSI p 3t : Therefore, one might conclude that three is a nonparametric cointegration relationship among SHCI, SZCI and HSI. Finally, to support our conclusions statistically, we consider the testing null hypothesis H 0 : f ðÁ, ÁÞ ¼ g, and the testing procedure  described in Section 2.2 is used with the bootstrap sampling 1000 times. As a result, the p-value is less than 0.001. Thus, this test result further supports our finding that the relationship among SHCI, SZCI and HSI can be described by a nonparametric cointegration model.

Conclusion and discussion
In this article, we developed a multivariate cointegration time series model to model nonlinear and nonstationary time series. A nonparametric local linear kernel smoothing method was proposed for estimating the model function. Then, we adapted a testing procedure for testing misspecification, based on the comparison of the RSS and suggested using a bootstrap to estimate the p-value. Furthermore, We obtained some insights about the modeling methods via some Monte Carlo experiments and demonstrated that the local linear estimator performed well. Finally, the usefulness of the considered model was demonstrated by an empirical application.
Some further research can be conducted in several directions. Firstly, we would consider models allowing e i to be heteroscedastic, and it would be of interest to extend this study to models containing lagged dependent variables. Secondly, an extension of the semiparametric modeling and estimation to multivariate cointegration time series models would also be interesting. Thirdly, the predictive utility of multivariate cointegration time series models needs further investigation.

Disclosure statement
No potential conflict of interest was reported by the authors. Figure 3. The estimated first-order partial derivative: solid curve represents the derivative of p 3t with respect to p 1t , dashed curve represents the derivative of p 3t with respect to p 2t : Source: The Authors.