Matching Kollo measures

Abstract We motivate the advantages of using the Kollo measures, relative to other types of third and fourth moments of multivariate systems, and explore their Monte Carlo simulation and bootstrapping errors. Then we derive necessary and sufficient conditions for simultaneously matching any given mean vector, covariance matrix, Kollo skewness, and Kollo kurtosis. The specification of a suitable orthonormal basis greatly simplifies these moment-matching conditions. We offer semi-closed-form solutions to increase computational efficiency. In this respect, we compare our approach to two competing methods, which anyway can only match Kollo skewness and not the kurtosis at the same time. Ours is the first method for exactly matching all four multivariate moments simultaneously.


Introduction
Consider an n-dimensional random variable X n ¼ X 1 , :::, X n ½ 0 with a mean vector l n ¼ E½X n and a covariance matrix R n ¼ V½X n : Mitchell (1989) introduced the third and fourth moments as the skewness tensor E½Y n Y 0 n Y n and kurtosis tensor E½Y n Y 0 n Y n Y 0 n where Y n is the standardized version of X n , i.e., Y n ¼ R À1=2 n ðX n À l n Þ and denotes the Kronecker product.In other words, the elements in the skewness tensor take the form E½Y i Y j Y k and those of the kurtosis tensor take the form E½Y i Y j Y k Y t , for i, j, k, t ¼ 1, :::, n: The tensors contain the third and fourth order central moments and all the co-moments of the marginal variables, which include nðn þ 1Þðn þ 2Þ=6 and nðn þ 1Þðn þ 2Þðn þ 3Þ=24 distinct elements, respectively.The number of elements increases rapidly with n.For that reason, these full tensor moments are impractical for use in large dimensions and several more parsimonious representations of multivariate skewness and kurtosis have been proposed.
Many authors advocate the need for models to match the information contained in measures of skewness and kurtosis, for instance in finance (Ryoo, 2007), biotech (Kuruvilla & Gunavathi, 2014), and engineering (Love et al., 2013).Indeed, moment-matching techniques have been applied in a variety of disciplines: Heath et al. (2018) use moment-matching simulation for probabilistic sensitivity analysis in economics; Pellegrino and Sabino (2014) apply moment-matching to reduce the dimension of the pricing problem in finance; and Ching et al. (2009) improve the accuracy of the estimation by using moment-matching in engineering.Yet none can help match even the mean vector and covariance matrix exactly, let alone the higher moments.
Some moment-matching techniques are specifically designed to reduce the sampling error in multivariate moments.In this strand of the literature both Bardsley (2017), Høyland et al. (2003), and Looney (1983) propose a method to generate data based on moments; Foldnes and Olsson (2016) introduce a simple copula structure to target the mean vector, the covariance matrix and both the skewness and the kurtosis of each marginal; and Qu et al. (2020) propose a method to target the Mardia (1970) skewness and kurtosis.However, none of these use the tensor measures defined above, because they are intractable for practical applications except when n is small.
Many different summary measures have been advocated to reduce the size of the skewness and kurtosis tensors, including those proposed by Koizumi et al. (2014), Kollo (2008), Mardia (1970), andM ori et al. (1994).Several other measures are not even directly related to the full tensors. 1 All these measures are listed in chronological order in Table 1 and each column refers to a property that is not fulfilled by the measure.
First note that most of the measures that are not based on the full tensors require computation of the multivariate probability density function, which is rather complex, and cannot be used to establish solvable systems of moment-matching constraints for simulations targeting skewness and kurtosis. 2 Secondly, of the four methods that are derived from the full tensor moments, all but three ignore information by excluding some of the cross moments.This includes the popular Mardia (1970) skewness and kurtosis, which are easy to use but-like several other measures-are only scalar quantities.As such, they are unable to carry as much information about the full multivariate density as vector or matrix measures.
From Table 1 it is clear that the only measures that suffer from none of these drawbacks are those introduced by Kollo (2008).In the form of an n Â 1 vector and an n Â n symmetric matrix, respectively, they may be written: Unlike tensor moments, the Kollo measures are easy to compute for applications.But so are other measures, such as those of Mardia and M ori et al. 3  In Appendix A, we provide some real-world examples that motivate the use of Kollo skewness and kurtosis in a practical setting, rather than the measures introduced by Mardia.Kollo measures also have many theoretical advantages.By retaining so many co-moments the Kolo measures are the most tractable attempt to retain as much information as possible from the full skewness and kurtosis tensors. 4The Mardia and Koizumi et al. measures suffer from a substantial loss of information because they are scalars.For example, Mardia skewness can only be positive, which cannot show the data is negatively skewed or positively skewed.A similar issue of information loss is shared by the M ori et al. measures which are a multivariate extension of Mardia measures but still omit many third and fourth order co-moments.Moreover, like the tensors themselves and several other related measures, the Kollo measures are invariant under affine transformations.Later we shall rely on this property to generate repeated simulations that exactly match some target values for Kollo skewness and kurtosis.This invariance property also enables concatenations of repeated samples.That is, the concatenated samples have the same moments as the sub-samples.
Our exact moment-matching methods are achieved by creating an orthonormal basis, the span of which includes a particular subspace containing all samples that have the exact Kollo skewness vector and Kollo kurtosis matrix.The benefit of using such a set of orthonormal vectors is huge: mathematically, matching the sample moments exactly means solving a large system of non-linear equations in which the number of unknowns greatly exceeds the number of constraints.The orthonormal basis also underpins a significant reduction in the size of the system to be solved, and we are able to derive analytic solutions that greatly enhance the computational efficiency of our method.It is also, to our knowledge, the only method for exactly matching both Kollo skewness and kurtosis in every simulation of the distribution.
Common techniques, such as Monte Carlo (MC) simulation (Hertz, 1964) or the statistical bootstrap (Efron, 1979) fail to ensure that each sample has the same moments as the target distribution.There is a large literature on their simulation errors in covariance matrices-see Jobson and Korkie (1980), Ledoit and Wolf (2004), and many others-but no previous study on MC or bootstrap simulation errors in the Kollo measures.Section 2 motivates the need to target higher moments exactly by demonstrating that the Kollo moment sampling errors from standard Monte Carlo and the statistical bootstrap can be very significant.This well-known property of sampling error is unlikely to be resolved completely by increasing the number of simulations alone (Ferraz & Moura, 2012;Oberkampf et al., 2002) and variancereduction techniques (Saliby & Paul, 2009) come at the expense of model complexity.
Then Section 3 introduces the Ledermann et al. (2011) Random Orthogonal Matrix (ROM) simulation method, which encompasses MC and the bootstrap (as special cases of the L matrices which underpin this approach) and was originally developed as a simulation method for matching exact covariance matrices.Two different extensions of the ROM algorithm by Hanke et al. (2017) and Alexander et al. (2022) can target exact Kollo skewness by generating arbitrary values to pad the sample so that Kollo skewness matching conditions are solvable numerically.But a trial-and-error process is needed to find suitable arbitrary values and, even when such values are found, there is no analytical solution and its numerical solution cannot be guaranteed, so the algorithms can fail.Moreover, as Alexander et al. (2022) demonstrate, these methods risk generating simulated samples with undesirable marginal densities, depending on the arbitrary values that are found.
Having derived the necessary and sufficient conditions for a sample generated to match a target Kollo skewness and Kollo kurtosis in Section 3, Section 4 uses these conditions to derive theoretical results which first define an orthonormal basis, and then find analytic solutions for the coefficients which generate the required subspace of the span of this orthonormal basis.Here we also prove the invariance under sample concatenation.We conclude the paper with a discussion of our results in Section 5, in the context of the previous literature.

Sampling errors in Kollo skewness and kurtosis
This section presents two experiments that demonstrate the need to use a simulation method that can target Kollo moments exactly.Let l i , r ij , s i , and j ij denote the elements of the target mean l n , covariance R n , Kollo skewness s n , and Kollo kurtosis K n , respectively, and let and li , rij , ŝi , and ĵij denote the corresponding elements of the simulated sample.Then the RMSEs for each moment are given by: ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi for the mean vector, covariance matrix, Kollo skewness vector, and Kollo kurtosis matrix, respectively.Now, for a given dimension n of the system of random variables and a given size m for the simulations, we generate a sample and compute the mean vector, covariance matrix, Kollo skewness vector, and Kollo kurtosis matrix and record the RMSEs using (2).We repeat this many times and then record the average RMSE and its standard error.Tables 2 and 3 tabulate the results of the first, bootstrapping experiment using two real data sets, viz. the U.S. stock sector returns data described above and a temperature data sample from the "Global and City Yearly Average Temperatures 1820-2015" data set available on Data.world.It includes the yearly average temperatures between 1849 and 2013 of the 11 cities in the United States (viz.Austin, Boston, Chicago, Kansas City, Las Vegas, Los Angeles, New Orleans, New York, Philadelphia, San Francisco, and Washington).In both cases n ¼ 11 with the target mean vector, covariance matrix, Kollo skewness vector, and Kollo kurtosis matrix derived from (1) the first 1000 observations in the sample, from 1 December 2018 to 31 December 2021 for sector returns and (2) 166 observations from 1849 to 2013 for temperature data, respectively.The moments are reported in the upper part of the tables.Then we apply the statistical bootstrap to these samples, to generate random samples of size m, for m ¼ 100, 500, and 1000.For any given m, we generate 100,000 samples of this size.For each sample, we record the RMSEs calculated using (2) and then compute the average and standard error of these RMSEs over all 100,000 repetitions and report the results in the lower part of the tables.
As expected both the average RMSE and its standard deviation decrease with m.However, their ratio (shown in the right-hand panel) does not.The errors in moments that arise when using the bootstrap are particularly apparent when only 100 observations are randomly selected.When m ¼ 100 the ratio of the average RMSE on the Kollo skewness to its standard deviation is 2.632 for the stock returns data and 3.856 for the temperature data, and for the Kollo kurtosis, the corresponding ratio is 7.099 and 2.702, respectively.But even when m ¼ 1000, so we bootstrap 100,000 samples having the same size as the original one, the errors relative to the target moments derived from the original sample remain substantial (with ratios are 2.416 and 2.469 for the stock returns data and 3.288 and 1.782 for the temperature data on the Kollo skewness and kurtosis, respectively).It is worth noting that much of the research on sampling errors in simulation focuses on errors in the covariance matrix, but looking at the ratios in the righthand panel it appears that the covariance matrix is the least inaccurate of the four moments.
Table 4 reports the results of the second experiment which assumes an n-dimensional N ð0 n , I n Þ distribution for n ¼ 5 or 10 and applies standard Monte Carlo simulation to generate samples of size m for m ¼ 10 3 or 10 6 .As expected both the average RMSE and its standard error decrease as m increases.However, again their ratio (shown in the right panel) does not.It is clear that all simulations yield inaccurate values for the mean vector, covariance matrix, Kollo skewness vector, and Kollo kurtosis matrix.It also becomes more difficult to simulate accurate values as the dimension of the system increases, especially for the covariance matrix.This observation was used to motivate the introduction of moment matching methods (such as Ledermann et al., 2011) and Table 4 demonstrates that similar inaccuracies occur when using Monte Carlo simulation in large systems for which accurate third and fourth moments are also important.

Conditions for matching Kollo skewness and kurtosis
Consider the simple case of generating a sample X mn of size m from an n-dimensional distribution that has the exact n Â 1 mean vector l n and covariance matrix R n : Let A n be a fixed square matrix such that A n A 0 n ¼ R n : Following Ledermann et al. ( 2011) we can select any "scaled L-matrix" M mn that is defined by the properties: where 1 m denotes an m Â 1 vector of 1s, 0 m is an m Â 1 vector of 0s and I n is the n Â n identity matrix.The matrix M mn is just an L matrix scaled by m 1=2 : See Ledermann and Alexander (2012) for more details about different types of L matrices that can be used to target the Mardia measures exactly, either skewness or kurtosis, but not both.The original concept of ROM simulation selects any random n Â n rotation matrix R n , and possibly also a random m Â m permutation matrix Q m , so that by (3) every sample of the form: has mean l n and covariance matrix R n : The problem with such a construction is that these ROM simulations can have quite different Kollo skewness and kurtosis because they depend on the choice of R n : For this reason, we set R n ¼ I n in Equation ( 4) and introduce another source of randomness.First, we note that M mn must satisfy more than just (3) for the simulation X mn given by:  2:20 1:97 2:01 2:18 1:29 1:70 2:12 2:91 1:97 1:30 1:68  1:76 1:48 1:53 1:63 1:07 1:29 1:66 1:97 2:01 1:08 1:33  1:43 1:35 1:38 1:07 1:25 1:20 1:50 1:30 1:08 2:11 1:77  1:90 1:63 1:70 1:43 1:28 1:35 1:85 1:68 1:33 1:77 2 The upper table reports the mean (l 11 ), variance (R 11 ), Kollo skewness (s 11 ), and kurtosis (K 11 ) defined in (1) derived from a sample of 1000 returns on each of n ¼ 11 U.S. stock sector index.In the lower table, the first column states the sample size of each bootstrap from which we calculate the RMSEs for the mean, covariance, Kollo skewness, and kurtosis, using (2).Repeating the sample and again computing the RMSE 10,000 times, we report here the average and standard error (in parentheses, below) for each moment in columns 2-5.The last four columns are the average RMSE divided by its standard error.
The rotation matrix is used to convert M mn to a new matrix S mn , i.e., S mn ¼ M mn X n , and so M mn ¼ S mn X 0 n : The sample construction (5) may now be rewritten: and the necessary and sufficient conditions (6) become: The above formulation explains why we require 1 0 X n ¼ ffiffiffi n p 1, 0, :::, 0 ½ : We can see that the most complex terms ð1 0 n m r i 0 Þ 2 in (6b) and (6c) are reduced to terms of the form ns 2 i1 in (8b) and (8c), with s i1 denoting the first element in the ith row of S mn : This way, the focus moves from simulating (5) using different M n to simulating (7) using different S mn which satisfies (8).

Theoretical results
For convenience, we now rewrite the conditions (8) as functions of each column of S mn , respectively, using the Hadamard product : Let s c j denote the m Â 1 vector that is the jth column of S mn : The following formulation of the necessary and sufficient conditions for the simulated sample to match exact mean, covariance and Kollo skewness and kurtosis will be useful in this section: where sk is the kth element of the scaled and rotated Kollo skewness vector sn and jik is the element in the ith row and kth column of the scaled and rotated Kollo kurtosis matrix Kn : Matching the exact mean, covariance and Kollo skewness relies on solving conditions (9a)-(9d) for S mn : The additional constraints (9e) set out the k restrictions on the kth column of S mn that must hold in order that each simulation X mn matches the target Kollo kurtosis.
The system (9) is not easy to solve.When k ¼ 1 we have to solve a system of linear equations for the mean constraint (9a), a system of quadratic equations for the variance constraint (9c), a system of cubic equations for the Kollo skewness constraint (9d), and a system of quartic equations for the Kollo kurtosis constraint (9e).When k > 1 we have four different systems of linear equations, for the mean constraint (9a), the covariance constraint (9b), the Kollo skewness constraint (9d), and Kollo kurtosis constraints (9e) when i < k, and two different systems of quadratic equations for the variance constraint (9c) and the Kollo kurtosis constraint (9e) when i ¼ k.In total, there are nðn þ 5Þ=2 equations to solve for matching the first three moments, or nðn þ 3Þ equations for matching the first four moments.For example, in a 10-dimensional system, we need to solve 75 equations for matching the first three moments, or 130 equations to match the first four moments.
In this section we show how to solve (9) for S mn column by column, by (a) finding a particular type of orthonormal basis for the sample space of the n random variables; and then (b) solving for the coefficients which generate a subspace S mn of its span which contains samples having exactly the target Kollo moments.That is, for each column k and for some column dimension l k , with 1 l k m, we first find a specific m Â l k rectangular orthonormal matrix H k m, l k , i.e., a matrix having the properties Then we set where c k is an l k Â 1 column vector with at least one non-zero element.The decomposition (10) allows one to select the orthonormal matrix H k m, l k in such a way that s c k always satisfies Equations (9a) and (9b).That is, instead of solving (9) for s c k directly, we may solve for H k m, l k and c k iteratively, for k ¼ 1, :::, n, using the following result: Lemma 1. Suppose the first k À 1 columns in the matrix S mn satisfy conditions (9a) and (9b) and the kth column s c k is constructed using (10).Then the conditions (9a) and (9b) will be satisfied if the column vectors in H k m, l k are also orthogonal to the vectors 1 m and s c 1 , :::, s c kÀ1 : Generating S mn column by column this way considerably reduces the number of constraints that must be satisfied.Later we shall show that, in addition to the mean and covariance, we can match the Kollo skewness and kurtosis by placing only k þ 2 constraints on the kth column of S mn : And if we only wish to target the first three moments, there are just two constraints on each column of S mn : In total, instead of nðn þ 5Þ=2 equations to solve for matching the first three moments, or nðn þ 3Þ equations for the first four moments, now we only need to solve 2n and nðn þ 5Þ=2 equations, respectively.For instance, in a 10-dimensional system, we need to solve 20 equations to match the mean, covariance, and Kollo skewness or 75 equations to additionally match the Kollo kurtosis.Of course, the dimension reduction increases with n; e.g., when n ¼ 20 the first three moments constraints reduce from 250 to 40, and the first four moments constraints reduce from 460 to 250.
Our method of solution requires that we fix the first k À 1 columns of S mn before we generate the matrix H k m, l k : We then generate a rectangular orthonormal matrix with dimensions m Â N whose first k columns are equal to m À1=2 1 m and m À1=2 s c 1 , :::, m À1=2 s c kÀ1 : The columns of the matrix H k m, l k are then randomly selected from the other Nk columns of this matrix.That is, we take a random sample (without replacement) of any column j for j !k and the selected columns h j for j ¼ 1, :::, l k are then used to form the orthonormal matrix H k m, l k : Note that we require N m, because there are m elements in each column and so, no more than m columns can be mutually orthogonal.This places an upper bound on l k because l k N À k m À k: There is a close relationship between H k m, l k and the L matrices (Ledermann et al., 2011).In fact, we can immediately obtain H k m, l k by randomly extracting l k columns from the last N À ðk À 1Þ columns of an L matrix whose first k À 1 columns are m À1=2 s c 1 , :::, m À1=2 s c kÀ1 : Now we turn to the actual solution of (9) using the construction (10), starting with the simpler case of matching the first three moments only.Recall that s c 1 is different from the other columns because it involves a cubic equation, which is easy to see by rewriting the conditions (9a)-(9d) for s c 1 as: The following Theorem explains how to solve for s c 1 , the first column of S mn , from which the other columns are generated in an iterative manner: Theorem 1. Suppose the kth column of S mn satisfies (10 satisfies the conditions of Lemma 1, and that s 1 k , :::, s c kÀ1 satisfy conditions (9a) to (9d).Then: by Lemma 1, s c k satisfies conditions (9a) and (9b); it also satisfies condition (9c) if and the condition (9d) also holds if for k ¼ 2, :::, n : where The pseudocode in algorithm 1 summarizes how to generate a sample X mn that matches any given mean vector, covariance matrix, and Kollo skewness vector.
Algorithm 1. Matching the First Three Moments 1: function ROM 3 (m, l, R, s) . m: Number of observations, l, R, s : Targeted first three moments 2: . Obtain the rotated Kollo skewness 6: while Theorem 1 conditions ( 12) and (13a) for c 1 do not hold do .Solve for the first column 7: Generate l 1 orthonormal vectors to form H 1 m, l 1 satisfying Lemma 1 8: Obtain real solutions for the coefficient c 1 restricted by Equations ( 12) and (13a) 9: Obtain for k ¼ 2, :::, n do: 11: while Theorem 1 conditions ( 12) and (13b) for c k do not hold do 12: Generate l k orthonormal vectors to form H k m, l k satisfying Lemma 1 13: Obtain real solutions for the coefficients c k restricted by Equations ( 12) and (13b) 14: Obtain Generate a sample with exact first three moments 17: return X mn It solves S mn iteratively, column-by-column while building the columns of an orthonormal matrix H k m, l k that yields a real-valued solution for c k : This means, to obtain s c k satisfying conditions (9a)-(9d), the k À 1 columns of S mn must first be known and will satisfy conditions (9a)-( 9d).So we first obtain an orthonormal matrix H 1 m, l 1 satisfying Lemma 1 and then obtain the coefficients c 1 by solving a system of quadratic Equation ( 12) and a system of cubic Equation (13a) simultaneously.Next, we generate s c k for k ¼ 2, :::, n: Now c k should satisfy a quadratic system (12) and a linear system (13b) simultaneously.
We require l k ! 2 because the elements of c k are always restricted by 2 equations: for k ¼ 1 the systems ( 12), (13a)-and for k > 1 the systems ( 12) and (13b)-are both exactly determined when l k ¼ 2; but when l k > 2 the systems are under-determined, which makes it easier to find a real-value solution for c k : While both general and flexible, there is an element of trial and error for finding a real-valued solution in Algorithm 1.The following Corollary 1 considers a special case of Theorem 1 for l k ¼ 2 and k !2: It is useful because it derives a single criterion for the existence of a real-valued solution to the system of ( 12) and ( 13b) and an analytical formula for each 2 Â 1 vector Ã 0 when such a solution exists.
Corollary 1.When l k ¼ 2 and k ! 2, the system of Equations ( 12) and ( 13b) has a real solution if and when the condition ( 14) holds the solution is: where bk 1 and bk 2 are the first and second elements of 1 0 m B k m, 2 : Corollary 1 greatly improves the computational efficiency of the new method.Setting l k ¼ 2 for k > 1 means that the matrix has only two columns which are simple to orthogonalize.Moreover, it is very quick to repeatedly generate orthonormal matrices H k m, 2 for k > 1 until we find those which satisfy the criterion ( 14), and then we can immediately obtain values for c k using the analytical formula (15).In this way, we do not need to search for the numerical solution for the simultaneous Equations ( 12) and (13b).
When matching only the first three moments, there is no reason why we should not always set l k ¼ 2 and use Corollary 1. Then the moment matching algorithm is very fast, being fully analytic.However, the next Theorem, which explains how to match the first four moments, requires l k !k þ 2: Again we begin by generating the first column of S mn because it requires higher-order conditions than the subsequent columns.To target the first four moments s c 1 should satisfy not only conditions (11) but also a quartic condition for matching Kollo kurtosis: where j11 is the first element in the first row of Kn : Theorem 2 provides the necessary and sufficient conditions on c k for S mn to satisfy conditions (9).
Theorem 2. Suppose the kth column of S mn satisfies (10 satisfies the conditions of Lemma 1, and that s 1 k , :::, s c kÀ1 satisfy conditions (9).Then s c k satisfies: i. the conditions (9a) and (9b) under Lemma 1; ii. the condition (9c) if condition (12) holds; iii. the condition (9d) if condition (13) holds; iv. the condition (9e) if for k ¼ 1: for k ¼ 2, :::, n: where the ðk À 1Þ Â 1 column vector jkÀ1 contains the first k À 1 elements of the kth column of Kn , Algorithm 2 is the pseudocode for implementing the above results.It yields a realisation X mn of m observations on n random variables which matches any given target mean, covariance, Kollo skewness, and Kollo kurtosis.To obtain s c k satisfying all the conditions (9), the k À 1 columns of S mn should be known and have satisfied condition (9).For this, we first obtain an orthonormal matrix H 1 m, l 1 satisfying Lemma 1 and then obtain the coefficients c 1 by solving a quadratic (12), a cubic (13a), and a quartic (17a) simultaneously.Next, we generate s c k in an iterative manner from k ¼ 2 to k ¼ n.Now the vector c k should satisfy k þ 2 equations including two quadratic Equations ( 12) and (17b) and k linear Equations ( 13b) and (17c), simultaneously.
Algorithm 2. Matching the First Four Moments 1: function ROM 4 (m, l, R, s, K) . m: Number of observations, l, R, s, K : Targeted first four moments 2: n ¼ lengthðlÞ 3: Obtain A n where Obtain the rotated Kollo kurtosis 7: while Theorem 2 does not hold do .Solve for the first column 8: generate l 1 orthonormal vectors and form H 1 m, l 1 satisfying Lemma 1 9: obtain real solutions for the coefficient c 1 restricted by Equations ( 12), (13a), and (17a) 10: Obtain for k ¼ 2, :::, n do .Solve for the other columns 12: while Theorem 2 does not hold do 13: Generate l k orthonormal vectors to form H k m, l k satisfying Lemma 1 14: Obtain real solutions for the coefficients c k restricted by Equations ( 12), ( 13b), (17b), and (17c) 15: Obtain Generate sample with exact first four moments 18: return X mn We require l k !k þ 2 because the elements of c k are always restricted by k þ 2 equations: for k ¼ 1 the systems ( 12), (13a), (17a)-and for k > 1 the systems ( 12), ( 13b), ( 17b), (17c)-are both exactly determined when l k ¼ k þ 2; but when l k > k þ 2 the systems are under-determined, which makes it easier to find a real-value solution for c k : Finally, we note the following property for Kollo kurtosis: 6 Let M ðkÞ be the matrix of dimension m k Â n with m k !n þ 2 satisfying conditions (3).Denote the ith row of M ðkÞ by m ðkÞr i , for i ¼ 1, :::, N where N is the number of concatenations.Then the Kollo kurtosis of the concatenated sample is the weighted sum of their Kollo kurtoses: Hence, Kollo kurtosis is invariant under sample concatenation if each sub-matrix M ðkÞ has identical Kollo kurtosis.

Discussions
This paper derives necessary and sufficient analytic conditions for exactly matching the mean, covariance, Kollo skewness, and Kollo kurtosis of multivariate random variables.The necessary and sufficient conditions (9) are a highly under-determined system.Even without the condition for matching Kollo kurtosis (9e), it is already a complex problem to find columns of S mn which satisfy (9a)-(9d) because of the skewness condition (9d).This explains why Kollo skewness matching methods, such as the algorithm proposed by Alexander et al. (2022), are much slower than those matching only the first two moments exactly, such as Ledermann et al. (2011).Adding the conditions (9e) creates even more of a challenge: there are already k þ 2 constraints on the kth column for Kollo skewness matching and an extra set of k conditions for Kollo kurtosis matching, so that in total for each column k, for k ¼ 1, :::, n, there are 2k þ 2 constraints on m In response to this issue, we propose the novel idea of decomposing S mn in (7) into the product of an orthonormal matrix and a vector of coefficients.We use the necessary and sufficient conditions to solve for the coefficients which generate the columns of S mn as linear combinations of the orthonormal column vectors in such a way that the moment matching conditions are satisfied.The orthonormal . . .
column vectors are restricted to the conditions provided in Lemma 1, which helps to reduce the number of equations in the non-linear system.With such construction simulations matching Kollo skewness become highly efficient; moreover, simulations matching Kollo kurtosis become practically possible, but in high-dimensional systems, the kurtosis matrix contains many elements to match, which greatly increases the difficulty of exact simulation. 7 Our method differs from Alexander et al. ( 2022) and Hanke et al. (2017), which solve condition (9) for simulations matching up to Kollo skewness.Hanke et al. (2017) set the undetermined elements in the equation system equal to arbitrary values so the equation system becomes a determined system that is potentially solvable.Their method thus requires a trial-and-error process to find arbitrary values, which can be very slow.Moreover, it can fail completely and even when it works the simulations may have undesirable marginal densities for most real-world applications.Noting these issues, Alexander et al. (2022) propose the Random Orthogonal Matrix simulation method with exact Kollo skewness which uses either a statistical bootstrap from real-world data or parametric multivariate distributions as arbitrary values.They also provide the necessary and sufficient conditions on arbitrary values for Kollo skewness matching to be possible. 8Such modifications significantly improve the theoretical and computational properties of the simulated samples, but their algorithms require arbitrary values which often produce marginals with strange characteristics.This problem arises because the choice of distribution used to generate random values has a significant impact on the success of solving for the remaining elements.This problem was discussed in depth by Alexander et al. (2022) who show that reducing the variance of the chosen distribution will ensure a low failure rate.But then, by definitions, the remaining elements will have more extreme values so that the first few columns, viewed as samples from marginal distributions of the first few random variables, could be overly leptokurtic.In this case the sample characteristics may not be ideal for many applications, especially when the target values for skewness or kurtosis are large.Our method bypasses this problem completely.With the use of an orthonormal basis, we remove the need to allocate arbitrary values to elements in S mn : Instead, we can obtain the orthonormal matrices H k m, l k in such a way that S mn has consistent characteristics, which provides a better quality of data, as will be demonstrated below.
Furthermore, direct simulations matching kurtosis become practically impossible when n is large.For instance, when n ¼ 30 the Alexander et al.
algorithm requires the solution of 30 different systems of non-linear equations, the smallest being a system of 3 equations (k ¼ 1, i.e., for the first column) and the largest being a system of 32 equations (k ¼ 30, i.e., the last column).In total one needs to solve P n k¼1 ðk þ 2Þ ¼ 525 non-linear equations to target Kollo skewness.But if we were to extend the system to target Kollo kurtosis directly, when n ¼ 30 there would be P n k¼1 ð2k þ 2Þ ¼ 989 non-linear equations to solve, which may even be beyond the reach of the high-performance computers in use today.Our method provides a practical solution to this problem because the orthonormal basis greatly reduces the number of equations that need to be solved.When matching Kollo skewness, our algorithm only needs to solve one quadratic equation and one cubic (k ¼ 1) or linear (k ¼ 2, :::, n) equation.In addition, we have an analytical solution in Corollary 1 for solving each column of S mn once the first column of S mn is available.By contrast, the algorithm proposed by Alexander et al. requires a numerical solution for one quadratic equation and k þ 1 linear equations.When matching both Kollo skewness and Kollo kurtosis, our algorithm is much slower because it involves solving two quadratic equations and k linear equations for each column of S mn : Also, it requires numerical methods to solve a higher-dimensional, non-linear system.Nevertheless, it is the only algorithm that can target the first four multivariate moments exactly.
We finish with a brief demonstration of the flexibility and efficiency of our moment-matching results.Not all matrices H 1 m, l 1 admit a solution for the conditions of Theorem 1 or 2, so the speed of the general Algorithms 1 and 2 depend on the time it takes to find the matrices H k m, l k for which real-valued vectors c k that satisfy Theorem 1 or 2 exist.Similarly, for k ¼ 2, :::, n each H k m, l k may need to be regenerated several times until we obtain a real-value c k that satisfies the conditions of Theorem 1 or 2.Although l k is at least 2 when matching the first three moments, and at least k þ 2 when matching the first four moments, if computational speed becomes an issue we prefer to use a greater value for l k because it is easier to find a real-value solution for an under-determined system than for an exactly determined one.But for matching the first three moments, Corollary 1 provides a simple set of conditions to check if we set l k ¼ 2, as well as providing the analytical expressions for the elements of c k : The computation times provided in Figure 1 use l k ¼ 2 for matching the first three moments and l k ¼ k þ 4 for matching the first four moments.
Figure 1 presents the empirical comparison of the cumulative density function (CDF) generated by our algorithm for matching the Kollo skewness only, or both the Kollo skewness and kurtosis, against that generated by the algorithm of Alexander et al. (2022). 9The Kollo skewness and kurtosis targets are derived empirically from the cryptocurrency hourly returns data.These moments have moderately large elements everywhere, which is suitable to illustrate the two problems discussed above.The figure also reports the results of the Kolmogorov-Smirnov (KS) test which quantifies the distance between the simulated and targeted samples, and the computation time for simulations matching the Kollo skewness only, since we have no benchmark comparison for matching Kollo kurtosis.2022) (KROM) and our algorithms (Algorithm 1 for targeting the first three moments and Algorithm 2 for targeting all first four moments) with 1000 observations on three random variables.The target moments are derived from a sample of cryptocurrency data with 720 observations, from 08:00 4 April to 08:00 4 May 2021, UTC time, with Kollo skewness s ¼ ½À1:02, À 0:03, 0:93 0 and the Kollo kurtosis K ¼ ½7:79, 3:97, 2:75; 3:97, 9:30, 2:44; 2:75, 2:44, 13:08: For simplicity, we standardize all data before setting the targets for simulation and therefore set l n ¼ 0 n , R n ¼ I n and Q m ¼ I n , so that X mn ¼ S mn X 0 n : We also employ the rotation matrix shown in Section 3 for all simulations.The black lines in the CDF plots are the empirical distribution functions of the real data on the corresponding variable, fitted using the Gaussian kernel.We treat these lines as the target marginal densities for the KS tests.The example simulations are a typical case of simulation trials that pass the KS test at the significance level of 10%.For simulations using our algorithms, the orthonormal basis is generated using the bootstrapped sample.For simulations using the Alexander et al. algorithm, the arbitrary values in each column are also randomly drawn from the real data and we set the parameter r 2 ¼ 0:8 or 0.6.In the right panel, the key characteristics table reports the KS statistics and p-values for the marginals depicted on the left.The table also reports the mean and standard deviation (in brackets) of the average computation time (ACT, in second/simulation), which is computed using the total computational time over 100 trials and depicted in the computational time figure below the table; the red line depicts the computation time of KROM 0:8 and is set against the left axis; the blue and green lines depicts the computation time of KROM 0:6 and our algorithm matching the first three moments and are set against the right axis.The machine used is an Intel Core i5-8250U CPU, 3.41 gigahertz, with 8 gigabyte RAM, running Python under Windows.
First, the results replicate Alexander et al.'s finding that a more constrained selection of arbitrary values (i.e., r 2 ¼ 0:6, in blue, relative to r 2 ¼ 0:8, in red) generates much more leptokurtic samples but is much faster for sample generation.So users must choose between a better-fitted sample or a faster speed of simulation.Note that the example distribution functions reported in Figure 1 are selected from "qualified" trials, i.e., the trials that pass the KS test at the significance level of 10%.Yet we can see the distribution functions generated using Alexander et al.'s algorithm with both r 2 ¼ 0:6 and r 2 ¼ 0:8 still appear leptokurtic.By comparison, samples generated by our algorithms for matching Kollo skewness and kurtosis do not suffer from the leptokurtic problem.In addition, our algorithm for matching Kollo skewness can be a hundred times faster when targeting suitably distributed samples: it takes 0.08 s per simulation for our algorithm versus 8.04 s per simulation for Alexander et al.'s algorithm with r 2 ¼ 0:8: The computation time for matching all four moments is not reported because it is always much slower than the other algorithms.This is in line with expectations because our algorithm requires at least k þ 2 degrees of freedom when solving the kth column of S mn : Finally, we emphasize that the more multivariate moments that are matched exactly, the more information that is preserved by ROM simulation.Our research thus provides an interesting avenue for future research on matching Kollo kurtosis as well as Kollo skewness-that is to look for an analytical solution for solving each column of S mn once its first column is available.Similar to the impact of Corollary 1 which significantly speeds up our algorithm for matching Kollo skewness, such a solution should further enhance our algorithm for matching all four moments so that such simulations become not just practically possible, as this paper demonstrates, but much more faster to use in practice.upward surge in oil prices.Similarly, the upward spike in s 7 and in most elements in Kollo kurtosis on 16 March 2020 resulted from extreme negative returns across all financial markets following the outbreak of Covid-19.
Almost all elements of the Kollo skewness vector lie in the range ½À10, 10 with the most sampling variability in s 1 and s 3 .The Kollo kurtosis ranges from 0 to 200.The spikes in Kollo skewness and Kollo kurtosis occur simultaneously.In particular, the global crash in risky assets after the outbreak of Covid in January 2020 is evident from the dip in Kollo skewness down to almost À10 in both data sets.The Kollo kurtosis takes a narrower range in the stock sector data than for the cryptocurrency data.
Compared with the Mardia measures, we see that the Kollo measures are more sensitive to the extremes of the distribution.Unlike the Mardia skewness, the Kollo measure also captures the direction of asymmetry in the marginals.The Kollo measures also capture more information from comoments and not only from the marginals.They can exhibit substantial variability even when the Mardia measures remain relatively stable over time.Furthermore, the Kollo measures can indicate which variable contributes most to an extreme value for skewness and kurtosis but Mardia measures only capture the overall information, in a scalar value.Then, since H k m, l k is orthogonal to 1 m we have 1 0 m h k i ¼ 0 and hence: and for j < k

B.2. Proof of Theorem 1
When k ¼ 1 the conditions for s c 1 are set out in Equation ( 11) and by Lemma 1 we have s c 1 ¼ H 1 m, l k c 1 so we only need to solve the second two equations in (11).The quadratic equation can be written as:

Figure 1 .
Figure 1.Empirical cumulative distribution functions of simulations matching cryptocurrency data.Reported in the left panels, the empirical distribution functions are based on simulations using the algorithm of Alexander et al. (2022) (KROM) and our algorithms (Algorithm 1 for targeting the first three moments and Algorithm 2 for targeting all first four moments) with 1000 observations on three random variables.The target moments are derived from a sample of cryptocurrency data with 720 observations, from 08:00 4 April to 08:00 4 May 2021, UTC time, with Kollo skewness s ¼ ½À1:02, À 0:03, 0:93 0 and the Kollo kurtosis K ¼ ½7:79, 3:97, 2:75; 3:97, 9:30, 2:44; 2:75, 2:44, 13:08: For simplicity, we standardize all data before setting the targets for simulation and therefore set l n ¼ 0 n , R n ¼ I n and Q m ¼ I n , so that X mn ¼ S mn X 0 n : We also employ the rotation matrix shown in Section 3 for all simulations.The black lines in the CDF plots are the empirical distribution functions of the real data on the corresponding variable, fitted using the Gaussian kernel.We treat these lines as the target marginal densities for the KS tests.The example simulations are a typical case of simulation trials that pass the KS test at the significance level of 10%.For simulations using our algorithms, the orthonormal basis is generated using the bootstrapped sample.For simulations using the Alexander et al. algorithm, the arbitrary values in each column are also randomly drawn from the real data and we set the parameter r 2 ¼ 0:8 or 0.6.In the right panel, the key characteristics table reports the KS statistics and p-values for the marginals depicted on the left.The table also reports the mean and standard deviation (in brackets) of the average computation time (ACT, in second/simulation), which is computed using the total computational time over 100 trials and depicted in the computational time figure below the table; the red line depicts the computation time of KROM 0:8 and is set against the left axis; the blue and green lines depicts the computation time of KROM 0:6 and our algorithm matching the first three moments and are set against the right axis.The machine used is an Intel Core i5-8250U CPU, 3.41 gigahertz, with 8 gigabyte RAM, running Python under Windows.
c k ¼ c 1 , :::, c l k ½ 0 where at least one element is non-zero.Now if the column vectors in H k m, l k are also orthogonal to the vectors s c 1 , :::, s c kÀ1 we have: j ¼ 1, :::, l k and 8h < k:

Figure A1 .
Figure A1.Time series of financial returns.(a) Exhibits the hourly returns on three cryptocurrencies: bitcoin (BTC), ether (ETH), and litecoin (LTC) from 1 January 2017 to 31 December 2021; and (b) exhibits the daily returns for three U.S. stock sector indices, viz.Energy, Finance, and Real Estate, from 16 October 2001 to 31 December 2021.

Figure A2 .
Figure A2.Kollo skewness and kurtosis in financial data in a rolling-window framework.We plot all elements in the Kollo skewness vector and only plot the unique elements in the Kollo kurtosis matrix.(a) Exhibits the Kollo skewness and kurtosis using a rolling window of 720 hourly returns for cryptocurrency prices of bitcoin (BTC), ether (ETH), and litecoin (LTC) from 1 January 2017 to 31 December 2021.(b) Exhibits the Kollo skewness and kurtosis using a rolling window of 250 daily returns for three U.S. stock sector indices, viz.Energy, Finance, and Real Estate, from 16 October 2001 to 31 December 2021.The grey dotted line depicts the rolling Mardia skewness and kurtosis, using the right-hand scale.

Table 1 .
Weaknesses of popular skewness and kurtosis measures for multivariate variables.All cross-moments are considered for the skewness but not kurtosis.
‡ Applicable for distributions without finite first fourth moments.

Table 2 .
Sampling errors in multivariate moments for statistical bootstrapping financial data.

Table 3 .
Sampling errors in multivariate moments for statistical bootstrapping temperature data.

Table 4 .
RMSE of multivariate moments in Monte Carlo simulation.