Improving the volatility of the optimal weights of the Markowitz model

Abstract The main practical problems that are faced by portfolio optimisation under the Markowitz model are (i) its lower out-of-sample performance than the naive rule, (ii) the resulting asset weights with extreme values, and (iii) the high sensitivity of those asset weights to small changes in the data. In this study, we aim to overcome these problems by using a computation method that shifts the smaller eigenvalues of the covariance matrix to the space that houses the eigenvalue spectrum of a random matrix. We evaluate this new method using a rolling sample approach. We obtain portfolios that show both more stable asset weights and better performance than the rule. We expect that this new computation method will be extended to several problems in portfolio management, thereby improving their consistency and performance.


Introduction
Portfolio theory, which was developed by Markowitz (1952Markowitz ( , 1959, is one of the most important pillars of financial theory. Part of the popularity of Markowitz's model, which is taught in finance courses globally, is due to its simplicity. 1 Nevertheless, despite the model's popularity, investors' empirical behaviour tends to differ from its predictions (Benartzi & Thaler, 2001).
Portfolio theory addresses the problem of how investors should invest by spreading their money across n þ 1 financial assets, where one asset is risk-free and n assets are risky. This shows that investors can achieve higher than expected returns for a given level of risk by investing in efficient portfolios. In Figure 1, the efficient portfolios are shown by the blue line. This theory indicates that all investors should divide their money between the risk-free asset and a tangent portfolio, which is an efficient portfolio that includes only risky assets Tobin (1958).
The set of points that contains this linear combination is called the capital market line. Investors choose one of these portfolios in the capital market line according to their degree of risk aversion. To determine the optimal weights of the tangent portfolio, it is necessary to consider expected returns (l), the return of the risk-free asset (r f ), and the covariance matrix (R). The optimal weights (w Ã ) are obtained as where c is the reduction in an investor's risk aversion. The practical application depends on obtaining good estimates ofl and R. However, this approach is not simple in practice. In fact, Markowitz's model has received serious criticism, including its poor out-of-sample performance (Bloomfield et al., 1977;DeMiguel et al., 2009;Kritzman et al., 2010) and the extreme weights of the resulting optimal portfolios (Black & Litterman, 1991Papp et al., 2005). It has also been criticised because its optimal portfolio weights are highly sensitive to the dataset that is used to estimate the parameters, which generates high transaction costs (Best & Grauer, 1991;Chopra, 1993;Jobson & Korkie, 1981). Several researchers have studied the impact of parameter estimation errors on portfolio optimisation. For instance, Chopra (1993) states that the major problems of the mean/variance (MV) framework are the errors associated with the estimation of the means, variances, and correlations of asset returns, as well as the fact that the optimisation process that underlies this methodology maximises these errors. In addition, Chopra and Ziemba (1993) indicate that the estimation errors in means are approximately 11 times more important than those in variances, and that the estimation errors in variances are about twice as important as those in covariances. Merton (1980) states that a long-term series of data is required to estimate expected returns accurately. Ng et al. (2020) explain the difficulty of improving the accuracy of estimates of the means of returns-even as the sample size increases. Moreover, Best and Grauer (1991) find that for an MV-efficient portfolio weight, both means and variances can be extremely sensitive to changes in asset means; consequently, they have implications for portfolio management. Researchers have also studied the effect of risk estimation on the performance of the Markowitz rule and the conditions required for good performance. ECONOMIC RESEARCH-EKONOMSKA ISTRAŽIVANJA For instance, Jobson and Korkie (1981), using Monte Carlo simulations, find that the mean, variance, and covariance parameters do not lend themselves to making inferences in small samples, even when considering 313 months or 26 years to calculate these parameters. Similarly, Jorion (1986) shows that, like local portfolios, the optimal asset allocation is sensitive to estimation risk in international cases. Michaud (1989) states that MV models are highly sensitive to changes in parameters and are hard to estimate; moreover, they also magnify the effect of estimation errors.
The most popular approach to managing the effect of estimation risk is the use of Bayesian shrinkage estimators, which shrink the sample estimator towards a certain target under the premise that the resulting shrinkage estimator contains less estimation error than the sample estimator. For example, Jorion (1986) and Frost and Savarino (1986) use a shrinkage estimator of the means vector, Ledoit and Wolf (2003, 2004b, 2020 and Ollila and Raninen (2019) consider shrinking the covariance matrix, and DeMiguel et al. (2013) consider different combinations of the shrinkage of the means vector and covariance matrix. However, these proposed amendments have not eliminated the main practical drawbacks of the Markowitz model. Despite these amendments, out-of-sample performance is not superior to the naive 1=n rule and the weights of optimal portfolios obtained with these amendments remain extremely sensitive.
In this study, we contribute to the literature on this topic by proposing a new approach that largely overcomes all of these drawbacks. In addition to decreasing the effect of the estimation risk of the covariance matrix, we focus on reducing the negative effects of the presence of eigenvalues close to zero in the sample covariance matrix.
Our approach compresses the eigenvalue spectrum of a covariance matrix towards the eigenvalue spectrum of a diagonal matrix, which only contains the estimated values of the variances. This diagonal matrix target represents a multivariate process in which the variables are not correlated. Our approach is based on Ledoit and Wolf (2004b) and Sch€ afer and Strimmer (2005). In particular, the justification for our approach stems from random matrix theory (RMT). The application of RMT to portfolio optimisation suggests that the estimation risk of the correlation (or covariance) matrix plays an important role in this problem. Using these approaches, Laloux et al. (1999) establish that the smallest eigenvalues of this matrix are sensitive to estimation risk, whereas it is precisely those eigenvectors that correspond to the smallest eigenvalues that determine (in Markowitz's theory) the least risky portfolios.
To take advantage of this approach, we must determine the optimal shrinkage degree of the covariance matrix before choosing the optimal portfolio. Therefore, we propose a method that makes the smallest eigenvalue of the fitted correlation matrix larger than the minimum corresponding to a random matrix. Next, we compare the performance of our proposed method with the Markowitz method without adjustments, the 1=n rule, the Ledoit and Wolf (2003, 2004a, b, 2020 methods, and the two methods proposed by Ollila and Raninen (2019). We find that the proposed method delivers better out-of-sample performance than all of the other methods for sample sizes larger than 165 months and maintains more stable and less volatile financial asset weights. Our results show that the proposed approach permits the Markowitz rule to clearly overcome the 1=n rule, which shows a notable decrease in the extreme values of the optimal portfolio weights and at the same time a significant decrease in the volatility of the optimal portfolio weights. Moreover, these results are obtained when using relatively small sample sizes to estimate the parameters.
The traditional corrections to the MV model have typically centred on raising its performance by improving the estimations of the parameters and diminishing the risk of the estimation. However, in most of these corrections, the weights of the resulting portfolios continue to be highly volatile. Unfortunately, this makes the practical application of the unrestricted model unviable because of the high transaction costs implied by the extreme changes in the portfolio weights. For example, using the methodologies of Ledoit and Wolf (2003, 2004a, b, 2020 and Ollila and Raninen (2019), we show that it is possible to improve the performance of the MV model but that the high volatility of the optimal weights prevails. The sensitivity of the optimal weights is only strongly attenuated when near-zero eigenvalues of the covariance matrix are shifted into the space that those of a random matrix occupy.
The following sections are organised as follows. Section 2 presents the eigenvalue shrinkage model. Section 3 describes the procedures that we followed to perform the out-of-sample comparisons of the portfolio selection rules. Section 4 shows the preliminary results, while highlighting the potential advantages. Section 5 presents the advantages of applying a new portfolio selection rule and compares the out-of-sample performance with other covariance matrix shrinkage proposals. Finally, Section 6 concludes.

The model
In this section, we will develop a covariance matrix estimate based on the shrinkage of the eigenvalue spectrum.

Traditional Markowitz approach
Consider that we have a set of T observations of n financial assets with returns r it for i ¼ 1, 2, 3 . . . , n, with t ¼ 1, 2, 3, . . . T, and a risk-free asset that investors can lend and borrow without limit up to the rate r f .
We assume that investors who wish to determine their optimal portfolios use a fixed sample size of the latest m observations to estimate the parameter values: the risk premiums of each asset (the average return less the return of a risk-free asset) l i Àr f and the covariance matrix R.
Typically, in the portfolio selection model, it is assumed that investors have utility functions of the following type: where l p is the expected return of the portfolio chosen by the investor, r p is the respective variance of the portfolio, and c represents the risk aversion of the investor. This theory assumes that investors seek to maximise their utility function, with the restriction that the invested money is divided among a set of financial assets that contains n risky assets and one risk-free asset. Hence, l p ¼ w t l þ ð1Àw t eÞr f , and r 2 p ¼ w t Rw, where w t ¼ ½w 1 , w 2 , w 3 . . . w n is the weight assigned to each risky asset and e is the column vector of ones of dimension n. Therefore, the problem for the investor is The solution is Investors can invest money in a risk-free asset that has a return of r f , and in n risky assets that have expected returns l and a covariance matrix R. They can assign a percentage of their wealth to each risky asset w t ¼ ½w 1 , w 2 , w 3 , . . . w n and ð1Àw t eÞ to the risk-free asset. 2 Following this reasoning, Tobin (1958) demonstrates that investors can achieve optimal portfolios by investing a proportion of their wealth in the risk-free asset and the remainder in a portfolio that contains only risky assets w T , which can be determined as Therefore, using this approach to determine how investors have to invest entails inverting the covariance matrix R and multiplying it by the expected excess returns of the risky assets over the risk-free asset lÀer f : However, it is difficult in practice to estimate the true value of these parameters given that they are mainly obtained using limited historical data. Thus, the estimation risk of these parameters sharply affects the performance of the optimal rule of investment proposed by Markowitz (1952).
If investors consider m preview observations of n assets r it i ¼ 1, 2, . . . , nt ¼ 1, 2, . . . , m, then the means and covariances are estimated using the following expressions:l Consequently, the empirical optimal portfolio weights are determined bŷ

Shrinkage estimation of the eigenvalues of the covariance matrix
In this study, we propose a new approach that compresses the eigenvalue spectrum of a covariance matrix towards the eigenvalue spectrum of a diagonal matrix, which only contains the estimated values of the variances to move the eigenvalues of the covariance matrix away from zero. This diagonal matrix target represents a multivariate process in which the variables are not correlated among themselves. This approach is similar to the approach that was proposed by Ledoit and Wolf (2004b) and Sch€ afer and Strimmer (2005). The justification for our approach stems from RMT. The application of RMT to portfolio optimisation suggests that the estimation risk of the correlation (or covariance) matrix plays an important role in this problem. Using this approach, Laloux et al. (1999) find that the smallest eigenvalues of this matrix are sensitive to estimation risk, whereas it is precisely the eigenvectors corresponding to the smallest eigenvalues that determine (in Markowitz's theory) the least risky portfolios. Thus, as stated by Laloux et al. (1999), 'one should be careful when using this correlation matrix in applications. ' RMT establishes that for a dataset X that has T observations of n random variables with dimension n Â T, where all its components are independent random variables, the spectrum of eigenvalues of its covariance matrix [C ¼ X t X T Àll t ] will be between a minimum limit 'a' and a maximum limit 'b' (see Marchenko & Pastur, 1967), where As we will show later, the eigenvalues below the lower limit 'a' contain problematic noise that complicates the portfolio optimisation process. Furthermore, smaller eigenvalues determine (in Markowitz's theory) the least risky portfolios. Eigenvalues between 'a' and 'b' are noise belonging to random matrix-type behaviour, whereas values greater than 'b' contain useful information. In this study, we propose an adjustment of the covariance matrix by shrinking the covariance matrix eigenvalue spectrum towards a target eigenvalue spectrum. The matrix associated with this target is a diagonal matrix containing the values of the estimated variances on its diagonal. This target matrix mimics a random matrix because the covariances between the variables considered are zero.
This methodology aims to move the part of the eigenvalue spectrum of the covariance matrix that lies between zero and 'a' within the eigenvalue spectrum sector given by RMT; that is, the region (a, b). The benefit of applying this adjustment to the covariance matrix is the noticeable reduction in the dispersion of the optimal portfolio weights without any restrictions on them, which allows us to obtain stable portfolios that imply low transaction costs for investors and a considerable improvement in the performance of the Markowitz rule. In fact, the out-of-sample Sharpe ratio surpasses the 1=n rule, which permits investors to obtain higher returns for each risk level.
We propose the following correction of the covariance matrix to calculate the optimal weights:Ŷ whereŜ is the diagonal matrix ofR and g is an adjustment factor of the shrinkage. This is the same expression that Sch€ afer and Strimmer (2005) employ when two models are used to estimate the covariance matrix, the first with many free parameters and the second with little bias. Consequently, the optimal weights are 2.2.1. Shrinkage of the eigenvalue spectrum Each of the n eigenvalues of the covariance matrixR, k R , can be estimated by detðk RðiÞ IÀRÞ ¼ 0 Furthermore, each of the n eigenvalues of the matrixŶ , k Y , can be obtained by detðk YðiÞ IÀð1ÀgÞRÀgŜÞ ¼ 0 The eigenvalues of the matrixŶ , k Y , are those of the matrix X when g ¼ 0; as g increases, they approach those of the diagonal matrix S and for g ¼ 1 they are equal to those of the S matrix. Consequently, by changing g, we shift the eigenvalues of the matrixŶ :

Eigenvalue spectrum of a typical large covariance matrix
The random matrix eigenvalues have the following distribution 3 : where a and b are given by Eq. (7) and y ¼ n T : The red line in Figure 2 shows the theoretical density of the eigenvalues of the random correlation matrix with dimensions equal to 30. The empirical density function of the eigenvalues is shown by the blue line of the correlation matrix of the 30 equally weighted industrial portfolios 4 considering a sample size of 200 monthly observations 5 In total, the eigenvalues of the 938 correlation matrices are used to construct this density function.
A high proportion of the eigenvalues are below the lower limit of an associated random matrix. As we will show later, the presence of these eigenvalues in this area causes serious problems for the Markowitz model.
Although not shown in Figure 2, the empirical function of the density of the eigenvalues of R reaches values up to 25. According to these observed values and principal component theory, this means that the highest eigenvalue explains around 83% of the variance, while smaller eigenvalues explain a minimal part of the variance of the covariance matrix.
According to RMT, the presence of eigenvalues below the lower limit (a ¼ 0.3754 for n ¼ 30) is highly problematic when determining the optimal portfolio according to Markowitz (Bai et al., 2009). By contrast, eigenvalues above the upper limit b ¼ 1.7325 contain valuable information, while eigenvalues within these limits ½a, b are the products of a random matrix. The proposed approach allows us to move the spectrum of eigenvalues towards a space in which all of these values are above the lower limit of the spectrum corresponding to a random matrix.
2.2.3. Another explanation of the optimal weight sensitivity from the spectral decomposition A square symmetric matrix such as a covariance matrix can be expressed as where U is defined as Here, u i represent the eigenvectors of the matrix R and D is a diagonal matrix containing the eigenvalues k R of the matrix R. Therefore, we can express where u i u t i are matrices with a rank equal to one. In a similar way, we can express the inverse of a covariance matrix as Eq. (16) shows that when we invert a matrix, smaller eigenvalues and their corresponding eigenvectors are more important than larger eigenvalues. Furthermore, the estimation error of the smallest eigenvalue is greater than that of the highest eigenvalue. This shows the sensitivity of the Markowitz model to errors in the assessment of smaller eigenvalues. Figure 2 shows the eigenvectors associated with eigenvalues with a mean of 0.06, which determine the inverse matrix of Eq. (16). The eigenvector associated with the highest eigenvalue, whose value is equal to 25, has minimal importance.
Traditional methodologies aim to reduce the effects of risk in the estimation of the parameters. However, if, after these corrections, the covariance matrix that is finally used continues to have values close to zero, then the high sensitivity of the optimal weights will continue. Thus, for the MV method to be viable, it must try to diminish the estimation risk of the parameters and shift the values of the covariance matrix itself towards the centre in which those of a random matrix are located.

Data and experiment
The data correspond to the 1138 monthly observations of excess returns on the riskfree asset of 30 industrial portfolios from July 1926 to April 2021. These are the 30 equally weighted industrial portfolio sectors taken from Kenneth French's website. We use the series of one-month Treasury bills taken from the Federal Reserve Economic Data as the risk-free rate.
To compare the relative empirical performance of the Markowitz rule with the shrinkage of eigenvalues, we use the naive 1=n rule as a benchmark. This investment rule invests a proportion of 1=n of wealth in each of the n assets available for investing in each of the rebalancing dates. We use this naive rule as the benchmark because it is easy to implement given that no parameter estimation is necessary. Hence, this rule has not been outperformed by more complex optimisation rules when small sample sizes are used to estimate the parameters. 6 Furthermore, this rule does not involve any parameter estimation or optimisation process and the data do not matter: We use a rolling sample approach that is similar to the one that was used by DeMiguel et al. (2009). Given a dataset of T ¼ 1138 monthly observations, we use sample sizes (m) that are equal to 200, 300, 400, and 500 monthly observations. For the case in which the investor uses a sample size of m, we thus have ðTÀmÞ observations of realised returns obtained using the investment rule suggested by each model.
The observations of out-of-sample returns are used to analyse the performance of these investment rules. The assumptions of the empirical models are as follows: 1. Investors use the first m observations to estimate the parameter values (mean and covariance matrix). 2. Using these parameters, investors compute the optimal portfolio weights. 3. Then, they use this asset allocation to build their portfolios. 4. Investors hold their investments in these portfolios for one month. 5. At the end of the month, they calculate their realised returns. 6. At the beginning of the following month, investors choose the last m months to recalculate the parameters by dropping the earliest return and adding the return of the following month. In this way, the number of monthly observations that is used to calculate the parameters is always equal to m. 7. Steps from 2 to 6 are repeated until investors take the last assignation in month TÀm: As explained earlier, for each sample size of length m, we have ðTÀmÞ observations of realised excess returns, which are calculated as follows: where the value ofŵ rule depends on the investment rule used: 1. If we use the Markowitz rule,ŵ rule ¼ŵ opt : 2. If we use the Markowitz rule with the proposed adjustment,ŵ rule ¼ŵ optR : 3. If we use the 1=n rule,ŵ rule ðiÞ ¼ 1=n for i ¼ 1, 2, ::, n: Thus, we have ðTÀmÞ observations of realised excess returns for the three rules. Using these realised excess returns, we calculate the out-of-sample Sharpe ratio for the three rules: Then, we calculate the realised Sharpe ratio For m ¼ 200, we have 938 observations to compute the out-of-sample Sharpe ratio; for the other extreme (m ¼ 500), we have 638 observations. This allows us to assess the performance of the three investment rules for sample sizes of 200, 300, 400, and 500 months.
Recall that we assume that investors choose their portfolios at the beginning of each month and then rebalance their portfolios by considering the results obtained and the new optimal weightings chosen at the end of the month. It would thus be interesting to explore the optimal length of portfolio rebalancing (e.g. Fahmy, 2020).

Preliminary results of the shrinkage estimations
4.1. Results using the 30 equally weighted industrial portfolios Figure 3 shows the results obtained using a rolling sample size of 200 monthly observations. Figure 3 shows the out-of-sample Sharpe ratio for the different investment rules. The green line shows the Sharpe ratio of the naive rule 1=n and the blue line shows the out-of-sample modified Markowitz Sharpe ratio for values of g ranging from 0 to 1, with an increment of 0.01. The Markowitz rule without shrinkage corresponds to g equal to zero. The Sharpe ratio of the Markowitz rule without restriction, for a sample size of 200, is 0.011. This is well below that obtained using the 1=n rule, which has an out-of-sample Sharpe ratio of 0.1725. The out-of-sample Sharpe ratio rises by increasing the value of g until it peaks at 0.1990. This optimal adjustment is obtained for g equal to 0.81. Moreover, with the optimum shrinkage, the out-of-sample Sharpe ratio increases from 0.03 to 0.1990 (i.e. an 18-fold increase). Figure 4 shows the out-of-sample Sharpe ratio for a sample size of 300. The Sharpe ratio without adjustment is 0.1641, which is higher than that obtained using the 1=n rule (0.1581). In this case, the optimal adjustment is obtained for g equal to 0.47 with a Sharpe ratio of 0.2117; that is, the out-of-sample Sharpe ratio increases by 29% with respect to the Markowitz rule without adjustment. Figure 5 shows that the Sharpe ratio without adjustment is 0.1480, which is higher than that obtained using the 1=n rule (0.1424). In this case, the optimal adjustment is obtained for g equal to 0.43, with a Sharpe ratio of 0.1945; that is, the out-of-sample Sharpe ratio increases by 31% with respect to the Markowitz rule without adjustment. Figure 6 shows that the Sharpe ratio without adjustment is 0.1412, which is higher than that obtained using the 1=n rule of 0.1305. In this case, the optimal adjustment is obtained for g equal to 0.37, with a Sharpe ratio of 0.1889; that is, the out-ofsample Sharpe ratio increases by 34% with respect to the Markowitz rule without adjustment.
In summary, the compression of the covariance matrix to a random matrix improves the out-of-sample Sharpe ratio for all of the sample sizes that we considered. However, its effect is stronger for smaller sample sizes.

Effects of the optimal weights
Recalling that the low performance of the Markowitz rule is only one of its drawbacks, we now focus on the second drawback (i.e. its extreme and volatile portfolio weights).
To illustrate this issue, the following exercise is carried out: We calculate the 30 portfolio weights obtained with the Markowitz rule without adjustment for each of the ðTÀmÞ portfolios. We calculate the 30 portfolio weights obtained with the Markowitz rule with the optimal adjustment for each of the ðTÀmÞ portfolios. We calculate the means and standard deviations of each portfolio weight. Table 1 shows the means and standard deviations of the optimal weights of the Markowitz rule with and without the adjustment for each of the 30 equally weighted industrial portfolios. Columns 2 and 4 show the means of the optimal weights obtained with the Markowitz rule without adjustment for sample sizes of 200 and 500 monthly observations, and columns 3 and 5 show their respective standard deviations. Similarly, columns 6 and 8 show the means of the optimal weights obtained using the Markowitz rule when setting the optimum adjustment to the covariance matrix for sample sizes of 200 and 500 monthly observations, and columns 7 and 9 show their respective standard deviations. The observed means and standard deviations of the optimal weights obtained using the traditional Markowitz rule show that the weights are extremely high and volatile. For example, for a sample size of 200, the mean of the number 11 portfolio weight of the industrial portfolio is À1.11. Therefore, investors who want to use this rule would have to invest taking the short position in this asset by 1.11 times their wealth. Additionally, the standard deviation of this asset is 5.59 times the value of investors' wealth, which implies high transaction costs for investors. These facts make it infeasible to use the Markowitz rule under these conditions. Certainly, these values are the most extreme when the sample size is small. However, the number of observations considered in the sample size does not eliminate these drawbacks. This is consistent with the findings of Jobson and Korkie (1981), Chopra and Ziemba (1993); Chopra (1993), and Best and Grauer (1991) Instead, using the Markowitz rule with the optimum shrinkage of the eigenvalue spectrum of the covariance matrix, we obtain smaller values with a lower dispersion for each asset weight. For a sample size of 200 monthly observations, the dispersion of some of the weights decreases by more than 416 times.
In conclusion, using this approach improves the out-of-sample Sharpe ratio and at the same time lowers the volatility of the portfolio weights markedly. This could allow the Markowitz rule to overcome its main practical drawbacks.

Effects of the eigenvalues of the covariance matrix
To assess whether the eigenvalue spectrum is affected using this approach, we calculate the eigenvalues k i for all of the samples and determine the percentage of times that they are within the limits associated with a random matrix. For a sample size of 200 monthly observations, we calculate the 30 eigenvalues of the 938 correlation matrices with and without shrinkage. Table 2 shows the results, where the values presented are averages. This shows that the main effect of the adjustment is to reduce the percentage of times that the eigenvalues are below the minimum limit.
The abundant noise in the estimation of the covariance matrix values triggers the greater presence of eigenvalues below the lower limit, as established in the spectral density of Marchenko and Pastur (1967). Larger eigenvalues remain relatively unaffected by the effect of noise. 7 Figure 7 shows that the eigenvalues spectrum moves towards the right as the compression of the covariance matrix to a random matrix increases. Initially, a large proportion of eigenvalues are below the lower random matrix limit. Meanwhile, as we reach the optimal shrinkage level, no eigenvalues are below the lower limit.  Figure 7 shows the effect of the shrinkage degree (by increasing g) over the eigenvalue spectrum. Random matrix density is shown in brown for all the figures. As the value of g increases, the eigenvalue spectrum of the shrinking matrix shifts towards the interior of a random matrix spectrum. For g ¼ 0:81, the lowest eigenvalues of the compressed matrix are all inside the random matrix spectrum.

Estimating the optimal covariance matrix ex-ante shrinkage level
So far, we have shown that the shrinkage of the covariance matrix can significantly improve the performance of the Markowitz rule. However, we now want to explore how effective this method is under more realistic conditions. In practice, an investor must simultaneously determine both the degree of compression of the covariance matrix and the portfolio in which they will invest, using only the information available up to that point. Thus, the investor must determine the ex-ante shrinkage level. A comparison with other approaches used to shrink the covariance matrix is also required. For example, we compare our method with the one proposed in Ledoit and Wolf (2004b), followed by a comparison of the optimal shrinkage with other more recent approaches.

Ledoit-Wolf approach
The Ledoit and Wolf (2004b) method minimises the estimation risk of a set of parameters by shrinking them to a target set value. The logic of this approach relies on the fact that the covariance matrix has a high estimation risk due to the presence of many degrees of freedom. Therefore, this shrinkage to another biased one (with a lower estimation risk and fewer degrees of freedom) reduces the effects of the estimation risk.
Using Ledoit and Wolf (2004b), we estimate the optimal shrinkage for the 938 samples of size 200. Figure 8 shows the optimal shrinkage values for each one of the 938 considered samples (of size 200 observations). It shows that the shrinkage degree value decreases when the samples use more recent data. Using these values, we calculate the Markowitz out-ofsample Sharpe ratio and obtain a value of 0.0835 compared with 0.1564 for the 1=n rule. This result shows that although the estimation risk decreases, it is not sufficient to ensure that the Markowitz rule performs better than the 1=n rule.

Shrinkage of the eigenvalues of the covariance matrix
Considering this discussion and recalling that the eigenvalues of the covariance matrix below the lower limit of the eigenvalues of a random matrix are those that cause greater problems to the Markowitz rule, we propose an alternative shrinkage approach. This approach consists of moving the smaller eigenvalues within the limits of the respective random matrix. Specifically, we calculate the eigenvalues k i of the correlation matrix. We then take the minimal eigenvalue k min and move it towards the random matrix spectrum: From equation 22, we choose the shrinkage factor g Ã as Henceforth, we call this approach the optimal shrinkage of the covariance matrix eigenvalues (OSME). We calculate the optimal g Ã for 938 samples and show the result in Figure 9. Thus, for each of the 938 samples with 200 monthly observations, a new correlation matrix is determined usinĝ Figure 9 shows that the shrinkage levels have higher and more stable values than the Ledoit and Wolf (2004b) approach. Next, we calculate the out-of-sample performance of the Markowitz rule using this approach. We find that its Sharpe ratio is equal to 0.1928, which is slightly above that of the 1=n rule (0.1564).

Comparison between OSME, Ledoit-Wolf, and the 1=n rule
In this subsection, we first compare OSME Ledoit and Wolf (2004a) with the 1=n rule. Figure 10 shows the Sharpe ratio of the 1=n rule in red, that obtained using the Ledoit-Wolf method in brown, and that obtained using OSME in blue for the different sample sizes. This figure shows that the 1=n rule is higher for sample sizes of less than 165 months. OSME performs better for sample sizes greater than 165 months. The Ledoit-Wolf method begins to be higher than the 1=n rule for sample sizes over 280 months. However, it is never superior to OSME. Table 3 shows the means and standard deviations of the optimal weights obtained using these three approaches. Although the standard deviation of the weights decreases, it remains high. The volatility of the optimal weights under the OSME approach is the lowest. The results show notably better performance for the Markowitz rule with OSME adjusted with respect to the naive 1=n rule, and not only for a sample size of 200 months. This third approach allows the Markowitz rule to perform better than the 1=n rule for sample sizes greater than 165 monthly observations.

Comparison with other approaches
Finally, to demonstrate the robustness of our results, we compare the proposed approach with recent approaches related to the shrinkage of covariance matrices. The methods that we consider are those of Ledoit and Wolf (2003, 2004a,b, 2020 and Ollila and Raninen (2019). All of these methods seek to decrease the estimation error in the covariance matrix by compressing the traditional matrix into an identity matrix to decrease the mean squared error. Table 4 shows the out-of-sample performance of several methods. Two methods developed by Ollila and Raninen (2019) are considered (Ell1-RSCM and Ell2-RSCM). The table shows that the 1=n rule predominates for sample sizes less than 200 months. Meanwhile, the OSME approach predominates for sample sizes of 150 months and over. The Ledoit-Wolf and Ollila-Raninen methods exceed the 1=n rule for a sample size greater than 300 months. The Sortino ratio is used as a performance measure to check the robustness of the results (Sortino & Van Der Meer, 1991). Whereas the Sharpe ratio considers the standard deviation r as the risk measure, the Sortino ratio considers as risk only the part of r whose values are below a minimum acceptable return. Table 5 shows the performance measures using the Sharpe and Sortino ratios for a sample size of 200 monthly observations. As before, the results continue to show the superior performance of the OSME method. We use the one-month Treasury bills as the minimum acceptable return to calculate the Sortino ratio. Table 6 shows the weights' standard deviations (for 30 assets) that we obtained using each method. The last row provides the respective mean values. The OSME method decreases the mean volatility 43 times compared with the Markowitz rule. In addition, it decreases mean volatility (on average) by 12.6 times with respect to the Ledoit and Wolf (2003, 2004a,b, 2020 and Ollila and Raninen (2019) methods. These results are useful for portfolio managers because a portfolio selection method must perform well and incur low transaction costs. High volatility in asset weightings implies high transaction costs when rebalancing portfolios periodically.
Therefore, we recommend that portfolio managers (instead of using traditional covariance matrix estimates) should shrink these matrices into a diagonal matrix that only contains the variances of the individual assets. The shrinkage factor must be chosen so that the smallest eigenvalue lies between the lower and upper bounds of the random matrix spectrum (the corresponding shrinkage factor can be calculated by applying equation 23). Consequently, by applying this modified Markowitz rule, better performance and lower transaction costs could be obtained.

Conclusions
In the literature, classical methods of covariance matrix shrinkage use a matrix proportional to the identity as a target. In this paper, we propose a modification of the previous method using a diagonal matrix that is generated by the variances of the returns as a target. In addition, we shift the eigenvalue spectrum of the compressed  covariance matrix such that its lower part is contained within the spectrum of a random matrix.
Our results suggest that the poor performance of the Markowitz rule is partly driven by the presence of eigenvalues close to zero in the covariance matrix. Using our approach, the corresponding portfolio performance is better than that using the 1=n rule. Additionally, the obtained optimal portfolio weights are more stable.
A further comparison with six other methods shows that our approach delivers better out-of-sample performance, while decreasing the volatilities of the optimal weights. Our methodology also works for small sample sizes.
The limitations of this study should be noted. We only used a one-month dataset of 30 equally weighted portfolios in one industry sector from Kenneth French's database. The performance of this approach using other datasets was not explored (e.g. daily and individual asset data). We also assumed a monthly portfolio rebalancing period with an investment horizon of one month; therefore, different rebalancing and investment periods should also be explored in the future. However, applying optimal portfolio selection with different schemes always faces the problem of correctly estimating the covariance matrices. The effects of considering transaction costs on the out-of-sample performance of this approach should also be explored. This aspect is of particular importance for emerging economies.
These shrinkage methods decrease the correlations between asset returns in the covariance matrix. This improves predictability and decreases the volatility of the optimal portfolio weights, and implies (in part) that the empirical correlations are overestimated; that is, the data contain spurious correlations that alter the estimation process. The causes of this phenomenon remain to be explored.  Although these results are specific to the dataset that we used in this study, they could be considered to be an opportunity to improve the work of portfolio managers, especially those working with small datasets (e.g. oriented towards emerging markets).

Disclosure statement
No potential conflict of interest was reported by the authors. Notes 1. In this study, we consider that all individuals are risk averse. However, Markowitz's model could incorporate greater flexibility by allowing useful functions that are considered to be risk-loving individuals. See Georgalos et al. (2021) for a detailed discussion. 2. Here, e is an n-dimensional column vector whose elements consist of ones. 3. See Bai et al. (2009). 4. Taken from Kenneth French's website. 5. We use the eigenvalues of the correlation matrices as the standardisation criterion because the sum of the eigenvalues of the correlation matrix, with a range equal to n, is equal to n. 6. See, for example, DeMiguel et al. (2009DeMiguel et al. ( , 2013. 7. For details, see Papp et al. (2005).