An unbiased estimator with prior information

Abstract The ordinary least square (OLS) estimator suffers a breakdown in the presence of multicollinearity. The estimator is still unbiased but possesses a significant variance. In this study, we proposed an unbiased modified ridge-type estimator as an alternative to the OLS estimator and the biased estimators for handling multicollinearity in linear regression models. The properties of this new estimator were derived. The estimator is also unbiased with minimum variance. A real-life application to the higher heating value of poultry waste from proximate analysis and simulation study generally supported the findings.


Introduction
Consider the linear regression model where y is a n Â 1 vector of the dependent variable, X is a known n Â p full rank matrix of explanatory variables, b is a p Â 1 vector of regression coefficients and I is an n Â n identity matrix.The ordinary least squares estimator (OLS) of b in model ( 1) is defined as: where S ¼ X 0 X: This estimator is the most widely used method to estimate the parameters in a linear regression model.It performs best when certain assumptions are satisfied.One of them is that the independent variables are not associated.However, in practice, there often exist strong or perfect linear relationships among the independent variables.This situation is called multicollinearity.The OLS estimator suffers a breakdown in the presence of multicollinearity.The estimator is still unbiased but possesses a significant variance (Ayinde, Lukman, Samuel, & Attah, 2018).Different approaches are available in the literature to handle this problem.These include Hoerl and Kennard (1970), Swindel (1976), Farebrother (1976), Liu (1993), Sakallioglu and Akdeniz (2003), Ozkale and Kacıranlar (2007), Yang and Chang (2010), Li and Yang (2012), Wu and Yang (2013), Wu (2014) and recently, Arumairajan and Wijekoon (2017), Ayinde et al. (2018), Lukman, Ayinde, Binuomote, and Onate (2019).The estimators by these authors are biased.Crouse, Jin, and Hanumara (1995) and Sakalloglu and Akdeniz (2003) proposed the unbiased version of the ridge estimator and Liu estimator, respectively, with the addition of prior information.These methods effectively handle the problem of multicollinearity and eliminate bias.
In this article, we proposed an unbiased modified ridge-type estimator (UMRT) with prior information and derived its properties.Furthermore, we discuss the performance of the proposed estimator over the OLS estimator, the Ridge estimator (RE) and the modified ridge-type estimator (MRT) using the mean square error matrix (MSEM) criteria.
The remaining part of this article is as follows.In Section 2, we proposed the unbiased modified ridgetype estimator and compared its performance with some existing estimators using the mean square error matrix (MSEM) criterion in Section 3. We estimate the biasing parameter k and d in Section 4. We conducted a simulation study and a real-life data application in Section 5. Finally, we provide some concluding remarks in Section 6.
2. Unbiased modified ridge-type estimator with prior information Hoerl and Kennard (1970) defined the ridge estimator of b as: where k is the biasing parameter.Swindel (1976) defined the ridge estimator with prior information b bMRE ðk, bÞ ¼ ðS þ kIÞ À 1 ðX 0 y þ kbÞ (4) Crouse et al. (1995) introduced the unbiased ridge estimator based on the ridge estimator and prior information J.This is defined as bUMRE ¼ ðS þ kIÞ À 1 ðX 0 y þ kJÞ (5) where J and bOLS are uncorrelated and J ̴ N(b, V) such that V ¼ r 2 k I p and I p is p Â p identity matrix.J is estimated by J ¼ P p i¼1 bi p : Lukman et al. (2019) proposed the modified ridge-type estimator which is defined as follows: where where C is a p Â p matrix and I is a p Â p identity matrix.Consequently, the mean square error of bðC, JÞ is Then, The convex estimator bðC, JÞ has minimum MSE for optimal value of C and it's an unbiased estimator of b: Therefore, the new estimator in this study is defined as where It is easy to show that bUMRT ðF kd , JÞ is an unbiased estimator of b: The expectation vector, bias vector, dispersion matrix and mean square error matrix of the proposed estimator are: Consequently, the estimator bUMRT ðF kd , JÞ is an unbiased estimator of b.
Suppose there exist an orthogonal matrix Q where k i is the ith eigenvalue of X 0 X: K and Q are the matrices of eigenvalues and eigenvectors of X 0 X, respectively.Model (1) can be written in canonical form as: where Z ¼ XQ, a ¼ Q 0 b and Z 0 Z ¼ K: For model ( 15), we get the following representations: Lemma 2.1.Let M be an n Â n positive definite matrix, that is M > 0, and a be some vector, then M À aa 0 !0 if and only if a 0 M À 1 a 1 (Farebrother, 1976).Trenkler & Toutenburg, 1990).

Comparison of the OLS estimator and the unbiased modified ridge-type estimator
Theorem 3.1.The unbiased modified ridge-type estimator bUMRT ðF kd , JÞ is superior to the OLS estimator in the mean square error sense for k > 0 and 0 The MSEM difference between Eqs. ( 14) and ( 21) dÞ À k i will be positive definite.By Lemma 2.2, the proof is completed.

Comparison of ridge estimator and the unbiased modified ridge-type estimator
From the representation, bRE ðkÞ ¼ ðK þ kIÞ À 1 Z 0 y, the mean square error matrix is where The difference between bRE ðkÞ and bUMRT ðF kd , JÞ in term of the MSEM is Proof: The difference between Eqs. ( 14) and ( 23) We observed that ðK þ kIÞ À 1 KðK þ kIÞ À 1 À ðK þ kð1 þ dÞÞ À 1 will be positive definite if and only if where k > 0 and 0 < d < 1.

Comparison of modified ridge-type estimator and unbiased modified ridgetype estimator
From the representation, âMRT ðk, dÞ ¼ ½K þ kð1 þ dÞ À 1 Z 0 y, the dispersion and MSEM is defined as follows: Theorem 3.3.The unbiased modified ridge type estimator always dominates the modified ridge type estimator in the MSEM sense for k > 0 and 0 < d < 1.

Estimation of the biasing parameters k and d
In this section, we discuss the estimation of the biasing parameter k and d.

The estimation of parameter d
In the definition of the new estimator, J and âOLS are uncorrelated .
Therefore, ðâ OLS À JÞ $ Nð0, r 2 kð1þdÞ ½K À 1 kð1 þ dÞ þ 1Þ and From (30), if r 2 is known for a fixed k, we can get an unbiased estimator of d as follows: When r 2 is unknown, s 2 is used as an estimate of r 2 : Consequently, where trðK À 1 Þ ¼ P P i¼1 1 k i and k i is the eigen-value of X 0 X: It was observed that the estimator of d in (33) can return a negative value.To eliminate the negative value, Wu (2014) suggests replacing d with one (1) when its estimate is negative.Here, in this study, when d in Eq. ( 33) is negative, we adopt the estimator of d suggested by Ozkale and Kaciranlar (2007) as follows:

Estimating the biasing parameter k
From Eq. ( 30), if r 2 is known and d is assumed to be fixed, an unbiased estimate of k is defined as follows: When k is negative, estimate k as follows:

Application to poultry waste data
The theoretical results are illustrated with real-life data which was analyzed in the study of Qian, Lee, Soto, and Chen (2018).A total of 48 samples of poultry waste were collected from different published open literature reviews to form a database for derivation, evaluation and validation of proximatebased higher heating value (HHV) models.Six samples (#43, 44, 45, 46, 47 and 48) were deleted due to incomplete information.The linear regression model is: where HHV denotes Higher Heating Value, FC denotes Fixed Carbon, VM denotes Volatile Matter, A denotes ASH and e is the random error term that is expected to be normally distributed.The relationship between the variables were obtained by the correlation matrix as follows.
From Table 1, there is a strong positive relationship between higher heating value and Fixed Carbon while a negative relationship exists between HHV and VM; HHV and Ash.To identify the distribution of the error term, we used the Jarque-Bera (JB) test.The test statistic and the corresponding p value are JB ¼ 0.6409 and p value ¼.7258, respectively.Since this p value is larger than any reasonable alpha value used in the literature, we conclude that the error term follows the normal distribution.We diagnosed the model for a possible presence of multicollinearity.The variance inflation factor (VIF) values are VIF FC ¼ 997.819,VIF VM ¼ 2163.504,VIF ASH ¼ 1533.782.Literature shows that a model suffers from multicollinearity when VIF i > 10.Since the values of the VIF in the above model is higher than 10, we conclude that the model suffers from severe multicollinearity.Alternatively, we can use the condition number (CN) to examine if the explanatory variables are related where CN ¼ maximumðeigenvalueÞ minimumðeigenvalueÞ : If CN is between 100 and 1000 there is moderate to strong multicollinearity and if it exceeds 1000 there is severe multicollinearity (Arumairajan & Wijekoon, 2017;Gujarati, 1995).The condition number is 581291.39which indicates the presence of severe multicollinearity.Therefore, it will be appropriate to predict higher heating value with an alternative unbiased estimator possessing minimum variance.We adopt K fold crossvalidation to validate the performances of the estimators.The data is partitioned into K equal size folds (K ¼ 10 in this study).In these K folds, onefold will be treated as the test set and use the remaining K -1 (9) folds as the training set.The MSE is computed on the observations in the held-out fold.The process is repeated ten times, taking out a different part each time.The validation test error is obtained by computing the average K estimates of the test error, and we get an estimated validation (test) error rate for new observations.The estimator with the lowest validation MSE is the best.The average MSE of the validation error in this study is defined as: where n k is the number of subsample in each fold, ỹi is the fitted value for observation i, obtained from the data with fold k removed.The result is presented in Table 2.The result in Table 2 shows that the unbiased modified ridge-type estimator (UMRT) produced the same estimates with the OLS estimator.Also, the technique was able to circumvent the problem of large variance which is peculiar to the OLS estimator.The proposed estimator has the smallest mean square error and prediction error, respectively.

Monte-Carlo simulation
We carried out a Monte-Carlo simulation to investigate the performances of these estimators.The explanatory variables were generated in line with the study of McDonald and Galarneau (1975), Liu                       Table 11.Estimated MSE for OLS, Ridge, MRT and UMRT when n ¼ 30, Sig ¼ 1 and p ¼ 6.Table 13.Estimated MSE for OLS, RE, D and MRT when n ¼ 50, sig ¼ 1 and p ¼ 6.          1993) and Lukman and Ayinde (2017).This is defined as: where z ij is independent standard normal distribution with mean zero and unit variance, c 2 is the correlation between any two explanatory variables and p is the number of explanatory variables.The values of c were taken as 0.85, 0.95 and 0.99, respectively.In this study, the number of explanatory variable (p) was taken to be three and six.The response variable is defined as: where e i $ ð0, r 2 Þ: The values of b were chosen such that b 0 b¼ 1 (Newhouse & Oman, 1971).The sample size used are 30 and 50.Two different values of r: 1 and 5.The experiment is repeated 1000 times.The estimated MSE is calculated as MSEð bÞ ¼ 1 1000 where bij denotes the estimate of the ith parameter in jth replication and b i is the true parameter values.The estimated MSEs of the estimators for different values of n, k, d, r and c are shown in Tables 3-18.
The following observations were made: 1.The unbiased estimator is superior to OLS in all the cases.OLS estimator has the least performance when there is multicollinearity.2. Also, the unbiased estimator consistently outperforms the ridge and modified ridge estimators.Even though, ridge and modified ridge estimators dominate OLS in all cases.3. When the sample size increase, the MSE decreases even when the correlation between the explanatory variables increases.
4. As sample sizes remain constant, increasing the value of r increases the mean square errors of each of the estimators. 5.As the number of explanatory variables increases, the mean squared error of all the estimators' increases for a given level of multicollinearity and r.
Generally, we confirm the superiority of the unbiased estimator over other estimators at the different level of multicollinearity and error variance.The performance of the modified-ridge estimator dominates the ridge estimator and OLS.

Conclusion
The OLS estimator suffers a breakdown in the presence of multicollinearity.The estimator is unbiased but possesses a significant variance.An alternative estimator called unbiased modified ridge-type estimator with prior information was proposed in this study.This estimator was proved to be unbiased and possess minimum variance theoretically.Also, a simulation study and real-life application were conducted to establish the superiority of this estimator over the existing estimators in terms of the MSEM criterion and crossvalidation prediction error.The performance of this new estimator is better than the OLS estimator and ridge estimator for all degree of multicollinearity.This estimator was able to circumvent the problem of inflated variance that faces the OLS estimator.Finally, this estimator should be adopted as a replacement to the OLS estimator and the biased estimators when there is multicollinearity in a linear model.
Let us consider two estimators bRE ðkÞ and bUMRT ðF kd , JÞ .If k > 0 and 0 < d < 1, the estimator bUMRT ðF kd , JÞ is superior to the estimator MSEMð bRE ðkÞÞ in the MSEM if and only if B

Table 1 .
Correlation matrix of the variables.