The Traditional Ordinary Least Squares Estimator under Collinearity

In a multiple regression analysis, it is usually difficult to interpret the estimator of the individual coefficients if the explanatory variables are highly inter-correlated. Such a problem is often referred to as the multicollinearity problem. There exist several ways to solve this problem. One such way is ridge regression. Two approaches of estimating the shrinkage ridge parameter k are proposed. Comparison is made with other ridge-type estimators. To investigate the performance of our proposed methods with the traditional ordinary least squares (OLS) and the other approaches for estimating the parameters of the ridge regression model, we calculate the mean squares error (MSE) using the simulation techniques. Results of the simulation study shows that the suggested ridge regression outperforms both the OLS estimator and the other ridge-type estimators in all of the different situations evaluated in this paper. Journal of Biometrics & Biostatistics J o ur al of Bio metrics & Bistatis t i c s


Introduction
Consider the standard multiple linear regression model; where Y is an ( 1) n× vector of responses, X is an ( ) n p × matrix of the explanatory variables of full rank p, β is a ( 1) p× vector of unknown regression coefficients, and finally, 2 (0, ) e N I σ is an ( 1) n× vector of error terms.
The OLS estimator is often used to estimate the regression coefficients β as: The standard assumption in the linear regression analysis is that all the explanatory variables are linearly independent. When this assumption is violated, the problem of multicollinearity enters into the data and it inflates the variance of an ordinary least squares estimator of the regression coefficient. Obtaining the estimators for multicollinear data is an important problem in the literature. In fact, when the problem of multicollinearity is present in the measurement error ridden data, then an important issue is how to obtain the consistent estimators of regression coefficients. One of the most popular estimator for combating multicollinearity is the ridge estimator, originally proposed by Hoerl et al. [1]. They suggested a small positive number (k>0) to be added to the diagonal elements of the X X ′ matrix from the multiple regression and the resulting estimators are obtained as: which is known as a ridge regression estimator. For a positive value of k, this estimator provides a smaller MSE compared to the OLS estimator, i.e.,( Most of the later efforts in this area have concentrated on estimating the value of the ridge parameter k. Many different techniques for estimating k have been proposed by different researchers, for example, Hoerl et al. [1], Hoerl et al. [2] Dempster et al. [3], Gibbons [4], Kibria [5], Khalaf et al. [6], Alkhamisi et al. [7], Khalaf [8] and Khalaf [9].
The plan of the paper is as follows: in Section 2, we present different methods for estimating the parameter of ridge regression together with our proposed estimators. A simulation study has been conducted in Section 3. The simulation results are discussed in Section 4. In Section 5 we give a brief summary and conclusions.

The Proposed Ridge Regression Parameter
In case of ordinary ridge regression, many researchers have suggested different ways of estimating the ridge parameter. Hoerl et al. [1] showed, by letting max β denote the maximum of the i β , that choosing; implies that ˆ( ( )) ( ) MSE k MSE β β < . The ridge estimator using ˆH K k will be denoted by HK.
Hoerl et al. [2] suggested that, the value of k is chosen small enough, for which the MSE of ridge estimator is less than the MSE of OLS estimator. They showed, through simulation, that the use of the ridge with biasing parameter given by: has a probability greater than 0.50 of producing estimator with a smaller MSE than the OLS estimator, where 2 σ is the usual estimator The ridge estimator using Eq.
The purpose of this study is to modify the approaches of estimating k mentioned in Hoerl and Kennard [1] and Hoerl et al. [2] given in whereβ is the estimator of β obtained from the OLS or the other different ridge parameters, and R equals 5000 which corresponds to the number of replications used in the situation.

The Simulation Results
Ridge estimators are constructed with the aim of having smaller MSE than the MSE of the OLS estimator. Improvement, if any, can therefore be studied by looking at the amount of the MSE. These MSEs are reported in Tables 1 and 2. The results of our simulation study indicate that ridge estimators outperform OLS estimator in all cases and the suggested estimators KI 1 and KI 2 performed very well in this study. They appear to offer an opportunity for large reduction in MSE, especially when the sample size and the correlation between the explanatory variables are high (Table 1). equations (4) and (5), to suggest the following two estimators: where p denotes the number of parameters (excluding the intercept). The ridge estimators using 1 k and 2 k will be denoted by KI 1 and KI 2 , respectively.
The performance of these proposed estimators will be then compared with the traditional OLS estimation and those of HK and HKB estimators in terms of MSE. This will mainly be done by means of simulations under conditions where the sample size n, the number of the explanatory variables p and the strength of correlations between the explanatory variables are varied.

The Simulation Study
This section consists of a brief description of how the data is generated together with a discussion about the different factors varied in the simulation study. Also the criteria for judging the performance of the different estimation methods are presented.

The design of the experiment
Following McDonald et al. [10], the explanatory variables are generated by 1 2 2 (1 ) , 1,2,..., 1,2,..., where ij z are independent standard normal pseudo-random numbers, and ρ is specified so that the correlation between any two explanatory variables is given by 2 .
The explanatory variables are then standardized so that X X ′ is in correlation form.
Observations on the dependent variable are determined by β is taken to be identically zero. Four values of 2 σ are considered which are 0.8, 0.9, 0.95 and 0.99. Then the dependent variable is standardized so that X y ′ is the vector of correlation of dependent variable with each explanatory variable. In this experiment, we choose p = 7 and 10 for n = 15, 25, 80 and 200. Then the experiment is replicated 5000 times by generating new error terms.

Judging the performance of the estimators
To investigate the performance of the different proposed ridge regression estimators and the OLS method, we calculate the MSE using the following equation: If we focus on the values of 2 σ , ρ and the sample size n, we find that among the ridge estimators considered, KI 1 is the best followed by HKB and KI 2 .
In comparing Tables 1 and 2 which involve p = 7 and p = 10, respectively, we find that the MSEs are lowest for Table 1. This is to say that the ridge estimators are more helpful when high multicollinearity exists and the number of explanatory is not large.

Conclusions
In this article, we introduce two alternatives ridge estimators and study their performance using simulation techniques. Comparisons are made with other ridge type estimators evaluated elsewhere. The results from the simulation study show that the sample size, the correlation between the independent variables and the number of explanatory variables are important factors for the performance of the different estimation methods. In most of the cases, the MSE decreases when the first two factors increase.
The result also shows that, with respect to MSE criteria, the proposed ridge regression methods out performs both the OLS estimator and the estimators of Hoerl et al. [1] and Hoerl et al. [2] in all cases investigated. The use of the proposed estimators is recommended since it reduces the MSE substantially in all of the different situations investigated in this paper.