Monday, January 10, 2011

Ridge Regression

Ridge Regression projects Y onto principle components, or fits a linear surface over the domain of the PC's.  Ridge regression shrinks regression co-efficients  with respect to the orthonormal basis formed by the principle components. It is a shrinkage method. In producing the coefficient estimates, a 'penalized' residual sum of squares is minimized.


By its nature, ridge regression addresses multicollinearity. The estimates have increased bias, but lower variance.  Ridge regression is an example of a case where a biased estimator can outperform an unbiased estimator given small enough variance (or large enough improvements in efficiency.


 min{e'e + λβ'β)          'penalized residual sum of squares' where λ is a penalty or shrinkage factor

X’βridge = X(X’X +λI)-1 X’y

See also Comparing OLS, Ridge Regression, LAR and LASSO.

References:

The Elements of 
Statistical Learning:
Data Mining, Inference, and Prediction.
Second Edition (link)

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.