Monday, October 31, 2011

Deriving Heteroskedasticity Consistent Standard Errors

Heteroskedasticity leads to inaccurate standard errors in linear regression, but can be corrected using robust / heteroskedasticity consistent corrected (HCC) standard errors.

Recall, the matrix form of the regression equation:

y = XB + ϵ

with the estimator b = (X’X)-1X’y substituting for y we get
b =  (X’X)-1X’ (XB + ϵ)
=  (X’X)-1 [X’XB +X’ ϵ]
=  (X’X)-1 X’XB +(X’X)-1X’ ϵ
= B +(X’X)-1X’ ϵ

VAR(b) = E[(b-B)(b-B)]

From the above we know: b = B +(X’X)-1X’ ϵ
(b-B) = B +(X’X)-1X’ ϵ - B
= (X’X)-1X’ ϵ

Therefore VAR(b)=  E[(b-B)(b-B)] = E[((X’X)-1X’ ϵ)( (X’X)-1X’ ϵ)]

=E[(X’X)-1X’ ϵ ϵ’X(X’X)-1]
=(X’X)-1X’E[ϵ ϵ’]X(X’X)-1

Let  E[ϵ ϵ’] = Φ  then we have VAR(b) = (X’X)-1X’ Φ X(X’X)-1   ‘general form’

If Φ = σ2I then VAR(b) = (X’X)-1X’ σ2I X(X’X)-1   = σ2(X’X)-1  ‘case of homoskedasticity’

For heteroskedasticity corrected standard errors VAR(b) =

N/(N-k) (X’X)-1X’ Φ̂  X(X’X)-1   where Φ̂ = diag(ei2)

Greene, Econometric Analysis. 1990
Using Heteroskedasticity Consistent Standard Errors in the Linear Regression Model. J.Scott Long and Laurie H. Ervin. The American Statistician Vol. 54 No 3 (Aug 2000) pp. 217-224

Chi-Square Test of Independence (Association)

I watched this a while back, not a bad step by step overview of chi-square tests. Great for thinking about decision tree logic.