Heteroskedasticity leads to inaccurate standard errors in linear regression, but can be corrected using robust / heteroskedasticity consistent corrected (HCC) standard errors.
Recall, the matrix form of the regression equation:
y = XB + ϵ
with the estimator b = (X’X)-1X’y substituting for y we get
b = (X’X)-1X’ (XB + ϵ)
= (X’X)-1 [X’XB +X’ ϵ]
= (X’X)-1 X’XB +(X’X)-1X’ ϵ
= B +(X’X)-1X’ ϵ
VAR(b) = E[(b-B)(b-B)]
From the above we know: b = B +(X’X)-1X’ ϵ
(b-B) = B +(X’X)-1X’ ϵ - B
= (X’X)-1X’ ϵ
Therefore VAR(b)= E[(b-B)(b-B)] = E[((X’X)-1X’ ϵ)( (X’X)-1X’ ϵ)]
=E[(X’X)-1X’ ϵ ϵ’X(X’X)-1]
=(X’X)-1X’E[ϵ ϵ’]X(X’X)-1
Let E[ϵ ϵ’] = Φ then we have VAR(b) = (X’X)-1X’ Φ X(X’X)-1 ‘general form’
If Φ = σ2I then VAR(b) = (X’X)-1X’ σ2I X(X’X)-1 = σ2(X’X)-1 ‘case of homoskedasticity’
For heteroskedasticity corrected standard errors VAR(b) =
N/(N-k) (X’X)-1X’ Φ̂ X(X’X)-1 where Φ̂ = diag(ei2)
References:
Greene, Econometric Analysis. 1990
Using Heteroskedasticity Consistent Standard Errors in the Linear Regression Model. J.Scott Long and Laurie H. Ervin. The American Statistician Vol. 54 No 3 (Aug 2000) pp. 217-224