## Saturday, January 15, 2011

### Simultaneous Equations, 2SLS, & IVs

Simultaneous Equations

Simplifying the notation, let’s assume we have the following system of simultaneous/structural equations:

Y1 = f(x1,Y2) +e1  (1)
Y2 = f(x2,Y1) +e2  (2)

Both equations are necessary to characterize the relationships between the variables, they must be viewed as an entire system to grasp all of the feedback mechanisms involved.

endogenous variables: those jointly determined by the system, in this case we have two endogenous variables, Y1 and Y2.

exogenous variables: those variables that are not jointly determined by the system, in this case x1 and x2. In othe words these variables are determined outside the system, or ‘exogenous’ to the system.

predetermined variables: exogenous variables and lagged endogenous variables (there are not lagged endogenous variables in this system, but an example would be Y1,t-1.)

simultaneity bias: In the system represented above, we have the following consequence- a change in e1 leads to a change in Y1, but because Y1 is a predictor of Y2, this causes a change in Y2, which feeds back into equation (1) causing a change in Y1.  We end up with the following result:

Cov(Y2,e1) ≠  0

Recall the assumptions from classical regression require that the correlation between the error term and explanatory variables be zero. The condition of simultaneity bias above violates that assumption.  The estimated regression co-efficients become biased:

E(b) ≠  b

This can be corrected using 2-stage least squares (2SLS) and instrumental variables.

Two-Stage Least Squares

If we could find a variable ‘z’ highly correlated with Y2 but uncorrelated with the error term e2 we could run regressions on (1) and (2) and avoid the problem of simultaneity bias.  This variable ‘z’ is referred to as an ‘instrumental variable.’

Studenmund defines 2SLS as ‘a method of systematically creating instrumental variables to replace endogenous variables where they appear as explanatory variables in simultaneous equation systems.’

Equations (1) and (2) can be re-written in what’s referred to as reduced form:

Y1 = f(x1,x2) +e1  (3)
Y2 = f(x2,x1) +e2  (4)

Stage 1: Run OLS on (3) and (4) to get estimates  z1 and z2 for Y1  and Y2 respectively.

z1 = f(x1,x2) +e1  (5)
z2 = f(x2,x1) +e2  (6)

Stage 2: Run OLS on (1) and (2) using instruments z1 and z2 to replace Y1 and Y2 where they appear as explanatory variables.

Y1 = f(x1,z2) +e1  (7)
Y2 = f(x2,z1) +e2  (8)

References:
Using Econometrics, Studenmund  (2001)