*"As many readers of this blog know, disentangling causal relationships from mere correlations is the goal of modern science, social or otherwise, and though it is easy to test whether two variables x and y are correlated, it is much more difficult to determine whether x causes y. So while it is easy to test whether increases in the level of food prices are correlated with episodes of social unrest, it is much more difficult to determine whether food prices cause social unrest."*

*"In my work, I try to do so by conditioning food prices on natural disasters. To make a long story short, if you believe that natural disasters only affect social unrest through food prices, this ensures that if there is a relationship between food prices and social unrest,*

**that relationship is cleaned out of whatever variation which is not purely due to the relationship flowing from food prices to social unrest.**In other words, this ensures that the estimated relationship between the two variables is causal. This technique is known as instrumental variables estimation."The idea of

*'cleaning'*out the bias or endogeneity etc. is consistent with how I tried to build intuition for IVs before depicting an instrumental variable as being like a 'filter' that picks up only variation in the treatment (CAMP) unrelated to an omitted variable (INDEX) or selection bias.

*"A very non-technical way to think about this is that we are taking Z and going through CAMP to get to Y, and bringing with us only those aspects of CAMP that are unrelated to INDEX. Z is like a filter that picks up only the variation in CAMP (what we may refer to as ‘quasi-experimental variation) that we are interested in and filters out the noise from INDEX. Z is technically related to Y only through CAMP."*

Z →CAMP→Y

Z →CAMP→Y

(you can read the full post for more context)

See also: http://econometricsense.blogspot.com/2013/06/unobserved-heterogeneity-and-endogeneity.html

Below are some more examples of discussions and descriptions of instrumental variables that have been the most beneficial to my understanding:

**Kennedy:**

*“The general idea behind this estimation procedure is that it takes the variation in the explanatory variable that matches up with variation in the instrument (and so is uncorrelated with the error), and uses only this variation to compute the slope estimate. This in effect circumvents the correlation between the error and the troublesome variable, and so avoids the asymptotic bias”*

**Mastering Metrics:**

*“The instrumental variables (IV) method harnesses partial or incomplete random assignment, whether naturally occurring or generated by researchers….."*

“The IV method uses these three assumptions to characterize a chain reaction leading from the instrument to student achievement. The first link in this causal chain-the first stage-connects randomly assigned offers with KIPP attendance, while the second link-the one we’re after-connects KIPP attendance with achievement.”

“The IV method uses these three assumptions to characterize a chain reaction leading from the instrument to student achievement. The first link in this causal chain-the first stage-connects randomly assigned offers with KIPP attendance, while the second link-the one we’re after-connects KIPP attendance with achievement.”

Dr. Andrew Gelman with comments from Hal Varian: How to think about instrumental variables when you get confused

*“Suppose z is your instrument, T is your treatment, and y is your outcome. So the causal model is z -> T -> y……. when I get stuck, I find it extremely helpful to go back and see what I've learned from separately thinking about the correlation of z with T, and the correlation of z with y. Since that's ultimately what instrumental variables analysis is doing.”*

*"You have to assume that the only way that z affects Y is through the treatment, T. So the IV model is*

T = az + e

y = bT + d

T = az + e

y = bT + d

It follows that

E(y|z) = b E(T|z) + E(d|z)

It follows that

E(y|z) = b E(T|z) + E(d|z)

*Now if we*

1) assume E(d|z) = 0

2) verify that E(T|z) != 0

we can solve for b by division. Of course, assumption 1 is untestable.

1) assume E(d|z) = 0

2) verify that E(T|z) != 0

we can solve for b by division. Of course, assumption 1 is untestable.

*An extreme case is a purely randomized experiment, where e=0 and z is a coin flip."*

**References:**

*A Guide to Econometrics. Peter Kennedy.*

Mastering 'Metrics. Joshua Angrist and Jörn-Steffen Pischke

## No comments:

## Post a Comment