Previously I discussed several of the most useful descriptions of instrumental variables that I have encountered through various sources. I was recently reviewing some of Lawlor's work related to Mendelian instruments and realized this was the first place I have seen the explicit use of directed acyclical graphs to describe how instrumental variables work.
In describing the application of Mendelian instruments, Lawlor et al present instrumental variables with the aid of directed acyclic graphs. They describe an instrumental variable (Z) as depicted above in the following way, based on three major assumptions:
(1) Z is associated with the treatment or exposure of interest (X)
(2) Z is independent of the unobserved confounding factors (U) that impact both X and the outcome of interest (Y).
(3) Z is independent of both the outcome of interest Y given X, and the unobservable factors U. (i.e. this is the ‘exclusion principle’ in that Z impacts Y only through X)
Our instrumental variable estimate, βIV is the ratio of E[Y|Z]/E[X|Z], which can be estimated by two-stage least squares:
X* = β0 + β1 Z + e
Y = β0 + βIV X* + e
The first regression gets only variation in our treatment or exposure of interest related to Z, and leaves all the variation related to U in the residual term. The second regression estimates βIV, and retains only the ‘quasi-experimental’ variation in X related to the instrument Z.
Stat Med. 2008 Apr 15;27(8):1133-63. Mendelian randomization: using genes as instruments for making causal inferences in epidemiology. Lawlor DA, Harbord RM, Sterne JA, Timpson N, Davey Smith G. Link: http://www.ncbi.nlm.nih.gov/pubmed/17886233
Causal diagrams for empirical research
BY JUDEA PEARL. Biometrika (1995),82,4,pp.669-710