Econometric Sense: Selection Bias and the Rubin Causal Model and Potential Outcomes Framework

Friday, May 31, 2013

Selection Bias and the Rubin Causal Model and Potential Outcomes Framework

The problem of selection bias is best characterized within the Rubin Causal Model or potential outcomes framework (Angrist and Pischke,2008; Rubin, 1974; Imbens and Wooldridge, 2009, Klaiber & Smith,2009)

Suppose Y_i is the measured outcome of interest. This can be written in terms of potential outcomes as:

Y_i = { y_1i if d_{i =1} ;y_0i, if d_i= 0}

= y_0i+ (y_1i- y_0i)d_i

The causal effect of interest is y_1i- y_0i, but is unobservable because we don’t see both outcomes for a single individual. Reality forces us to compare outcomes for different individuals (those treated vs. untreated).

Let d_i= choice or selection or treatment

Y_0i= baseline potential outcome

Y_1i = potential treatment outcome

What we actually measure is E[Y_i|d_i=1] - E[Y_i|d_i=0], the observed effect or observed difference between means for treated vs. untreated groups . The problem of non-random treatment selection can be characterized as follows:

E[Y_i|d_i=1] - E[Y_i|d_i=0] =E[Y_1i-Y_0i] +{E[Y_0i|d_i=1] - E[Y_0i|d_i=0]}

The observed effect or difference is equal to the population average treatment effect (ATE) E[Y_1i-Y_0i] in addition to the bracketed term for selection bias. If the potential outcomes ‘Y_0i’ for those that are treated (d_i=1) differ from potential outcomes ‘Y_0i’ from those that are not treated or don’t self-select(d_i=0), then the term {E [Y_0i|d_i=1] - E [Y_0i|d_i=0]} could have a positive or negative value, creating selection bias. When we calculate the observed difference between treated and untreated groups selection bias becomes confounded with the actual treatment effect E[Y_1i-Y_0i]. Note, if the potential outcomes of the treated and control groups were the same, then the selection bias term would equal zero, and the observed difference would represent the population average treatment effect.

If the term { E[Y_0i|_di=1] - E[Y_0i|d_i=0]} representing section bias is large enough, it can overpower the actual treatment effect and leave the naïve researcher to conclude (based on the observed effect E[Y_i|d_i=1] - E[Y_i|d_i=0] ) that the intervention or treatment was ineffectual or lead them to under or overestimate the true treatment effects depending on the direction of the bias.

References:

Rubin, D. B.(1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, Vol 66(5), Oct 1974, 688-701

Angrist, J. D. & Pischke J. (2008). Mostly harmless econometrics: An empiricist's companion. Princeton University Press.

Imbens, G. W. & Wooldridge, J.M.(2009). Recent developments in the econometrics of program evaluation. Journal of Economic Literature, 47:1, 5–86

Klaiber, H.A. & Smith,V.K. (2009). Evaluating Rubin's causal model for measuring the capitalization of environmental amenities. NBER Working Paper No 14957. National Bureau of Economic Research.

Econometric Sense

Friday, May 31, 2013

Selection Bias and the Rubin Causal Model and Potential Outcomes Framework

No comments:

Post a Comment