Thursday, September 5, 2013

Propensity Score Matching

See previous:
The problem of selection bias can be well characterized within the Rubin causal model or potential outcomes framework (Angrist and Pischke,2008; Rubin, 1974; Imbens and Wooldridge, 2009, Klaiber & Smith,2009). In a previous post I explained how selection bias can overpower the actual treatment effect and leave the naïve researcher to conclude that the intervention or treatment was ineffectual or lead them to under or overestimate the true treatment effects depending on the direction of the bias.

However, according to the conditional independence assumption (CIA) ( Rubin, 1973; Angrist & Pischke, 2008; Rosenbaum and Rubin, 1983;Angrist and Hahn,2004) conditional on covariate comparisons may remove selection bias, giving us the estimate of the treatment effect we need:

E[Yi|xi,di=1]- E[Yi|xi,di=0]= E[Y1i-Y0i|xi] or Y1i,Y0i di| xi    
The last term implies that treatment assignment ( di) and response (Y1i,Y0i are conditionally independent given covariates xi. This conclusion provides the justification and motivation for utilizing matched comparisons to estimate treatment  effects.  Matched comparisons imply balance on observed covariates, which ‘recreates’ a situation similar to a randomized experiment  where all subjects are essentially the same except for the treatment(Thoemmes and Kim,2011).  However, matching on covariates can be complicated and cumbersome. An alternative is to implement matching based on an estimate of the probability of receiving treatment or selection. This probability is referred to as a propensity score. Given estimates of the propensity or probability of receiving treatment, comparisons can then be made between observations matched on propensity scores.  This is in effect a two stage process requiring first a specification and estimation of a model used to derive the propensity scores, and then some implementation of matched comparisons made on the basis of the propensity scores.  Rosenbaum and Rubin’s propensity score theorem (1983) states that if the CIA holds, then matching or conditioning on propensity scores (denoted p(xi) ) will also eliminate selection bias, i.e. treatment assignment ( di) and response (Y1i,Y0i) are conditionally independent given propensity scores p(xi):
Y1i,Y0i ⊥di| xi     =  Y1i,Y0i ⊥ di|p(xi)  
In fact, propensity score matching can provide a more asymptotically efficient estimator of treatment effects than covariate matching (Angrist andHahn,2004).  
So the idea is to first generate propensity scores by specifying a model that predicts the probability of receiving treatment given covariates xi
p(xi)  = p(di=1|xi)
There are many possible functional forms for estimating propensity scores. Logit and probit models with the binary treatment indicator as the dependent variable are commonly used. Hirano et. al find that an efficient estimator can be achieved by weighting by a non-parametrically estimated propensity score (Hirano, et al, 2003). Millimet and Tchernis find evidence that more flexible and over specified estimators perform better in propensity score applications (Millimet and Tchernis , 2009). A comparative study of propensity score estimators using logistic regression, support vector machines, decision trees, and boosting algorithms can be found in Westreich et al (Westreich et al , 2009).
Once these probabilities, or ‘propensity scores’ are generated for each individual, matching is accomplished by identifying individuals in the control group with propensity scores similar to those in the treated group. Types of matching algorithms include 1:1 and nearest neighbor methods.  Differences between matched cases are calculated and then combined to estimate an average treatment effect.  Another method that implements matching based on propensity scores includes stratified comparisons. In this case treatment and control groups are stratified or divided into groups or categories  or bins of propensity scores. Then comparisons are made across strata and combined to estimate an average treatment effect. Matched comparisons based on propensity score strata  are discussed in Rosenbaum and Rubin (1984). This method can remove up to 90% of bias due to factors related to selection using as few as five strata (Rosenbaum and Rubin, 1984).
 Angrist, J. D., &  Hahn, J. (2004). When to control for covariates? Panel-Asymptotic Results for  Estimates of Treatment Effects. Review of Economics and Statistics. 86, 58-72.

Angrist, J. D. &  Pischke J. (2008). Mostly harmless econometrics: An empiricist's companion. Princeton  University Press.

Hirano, K. & Imbens, G.W. &  Ridder, G. (2003). Efficient estimation of average treatment effects  using the estimated propensity score. Econometrica, Vol. 71, No. 4, 1161–1189.
Klaiber, H.A. & Smith,V.K. (2009). Evaluating Rubin's causal model for measuring the capitalization of  environmental amenities.  NBER Working Paper No 14957. National Bureau of Economic Research.

Imbens, G. W. & Wooldridge, J.M.(2009). Recent developments in the econometrics of program evaluation. Journal of Economic Literature, 47:1, 5–86

Millimet , D. L. & Tchernis, R.(2009). On the specification of propensity scores, with applications to the  analysis of trade policies. Journal of Business & Economic Statistics, Vol. 27, No. 3

Rosenbaum , R. &. Rubin, D.B.(1983). The central role of the propensity score in observational studies  for causal effects.  Biometrika, Vol. 70, No. 1, pp. 41-55

Rosenbaum , R. &. Rubin, D.B.(1984). Reducing Bias in Observational Studies Using Sub classification   on the Propensity Score.  Journal of the American Statistical Association, Vol. 79, Issue. 387,  pp.516-524

Rubin, D. B.(1974). Estimating causal effects of treatments in randomized and nonrandomized studies. Journal of Educational Psychology, Vol 66(5), Oct 1974, 688-701

Rubin, Donald B. (1973). Matching to remove bias in observational studies. Biometrics, 29, 159-83.

Thoemmes, F. J. & Kim, E. S. (2011). A systematic review of propensity score methods in the social  sciences. Multivariate Behavioral Research, 46(1), 90-118.

Westreich,  D. , Justin L., & Funk, M.J. (2010). Propensity score estimation: machine learning and classification methods as alternatives to logistic regression. Journal of  Clinical    Epidemiology,   63(8): 826–833.

No comments:

Post a Comment