In a previous post, I discussed the use of instrumental variables to address selection bias. Another method of correcting for selection bias would involve the use of propensity score matching. Heuristically it would involve first estimating (using probit or logit models) the propensity that an individual would self select, and weighting or matching those subjects with similar predicted probabilities, or ‘propensities’ and carrying out the analysis on the weighted or adjusted data. A good example using SAS can be found here. Why would we want to derive matching estimators and how would we do it? In general, we employ matching when we want to compare like cases. We may employ matching when we want to estimate the average effect of a treatment or intervention comparing participants to non-participants in a program. Comparisons are made between participants and nonparticipants based on the similarity of their observed characteristics. There are two major assumptions related to matching schemes: 1) We have sufficient observable data (X) such that outcomes or treatment effects (Y) are independent of program participation (D). This is also referred to as the conditional independence assumption.
The idea is that conditional on observed characteristics (X), selection bias disappears. We want to make conditional on X comparisons of Y that mimic as much as possible the experimental benchmark of random assignment. Let Y be some measure of success or the average treatment effect, D be some program, and X be a matrix of control variables. It follows from the CIA that
E[Y|X,D=1] – E[Y|X,D=0] = E[Y(1) –Y(0)|X]
Where Y(1) = value of Y when D=1, Y(0) = value of Y when D = 0.
2) 0 < p(D=1|X) < 1
Also central to the concept of matching is the idea of ‘common support.’ Support is essentially the overlap between values of X for the comparison groups (defined by D=1 or 0).
How can we match observations? Several methods are discussed in the literature, including nearest neighbor, caliper and radius, stratification, and kernel matching. In practice matching has some complications. By definition, matching only utilizes observations in the region of common support where we are able to obtain matched observations. That is, unmatched observations are excluded from the analysis. Therefore we can’t estimate treatment effects (Y) outside the region of common support. As a result, estimated treatment effects may vary substantially based on the matching method employed. The matching process itself also adds variation that must be accounted for in the estimation of standard errors in excess of normal sampling variation. Some approaches to this involve bootstrapping (Lechner,2002). However, Imbens(2004) finds lack of empirical support for the necessity of bootstrapping.
Why not just run a regression?
The discussion of conditional independence above and making conditional on X comparisons of Y may sound a lot like regression in terms of estimating the conditional expectation of Y.
E(Y|X) = B0 + B1X +B2D +e
i.e. estimating the beta coefficient on D holding constant all other factors in the model. In fact, regression and matching have a very strong connection. In Equivalence between Fixed Effects and Matching Estimators for Causal Inference, Kosuke Imaiy and Song Kimz (as a prelude to their discussion of fixed effects models and matching in panel studies) state:
It is well known that the least squares estimate of β is algebraically
equivalent to a matching estimator for the average treatment effect
They illustrate the algebraic equivalence between regression and matching:
Morgan and Harding (2006) cite the abundance of literature supporting the connection between regression and matching estimators including Hahn 1998; Heckman, Ichimura, and Todd 1998; Hirano et al. 2003; Imbens 2004; Angrist and Krueger 1999. Particularly they point out the findings of Heckman and Vytlacil, 2004, stating ‘all average causal effect estimators can be interpreted as weighted averages of marginal treatment effects whether generated by matching, regression, or local instrumental variable estimators.’
In their particular work, Morgan and Harding find:
“For comparison, we then provide analogous regression estimates in the second panel of the table. In some cases, these estimates outperform some of the matching estimates.”
This is in line with more recent work by Angrist and Pischke (2009) as presented in their book Mostly Harmless Econometrics:
"Our view is that regression can be motivated as a particular sort of weighted matching estimator, and therefore the differences between regression and matching estimates are unlikely to be of major empirical importance…The reason the regression and matching estimates are similar is that regression, too, can be seen as a sort of matching estimator: the regression estimand differs from the matching estimands only in the weights used to combine the covariate-specific effects into a single average effect. In particular, while matching uses the distribution of covariates among the treated to weight covariate-specific estimates into an estimate of the effect of treatment on the treated, regression produces a variance-weighted average of these effects." (Chapter 3, p. 70,73-74)
Angrist and Pischke seem to imply that regression can be a 'mostly harmless' substitute or competitor to matching.
Additional Notes
If we stratify our data, matching places more emphasis on cells most likely to be treated. Regression places more emphasis on cells with similar numbers of treated and untreated cases. Matching tries to address heterogeneity (the differences in the groups we are comparing). It can do this only to the extent that these differences are captured in observable characteristics (X). In this context, it offers no improvement over regression, which is also ‘matches’ observations on X.
References:
Gelman, Andrew. The Stata Journal (2009) 9, Number 2, pp. 315–320
A statistician’s perspective on “Mostly Harmless Econometrics: An Empiricist’s Companion”, by Joshua D. Angrist and J¨orn-Steffen Pischke.
Angrist and Pischke, Mostly Harmless Econometrics, 2009
Hahn, Jinyong. 1998. ‘‘On the Role of the Propensity Score in Efficient Semiparametric Estimation of Average Treatment Effects.’’ Econometrica 66:315-31.
Heckman, James J., Hidehiko Ichimura, and Petra Todd. 1997. ‘‘Matching as an Econometric Evaluation Estimator: Evidence From Evaluating a Job Training Programme.’’ Review of Economic Studies 64:605-54.
Hirano, Keisuke, Guido W. Imbens, and Geert Ridder. 2003. ‘‘Efficient Estimation of Average Treatment Effects Using the Estimated Propensity Score.’’ Econometrica 71:1161-89.
Imbens, Guido W. 2000. ‘‘The Role of the Propensity Score in Estimating Dose-Response Functions.’’ Biometrika 87:706-10.
Imbens, Guido W. 2004. ‘‘Nonparametric Estimation of Average Treatment Effects Under Exogeneity: A Review.’’ Review of Economics and Statistics 86:4-29.
Angrist, Joshua D. and Alan B. Krueger. 1999. ‘‘Empirical Strategies in Labor Economics.’’ Pp. 1277-1366 in Handbook of Labor Economics, vol. 3, edited by O. C. Ashenfelter and D. Card. Amsterdam: Elsevier.
Paper 366-2008 SAS Global Forum. The Use of Propensity Scores and Instrumental Variable Methods to Adjust For Treatment Selection Bias R. Scott Leslie, MedImpact Healthcare Systems, Inc., San Diego, CA Hassan Ghomrawi, MedImpact Healthcare Systems, Inc., San Diego, CA
Lechner M. 2002. Some practical issues in the evaluation of heterogeneous labour market programmes by matching methods. Journal of the Royal Statistical Society. Series A, 165, pp. 59-82.
Equivalence between Fixed Effects and Matching Estimators for Causal Inference Kosuke Imaiy In Song Kimz http://imai.princeton.edu/research/files/FEmatch.pdf
Sociological Methods & Research. Volume 35 Number 1. August 2006. Matching Estimators of Causal Effects Prospects and Pitfalls in Theory and Practice. Stephen L. Morgan Cornell and David J. Harding