Thursday, January 3, 2013

Propensity Score Matching in Higher Ed Research

See also: Causal Inference in A Nutshell, and this example using Instrumental Variables to evaluate First Year Programs, as well as my previous discussion of matching estimators here and here.

And this really good presentation: Why Propensity Score Matching should beused to Assess Programmatic Effects by Forrest Lane at the University of North Texas.

Below are some great references for both higher education research as well as good examples of applied quasi-experimental methods, particularly propensity score matching: 

Estimating the influence of financial aid on student retention: A discrete-choice propensity score-matching model
Education Working Paper Archive
January 17, 2008
Serge Herzog, Ph.D.
Director, Institutional Analysis
Consultant, CRDA StatLab
University of Nevada, Reno

Estimates the effect of financial aid on freshmen retention using propensity score-matching.Found that higher income students accrue a retention benefit from financial aid while retention of low-income freshmen is more likely due to academic

Assessing the Effectiveness of a College Freshman
Seminar Using Propensity Score Adjustments

M. H. Clark  and Nicole L. Cundiff
Res High Educ (2011) 52:616–639

Without accounting for selection bias, those who took the course had similar retention rates
and lower GPAs than those who did not take the course. After matching on propensity
scores, the negative effects of the program on GPA were nullified and those in the program
were more likely to enroll for a second year.

An, B. P. (In press). The influence of dual enrollment on academic performance and college readiness: Differences by socioeconomic status. Research in Higher Education. 2012.

Employed a propensity score matching model to assess the impact of dual enrollment
on academic performance and college readiness. Found that dual enrollment continues to
influence positively academic performance and college readiness.


  1. Hi Matt. Thanks for the great PSM resources! I have been doing a lot of reading about this and thought I might as well ask an expert...

    When coupling propensity score matching with survival analysis, do varying follow-up periods need to be considered during the matching process or does the survival analysis account for this?

    I am inclined to say that the varying follow-up period problem can be tackled during the survival analysis and can be ignored during the matching process because I have used the stset command with the origin() and failure () functions to tell Stata when the study period is to begin for each case and whether or not they were censored. Once matched, I can then use the stpower exponential command for uniform accrual. Basically Stata is now aware that there are varying entry and exit times and can control for that fact.

    Do you have any advice here? Thanks!

    1. A colleague and I were discussing something very similar just last week. I really don't know, but I the mean time will work on a solution. It may take a while before I find out. I think it depends. If you have a selection on observable s situation that lends itself to attaching, I might venture to say that as long as the observable s don't change during followup periods then the propensity scores estimated and initial matching should hold throughout. If you the observables and hence propensity scores could somehow be percieved as time varying, then I would think you would have something similar to a survival analysis with time varying covariates, and may requre different matches and PS estimated for each period. And, another challenge, regardless of time varying PS or not, what do you do with an initial matched comparison if one of the matches becomes censored and the other doesn't. This may be more of an issue with 1:1 matches. I really need to think more about this. The scenarios I have mentioned would seem to be difficult to code, and I'm not sure of any procedures in SAS would handle them by default. I'm not a STATA user so I'm unsure there. If these are truly issues, I would not doubt someone has written an R package to deal with it.

  2. Sorry for the typos "attaching" vs "matching" above. I'm commenting via my iPad which tends to have latency issues. Regardless, time dependent covariates are challenging enough at least in SAS:

  3. I've also discussed IPTW regression on this blog before, as a much more computationally tractable alternative to PS matching. My first thoughts when confronted with this issue was to implement some sort of IPTW based survival analysis if possible. This would take care (possibly) of the issue related to censored matches, since IPTW methods utilize PS adjustments without explicit matching. This is discussed using R here:

  4. Also, in the STATA discussion I previously linked to, one commenter mentioned that time varying propensity scores could introduce bias or other issues if the treatment itself impacts the observables and hence PS in later periods. So, in this regard it would make since to estimate PS and implement the matches prior to the analysis without later updates. Again, an IPTW approach may be useful if you are concerned with censorship breaking down your matches.

  5. I always thought, a general rule of blogging was to not dominate the comment section. I've violated that convention. But I wanted to point to a recent post that cites some articles that involve PS matching in the context of survival analysis:

  6. Hi Matt. Wow! Thank you for all the helpful resources. I think the easiest - but not entirely ideal - solution for me is to first determine the smallest follow-up period for my treatment group, and then observe all cases based on that follow-up period (first matching, then analysis). I can then move on to a longer follow-up period and take a look at who is left. It's not the cleanest method, but it appears to be the simplest. If I come up with something nicer, I will be sure to post here! Thanks again for all your help!

    Victoria from Rutgers SCJ

  7. This may also be useful to your readers:

    The Last Shall Be First: Matching Patients by Choosing the Least Popular First
    Stefanie J. Millar and David J. Pasta
    ICON Clinical Research, San Francisco, CA