Nice discussion on regression here:(one of my favorite blogs)

http://andrewgelman.com/2013/01/understanding-regression-models-and-regression-coefficients/

I particularly like Gelman's comment:

"It's all about comparisons, nothing about how a variable "responds to change." Why? Because, in its most basic form, regression tells you nothing at all about change. It's a structured way of computing average comparisons in data."

This 'computing average comparisons of data' interpretation is why regression works as sort of a matching estimator as Angrist and Pischke argue, and we've all including Dr. Gelman discussed before here:

http://andrewgelman.com/2013/01/understanding-regression-models-and-regression-coefficients/

I particularly like Gelman's comment:

"It's all about comparisons, nothing about how a variable "responds to change." Why? Because, in its most basic form, regression tells you nothing at all about change. It's a structured way of computing average comparisons in data."

This 'computing average comparisons of data' interpretation is why regression works as sort of a matching estimator as Angrist and Pischke argue, and we've all including Dr. Gelman discussed before here:

http://econometricsense.blogspot.com/2011/07/more-discussion-on-matching-estimators.html

Below is one of the quotes attributed to Terry:

Below is one of the quotes attributed to Terry:

*"Think of the world of difference between using a regression model for prediction and using one for estimating a parameter with a causal interpretation, for example, the effect of class size on school children's test scores. With prediction, we don't need our relationship to be causal, but we do need to be concerned with the relation between our training and our test set. If we have reason to think that our future test set may differ from our past training set in unknown ways, nothing, including cross-validation, will save us. When estimating the causal parameter, we do need to ask whether the children were randomly assigned to classes of different sizes, and if not, we need to find a way to deal with possible selection bias. If we have not measured suitable covariates on our children, we may not be able to adjust for any bias."*

If we are talking about the following specification:

E[Y

_{i}|c_{i}=1] - E[Y_{i}|c_{i}=0] =E[Y_{1i}-Y_{0i}|c_{i}=1] +{ E[Y_{0i}|c_{i}=1] - E[Y_{0i}|c_{i}=0]}*Observed effect = treatment effect on the treated + {selection bias}*

I think that framework is the most useful for characterizing and understanding selection bias. I could be missing something but I don't see how the block quote from Terry above is really inconsistent with the potential outcome framework of causal inference, unless maybe you completely refuse to think of regression as a matching estimator. I think he does a good job pointing out what most people don't see as different applications of regression. As Dr. Gelman says, inference may be a special case of prediction, but when I here this distinction I can't help but think of this comment from Greene:

*"It remains an interesting question for research whether fitting*

*y*

*well or obtaining good parameter estimates is a preferable estimation criterion. Evidently, they need not be the same thing."*

p. 686 Greene, Econometric Analysis 5

^{th}ed
## No comments:

## Post a Comment

Note: Only a member of this blog may post a comment.