## Sunday, April 13, 2014

### Intuition for Fixed Effects

I've written about fixed effects before in the context of mixed models. But how are FE useful in the context of causal inference? What can we learn from a panel data using FE that we can't get from a standard regression with cross sectional data?  Let's view this through a sort of parable, based largely on a very good set of notes produced by J. Blumenstock, used in a management statistics course (link).

Suppose we have a restaurant chain and have gathered some cross sectional data on the pricing and consumption of large pizzas for some portion of the day for some period 1 across three cities, as pictured below:

Now, if we are trying to infer the relationship between price and quantity demanded using this data, we notice something odd. The theoretically implied negative relationship does not exist. In fact, if we plot the points, this seems more in line with a supply curve rather than a demand curve:
What's going on that could explain this? One explanation could be specific individual differences across cities related to taste and quality. Perhaps in Chicago, customer's tastes and preferences are for much more expensive and higher quality pizza, and they really like pizza a lot. They may be willing to pay more for more pizzas aligned with their specific tastes and preferences. Perhaps this is also true for San Francisco, but to a lesser extent, and in Atlanta maybe not so much.

What we have is unobserved heterogeneity related to these specific individual effects. How can we account for this? Suppose we instead collected the same data for two periods, essentially creating a panel of data for pizza consumption:
Now, if we look 'within' each city, the data reveals the theoretically implied relationship between price and demand. Take San Francisco for example:
This is essentially what fixed effects estimators using panel data can do. They allow us to exploit the 'within' variation to 'identify' causal relationships. Essentially using a dummy variable in a regression for each city (or group, or type to generalize beyond this example) holds constant or 'fixes' the effects across cities that we can't directly measure or observe. Controlling for these differences removes the 'cross-sectional' variation related to unobserved heterogeneity (like tastes, preferences, other unobserved individual specific effects). The remaining variation, or 'within' variation can then be used to 'identify' the causal relationships we are interested in.

See also: Difference-in-Difference models. These are a special case of fixed effects also used in causal inference.

Reference:
Fixed Effects Models(Very Important Stuff)
www.jblumenstock.com/courses/econ174/FEModels.pdf