Saturday, March 11, 2017

Basic Econometrics of Counts

As Cameron and Trivedi state, Poisson regression is "the starting point for count data analysis, though it is often inadequate" (Cameron and Trivedi,1999). The "main focus is the effect of covariates on the frequency of an event, measured by non-negative integer values or counts"(Cameron and Trivedi,1996).

Examples of counts they reference are related to medical care utilization such as office visits or days in the hospital.

"In all cases the data are concentrated on a few small discrete values, say 0, 1 and 2; skewed to the left; and intrinsically heteroskedastic with variance increasing with the mean. These features motivate the application of special methods and models for count regression." (Cameron and Trivedi, 1999).

From Cameron and Trivedi (1999) "The Poisson regression model is derived from the Poisson distribution by parameterizing the relation between the mean parameter μ and covariates (regressors) x. The standard assumption is to use the exponential mean parameterization"

μ = exp(xβ)

Slightly abusing partial derivative notation we derive the marginal effect of x on y as follows:

dE[y|x]/dx = β*exp(xβ)

What this implies is that if  β = .10 and  exp(xβ) = 2 then a 1 unit change in x  will change the expectation of y by .20 units.

Another way of interpretation is to get an approximate value for average response by:


(see also Wooldrige, 2010)

Another way this is interpreted is through exponentiation. From Penn State's STATS 504 course:

"with every unit increase in X, the predictor variable has multiplicative effect of exp(β) on the mean (i.e. expected count) of Y."

This implies (as noted in the course notes):

    If β = 0, then exp(β) = 1  and Y and X are not related.
    If β > 0, then exp(β) > 1, and the expected count μ = E(y) is exp(β) times larger than when X = 0
    If β < 0, then exp(β) < 1, and the expected count μ = E(y) is exp(β) times smaller than when X = 0

For example, if comparing a group A  i.e. (X = 1) vs B i.e.  (X = 0)  if exp(β) = .5, group A has an expected count .50 times smaller than B. On the other had if exp(β) = 1.5, group A has an expected count 1.5 times larger than B.

This could also be interpreted in percentage terms (similar to odds ratios in logistic regression)

For example, comparing a group A (X = 1) vs B (X = 0) if exp(B) = .5 that implies that group A has (.5-1)*100%  = -50% lower expected count than group B. On the other hand if exp(B) = 1.5, this implies that group A has a (1.5-1)*100% = 50% larger expected count that B.

A simple rule of thumb  or shortcut is for small values we can interpret β as a percent change in the expected count of y for a given change in x, as in 100*β (Wooldrigde, 2nd ed, 2010)

           β              exp(β)              (β-1)*100%
0.01 1.0100501671 1.0050167084
0.02 1.02020134 2.0201340027
0.03 1.030454534 3.0454533954
0.04 1.0408107742 4.0810774192
0.05 1.0512710964 5.1271096376
0.06 1.0618365465 6.1836546545
0.07 1.0725081813 7.2508181254
0.08 1.0832870677 8.3287067675
0.09 1.0941742837 9.4174283705
0.1 1.1051709181 10.5170918076
0.11 1.1162780705 11.6278070459

In STATA, the margins command can be used to get predicted (average) counts at each specified level of a covariate. This is similar as I understand,  to getting marginal effects at the mean for logistic regression. See UCLA STATA examples. Similarly this can be done in SAS using the ilink option with lsmeans.

An Applied Example: Suppose we have some count outcome y that we want to model as a function of some treatment 'TRT.' Maybe we are modeling hospital admission rate differences by treated vs control group  for some intervention or maybe this is number of weeds in an acre plot for a treated vs control group in an agricultural experiment. Using python I simulated a toy count data set for two groups treated (TRT = 1) and untreated (TRT = 2). Descriptive statistics indicate a treatment effect.

Mean (treated): 2.6
Mean (untreated): 5.4

However, if I want to look at the significance of this I can model the treatment effect by specifying a Poisson regression model. Results are below:

E[y|TRT]  = exp(xβ) where x = TRT our binary indicator for treatment

Despite the poor quality of this image we can see that our estimate for β = -.7309. This is rather large so the direct percentage approximation above won't likely hold. However we we can interpret the significance and direction of the effect to imply that the treatment significantly reduces the expected count of y, our outcome of interest. The chart presented previously indicates that as β becomes large the direct percentage shortcut interpretation tends to overestimate the true effect. This implies that the treatment is reducing the expected count on some order less than 73%.  If we take the path of exponentiation we get:

exp(-.7309) = .48

This implies the treatment group has an expected count .48 times lower than the control. In percentage terms the treatment group has an expected count (.48-1)*100 = -52% or 52% lower than the control group.

Interestingly, with a single variable poisson regression model we can derive these results from the descriptive data.

If we take the ratio of average counts for treated vs untreated groups we get 2.6/5.4= .48  which is basically the same as our exponentiated result exp(β). And if we calculate a difference in raw means between treated and untreated groups we see that in fact the treatment group has an average count that is about 52% lower than the control group. 

Extensions of the Model 

As stated at the beginning of this post, the poisson model is just the benchmark or starting point for count models. One assumption is that the mean and variance are equal. This is known as 'equidispersion.' If the variance exceeds the mean that is referred to as overdispersion and negative binomial models are often specified (but interpretation of the coefficients is unchanged).  Overdispersion is more often the case (Cameron and Trivedi, 1996). Other special cases consider the proportion of zeros, sometimes accounted for by zero inflated poisson (ZIP) or zero inflated negative binomial models (ZINB).  As noted in Cameron and Trivedi (1996) count models and duration models can be viewed as duals. If observed units have different levels of exposure or duration this is accounted for in count models through inclusion of an offset. More advanced treatment and references should be considered in these cases.

Applied Examples from Literature

Some examples where counts are modeled in the applied economics literature include the following:

The Demand for Varied Diet with Econometric Models for Count Data. Jonq-Ying Lee. American Journal of Agricultural Economics, vol 69 no 3 (Aug,1987)

Standing on the shoulders of giants: Coherence and biotechnology innovation performance. Sanchez and Ng. Selected Poster 2015 Agricultural and Applied Economics Association and Western Agricultural Economics Association Join Annual Meeting. San Francisco CA July 26-28

Adoption of  Best Management Practices to Control Weed Resistance by Corn, Cotton,and Soybean Growers. Frisvold, Hurley, and Mitchell. AgBioForum 12(3&4) 370-381. 2009.

In all cases, as is common in the limited amount of literature I have seen in applied economics, the results of the count regressions are interpreted in terms of direction and significance, but not much consideration is given to an interpretation of results based on exponentiation of coefficients. 


Essentials of Count Data Regression. A. Colin Cameron and Pravin K. Trivedi. (1999)
Count Data Models for Financial Data. A. Colin Cameron and Pravin K. Trivedi. (1996)
Econometric Analysis of Cross Section and Panel Data. Wooldridge. 2nd Ed. 2010.

See also:
Count Models with Offsets
Do we Really Need Zero Inflated Models
Quantile Regression with Count Data

For python code related to the applied example see the following gist.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.