Saturday, March 11, 2017

Count Models with Offsets

See also: Count Models with Offsets: Practical Applications using R

Principles of Count Model Regression

Often times we want to model the impact of some intervention or differences between groups in relation to an outcome that is a count. Examples of counts may be related to medical care utilization such as office visits or days in the hospital or total hospital admissions.

"In all cases the data are concentrated on a few small discrete values, say 0, 1 and 2; skewed to the left; and intrinsically heteroskedastic with variance increasing with the mean. These features motivate the application of special methods and models for count regression." (Cameron and Trivedi, 1999).

Poisson regression is "the starting point for count data analysis, though it is often inadequate" (Cameron and Trivedi,1999). The "main focus is the effect of covariates on the frequency of an event, measured by non-negative integer values or counts"(Cameron and Trivedi,1996).

From Cameron and Trivedi (1999) "The Poisson regression model is derived from the Poisson distribution by parameterizing the relation between the mean parameter μ and covariates (regressors) x. The standard assumption is to use the exponential mean parameterization"

E(Y|x) = μ = exp(xβ)where xβ= β0 +βx

Equivalently:

Log(μ) = xβ

Where β is the change in the log of average counts given a change in X. Alternatively we can also say that  E(Y|x) changes by a factor of exp(β) . Often exp(β) is interpreted as an ‘incident rate ratio’ although in many cases ‘rate’ and ‘mean’ are used interchangeably (Williams, 2016). 

Are we modeling mean or average counts or rates or does this matter?

We might often think of our data as counts, but count models assume that counts are always observed within time or space. This gives rise to an implicit rate interpretation of counts. For example, the probability mass function for a Poisson process can be specified as:

P(Y|μ) = exp(-μ)* μ-y / y!

The Poisson process described above gives us the probability of ‘y’ events occurring during a given interval. The parameter μ is the expected count or average number of occurrences in the specified or implied interval. Often but not always this interval is measured in units of time (i.e. ER visits per year or total leaks in 1 mile of pipeline).  So even if we think of our outcome as counts we are actually modeling a rate per some implicit interval of observation whether we think of that explicitly or not. This is what gives us the incident rate ratio (IRR) interpretation of exponentiated coefficients. If we think of rates as:

Rate = count/t

Where t = time or interval of observation or exposure If all participants are assumed to have a common t this is essentially like dividing by 1 and our rate is for all practical purposes often interpreted as a count.

Equivalently, as noted in STAT504 (UPenn) when we model rates, mean count is proportional to t and the interpretation of parameter estimates, α and β will stay the same as for the model of counts; you just need to multiply the expected counts by t.

If t =1 or everyone is observed for the same period of time, then we are back to just thinking about counts with an implied rate.
As noted in Cameron and Trivedi (1996) count models and duration (survival) models can be viewed as duals. If observed units have different levels of exposure or duration  or intervals of observation this is accounted for in count models through inclusion of an offset. Inclusion of an offset creates a model that explicitly considers interval ‘tx’ where tx represents exposure time for individuals with covariate value x:

Log(μ/tx) = xβ  here we are explicitly specifying a rate based on time ‘tx

Re-arranging terms we get:

Log(μ) – Log(tx) = xβ 
Log(μ) = xβ + Log(tx)
The term Log(tx) is referred to as an ‘offset.’

When should we include an offset?

Karen Grace Martin at the Analysis Factor gives a great explanation of modeling offsets or exposure in count models. Here is an excerpt:

"What this means theoretically is that by defining an offset variable, you are only adjusting for the amount of opportunity an event has.....A patient in for 20 days is twice as likely to have an incident as a patient in for 10 days....There is an assumption that the likelihood of events is not changing over time."

In another post Karen states:
"It is often necessary to include an exposure or offset parameter in the model to account for the amount of risk each individual had to the event."

So if there are differences in exposure or observation times for different observations relevant to the outcome of interest then it makes sense to account for this by including offsets as specified above.  By explicitly specifying tx we can account for differences in exposure time or observation periods unique to each observation. Often the relevant interval of exposure may be something other than time.  Karen gives one example where it might not make sense to include an offset or account for time such as the number of words a toddler can say. Another example might be the number of correct words spelled in a spelling bee. In fact in this case time may be endogenous. More correct words spelled by a participant imply a longer interval of observation, duration, or ‘exposure’.  We would not make our decision about who is a better speller on the basis of time or a rate such as total correct words per minute. As all count models implicitly model rates, the implicit and most relevant interval here would be the contest itself. In practical terms this simply reverts to being a comparison of raw counts. 

Summary: Counts always occur within some interval of time or space and therefore can always have an implicit ‘rate’ interpretation. If counts are observed across different intervals in time or space for different observations then differences in outcomes should be modeled through the specification of an offset. Whether to include an offset really depends on answering the questions:  (1) What is the relevant interval in time or space upon which our counts are based? (2) Is this interval different across our observations of counts?

References:
Essentials of Count Data Regression. A. Colin Cameron and Pravin K. Trivedi. (1999)

Count Data Models for Financial Data. A. Colin Cameron and Pravin K. Trivedi. (1996)

Models for Count Outcomes. Richard Williams, University of Notre Dame, http://www3.nd.edu/~rwilliam/ . Last revised February 16, 2016

Econometric Analysis of Count Data. By Rainer Winkelmann. 2nd Edition.

Notes: This ignores any discussion related to overdispersion or inflated zeros which relate to other possible model specifications including negative binomial or zero-inflated poisson (ZIP) or zero-inflated negative binomial (ZINB) models.

No comments:

Post a Comment