See also: Count Models with Offsets: Practical Applications using R 
Principles of Count Model Regression
Principles of Count Model Regression
Often times we want to model the impact of some 
intervention or differences between groups in relation to an outcome 
that is a count. Examples of counts may be related to medical care 
utilization such as office visits or days in the hospital
 or total hospital admissions.
"In all cases the data are concentrated on a few
 small discrete values, say 0, 1 and 2; skewed to the left; and 
intrinsically heteroskedastic with variance increasing with the mean. 
These features motivate the application of special
 methods and models for count regression." (Cameron and Trivedi, 1999).
Poisson regression is "the starting point for count data analysis, though it is often inadequate"
(Cameron and Trivedi,1999). The "main focus is the effect of covariates on the frequency of an event, measured by non-negative integer values or counts"(Cameron and Trivedi,1996).
From Cameron and Trivedi (1999) "The Poisson 
regression model is derived from the Poisson distribution by 
parameterizing the relation between the mean parameter μ and covariates 
(regressors) x. The standard assumption is to use the exponential
 mean parameterization"
E(Y|x) = μ = exp(xβ)where xβ= β0 +βx
Equivalently:
Log(μ) = xβ
Where β is the change in the log of average counts 
given a change in X. Alternatively we can also say that  E(Y|x) changes 
by a factor of exp(β) . Often exp(β) is interpreted as an ‘incident rate
 ratio’ although in many cases ‘rate’ and
 ‘mean’ are used interchangeably (Williams, 2016). 
Are we modeling mean or average counts or rates or does this matter?
We might often think of our data as counts, but 
count models assume that counts are always observed within time or 
space. This gives rise to an implicit rate interpretation of counts. For
 example, the probability mass function for a Poisson
 process can be specified as:
P(Y|μ) = exp(-μ)* μ-y / y!
The Poisson process described above gives us the 
probability of ‘y’ events occurring during a given interval. The 
parameter μ is the expected count or average number of occurrences in 
the specified or implied interval. Often but not always
 this interval is measured in units of time (i.e. ER visits per year or 
total leaks in 1 mile of pipeline).  So even if we think of our outcome 
as counts we are actually modeling a rate per some implicit interval of 
observation whether we think of that explicitly
 or not. This is what gives us the incident rate ratio (IRR) 
interpretation of exponentiated coefficients. If we think of rates as:
Rate = count/t
Where t = time or interval of observation or 
exposure If all participants are assumed to have a common t this 
is essentially like dividing by 1 and our rate is for all practical 
purposes often interpreted as a count.
If t =1 or everyone is observed for the same period of time, then we are back to just thinking about counts with an implied rate.
As noted in Cameron and Trivedi (1996) count models
 and duration (survival) models can be viewed as duals. If observed 
units have different levels of exposure or duration  or intervals of 
observation this is accounted for in count models
 through inclusion of an offset. Inclusion of an offset creates a model 
that explicitly considers interval ‘tx’ where tx represents exposure time for individuals with covariate value x:
Log(μ/tx) = xβ  here we are explicitly specifying a rate based on time ‘tx’
Re-arranging terms we get:
Log(μ) – Log(tx) = xβ  
Log(μ) = xβ + Log(tx)
The term Log(tx) is referred to as an ‘offset.’
When should we include an offset?
Karen Grace Martin at the Analysis Factor gives a 
great explanation of modeling offsets or exposure in count models. Here is an excerpt:
"What this means theoretically is that by 
defining an offset variable, you are only adjusting for the amount of 
opportunity an event has.....A patient in for 20 days is twice as likely
 to have an incident as a patient in for 10 days....There
 is an assumption that the likelihood of events is not changing over 
time."
In another post Karen states: 
"It is often necessary to include an exposure or
 offset parameter in the model to account for the amount of risk each 
individual had to the event."
So if there are differences in exposure or 
observation times for different observations relevant to the outcome of 
interest then it makes sense to account for this by including offsets as
 specified above.  By explicitly specifying tx
 we can account for differences in exposure time or observation periods 
unique to each observation. Often the relevant interval of exposure may 
be something other than time.  Karen gives one example where it might 
not make sense to include an offset or account
 for time such as the number of words a toddler can say. Another example
 might be the number of correct words spelled in a spelling bee. In fact
 in this case time may be
endogenous. More correct words spelled by a participant imply a 
longer interval of observation, duration, or ‘exposure’.  We would not 
make our decision about who is a better speller on the basis of time or a
 rate such as total correct words per minute.
 As all count models implicitly model rates, the implicit and most 
relevant interval here would be the contest itself. In practical terms 
this simply reverts to being a comparison of raw counts. 
References:
Essentials of Count Data Regression. A. Colin 
Cameron and Pravin K. Trivedi. (1999)
Count Data Models for Financial Data. A. Colin Cameron and Pravin K. Trivedi. (1996)
Count Data Models for Financial Data. A. Colin Cameron and Pravin K. Trivedi. (1996)
Models for Count Outcomes. Richard Williams, University of Notre Dame,
http://www3.nd.edu/~rwilliam/ . Last revised February 16, 2016
Econometric Analysis of Count Data. By Rainer Winkelmann. 2nd Edition.
No comments:
Post a Comment