Tuesday, February 1, 2011

Interaction Models

Given a model of the form:

y= β0 + β1 X+ β2Z  + β3 XZ+ e

the relationship between X and Y is conditional on Z. The interaction term represents the effect of X on Y  conditional on the value of Z.

In ‘Understanding Interaction Models: Improving Empirical Analysis’ by Brambor, Clark, and Golder the following  schematic is presented:
As the schematic shows, β2 represents the difference in intercepts between the two regression lines.


Marginal Effect of X on Y: ∂Y/ ∂X = β1  + β3 Z

β1 = effect of X on Y when Z =0

If XY is significant, that implies that the relationship between X and Y differs significantly between classes or values of Z.

It is possible that the effect of X on Y is significant for some values of Z even if the interaction term is not, hence you cannot base the inclusion of XZ in the model on the significance of the interaction term (Bramber et al, 2005).
In determining significance, the basic regression output typically does not provide sufficient information  and modifications are required (Bramber et al, 2005).

Kmenta (1971) provides the following comments regarding the significance of interactions and constitutive terms:

“When there are interaction terms in the equation, then any given explanatory variable may be represented not by one but several regressors. The hypothesis that this variable does not influence Y means that the coefficients of all regressors involving this variable are jointly zero”

As a result, the significance of X and the XZ term is given by the following F-test:

F = [ (R22 – R21) / (k2 – k1 )] / [(1-R22) / (N- k2 -1)

Kn = # of variables in each model respectively (model including and excluding the interaction term and interaction variable)
R2n = R-square for each respective model
N = total observations

The standard error of β1  + β3 Z = sqrt(V(β1)  + Z2 V(β3 ) + 2 Z COV(β13))

Constructing Odds Ratios from Logistic Models: e β1  + β3 Z


Understanding Interaction Models: Improving Empirical Analyses. Thomas Bramber, William Roberts Clark, Matt Golder. Political Analysis (2006) 14:63-82

Elements of Econometrics. Jan Kmenta. Macmillan (1971)


  1. This comment has been removed by the author.

  2. Thanks! This discussion helped me. But at the y axis, should the lines have vertical offset beta_2*z?

  3. What you say makes sense to me! I did review the paper I referenced (
    https://files.nyu.edu/mrg217/public/pa_final.pdf ) and that graph specifically depicts the conditional hypothesis Ho: A relationship between x and y exists only when z not = 0.

    The alternative being that, if Z= zero, no relationship exists implying a slope = 0. Which tecnically would leave you with Y = Bo. I guess in their graphic, we have B1 implied to be zero (which I realize now makes my post confusing since I left out the specific hypothesis they were illustrating). I think this example illustrates very clearly that the interaction can impact both the slope and the intercept.

    They actually depict the slope of the equation of the upper line as (Bo + B2) + (B1 + B3)X + e , for Z = 1, ( in this case Z is a 0/1 variable), so the slope of the upper line when z= 1 is (B1 + B3) and the slope of the line when z =0 is B1 (which technically is zero for a horizontal line, implying no relationship between X and Y when Z is zero.

    I think this is a very specific case, but your interpretation should hold in general. Thanks for pointing this out, I actually feel like I have a better understanding now than before. And I'd encourage you to read their paper.

    But, I think your visualization holds true.

  4. Yes, you are correct about the offset as far as I know.

  5. Z is usually a dummy variable in situations like this, which is by definition binary. If Z isn't a dummy variable, you would have to pick Z to be at an arbitrary point, so an offset of b2*Z isn't terribly useful.