Regression with interaction effects - continuous variables¶

In another post we talked about regression analysis with interaction effects. The variables then only had two values. But how do we do when one or both of the variables in the interaction are continuous, with many values? In general we do the same thing, but we have to present and interpret the results in a slightly different way. That is the subject of this post. Anyone that wants to dig deeper can read this (paywalled) article.

We will use data from the American General Social Survey, a survey with regular citizens, with questions about a lot of subjects. We will use the 2016 version, and ask what is the effect of having kids on income. If you want to follow along, download the data and put it into your project folder. I have put it in a sub-folder called "data", which I also state when loading the file.

In the code below, we first load the data, and then do a recoding to create a "woman" variable.

cd "/Users/xsunde/Dropbox/Jupyter/stathelp"
use "data/GSS2016.dta", clear
recode sex (1=0) (2=1), generate(woman)

/Users/xsunde/Dropbox/Jupyter/stathelp


(2867 differences between sex and woman)

Interaction analysis with continuous variables¶

In the previous post we could see that the effect of having kids on income was different for men and women. But the variable we used for kids was a dummy variable, with the values 0 (no kids) and 1 (one or more kids). Now we will insteade use a continous variable, "childs", that shows how many kids the respondent has. The variable is however cut off at 8 - the value 8 signifies 8 or more kids.

First we do a normal regression, with income as the dependent variable, and "woman" and "childs" as independents, and also contorlling for age, since it is strongly related to both income and the number of kids.

reg realrinc woman childs age

      Source |       SS           df       MS      Number of obs   =     1,627
-------------+----------------------------------   F(3, 1623)      =     24.96
       Model |  6.1202e+10         3  2.0401e+10   Prob > F        =    0.0000
    Residual |  1.3264e+12     1,623   817265487   R-squared       =    0.0441
-------------+----------------------------------   Adj R-squared   =    0.0423
       Total |  1.3876e+12     1,626   853396969   Root MSE        =     28588

------------------------------------------------------------------------------
    realrinc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       woman |   -9474.27   1418.312    -6.68   0.000    -12256.19   -6692.354
      childs |  -108.2144   502.1071    -0.22   0.829    -1093.061    876.6319
         age |   282.4416   52.96583     5.33   0.000      178.553    386.3302
       _cons |   16147.79   2435.659     6.63   0.000     11370.42    20925.15
------------------------------------------------------------------------------

The coefficient for "childs" shows the expected effect of increasing the variable one step, which means having another child. It is weakly negative and insignificant. There is thus no apparent difference between respondents with few and many children. The effect of woman is negative - women earn less than men.

Difference in the effect of having children for women and men¶

Now we will add the interaction between woman and childs. We can do that directly in the regression command, by connecting the two variables with ##. But we must also add c. in front on the childs variable, to show that it is continuous.

reg realrinc woman##c.childs age

      Source |       SS           df       MS      Number of obs   =     1,627
-------------+----------------------------------   F(4, 1622)      =     19.93
       Model |  6.5005e+10         4  1.6251e+10   Prob > F        =    0.0000
    Residual |  1.3226e+12     1,622   815424344   R-squared       =    0.0468
-------------+----------------------------------   Adj R-squared   =    0.0445
       Total |  1.3876e+12     1,626   853396969   Root MSE        =     28556

------------------------------------------------------------------------------
    realrinc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     1.woman |  -6100.754   2108.763    -2.89   0.004    -10236.94   -1964.569
      childs |   855.1288   671.1906     1.27   0.203    -461.3629    2171.621
             |
       woman#|
    c.childs |
          1  |  -2060.054   953.8345    -2.16   0.031    -3930.931   -189.1764
             |
         age |   275.4991    53.0037     5.20   0.000     171.5362     379.462
       _cons |   14916.98   2498.767     5.97   0.000     10015.82    19818.13
------------------------------------------------------------------------------

It is now important to remember that we cannot interpret the coefficients in the interaction in the regular way.

The coefficient for woman (-6100.754) now shows the effect of being a woman for persons with 0 kids.
The coefficient for childs (855.1288) now shows the effect of having another child for persons that have the value 0 on the variable woman, that is, men.
The coefficient for woman#c.childs is the interaction term, and shows the difference in effect. That means both the difference in the effect of having a child, for women and men, or how the gender difference changes when you have more kids.

To calculate the effect of having another child for both values of woman (that is, men and women), we take the main effect of childs (855.1288) and then add the coefficient for the interaction term, times the value of the woman variable:
For woman = 0 (men): $855.1288 -2060.054*0 = 855.1288$
For woman = 1 (women): $855.1288 -2060.054*1 = -1204.9252$

Men that have another child increase their income by 855.1288, while women decrease their income by 1204.9252. As the interaction term is statistically significant, we know that the difference in effect between the groups is significant.

However, this does not necessarily mean that the effects of childs is significant for each of the groups. For instance, among women, is the effect of having another child significantly negative? To calculate that we can use the margins command, and then immediately after the marginsplot command to show the coefficients graphically.

margins, dydx(childs) at(woman=(0 1))
marginsplot, yline(0)


Average marginal effects                        Number of obs     =      1,627
Model VCE    : OLS

Expression   : Linear prediction, predict()
dy/dx w.r.t. : childs

1._at        : woman           =           0

2._at        : woman           =           1

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
childs       |
         _at |
          1  |   855.1288   671.1906     1.27   0.203    -461.3629    2171.621
          2  |  -1204.925   713.7205    -1.69   0.092    -2604.836    194.9863
------------------------------------------------------------------------------


  Variables that uniquely identify margins: woman

We now get graphically what we calculated manually: The effect of "childs" is positive for people with a 0 on the variable woman (the men) and it is negative for the women. But the effects in themselves are not significantly different from zero. That means that we can be pretty sure that the effect of having more children is different for women than for men, but we can at the same time not be sure that either of the two effects is different from zero! This might be a little bit difficult to grasp.

But what we need to keep in mind is that "not significantly different from zero" not means that it definitely is zero. It is just more than five percent chance that it is zero. But if we look at the graph we can see that the confidence intervals only overlap slightly. In order for the two effects to be the same, the effect for men must be in the lowest part of the interval, at the same time as the effect of women is in the highest part of the interval. THe chance of both those happening at the same time is less than five percent. A bit tricky, but it actually makes sense.

Difference in the effect of woman for people with different number of kids¶

Now we will calculate what the difference between men and women is, over different values of the variable childs. We do so by taking the main effect of "woman" (-6100.754), and then add the interaction term (-2060.054) times different values of children.

0 kids: -6100.754 -2060.054 0 = -6100.754
1 kids: -6100.754 -2060.054 1 = -8160.808
2 kids: -6100.754 -2060.054 2 = -10220.862
3 kids: -6100.754 -2060.054 3 = -12280.916
4 kids: -6100.754 -2060.054 4 = -14340.97
5 kids: -6100.754 -2060.054 5 = -16401.024
6 kids: -6100.754 -2060.054 6 = -18461.078
7 kids: -6100.754 -2060.054 7 = -20521.132
8 kids: -6100.754 -2060.054 * 8 = -22581.186

We thus get nine coefificents, each 2060.054 smaller than the previous. The difference between men and women grow larger for each additional kid there is. Women with 8 kids on average earn 22581.186 less than men with 8 kids!

We can't have 3.5 kids, but we can use the same method for interactions with variables that have decimal values.

Now we will calculate the same coefficients with the margins command, because we then also get the significance values and confidence intervals:

margins, dydx(woman) at(childs=(0/8))
marginsplot, yline(0)


Average marginal effects                        Number of obs     =      1,627
Model VCE    : OLS

Expression   : Linear prediction, predict()
dy/dx w.r.t. : 1.woman

1._at        : childs          =           0

2._at        : childs          =           1

3._at        : childs          =           2

4._at        : childs          =           3

5._at        : childs          =           4

6._at        : childs          =           5

7._at        : childs          =           6

8._at        : childs          =           7

9._at        : childs          =           8

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
0.woman      |  (base outcome)
-------------+----------------------------------------------------------------
1.woman      |
         _at |
          1  |  -6100.754   2108.763    -2.89   0.004    -10236.94   -1964.569
          2  |  -8160.807   1541.729    -5.29   0.000     -11184.8   -5136.819
          3  |  -10220.86   1458.278    -7.01   0.000    -13081.17   -7360.555
          4  |  -12280.91   1922.452    -6.39   0.000    -16051.67   -8510.164
          5  |  -14340.97   2661.704    -5.39   0.000    -19561.71   -9120.228
          6  |  -16401.02   3506.154    -4.68   0.000    -23278.09   -9523.954
          7  |  -18461.08   4395.585    -4.20   0.000     -27082.7   -9839.454
          8  |  -20521.13    5307.43    -3.87   0.000    -30931.27   -10110.99
          9  |  -22581.18   6231.859    -3.62   0.000    -34804.52   -10357.84
------------------------------------------------------------------------------
Note: dy/dx for factor levels is the discrete change from the base level.


  Variables that uniquely identify margins: childs

The nine points with confidence intervals thus shows the nine coefficients we just calculated. We can see that the coefficient for woman - that is, the difference between women and men - grows more and more negative as the number of kids increases. All confidence intervals are also different frmo zero, which means that the gender difference always is statistically significant.

The fact that confidence intervals are wider in some places than others have to do with the distribution of values on the childs variable. The regression line is drawn through the center of gravity for the observations, and therefore varies more towards the ends of the data. The intervals are generally smallest where there are the most observations - between 1 and 3 kids. The average of the childs variable in the data is 1.8 kids.

Predict values¶

Finally we predict values of the income variable, using the margins command. The easiest way to do so is to show the expected income for women and men at different numbers of kids. But keep in mind that the order of the variables in the at option matters. The variable that is entered first will be on the x axis, and the second will govern the colors. The table is also large, since we now have 18 (9 * 2) coefficients. This table is not generally reported, only the graph.

margins, at(childs=(0/8) woman=(0 1))
marginsplot


Predictive margins                              Number of obs     =      1,627
Model VCE    : OLS

Expression   : Linear prediction, predict()

1._at        : woman           =           0
               childs          =           0

2._at        : woman           =           0
               childs          =           1

3._at        : woman           =           0
               childs          =           2

4._at        : woman           =           0
               childs          =           3

5._at        : woman           =           0
               childs          =           4

6._at        : woman           =           0
               childs          =           5

7._at        : woman           =           0
               childs          =           6

8._at        : woman           =           0
               childs          =           7

9._at        : woman           =           0
               childs          =           8

10._at       : woman           =           1
               childs          =           0

11._at       : woman           =           1
               childs          =           1

12._at       : woman           =           1
               childs          =           2

13._at       : woman           =           1
               childs          =           3

14._at       : woman           =           1
               childs          =           4

15._at       : woman           =           1
               childs          =           5

16._at       : woman           =           1
               childs          =           6

17._at       : woman           =           1
               childs          =           7

18._at       : woman           =           1
               childs          =           8

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   27141.21   1476.111    18.39   0.000     24245.93     30036.5
          2  |   27996.34   1089.904    25.69   0.000     25858.57    30134.11
          3  |   28851.47   1047.794    27.54   0.000      26796.3    30906.64
          4  |    29706.6   1381.611    21.50   0.000     26996.67    32416.53
          5  |   30561.73   1902.845    16.06   0.000     26829.44    34294.02
          6  |   31416.86   2496.754    12.58   0.000     26519.65    36314.06
          7  |   32271.99   3122.136    10.34   0.000     26148.14    38395.83
          8  |   33127.11   3763.334     8.80   0.000     25745.61    40508.62
          9  |   33982.24    4413.46     7.70   0.000     25325.56    42638.93
         10  |   21040.46   1549.293    13.58   0.000     18001.63    24079.29
         11  |   19835.53   1099.371    18.04   0.000      17679.2    21991.87
         12  |   18630.61   1017.701    18.31   0.000     16634.46    20626.76
         13  |   17425.68   1371.717    12.70   0.000     14735.16    20116.21
         14  |   16220.76   1935.535     8.38   0.000     12424.35    20017.17
         15  |   15015.83   2574.835     5.83   0.000     9965.481    20066.19
         16  |   13810.91   3245.312     4.26   0.000     7445.464    20176.35
         17  |   12605.98   3931.046     3.21   0.001     4895.522    20316.45
         18  |   11401.06   4625.255     2.46   0.014     2328.956    20473.16
------------------------------------------------------------------------------


  Variables that uniquely identify margins: childs woman

Interaction with two continuous variables¶

Now it is time to take things a step further. What if we have two continuous variables in the interaction? We basically do the same thing, but just have to interpret and present a little bit differently.

Now let's say that we want to look at the relationship between income, age and having another child. It is a bit strange, since the number of kids you have is tightly connecteed with age, but let us still try, for the sake of the exapmle. We then run a regression where we interact the number of kids with age (and write c. in front of both variables, since they are both continuous).

reg realrinc woman c.childs##c.age

      Source |       SS           df       MS      Number of obs   =     1,627
-------------+----------------------------------   F(4, 1622)      =     23.32
       Model |  7.5451e+10         4  1.8863e+10   Prob > F        =    0.0000
    Residual |  1.3122e+12     1,622   808984506   R-squared       =    0.0544
-------------+----------------------------------   Adj R-squared   =    0.0520
       Total |  1.3876e+12     1,626   853396969   Root MSE        =     28443

------------------------------------------------------------------------------
    realrinc |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
       woman |  -10027.18   1417.245    -7.08   0.000       -12807   -7247.353
      childs |   6347.314   1617.275     3.92   0.000     3175.145    9519.482
         age |   474.4715   69.78933     6.80   0.000     337.5848    611.3583
             |
    c.childs#|
       c.age |  -138.2845   32.94967    -4.20   0.000    -202.9129   -73.65609
             |
       _cons |    8337.64   3055.404     2.73   0.006     2344.686    14330.59
------------------------------------------------------------------------------

There is a significant interaction effect (-138.2845). Since the term is negative, the interpretation is that the effect of having more kids becomes more negative the older you are. Conversely, the effect of age is more negative the more kids you have.

To for instance calculate the effect of having another another child at different ages we do the following: 20 years old: 6347.314 - 138.2845 20 = 3581.624
30 years old: 6347.314 - 138.2845 30 = 2198.779
40 years old: 6347.314 - 138.2845 40 = 815.934
50 years old: 6347.314 - 138.2845 50 = -566.9108
60 years old: 6347.314 - 138.2845 * 60 = -1949.756

We can as usual illustrate it with margins and marginsplot:

margins, dydx(childs) at(age=(20(10)60))
marginsplot, yline(0)


Average marginal effects                        Number of obs     =      1,627
Model VCE    : OLS

Expression   : Linear prediction, predict()
dy/dx w.r.t. : childs

1._at        : age             =          20

2._at        : age             =          30

3._at        : age             =          40

4._at        : age             =          50

5._at        : age             =          60

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
childs       |
         _at |
          1  |   3581.624   1011.207     3.54   0.000     1598.214    5565.033
          2  |   2198.779   742.7817     2.96   0.003     741.8664    3655.691
          3  |   815.9341   545.9354     1.49   0.135    -254.8787    1886.747
          4  |  -566.9108   511.3732    -1.11   0.268    -1569.932    436.1107
          5  |  -1949.756   664.9028    -2.93   0.003    -3253.914   -645.5969
------------------------------------------------------------------------------


  Variables that uniquely identify margins: age

We can see in the graph that the effect of having kids is positive for people that are 20, 30 and 40, and negative for people that are 50 or 60.

And to get the effect of getting one year older at different number of kids we do the following:

0 kids: 474.4715 - 138.2845 0 = 474.4715
2 kids: 474.4715 - 138.2845 2 = 197.9026
4 kids: 474.4715 - 138.2845 4 = -78.6664
6 kids: 474.4715 - 138.2845 6 = -355.2354
8 kids: 474.4715 - 138.2845 * 8 = 631.8044

margins, dydx(age) at(childs=(0(2)8))
marginsplot, yline(0)


Average marginal effects                        Number of obs     =      1,627
Model VCE    : OLS

Expression   : Linear prediction, predict()
dy/dx w.r.t. : age

1._at        : childs          =           0

2._at        : childs          =           2

3._at        : childs          =           4

4._at        : childs          =           6

5._at        : childs          =           8

------------------------------------------------------------------------------
             |            Delta-method
             |      dy/dx   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
age          |
         _at |
          1  |   474.4715   69.78933     6.80   0.000     337.5848    611.3583
          2  |   197.9026   56.41555     3.51   0.000     87.24755    308.5576
          3  |  -78.66641   100.8976    -0.78   0.436    -276.5697    119.2369
          4  |  -355.2354    160.821    -2.21   0.027    -670.6741   -39.79669
          5  |  -631.8044   224.1247    -2.82   0.005    -1071.409   -192.1999
------------------------------------------------------------------------------


  Variables that uniquely identify margins: childs

The effect of age is positive for people with 0 or 2 kids, but negative for people with 4, 6 or 8 kids.

So far we have looked at the effects, but it gets more tricky if we want to show predicted values. Let's say that we want to have age on the x axis. THen we will have to have one line for each number of kids. But in order not to make the graph to overloaded we will just display a few vales, for instance 0, 3 and 6 kids. The other lines would be drawn in between anyway.

margins, at(age=(20(10)70) childs=(0 3 6))
marginsplot


Predictive margins                              Number of obs     =      1,627
Model VCE    : OLS

Expression   : Linear prediction, predict()

1._at        : childs          =           0
               age             =          20

2._at        : childs          =           0
               age             =          30

3._at        : childs          =           0
               age             =          40

4._at        : childs          =           0
               age             =          50

5._at        : childs          =           0
               age             =          60

6._at        : childs          =           0
               age             =          70

7._at        : childs          =           3
               age             =          20

8._at        : childs          =           3
               age             =          30

9._at        : childs          =           3
               age             =          40

10._at       : childs          =           3
               age             =          50

11._at       : childs          =           3
               age             =          60

12._at       : childs          =           3
               age             =          70

13._at       : childs          =           6
               age             =          20

14._at       : childs          =           6
               age             =          30

15._at       : childs          =           6
               age             =          40

16._at       : childs          =           6
               age             =          50

17._at       : childs          =           6
               age             =          60

18._at       : childs          =           6
               age             =          70

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   12699.47   1777.144     7.15   0.000     9213.729    16185.21
          2  |   17444.18   1282.334    13.60   0.000     14928.98    19959.39
          3  |    22188.9   1051.014    21.11   0.000     20127.41    24250.39
          4  |   26933.61    1240.56    21.71   0.000     24500.34    29366.88
          5  |   31678.33   1716.817    18.45   0.000     28310.92    35045.74
          6  |   36423.04   2308.688    15.78   0.000     31894.72    40951.37
          7  |   23444.34   2412.406     9.72   0.000     18712.58     28176.1
          8  |   24040.52   1751.345    13.73   0.000     20605.38    27475.66
          9  |    24636.7   1197.448    20.57   0.000     22287.99    26985.41
         10  |   25232.88   959.0215    26.31   0.000     23351.83    27113.93
         11  |   25829.06     1234.8    20.92   0.000     23407.09    28251.03
         12  |   26425.24   1802.472    14.66   0.000     22889.82    29960.66
         13  |   34189.21   5185.257     6.59   0.000      24018.7    44359.72
         14  |   30636.86    3797.51     8.07   0.000     23188.32     38085.4
         15  |    27084.5   2669.823    10.14   0.000     21847.84    32321.17
         16  |   23532.15   2237.744    10.52   0.000     19142.98    27921.32
         17  |    19979.8   2838.965     7.04   0.000     14411.37    25548.22
         18  |   16427.44   4035.421     4.07   0.000     8512.254    24342.63
------------------------------------------------------------------------------


  Variables that uniquely identify margins: age childs

Here we can see in a different way that the effect of getting older is positive for people with 0 kids, close to zero for people with 3 kids, and negative for people with 6 kids.

But we also have to remember that we now have divided the data into a lot of different subgroups, and we only had about 1600 observations to work with from the beginning. The more interactions we do, the more sensitive is the data to outliers. If there for instance is a person that is very old or have a lot of kids, that person will have a great deal of leverage. It is therefore often better to combine values, so that we for instance compare people with and without kids, or people above or below 40, for instance.

Conclusion¶

Interaction analyses are often theoretically interesting, and can show important differences in the data. But remember that the goal seldom is to make a map with the scale 1:1 of reality. Instead, we want to sift through large amounts of data to get to the big patterns. The fact that it is possible to find a significant interaction does not imply that it is interesting. There is always a risk of overfitting, that is, building a model that fits our sample perfectly, but not the wider population.