Purpose

Factorial ANOVA refers to a special case of the general linear model in which there is an interaction of two or more categorical variables (i.e. factors). A factorial design is used when there is an interest in how two or more variables (or factors) affect some outcomes variable. Rather than conduct separate one-way ANOVAs for each factor, they are all included in one analysis. Today we will review how to run factorial ANOVA models in R and how to interpret and visualize the results.

  1. Research scenario
  2. Tables of means
  3. Running the model
  4. Plotting
  5. Simple effects

Be sure to have the following packages installed and loaded:

library(tidyverse) # for plotting and data wrangling
library(rio) # for importing data
library(psych) # for descriptives
library(lsr) # for eta squared functions
library(emmeans) # for marginal means and simple effects
library(sjPlot) # for plotting model results 
library(apaTables) # for tables of means
library(car) # for testing model assumptions
library(broom) # for tidying model output

Research scenario

  • Based on subjects’ self-reports of rejection sensitivity (N = 80), a researcher divides subjects into two equal groups (low RS and high RS). Whereas half of the subjects in each group interact with a partner who displays a happy emotional expression during the interaction, the other half of the subjects in each group interact with a partner who displays a neutral emotional expression during the interaction. After the interaction, subjects are asked to rate the statement, “My interaction partner likes me”, on a scale from 1 (strongly disagree) to 7 (strongly agree).

Question: What type of factorial ANOVA design is this?

Data

  • Import the data
reject <- import("https://raw.githubusercontent.com/uopsych/psy612/master/labs/lab-8/data/reject.csv")


  • Check out the data

Code
str(reject)
Output
## 'data.frame':    80 obs. of  3 variables:
##  $ rs     : chr  "Low" "Low" "Low" "Low" ...
##  $ partner: chr  "Neutral" "Neutral" "Neutral" "Neutral" ...
##  $ liking : int  4 4 4 5 5 5 5 5 5 5 ...


  • Look at the first few rows

Code
head(reject)
Output


  • It looks like rs and partner are both being read in as character variables. Let’s go ahead and change those to factors.
reject <- reject %>% 
  mutate(rs = as.factor(rs),
         partner = as.factor(partner))


  • Check the structure again. Notice that by default R orders factor levels alphabetically. In our case, this means that High will be the reference group of rejection sensitivity and Happy will be the reference group of interaction partner’s emotional expression. However, it might be more intuitive to have Low and Neutral be the reference groups, respectively.

Code
str(reject)
Output
## 'data.frame':    80 obs. of  3 variables:
##  $ rs     : Factor w/ 2 levels "High","Low": 2 2 2 2 2 2 2 2 2 2 ...
##  $ partner: Factor w/ 2 levels "Happy","Neutral": 2 2 2 2 2 2 2 2 2 2 ...
##  $ liking : int  4 4 4 5 5 5 5 5 5 5 ...


  • To accomplish this, we can simply re-order the levels of our factor variables with fct_relevel().
# manually specify order of factor levels
reject <- reject %>% 
  mutate(rs = fct_relevel(rs, c("Low", "High")), 
         partner = fct_relevel(partner, c("Neutral", "Happy")))


  • To make sure this work, check the structure again.

Code
str(reject)
Output
## 'data.frame':    80 obs. of  3 variables:
##  $ rs     : Factor w/ 2 levels "Low","High": 1 1 1 1 1 1 1 1 1 1 ...
##  $ partner: Factor w/ 2 levels "Neutral","Happy": 1 1 1 1 1 1 1 1 1 1 ...
##  $ liking : int  4 4 4 5 5 5 5 5 5 5 ...


  • Do we have a balanced design?

Code
reject %>%
  group_by(rs, partner) %>% 
  summarize(n = n())
Output



Tables of means

Using summarize()

  • We’ll create 4 tables of means:
  1. cell means (i.e., for each combination of the two factors)
  2. marginal means for interaction partner (averaged across levels of rejection sensitivity)
  3. marginal means for RS (averaged across levels of interaction partner)
  4. grand mean (averaged across both rejection sensitivity and interaction partner)

Cell Means

Code
reject %>%
  group_by(rs, partner) %>% 
  summarize(mean = mean(liking, na.rm = TRUE),
            sd = sd(liking, na.rm = TRUE)) %>% 
knitr::kable(digits = c(NA, NA, 2, 2, 2),
             caption = "Cell Means & SD")
Output
Cell Means & SD
rs partner mean sd
Low Neutral 5.55 1.05
Low Happy 6.00 0.73
High Neutral 1.80 0.70
High Happy 6.45 0.69


Marginal Means: Interaction Partner

Code
reject %>%
  group_by(partner) %>% # instead of grouping by both factors, we just group by interaction partner
  summarize(mean = mean(liking, na.rm = TRUE),
            sd = sd(liking, na.rm = TRUE)) %>% 
knitr::kable(digits = c(NA, 2, 2, 2),
             caption = "Marginal Means & SD for Interaction Partner")
Output
Marginal Means & SD for Interaction Partner
partner mean sd
Neutral 3.67 2.09
Happy 6.22 0.73


Marginal Means: Rejection Sensitivity

Code
reject %>%
  group_by(rs) %>% # just group by rs for this one
  summarize(mean = mean(liking, na.rm = TRUE),
            sd = sd(liking, na.rm = TRUE)) %>% 
knitr::kable(digits = c(NA, 2, 2, 2),
             caption = "Marginal Means & SD for Rejection Sensitivity")
Output
Marginal Means & SD for Rejection Sensitivity
rs mean sd
Low 5.78 0.92
High 4.12 2.45


Grand Mean

Code
reject %>% # note that we don't need group_by for this one
  summarize(mean = mean(liking, na.rm = TRUE),
            sd = sd(liking, na.rm = TRUE)) %>% 
knitr::kable(digits = c(2, 2, 2),
             caption = "Grand Mean")
Output
Grand Mean
mean sd
4.95 2.02


An easier way

  • The apa.2way.table() function from {apaTables} is a much more convenient way to get our cell means and marginal means. This function works for any type of 2-way ANOVA, regardless of the number of levels of your factors, e.g. it would work for a 3 x 3 ANOVA. All you need to do is indicate what the IV’s and DV are and specify show.marginal.means = TRUE.

Code
apa.2way.table(iv1 = rs, 
               iv2 = partner, 
               dv = liking, 
               data = reject,
               show.marginal.means = TRUE)
Output
## 
## 
## Means and standard deviations for liking as a function of a 2(rs) X 2(partner) design 
## 
##           partner                              
##           Neutral      Happy      Marginal     
##        rs       M   SD     M   SD        M   SD
##       Low    5.55 1.05  6.00 0.73     5.78 0.92
##      High    1.80 0.70  6.45 0.69     4.12 2.45
##  Marginal    3.67 2.09  6.22 0.73              
## 
## Note. M and SD represent mean and standard deviation, respectively. 
## Marginal indicates the means and standard deviations pertaining to main effects.


  • An added plus is that you can easily export these tables to word (see ?apa.2way.table).

Code
apa.2way.table(iv1 = rs,
               iv2 = partner,
               dv = liking,
               data = reject,
               show.marginal.means = TRUE,
               table.number = 1,
               filename = here::here("/labs/lab-8/images/means.doc")) # example file path
Output


  • Remember, a main effect is the effect of one IV on the DV completely ignoring the other variable(s).


Question: Which means are being compared in the main effect of rejection sensitivity?

Question: Which means are being compared in the main effect of interaction partner?

Question: Which means are involved in the interaction?


Running the model

  • Factorial ANOVA is the method by which we can examine whether two (or more) categorical IVs have joint effects on a continuous outcome of interest. Like all general linear models, factorial ANOVA is a specific case of multiple regression. However, we may choose to use an ANOVA framework for the sake of interpretability.


  • We can specify the factorial ANOVA model using lm() the same way we specified
model <- lm(liking ~ rs * partner, data = reject) 
  • We can look at the regression coefficients from the model. However, in the case of factorial ANOVA, these are less useful to interpret when thinking about main effects.

Code
summary(model)
Output
## 
## Call:
## lm(formula = liking ~ rs * partner, data = reject)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -1.55  -0.55   0.00   0.55   1.45 
## 
## Coefficients:
##                     Estimate Std. Error t value            Pr(>|t|)    
## (Intercept)           5.5500     0.1797   30.88 <0.0000000000000002 ***
## rsHigh               -3.7500     0.2542  -14.75 <0.0000000000000002 ***
## partnerHappy          0.4500     0.2542    1.77              0.0807 .  
## rsHigh:partnerHappy   4.2000     0.3595   11.68 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8038 on 76 degrees of freedom
## Multiple R-squared:  0.8474, Adjusted R-squared:  0.8414 
## F-statistic: 140.7 on 3 and 76 DF,  p-value: < 0.00000000000000022


ANOVA table

  • Instead, we will focus on the ANOVA table output. We can get this using anova().

Code
anova(model)
Output


Question: Is there a main effect of rejection sensitivity on perceived liking?

Question: Is there a main effect of interaction partner on perceived liking?

Question: Is there an interaction between rejection sensitivity and interaction partner?

  • We can tidy the ANOVA table output with broom::tidy()

Code
anova(model) %>% 
  broom::tidy()
Output


  • apaTables::apa.aov.table() will generate an APA-formatted comprehensive ANOVA table for you

Code
# might need to have the {MBESS} package installed to use this function
# install.packages("MBESS")

apa.aov.table(lm_output = model,
              conf.level = .95)
Output
## 
## 
## ANOVA results using liking as the dependent variable
##  
## 
##     Predictor     SS df     MS      F    p partial_eta2 CI_95_partial_eta2
##   (Intercept) 616.05  1 616.05 953.56 .000                                
##            rs 140.62  1 140.62 217.67 .000          .74         [.64, .80]
##       partner   2.03  1   2.03   3.13 .081          .04         [.00, .15]
##  rs x partner  88.20  1  88.20 136.52 .000          .64         [.51, .73]
##         Error  49.10 76   0.65                                            
## 
## Note: Values in square brackets indicate the bounds of the 95% confidence interval for partial eta-squared


  • This also allows you to output tables to Word.

Code
apa.aov.table(lm_output = model,
              conf.level = .95,
              table.number = 2,
              filename = here::here("/labs/lab-8/images/anova_table.doc"))
Output


Checking assumptions

  • You can check the assumptions of the factorial ANOVA in much the same way you check them for multiple regression; but given the categorical nature of the predictors, some assumptions are easier to check.


  • Homogeneity of variance, for example, can be tested using Levene’s test, instead of examining a plot.

Code
car::leveneTest(liking ~ rs * partner, data = reject)
Output


Question: Have we met the homogeneity of variance assumption?

Effect size

  • All of the effects in the ANOVA are statistically significant, but how big are they? An effect size, \(\eta^2\), provides a simple way of indexing effect magnitude for ANOVA designs. You can think of it the same way as \(R^2\) (proportion of variance explained).


  • To calculate \(\eta^2\), we’ll use etaSquared from {lsr} (the companion package to your textbook).

Code
lsr::etaSquared(model)
Output
##               eta.sq eta.sq.part
## rs         0.1692045   0.5258329
## partner    0.4041330   0.7259280
## rs:partner 0.2740833   0.6423889


Question: What does partial eta squared represent? Why might be prefer partial eta squared over eta squared?


Plotting

Main effects

Main effect of rejection sensitivity

  • Remember that main effects correspond to differences in margial means. To plot main effects, we can use sjPlot::plot_model(). As usual, the first argument will be the fitted model object. Because we want to plot marginal means, we specify type = emm (which stands for “estimated marginal means”). Lastly, because we want to plot the main effect for just rejection sensitivty, we will only specify terms = "rs".

Code
# main effect of rejection sensitivity
plot_model(model, type = "emm", terms = "rs") 
Output
## NOTE: Results may be misleading due to involvement in interactions


  • Note the warning when plotting this main effect: “Results may be misleading due to involvement in interactions”. As we’ve already discussed, we have to be careful in interpreting main effects when there is also a significant interaction. For example, in this case, we would say that there clearly is a main effect of rejection sensitivity (mean liking is higher for low compared to high rejection sensitivity), BUT this effect depends on the emotional expression of the interaction partner because the interaction between these variables is significant.


Main effect of interaction partner

  • Now, to plot the main effect of interaction partner, all we need to change is terms = "partner".

Code
# main effect of rejection sensitivity
plot_model(model, type = "emm", terms = "partner") 
Output
## NOTE: Results may be misleading due to involvement in interactions


Interaction

  • Now, to visualize the interaction between rejection sensitivity and interaction partner, we will still use plot_model, but instead of plotting estimated marginal means, we will tell it to plot predicted values using type = "pred". Since the interaction involves both variables, we will also specify terms = c("rs", "partner").

Code
# plot the interaction 
plot_model(model, type = "pred", terms = c("rs", "partner"))
Output


  • Switch how the interaction is visualized by switching the order of terms.

Code
# switch the order of the terms
plot_model(model, type = "pred", terms = c("partner", "rs"))
Output


  • The following ggplot code will generate a bar plot version of the interaction plot. The emmeans() function at the beginning calculates cell means in order to plot them (more on this below).

Code
emmeans(model, specs = c("partner", "rs")) %>% 
  broom::tidy() %>% 
  ggplot(aes(x = rs, y = estimate, fill = partner)) + 
  geom_bar(stat = "identity", 
           position = "dodge", 
           alpha = 0.7) + 
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high), 
                width = 0.20, 
                position = position_dodge(0.9)) +
  labs(title = "Effects of Rejection Sensitivity and Interaction Partner Expression \non Perceived Liking",
       x = "Rejection Sensitivity",
       y = "Perceived liking") +
  guides(fill=guide_legend(title = "Partner's Expression")) + 
  theme_minimal()
Output



Simple effects

  • Since we had a significant interaction above, we should look at the simple main effects. Recall that simple effects are the effect of some factor (e.g., interaction partner’s expression) at each level of another factor (e.g., at high and low RS separately).


  • We’ll look at the simple effect of interaction partner having a neutral vs. happy expression on perceived liking at different levels of rejection sensitivity. We’ll use the {emmeans} package, which Sara will go over more in lecture next week.


  • To get simple effects, we combine the emmeans() function with the contrast() function (both from {emmeans}). emmeans() works by passing it a model and then specifying which variables you’re looking at. Then, we pass that along to contrast(), which can give us a variety of different contrasts. If we want simple effects for interaction partner expression at each level of rejection sensitivity we can use by = "rs" and simple = "partner".

Code
model %>% emmeans(specs = c("partner", "rs")) %>% # specify our two factors
  contrast(by = "rs", # by is the variable we are looking at each level of
           simple = "partner") # simple is what we want simple effects of.
Output
## rs = Low:
##  contrast       estimate    SE df t.ratio p.value
##  Neutral effect   -0.225 0.127 76  -1.770 0.0807 
##  Happy effect      0.225 0.127 76   1.770 0.0807 
## 
## rs = High:
##  contrast       estimate    SE df t.ratio p.value
##  Neutral effect   -2.325 0.127 76 -18.294 <.0001 
##  Happy effect      2.325 0.127 76  18.294 <.0001 
## 
## P value adjustment: fdr method for 2 tests


  • The output has two lines per simple effect, which just shows you what it looks like in either direction (note that they are equivalent), so you would just pick the direction that makes the most sense.


  • Now let’s look at simple effects for rejection sensitivity at each level of partner expression.

Code
model %>% emmeans(specs = c("partner", "rs")) %>% # specify our two factors
  contrast(by = "partner", # by is the variable we are looking at each level of
           simple = "rs") # simple is what we want simple effects of.
Output
## partner = Neutral:
##  contrast    estimate    SE df t.ratio p.value
##  Low effect     1.875 0.127 76  14.754 <.0001 
##  High effect   -1.875 0.127 76 -14.754 <.0001 
## 
## partner = Happy:
##  contrast    estimate    SE df t.ratio p.value
##  Low effect    -0.225 0.127 76  -1.770 0.0807 
##  High effect    0.225 0.127 76   1.770 0.0807 
## 
## P value adjustment: fdr method for 2 tests