Purpose

Factorial ANOVA refers to a special case of the general linear model in which there is an interaction of two or more categorical variables (i.e. factors). A factorial design is used when there is an interest in how two or more variables (or factors) affect some outcomes variable. Rather than conduct separate one-way ANOVAs for each factor, they are all included in one analysis. Today we will review how to run factorial ANOVA models in R and how to interpret and visualize the results.

Research scenario
Tables of means
Running the model
Plotting
Simple effects

Be sure to have the following packages installed and loaded:

library(tidyverse) # for plotting and data wrangling
library(rio) # for importing data
library(psych) # for descriptives
library(lsr) # for eta squared functions
library(emmeans) # for marginal means and simple effects
library(sjPlot) # for plotting model results 
library(apaTables) # for tables of means
library(car) # for testing model assumptions
library(broom) # for tidying model output

Research scenario

Based on subjects’ self-reports of rejection sensitivity (N = 80), a researcher divides subjects into two equal groups (low RS and high RS). Whereas half of the subjects in each group interact with a partner who displays a happy emotional expression during the interaction, the other half of the subjects in each group interact with a partner who displays a neutral emotional expression during the interaction. After the interaction, subjects are asked to rate the statement, “My interaction partner likes me”, on a scale from 1 (strongly disagree) to 7 (strongly agree).

Question: What type of factorial ANOVA design is this?

Data

Import the data

reject <- import("https://raw.githubusercontent.com/uopsych/psy612/master/labs/lab-8/data/reject.csv")

Check out the data

Code

str(reject)

Output

## 'data.frame':    80 obs. of  3 variables:
##  $ rs     : chr  "Low" "Low" "Low" "Low" ...
##  $ partner: chr  "Neutral" "Neutral" "Neutral" "Neutral" ...
##  $ liking : int  4 4 4 5 5 5 5 5 5 5 ...

Look at the first few rows

Code

head(reject)

Output

It looks like rs and partner are both being read in as character variables. Let’s go ahead and change those to factors.

reject <- reject %>% 
  mutate(rs = as.factor(rs),
         partner = as.factor(partner))

Check the structure again. Notice that by default R orders factor levels alphabetically. In our case, this means that High will be the reference group of rejection sensitivity and Happy will be the reference group of interaction partner’s emotional expression. However, it might be more intuitive to have Low and Neutral be the reference groups, respectively.

Code

str(reject)

Output

## 'data.frame':    80 obs. of  3 variables:
##  $ rs     : Factor w/ 2 levels "High","Low": 2 2 2 2 2 2 2 2 2 2 ...
##  $ partner: Factor w/ 2 levels "Happy","Neutral": 2 2 2 2 2 2 2 2 2 2 ...
##  $ liking : int  4 4 4 5 5 5 5 5 5 5 ...

To accomplish this, we can simply re-order the levels of our factor variables with fct_relevel().

# manually specify order of factor levels
reject <- reject %>% 
  mutate(rs = fct_relevel(rs, c("Low", "High")), 
         partner = fct_relevel(partner, c("Neutral", "Happy")))

To make sure this work, check the structure again.

Code

str(reject)

Output

## 'data.frame':    80 obs. of  3 variables:
##  $ rs     : Factor w/ 2 levels "Low","High": 1 1 1 1 1 1 1 1 1 1 ...
##  $ partner: Factor w/ 2 levels "Neutral","Happy": 1 1 1 1 1 1 1 1 1 1 ...
##  $ liking : int  4 4 4 5 5 5 5 5 5 5 ...

Do we have a balanced design?

Code

reject %>%
  group_by(rs, partner) %>% 
  summarize(n = n())

Output

Tables of means

Using `summarize()`

We’ll create 4 tables of means:

cell means (i.e., for each combination of the two factors)
marginal means for interaction partner (averaged across levels of rejection sensitivity)
marginal means for RS (averaged across levels of interaction partner)
grand mean (averaged across both rejection sensitivity and interaction partner)

Cell Means

Code

reject %>%
  group_by(rs, partner) %>% 
  summarize(mean = mean(liking, na.rm = TRUE),
            sd = sd(liking, na.rm = TRUE)) %>% 
knitr::kable(digits = c(NA, NA, 2, 2, 2),
             caption = "Cell Means & SD")

Output

Cell Means & SD
rs	partner	mean	sd
Low	Neutral	5.55	1.05
Low	Happy	6.00	0.73
High	Neutral	1.80	0.70
High	Happy	6.45	0.69

Marginal Means: Interaction Partner

Code

reject %>%
  group_by(partner) %>% # instead of grouping by both factors, we just group by interaction partner
  summarize(mean = mean(liking, na.rm = TRUE),
            sd = sd(liking, na.rm = TRUE)) %>% 
knitr::kable(digits = c(NA, 2, 2, 2),
             caption = "Marginal Means & SD for Interaction Partner")

Output

Marginal Means & SD for Interaction Partner
partner	mean	sd
Neutral	3.67	2.09
Happy	6.22	0.73

Marginal Means: Rejection Sensitivity

Code

reject %>%
  group_by(rs) %>% # just group by rs for this one
  summarize(mean = mean(liking, na.rm = TRUE),
            sd = sd(liking, na.rm = TRUE)) %>% 
knitr::kable(digits = c(NA, 2, 2, 2),
             caption = "Marginal Means & SD for Rejection Sensitivity")

Output

Marginal Means & SD for Rejection Sensitivity
rs	mean	sd
Low	5.78	0.92
High	4.12	2.45

Grand Mean

Code

reject %>% # note that we don't need group_by for this one
  summarize(mean = mean(liking, na.rm = TRUE),
            sd = sd(liking, na.rm = TRUE)) %>% 
knitr::kable(digits = c(2, 2, 2),
             caption = "Grand Mean")

Output

Grand Mean
mean	sd
4.95	2.02

An easier way

The apa.2way.table() function from {apaTables} is a much more convenient way to get our cell means and marginal means. This function works for any type of 2-way ANOVA, regardless of the number of levels of your factors, e.g. it would work for a 3 x 3 ANOVA. All you need to do is indicate what the IV’s and DV are and specify show.marginal.means = TRUE.

Code

apa.2way.table(iv1 = rs, 
               iv2 = partner, 
               dv = liking, 
               data = reject,
               show.marginal.means = TRUE)

Output

## 
## 
## Means and standard deviations for liking as a function of a 2(rs) X 2(partner) design 
## 
##           partner                              
##           Neutral      Happy      Marginal     
##        rs       M   SD     M   SD        M   SD
##       Low    5.55 1.05  6.00 0.73     5.78 0.92
##      High    1.80 0.70  6.45 0.69     4.12 2.45
##  Marginal    3.67 2.09  6.22 0.73              
## 
## Note. M and SD represent mean and standard deviation, respectively. 
## Marginal indicates the means and standard deviations pertaining to main effects.

An added plus is that you can easily export these tables to word (see ?apa.2way.table).

Code

apa.2way.table(iv1 = rs,
               iv2 = partner,
               dv = liking,
               data = reject,
               show.marginal.means = TRUE,
               table.number = 1,
               filename = here::here("/labs/lab-8/images/means.doc")) # example file path

Output

Remember, a main effect is the effect of one IV on the DV completely ignoring the other variable(s).

Question: Which means are being compared in the main effect of rejection sensitivity?

Question: Which means are being compared in the main effect of interaction partner?

Question: Which means are involved in the interaction?

Running the model

Factorial ANOVA is the method by which we can examine whether two (or more) categorical IVs have joint effects on a continuous outcome of interest. Like all general linear models, factorial ANOVA is a specific case of multiple regression. However, we may choose to use an ANOVA framework for the sake of interpretability.

We can specify the factorial ANOVA model using lm() the same way we specified

model <- lm(liking ~ rs * partner, data = reject)

We can look at the regression coefficients from the model. However, in the case of factorial ANOVA, these are less useful to interpret when thinking about main effects.

Code

summary(model)

Output

## 
## Call:
## lm(formula = liking ~ rs * partner, data = reject)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
##  -1.55  -0.55   0.00   0.55   1.45 
## 
## Coefficients:
##                     Estimate Std. Error t value            Pr(>|t|)    
## (Intercept)           5.5500     0.1797   30.88 <0.0000000000000002 ***
## rsHigh               -3.7500     0.2542  -14.75 <0.0000000000000002 ***
## partnerHappy          0.4500     0.2542    1.77              0.0807 .  
## rsHigh:partnerHappy   4.2000     0.3595   11.68 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.8038 on 76 degrees of freedom
## Multiple R-squared:  0.8474, Adjusted R-squared:  0.8414 
## F-statistic: 140.7 on 3 and 76 DF,  p-value: < 0.00000000000000022

ANOVA table

Instead, we will focus on the ANOVA table output. We can get this using anova().

Code

anova(model)

Output

Question: Is there a main effect of rejection sensitivity on perceived liking?

Question: Is there a main effect of interaction partner on perceived liking?

Question: Is there an interaction between rejection sensitivity and interaction partner?

We can tidy the ANOVA table output with broom::tidy()

Code

anova(model) %>% 
  broom::tidy()

Output

apaTables::apa.aov.table() will generate an APA-formatted comprehensive ANOVA table for you

Code

# might need to have the {MBESS} package installed to use this function
# install.packages("MBESS")

apa.aov.table(lm_output = model,
              conf.level = .95)

Output

## 
## 
## ANOVA results using liking as the dependent variable
##  
## 
##     Predictor     SS df     MS      F    p partial_eta2 CI_95_partial_eta2
##   (Intercept) 616.05  1 616.05 953.56 .000                                
##            rs 140.62  1 140.62 217.67 .000          .74         [.64, .80]
##       partner   2.03  1   2.03   3.13 .081          .04         [.00, .15]
##  rs x partner  88.20  1  88.20 136.52 .000          .64         [.51, .73]
##         Error  49.10 76   0.65                                            
## 
## Note: Values in square brackets indicate the bounds of the 95% confidence interval for partial eta-squared

This also allows you to output tables to Word.

Code

apa.aov.table(lm_output = model,
              conf.level = .95,
              table.number = 2,
              filename = here::here("/labs/lab-8/images/anova_table.doc"))

Output

Checking assumptions

You can check the assumptions of the factorial ANOVA in much the same way you check them for multiple regression; but given the categorical nature of the predictors, some assumptions are easier to check.

Homogeneity of variance, for example, can be tested using Levene’s test, instead of examining a plot.

Code

car::leveneTest(liking ~ rs * partner, data = reject)

Output

Question: Have we met the homogeneity of variance assumption?

Effect size

All of the effects in the ANOVA are statistically significant, but how big are they? An effect size, \(\eta^2\), provides a simple way of indexing effect magnitude for ANOVA designs. You can think of it the same way as \(R^2\) (proportion of variance explained).

To calculate \(\eta^2\), we’ll use etaSquared from {lsr} (the companion package to your textbook).

Code

lsr::etaSquared(model)

Output

##               eta.sq eta.sq.part
## rs         0.1692045   0.5258329
## partner    0.4041330   0.7259280
## rs:partner 0.2740833   0.6423889

Question: What does partial eta squared represent? Why might be prefer partial eta squared over eta squared?

Plotting

Main effects

Main effect of rejection sensitivity

Remember that main effects correspond to differences in margial means. To plot main effects, we can use sjPlot::plot_model(). As usual, the first argument will be the fitted model object. Because we want to plot marginal means, we specify type = emm (which stands for “estimated marginal means”). Lastly, because we want to plot the main effect for just rejection sensitivty, we will only specify terms = "rs".

Code

# main effect of rejection sensitivity
plot_model(model, type = "emm", terms = "rs")

Output

## NOTE: Results may be misleading due to involvement in interactions

Note the warning when plotting this main effect: “Results may be misleading due to involvement in interactions”. As we’ve already discussed, we have to be careful in interpreting main effects when there is also a significant interaction. For example, in this case, we would say that there clearly is a main effect of rejection sensitivity (mean liking is higher for low compared to high rejection sensitivity), BUT this effect depends on the emotional expression of the interaction partner because the interaction between these variables is significant.

Main effect of interaction partner

Now, to plot the main effect of interaction partner, all we need to change is terms = "partner".

Code

# main effect of rejection sensitivity
plot_model(model, type = "emm", terms = "partner")

Output

## NOTE: Results may be misleading due to involvement in interactions

Interaction

Now, to visualize the interaction between rejection sensitivity and interaction partner, we will still use plot_model, but instead of plotting estimated marginal means, we will tell it to plot predicted values using type = "pred". Since the interaction involves both variables, we will also specify terms = c("rs", "partner").

Code

# plot the interaction 
plot_model(model, type = "pred", terms = c("rs", "partner"))

Output

Switch how the interaction is visualized by switching the order of terms.

Code

# switch the order of the terms
plot_model(model, type = "pred", terms = c("partner", "rs"))

Output

The following ggplot code will generate a bar plot version of the interaction plot. The emmeans() function at the beginning calculates cell means in order to plot them (more on this below).

Code

emmeans(model, specs = c("partner", "rs")) %>% 
  broom::tidy() %>% 
  ggplot(aes(x = rs, y = estimate, fill = partner)) + 
  geom_bar(stat = "identity", 
           position = "dodge", 
           alpha = 0.7) + 
  geom_errorbar(aes(ymin = conf.low, ymax = conf.high), 
                width = 0.20, 
                position = position_dodge(0.9)) +
  labs(title = "Effects of Rejection Sensitivity and Interaction Partner Expression \non Perceived Liking",
       x = "Rejection Sensitivity",
       y = "Perceived liking") +
  guides(fill=guide_legend(title = "Partner's Expression")) + 
  theme_minimal()

Output

Simple effects

Since we had a significant interaction above, we should look at the simple main effects. Recall that simple effects are the effect of some factor (e.g., interaction partner’s expression) at each level of another factor (e.g., at high and low RS separately).

We’ll look at the simple effect of interaction partner having a neutral vs. happy expression on perceived liking at different levels of rejection sensitivity. We’ll use the {emmeans} package, which Sara will go over more in lecture next week.

To get simple effects, we combine the emmeans() function with the contrast() function (both from {emmeans}). emmeans() works by passing it a model and then specifying which variables you’re looking at. Then, we pass that along to contrast(), which can give us a variety of different contrasts. If we want simple effects for interaction partner expression at each level of rejection sensitivity we can use by = "rs" and simple = "partner".

Code

model %>% emmeans(specs = c("partner", "rs")) %>% # specify our two factors
  contrast(by = "rs", # by is the variable we are looking at each level of
           simple = "partner") # simple is what we want simple effects of.

Output

## rs = Low:
##  contrast       estimate    SE df t.ratio p.value
##  Neutral effect   -0.225 0.127 76  -1.770 0.0807 
##  Happy effect      0.225 0.127 76   1.770 0.0807 
## 
## rs = High:
##  contrast       estimate    SE df t.ratio p.value
##  Neutral effect   -2.325 0.127 76 -18.294 <.0001 
##  Happy effect      2.325 0.127 76  18.294 <.0001 
## 
## P value adjustment: fdr method for 2 tests

The output has two lines per simple effect, which just shows you what it looks like in either direction (note that they are equivalent), so you would just pick the direction that makes the most sense.

Now let’s look at simple effects for rejection sensitivity at each level of partner expression.

Code

model %>% emmeans(specs = c("partner", "rs")) %>% # specify our two factors
  contrast(by = "partner", # by is the variable we are looking at each level of
           simple = "rs") # simple is what we want simple effects of.

Output

## partner = Neutral:
##  contrast    estimate    SE df t.ratio p.value
##  Low effect     1.875 0.127 76  14.754 <.0001 
##  High effect   -1.875 0.127 76 -14.754 <.0001 
## 
## partner = Happy:
##  contrast    estimate    SE df t.ratio p.value
##  Low effect    -0.225 0.127 76  -1.770 0.0807 
##  High effect    0.225 0.127 76   1.770 0.0807 
## 
## P value adjustment: fdr method for 2 tests

Lab 8: Factorial ANOVA

Purpose

Research scenario

Data

Code

Output

Code

Output

Code

Output

Code

Output

Code

Output

Tables of means

Using summarize()

Cell Means

Code

Output

Marginal Means: Interaction Partner

Code

Output

Marginal Means: Rejection Sensitivity

Code

Output

Grand Mean

Code

Output

An easier way

Code

Output

Code

Output

Running the model

Code

Output

ANOVA table

Code

Output

Code

Output

Code

Output

Code

Output

Checking assumptions

Code

Output

Effect size

Code

Output

Plotting

Main effects

Main effect of rejection sensitivity

Code

Output

Main effect of interaction partner

Code

Output

Interaction

Code

Output

Code

Output

Code

Output

Simple effects

Code

Output

Code

Output

Using `summarize()`