Purpose
Factorial ANOVA refers to a special case of the general linear model in which there is an interaction of two or more categorical variables (i.e. factors). A factorial design is used when there is an interest in how two or more variables (or factors) affect some outcomes variable. Rather than conduct separate one-way ANOVAs for each factor, they are all included in one analysis. Today we will review how to run factorial ANOVA models in R and how to interpret and visualize the results.
- Research scenario
- Tables of means
- Running the model
- Plotting
- Simple effects
Be sure to have the following packages installed and loaded:
library(tidyverse) # for plotting and data wrangling
library(rio) # for importing data
library(psych) # for descriptives
library(lsr) # for eta squared functions
library(emmeans) # for marginal means and simple effects
library(sjPlot) # for plotting model results
library(apaTables) # for tables of means
library(car) # for testing model assumptions
library(broom) # for tidying model output
Research scenario
- Based on subjects’ self-reports of rejection sensitivity (N = 80), a researcher divides subjects into two equal groups (low RS and high RS). Whereas half of the subjects in each group interact with a partner who displays a happy emotional expression during the interaction, the other half of the subjects in each group interact with a partner who displays a neutral emotional expression during the interaction. After the interaction, subjects are asked to rate the statement, “My interaction partner likes me”, on a scale from 1 (strongly disagree) to 7 (strongly agree).
Question: What type of factorial ANOVA design is this?
Data
reject <- import("https://raw.githubusercontent.com/uopsych/psy612/master/labs/lab-8/data/reject.csv")
Output
## 'data.frame': 80 obs. of 3 variables:
## $ rs : chr "Low" "Low" "Low" "Low" ...
## $ partner: chr "Neutral" "Neutral" "Neutral" "Neutral" ...
## $ liking : int 4 4 4 5 5 5 5 5 5 5 ...
- Look at the first few rows
- It looks like
rs
and partner
are both being read in as character variables. Let’s go ahead and change those to factors.
reject <- reject %>%
mutate(rs = as.factor(rs),
partner = as.factor(partner))
- Check the structure again. Notice that by default R orders factor levels alphabetically. In our case, this means that
High
will be the reference group of rejection sensitivity and Happy
will be the reference group of interaction partner’s emotional expression. However, it might be more intuitive to have Low
and Neutral
be the reference groups, respectively.
Output
## 'data.frame': 80 obs. of 3 variables:
## $ rs : Factor w/ 2 levels "High","Low": 2 2 2 2 2 2 2 2 2 2 ...
## $ partner: Factor w/ 2 levels "Happy","Neutral": 2 2 2 2 2 2 2 2 2 2 ...
## $ liking : int 4 4 4 5 5 5 5 5 5 5 ...
- To accomplish this, we can simply re-order the levels of our factor variables with
fct_relevel()
.
# manually specify order of factor levels
reject <- reject %>%
mutate(rs = fct_relevel(rs, c("Low", "High")),
partner = fct_relevel(partner, c("Neutral", "Happy")))
- To make sure this work, check the structure again.
Output
## 'data.frame': 80 obs. of 3 variables:
## $ rs : Factor w/ 2 levels "Low","High": 1 1 1 1 1 1 1 1 1 1 ...
## $ partner: Factor w/ 2 levels "Neutral","Happy": 1 1 1 1 1 1 1 1 1 1 ...
## $ liking : int 4 4 4 5 5 5 5 5 5 5 ...
- Do we have a balanced design?
Code
reject %>%
group_by(rs, partner) %>%
summarize(n = n())
Tables of means
Using summarize()
- We’ll create 4 tables of means:
- cell means (i.e., for each combination of the two factors)
- marginal means for interaction partner (averaged across levels of rejection sensitivity)
- marginal means for RS (averaged across levels of interaction partner)
- grand mean (averaged across both rejection sensitivity and interaction partner)
Cell Means
Code
reject %>%
group_by(rs, partner) %>%
summarize(mean = mean(liking, na.rm = TRUE),
sd = sd(liking, na.rm = TRUE)) %>%
knitr::kable(digits = c(NA, NA, 2, 2, 2),
caption = "Cell Means & SD")
Output
Cell Means & SD
Low |
Neutral |
5.55 |
1.05 |
Low |
Happy |
6.00 |
0.73 |
High |
Neutral |
1.80 |
0.70 |
High |
Happy |
6.45 |
0.69 |
Marginal Means: Interaction Partner
Code
reject %>%
group_by(partner) %>% # instead of grouping by both factors, we just group by interaction partner
summarize(mean = mean(liking, na.rm = TRUE),
sd = sd(liking, na.rm = TRUE)) %>%
knitr::kable(digits = c(NA, 2, 2, 2),
caption = "Marginal Means & SD for Interaction Partner")
Output
Marginal Means & SD for Interaction Partner
Neutral |
3.67 |
2.09 |
Happy |
6.22 |
0.73 |
Marginal Means: Rejection Sensitivity
Code
reject %>%
group_by(rs) %>% # just group by rs for this one
summarize(mean = mean(liking, na.rm = TRUE),
sd = sd(liking, na.rm = TRUE)) %>%
knitr::kable(digits = c(NA, 2, 2, 2),
caption = "Marginal Means & SD for Rejection Sensitivity")
Output
Marginal Means & SD for Rejection Sensitivity
Low |
5.78 |
0.92 |
High |
4.12 |
2.45 |
Grand Mean
Code
reject %>% # note that we don't need group_by for this one
summarize(mean = mean(liking, na.rm = TRUE),
sd = sd(liking, na.rm = TRUE)) %>%
knitr::kable(digits = c(2, 2, 2),
caption = "Grand Mean")
An easier way
- The
apa.2way.table()
function from {apaTables}
is a much more convenient way to get our cell means and marginal means. This function works for any type of 2-way ANOVA, regardless of the number of levels of your factors, e.g. it would work for a 3 x 3 ANOVA. All you need to do is indicate what the IV’s and DV are and specify show.marginal.means = TRUE
.
Code
apa.2way.table(iv1 = rs,
iv2 = partner,
dv = liking,
data = reject,
show.marginal.means = TRUE)
Output
##
##
## Means and standard deviations for liking as a function of a 2(rs) X 2(partner) design
##
## partner
## Neutral Happy Marginal
## rs M SD M SD M SD
## Low 5.55 1.05 6.00 0.73 5.78 0.92
## High 1.80 0.70 6.45 0.69 4.12 2.45
## Marginal 3.67 2.09 6.22 0.73
##
## Note. M and SD represent mean and standard deviation, respectively.
## Marginal indicates the means and standard deviations pertaining to main effects.
- An added plus is that you can easily export these tables to word (see
?apa.2way.table
).
Code
apa.2way.table(iv1 = rs,
iv2 = partner,
dv = liking,
data = reject,
show.marginal.means = TRUE,
table.number = 1,
filename = here::here("/labs/lab-8/images/means.doc")) # example file path
Output
- Remember, a main effect is the effect of one IV on the DV completely ignoring the other variable(s).
Question: Which means are being compared in the main effect of rejection sensitivity?
Question: Which means are being compared in the main effect of interaction partner?
Question: Which means are involved in the interaction?
Running the model
- Factorial ANOVA is the method by which we can examine whether two (or more) categorical IVs have joint effects on a continuous outcome of interest. Like all general linear models, factorial ANOVA is a specific case of multiple regression. However, we may choose to use an ANOVA framework for the sake of interpretability.
- We can specify the factorial ANOVA model using
lm()
the same way we specified
model <- lm(liking ~ rs * partner, data = reject)
- We can look at the regression coefficients from the model. However, in the case of factorial ANOVA, these are less useful to interpret when thinking about main effects.
Output
##
## Call:
## lm(formula = liking ~ rs * partner, data = reject)
##
## Residuals:
## Min 1Q Median 3Q Max
## -1.55 -0.55 0.00 0.55 1.45
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) 5.5500 0.1797 30.88 <0.0000000000000002 ***
## rsHigh -3.7500 0.2542 -14.75 <0.0000000000000002 ***
## partnerHappy 0.4500 0.2542 1.77 0.0807 .
## rsHigh:partnerHappy 4.2000 0.3595 11.68 <0.0000000000000002 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 0.8038 on 76 degrees of freedom
## Multiple R-squared: 0.8474, Adjusted R-squared: 0.8414
## F-statistic: 140.7 on 3 and 76 DF, p-value: < 0.00000000000000022
ANOVA table
- Instead, we will focus on the ANOVA table output. We can get this using
anova()
.
Question: Is there a main effect of rejection sensitivity on perceived liking?
Question: Is there a main effect of interaction partner on perceived liking?
Question: Is there an interaction between rejection sensitivity and interaction partner?
- We can tidy the ANOVA table output with
broom::tidy()
Code
anova(model) %>%
broom::tidy()
apaTables::apa.aov.table()
will generate an APA-formatted comprehensive ANOVA table for you
Code
# might need to have the {MBESS} package installed to use this function
# install.packages("MBESS")
apa.aov.table(lm_output = model,
conf.level = .95)
Output
##
##
## ANOVA results using liking as the dependent variable
##
##
## Predictor SS df MS F p partial_eta2 CI_95_partial_eta2
## (Intercept) 616.05 1 616.05 953.56 .000
## rs 140.62 1 140.62 217.67 .000 .74 [.64, .80]
## partner 2.03 1 2.03 3.13 .081 .04 [.00, .15]
## rs x partner 88.20 1 88.20 136.52 .000 .64 [.51, .73]
## Error 49.10 76 0.65
##
## Note: Values in square brackets indicate the bounds of the 95% confidence interval for partial eta-squared
- This also allows you to output tables to Word.
Code
apa.aov.table(lm_output = model,
conf.level = .95,
table.number = 2,
filename = here::here("/labs/lab-8/images/anova_table.doc"))
Output
Checking assumptions
- You can check the assumptions of the factorial ANOVA in much the same way you check them for multiple regression; but given the categorical nature of the predictors, some assumptions are easier to check.
- Homogeneity of variance, for example, can be tested using Levene’s test, instead of examining a plot.
Code
car::leveneTest(liking ~ rs * partner, data = reject)
Question: Have we met the homogeneity of variance assumption?
Effect size
- All of the effects in the ANOVA are statistically significant, but how big are they? An effect size, \(\eta^2\), provides a simple way of indexing effect magnitude for ANOVA designs. You can think of it the same way as \(R^2\) (proportion of variance explained).
- To calculate \(\eta^2\), we’ll use
etaSquared
from {lsr}
(the companion package to your textbook).
Code
lsr::etaSquared(model)
Output
## eta.sq eta.sq.part
## rs 0.1692045 0.5258329
## partner 0.4041330 0.7259280
## rs:partner 0.2740833 0.6423889
Question: What does partial eta squared represent? Why might be prefer partial eta squared over eta squared?
Plotting
Main effects
Main effect of rejection sensitivity
- Remember that main effects correspond to differences in margial means. To plot main effects, we can use
sjPlot::plot_model()
. As usual, the first argument will be the fitted model object. Because we want to plot marginal means, we specify type = emm
(which stands for “estimated marginal means”). Lastly, because we want to plot the main effect for just rejection sensitivty, we will only specify terms = "rs"
.
Code
# main effect of rejection sensitivity
plot_model(model, type = "emm", terms = "rs")
Output
## NOTE: Results may be misleading due to involvement in interactions
- Note the warning when plotting this main effect: “Results may be misleading due to involvement in interactions”. As we’ve already discussed, we have to be careful in interpreting main effects when there is also a significant interaction. For example, in this case, we would say that there clearly is a main effect of rejection sensitivity (mean liking is higher for low compared to high rejection sensitivity), BUT this effect depends on the emotional expression of the interaction partner because the interaction between these variables is significant.
Main effect of interaction partner
- Now, to plot the main effect of interaction partner, all we need to change is
terms = "partner"
.
Code
# main effect of rejection sensitivity
plot_model(model, type = "emm", terms = "partner")
Output
## NOTE: Results may be misleading due to involvement in interactions
Interaction
- Now, to visualize the interaction between rejection sensitivity and interaction partner, we will still use
plot_model
, but instead of plotting estimated marginal means, we will tell it to plot predicted values using type = "pred"
. Since the interaction involves both variables, we will also specify terms = c("rs", "partner")
.
Code
# plot the interaction
plot_model(model, type = "pred", terms = c("rs", "partner"))
Output
- Switch how the interaction is visualized by switching the order of
terms
.
Code
# switch the order of the terms
plot_model(model, type = "pred", terms = c("partner", "rs"))
Output
- The following ggplot code will generate a bar plot version of the interaction plot. The
emmeans()
function at the beginning calculates cell means in order to plot them (more on this below).
Code
emmeans(model, specs = c("partner", "rs")) %>%
broom::tidy() %>%
ggplot(aes(x = rs, y = estimate, fill = partner)) +
geom_bar(stat = "identity",
position = "dodge",
alpha = 0.7) +
geom_errorbar(aes(ymin = conf.low, ymax = conf.high),
width = 0.20,
position = position_dodge(0.9)) +
labs(title = "Effects of Rejection Sensitivity and Interaction Partner Expression \non Perceived Liking",
x = "Rejection Sensitivity",
y = "Perceived liking") +
guides(fill=guide_legend(title = "Partner's Expression")) +
theme_minimal()
Output
Simple effects
- Since we had a significant interaction above, we should look at the simple main effects. Recall that simple effects are the effect of some factor (e.g., interaction partner’s expression) at each level of another factor (e.g., at high and low RS separately).
- We’ll look at the simple effect of interaction partner having a neutral vs. happy expression on perceived liking at different levels of rejection sensitivity. We’ll use the
{emmeans}
package, which Sara will go over more in lecture next week.
- To get simple effects, we combine the
emmeans()
function with the contrast()
function (both from {emmeans}
). emmeans()
works by passing it a model and then specifying which variables you’re looking at. Then, we pass that along to contrast()
, which can give us a variety of different contrasts. If we want simple effects for interaction partner expression at each level of rejection sensitivity we can use by = "rs"
and simple = "partner"
.
Code
model %>% emmeans(specs = c("partner", "rs")) %>% # specify our two factors
contrast(by = "rs", # by is the variable we are looking at each level of
simple = "partner") # simple is what we want simple effects of.
Output
## rs = Low:
## contrast estimate SE df t.ratio p.value
## Neutral effect -0.225 0.127 76 -1.770 0.0807
## Happy effect 0.225 0.127 76 1.770 0.0807
##
## rs = High:
## contrast estimate SE df t.ratio p.value
## Neutral effect -2.325 0.127 76 -18.294 <.0001
## Happy effect 2.325 0.127 76 18.294 <.0001
##
## P value adjustment: fdr method for 2 tests
- The output has two lines per simple effect, which just shows you what it looks like in either direction (note that they are equivalent), so you would just pick the direction that makes the most sense.
- Now let’s look at simple effects for rejection sensitivity at each level of partner expression.
Code
model %>% emmeans(specs = c("partner", "rs")) %>% # specify our two factors
contrast(by = "partner", # by is the variable we are looking at each level of
simple = "rs") # simple is what we want simple effects of.
Output
## partner = Neutral:
## contrast estimate SE df t.ratio p.value
## Low effect 1.875 0.127 76 14.754 <.0001
## High effect -1.875 0.127 76 -14.754 <.0001
##
## partner = Happy:
## contrast estimate SE df t.ratio p.value
## Low effect -0.225 0.127 76 -1.770 0.0807
## High effect 0.225 0.127 76 1.770 0.0807
##
## P value adjustment: fdr method for 2 tests