class: center, middle, inverse, title-slide # Correlations --- ## Recap Correlations are: - Standardized covariances + Range from -1 to 1 - an effect size + Measure of the strength of association between two continuous variables --- ### Recap: testing the significance of a correlation .pull-left[ If the null hypothesis is the **nil hypothesis**: - test significance using a *t*-distribution, where `$$\large t = \frac{r}{SE_r}$$` `$$\large SE_r = \sqrt{\frac{1-r^2}{N-2}}$$` `$$DF = N-2$$` ] .pull-right[ If null hypothesis is not 0 `\((\text{e.g., }H_0:\rho_{xy} = .40)\)` - Transform statistic and null using Fisher's r to Z `$$\large z^{'} = {\frac{1}{2}}ln{\frac{1+r}{1-r}}$$` `$$\large SE = \frac{1}{\sqrt{N-3}}$$` ] --- ### Example In PSY 302, the correlation between midterm exam grades and final exam grades was .56. The class size was 104. Is this statistically significant? -- ### Using t-method `$$\large SE_r = \sqrt{\frac{1-r^2}{N-2}} = \sqrt{\frac{1-.56^2}{104-2}} = 0.08$$` `$$\large t = \frac{r}{SE_r} = \frac{0.56}{0.08} = 6.83$$` --- .left-column[ Probability of getting a *t* statistic of 6.83 or greater is 3.1921793\times 10^{-10}. Probability of getting *t* statistic of 6.83 or more extreme is 6.3843586\times 10^{-10}. ] ![](2-correlation_files/figure-html/unnamed-chunk-2-1.png)<!-- --> --- ### Example In PSY 302, the correlation between midterm exam grades and final exam grades was .56. The class size was 104. Is this statistically significantly different from .40? -- `$$\large z^{'} = {\frac{1}{2}}ln{\frac{1+r}{1-r}}= {\frac{1}{2}}ln{\frac{1+0.56}{1-0.56}} = 0.63$$` `$$\large z^{'}_{H_0} = {\frac{1}{2}}ln{\frac{1+r}{1-r}}= {\frac{1}{2}}ln{\frac{1+0.4}{1-0.4}} = 0.42$$` $$ SE_z = \frac{1}{\sqrt{104-3}} = 0.1$$ --- `$$Z_{\text{statistic}} = \frac{z'-\mu}{SE_z}=\frac{0.63-0.42}{0.1} = 2.1$$` ```r stat ``` ``` ## [1] 2.102276 ``` ```r pnorm(stat, lower.tail = F) ``` ``` ## [1] 0.01776456 ``` ```r pnorm(stat, lower.tail = F)*2 ``` ``` ## [1] 0.03552913 ``` --- ## Today - correlation matrices - visualizing correlations - reliability --- ## Correlation matrices Correlations are both a descriptive and an inferential statistic. As a descriptive statistic, they're useful for understanding what's going on in a larger dataset. Like we use the `summary()` or `describe()` (psych) functions to examine our dataset _before we run any infernetial tests_, we should also look at the correlation matrix. --- ```r library(psych) ``` ``` ## ## Attaching package: 'psych' ``` ``` ## The following objects are masked from 'package:ggplot2': ## ## %+%, alpha ``` ```r data(bfi) head(bfi) ``` ``` ## A1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O2 O3 O4 ## 61617 2 4 3 4 4 2 3 3 4 4 3 3 3 4 4 3 4 2 2 3 3 6 3 4 ## 61618 2 4 5 2 5 5 4 4 3 4 1 1 6 4 3 3 3 3 5 5 4 2 4 3 ## 61620 5 4 5 4 4 4 5 4 2 5 2 4 4 4 5 4 5 4 2 3 4 2 5 5 ## 61621 4 4 6 5 5 4 4 3 5 5 5 3 4 4 4 2 5 2 4 1 3 3 4 3 ## 61622 2 3 3 4 5 4 4 5 3 2 2 2 5 4 5 2 3 4 4 3 3 3 4 3 ## 61623 6 6 5 6 5 6 6 6 1 3 2 1 6 5 6 3 5 2 2 3 4 3 5 6 ## O5 gender education age ## 61617 3 1 NA 16 ## 61618 3 2 NA 18 ## 61620 2 2 NA 17 ## 61621 5 2 NA 17 ## 61622 3 1 NA 17 ## 61623 1 2 3 21 ``` --- ```r cor(bfi) ``` ``` ## A1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 ## A1 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## A2 NA 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## A3 NA NA 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## A4 NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## A5 NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## C1 NA NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## C2 NA NA NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## C3 NA NA NA NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA NA NA ## C4 NA NA NA NA NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA NA ## C5 NA NA NA NA NA NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA NA ## E1 NA NA NA NA NA NA NA NA NA NA 1 NA NA NA NA NA NA NA NA NA NA ## E2 NA NA NA NA NA NA NA NA NA NA NA 1 NA NA NA NA NA NA NA NA NA ## E3 NA NA NA NA NA NA NA NA NA NA NA NA 1 NA NA NA NA NA NA NA NA ## E4 NA NA NA NA NA NA NA NA NA NA NA NA NA 1 NA NA NA NA NA NA NA ## E5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 NA NA NA NA NA NA ## N1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 NA NA NA NA NA ## N2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 NA NA NA NA ## N3 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 NA NA NA ## N4 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 NA NA ## N5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 NA ## O1 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA 1 ## O2 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## O3 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## O4 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## O5 NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## gender NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## education NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## age NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA ## O2 O3 O4 O5 gender education age ## A1 NA NA NA NA NA NA NA ## A2 NA NA NA NA NA NA NA ## A3 NA NA NA NA NA NA NA ## A4 NA NA NA NA NA NA NA ## A5 NA NA NA NA NA NA NA ## C1 NA NA NA NA NA NA NA ## C2 NA NA NA NA NA NA NA ## C3 NA NA NA NA NA NA NA ## C4 NA NA NA NA NA NA NA ## C5 NA NA NA NA NA NA NA ## E1 NA NA NA NA NA NA NA ## E2 NA NA NA NA NA NA NA ## E3 NA NA NA NA NA NA NA ## E4 NA NA NA NA NA NA NA ## E5 NA NA NA NA NA NA NA ## N1 NA NA NA NA NA NA NA ## N2 NA NA NA NA NA NA NA ## N3 NA NA NA NA NA NA NA ## N4 NA NA NA NA NA NA NA ## N5 NA NA NA NA NA NA NA ## O1 NA NA NA NA NA NA NA ## O2 1.00000000 NA NA NA 0.02694778 NA -0.04254386 ## O3 NA 1 NA NA NA NA NA ## O4 NA NA 1 NA NA NA NA ## O5 NA NA NA 1 NA NA NA ## gender 0.02694778 NA NA NA 1.00000000 NA 0.04770347 ## education NA NA NA NA NA 1 NA ## age -0.04254386 NA NA NA 0.04770347 NA 1.00000000 ``` --- ```r round(cor(bfi, use = "pairwise"),2) ``` ``` ## A1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 ## A1 1.00 -0.34 -0.27 -0.15 -0.18 0.03 0.02 -0.02 0.13 0.05 0.11 ## A2 -0.34 1.00 0.49 0.34 0.39 0.09 0.14 0.19 -0.15 -0.12 -0.21 ## A3 -0.27 0.49 1.00 0.36 0.50 0.10 0.14 0.13 -0.12 -0.16 -0.21 ## A4 -0.15 0.34 0.36 1.00 0.31 0.09 0.23 0.13 -0.15 -0.24 -0.11 ## A5 -0.18 0.39 0.50 0.31 1.00 0.12 0.11 0.13 -0.13 -0.17 -0.25 ## C1 0.03 0.09 0.10 0.09 0.12 1.00 0.43 0.31 -0.34 -0.25 -0.02 ## C2 0.02 0.14 0.14 0.23 0.11 0.43 1.00 0.36 -0.38 -0.30 0.02 ## C3 -0.02 0.19 0.13 0.13 0.13 0.31 0.36 1.00 -0.34 -0.34 0.00 ## C4 0.13 -0.15 -0.12 -0.15 -0.13 -0.34 -0.38 -0.34 1.00 0.48 0.09 ## C5 0.05 -0.12 -0.16 -0.24 -0.17 -0.25 -0.30 -0.34 0.48 1.00 0.06 ## E1 0.11 -0.21 -0.21 -0.11 -0.25 -0.02 0.02 0.00 0.09 0.06 1.00 ## E2 0.09 -0.23 -0.29 -0.19 -0.33 -0.09 -0.06 -0.08 0.20 0.26 0.47 ## E3 -0.05 0.25 0.39 0.19 0.42 0.12 0.15 0.09 -0.08 -0.16 -0.33 ## E4 -0.06 0.28 0.38 0.30 0.47 0.14 0.12 0.09 -0.11 -0.20 -0.42 ## E5 -0.02 0.29 0.25 0.16 0.27 0.25 0.25 0.21 -0.24 -0.23 -0.30 ## N1 0.17 -0.09 -0.08 -0.10 -0.20 -0.07 -0.02 -0.07 0.22 0.21 0.02 ## N2 0.14 -0.05 -0.09 -0.14 -0.19 -0.04 -0.01 -0.06 0.16 0.25 0.01 ## N3 0.10 -0.04 -0.04 -0.07 -0.14 -0.03 0.00 -0.07 0.21 0.24 0.05 ## N4 0.05 -0.09 -0.13 -0.17 -0.20 -0.10 -0.05 -0.11 0.26 0.34 0.23 ## N5 0.02 0.02 -0.04 -0.01 -0.08 -0.05 0.05 -0.01 0.20 0.17 0.05 ## O1 0.01 0.13 0.15 0.06 0.16 0.17 0.16 0.09 -0.09 -0.08 -0.10 ## O2 0.08 0.02 0.00 0.04 0.00 -0.11 -0.04 -0.03 0.21 0.14 0.04 ## O3 -0.06 0.16 0.22 0.07 0.24 0.19 0.19 0.06 -0.08 -0.08 -0.22 ## O4 -0.08 0.09 0.04 -0.04 0.02 0.11 0.06 0.02 0.05 0.14 0.08 ## O5 0.11 -0.09 -0.05 0.02 -0.05 -0.12 -0.05 -0.01 0.20 0.06 0.10 ## gender -0.16 0.18 0.14 0.13 0.10 0.01 0.07 0.05 -0.08 -0.09 -0.13 ## education -0.14 0.01 0.00 -0.02 0.01 0.03 0.00 0.05 -0.04 0.03 0.00 ## age -0.16 0.11 0.07 0.14 0.13 0.08 0.02 0.07 -0.15 -0.09 -0.03 ## E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O2 ## A1 0.09 -0.05 -0.06 -0.02 0.17 0.14 0.10 0.05 0.02 0.01 0.08 ## A2 -0.23 0.25 0.28 0.29 -0.09 -0.05 -0.04 -0.09 0.02 0.13 0.02 ## A3 -0.29 0.39 0.38 0.25 -0.08 -0.09 -0.04 -0.13 -0.04 0.15 0.00 ## A4 -0.19 0.19 0.30 0.16 -0.10 -0.14 -0.07 -0.17 -0.01 0.06 0.04 ## A5 -0.33 0.42 0.47 0.27 -0.20 -0.19 -0.14 -0.20 -0.08 0.16 0.00 ## C1 -0.09 0.12 0.14 0.25 -0.07 -0.04 -0.03 -0.10 -0.05 0.17 -0.11 ## C2 -0.06 0.15 0.12 0.25 -0.02 -0.01 0.00 -0.05 0.05 0.16 -0.04 ## C3 -0.08 0.09 0.09 0.21 -0.07 -0.06 -0.07 -0.11 -0.01 0.09 -0.03 ## C4 0.20 -0.08 -0.11 -0.24 0.22 0.16 0.21 0.26 0.20 -0.09 0.21 ## C5 0.26 -0.16 -0.20 -0.23 0.21 0.25 0.24 0.34 0.17 -0.08 0.14 ## E1 0.47 -0.33 -0.42 -0.30 0.02 0.01 0.05 0.23 0.05 -0.10 0.04 ## E2 1.00 -0.38 -0.51 -0.37 0.17 0.19 0.20 0.35 0.25 -0.16 0.08 ## E3 -0.38 1.00 0.42 0.38 -0.05 -0.07 -0.02 -0.15 -0.07 0.33 -0.07 ## E4 -0.51 0.42 1.00 0.32 -0.14 -0.14 -0.10 -0.29 -0.09 0.14 0.06 ## E5 -0.37 0.38 0.32 1.00 0.04 0.04 -0.06 -0.21 -0.13 0.30 -0.08 ## N1 0.17 -0.05 -0.14 0.04 1.00 0.71 0.56 0.40 0.38 -0.05 0.13 ## N2 0.19 -0.07 -0.14 0.04 0.71 1.00 0.55 0.39 0.35 -0.05 0.13 ## N3 0.20 -0.02 -0.10 -0.06 0.56 0.55 1.00 0.52 0.43 -0.03 0.11 ## N4 0.35 -0.15 -0.29 -0.21 0.40 0.39 0.52 1.00 0.40 -0.05 0.08 ## N5 0.25 -0.07 -0.09 -0.13 0.38 0.35 0.43 0.40 1.00 -0.12 0.20 ## O1 -0.16 0.33 0.14 0.30 -0.05 -0.05 -0.03 -0.05 -0.12 1.00 -0.21 ## O2 0.08 -0.07 0.06 -0.08 0.13 0.13 0.11 0.08 0.20 -0.21 1.00 ## O3 -0.23 0.39 0.21 0.29 -0.05 -0.03 -0.03 -0.06 -0.08 0.40 -0.26 ## O4 0.17 0.05 -0.10 0.00 0.08 0.13 0.18 0.21 0.11 0.18 -0.07 ## O5 0.08 -0.11 0.05 -0.11 0.11 0.04 0.06 0.04 0.14 -0.24 0.32 ## gender -0.05 0.05 0.08 0.07 0.04 0.10 0.12 0.00 0.21 -0.10 0.03 ## education -0.01 0.00 -0.04 0.06 -0.05 -0.05 -0.05 0.01 -0.05 0.03 -0.09 ## age -0.11 0.00 -0.01 0.11 -0.09 -0.10 -0.11 -0.03 -0.10 0.05 -0.04 ## O3 O4 O5 gender education age ## A1 -0.06 -0.08 0.11 -0.16 -0.14 -0.16 ## A2 0.16 0.09 -0.09 0.18 0.01 0.11 ## A3 0.22 0.04 -0.05 0.14 0.00 0.07 ## A4 0.07 -0.04 0.02 0.13 -0.02 0.14 ## A5 0.24 0.02 -0.05 0.10 0.01 0.13 ## C1 0.19 0.11 -0.12 0.01 0.03 0.08 ## C2 0.19 0.06 -0.05 0.07 0.00 0.02 ## C3 0.06 0.02 -0.01 0.05 0.05 0.07 ## C4 -0.08 0.05 0.20 -0.08 -0.04 -0.15 ## C5 -0.08 0.14 0.06 -0.09 0.03 -0.09 ## E1 -0.22 0.08 0.10 -0.13 0.00 -0.03 ## E2 -0.23 0.17 0.08 -0.05 -0.01 -0.11 ## E3 0.39 0.05 -0.11 0.05 0.00 0.00 ## E4 0.21 -0.10 0.05 0.08 -0.04 -0.01 ## E5 0.29 0.00 -0.11 0.07 0.06 0.11 ## N1 -0.05 0.08 0.11 0.04 -0.05 -0.09 ## N2 -0.03 0.13 0.04 0.10 -0.05 -0.10 ## N3 -0.03 0.18 0.06 0.12 -0.05 -0.11 ## N4 -0.06 0.21 0.04 0.00 0.01 -0.03 ## N5 -0.08 0.11 0.14 0.21 -0.05 -0.10 ## O1 0.40 0.18 -0.24 -0.10 0.03 0.05 ## O2 -0.26 -0.07 0.32 0.03 -0.09 -0.04 ## O3 1.00 0.19 -0.31 -0.04 0.09 0.04 ## O4 0.19 1.00 -0.18 0.00 0.05 0.01 ## O5 -0.31 -0.18 1.00 0.02 -0.06 -0.10 ## gender -0.04 0.00 0.02 1.00 0.01 0.05 ## education 0.09 0.05 -0.06 0.01 1.00 0.24 ## age 0.04 0.01 -0.10 0.05 0.24 1.00 ``` --- ```r round(cor(bfi, use = "complete"),2) ``` ``` ## A1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 ## A1 1.00 -0.34 -0.26 -0.14 -0.19 0.02 0.01 -0.01 0.10 0.02 0.12 ## A2 -0.34 1.00 0.48 0.34 0.38 0.09 0.13 0.19 -0.14 -0.11 -0.24 ## A3 -0.26 0.48 1.00 0.38 0.50 0.10 0.14 0.13 -0.12 -0.15 -0.22 ## A4 -0.14 0.34 0.38 1.00 0.32 0.08 0.22 0.13 -0.16 -0.24 -0.14 ## A5 -0.19 0.38 0.50 0.32 1.00 0.12 0.11 0.13 -0.12 -0.16 -0.25 ## C1 0.02 0.09 0.10 0.08 0.12 1.00 0.43 0.32 -0.35 -0.25 -0.03 ## C2 0.01 0.13 0.14 0.22 0.11 0.43 1.00 0.36 -0.38 -0.30 0.02 ## C3 -0.01 0.19 0.13 0.13 0.13 0.32 0.36 1.00 -0.35 -0.35 -0.02 ## C4 0.10 -0.14 -0.12 -0.16 -0.12 -0.35 -0.38 -0.35 1.00 0.48 0.10 ## C5 0.02 -0.11 -0.15 -0.24 -0.16 -0.25 -0.30 -0.35 0.48 1.00 0.07 ## E1 0.12 -0.24 -0.22 -0.14 -0.25 -0.03 0.02 -0.02 0.10 0.07 1.00 ## E2 0.08 -0.24 -0.29 -0.20 -0.33 -0.10 -0.07 -0.09 0.21 0.26 0.47 ## E3 -0.04 0.25 0.38 0.20 0.41 0.13 0.15 0.10 -0.09 -0.17 -0.33 ## E4 -0.07 0.30 0.39 0.33 0.48 0.14 0.12 0.10 -0.12 -0.21 -0.42 ## E5 -0.02 0.30 0.26 0.16 0.27 0.26 0.25 0.22 -0.23 -0.24 -0.31 ## N1 0.16 -0.08 -0.07 -0.09 -0.19 -0.06 -0.02 -0.08 0.21 0.21 0.01 ## N2 0.13 -0.04 -0.08 -0.15 -0.19 -0.03 0.00 -0.06 0.15 0.24 0.01 ## N3 0.09 -0.02 -0.03 -0.07 -0.13 -0.01 0.01 -0.07 0.20 0.23 0.05 ## N4 0.04 -0.09 -0.13 -0.16 -0.21 -0.09 -0.04 -0.13 0.28 0.35 0.23 ## N5 0.01 0.02 -0.04 0.00 -0.08 -0.05 0.05 -0.04 0.21 0.18 0.04 ## O1 0.00 0.11 0.14 0.04 0.15 0.18 0.16 0.09 -0.10 -0.09 -0.10 ## O2 0.07 0.03 0.03 0.05 0.00 -0.13 -0.05 -0.03 0.21 0.12 0.06 ## O3 -0.06 0.15 0.22 0.04 0.22 0.19 0.18 0.06 -0.07 -0.07 -0.21 ## O4 -0.09 0.05 0.02 -0.06 0.00 0.08 0.03 0.00 0.07 0.14 0.08 ## O5 0.11 -0.08 -0.04 0.04 -0.04 -0.13 -0.06 0.00 0.18 0.05 0.09 ## gender -0.17 0.21 0.16 0.13 0.11 0.00 0.06 0.04 -0.07 -0.09 -0.15 ## education -0.14 0.02 0.00 -0.02 0.02 0.04 0.01 0.06 -0.04 0.04 0.00 ## age -0.14 0.09 0.04 0.11 0.10 0.08 0.00 0.05 -0.12 -0.07 -0.03 ## E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O2 ## A1 0.08 -0.04 -0.07 -0.02 0.16 0.13 0.09 0.04 0.01 0.00 0.07 ## A2 -0.24 0.25 0.30 0.30 -0.08 -0.04 -0.02 -0.09 0.02 0.11 0.03 ## A3 -0.29 0.38 0.39 0.26 -0.07 -0.08 -0.03 -0.13 -0.04 0.14 0.03 ## A4 -0.20 0.20 0.33 0.16 -0.09 -0.15 -0.07 -0.16 0.00 0.04 0.05 ## A5 -0.33 0.41 0.48 0.27 -0.19 -0.19 -0.13 -0.21 -0.08 0.15 0.00 ## C1 -0.10 0.13 0.14 0.26 -0.06 -0.03 -0.01 -0.09 -0.05 0.18 -0.13 ## C2 -0.07 0.15 0.12 0.25 -0.02 0.00 0.01 -0.04 0.05 0.16 -0.05 ## C3 -0.09 0.10 0.10 0.22 -0.08 -0.06 -0.07 -0.13 -0.04 0.09 -0.03 ## C4 0.21 -0.09 -0.12 -0.23 0.21 0.15 0.20 0.28 0.21 -0.10 0.21 ## C5 0.26 -0.17 -0.21 -0.24 0.21 0.24 0.23 0.35 0.18 -0.09 0.12 ## E1 0.47 -0.33 -0.42 -0.31 0.01 0.01 0.05 0.23 0.04 -0.10 0.06 ## E2 1.00 -0.40 -0.52 -0.39 0.17 0.20 0.19 0.35 0.26 -0.16 0.08 ## E3 -0.40 1.00 0.43 0.40 -0.04 -0.06 -0.01 -0.15 -0.09 0.33 -0.07 ## E4 -0.52 0.43 1.00 0.33 -0.14 -0.15 -0.13 -0.31 -0.09 0.12 0.05 ## E5 -0.39 0.40 0.33 1.00 0.04 0.05 -0.06 -0.21 -0.14 0.29 -0.09 ## N1 0.17 -0.04 -0.14 0.04 1.00 0.71 0.57 0.41 0.38 -0.05 0.14 ## N2 0.20 -0.06 -0.15 0.05 0.71 1.00 0.55 0.39 0.35 -0.05 0.12 ## N3 0.19 -0.01 -0.13 -0.06 0.57 0.55 1.00 0.52 0.43 -0.05 0.11 ## N4 0.35 -0.15 -0.31 -0.21 0.41 0.39 0.52 1.00 0.40 -0.06 0.08 ## N5 0.26 -0.09 -0.09 -0.14 0.38 0.35 0.43 0.40 1.00 -0.15 0.20 ## O1 -0.16 0.33 0.12 0.29 -0.05 -0.05 -0.05 -0.06 -0.15 1.00 -0.23 ## O2 0.08 -0.07 0.05 -0.09 0.14 0.12 0.11 0.08 0.20 -0.23 1.00 ## O3 -0.24 0.41 0.21 0.30 -0.03 -0.02 -0.03 -0.06 -0.08 0.39 -0.29 ## O4 0.17 0.04 -0.10 -0.02 0.09 0.13 0.17 0.23 0.11 0.17 -0.08 ## O5 0.08 -0.13 0.04 -0.11 0.10 0.02 0.05 0.03 0.14 -0.25 0.33 ## gender -0.08 0.05 0.11 0.08 0.04 0.09 0.11 -0.02 0.21 -0.11 0.04 ## education -0.01 0.01 -0.03 0.06 -0.04 -0.04 -0.04 0.01 -0.05 0.03 -0.10 ## age -0.10 -0.02 -0.01 0.10 -0.07 -0.09 -0.11 -0.02 -0.10 0.05 -0.04 ## O3 O4 O5 gender education age ## A1 -0.06 -0.09 0.11 -0.17 -0.14 -0.14 ## A2 0.15 0.05 -0.08 0.21 0.02 0.09 ## A3 0.22 0.02 -0.04 0.16 0.00 0.04 ## A4 0.04 -0.06 0.04 0.13 -0.02 0.11 ## A5 0.22 0.00 -0.04 0.11 0.02 0.10 ## C1 0.19 0.08 -0.13 0.00 0.04 0.08 ## C2 0.18 0.03 -0.06 0.06 0.01 0.00 ## C3 0.06 0.00 0.00 0.04 0.06 0.05 ## C4 -0.07 0.07 0.18 -0.07 -0.04 -0.12 ## C5 -0.07 0.14 0.05 -0.09 0.04 -0.07 ## E1 -0.21 0.08 0.09 -0.15 0.00 -0.03 ## E2 -0.24 0.17 0.08 -0.08 -0.01 -0.10 ## E3 0.41 0.04 -0.13 0.05 0.01 -0.02 ## E4 0.21 -0.10 0.04 0.11 -0.03 -0.01 ## E5 0.30 -0.02 -0.11 0.08 0.06 0.10 ## N1 -0.03 0.09 0.10 0.04 -0.04 -0.07 ## N2 -0.02 0.13 0.02 0.09 -0.04 -0.09 ## N3 -0.03 0.17 0.05 0.11 -0.04 -0.11 ## N4 -0.06 0.23 0.03 -0.02 0.01 -0.02 ## N5 -0.08 0.11 0.14 0.21 -0.05 -0.10 ## O1 0.39 0.17 -0.25 -0.11 0.03 0.05 ## O2 -0.29 -0.08 0.33 0.04 -0.10 -0.04 ## O3 1.00 0.17 -0.32 -0.04 0.10 0.02 ## O4 0.17 1.00 -0.18 -0.04 0.06 0.00 ## O5 -0.32 -0.18 1.00 0.04 -0.06 -0.08 ## gender -0.04 -0.04 0.04 1.00 0.01 0.05 ## education 0.10 0.06 -0.06 0.01 1.00 0.25 ## age 0.02 0.00 -0.08 0.05 0.25 1.00 ``` --- With **pairwise deletion**, different sets of cases contribute to different correlations. That maximizes the sample sizes, but can lead to problems if the data are missing for some systematic reason. **Listwise deletion** (often referred to in `R` as use complete cases) doesn't have the same issue of biasing correlations, but does result in smaller samples and potentially limited generalizability. A good practice is comparing the different matrices; if the correlation values are very different, this suggests that the missingness that affects pairwise deletion is systematic. --- ```r round(cor(bfi, use = "pairwise")- cor(bfi, use = "complete"),2) ``` ``` ## A1 A2 A3 A4 A5 C1 C2 C3 C4 C5 E1 ## A1 0.00 0.00 0.00 0.00 0.00 0.01 0.00 -0.01 0.03 0.03 -0.01 ## A2 0.00 0.00 0.00 -0.01 0.01 0.00 0.01 0.01 -0.01 -0.01 0.03 ## A3 0.00 0.00 0.00 -0.02 0.00 0.00 0.00 0.00 0.00 -0.01 0.00 ## A4 0.00 -0.01 -0.02 0.00 -0.01 0.01 0.01 0.00 0.01 0.00 0.03 ## A5 0.00 0.01 0.00 -0.01 0.00 0.00 0.00 0.00 -0.01 -0.01 0.00 ## C1 0.01 0.00 0.00 0.01 0.00 0.00 0.00 -0.01 0.01 0.00 0.00 ## C2 0.00 0.01 0.00 0.01 0.00 0.00 0.00 0.00 0.00 0.00 -0.01 ## C3 -0.01 0.01 0.00 0.00 0.00 -0.01 0.00 0.00 0.02 0.01 0.02 ## C4 0.03 -0.01 0.00 0.01 -0.01 0.01 0.00 0.02 0.00 -0.01 -0.01 ## C5 0.03 -0.01 -0.01 0.00 -0.01 0.00 0.00 0.01 -0.01 0.00 0.00 ## E1 -0.01 0.03 0.00 0.03 0.00 0.00 -0.01 0.02 -0.01 0.00 0.00 ## E2 0.01 0.01 0.00 0.01 0.00 0.01 0.01 0.01 -0.01 0.00 0.00 ## E3 0.00 0.00 0.00 -0.01 0.00 -0.02 0.00 -0.02 0.01 0.01 0.01 ## E4 0.01 -0.02 -0.02 -0.03 -0.01 0.00 0.00 -0.01 0.01 0.01 0.00 ## E5 0.00 0.00 -0.01 0.00 0.00 -0.01 0.00 0.00 0.00 0.01 0.00 ## N1 0.01 -0.01 -0.02 0.00 0.00 -0.01 0.00 0.01 0.01 0.01 0.01 ## N2 0.01 -0.01 0.00 0.00 0.00 -0.01 -0.01 0.00 0.01 0.01 0.01 ## N3 0.01 -0.02 -0.01 0.00 -0.01 -0.02 -0.01 0.01 0.01 0.01 0.00 ## N4 0.01 0.00 0.00 -0.01 0.01 -0.01 -0.01 0.02 -0.02 -0.01 0.00 ## N5 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.02 -0.02 -0.01 0.01 ## O1 0.01 0.02 0.00 0.02 0.02 -0.01 0.01 0.00 0.01 0.01 0.00 ## O2 0.01 -0.02 -0.03 -0.01 0.00 0.02 0.01 0.00 0.00 0.02 -0.01 ## O3 0.00 0.02 0.01 0.03 0.02 0.00 0.01 0.01 -0.01 -0.01 0.00 ## O4 0.01 0.03 0.01 0.02 0.01 0.03 0.03 0.02 -0.02 0.00 -0.01 ## O5 0.01 -0.01 -0.01 -0.01 -0.01 0.01 0.00 -0.01 0.01 0.01 0.01 ## gender 0.01 -0.03 -0.02 0.00 -0.01 0.01 0.01 0.01 -0.01 0.00 0.02 ## education 0.00 -0.01 -0.01 0.00 0.00 -0.01 -0.01 -0.01 0.00 -0.01 0.00 ## age -0.02 0.02 0.03 0.03 0.03 0.00 0.02 0.02 -0.03 -0.01 0.01 ## E2 E3 E4 E5 N1 N2 N3 N4 N5 O1 O2 ## A1 0.01 0.00 0.01 0.00 0.01 0.01 0.01 0.01 0.01 0.01 0.01 ## A2 0.01 0.00 -0.02 0.00 -0.01 -0.01 -0.02 0.00 0.00 0.02 -0.02 ## A3 0.00 0.00 -0.02 -0.01 -0.02 0.00 -0.01 0.00 0.00 0.00 -0.03 ## A4 0.01 -0.01 -0.03 0.00 0.00 0.00 0.00 -0.01 0.00 0.02 -0.01 ## A5 0.00 0.00 -0.01 0.00 0.00 0.00 -0.01 0.01 0.00 0.02 0.00 ## C1 0.01 -0.02 0.00 -0.01 -0.01 -0.01 -0.02 -0.01 0.00 -0.01 0.02 ## C2 0.01 0.00 0.00 0.00 0.00 -0.01 -0.01 -0.01 0.00 0.01 0.01 ## C3 0.01 -0.02 -0.01 0.00 0.01 0.00 0.01 0.02 0.02 0.00 0.00 ## C4 -0.01 0.01 0.01 0.00 0.01 0.01 0.01 -0.02 -0.02 0.01 0.00 ## C5 0.00 0.01 0.01 0.01 0.01 0.01 0.01 -0.01 -0.01 0.01 0.02 ## E1 0.00 0.01 0.00 0.00 0.01 0.01 0.00 0.00 0.01 0.00 -0.01 ## E2 0.00 0.02 0.01 0.02 0.00 0.00 0.01 -0.01 0.00 0.00 0.00 ## E3 0.02 0.00 -0.01 -0.02 -0.01 -0.01 -0.01 0.01 0.01 0.00 0.01 ## E4 0.01 -0.01 0.00 -0.02 0.01 0.01 0.03 0.02 0.00 0.01 0.01 ## E5 0.02 -0.02 -0.02 0.00 0.00 -0.01 0.00 0.00 0.01 0.00 0.00 ## N1 0.00 -0.01 0.01 0.00 0.00 0.00 -0.01 -0.01 -0.01 0.00 -0.01 ## N2 0.00 -0.01 0.01 -0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 ## N3 0.01 -0.01 0.03 0.00 -0.01 0.00 0.00 0.00 0.00 0.01 0.00 ## N4 -0.01 0.01 0.02 0.00 -0.01 0.00 0.00 0.00 0.00 0.01 0.00 ## N5 0.00 0.01 0.00 0.01 -0.01 0.00 0.00 0.00 0.00 0.03 0.00 ## O1 0.00 0.00 0.01 0.00 0.00 0.00 0.01 0.01 0.03 0.00 0.02 ## O2 0.00 0.01 0.01 0.00 -0.01 0.00 0.00 0.00 0.00 0.02 0.00 ## O3 0.02 -0.02 0.00 0.00 -0.02 -0.01 0.00 0.00 0.01 0.00 0.03 ## O4 0.00 0.01 0.01 0.01 -0.01 0.00 0.01 -0.02 0.01 0.01 0.01 ## O5 0.00 0.02 0.01 0.00 0.01 0.02 0.01 0.01 -0.01 0.01 -0.01 ## gender 0.02 -0.01 -0.03 -0.01 0.01 0.00 0.01 0.02 0.00 0.01 -0.02 ## education 0.00 0.00 -0.01 0.00 0.00 -0.01 -0.01 0.00 -0.01 -0.01 0.01 ## age 0.00 0.02 0.00 0.02 -0.01 -0.01 0.00 -0.01 0.00 0.00 0.00 ## O3 O4 O5 gender education age ## A1 0.00 0.01 0.01 0.01 0.00 -0.02 ## A2 0.02 0.03 -0.01 -0.03 -0.01 0.02 ## A3 0.01 0.01 -0.01 -0.02 -0.01 0.03 ## A4 0.03 0.02 -0.01 0.00 0.00 0.03 ## A5 0.02 0.01 -0.01 -0.01 0.00 0.03 ## C1 0.00 0.03 0.01 0.01 -0.01 0.00 ## C2 0.01 0.03 0.00 0.01 -0.01 0.02 ## C3 0.01 0.02 -0.01 0.01 -0.01 0.02 ## C4 -0.01 -0.02 0.01 -0.01 0.00 -0.03 ## C5 -0.01 0.00 0.01 0.00 -0.01 -0.01 ## E1 0.00 -0.01 0.01 0.02 0.00 0.01 ## E2 0.02 0.00 0.00 0.02 0.00 0.00 ## E3 -0.02 0.01 0.02 -0.01 0.00 0.02 ## E4 0.00 0.01 0.01 -0.03 -0.01 0.00 ## E5 0.00 0.01 0.00 -0.01 0.00 0.02 ## N1 -0.02 -0.01 0.01 0.01 0.00 -0.01 ## N2 -0.01 0.00 0.02 0.00 -0.01 -0.01 ## N3 0.00 0.01 0.01 0.01 -0.01 0.00 ## N4 0.00 -0.02 0.01 0.02 0.00 -0.01 ## N5 0.01 0.01 -0.01 0.00 -0.01 0.00 ## O1 0.00 0.01 0.01 0.01 -0.01 0.00 ## O2 0.03 0.01 -0.01 -0.02 0.01 0.00 ## O3 0.00 0.02 0.01 0.01 0.00 0.01 ## O4 0.02 0.00 0.00 0.03 -0.01 0.01 ## O5 0.01 0.00 0.00 -0.01 0.00 -0.02 ## gender 0.01 0.03 -0.01 0.00 0.00 0.00 ## education 0.00 -0.01 0.00 0.00 0.00 -0.01 ## age 0.01 0.01 -0.02 0.00 -0.01 0.00 ``` --- ## Visualizing correlations For a single correlation, best practice is to visualize the relationship using a scatterplot. A best fit line is advised, as it can help clarify the strength and direction of the relationship. [http://guessthecorrelation.com/](http://guessthecorrelation.com/) --- ![](2-correlation_files/figure-html/unnamed-chunk-10-1.png)<!-- --> --- ![](2-correlation_files/figure-html/unnamed-chunk-11-1.png)<!-- --> --- ![](2-correlation_files/figure-html/unnamed-chunk-12-1.png)<!-- --> --- ![](2-correlation_files/figure-html/unnamed-chunk-13-1.png)<!-- --> --- ![](2-correlation_files/figure-html/unnamed-chunk-14-1.png)<!-- --> --- ![](2-correlation_files/figure-html/unnamed-chunk-15-1.png)<!-- --> --- ![](2-correlation_files/figure-html/unnamed-chunk-16-1.png)<!-- --> --- ![](2-correlation_files/figure-html/unnamed-chunk-17-1.png)<!-- --> --- ## Visualizing correlation matrices A single correlation can be informative; a correlation matrix is more than the sum of its parts. Correlation matrices can be used to infer larger patterns of relationships. You may be one of the gifted who can look at a matrix of numbers and see those patterns immediately. Or you can use **heat maps** to visualize correlation matrices. ```r library(corrplot) ``` ``` ## corrplot 0.84 loaded ``` --- ```r corrplot(cor(bfi, use = "pairwise"), method = "square") ``` ![](2-correlation_files/figure-html/unnamed-chunk-19-1.png)<!-- --> --- ![](images/comm plot-1.png) .small[ [Beck, Condon, & Jackson, 2019](https://psyarxiv.com/857ev/) ] --- ## Other correlation tests: 1. Set of correlations 2. Dependent correlations (i.e., within same group) These are more easily tested via Structural Equation Modeling (SEM) 3. Intra Class Correlation (ICC) - Again, best to do these tests in another framework (e.g., interaction, SEM, MLM) --- ## Factors that influence r (and most other test statistics) 1. Restriction of range (GRE scores and success) 2. Very skewed distributions (smoking and health) 3. Non-linear associations 4. Measurement overlap (modality and content) 5. Reliability --- ## Reliability Which would you rather have? - 1-item final exam versus 30-item? - assessment via trained clinician vs tarot cards? - fMRI during minor earthquake vs no earthquake? -- All measurement includes error - Score = true score + measurement error (CTT version) - Reliability assesses the consistency of measurement; high reliability indicates less error --- ## Reliability - Cannot correlate error (randomness) with something - Because we do not measure our variables perfectly we get lower correlations compared to true correlations - If we want to have a valid measure it better be a reliable measure --- ## Reliability - think of reliability as a correlation with a measure and itself in a different world, at a different time, or a different but equal version `$$\large r_{XX}$$` --- ## Reliability - true score variance divided by observed variance - how do you assess theoretical variance i.e., true score variance? `$$\large r_{XY} = r_{X_{T} Y_{T}} {\sqrt{r_{XX}r_{YY}}}$$` `$$\large r_{XY} = .6 {\sqrt {(.70) (.70)}}$$` --- ## Reliability `$$\large r_{X_{T} Y_{T}} = = {\frac {r_{XY}} {\sqrt{r_{XX}r_{YY}}}}$$` `$$\large r_{X_{T} Y_{T}} = = {\frac {.30} {\sqrt{(.70)(.70)}}} = .42$$` ??? ### Take aways N needed for .42 = 42 N needed for .3 = 84 -- need twice as many people!! it doesn't work the other way -- you can't take your correlation and back calculate the true score, because reliabilities are also estimates. these can be wrong; the correlation you calculate is the max it could bep --- ## Most common ways to assess - cronbachs alpha ```r library (psych) alpha(measure) ## Gives average split half correlation ## Can tell you if you are assessing a single construct ``` - test - retest reliability - Kappa or ICC --- ## Reliability - if you are going to measure something do it well - applies to ALL IVs and DVs, and all designs - remember this when interpretting other's research --- ## Types of correlations - Many ways to get at relationship between two variables - Statistically the different types are almost exactly the same - Exist for historical reasons --- ## Types of correlations 1. Point Biserial + continuous and dichtomous 2. Phi coefficient + both dichotomous 3. Spearman rank order + ranked data (nonparametric) 4. Biserial (assumes dichotomous is continous) 5. Tetrachoric (assumes dichotomous is continous) + both dichotomous 6. Polychoric (assumes continous) + ordinal --- class: inverse ## Next time.... Univariate regression