Differences between revisions 159 and 160

Situations with more than two variables of interest

When considering the relationship among three or more variables, an interaction may arise. Interactions describe a situation in which the simultaneous influence of two variables on a third is not additive. Most commonly, interactions are considered in the context of Multiple Regression analyses, but they may also be evaluated using Two-Way ANOVA.

An Example Problem

To determine if prenatal exposure to cocaine alters dendritic spine density within prefrontal cortex, 20 rats were equally divided between treatment groups that were prenatally exposed to either cocaine or placebo. Further, because any effect of prenatal drug exposure might be evident at one age but not another, animals within each treatment group were studied at either 4 or 12 weeks of age. Thus, our independent variables are treatment (prenatal drug/placebo exposure) and age (lets say 4 and 12 weeks of age), and our dependent variable is spine density. The following table shows one possible outcome of such a study:

DENDRITIC SPINE DENSITY
4-Week Control	4-Week Cocaine	12-Week Control	12-Week Cocaine
7.5	5.5	8.0	5.0
8.0	3.5	10.0	4.5
6.0	4.5	13.0	4.0
7.0	6.0	9.0	6.0
6.5	5.0	8.5	4.0

There are three null hypotheses we may want to test. The first two test the effects of each independent variable (or factor) under investigation:

H₀₁: Treatment groups have the same dendritic spine density on average.
H₀₂: Age groups have the same dendritic spine density on average.

And the third tests for an interaction between these two factors:

H₀₃: The two factors (treatment and age) are independent; ie., there is no interaction effect.

Two-Way ANOVA

A two-way ANOVA is an analysis technique that quantifies how much of the variance in a sample can be accounted for by each of two categorical variables and their interactions.

Step 1 is to compute the group means (for each cell, row, and column):

GROUP MEANS
	4-Week	12-Week	All Ages
Control	7	9.7	8.35
Cocaine	4.9	4.7	4.8
All Prenatal Exposures	5.95	7.2	6.575

And it's helpful to also plot the mean data. (It's easier to understand it that way!)

Means by group

Step 2 is to calculate the sum of squares (SS) for each group (cell) using the following formula:

$\sum_{i=1} (x_{i,g} - \overline{X}_g)^2$

where $x_{i,g}$ is the ith measurement for group g, and $\overline{X}_g$ is the overall group mean for group g.

For each group, this formula is implemented as follows:

4-Week Control:
- {7.5, 8, 6, 7, 6.5}, $\overline{X}_{4-week, control}$ = 7
  $SS_{4-week, control}$ = (7.5-7)² + (8-7)² + (6-7)² + (7-7)² + (6.5-7)² = 2.5
4-Week Cocaine:
- {5.5, 3.5, 4.5, 6, 5}, $\overline{X}_{4-week, cocaine}$ = 4.9
  $SS_{4-week, cocaine}$ = (5.5-4.9)² + (3.5-4.9)² + (4.5-4.9)² + (6-4.9)² + (5-4.9)² = 3.7
12-Week Control:
- {8, 10, 13, 9, 8.5}, $\overline{X}_{12-week, control}$ = 9.7
  $SS_{12-week, control}$ = (8-9.7)² + (10-9.7)² + (13-9.7)² + (9-9.7)² + (8.5-9.7)² = 15.8
12-Week Cocaine:
- {5, 4.5, 4, 6, 4}, $\overline{X}_{12-week, cocaine}$ = 4.7
  $SS_{12-week, cocaine}$ = (5-4.7)² + (4.5-4.7)² + (4-4.7)² + (6-4.7)² + (4-4.7)² = 2.8

Step 3 is to calculate the between-groups sum of squares(SS_B):

$n \cdot \sum_{g} (\overline{X}_{g} - \overline{X})^2$

where n is the number of subjects in each group, $\overline{X}_g$ is the mean for group g, and $\overline{X}$ is the overall mean (across groups).

$SS_{B}$ = [( $\overline{X}_{4-week, control}$ - $\overline{X}$ )² + ( $\overline{X}_{4-week, cocaine}$ - $\overline{X}$ )² + ( $\overline{X}_{12-week, control}$ - $\overline{X}$ )² + ( $\overline{X}_{12-week, cocaine}$ - $\overline{X}$ )²]
- = 5 [(7 - 6.575 )² + (4.9 - 6.575)² + (9.7 - 6.575)² + (4.7 - 6.575)²]
  = 5 [0.180625 + 2.805625 + 9.765625 + 3.515625]
  = 5 [16.2675]
  = 81.3375

Now, Step 4 , we'll calculate the sum-of-squares within groups ( $SS_{W}$ ). For a group g, this is

$\sum_{g} SS_g$

So:

$SS_{W}$ = $SS_{4-week, control}$ + $SS_{4-week, cocaine}$ + $SS_{12-week, control}$ + $SS_{12-week, cocaine}$
- = 2.5 + 3.7 + 15.8 + 2.8
  = 24.8
$df_{w}$ =
- = 20 - (2 * 2)
  = 16
$s_{W}$ ² = $SS_{W}$ / $df_{w}$
- = 24.8 / 16
  = 1.55

Note that $SS_{W}$ is also known as the "residual" or "error" since it quantifies the amount of variability after the condition means are taken into account. The degrees of freedom here are N - rc because there are N data points, but the number of means fit is r*c, giving a total of N - rc variables that are free to vary.

For Step 5, we'll calculate the sum-of-squares for the rows ( $SS_{R}$ ):

$\sum_{r} (\overline{X}_r - \overline{X})^2$

where r ranges over rows.

$SS_{R}$ = [( $\overline{X}_{control}$ - $\overline{X}$ )² + ( $\overline{X}_{cocaine}$ - $\overline{X}$ )²]
- = 10 [(8.35 - 6.575)² + (4.8 - 6.575)²]
  = 10 [3.150625 + 3.150625]
  = 10 [6.30125]
  = 63.0125
- df_R = r - 1
  - = 2-1
    = 1
  s_R² = SS_R / df_R
  - = 63.0125 / 1
    = 63.0125

In Step 6, we calculate the sum-of-squares for the columns ( $SS_{C}$ ):

$\sum_{c} (\overline{X}_c - \overline{X})^2$

where c ranges over columns.

$SS_{C}$ = ( $\overline{X}_{4-week}$ - $\overline{X}$ )² + ( $\overline{X}_{12-week}$ - $\overline{X}$ )²]
- = 10 [(5.95 - 6.575)² + (7.2 - 6.575)²]
  = 10 [0.390625 + 0.390625]
  = 10 [0.78125]
  = 7.8125
- df_C = c - 1
  - = 2-1
    = 1
  s_C² = SS_C / df_C
  - = 7.8125 / 1
    = 7.8125

In Step 7, calculate the sum-of-squares for the interaction ( $SS_{RC}$ ):

SS_RC = SS_B - SS_R - SS_C
- = 81.3375 - 63.0125 - 7.8125
  = 10.5125
- df_RC = (r - 1)(c - 1)
  - = (2-1)(2-1)
    = 1
  s_RC² = SS_RC / df_RC
  - = 10.5125 / 1
    = 10.5125

Now, in Step 8, we'll calculate the total sum-of-squares ( $SS_{T}$ ):

SS_T = SS_B + SS_W + SS_R + SS_C + SS_RC
- = 81.3375 + 24.8 + 63.0125 + 7.8125 + 10.5125
  = 187.475
df_T = N - 1
- = 20-1
  = 19

At Step 9, we calculate the F values:

F_R = s_R² / s_W²
- = 63.0125 / 1.55
  = 40.65323
F_C = s_C² / s_W²
- = 7.8125 / 1.55
  = 5.040323
F_RC = s_RC² / s_W²
- = 10.5125 / 1.55
  = 6.782258

And, finally, at Step 10, we can organize all of the above into a table, along with the appropriate F_crit values (looked up in a table like this one) that we'll use for comparison and interpretation of our computations:

F_crit (1, 16) _α=0.5 = 4.49

ANOVA TABLE
Source	SS	df	s²	F_obt	F_crit	p
rows	63.0125	1	63.0125	40.65323	4.49	p < 0.05
columns	7.8125	1	7.8125	5.040323	4.49	p < 0.05
r * c	10.5125	1	10.5125	6.782258	4.49	p < 0.05
within	24.8	16	1.55	--	--	--
total	187.475	19	--	--	--	--

Both factors (treatment and age) are significant, as indicate by the fact that F_obt > F_crit. Thus, we can reject all H₀₁ and H₀₂ and conclude that dendritic spine density is affected by prenatal cocaine exposure and age. The interaction between the two factors (r * c) is also significant. Thus, we can also reject H₀₃ and conclude there is a significant interaction between treatment and age.

To further interpret these results, we can plot the group means as follows:

Means by group

NOTE: Remember that the statistics provided by the ANOVA quantify the effect of each factor (in this case, treatment and age). These statistics do not compare individual condition means, such as whether 4-week control differs from 12-week control. If k = the number of groups, the number of possible comparisons is k * (k-1) / 2. In the above example, we have 4 groups, so there are (4*3)/2 = 6 possible comparisons between these group means. Statistical testing of these individual comparisons requires a post-hoc analysis that corrects for experiment-wise error rate. If all possible comparisons are of interest, the Tukey's HSD (Honestly Significant Difference) Test is commonly used.

Tukey's HSD (Honestly Significant Difference) Test

Tukey's test is a single-step, multiple-comparison statistical procedure often used in conjunction with an ANOVA to test which group means are significantly different from one another. It is used in cases where group sizes are equal (the Tukey-Kramer procedure is used if group sizes are unequal) and it compares all possible pairs of means. Note that the Tukey's test formula is very similar to that of the t-test, except that it corrects for experiment-wise error rate. (When there are multiple comparisons being made, the probability of making a type I error (rejecting a true null hypothesis) increases, so Tukey's test corrects for this.) The formula for a Tukey's test is:

$q_{obt} = (Y_{A} - Y_{B}) / \sqrt{ s_{W}^2 / n}$

where Y_A is the larger of the two means being compared, Y_B is the smaller,s_W² is the mean squared error within, and n is the number of data points within each group . Once computed, the q_obt value is compared to a q-value from the q distribution. If the q_obt value is larger than the q_crit value from the distribution, the two means are significantly different.

So, if we wanted to use a Tukey's test to determine whether 4-week control significantly differs from 12-week control, we'd calculate it as follows

q_obt = $\overline{X}_{12-week, control}$ - $\overline{X}_{4-week, control}$ / $\sqrt{ s_{W}^2 / n}$
- = 9.7 - 7 / $\sqrt{ 1.55 / 5}$
  = 9.7 - 7 / $\sqrt{ 1.55 / 5}$
  = 2.7 / 0.5567764
  = 4.849343

The q_crit value may be looked up in a chart (like this one) using the appropriate values for k (which represents the number of group means, so 4 here) and df_w (16). So here, q_crit (4, 16) _α=0.5= 4.05. Since q_obt > q_crit (4.85 > 4.05), we can conclude that the two means are, in fact, (honestly) significantly different.

Additional group comparisons reveal that spine density is significantly greater in the control group as compared to drug-exposed animals at both 4 and 12 weeks of age. The effect of prenatal drug exposure is magnified with time due to a developmental increase in spine density that occurs only in the control group.

IMPORTANT NOTE: There are various post-hoc tests that can be used, and the choice of which method to apply is a controversial area in statistics. The different tests vary in how conservative they are: more conservative tests are less powerful and have a lower risk of a type I error (rejecting a true null hypothesis), however this comes at the cost of increasing the risk of a type II error (incorrectly accepting the null hypothesis). Some common methods, listed in order of decreasing power, are: Fisher’s LSD, Newman-Keuls, Tukey HSD, Bonferonni, Scheffé. The following provides some guidelines for choosing an appropriate post-hoc procedure.

If all pairwise comparisons are truly relevant, the Tukey (-Kramer) method is often recommended.
If it is most reasonable to compare all groups against a single control, then the Dunnett test is recommended.
If only a subset of the pairwise comparisons are relevant, then the Bonferroni method is often utilized for those selected comparisons. For example, in the present example one would likely not be interested in comparing the 4-week control to the 12-week cocaine group, or the 12-week control to the 4-week cocaine group.

Multiple Regression

Another analysis technique you could use is a multiple regression. In multiple regression, we find coefficients for each group such that we are able to best predict the group means. The multiple regression computes standard errors on the coefficients, meaning that we can determine if a coefficient is significantly different from zero. On this simple example, multiple regression will give identical answers to ANOVA, but in more complex cases, multiple regression is a more powerful technique that allows you to include additional nuisance predictors, that the analysis controls for before testing for significance of your independent variables.

Go back to the Homepage

-  ⇤ ← Revision 159 as of 2011-12-16 18:57:39 → 
  Size: 15738
  Editor: KathyNordeen
  Comment:
+   ← Revision 160 as of 2011-12-16 19:00:39 → ⇥
  Size: 15740
  Editor: KathyNordeen
  Comment:
-Deletions are marked like this.
+Additions are marked like this.
 Line 12:
-To determine if prenatal exposure to cocaine alters dendritic spine density within prefrontal cortex, 20 rats were equally divided between treatment groups that were prenatally exposed to either cocaine or placebo.  Further, because any effect of prenatal drug exposure might be evident at one age but not another, animals within each treatment group were studied at either 4 or 12 weeks of age. Thus, our independent variables are treatment (prenatal drug/placebo exposure)and age (lets say 4 and 12 weeks of age), and our dependent variable is spine density.  The following table shows one possible outcome of such a study:
+To determine if prenatal exposure to cocaine alters dendritic spine density within prefrontal cortex, 20 rats were equally divided between treatment groups that were prenatally exposed to either cocaine or placebo.  Further, because any effect of prenatal drug exposure might be evident at one age but not another, animals within each treatment group were studied at either 4 or 12 weeks of age.  Thus, our independent variables are treatment (prenatal drug/placebo exposure) and age (lets say 4 and 12 weeks of age), and our dependent variable is spine density.  The following table shows one possible outcome of such a study: