Differences between revisions 118 and 119
Revision 118 as of 2011-11-25 20:12:56
Size: 10456
Editor: CelesteKidd
Comment:
Revision 119 as of 2011-11-25 20:26:55
Size: 10520
Editor: CelesteKidd
Comment:
Deletions are marked like this. Additions are marked like this.
Line 213: Line 213:
Both factors (substance and duration of usage) are significant, as indicate by the fact that '''F,,OBT,,''' > '''F,,CRIT,,'''. The interaction between the two (r * c) is also significant. To interpret the significant interaction we can look at our plot of group means above. Both factors (substance and duration of usage) are significant, as indicate by the fact that '''F,,OBT,,''' > '''F,,CRIT,,'''. The interaction between the two (r * c) is also significant. To interpret the significant interaction we can look at our plot of group means to see this:

{{attachment:crossover.png|Means
by group|width=450}}

Situations with more than two variables of interest

When considering the relationship among three or more variables, an interaction may arise. Interactions describe a situation in which the simultaneous influence of two variables on a third is not additive. Most commonly, interactions are considered in the context of Multiple Regression analyses, but they may also be evaluated using Two-Way ANOVA.

An Example Problem

Suppose we are interested in studying the effects of cocaine on sleep. We might design an experiment to simultaneously test whether both the use of cocaine and the duration of usage affect the number of hours a squirrel will sleep in a night. We might give half of the squirrels we test cocaine, and the other half a placebo substance (the substance variable). And we might vary the duration of usage by administering cocaine or placebo for one of two possible durations before test, 4 weeks or 12 weeks (the duration variable). We can then consider the average treatment response (e.g. number of hours slept) for each squirrel, as a function of the treatment combination that was administered (e.g. substance and duration). The following table shows one possible situation:

HOURS SLEPT IN SINGLE NIGHT

4-Week Placebo (Control)

4-Week Cocaine

12-Week Placebo (Control)

12-Week Cocaine

7.5

5.5

8.0

5.0

8.0

3.5

10.0

4.5

6.0

4.5

13.0

4.0

7.0

6.0

9.0

6.0

6.5

5.0

8.5

4.0

There are three null hypotheses we may want to test. The first two test the effects of each factor under investigation:

  • H01: Both substance groups sleep for the same number of hours on average.

  • H02: Both treatment duration groups sleep for the same number of hours on average.

And the third tests for an interaction between these two factors:

  • H03: The two factors are independent or there is no interaction effect.

Two-Way ANOVA

A two-way ANOVA is an analysis technique that quantifies how much of the variance in a sample can be accounted for by each of two categorical variables and their interactions.

Step 1 is to compute the group means (for each cell, row, and column):

GROUP MEANS

4-Week

12-Week

All Durations

Placebo

7

9.7

8.35

Cocaine

4.9

4.7

4.8

All Substances

5.95

7.2

6.575

It's important to plot the numbers--it's easier to understand them that way:

Means by group

Step 2 is to calculate the sum of squares (SS) for each group (cell) using the following formula:

\[
\sum_{i=1} (x_{i,g} - \overline{X}_g)^2 
\]

where $x_{i,g}$ is the i'th measurement for group g, and $\overline{X}_g$ is the overall group mean for group g.

For each group, this formula is implemented as follows:

  • 4-Week Placebo:

    • {7, 8, 6, 7, 6.5}, $\overline{X}_{4-week, placebo}$ = 7
      $SS_{4-week, placebo}$ = (7-7)2 + (8-7)2 + (6-7)2 + (7-7)2 + (6.5-7)2 = 2.25

    4-Week Cocaine:

    • {5.5, 3.5, 4.5, 6, 5}, $\overline{X}_{4-week, cocaine}$ = 4.9
      $SS_{4-week, cocaine}$ = (5.5-4.9)2 + (3.5-4.9)2 + (4.5-4.9)2 + (6-4.9)2 + (5-4.9)2 = 3.7

    12-Week Placebo:

    • {8, 10, 13, 9, 8.5}, $\overline{X}_{12-week, placebo}$ = 9.7
      $SS_{12-week, placebo}$ = (8-9.7)2 + (10-9.7)2 + (13-9.7)2 + (9-9.7)2 + (8.5-9.7)2 = 15.8

    12-Week Cocaine:

    • {5, 4.5, 4, 6, 4}, $\overline{X}_{12-week, cocaine}$ = 4.7
      $SS_{12-week, cocaine}$ = (5-4.7)2 + (4.5-4.7)2 + (4-4.7)2 + (6-4.7)2 + (4-4.7)2 = 2.8

Step 3 is to calculate the between-groups sum of squares(SSB):

\[
n \cdot \sum_{g} (\overline{X}_{g} - \overline{X})^2 
\]

where n is the number of subjects in each group, $\overline{X}_g$ is the mean for group g, and $\overline{X}$ is the overall mean (across groups).

  • $SS_{B}$ = $n$ [( $\overline{X}_{4-week, placebo}$ - $\overline{X}$ )2 + ( $\overline{X}_{4-week, cocaine}$ - $\overline{X}$)2 + ( $\overline{X}_{12-week, placebo}$ - $\overline{X}$ )2 + ( $\overline{X}_{12-week, cocaine}$ - $\overline{X}$)2]

    • = 5 [(7 - 6.575 )2 + (4.9 - 6.575)2 + (9.7 - 6.575)2 + (4.7 - 6.575)2]
      = 5 [0.180625 + 2.805625 + 9.765625 + 3.515625]
      = 5 [16.2675]
      = 81.3375

Now, Step 4 , we'll calculate the sum-of-squares within groups ($SS_{W}$). For a group g, this is

\[
\sum_{g} SS_g
\]

So:

  • $SS_{W}$ = $SS_{4-week, placebo}$ + $SS_{4-week, cocaine}$ + $SS_{12-week, placebo}$ + $SS_{12-week, cocaine}$

    • = 2.25 + 3.7 + 15.8 + 2.8
      = 24.55

    $df_{w}$ = $N - rc$

    • = 20 - (2 * 2)
      = 16

    $s_{W}$2 = $SS_{W}$ / $df_{w}$

    • = 24.55 / 16
      = 1.534375

For Step 5, we'll calculate the sum-of-squares for the rows ($SS_{R}$):

\[
\sum_{r} (\overline{X}_r - \overline{X})^2 
\]

where r ranges over rows.

  • $SS_{R}$ = $n$ [( $\overline{X}_{placebo}$ - $\overline{X}$ )2 + ( $\overline{X}_{cocaine}$ - $\overline{X}$)2]

    • = 10 [(8.35 - 6.575)2 + (4.8 - 6.575)2]
      = 10 [3.150625 + 3.150625]
      = 10 [6.30125]
      = 63.0125

    • dfR = r - 1

      • = 2-1
        = 1

      sR2 = SSR / dfR

      • = 63.0125 / 1
        = 63.0125

In Step 6, we calculate the sum-of-squares for the columns ($SS_{C}$):

\[
\sum_{c} (\overline{X}_c - \overline{X})^2 
\]

where c ranges over columns.

  • $SS_{C}$ = $n$ ( $\overline{X}_{4-week}$ - $\overline{X}$ )2 + ( $\overline{X}_{12-week}$ - $\overline{X}$ )2]

    • = 10 [(5.95 - 6.575)2 + (7.2 - 6.575)2]
      = 10 [0.390625 + 0.390625]
      = 10 [0.78125]
      = 7.8125

    • dfC = c - 1

      • = 2-1
        = 1

      sC2 = SSC / dfC

      • = 7.8125 / 1
        = 7.8125

In Step 7, calculate the sum-of-squares for the interaction ($SS_{RC}$):

  • SSRC = SSB - SSR - SSC

    • = 81.3375 - 63.0125 - 7.8125
      = 10.5125

    • dfRC = (r - 1)(c - 1)

      • = (2-1)(2-1)
        = 1

      sRC2 = SSRC / dfRC

      • = 10.5125 / 1
        = 10.5125

Now, in Step 8, we'll calculate the total sum-of-squares ($SS_{T}$):

  • SST = SSB + SSW + SSR + SSC + SSRC

    • = 81.3375+ 24.55 + 63.0125 + 7.8125 + 10.5125
      = 187.225

    dfT = N - 1

    • = 20-1
      = 19

At Step 9, we calculate the F values:

  • FR = sR2 / sW2

    • = 63.0125 / 1.534375
      = 41.06721

    FC = sC2 / sW2

    • = 7.8125 / 1.534375
      = 5.09165

    FRC = sRC2 / sW2

    • = 10.5125 / 1.534375
      = 6.851324

And, finally, at Step 10, we can organize all of the above into a table, along with the appropriate FCRIT values (looked up in a table like this one) that we'll use for comparison and interpretation of our computations:

  • FCRIT (1, 16) α=0.5 = 4.49

ANOVA TABLE

Source

SS

df

s2

Fobt

Fcrit

p

rows

63.0125

1

63.0125

41.06721

4.49

p < 0.05

columns

7.8125

1

7.8125

5.09165

4.49

p < 0.05

r * c

10.5125

1

10.5125

6.851324

4.49

p < 0.05

within

24.55

16

1.534375

--

--

--

total

187.225

19

--

--

--

--

Both factors (substance and duration of usage) are significant, as indicate by the fact that FOBT > FCRIT. The interaction between the two (r * c) is also significant. To interpret the significant interaction we can look at our plot of group means to see this:

Means by group

Multiple Regression

Another analysis technique you could use is a multiple regression. In multiple regression, we find coefficients for each group such that we are able to best predict the group means. The multiple regression computes standard errors on the coefficients, meaning that we can determine if a coefficient is significantly different from zero. On this simple example, multiple regression will give identical answers to ANOVA, but in more complex cases, multiple regression is a more powerful technique that allows you to include additional nuisance predictors, that the analysis controls for before testing for significance of your independent variables.

MoreThanTwoVariables (last edited 2012-01-14 00:05:00 by cpe-69-207-83-233)

MoinMoin Appliance - Powered by TurnKey Linux