Statistical Tests for Experiments with Two Samples

A common experimental design is taking measurements from both a test group and a control group. For example, we may want to test the effectiveness of a new drug. A scientifically reasonable way of conducting an experiment is to give the drug to the test group while giving the placebo to the control group. If there is a significant difference between the test group and the control group, we can conclude the drug is at least statistically affecting patients' well-being.

An important concept in experiments with two groups is "repeated measure design" or "paired design". A paired design refers to studies where two or more sets of measurements are taken from a single group of subjects under different conditions. Suppose we want to study the effect of sleep on memory. Having a test group of subjects who sleep normally and another group of subjects who are deprived of sleep does not reveal much about the target effect. Any observed difference may be due to the inherent difference in memory performance between these two groups. A repeated measure design is recommended here so that we can measure memory performance before and after sleep deprivation on the same group of subjects.

In the following, we illustrate how to conduct statistical tests in the unpaired design (i.e. test group vs. control group) and in the paired design.

Unpaired (Independent Group) t-test

The unpaired t-test is applicable when the experimental group and the control group consist of different pools of subjects. Consider the following example:

Example Problem

Let's first identify the null hypothesis: The hormone has no effect on the number of times male rats mount a female during a 20 minute period. Note that we have 20 rats here divided into two groups. Such a design is an independent group design and an unpaired t-test is appropriate.

Solve the Problem by Hand

Now we test for a significant change in the mating rate of rats under the treatment of the hormone using an alpha = 0.05 significance level.

We define D0 as the difference in population means between the two conditions. The null hypothesis states that this value is 0:

Then, calculate the difference in means of the two samples:

\[
\bar{D} = \bar{x}_{hormone} - \bar{x}_{placebo} = 8.4-5.6=2.8
\]

and the standard error of the difference in means can be computed from the variance of each sample $s^{2}_{hormone}$ and $s^{2}_{placebo}$:

\[
se_{D} = \sqrt{\frac{s^{2}_{hormone} + s^{2}_{placebo}}{N}} = \sqrt{\frac{6.197^{2} + 5.139^{2}}{10}} = 2.546
\]

Finally, we calculate the tobt. The formula is as follows:

\[
t_{obt} = \frac{\bar{D}-D_0}{se_{D}} = \frac{2.8 - 0}{2.546} = 1.10
\]

By looking up in the t distribution table, we find the 2-tailed t-critical value at df = 2N − 2 = 18 is 2.101. Since our obtained t-statistic is smaller than the critical value, we fail to reject the null hypothesis. The data do not support the hypothesis that hormone X has an effect on the frequency of male rat mounting behavior during a 20-minute period.

Note: the degrees of freedom for an independent group design is the number of subjects minus 2 (number of groups).

Paired (Repeated-Measures) t-test

The paired group t-test is used when the two sets of measurements are taken on the same group of subjects. Consider the following example.

Example Problem

Subject No.

Placebo pill

Birth Control pill

1

102

108

2

76

76

3

66

69

4

71

78

5

68

74

6

85

85

7

82

79

8

78

78

9

79

80

10

80

81

The first step, again, is to identify the null hypothesis: The birth control pill does not affect the blood pressure of women who are taking it (i.e. $D_0 = 0$). A two-tailed test should be used since an effect in either direction (i.e. whether the birth control pill increases or decreases blood pressure) would be of interest.

Solve the Problem by Hand

Now we test for a significant change in the blood pressure of women on birth control pills using an alpha = 0.05 significance level.

First, calculate the mean blood pressure change of the sample:

\[
\bar{D} = 2.1
\]

Next, the standard deviation of the sample difference scores (this is done by taking the differences first and then calculate the standard deviation):

s = 3.281

Now, the standard error of the mean sD>> is:

\[
se_{D} = \frac{s}{\sqrt{n}} = \frac{3.281}{\sqrt{10}} = 1.038
\]

Note that in the unpaired t-test, we calculated the standard error of the difference in means, while here we calculated the standard error of the mean difference. This is the key difference between the independent group design and the paired group design.

And, finally, our tobt:

\[
t_{obt} = \frac{\bar{D}-D_0}{se_{D}} = \frac{2.1 -0 }{1.038} = 2.024
\]

What is the degrees of freedom for a paired t-test? Since we only have 10 subjects in our group, the df is the same as in the one-sample t-test: N - 1. Now, we can look up the critical value of t for a two-tail test using alpha = 0.05 and the appropriate degrees of freedom (df = 10-1 = 9). The critical value is 2.262, which is larger than our obtained t-statistic. Thus, we retain the null hypothesis and conclude that there is no sufficient evidence for the claim that the birth control pill affects the blood pressure of women who take it.

Note: If you used a statistics program to calculate the result of the t-test, you will find the p-value of this test is 0.074. When the p-value is greater than 0.05 and less than 0.1, we often refer to the result as "marginally significant". A marginally significant result often indicates a weak but probable effect, and thus is worth mentioning. Reporting a result as "marginally significant" is a common practice in experimental sciences.

Go back to the Homepage

StatsWiki: TwoSamplesOneVariable (last edited 2012-01-13 06:46:50 by KathyNordeen)

MoinMoin Appliance - Powered by TurnKey Linux