Hypothesis Testing: Two-Sample Problems for Independent Normal Distributions

Hypothesis Testing: Two-Sample Problems for Independent Normal Distributions - Equality of Means (Pooled vs. Un-pooled t-test)

Overview

In hypothesis testing for two-sample problems with independent normal distributions, we often want to test whether the means of two populations are equal. The two main approaches for this are the pooled t-test and the un-pooled t-test (also known as Welch's t-test). The choice between these tests depends on whether we assume the variances of the two populations are equal.

Pooled t-test

The pooled t-test is used when we assume that the variances of the two populations are equal. The test statistic is calculated using a pooled estimate of the variance.

Assumptions:

The populations from which the samples are drawn are normally distributed.
The samples are independent.
The population variances are equal (homogeneity of variance).

Test Statistic:

The pooled t-test statistic is given by:

$t = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{S_p^2 \left( \frac{1}{n_1} + \frac{1}{n_2} \right)}}$

where $S_p^2$ is the pooled sample variance:

$S_p^2 = \frac{(n_1 - 1)S_1^2 + (n_2 - 1)S_2^2}{n_1 + n_2 - 2}$

and

$\bar{X}_1$ and $\bar{X}_2$ are the sample means,
$S_1^2$ and $S_2^2$ are the sample variances,
$n_1$ and $n_2$ are the sample sizes.

The degrees of freedom for the test are $n_1 + n_2 - 2$ .

Example:

Suppose we have two independent samples:

Sample 1: $n_1 = 10$ , $\bar{X}_1 = 15$ , $S_1^2 = 4$
Sample 2: $n_2 = 12$ , $\bar{X}_2 = 18$ , $S_2^2 = 5$

Calculate the pooled variance $S_p^2$ :

$S_p^2 = \frac{(10 - 1) \cdot 4 + (12 - 1) \cdot 5}{10 + 12 - 2} = \frac{36 + 55}{20} = 4.55$

Calculate the t-statistic:

$t = \frac{15 - 18}{\sqrt{4.55 \left( \frac{1}{10} + \frac{1}{12} \right)}} = \frac{-3}{\sqrt{4.55 \cdot 0.1833}} = \frac{-3}{0.909} \approx -3.30$

Determine the degrees of freedom:

$df = 10 + 12 - 2 = 20$

Compare the t-statistic with the critical value from the t-distribution table for 20 degrees of freedom at the desired significance level (e.g., $\alpha = 0.05$ ).

Un-pooled t-test (Welch's t-test)

The un-pooled t-test is used when we do not assume equal variances. This test is more robust when the variances are unequal.

Assumptions:

The populations from which the samples are drawn are normally distributed.
The samples are independent.
The population variances are not assumed to be equal.

Test Statistic:

The Welch's t-test statistic is given by:

$t = \frac{\bar{X}_1 - \bar{X}_2}{\sqrt{\frac{S_1^2}{n_1} + \frac{S_2^2}{n_2}}}$

The degrees of freedom are approximated using the following formula (Welch-Satterthwaite equation):

$df = \frac{\left( \frac{S_1^2}{n_1} + \frac{S_2^2}{n_2} \right)^2}{\frac{\left( \frac{S_1^2}{n_1} \right)^2}{n_1 - 1} + \frac{\left( \frac{S_2^2}{n_2} \right)^2}{n_2 - 1}}$

Example:

Using the same sample data as above:

Sample 1: $n_1 = 10$ , $\bar{X}_1 = 15$ , $S_1^2 = 4$
Sample 2: $n_2 = 12$ , $\bar{X}_2 = 18$ , $S_2^2 = 5$

Calculate the t-statistic:

$t = \frac{15 - 18}{\sqrt{\frac{4}{10} + \frac{5}{12}}} = \frac{-3}{\sqrt{0.4 + 0.4167}} = \frac{-3}{\sqrt{0.8167}} \approx \frac{-3}{0.9037} \approx -3.32$

Calculate the degrees of freedom:

$df = \frac{\left( \frac{4}{10} + \frac{5}{12} \right)^2}{\frac{\left( \frac{4}{10} \right)^2}{10 - 1} + \frac{\left( \frac{5}{12} \right)^2}{12 - 1}} = \frac{0.8167^2}{\frac{0.16}{9} + \frac{0.1736}{11}} = \frac{0.667}{0.0178 + 0.0158} = \frac{0.667}{0.0336} \approx 19.85$

Compare the t-statistic with the critical value from the t-distribution table for approximately 20 degrees of freedom at the desired significance level (e.g., $\alpha = 0.05$ ).

Conclusion

Pooled t-test is used when the variances of the two populations are assumed to be equal. It has a simpler formula and uses a pooled estimate of variance.
Un-pooled t-test (Welch's t-test) does not assume equal variances and is more robust in the presence of unequal variances. It uses a different formula for the test statistic and degrees of freedom.

Choosing the correct test depends on whether the assumption of equal variances is reasonable. If there is doubt about the equality of variances, Welch's t-test is generally preferred.

Hypothesis Testing: Two-Sample Problems for Independent Normal Distributions