Learning Objectives
By the end of this chapter, you will be able to:
- Set up hypotheses for comparing two population means
- Calculate the test statistic for difference of means
- Perform hypothesis tests comparing two independent groups
- Interpret results in practical context
When to Use Two-Sample Z-Test
Use this test when:
- Comparing two independent population means (μ₁ vs μ₂)
- Both sample sizes are large (n₁ ≥ 30 AND n₂ ≥ 30)
- Samples are independent (different groups, no matching)
flowchart TD
A[Comparing two means?]
B{Are samples independent?}
C{Are both n ≥ 30?}
D[Two-sample Z-test]
E[Use paired t-test]
F[Use two-sample t-test]
A --> B
B -->|Yes| C
B -->|No/Matched| E
C -->|Yes| D
C -->|No| F
Hypotheses for Two Means
| Type | H₀ | H₁ |
|---|---|---|
| Two-tailed | μ₁ = μ₂ | μ₁ ≠ μ₂ |
| Right-tailed | μ₁ = μ₂ | μ₁ > μ₂ |
| Left-tailed | μ₁ = μ₂ | μ₁ < μ₂ |
Alternative forms:
- H₀: μ₁ - μ₂ = 0
- H₁: μ₁ - μ₂ ≠ 0 (or > 0 or < 0)
Test Statistic Formula
\[z = \frac{(\bar{x}_1 - \bar{x}_2) - (\mu_1 - \mu_2)_0}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}\]Under H₀ (μ₁ = μ₂), the hypothesized difference is 0:
\[z = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}\]Where:
- $\bar{x}_1, \bar{x}_2$ = sample means
- $s_1, s_2$ = sample standard deviations
- $n_1, n_2$ = sample sizes
Standard Error of Difference
\[SE = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}\]This measures the variability in the difference between sample means.
Step-by-Step Example 1: Two-Tailed Test
Problem: A study compares processing times at two government offices:
| Office A | Office B | |
|---|---|---|
| Sample size | 50 | 60 |
| Mean time (min) | 45 | 40 |
| Std deviation | 12 | 10 |
Test at α = 0.05 whether there is a significant difference in mean processing times.
Solution:
Step 1: State hypotheses
- $H_0: \mu_1 = \mu_2$ (no difference)
- $H_1: \mu_1 \neq \mu_2$ (different)
Step 2: Significance level
- α = 0.05 (two-tailed)
Step 3: Calculate test statistic
First, find the standard error: \(SE = \sqrt{\frac{12^2}{50} + \frac{10^2}{60}} = \sqrt{\frac{144}{50} + \frac{100}{60}}\) \(= \sqrt{2.88 + 1.67} = \sqrt{4.55} = 2.133\)
Then calculate z: \(z = \frac{45 - 40}{2.133} = \frac{5}{2.133} = 2.34\)
Step 4: Find critical value
- Two-tailed, α = 0.05: z* = ±1.96
Step 5: Decision
- |z| = 2.34 > 1.96
- Reject H₀
Step 6: Conclusion At the 0.05 level of significance, there is sufficient evidence to conclude that there is a significant difference in mean processing times between the two offices. Office A appears to have longer processing times.
Step-by-Step Example 2: Right-Tailed Test
Problem: An HR department wants to test if employees with training have higher productivity than those without.
| With Training | Without Training | |
|---|---|---|
| n | 40 | 45 |
| Mean | 85 | 78 |
| SD | 15 | 18 |
Test at α = 0.05.
Solution:
Step 1: State hypotheses
- $H_0: \mu_1 = \mu_2$
- $H_1: \mu_1 > \mu_2$ (trained > untrained)
Step 2: Significance level
- α = 0.05 (right-tailed)
Step 3: Calculate test statistic
\(SE = \sqrt{\frac{15^2}{40} + \frac{18^2}{45}} = \sqrt{\frac{225}{40} + \frac{324}{45}}\) \(= \sqrt{5.625 + 7.2} = \sqrt{12.825} = 3.581\)
\[z = \frac{85 - 78}{3.581} = \frac{7}{3.581} = 1.955\]Step 4: Find critical value
- Right-tailed, α = 0.05: z* = 1.645
Step 5: Decision
- z = 1.955 > 1.645
- Reject H₀
Step 6: Conclusion At the 0.05 level of significance, there is sufficient evidence to conclude that employees with training have higher productivity than those without training.
Step-by-Step Example 3: Finding p-Value
Problem: Compare average incomes of two districts:
| District A | District B | |
|---|---|---|
| n | 100 | 80 |
| Mean (NPR) | 42,000 | 38,000 |
| SD (NPR) | 10,000 | 12,000 |
Test if District A has higher income at α = 0.01 and find the p-value.
Solution:
Step 1: State hypotheses
- $H_0: \mu_A = \mu_B$
- $H_1: \mu_A > \mu_B$ (right-tailed)
Step 2: Calculate test statistic
\(SE = \sqrt{\frac{10000^2}{100} + \frac{12000^2}{80}} = \sqrt{1,000,000 + 1,800,000}\) \(= \sqrt{2,800,000} = 1673.32\)
\[z = \frac{42000 - 38000}{1673.32} = \frac{4000}{1673.32} = 2.39\]Step 3: Find p-value
For right-tailed test: \(p\text{-value} = P(Z > 2.39) = 1 - 0.9916 = 0.0084\)
Step 4: Decision
- p-value = 0.0084 < α = 0.01
- Reject H₀
Step 5: Conclusion At the 0.01 level of significance, there is sufficient evidence to conclude that District A has higher average income than District B. The p-value of 0.0084 indicates strong evidence against H₀.
Step-by-Step Example 4: Left-Tailed Test
Problem: A policy aims to reduce wait times. Compare before and after implementation (different samples):
| Before Policy | After Policy | |
|---|---|---|
| n | 60 | 50 |
| Mean (min) | 35 | 30 |
| SD (min) | 8 | 7 |
Test at α = 0.05 if wait time decreased.
Solution:
Step 1: State hypotheses Let μ₁ = before, μ₂ = after
- $H_0: \mu_1 = \mu_2$
- $H_1: \mu_2 < \mu_1$ or equivalently $\mu_1 > \mu_2$
For easier calculation, test if μ₂ < μ₁:
- $H_1: \mu_{after} < \mu_{before}$ (wait time decreased)
Step 2: Calculate test statistic
\[SE = \sqrt{\frac{8^2}{60} + \frac{7^2}{50}} = \sqrt{1.067 + 0.98} = \sqrt{2.047} = 1.431\] \[z = \frac{30 - 35}{1.431} = \frac{-5}{1.431} = -3.49\]Step 3: Find critical value
- Left-tailed, α = 0.05: z* = -1.645
Step 4: Decision
- z = -3.49 < -1.645
- Reject H₀
Step 5: Conclusion At the 0.05 level of significance, there is strong evidence that the policy has significantly reduced wait times.
Confidence Interval for Difference of Means
\[(\bar{x}_1 - \bar{x}_2) \pm z^* \times SE\]Example 5: 95% CI for Difference
Using Example 1 data:
- Difference = 45 - 40 = 5
- SE = 2.133
- z* = 1.96
\(95\% \text{ CI} = 5 \pm 1.96 \times 2.133 = 5 \pm 4.18\) \(= (0.82, 9.18)\)
Interpretation: We are 95% confident that Office A’s mean processing time is between 0.82 and 9.18 minutes longer than Office B’s.
Since 0 is not in the interval → significant difference exists.
Summary Table
| Component | Formula |
|---|---|
| Test Statistic | $z = \frac{\bar{x}_1 - \bar{x}_2}{\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}}$ |
| Standard Error | $SE = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}$ |
| 95% CI | $(\bar{x}_1 - \bar{x}_2) \pm 1.96 \times SE$ |
Practice Problems
Problem 1
Compare mean scores: | | Group 1 | Group 2 | |–|———|———| | n | 40 | 50 | | Mean | 72 | 68 | | SD | 10 | 12 |
Test at α = 0.05 if means differ.
Problem 2
Test if Method A produces higher output than Method B:
- Method A: n=36, $\bar{x}$=95, s=15
- Method B: n=40, $\bar{x}$=88, s=12
Use α = 0.01.
Problem 3
For the data in Problem 1, construct a 95% confidence interval for the difference in means.
Problem 4
Two factories are compared:
- Factory 1: n=100, $\bar{x}$=50, s=8
- Factory 2: n=120, $\bar{x}$=48, s=10
(a) Test if means differ at α = 0.05 (b) Find the p-value (c) Construct 99% CI for the difference
Problem 5
If z = 2.5 for a two-tailed test comparing two means, find the p-value and state the decision at α = 0.05.
Summary
| Aspect | Key Point |
|---|---|
| Purpose | Compare two independent population means |
| Requirements | n₁ ≥ 30, n₂ ≥ 30, independent samples |
| H₀ | μ₁ = μ₂ (no difference) |
| Test Statistic | z = (difference in means) / SE |
| Decision | Same rules as single-sample z-test |
Next Topic
In the next chapter, we will study Large Sample Test for Single Proportion - testing claims about population proportions.

