Learning Objectives

By the end of this chapter, you will be able to:

  • Calculate required sample size for estimating a mean
  • Calculate required sample size for estimating a proportion
  • Understand the relationship between sample size and precision
  • Handle situations where population variance is unknown
  • Apply sample size formulas to practical problems

Why Determine Sample Size?

Before collecting data, we need to know how many observations are required to achieve:

  1. Desired precision (margin of error)
  2. Desired confidence level
  3. Cost-effectiveness (not too large or too small)
flowchart TD
    A[Sample Size Planning]
    A --> B[Too Small]
    A --> C[Just Right]
    A --> D[Too Large]

    B --> B1["Imprecise estimates<br/>Wide confidence intervals"]
    C --> C1["Good precision<br/>Cost-effective"]
    D --> D1["Wasted resources<br/>Unnecessary cost"]

Sample Size for Estimating Mean (σ Known)

Formula

Starting from margin of error:

\[ME = z^* \cdot \frac{\sigma}{\sqrt{n}}\]

Solving for n:

\[n = \left(\frac{z^* \cdot \sigma}{ME}\right)^2\]

Where:

  • $n$ = required sample size
  • $z^*$ = z-value for confidence level
  • $\sigma$ = population standard deviation
  • $ME$ = desired margin of error

Step-by-Step Example 1: Sample Size for Mean

Problem: A researcher wants to estimate the average monthly expenditure of households. Historical data suggests σ = NPR 5,000. How large a sample is needed for a 95% confidence interval with margin of error NPR 800?

Solution:

Step 1: Identify given values

  • σ = 5,000
  • ME = 800
  • Confidence = 95% → z* = 1.96

Step 2: Apply formula \(n = \left(\frac{z^* \cdot \sigma}{ME}\right)^2 = \left(\frac{1.96 \times 5000}{800}\right)^2\)

\[n = \left(\frac{9800}{800}\right)^2 = (12.25)^2 = 150.0625\]

Step 3: Round up (always!) \(n = 151\)

Answer: A sample of at least 151 households is required.


Step-by-Step Example 2: Effect of Confidence Level

Problem: Using the same scenario, compare sample sizes needed for: (a) 90% confidence (b) 95% confidence (c) 99% confidence

Solution:

Using formula: $n = \left(\frac{z^* \times 5000}{800}\right)^2$

(a) 90% confidence (z* = 1.645): \(n = \left(\frac{1.645 \times 5000}{800}\right)^2 = (10.28)^2 = 105.7 \approx 106\)

(b) 95% confidence (z* = 1.96): \(n = \left(\frac{1.96 \times 5000}{800}\right)^2 = (12.25)^2 = 150.1 \approx 151\)

(c) 99% confidence (z* = 2.576): \(n = \left(\frac{2.576 \times 5000}{800}\right)^2 = (16.1)^2 = 259.2 \approx 260\)

Confidence Level Required n
90% 106
95% 151
99% 260

Conclusion: Higher confidence requires larger sample size.


What If σ is Unknown?

Options:

  1. Use pilot study: Take a small preliminary sample to estimate σ
  2. Use prior research: Use σ from similar studies
  3. Use range rule: Estimate σ ≈ Range/4
  4. Use conservative estimate: Use the largest reasonable σ

Example 3: Using Range Estimate

Problem: Salaries range from NPR 20,000 to NPR 80,000. Estimate σ and calculate sample size for 95% CI with ME = 5,000.

Solution:

\[\sigma \approx \frac{\text{Range}}{4} = \frac{80,000 - 20,000}{4} = \frac{60,000}{4} = 15,000\] \[n = \left(\frac{1.96 \times 15,000}{5,000}\right)^2 = (5.88)^2 = 34.6 \approx 35\]

Answer: Approximately 35 observations needed.


Sample Size for Estimating Proportion

Formula

\[n = \frac{(z^*)^2 \cdot \hat{p}(1-\hat{p})}{ME^2}\]

Or equivalently:

\[n = \hat{p}(1-\hat{p})\left(\frac{z^*}{ME}\right)^2\]

Where:

  • $\hat{p}$ = estimated proportion (from pilot study or prior knowledge)
  • $ME$ = desired margin of error for proportion

Step-by-Step Example 4: Sample Size for Proportion

Problem: A researcher wants to estimate the proportion of households with internet access. A pilot study suggests about 40% have access. How many households should be surveyed for 95% confidence with margin of error 5%?

Solution:

Step 1: Identify given values

  • $\hat{p}$ = 0.40
  • ME = 0.05
  • z* = 1.96

Step 2: Apply formula \(n = \frac{(1.96)^2 \times 0.40 \times 0.60}{(0.05)^2}\)

\[n = \frac{3.8416 \times 0.24}{0.0025}\] \[n = \frac{0.922}{0.0025} = 368.8 \approx 369\]

Answer: At least 369 households should be surveyed.


Conservative Sample Size (When p is Unknown)

Maximum Variability

The product $p(1-p)$ is maximized when $p = 0.5$:

\[p(1-p) = 0.5 \times 0.5 = 0.25\]

Conservative Formula

When p is completely unknown, use:

\[n = \frac{(z^*)^2 \times 0.25}{ME^2} = \frac{0.25 \times (z^*)^2}{ME^2}\]

This gives the largest sample size needed regardless of actual p.


Step-by-Step Example 5: Conservative Approach

Problem: No prior information is available about a proportion. Calculate the sample size for 95% confidence with 3% margin of error.

Solution:

Using conservative formula with p = 0.5:

\[n = \frac{(1.96)^2 \times 0.25}{(0.03)^2}\] \[n = \frac{3.8416 \times 0.25}{0.0009}\] \[n = \frac{0.9604}{0.0009} = 1067.1 \approx 1068\]

Answer: 1,068 observations needed.


Comparison: Effect of p on Sample Size

Estimated p p(1-p) Sample Size (95%, ME=5%)
0.10 0.09 139
0.20 0.16 246
0.30 0.21 323
0.40 0.24 369
0.50 0.25 385

Insight: Sample size is maximized at p = 0.50.


Sample Size Quick Reference

For Means

\[n = \left(\frac{z^* \cdot \sigma}{ME}\right)^2\]

For Proportions

\[n = \frac{(z^*)^2 \cdot \hat{p}(1-\hat{p})}{ME^2}\]

Conservative (Proportion)

\[n = \frac{0.25 \times (z^*)^2}{ME^2}\]

Step-by-Step Example 6: Exam-Style Problem

Problem: A government department wants to estimate: (a) The average processing time (σ estimated at 12 minutes) with ME = 2 minutes at 99% confidence (b) The proportion of satisfied customers (previous estimate: 75%) with ME = 4% at 95% confidence

Calculate the required sample sizes.

Solution:

(a) Sample size for mean:

  • σ = 12, ME = 2, z* = 2.576
\[n = \left(\frac{2.576 \times 12}{2}\right)^2 = \left(\frac{30.912}{2}\right)^2 = (15.456)^2 = 238.9 \approx 239\]

(b) Sample size for proportion:

  • $\hat{p}$ = 0.75, ME = 0.04, z* = 1.96
\[n = \frac{(1.96)^2 \times 0.75 \times 0.25}{(0.04)^2}\] \[n = \frac{3.8416 \times 0.1875}{0.0016} = \frac{0.7203}{0.0016} = 450.2 \approx 451\]

Answers:

  • (a) 239 samples for processing time
  • (b) 451 customers for satisfaction survey

Practical Considerations

1. Always Round UP

A sample of 150.1 requires 151 observations, not 150.

2. Account for Non-Response

If expected response rate is 80%: \(n_{\text{adjusted}} = \frac{n_{\text{required}}}{0.80}\)

3. Budget Constraints

Sometimes you can’t afford the calculated sample size. Options:

  • Accept larger margin of error
  • Accept lower confidence level
  • Use stratified sampling for efficiency

4. Finite Population Correction

For finite population N: \(n_{\text{corrected}} = \frac{n}{1 + \frac{n-1}{N}}\)


Example 7: Adjusting for Non-Response

Problem: A survey requires 400 responses. If the expected response rate is 70%, how many should be contacted?

Solution:

\[n_{\text{contact}} = \frac{400}{0.70} = 571.4 \approx 572\]

Answer: Contact 572 people.


Decision Flow for Sample Size

flowchart TD
    A[What are you estimating?]
    A --> B{Mean or Proportion?}
    B -->|Mean| C{Is σ known?}
    B -->|Proportion| D{Is p estimated?}

    C -->|Yes| E["Use n = (z*σ/ME)²"]
    C -->|No| F["Estimate σ from range<br/>or pilot study"]

    D -->|Yes| G["Use n = z²p(1-p)/ME²"]
    D -->|No| H["Use p = 0.5<br/>(conservative)"]

Practice Problems

Problem 1

How large a sample is needed to estimate population mean with:

  • σ = 20
  • ME = 3
  • 95% confidence

Problem 2

What sample size is required to estimate a proportion with:

  • Estimated p = 0.60
  • ME = 0.04
  • 99% confidence

Problem 3

If we want to halve the margin of error, by what factor must we increase the sample size?

Problem 4

A survey of voters requires ME = 3% at 95% confidence. No prior estimate of p is available. (a) Calculate conservative sample size (b) If p is estimated at 0.35, recalculate

Problem 5

A researcher has budget for only 200 samples. If σ = 15 and 95% confidence is required, what is the achievable margin of error?


Summary

Scenario Formula
Mean (σ known) $n = \left(\frac{z^* \sigma}{ME}\right)^2$
Proportion (p estimated) $n = \frac{(z^*)^2 \hat{p}(1-\hat{p})}{ME^2}$
Proportion (p unknown) $n = \frac{0.25(z^*)^2}{ME^2}$
Halve ME Quadruple n
Double confidence Increase n

Unit 4 Complete!

You have completed Unit 4: Estimation. You now understand:

  • Sampling distributions and standard errors
  • Properties of good estimators
  • How to construct confidence intervals
  • How to determine sample size

In Unit 5, we will study Hypothesis Testing - the most extensive unit covering statistical tests for making decisions about population parameters.