Learning Objectives

By the end of this chapter, you will be able to:

  • Understand the concept and purpose of hypothesis testing
  • Formulate null and alternative hypotheses
  • Explain Type I and Type II errors
  • Define significance level and p-value
  • Understand the logic of hypothesis testing

What is Hypothesis Testing?

Hypothesis testing is a statistical method used to make decisions about population parameters based on sample data.

flowchart LR
    A[Research Question] --> B[Formulate Hypotheses]
    B --> C[Collect Sample Data]
    C --> D[Calculate Test Statistic]
    D --> E[Make Decision]
    E --> F[Draw Conclusion]

Purpose

  • Test claims about population parameters
  • Make data-driven decisions
  • Determine if observed differences are statistically significant
  • Support evidence-based policy making

Key Terminology

Null Hypothesis (H₀)

The null hypothesis is a statement of “no effect” or “no difference.” It represents the status quo.

\[H_0: \mu = \mu_0\]

Alternative Hypothesis (H₁ or Hₐ)

The alternative hypothesis is what we want to prove. It represents the research claim.

\[H_1: \mu \neq \mu_0 \text{ (or } > \text{ or } <\text{)}\]

Example

Research Question: Has the average processing time changed from 30 minutes?

  • $H_0: \mu = 30$ (No change)
  • $H_1: \mu \neq 30$ (Changed)

Types of Alternative Hypotheses

flowchart TD
    A[Alternative Hypothesis]
    A --> B[Two-Tailed<br/>H₁: μ ≠ μ₀]
    A --> C[Right-Tailed<br/>H₁: μ > μ₀]
    A --> D[Left-Tailed<br/>H₁: μ < μ₀]

    B --> B1["Tests for 'different'<br/>'changed' 'not equal'"]
    C --> C1["Tests for 'greater'<br/>'increased' 'more'"]
    D --> D1["Tests for 'less'<br/>'decreased' 'fewer'"]

Two-Tailed Test

\[H_0: \mu = \mu_0 \quad \text{vs} \quad H_1: \mu \neq \mu_0\]

Keywords: different, changed, not equal

Right-Tailed (Upper) Test

\[H_0: \mu = \mu_0 \quad \text{vs} \quad H_1: \mu > \mu_0\]

Keywords: greater, more, increased, improved

Left-Tailed (Lower) Test

\[H_0: \mu = \mu_0 \quad \text{vs} \quad H_1: \mu < \mu_0\]

Keywords: less, fewer, decreased, reduced


Formulating Hypotheses: Examples

Scenario H₀ H₁ Type
Has average salary changed from 50,000? μ = 50,000 μ ≠ 50,000 Two-tailed
Is average time greater than 30 min? μ = 30 μ > 30 Right-tailed
Is defect rate less than 5%? p = 0.05 p < 0.05 Left-tailed
Is the new drug more effective? μ₁ = μ₂ μ₁ > μ₂ Right-tailed

Type I and Type II Errors

Decision Table

  H₀ is TRUE H₀ is FALSE
Reject H₀ Type I Error (α) Correct Decision
Fail to Reject H₀ Correct Decision Type II Error (β)

Type I Error (α)

  • Definition: Rejecting H₀ when it is actually true
  • Also called: False Positive, α error
  • Example: Concluding a new policy is effective when it actually isn’t
  • Probability: α (significance level)

Type II Error (β)

  • Definition: Failing to reject H₀ when it is actually false
  • Also called: False Negative, β error
  • Example: Concluding a new policy has no effect when it actually does
  • Probability: β
flowchart TD
    subgraph "Reality"
        A["H₀ True"]
        B["H₀ False"]
    end

    subgraph "Decision"
        C["Reject H₀"]
        D["Fail to Reject H₀"]
    end

    A --> C
    A --> D
    B --> C
    B --> D

    A -.-> |"Type I Error (α)"| C
    B -.-> |"Type II Error (β)"| D

Power of a Test

\[\text{Power} = 1 - \beta\]

Power is the probability of correctly rejecting a false H₀.


Significance Level (α)

Definition

The significance level (α) is the maximum probability of Type I error we’re willing to accept.

Common Values

α Interpretation
0.10 10% risk of Type I error (less strict)
0.05 5% risk of Type I error (most common)
0.01 1% risk of Type I error (very strict)

Critical Region

The critical region (rejection region) is the set of values that lead to rejecting H₀.


The Logic of Hypothesis Testing

Approach: Proof by Contradiction

  1. Assume H₀ is true
  2. Calculate the probability of observing our sample result
  3. If this probability is very low (< α), reject H₀
  4. Conclude H₁ is supported
flowchart TD
    A["Assume H₀ is true"]
    B["Calculate test statistic"]
    C["Find p-value or compare to critical value"]
    D{Is result unlikely<br/>under H₀?}
    E["Reject H₀<br/>Support H₁"]
    F["Fail to Reject H₀<br/>Insufficient evidence"]

    A --> B --> C --> D
    D -->|"p-value < α"| E
    D -->|"p-value ≥ α"| F

p-Value

Definition

The p-value is the probability of obtaining a test statistic as extreme as (or more extreme than) what we observed, assuming H₀ is true.

Decision Rule

  • If p-value < α → Reject H₀
  • If p-value ≥ α → Fail to reject H₀

Interpretation

p-value Evidence Against H₀
p > 0.10 No evidence
0.05 < p ≤ 0.10 Weak evidence
0.01 < p ≤ 0.05 Moderate evidence
0.001 < p ≤ 0.01 Strong evidence
p ≤ 0.001 Very strong evidence

Test Statistic

Definition

A test statistic is a standardized value calculated from sample data, used to decide whether to reject H₀.

Common Test Statistics

Situation Test Statistic Distribution
Mean, σ known $z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}$ Normal
Mean, σ unknown $t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}$ t-distribution
Proportion $z = \frac{\hat{p} - p_0}{\sqrt{p_0(1-p_0)/n}}$ Normal

Two Approaches to Decision Making

1. Critical Value Approach

  1. Determine critical value(s) from tables
  2. Calculate test statistic
  3. Compare: If test statistic falls in critical region → Reject H₀

2. p-Value Approach

  1. Calculate test statistic
  2. Find p-value
  3. Compare: If p-value < α → Reject H₀

Both approaches give the same conclusion!


Steps in Hypothesis Testing

Step 1: State Hypotheses

Write H₀ and H₁ based on the research question.

Step 2: Set Significance Level

Choose α (typically 0.05).

Step 3: Calculate Test Statistic

Use appropriate formula based on situation.

Step 4: Determine Critical Value or p-Value

Use statistical tables.

Step 5: Make Decision

Compare and decide to reject or fail to reject H₀.

Step 6: State Conclusion

Interpret the result in context.


Important Notes

“Fail to Reject” vs “Accept”

We say “fail to reject H₀” rather than “accept H₀” because:

  • We can never prove H₀ is true
  • We can only say there’s insufficient evidence against it

Statistical vs Practical Significance

  • Statistical significance: p-value < α
  • Practical significance: Is the effect large enough to matter?

A very large sample might find statistically significant but practically trivial differences.


Example: Setting Up Hypotheses

Problem: A government claims average service time is 20 minutes. A citizen group believes it takes longer. Set up hypotheses.

Solution:

  • Claim to test: Service time > 20 minutes
  • H₀: μ = 20 (Government claim)
  • H₁: μ > 20 (Citizen group claim)
  • This is a right-tailed test

Practice Problems

Problem 1

For each scenario, write H₀ and H₁: (a) Testing if a new training improves scores from the previous average of 75 (b) Testing if average commute time has changed from 45 minutes (c) Testing if defect rate is below the standard 3%

Problem 2

Explain the difference between Type I and Type II errors in the context of testing whether a new medication is effective.

Problem 3

A researcher uses α = 0.05. Explain what this means in terms of Type I error.

Problem 4

If p-value = 0.03 and α = 0.05, what is the decision? What if α = 0.01?

Problem 5

Why do we say “fail to reject H₀” instead of “accept H₀”?


Summary

Concept Definition
H₀ Null hypothesis (no effect, status quo)
H₁ Alternative hypothesis (research claim)
Type I Error (α) Rejecting true H₀
Type II Error (β) Failing to reject false H₀
Power 1 - β (detecting true effect)
p-value Probability of result given H₀ true
Decision Reject H₀ if p-value < α

Next Topic

In the next chapter, we will study Basic Terminologies in Hypothesis Testing - a deeper look at one-tailed vs two-tailed tests, critical regions, and making decisions.