Learning Objectives
By the end of this chapter, you will be able to:
- Understand the concept and purpose of hypothesis testing
- Formulate null and alternative hypotheses
- Explain Type I and Type II errors
- Define significance level and p-value
- Understand the logic of hypothesis testing
What is Hypothesis Testing?
Hypothesis testing is a statistical method used to make decisions about population parameters based on sample data.
flowchart LR
A[Research Question] --> B[Formulate Hypotheses]
B --> C[Collect Sample Data]
C --> D[Calculate Test Statistic]
D --> E[Make Decision]
E --> F[Draw Conclusion]
Purpose
- Test claims about population parameters
- Make data-driven decisions
- Determine if observed differences are statistically significant
- Support evidence-based policy making
Key Terminology
Null Hypothesis (H₀)
The null hypothesis is a statement of “no effect” or “no difference.” It represents the status quo.
\[H_0: \mu = \mu_0\]Alternative Hypothesis (H₁ or Hₐ)
The alternative hypothesis is what we want to prove. It represents the research claim.
\[H_1: \mu \neq \mu_0 \text{ (or } > \text{ or } <\text{)}\]Example
Research Question: Has the average processing time changed from 30 minutes?
- $H_0: \mu = 30$ (No change)
- $H_1: \mu \neq 30$ (Changed)
Types of Alternative Hypotheses
flowchart TD
A[Alternative Hypothesis]
A --> B[Two-Tailed<br/>H₁: μ ≠ μ₀]
A --> C[Right-Tailed<br/>H₁: μ > μ₀]
A --> D[Left-Tailed<br/>H₁: μ < μ₀]
B --> B1["Tests for 'different'<br/>'changed' 'not equal'"]
C --> C1["Tests for 'greater'<br/>'increased' 'more'"]
D --> D1["Tests for 'less'<br/>'decreased' 'fewer'"]
Two-Tailed Test
\[H_0: \mu = \mu_0 \quad \text{vs} \quad H_1: \mu \neq \mu_0\]Keywords: different, changed, not equal
Right-Tailed (Upper) Test
\[H_0: \mu = \mu_0 \quad \text{vs} \quad H_1: \mu > \mu_0\]Keywords: greater, more, increased, improved
Left-Tailed (Lower) Test
\[H_0: \mu = \mu_0 \quad \text{vs} \quad H_1: \mu < \mu_0\]Keywords: less, fewer, decreased, reduced
Formulating Hypotheses: Examples
| Scenario | H₀ | H₁ | Type |
|---|---|---|---|
| Has average salary changed from 50,000? | μ = 50,000 | μ ≠ 50,000 | Two-tailed |
| Is average time greater than 30 min? | μ = 30 | μ > 30 | Right-tailed |
| Is defect rate less than 5%? | p = 0.05 | p < 0.05 | Left-tailed |
| Is the new drug more effective? | μ₁ = μ₂ | μ₁ > μ₂ | Right-tailed |
Type I and Type II Errors
Decision Table
| H₀ is TRUE | H₀ is FALSE | |
|---|---|---|
| Reject H₀ | Type I Error (α) | Correct Decision |
| Fail to Reject H₀ | Correct Decision | Type II Error (β) |
Type I Error (α)
- Definition: Rejecting H₀ when it is actually true
- Also called: False Positive, α error
- Example: Concluding a new policy is effective when it actually isn’t
- Probability: α (significance level)
Type II Error (β)
- Definition: Failing to reject H₀ when it is actually false
- Also called: False Negative, β error
- Example: Concluding a new policy has no effect when it actually does
- Probability: β
flowchart TD
subgraph "Reality"
A["H₀ True"]
B["H₀ False"]
end
subgraph "Decision"
C["Reject H₀"]
D["Fail to Reject H₀"]
end
A --> C
A --> D
B --> C
B --> D
A -.-> |"Type I Error (α)"| C
B -.-> |"Type II Error (β)"| D
Power of a Test
\[\text{Power} = 1 - \beta\]Power is the probability of correctly rejecting a false H₀.
Significance Level (α)
Definition
The significance level (α) is the maximum probability of Type I error we’re willing to accept.
Common Values
| α | Interpretation |
|---|---|
| 0.10 | 10% risk of Type I error (less strict) |
| 0.05 | 5% risk of Type I error (most common) |
| 0.01 | 1% risk of Type I error (very strict) |
Critical Region
The critical region (rejection region) is the set of values that lead to rejecting H₀.
The Logic of Hypothesis Testing
Approach: Proof by Contradiction
- Assume H₀ is true
- Calculate the probability of observing our sample result
- If this probability is very low (< α), reject H₀
- Conclude H₁ is supported
flowchart TD
A["Assume H₀ is true"]
B["Calculate test statistic"]
C["Find p-value or compare to critical value"]
D{Is result unlikely<br/>under H₀?}
E["Reject H₀<br/>Support H₁"]
F["Fail to Reject H₀<br/>Insufficient evidence"]
A --> B --> C --> D
D -->|"p-value < α"| E
D -->|"p-value ≥ α"| F
p-Value
Definition
The p-value is the probability of obtaining a test statistic as extreme as (or more extreme than) what we observed, assuming H₀ is true.
Decision Rule
- If p-value < α → Reject H₀
- If p-value ≥ α → Fail to reject H₀
Interpretation
| p-value | Evidence Against H₀ |
|---|---|
| p > 0.10 | No evidence |
| 0.05 < p ≤ 0.10 | Weak evidence |
| 0.01 < p ≤ 0.05 | Moderate evidence |
| 0.001 < p ≤ 0.01 | Strong evidence |
| p ≤ 0.001 | Very strong evidence |
Test Statistic
Definition
A test statistic is a standardized value calculated from sample data, used to decide whether to reject H₀.
Common Test Statistics
| Situation | Test Statistic | Distribution |
|---|---|---|
| Mean, σ known | $z = \frac{\bar{x} - \mu_0}{\sigma/\sqrt{n}}$ | Normal |
| Mean, σ unknown | $t = \frac{\bar{x} - \mu_0}{s/\sqrt{n}}$ | t-distribution |
| Proportion | $z = \frac{\hat{p} - p_0}{\sqrt{p_0(1-p_0)/n}}$ | Normal |
Two Approaches to Decision Making
1. Critical Value Approach
- Determine critical value(s) from tables
- Calculate test statistic
- Compare: If test statistic falls in critical region → Reject H₀
2. p-Value Approach
- Calculate test statistic
- Find p-value
- Compare: If p-value < α → Reject H₀
Both approaches give the same conclusion!
Steps in Hypothesis Testing
Step 1: State Hypotheses
Write H₀ and H₁ based on the research question.
Step 2: Set Significance Level
Choose α (typically 0.05).
Step 3: Calculate Test Statistic
Use appropriate formula based on situation.
Step 4: Determine Critical Value or p-Value
Use statistical tables.
Step 5: Make Decision
Compare and decide to reject or fail to reject H₀.
Step 6: State Conclusion
Interpret the result in context.
Important Notes
“Fail to Reject” vs “Accept”
We say “fail to reject H₀” rather than “accept H₀” because:
- We can never prove H₀ is true
- We can only say there’s insufficient evidence against it
Statistical vs Practical Significance
- Statistical significance: p-value < α
- Practical significance: Is the effect large enough to matter?
A very large sample might find statistically significant but practically trivial differences.
Example: Setting Up Hypotheses
Problem: A government claims average service time is 20 minutes. A citizen group believes it takes longer. Set up hypotheses.
Solution:
- Claim to test: Service time > 20 minutes
- H₀: μ = 20 (Government claim)
- H₁: μ > 20 (Citizen group claim)
- This is a right-tailed test
Practice Problems
Problem 1
For each scenario, write H₀ and H₁: (a) Testing if a new training improves scores from the previous average of 75 (b) Testing if average commute time has changed from 45 minutes (c) Testing if defect rate is below the standard 3%
Problem 2
Explain the difference between Type I and Type II errors in the context of testing whether a new medication is effective.
Problem 3
A researcher uses α = 0.05. Explain what this means in terms of Type I error.
Problem 4
If p-value = 0.03 and α = 0.05, what is the decision? What if α = 0.01?
Problem 5
Why do we say “fail to reject H₀” instead of “accept H₀”?
Summary
| Concept | Definition |
|---|---|
| H₀ | Null hypothesis (no effect, status quo) |
| H₁ | Alternative hypothesis (research claim) |
| Type I Error (α) | Rejecting true H₀ |
| Type II Error (β) | Failing to reject false H₀ |
| Power | 1 - β (detecting true effect) |
| p-value | Probability of result given H₀ true |
| Decision | Reject H₀ if p-value < α |
Next Topic
In the next chapter, we will study Basic Terminologies in Hypothesis Testing - a deeper look at one-tailed vs two-tailed tests, critical regions, and making decisions.

