Learning Objectives

By the end of this chapter, you will be able to:

  • Define and identify unbiased estimators
  • Understand the concept of efficiency
  • Explain consistency in estimation
  • Describe sufficiency of estimators
  • Compare different estimators based on these criteria

What Makes an Estimator Good?

Not all estimators are equally good. We evaluate estimators based on several criteria:

flowchart TD
    A[Criteria of Good Estimators] --> B[1. Unbiasedness]
    A --> C[2. Efficiency]
    A --> D[3. Consistency]
    A --> E[4. Sufficiency]

    B --> B1["Hits the target<br/>on average"]
    C --> C1["Minimum variance"]
    D --> D1["Improves with<br/>larger samples"]
    E --> E1["Uses all available<br/>information"]

1. Unbiasedness

Definition

An estimator $\hat{\theta}$ is unbiased if its expected value equals the true parameter:

\[E(\hat{\theta}) = \theta\]

Bias

The bias of an estimator is:

\[\text{Bias}(\hat{\theta}) = E(\hat{\theta}) - \theta\]
  • If Bias = 0 → Unbiased
  • If Bias ≠ 0 → Biased

Visual Representation

flowchart LR
    subgraph "Unbiased"
        A["Average of estimates<br/>= True value"]
    end

    subgraph "Biased"
        B["Average of estimates<br/>≠ True value"]
    end

Common Unbiased Estimators

Parameter Unbiased Estimator
Population Mean (μ) Sample Mean ($\bar{x}$)
Population Proportion (p) Sample Proportion ($\hat{p}$)
Population Variance (σ²) $s^2 = \frac{\sum(x-\bar{x})^2}{n-1}$

Why Divide by (n-1)?

The sample variance uses $(n-1)$ in the denominator to make it unbiased:

\[s^2 = \frac{\sum(x_i - \bar{x})^2}{n-1}\]

If we used $n$, the estimator would systematically underestimate σ².

Example 1: Checking Unbiasedness

Problem: Show that the sample mean $\bar{x}$ is an unbiased estimator of μ.

Solution:

\[E(\bar{x}) = E\left(\frac{\sum x_i}{n}\right) = \frac{1}{n} \sum E(x_i) = \frac{1}{n} \cdot n \cdot \mu = \mu\]

Since $E(\bar{x}) = \mu$, the sample mean is an unbiased estimator of the population mean.


2. Efficiency

Definition

An estimator is efficient if it has the minimum variance among all unbiased estimators.

\[\text{Var}(\hat{\theta}_1) < \text{Var}(\hat{\theta}_2)\]

→ $\hat{\theta}_1$ is more efficient than $\hat{\theta}_2$

Relative Efficiency

\[\text{Relative Efficiency} = \frac{\text{Var}(\text{less efficient})}{\text{Var}(\text{more efficient})}\]

Example 2: Comparing Efficiency

Problem: For a normal distribution, compare the efficiency of the sample mean and sample median as estimators of μ.

Solution:

For normal distribution:

  • Variance of mean: $\text{Var}(\bar{x}) = \frac{\sigma^2}{n}$
  • Variance of median: $\text{Var}(\text{Median}) \approx \frac{1.57 \sigma^2}{n}$
\[\text{Relative Efficiency} = \frac{1.57\sigma^2/n}{\sigma^2/n} = 1.57\]

Conclusion: The sample mean is about 57% more efficient than the median for normal distributions. The mean is the preferred estimator.

Trade-off: Bias vs Variance

Sometimes a slightly biased estimator with lower variance may be preferred (Mean Squared Error approach):

\[MSE = \text{Bias}^2 + \text{Variance}\]

3. Consistency

Definition

An estimator is consistent if it converges to the true parameter value as sample size increases:

\[\lim_{n \to \infty} P(|\hat{\theta} - \theta| < \epsilon) = 1\]

In simpler terms: As n → ∞, the estimator gets closer and closer to the true value.

Properties of Consistent Estimators

As sample size increases:

  1. Bias approaches zero
  2. Variance approaches zero
flowchart TD
    A["Small Sample<br/>n = 10"]
    B["Medium Sample<br/>n = 50"]
    C["Large Sample<br/>n = 500"]
    D["True Value θ"]

    A --> |"Wide spread"| D
    B --> |"Narrower"| D
    C --> |"Very close"| D

Example 3: Consistency

The sample mean $\bar{x}$ is a consistent estimator of μ because:

\[\text{Var}(\bar{x}) = \frac{\sigma^2}{n} \to 0 \text{ as } n \to \infty\]

As sample size increases, the variance decreases, concentrating the distribution around μ.


4. Sufficiency

Definition

An estimator is sufficient if it captures all the information in the sample about the parameter.

Intuition

A sufficient statistic summarizes the data completely - no other statistic calculated from the sample provides additional information about the parameter.

Example 4: Sufficient Statistics

For a normal population:

  • Sample mean ($\bar{x}$) is sufficient for μ
  • Sample variance ($s^2$) is sufficient for σ²

Once you know $\bar{x}$, no other function of the sample data provides additional information about μ.


Best Linear Unbiased Estimator (BLUE)

Definition

An estimator is BLUE if it is:

  1. Linear - a linear combination of observations
  2. Unbiased - E(estimator) = parameter
  3. Best - has minimum variance among all linear unbiased estimators

Example: Sample Mean is BLUE

The sample mean $\bar{x}$ is BLUE for μ (under certain conditions):

\[\bar{x} = \frac{1}{n}(x_1 + x_2 + ... + x_n)\]
  • Linear: Yes (linear combination)
  • Unbiased: Yes (E($\bar{x}$) = μ)
  • Minimum Variance: Yes (among linear unbiased estimators)

Comparison Summary

Criterion Question Answered Desirable Property
Unbiasedness Does it hit the target on average? E($\hat{\theta}$) = θ
Efficiency Is variance minimized? Minimum variance
Consistency Does it improve with more data? Converges to θ as n → ∞
Sufficiency Does it use all information? Captures all sample info

Step-by-Step Example 5: Exam-Style Problem

Problem: Two estimators for μ are proposed:

  • Estimator A: $\hat{\mu}_A = \bar{x}$ (sample mean)
  • Estimator B: $\hat{\mu}_B = \frac{x_1 + x_n}{2}$ (average of first and last observations)

Compare these estimators in terms of unbiasedness and efficiency.

Solution:

Checking Unbiasedness:

For Estimator A: \(E(\hat{\mu}_A) = E(\bar{x}) = \mu\) ✓ Unbiased

For Estimator B: \(E(\hat{\mu}_B) = E\left(\frac{x_1 + x_n}{2}\right) = \frac{E(x_1) + E(x_n)}{2} = \frac{\mu + \mu}{2} = \mu\) ✓ Unbiased

Both are unbiased!

Checking Efficiency:

For Estimator A: \(\text{Var}(\hat{\mu}_A) = \frac{\sigma^2}{n}\)

For Estimator B: \(\text{Var}(\hat{\mu}_B) = \text{Var}\left(\frac{x_1 + x_n}{2}\right) = \frac{\sigma^2 + \sigma^2}{4} = \frac{\sigma^2}{2}\)

Comparison:

  • Var(A) = $\frac{\sigma^2}{n}$
  • Var(B) = $\frac{\sigma^2}{2}$

For n > 2: $\frac{\sigma^2}{n} < \frac{\sigma^2}{2}$

Conclusion: Both estimators are unbiased, but Estimator A (sample mean) is more efficient because it has smaller variance when n > 2.


Practical Implications

Choosing an Estimator

  1. First priority: Unbiasedness (or at least approximately unbiased)
  2. Second priority: Minimum variance (efficiency)
  3. Consider: Consistency for large samples
  4. Ideal: Use sufficient statistics

Common Situations

To Estimate Best Estimator
μ (normal) $\bar{x}$
σ² $s^2 = \frac{\sum(x-\bar{x})^2}{n-1}$
p (proportion) $\hat{p} = \frac{x}{n}$
μ₁ - μ₂ $\bar{x}_1 - \bar{x}_2$

Practice Problems

Problem 1

Explain why dividing by (n-1) instead of n when calculating sample variance makes it unbiased.

Problem 2

Two unbiased estimators have variances:

  • Estimator X: Var = 100/n
  • Estimator Y: Var = 144/n

Which is more efficient and by how much?

Problem 3

Is the sample median a consistent estimator of μ for a symmetric distribution? Explain.

Problem 4

Define: (a) Unbiased estimator (b) Efficient estimator (c) Consistent estimator Give an example of each.

Problem 5

If $E(\hat{\theta}) = \theta + 5$ and $\text{Var}(\hat{\theta}) = 16$, calculate the Mean Squared Error.


Summary

Criterion Formula/Condition Practical Meaning
Unbiased $E(\hat{\theta}) = \theta$ Correct on average
Efficient Minimum $\text{Var}(\hat{\theta})$ Least spread
Consistent $\hat{\theta} \to \theta$ as $n \to \infty$ Improves with data
MSE $\text{Bias}^2 + \text{Variance}$ Total error measure
BLUE Best Linear Unbiased Optimal linear estimator

Next Topic

In the next chapter, we will study Point and Interval Estimates - how to construct confidence intervals for population parameters.