Learning Objectives
By the end of this chapter, you will be able to:
- Explain when to use Spearman’s rank correlation instead of Pearson’s
- Calculate Spearman’s correlation for data without ties
- Handle tied ranks correctly in calculations
- Interpret the results appropriately
When to Use Spearman’s Rank Correlation
flowchart TD
A{Choose Correlation Method}
A -->|Continuous data<br/>Linear relationship<br/>Normal distribution| B[Pearson's r]
A -->|Ordinal/Ranked data<br/>OR Non-linear<br/>OR Non-normal| C[Spearman's ρ]
C --> D["Examples:"]
D --> D1["Rankings (1st, 2nd, 3rd)"]
D --> D2["Likert scales (1-5 ratings)"]
D --> D3["Ordinal categories"]
D --> D4["Data with outliers"]
Comparison: Pearson vs Spearman
| Feature | Pearson’s r | Spearman’s ρ |
|---|---|---|
| Data type | Continuous | Ordinal or ranked |
| Measures | Linear relationship | Monotonic relationship |
| Assumption | Normal distribution | No distributional assumption |
| Outliers | Sensitive | Robust |
| Calculation | Uses actual values | Uses ranks |
Spearman’s Rank Correlation Formula
\[\rho = 1 - \frac{6\sum d^2}{n(n^2-1)}\]Where:
- $\rho$ (rho) = Spearman’s rank correlation coefficient
- $d$ = difference between ranks for each pair
- $n$ = number of paired observations
Properties
- Range: $-1 \leq \rho \leq +1$
- Same interpretation as Pearson’s r
- Measures the strength of monotonic relationship
Step-by-Step Example 1: No Tied Ranks
Problem: Two evaluators ranked 8 employees for promotion. Calculate Spearman’s rank correlation to assess agreement between evaluators.
| Employee | Evaluator 1 Rank | Evaluator 2 Rank |
|---|---|---|
| A | 1 | 2 |
| B | 2 | 1 |
| C | 3 | 4 |
| D | 4 | 3 |
| E | 5 | 6 |
| F | 6 | 5 |
| G | 7 | 8 |
| H | 8 | 7 |
Solution:
Step 1: Calculate differences ($d$) and squared differences ($d^2$)
| Employee | $R_1$ | $R_2$ | $d = R_1 - R_2$ | $d^2$ |
|---|---|---|---|---|
| A | 1 | 2 | -1 | 1 |
| B | 2 | 1 | 1 | 1 |
| C | 3 | 4 | -1 | 1 |
| D | 4 | 3 | 1 | 1 |
| E | 5 | 6 | -1 | 1 |
| F | 6 | 5 | 1 | 1 |
| G | 7 | 8 | -1 | 1 |
| H | 8 | 7 | 1 | 1 |
| Total | 0 | 8 |
Step 2: Note the values
- $n = 8$
- $\sum d^2 = 8$
Step 3: Apply the formula
\[\rho = 1 - \frac{6\sum d^2}{n(n^2-1)}\] \[\rho = 1 - \frac{6 \times 8}{8(8^2-1)}\] \[\rho = 1 - \frac{48}{8(64-1)}\] \[\rho = 1 - \frac{48}{8 \times 63}\] \[\rho = 1 - \frac{48}{504}\] \[\rho = 1 - 0.095 = 0.905\]Answer: $\rho = 0.905$
Interpretation: There is a very strong positive correlation between the two evaluators’ rankings. They largely agree on employee rankings.
Step-by-Step Example 2: Converting Scores to Ranks
Problem: Calculate Spearman’s correlation between experience (years) and performance rating for 6 officers:
| Officer | Experience (X) | Performance (Y) |
|---|---|---|
| A | 5 | 78 |
| B | 3 | 65 |
| C | 8 | 85 |
| D | 6 | 80 |
| E | 2 | 60 |
| F | 10 | 90 |
Solution:
Step 1: Convert values to ranks (1 = lowest, n = highest)
For Experience (X):
- 2 years → Rank 1
- 3 years → Rank 2
- 5 years → Rank 3
- 6 years → Rank 4
- 8 years → Rank 5
- 10 years → Rank 6
For Performance (Y):
- 60 → Rank 1
- 65 → Rank 2
- 78 → Rank 3
- 80 → Rank 4
- 85 → Rank 5
- 90 → Rank 6
Step 2: Create calculation table
| Officer | X | Y | Rank X ($R_X$) | Rank Y ($R_Y$) | $d$ | $d^2$ |
|---|---|---|---|---|---|---|
| A | 5 | 78 | 3 | 3 | 0 | 0 |
| B | 3 | 65 | 2 | 2 | 0 | 0 |
| C | 8 | 85 | 5 | 5 | 0 | 0 |
| D | 6 | 80 | 4 | 4 | 0 | 0 |
| E | 2 | 60 | 1 | 1 | 0 | 0 |
| F | 10 | 90 | 6 | 6 | 0 | 0 |
| Total | 0 |
Step 3: Apply the formula
\[\rho = 1 - \frac{6 \times 0}{6(6^2-1)} = 1 - 0 = 1\]Answer: $\rho = 1$
Interpretation: There is a perfect positive correlation. The rankings are identical - employees with more experience have correspondingly higher performance ratings.
Handling Tied Ranks
When two or more observations have the same value, they receive the average of the ranks they would have occupied.
How to Assign Tied Ranks
flowchart TD
A[Values: 10, 15, 15, 20] --> B[Normal ranks would be: 1, 2, 3, 4]
B --> C[Two values are tied at 15]
C --> D["They would occupy ranks 2 and 3"]
D --> E["Average = (2+3)/2 = 2.5"]
E --> F["Final ranks: 1, 2.5, 2.5, 4"]
Step-by-Step Example 3: With Tied Ranks
Problem: Calculate Spearman’s correlation for satisfaction ratings from two surveys:
| Respondent | Survey 1 | Survey 2 |
|---|---|---|
| 1 | 4 | 3 |
| 2 | 3 | 3 |
| 3 | 5 | 4 |
| 4 | 3 | 2 |
| 5 | 4 | 5 |
| 6 | 2 | 1 |
Solution:
Step 1: Assign ranks with ties
For Survey 1:
- Value 2 → Rank 1
- Value 3 appears twice → Would be ranks 2, 3 → Average = 2.5
- Value 4 appears twice → Would be ranks 4, 5 → Average = 4.5
- Value 5 → Rank 6
For Survey 2:
- Value 1 → Rank 1
- Value 2 → Rank 2
- Value 3 appears twice → Would be ranks 3, 4 → Average = 3.5
- Value 4 → Rank 5
- Value 5 → Rank 6
Step 2: Create calculation table
| Respondent | X | Y | $R_X$ | $R_Y$ | $d$ | $d^2$ |
|---|---|---|---|---|---|---|
| 1 | 4 | 3 | 4.5 | 3.5 | 1 | 1 |
| 2 | 3 | 3 | 2.5 | 3.5 | -1 | 1 |
| 3 | 5 | 4 | 6 | 5 | 1 | 1 |
| 4 | 3 | 2 | 2.5 | 2 | 0.5 | 0.25 |
| 5 | 4 | 5 | 4.5 | 6 | -1.5 | 2.25 |
| 6 | 2 | 1 | 1 | 1 | 0 | 0 |
| Total | 5.5 |
Step 3: Apply the formula
\[\rho = 1 - \frac{6 \times 5.5}{6(36-1)}\] \[\rho = 1 - \frac{33}{6 \times 35}\] \[\rho = 1 - \frac{33}{210} = 1 - 0.157 = 0.843\]Answer: $\rho = 0.843$
Interpretation: There is a strong positive correlation between the two survey results.
Adjusted Formula for Many Ties
When there are many ties, use the adjusted formula:
\[\rho = \frac{\sum R_X^2 + \sum R_Y^2 - \sum d^2}{2\sqrt{\sum R_X^2 \cdot \sum R_Y^2}}\]Where: \(\sum R_X^2 = \frac{n(n^2-1)}{12} - \sum T_X\) \(\sum R_Y^2 = \frac{n(n^2-1)}{12} - \sum T_Y\)
And $T = \frac{t^3-t}{12}$ for each group of $t$ tied observations.
Note: For exams, the basic formula is usually sufficient unless specifically asked to adjust for ties.
Interpretation Guidelines
Strength of Spearman’s Correlation
| $\rho$ Value | Interpretation |
|---|---|
| 0.00 - 0.19 | Negligible |
| 0.20 - 0.39 | Weak |
| 0.40 - 0.59 | Moderate |
| 0.60 - 0.79 | Strong |
| 0.80 - 1.00 | Very Strong |
Sign Interpretation
- Positive $\rho$: High ranks in X tend to occur with high ranks in Y
- Negative $\rho$: High ranks in X tend to occur with low ranks in Y
Real-World Applications in Public Administration
Example Applications
| Application | Variable X | Variable Y |
|---|---|---|
| Employee Evaluation | Manager’s ranking | Peer ranking |
| Policy Priority | Expert ranking | Public ranking |
| Service Quality | Citizen satisfaction rank | Department efficiency rank |
| Training Effectiveness | Pre-training rank | Post-training rank |
| Budget Allocation | Priority rank by department | Priority rank by ministry |
Comparison: Pearson vs Spearman Results
Important Note: Pearson’s r and Spearman’s ρ may give different results for the same data:
| Scenario | Pearson | Spearman | Which is Better? |
|---|---|---|---|
| Perfect linear relationship | 1.00 | 1.00 | Either |
| Monotonic but non-linear | < 1.00 | 1.00 | Spearman |
| Outliers present | Affected | Less affected | Spearman |
| Normal, continuous data | Preferred | Valid | Pearson |
Practice Problems
Problem 1
Two judges ranked 6 contestants. Calculate Spearman’s correlation:
| Contestant | Judge A | Judge B |
|---|---|---|
| P | 1 | 2 |
| Q | 2 | 3 |
| R | 3 | 1 |
| S | 4 | 5 |
| T | 5 | 4 |
| U | 6 | 6 |
Problem 2
Calculate Spearman’s correlation between department size and citizen complaints:
| Department | Staff (X) | Complaints (Y) |
|---|---|---|
| A | 50 | 25 |
| B | 30 | 15 |
| C | 45 | 20 |
| D | 30 | 18 |
| E | 60 | 30 |
| F | 25 | 12 |
Note: Handle the tie in staff size (both B and D have 30).
Problem 3
When would you choose Spearman’s correlation over Pearson’s correlation? Give two specific examples from public administration.
Summary
| Concept | Key Points |
|---|---|
| When to Use | Ordinal data, non-linear relationships, non-normal data, outliers |
| Formula | $\rho = 1 - \frac{6\sum d^2}{n(n^2-1)}$ |
| Range | $-1 \leq \rho \leq +1$ |
| Tied Ranks | Use average of positions |
| Interpretation | Same as Pearson’s r |
| Advantage | Non-parametric, robust to outliers |
Next Chapter
In the next chapter, we will study Simple Linear Regression - a method to predict the value of one variable based on another.

