Learning Objectives

By the end of this chapter, you will be able to:

  • Explain when to use Spearman’s rank correlation instead of Pearson’s
  • Calculate Spearman’s correlation for data without ties
  • Handle tied ranks correctly in calculations
  • Interpret the results appropriately

When to Use Spearman’s Rank Correlation

flowchart TD
    A{Choose Correlation Method}
    A -->|Continuous data<br/>Linear relationship<br/>Normal distribution| B[Pearson's r]
    A -->|Ordinal/Ranked data<br/>OR Non-linear<br/>OR Non-normal| C[Spearman's ρ]

    C --> D["Examples:"]
    D --> D1["Rankings (1st, 2nd, 3rd)"]
    D --> D2["Likert scales (1-5 ratings)"]
    D --> D3["Ordinal categories"]
    D --> D4["Data with outliers"]

Comparison: Pearson vs Spearman

Feature Pearson’s r Spearman’s ρ
Data type Continuous Ordinal or ranked
Measures Linear relationship Monotonic relationship
Assumption Normal distribution No distributional assumption
Outliers Sensitive Robust
Calculation Uses actual values Uses ranks

Spearman’s Rank Correlation Formula

\[\rho = 1 - \frac{6\sum d^2}{n(n^2-1)}\]

Where:

  • $\rho$ (rho) = Spearman’s rank correlation coefficient
  • $d$ = difference between ranks for each pair
  • $n$ = number of paired observations

Properties

  • Range: $-1 \leq \rho \leq +1$
  • Same interpretation as Pearson’s r
  • Measures the strength of monotonic relationship

Step-by-Step Example 1: No Tied Ranks

Problem: Two evaluators ranked 8 employees for promotion. Calculate Spearman’s rank correlation to assess agreement between evaluators.

Employee Evaluator 1 Rank Evaluator 2 Rank
A 1 2
B 2 1
C 3 4
D 4 3
E 5 6
F 6 5
G 7 8
H 8 7

Solution:

Step 1: Calculate differences ($d$) and squared differences ($d^2$)

Employee $R_1$ $R_2$ $d = R_1 - R_2$ $d^2$
A 1 2 -1 1
B 2 1 1 1
C 3 4 -1 1
D 4 3 1 1
E 5 6 -1 1
F 6 5 1 1
G 7 8 -1 1
H 8 7 1 1
Total     0 8

Step 2: Note the values

  • $n = 8$
  • $\sum d^2 = 8$

Step 3: Apply the formula

\[\rho = 1 - \frac{6\sum d^2}{n(n^2-1)}\] \[\rho = 1 - \frac{6 \times 8}{8(8^2-1)}\] \[\rho = 1 - \frac{48}{8(64-1)}\] \[\rho = 1 - \frac{48}{8 \times 63}\] \[\rho = 1 - \frac{48}{504}\] \[\rho = 1 - 0.095 = 0.905\]

Answer: $\rho = 0.905$

Interpretation: There is a very strong positive correlation between the two evaluators’ rankings. They largely agree on employee rankings.


Step-by-Step Example 2: Converting Scores to Ranks

Problem: Calculate Spearman’s correlation between experience (years) and performance rating for 6 officers:

Officer Experience (X) Performance (Y)
A 5 78
B 3 65
C 8 85
D 6 80
E 2 60
F 10 90

Solution:

Step 1: Convert values to ranks (1 = lowest, n = highest)

For Experience (X):

  • 2 years → Rank 1
  • 3 years → Rank 2
  • 5 years → Rank 3
  • 6 years → Rank 4
  • 8 years → Rank 5
  • 10 years → Rank 6

For Performance (Y):

  • 60 → Rank 1
  • 65 → Rank 2
  • 78 → Rank 3
  • 80 → Rank 4
  • 85 → Rank 5
  • 90 → Rank 6

Step 2: Create calculation table

Officer X Y Rank X ($R_X$) Rank Y ($R_Y$) $d$ $d^2$
A 5 78 3 3 0 0
B 3 65 2 2 0 0
C 8 85 5 5 0 0
D 6 80 4 4 0 0
E 2 60 1 1 0 0
F 10 90 6 6 0 0
Total           0

Step 3: Apply the formula

\[\rho = 1 - \frac{6 \times 0}{6(6^2-1)} = 1 - 0 = 1\]

Answer: $\rho = 1$

Interpretation: There is a perfect positive correlation. The rankings are identical - employees with more experience have correspondingly higher performance ratings.


Handling Tied Ranks

When two or more observations have the same value, they receive the average of the ranks they would have occupied.

How to Assign Tied Ranks

flowchart TD
    A[Values: 10, 15, 15, 20] --> B[Normal ranks would be: 1, 2, 3, 4]
    B --> C[Two values are tied at 15]
    C --> D["They would occupy ranks 2 and 3"]
    D --> E["Average = (2+3)/2 = 2.5"]
    E --> F["Final ranks: 1, 2.5, 2.5, 4"]

Step-by-Step Example 3: With Tied Ranks

Problem: Calculate Spearman’s correlation for satisfaction ratings from two surveys:

Respondent Survey 1 Survey 2
1 4 3
2 3 3
3 5 4
4 3 2
5 4 5
6 2 1

Solution:

Step 1: Assign ranks with ties

For Survey 1:

  • Value 2 → Rank 1
  • Value 3 appears twice → Would be ranks 2, 3 → Average = 2.5
  • Value 4 appears twice → Would be ranks 4, 5 → Average = 4.5
  • Value 5 → Rank 6

For Survey 2:

  • Value 1 → Rank 1
  • Value 2 → Rank 2
  • Value 3 appears twice → Would be ranks 3, 4 → Average = 3.5
  • Value 4 → Rank 5
  • Value 5 → Rank 6

Step 2: Create calculation table

Respondent X Y $R_X$ $R_Y$ $d$ $d^2$
1 4 3 4.5 3.5 1 1
2 3 3 2.5 3.5 -1 1
3 5 4 6 5 1 1
4 3 2 2.5 2 0.5 0.25
5 4 5 4.5 6 -1.5 2.25
6 2 1 1 1 0 0
Total           5.5

Step 3: Apply the formula

\[\rho = 1 - \frac{6 \times 5.5}{6(36-1)}\] \[\rho = 1 - \frac{33}{6 \times 35}\] \[\rho = 1 - \frac{33}{210} = 1 - 0.157 = 0.843\]

Answer: $\rho = 0.843$

Interpretation: There is a strong positive correlation between the two survey results.


Adjusted Formula for Many Ties

When there are many ties, use the adjusted formula:

\[\rho = \frac{\sum R_X^2 + \sum R_Y^2 - \sum d^2}{2\sqrt{\sum R_X^2 \cdot \sum R_Y^2}}\]

Where: \(\sum R_X^2 = \frac{n(n^2-1)}{12} - \sum T_X\) \(\sum R_Y^2 = \frac{n(n^2-1)}{12} - \sum T_Y\)

And $T = \frac{t^3-t}{12}$ for each group of $t$ tied observations.

Note: For exams, the basic formula is usually sufficient unless specifically asked to adjust for ties.


Interpretation Guidelines

Strength of Spearman’s Correlation

$\rho$ Value Interpretation
0.00 - 0.19 Negligible
0.20 - 0.39 Weak
0.40 - 0.59 Moderate
0.60 - 0.79 Strong
0.80 - 1.00 Very Strong

Sign Interpretation

  • Positive $\rho$: High ranks in X tend to occur with high ranks in Y
  • Negative $\rho$: High ranks in X tend to occur with low ranks in Y

Real-World Applications in Public Administration

Example Applications

Application Variable X Variable Y
Employee Evaluation Manager’s ranking Peer ranking
Policy Priority Expert ranking Public ranking
Service Quality Citizen satisfaction rank Department efficiency rank
Training Effectiveness Pre-training rank Post-training rank
Budget Allocation Priority rank by department Priority rank by ministry

Comparison: Pearson vs Spearman Results

Important Note: Pearson’s r and Spearman’s ρ may give different results for the same data:

Scenario Pearson Spearman Which is Better?
Perfect linear relationship 1.00 1.00 Either
Monotonic but non-linear < 1.00 1.00 Spearman
Outliers present Affected Less affected Spearman
Normal, continuous data Preferred Valid Pearson

Practice Problems

Problem 1

Two judges ranked 6 contestants. Calculate Spearman’s correlation:

Contestant Judge A Judge B
P 1 2
Q 2 3
R 3 1
S 4 5
T 5 4
U 6 6

Problem 2

Calculate Spearman’s correlation between department size and citizen complaints:

Department Staff (X) Complaints (Y)
A 50 25
B 30 15
C 45 20
D 30 18
E 60 30
F 25 12

Note: Handle the tie in staff size (both B and D have 30).

Problem 3

When would you choose Spearman’s correlation over Pearson’s correlation? Give two specific examples from public administration.


Summary

Concept Key Points
When to Use Ordinal data, non-linear relationships, non-normal data, outliers
Formula $\rho = 1 - \frac{6\sum d^2}{n(n^2-1)}$
Range $-1 \leq \rho \leq +1$
Tied Ranks Use average of positions
Interpretation Same as Pearson’s r
Advantage Non-parametric, robust to outliers

Next Chapter

In the next chapter, we will study Simple Linear Regression - a method to predict the value of one variable based on another.