Spearman Rank Correlation Coefficient (rs)
- It is a non-parametric measure of correlation.
- This procedure makes use of the two sets of ranks that may be assigned to the sample values of x and Y.
- Spearman Rank correlation coefficient could be computed in the following cases:
- Both variables are quantitative.
- Both variables are qualitative ordinal.
- One variable is quantitative and the other is qualitative ordinal.
Procedure:
- Rank the values of X from 1 to n where n is the numbers of pairs of values of X and Y in the sample.
- Rank the values of Y from 1 to n.
- Compute the value of di for each pair of observation by subtracting the rank of Yi from the rank of Xi
- Square each di and compute ∑di2 which is the sum of the squared values.
- Apply the following formula

In a study of the relationship between level education and income the following data was obtained. Find the relationship between them and comment.
|
Income (Y)
|
level education (X)
|
sample numbers
|
|
25
|
Preparatory.
|
A
|
|
10
|
Primary.
|
B
|
|
8
|
University.
|
C
|
|
10
|
secondary
|
D
|
|
15
|
secondary
|
E
|
|
50
|
illiterate
|
F
|
|
60
|
University.
|
G
|
Solutions:


Comment: There is an indirect weak correlation between level of education and income.
Python Codes:
import pandas as pd
# creating a data dictionary
data = {
'Income': [25, 10, 8, 10, 15, 50, 60],
'Education': ['Preparatory', 'Primary', 'University',
'Secondary', 'Secondary', 'Illiterate',
'University']
}
# Create DataFrame
df = pd.DataFrame(data)
edu_codes = pd.Categorical(df['Education'], ordered=True,
categories=sorted(set(df['Education'])))
edu_order = {
'Illiterate': 1,
'Preparatory': 2,
'Primary': 3,
'Secondary': 4,
'University': 5
}
df['EducationRank'] = df['Education'].map(edu_order)
from scipy.stats import spearmanr, rankdata
# Apply rankdata to get average ranks
df['EducationRank'] = rankdata(edu_codes.codes, method='average')
# Spearman rank correlation
corr, p_value = spearmanr(df['Income'], df['EducationRank'])
# Output
print("DataFrame with Ranks:\n", df)
print(f"\nSpearman rank correlation: {corr:.4f}")
print(f"P-value: {p_value:.4f}")
Output:
DataFrame with Ranks:
Income Education EducationRank
0 25 Preparatory 2.0
1 10 Primary 3.0
2 8 University 6.5
3 10 Secondary 4.5
4 15 Secondary 4.5
5 50 Illiterate 1.0
6 60 University 6.5
Spearman rank correlation: -0.2661
P-value: 0.5641
Comments:
Since correlation is negative so education and income is inversely related. i.e. higher educated persons earned less.