📊 Quick Formula Reference Guide

Use this cheat sheet for quick revision before exams. All formulas organized by topic.

Unit 1: Descriptive Statistics

Measures of Central Tendency

Measure	Formula	Use When
Arithmetic Mean	$\bar{X} = \frac{\sum X}{n}$	Data is symmetric, no outliers
Weighted Mean	$\bar{X}_w = \frac{\sum wX}{\sum w}$	Different items have different importance
Grouped Mean	$\bar{X} = \frac{\sum fm}{n}$	Data is in frequency distribution
Median	Middle value when sorted	Data has outliers or is skewed
Median (Grouped)	$Md = L + \frac{(n/2 - cf)}{f} \times h$	Grouped frequency data
Mode	Most frequent value	Categorical data or quick estimate
Mode (Grouped)	$Mo = L + \frac{f_1 - f_0}{2f_1 - f_0 - f_2} \times h$	Modal class in grouped data

Where:

$L$ = Lower boundary of median/modal class
$cf$ = Cumulative frequency before median class
$f$ = Frequency of median/modal class
$h$ = Class width
$f_0, f_1, f_2$ = Frequencies of pre-modal, modal, post-modal classes

Measures of Dispersion

Measure	Formula	Interpretation
Range	$R = X_{max} - X_{min}$	Quick spread measure
Variance (Population)	$\sigma^2 = \frac{\sum(X - \mu)^2}{N}$	Average squared deviation
Variance (Sample)	$s^2 = \frac{\sum(X - \bar{X})^2}{n-1}$	Unbiased estimate
Standard Deviation	$\sigma = \sqrt{\sigma^2}$ or $s = \sqrt{s^2}$	Spread in original units
Coefficient of Variation	$CV = \frac{s}{\bar{X}} \times 100\%$	Compare variability across datasets

Shortcut Formula for Variance: $s^2 = \frac{\sum X^2 - \frac{(\sum X)^2}{n}}{n-1}$

For Grouped Data: $s^2 = \frac{\sum f(m - \bar{X})^2}{n-1} = \frac{\sum fm^2 - \frac{(\sum fm)^2}{n}}{n-1}$

Unit 2: Correlation & Regression

Karl Pearson’s Correlation Coefficient

\[r = \frac{n\sum XY - \sum X \sum Y}{\sqrt{[n\sum X^2 - (\sum X)^2][n\sum Y^2 - (\sum Y)^2]}}\]

Alternative Formula: $r = \frac{\sum(X - \bar{X})(Y - \bar{Y})}{\sqrt{\sum(X - \bar{X})^2 \cdot \sum(Y - \bar{Y})^2}}$

Interpretation:

r = +1: Perfect positive correlation
r = -1: Perfect negative correlation
r = 0: No linear correlation
|r| > 0.7: Strong correlation
0.4 < |r| < 0.7: Moderate correlation
|r| < 0.4: Weak correlation

Spearman’s Rank Correlation

\[r_s = 1 - \frac{6\sum d^2}{n(n^2 - 1)}\]

With Tied Ranks: $r_s = \frac{n\sum R_X R_Y - \sum R_X \sum R_Y}{\sqrt{[n\sum R_X^2 - (\sum R_X)^2][n\sum R_Y^2 - (\sum R_Y)^2]}}$

Where: $d$ = difference between ranks, $n$ = number of pairs

Simple Linear Regression

Regression Line: $\hat{Y} = a + bX$

Parameter	Formula
Slope (b)	$b = \frac{n\sum XY - \sum X \sum Y}{n\sum X^2 - (\sum X)^2}$
Intercept (a)	$a = \bar{Y} - b\bar{X}$
Alternative for b	$b = r \cdot \frac{s_Y}{s_X}$

Coefficient of Determination: $R^2 = r^2$ (proportion of variance explained)

Unit 3: Probability

Basic Probability Rules

Classical Probability: $P(A) = \frac{\text{Favorable outcomes}}{\text{Total outcomes}}$

Complement Rule: $P(A') = 1 - P(A)$

Addition Rule (General): $P(A \cup B) = P(A) + P(B) - P(A \cap B)$

Addition Rule (Mutually Exclusive): $P(A \cup B) = P(A) + P(B)$

Multiplication Rule (General): $P(A \cap B) = P(A) \cdot P(B \mid A)$

Multiplication Rule (Independent): $P(A \cap B) = P(A) \cdot P(B)$

Conditional Probability: $P(A \mid B) = \frac{P(A \cap B)}{P(B)}$

Bayes’ Theorem: $P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}$

Binomial Distribution

\[P(X = x) = \binom{n}{x} p^x (1-p)^{n-x} = \frac{n!}{x!(n-x)!} p^x q^{n-x}\]

Parameter	Formula
Mean	$\mu = np$
Variance	$\sigma^2 = npq$
Standard Deviation	$\sigma = \sqrt{npq}$

Where: $n$ = trials, $p$ = success probability, $q = 1-p$, $x$ = successes

Normal Distribution

Standard Normal (Z) Score: $Z = \frac{X - \mu}{\sigma}$

Finding X from Z: $X = \mu + Z \cdot \sigma$

Properties:

Mean = Median = Mode = $\mu$
Total area under curve = 1
68% within $\pm 1\sigma$, 95% within $\pm 2\sigma$, 99.7% within $\pm 3\sigma$

Unit 4: Estimation

Point Estimates

Parameter	Point Estimator
Population Mean ($\mu$)	Sample Mean ($\bar{X}$)
Population Proportion ($p$)	Sample Proportion ($\hat{p}$)
Population Variance ($\sigma^2$)	Sample Variance ($s^2$)

Confidence Intervals

For Mean (σ known or large sample): $\bar{X} \pm Z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}$

For Mean (σ unknown, small sample): $\bar{X} \pm t_{\alpha/2, n-1} \cdot \frac{s}{\sqrt{n}}$

For Proportion: $\hat{p} \pm Z_{\alpha/2} \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}$

Common Z values:

90% CI: $Z_{0.05} = 1.645$
95% CI: $Z_{0.025} = 1.96$
99% CI: $Z_{0.005} = 2.576$

Sample Size Determination

For Estimating Mean: $n = \left(\frac{Z_{\alpha/2} \cdot \sigma}{E}\right)^2$

For Estimating Proportion: $n = \frac{Z_{\alpha/2}^2 \cdot p(1-p)}{E^2}$

Where: $E$ = margin of error (desired precision)

Note: If $p$ is unknown, use $p = 0.5$ for maximum sample size.

Unit 5: Hypothesis Testing

General Framework

Component	Symbol	Description
Null Hypothesis	$H_0$	Statement of no effect/difference
Alternative Hypothesis	$H_1$ or $H_a$	Research hypothesis
Significance Level	$\alpha$	Probability of Type I error
Test Statistic	Z, t, $\chi^2$	Calculated value
Critical Value	$Z_c$, $t_c$	Threshold for rejection
p-value	p	Probability of observing result

Decision Rule: Reject $H_0$ if |Test Statistic| > Critical Value or if p-value < $\alpha$

Z-Test for Single Mean (Large Sample)

\[Z = \frac{\bar{X} - \mu_0}{\sigma/\sqrt{n}}\]

If σ unknown (n ≥ 30): Use $s$ instead of $\sigma$

Z-Test for Two Means (Large Samples)

\[Z = \frac{(\bar{X}_1 - \bar{X}_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}\]

If σ unknown: Use $s_1$ and $s_2$

Z-Test for Single Proportion

\[Z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}\]

Z-Test for Two Proportions

\[Z = \frac{\hat{p}_1 - \hat{p}_2}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}\]

Where pooled proportion: $\hat{p} = \frac{x_1 + x_2}{n_1 + n_2}$

t-Test for Single Mean (Small Sample)

\[t = \frac{\bar{X} - \mu_0}{s/\sqrt{n}}\]

Degrees of freedom: $df = n - 1$

t-Test for Two Independent Means

\[t = \frac{(\bar{X}_1 - \bar{X}_2) - (\mu_1 - \mu_2)}{s_p\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}\]

Pooled Standard Deviation: $s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1 + n_2 - 2}}$

Degrees of freedom: $df = n_1 + n_2 - 2$

Paired t-Test

\[t = \frac{\bar{d} - \mu_d}{s_d/\sqrt{n}}\]

Where:

$\bar{d}$ = mean of differences
$s_d$ = standard deviation of differences
$df = n - 1$

Chi-Square Test for Independence

\[\chi^2 = \sum \frac{(O - E)^2}{E}\]

Expected Frequency: $E = \frac{\text{Row Total} \times \text{Column Total}}{\text{Grand Total}}$

Degrees of freedom: $df = (r - 1)(c - 1)$

Chi-Square Goodness of Fit

\[\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}\]

Degrees of freedom: $df = k - 1 - m$

Where: $k$ = categories, $m$ = parameters estimated from data

Kruskal-Wallis Test (Non-parametric ANOVA)

\[H = \frac{12}{N(N+1)} \sum_{i=1}^{k} \frac{R_i^2}{n_i} - 3(N+1)\]

Where:

$N$ = total observations
$k$ = number of groups
$R_i$ = sum of ranks in group $i$
$n_i$ = size of group $i$
$df = k - 1$

📋 Critical Values Quick Reference

Z Critical Values

Confidence Level	$\alpha$	$Z_{\alpha/2}$
90%	0.10	1.645
95%	0.05	1.96
99%	0.01	2.576

Common t Critical Values (Two-tailed)

df	α = 0.10	α = 0.05	α = 0.01
5	2.015	2.571	4.032
10	1.812	2.228	3.169
15	1.753	2.131	2.947
20	1.725	2.086	2.845
30	1.697	2.042	2.750
∞	1.645	1.96	2.576

🎯 Quick Decision Guide

Which Test to Use?

Situation	Test
One sample mean, σ known or n ≥ 30	Z-test
One sample mean, σ unknown, n < 30	t-test
Two sample means, independent, large	Z-test
Two sample means, independent, small	Independent t-test
Two sample means, paired/matched	Paired t-test
One proportion	Z-test for proportion
Two proportions	Z-test for two proportions
Categorical data, one variable	Chi-square goodness of fit
Categorical data, two variables	Chi-square independence
Compare 3+ groups, non-parametric	Kruskal-Wallis

📝 Exam Tips

Always state hypotheses clearly: $H_0$ and $H_1$
Check conditions: Sample size, normality, independence
Use correct formula: Match test to situation
Show all work: Include intermediate calculations
State conclusion in context: Relate back to the problem
Round appropriately: Usually 3-4 decimal places for test statistics

_Last updated: January 2026

MPA 509: Statistics for Public Administration_

Formula Cheat Sheet: Complete Statistics Reference

📊 Quick Formula Reference Guide

Unit 1: Descriptive Statistics

Measures of Central Tendency

Measures of Dispersion

Unit 2: Correlation & Regression

Karl Pearson’s Correlation Coefficient

Spearman’s Rank Correlation

Simple Linear Regression

Unit 3: Probability

Basic Probability Rules

Binomial Distribution

Normal Distribution

Unit 4: Estimation

Point Estimates

Confidence Intervals

Sample Size Determination

Unit 5: Hypothesis Testing

General Framework

Z-Test for Single Mean (Large Sample)

Z-Test for Two Means (Large Samples)

Z-Test for Single Proportion

Z-Test for Two Proportions

t-Test for Single Mean (Small Sample)

t-Test for Two Independent Means

Paired t-Test

Chi-Square Test for Independence

Chi-Square Goodness of Fit

Kruskal-Wallis Test (Non-parametric ANOVA)

📋 Critical Values Quick Reference

Z Critical Values

Common t Critical Values (Two-tailed)

🎯 Quick Decision Guide

Which Test to Use?

📝 Exam Tips

Recommended Posts

Nepal Digital Electorate: Political Content & Voter Behavior

High CPC & RPM Strategies: Maximize Blog Revenue

Start Digital Marketing Agency USA: Guide for Non-Residents

Get in touch

Measure	Formula	Use When
Arithmetic Mean	\(\bar{X} = \frac{\sum X}{n}\)	Data is symmetric, no outliers
Weighted Mean	\(\bar{X}_w = \frac{\sum wX}{\sum w}\)	Different items have different importance
Grouped Mean	\(\bar{X} = \frac{\sum fm}{n}\)	Data is in frequency distribution
Median	Middle value when sorted	Data has outliers or is skewed
Median (Grouped)	\(Md = L + \frac{(n/2 - cf)}{f} \times h\)	Grouped frequency data
Mode	Most frequent value	Categorical data or quick estimate
Mode (Grouped)	\(Mo = L + \frac{f_1 - f_0}{2f_1 - f_0 - f_2} \times h\)	Modal class in grouped data

Measure	Formula	Interpretation
Range	\(R = X_{max} - X_{min}\)	Quick spread measure
Variance (Population)	\(\sigma^2 = \frac{\sum(X - \mu)^2}{N}\)	Average squared deviation
Variance (Sample)	\(s^2 = \frac{\sum(X - \bar{X})^2}{n-1}\)	Unbiased estimate
Standard Deviation	\(\sigma = \sqrt{\sigma^2}\) or \(s = \sqrt{s^2}\)	Spread in original units
Coefficient of Variation	\(CV = \frac{s}{\bar{X}} \times 100\%\)	Compare variability across datasets

Parameter	Formula
Slope (b)	\(b = \frac{n\sum XY - \sum X \sum Y}{n\sum X^2 - (\sum X)^2}\)
Intercept (a)	\(a = \bar{Y} - b\bar{X}\)
Alternative for b	\(b = r \cdot \frac{s_Y}{s_X}\)

Parameter	Formula
Mean	\(\mu = np\)
Variance	\(\sigma^2 = npq\)
Standard Deviation	\(\sigma = \sqrt{npq}\)