📊 Quick Formula Reference Guide
Use this cheat sheet for quick revision before exams. All formulas organized by topic.
Unit 1: Descriptive Statistics
Measures of Central Tendency
| Measure | Formula | Use When |
|---|---|---|
| Arithmetic Mean | \(\bar{X} = \frac{\sum X}{n}\) | Data is symmetric, no outliers |
| Weighted Mean | \(\bar{X}_w = \frac{\sum wX}{\sum w}\) | Different items have different importance |
| Grouped Mean | \(\bar{X} = \frac{\sum fm}{n}\) | Data is in frequency distribution |
| Median | Middle value when sorted | Data has outliers or is skewed |
| Median (Grouped) | \(Md = L + \frac{(n/2 - cf)}{f} \times h\) | Grouped frequency data |
| Mode | Most frequent value | Categorical data or quick estimate |
| Mode (Grouped) | \(Mo = L + \frac{f_1 - f_0}{2f_1 - f_0 - f_2} \times h\) | Modal class in grouped data |
Where:
- $L$ = Lower boundary of median/modal class
- $cf$ = Cumulative frequency before median class
- $f$ = Frequency of median/modal class
- $h$ = Class width
- $f_0, f_1, f_2$ = Frequencies of pre-modal, modal, post-modal classes
Measures of Dispersion
| Measure | Formula | Interpretation |
|---|---|---|
| Range | \(R = X_{max} - X_{min}\) | Quick spread measure |
| Variance (Population) | \(\sigma^2 = \frac{\sum(X - \mu)^2}{N}\) | Average squared deviation |
| Variance (Sample) | \(s^2 = \frac{\sum(X - \bar{X})^2}{n-1}\) | Unbiased estimate |
| Standard Deviation | \(\sigma = \sqrt{\sigma^2}\) or \(s = \sqrt{s^2}\) | Spread in original units |
| Coefficient of Variation | \(CV = \frac{s}{\bar{X}} \times 100\%\) | Compare variability across datasets |
Shortcut Formula for Variance: \(s^2 = \frac{\sum X^2 - \frac{(\sum X)^2}{n}}{n-1}\)
For Grouped Data: \(s^2 = \frac{\sum f(m - \bar{X})^2}{n-1} = \frac{\sum fm^2 - \frac{(\sum fm)^2}{n}}{n-1}\)
Unit 2: Correlation & Regression
Karl Pearson’s Correlation Coefficient
\[r = \frac{n\sum XY - \sum X \sum Y}{\sqrt{[n\sum X^2 - (\sum X)^2][n\sum Y^2 - (\sum Y)^2]}}\]Alternative Formula: \(r = \frac{\sum(X - \bar{X})(Y - \bar{Y})}{\sqrt{\sum(X - \bar{X})^2 \cdot \sum(Y - \bar{Y})^2}}\)
Interpretation:
- r = +1: Perfect positive correlation
- r = -1: Perfect negative correlation
- r = 0: No linear correlation
- |r| > 0.7: Strong correlation
- 0.4 < |r| < 0.7: Moderate correlation
- |r| < 0.4: Weak correlation
Spearman’s Rank Correlation
\[r_s = 1 - \frac{6\sum d^2}{n(n^2 - 1)}\]With Tied Ranks: \(r_s = \frac{n\sum R_X R_Y - \sum R_X \sum R_Y}{\sqrt{[n\sum R_X^2 - (\sum R_X)^2][n\sum R_Y^2 - (\sum R_Y)^2]}}\)
Where: $d$ = difference between ranks, $n$ = number of pairs
Simple Linear Regression
Regression Line: $\hat{Y} = a + bX$
| Parameter | Formula |
|---|---|
| Slope (b) | \(b = \frac{n\sum XY - \sum X \sum Y}{n\sum X^2 - (\sum X)^2}\) |
| Intercept (a) | \(a = \bar{Y} - b\bar{X}\) |
| Alternative for b | \(b = r \cdot \frac{s_Y}{s_X}\) |
Coefficient of Determination: $R^2 = r^2$ (proportion of variance explained)
Unit 3: Probability
Basic Probability Rules
Classical Probability: \(P(A) = \frac{\text{Favorable outcomes}}{\text{Total outcomes}}\)
Complement Rule: \(P(A') = 1 - P(A)\)
Addition Rule (General): \(P(A \cup B) = P(A) + P(B) - P(A \cap B)\)
Addition Rule (Mutually Exclusive): \(P(A \cup B) = P(A) + P(B)\)
Multiplication Rule (General): \(P(A \cap B) = P(A) \cdot P(B \mid A)\)
Multiplication Rule (Independent): \(P(A \cap B) = P(A) \cdot P(B)\)
Conditional Probability: \(P(A \mid B) = \frac{P(A \cap B)}{P(B)}\)
Bayes’ Theorem: \(P(A \mid B) = \frac{P(B \mid A) \cdot P(A)}{P(B)}\)
Binomial Distribution
\[P(X = x) = \binom{n}{x} p^x (1-p)^{n-x} = \frac{n!}{x!(n-x)!} p^x q^{n-x}\]| Parameter | Formula |
|---|---|
| Mean | \(\mu = np\) |
| Variance | \(\sigma^2 = npq\) |
| Standard Deviation | \(\sigma = \sqrt{npq}\) |
Where: $n$ = trials, $p$ = success probability, $q = 1-p$, $x$ = successes
Normal Distribution
Standard Normal (Z) Score: \(Z = \frac{X - \mu}{\sigma}\)
Finding X from Z: \(X = \mu + Z \cdot \sigma\)
Properties:
- Mean = Median = Mode = $\mu$
- Total area under curve = 1
- 68% within $\pm 1\sigma$, 95% within $\pm 2\sigma$, 99.7% within $\pm 3\sigma$
Unit 4: Estimation
Point Estimates
| Parameter | Point Estimator |
|---|---|
| Population Mean ($\mu$) | Sample Mean ($\bar{X}$) |
| Population Proportion ($p$) | Sample Proportion ($\hat{p}$) |
| Population Variance ($\sigma^2$) | Sample Variance ($s^2$) |
Confidence Intervals
For Mean (σ known or large sample): \(\bar{X} \pm Z_{\alpha/2} \cdot \frac{\sigma}{\sqrt{n}}\)
For Mean (σ unknown, small sample): \(\bar{X} \pm t_{\alpha/2, n-1} \cdot \frac{s}{\sqrt{n}}\)
For Proportion: \(\hat{p} \pm Z_{\alpha/2} \cdot \sqrt{\frac{\hat{p}(1-\hat{p})}{n}}\)
Common Z values:
- 90% CI: $Z_{0.05} = 1.645$
- 95% CI: $Z_{0.025} = 1.96$
- 99% CI: $Z_{0.005} = 2.576$
Sample Size Determination
For Estimating Mean: \(n = \left(\frac{Z_{\alpha/2} \cdot \sigma}{E}\right)^2\)
For Estimating Proportion: \(n = \frac{Z_{\alpha/2}^2 \cdot p(1-p)}{E^2}\)
Where: $E$ = margin of error (desired precision)
Note: If $p$ is unknown, use $p = 0.5$ for maximum sample size.
Unit 5: Hypothesis Testing
General Framework
| Component | Symbol | Description |
|---|---|---|
| Null Hypothesis | $H_0$ | Statement of no effect/difference |
| Alternative Hypothesis | $H_1$ or $H_a$ | Research hypothesis |
| Significance Level | $\alpha$ | Probability of Type I error |
| Test Statistic | Z, t, $\chi^2$ | Calculated value |
| Critical Value | $Z_c$, $t_c$ | Threshold for rejection |
| p-value | p | Probability of observing result |
Decision Rule: Reject $H_0$ if |Test Statistic| > Critical Value or if p-value < $\alpha$
Z-Test for Single Mean (Large Sample)
\[Z = \frac{\bar{X} - \mu_0}{\sigma/\sqrt{n}}\]If σ unknown (n ≥ 30): Use $s$ instead of $\sigma$
Z-Test for Two Means (Large Samples)
\[Z = \frac{(\bar{X}_1 - \bar{X}_2) - (\mu_1 - \mu_2)}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}\]If σ unknown: Use $s_1$ and $s_2$
Z-Test for Single Proportion
\[Z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}\]Z-Test for Two Proportions
\[Z = \frac{\hat{p}_1 - \hat{p}_2}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}\]Where pooled proportion: $\hat{p} = \frac{x_1 + x_2}{n_1 + n_2}$
t-Test for Single Mean (Small Sample)
\[t = \frac{\bar{X} - \mu_0}{s/\sqrt{n}}\]Degrees of freedom: $df = n - 1$
t-Test for Two Independent Means
\[t = \frac{(\bar{X}_1 - \bar{X}_2) - (\mu_1 - \mu_2)}{s_p\sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}\]Pooled Standard Deviation: \(s_p = \sqrt{\frac{(n_1-1)s_1^2 + (n_2-1)s_2^2}{n_1 + n_2 - 2}}\)
Degrees of freedom: $df = n_1 + n_2 - 2$
Paired t-Test
\[t = \frac{\bar{d} - \mu_d}{s_d/\sqrt{n}}\]Where:
- $\bar{d}$ = mean of differences
- $s_d$ = standard deviation of differences
- $df = n - 1$
Chi-Square Test for Independence
\[\chi^2 = \sum \frac{(O - E)^2}{E}\]Expected Frequency: \(E = \frac{\text{Row Total} \times \text{Column Total}}{\text{Grand Total}}\)
Degrees of freedom: $df = (r - 1)(c - 1)$
Chi-Square Goodness of Fit
\[\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i}\]Degrees of freedom: $df = k - 1 - m$
Where: $k$ = categories, $m$ = parameters estimated from data
Kruskal-Wallis Test (Non-parametric ANOVA)
\[H = \frac{12}{N(N+1)} \sum_{i=1}^{k} \frac{R_i^2}{n_i} - 3(N+1)\]Where:
- $N$ = total observations
- $k$ = number of groups
- $R_i$ = sum of ranks in group $i$
- $n_i$ = size of group $i$
- $df = k - 1$
📋 Critical Values Quick Reference
Z Critical Values
| Confidence Level | $\alpha$ | $Z_{\alpha/2}$ |
|---|---|---|
| 90% | 0.10 | 1.645 |
| 95% | 0.05 | 1.96 |
| 99% | 0.01 | 2.576 |
Common t Critical Values (Two-tailed)
| df | α = 0.10 | α = 0.05 | α = 0.01 |
|---|---|---|---|
| 5 | 2.015 | 2.571 | 4.032 |
| 10 | 1.812 | 2.228 | 3.169 |
| 15 | 1.753 | 2.131 | 2.947 |
| 20 | 1.725 | 2.086 | 2.845 |
| 30 | 1.697 | 2.042 | 2.750 |
| ∞ | 1.645 | 1.96 | 2.576 |
🎯 Quick Decision Guide
Which Test to Use?
| Situation | Test |
|---|---|
| One sample mean, σ known or n ≥ 30 | Z-test |
| One sample mean, σ unknown, n < 30 | t-test |
| Two sample means, independent, large | Z-test |
| Two sample means, independent, small | Independent t-test |
| Two sample means, paired/matched | Paired t-test |
| One proportion | Z-test for proportion |
| Two proportions | Z-test for two proportions |
| Categorical data, one variable | Chi-square goodness of fit |
| Categorical data, two variables | Chi-square independence |
| Compare 3+ groups, non-parametric | Kruskal-Wallis |
📝 Exam Tips
- Always state hypotheses clearly: $H_0$ and $H_1$
- Check conditions: Sample size, normality, independence
- Use correct formula: Match test to situation
- Show all work: Include intermediate calculations
- State conclusion in context: Relate back to the problem
- Round appropriately: Usually 3-4 decimal places for test statistics
| _Last updated: January 2026 | MPA 509: Statistics for Public Administration_ |


