Unit 5: Solved Numerical Problems (Part 2)
Chi-Square Tests and Non-Parametric Tests
This section contains 15+ fully solved problems on chi-square tests and Kruskal-Wallis test.
Section A: Chi-Square Test for Independence
Problem 1: 2×2 Contingency Table
Question: A survey examines relationship between gender and job satisfaction:
| |
Satisfied |
Not Satisfied |
Total |
| Male |
60 |
40 |
100 |
| Female |
45 |
55 |
100 |
| Total |
105 |
95 |
200 |
Test at α = 0.05 whether satisfaction is independent of gender.
Click to reveal solution
**Step 1: State hypotheses**
- H₀: Gender and satisfaction are independent
- H₁: Gender and satisfaction are associated
**Step 2: Calculate expected frequencies**
Formula: $E = \frac{\text{Row Total} \times \text{Column Total}}{\text{Grand Total}}$
| Cell | Calculation | Expected |
| -------------------- | --------------- | -------- |
| Male-Satisfied | (100 × 105)/200 | 52.5 |
| Male-Not Satisfied | (100 × 95)/200 | 47.5 |
| Female-Satisfied | (100 × 105)/200 | 52.5 |
| Female-Not Satisfied | (100 × 95)/200 | 47.5 |
**Step 3: Create calculation table**
| Cell | O | E | (O-E) | (O-E)² | (O-E)²/E |
| ---------- | --- | ---- | ----- | ------ | --------- |
| Male-Sat | 60 | 52.5 | 7.5 | 56.25 | 1.071 |
| Male-Not | 40 | 47.5 | -7.5 | 56.25 | 1.184 |
| Female-Sat | 45 | 52.5 | -7.5 | 56.25 | 1.071 |
| Female-Not | 55 | 47.5 | 7.5 | 56.25 | 1.184 |
| **Total** | | | | | **4.510** |
$$\chi^2 = 4.510$$
**Step 4: Find degrees of freedom and critical value**
$$df = (r-1)(c-1) = (2-1)(2-1) = 1$$
At α = 0.05, df = 1: χ²\* = 3.841
**Step 5: Decision**
χ² = 4.510 > 3.841, **Reject H₀**
**Step 6: Conclusion**
At α = 0.05, there is sufficient evidence that job satisfaction is associated with gender.
Click to reveal solution
For a 2×2 table:
| | Col 1 | Col 2 | Total |
|–|——-|——-|——-|
| Row 1 | a=60 | b=40 | 100 |
| Row 2 | c=45 | d=55 | 100 |
| Total | 105 | 95 | 200 |
Shortcut formula:
\(\chi^2 = \frac{n(ad - bc)^2}{(a+b)(c+d)(a+c)(b+d)}\)
\[\chi^2 = \frac{200(60 \times 55 - 40 \times 45)^2}{(100)(100)(105)(95)}\]
\[= \frac{200(3300 - 1800)^2}{99,750,000} = \frac{200 \times 2,250,000}{99,750,000}\]
\[= \frac{450,000,000}{99,750,000} = 4.511\]
Same result as before!
</details>
Question: Verify Problem 1 using the shortcut formula.
### Problem 3: 3×2 Contingency Table
**Question:** Test association between education level and voting preference:
| | Voted | Did Not Vote | Total |
| ----------- | ----- | ------------ | ----- |
| High School | 40 | 60 | 100 |
| Bachelor's | 70 | 50 | 120 |
| Graduate | 80 | 40 | 120 |
| Total | 190 | 150 | 340 |
Test at α = 0.01.
Click to reveal solution
**Step 1: State hypotheses**
- H₀: Education and voting are independent
- H₁: Education and voting are associated
**Step 2: Calculate expected frequencies**
| Cell | Calculation | E |
| ---------- | --------------- | ----- |
| HS-Voted | (100 × 190)/340 | 55.88 |
| HS-Not | (100 × 150)/340 | 44.12 |
| Bach-Voted | (120 × 190)/340 | 67.06 |
| Bach-Not | (120 × 150)/340 | 52.94 |
| Grad-Voted | (120 × 190)/340 | 67.06 |
| Grad-Not | (120 × 150)/340 | 52.94 |
**Step 3: Calculate χ²**
| Cell | O | E | (O-E)²/E |
| ---------- | --- | ----- | ---------- |
| HS-Voted | 40 | 55.88 | 4.509 |
| HS-Not | 60 | 44.12 | 5.713 |
| Bach-Voted | 70 | 67.06 | 0.129 |
| Bach-Not | 50 | 52.94 | 0.163 |
| Grad-Voted | 80 | 67.06 | 2.496 |
| Grad-Not | 40 | 52.94 | 3.159 |
| **Total** | | | **16.169** |
$$\chi^2 = 16.169$$
**Step 4: Critical value**
df = (3-1)(2-1) = 2
At α = 0.01: χ²\* = 9.210
**Step 5: Decision**
χ² = 16.169 > 9.210, **Reject H₀**
**Step 6: Conclusion**
At α = 0.01, education level and voting behavior are significantly associated.
---
### Problem 4: 3×3 Contingency Table
**Question:** Employee satisfaction by department:
| | Excellent | Good | Poor | Total |
| ------- | --------- | ---- | ---- | ----- |
| Finance | 25 | 35 | 20 | 80 |
| HR | 30 | 40 | 30 | 100 |
| IT | 40 | 45 | 35 | 120 |
| Total | 95 | 120 | 85 | 300 |
Test at α = 0.05 if satisfaction differs by department.
Click to reveal solution
**Step 1: Calculate expected frequencies**
| Cell | E = (Row × Col)/300 |
| ------------ | --------------------- |
| Finance-Exc | (80×95)/300 = 25.33 |
| Finance-Good | (80×120)/300 = 32.00 |
| Finance-Poor | (80×85)/300 = 22.67 |
| HR-Exc | (100×95)/300 = 31.67 |
| HR-Good | (100×120)/300 = 40.00 |
| HR-Poor | (100×85)/300 = 28.33 |
| IT-Exc | (120×95)/300 = 38.00 |
| IT-Good | (120×120)/300 = 48.00 |
| IT-Poor | (120×85)/300 = 34.00 |
**Step 2: Calculate χ²**
| Cell | O | E | (O-E)²/E |
| --------- | --- | ----- | --------- |
| Fin-Exc | 25 | 25.33 | 0.004 |
| Fin-Good | 35 | 32.00 | 0.281 |
| Fin-Poor | 20 | 22.67 | 0.314 |
| HR-Exc | 30 | 31.67 | 0.088 |
| HR-Good | 40 | 40.00 | 0.000 |
| HR-Poor | 30 | 28.33 | 0.098 |
| IT-Exc | 40 | 38.00 | 0.105 |
| IT-Good | 45 | 48.00 | 0.188 |
| IT-Poor | 35 | 34.00 | 0.029 |
| **Total** | | | **1.107** |
$$\chi^2 = 1.107$$
**Step 3: Critical value**
df = (3-1)(3-1) = 4
At α = 0.05: χ²\* = 9.488
**Step 4: Decision**
χ² = 1.107 < 9.488, **Fail to Reject H₀**
**Step 5: Conclusion**
At α = 0.05, there is no significant association between department and satisfaction level.
---
## Section B: Chi-Square Goodness of Fit Test
### Problem 5: Uniform Distribution Test
**Question:** A die is rolled 120 times with results:
| Face | 1 | 2 | 3 | 4 | 5 | 6 |
| -------- | --- | --- | --- | --- | --- | --- |
| Observed | 25 | 15 | 22 | 18 | 20 | 20 |
Test at α = 0.05 if the die is fair.
Click to reveal solution
**Step 1: State hypotheses**
- H₀: Die is fair (uniform distribution)
- H₁: Die is not fair
**Step 2: Calculate expected frequencies**
For fair die: E = 120/6 = 20 for each face
**Step 3: Calculate χ²**
| Face | O | E | (O-E)² | (O-E)²/E |
| --------- | ------- | ------- | ------ | -------- |
| 1 | 25 | 20 | 25 | 1.25 |
| 2 | 15 | 20 | 25 | 1.25 |
| 3 | 22 | 20 | 4 | 0.20 |
| 4 | 18 | 20 | 4 | 0.20 |
| 5 | 20 | 20 | 0 | 0.00 |
| 6 | 20 | 20 | 0 | 0.00 |
| **Total** | **120** | **120** | | **2.90** |
$$\chi^2 = 2.90$$
**Step 4: Critical value**
df = k - 1 = 6 - 1 = 5
At α = 0.05: χ²\* = 11.070
**Step 5: Decision**
χ² = 2.90 < 11.070, **Fail to Reject H₀**
**Step 6: Conclusion**
At α = 0.05, there is insufficient evidence to conclude the die is unfair.
---
### Problem 6: Test for Given Proportions
**Question:** A manager claims customer preferences are in ratio 3:2:1 for products A, B, C. A survey of 180 customers shows:
- Product A: 100
- Product B: 55
- Product C: 25
Test at α = 0.05 if data supports the claim.
Click to reveal solution
**Step 1: State hypotheses**
- H₀: Preferences are in ratio 3:2:1
- H₁: Preferences are not in ratio 3:2:1
**Step 2: Calculate expected frequencies**
Total ratio = 3 + 2 + 1 = 6
- E(A) = 180 × (3/6) = 90
- E(B) = 180 × (2/6) = 60
- E(C) = 180 × (1/6) = 30
**Step 3: Calculate χ²**
| Product | O | E | (O-E)² | (O-E)²/E |
| --------- | ------- | ------- | ------ | --------- |
| A | 100 | 90 | 100 | 1.111 |
| B | 55 | 60 | 25 | 0.417 |
| C | 25 | 30 | 25 | 0.833 |
| **Total** | **180** | **180** | | **2.361** |
$$\chi^2 = 2.361$$
**Step 4: Critical value**
df = 3 - 1 = 2
At α = 0.05: χ²\* = 5.991
**Step 5: Decision**
χ² = 2.361 < 5.991, **Fail to Reject H₀**
**Step 6: Conclusion**
At α = 0.05, the data is consistent with the claimed ratio 3:2:1.
---
### Problem 7: Test for Specified Percentages
**Question:** A city claims distribution of households by income:
- Low: 30%
- Middle: 50%
- High: 20%
A sample of 250 households shows:
- Low: 90
- Middle: 110
- High: 50
Test at α = 0.01 if the sample matches claimed distribution.
Click to reveal solution
**Step 1: Calculate expected frequencies**
- E(Low) = 250 × 0.30 = 75
- E(Middle) = 250 × 0.50 = 125
- E(High) = 250 × 0.20 = 50
**Step 2: Calculate χ²**
| Category | O | E | (O-E)²/E |
| --------- | --- | --- | -------- |
| Low | 90 | 75 | 3.00 |
| Middle | 110 | 125 | 1.80 |
| High | 50 | 50 | 0.00 |
| **Total** | | | **4.80** |
$$\chi^2 = 4.80$$
**Step 3: Critical value**
df = 2, α = 0.01: χ²\* = 9.210
**Step 4: Decision**
χ² = 4.80 < 9.210, **Fail to Reject H₀**
**Step 5: Conclusion**
At α = 0.01, the sample distribution is consistent with the claimed percentages.
---
### Problem 8: Day of Week Distribution
**Question:** Emergency calls over a week:
| Day | Mon | Tue | Wed | Thu | Fri | Sat | Sun |
| ----- | --- | --- | --- | --- | --- | --- | --- |
| Calls | 45 | 48 | 42 | 50 | 55 | 70 | 60 |
Test at α = 0.05 if calls are uniformly distributed.
Click to reveal solution
**Step 1: Calculate expected (uniform)**
Total = 370, E = 370/7 = 52.86 per day
**Step 2: Calculate χ²**
| Day | O | E | (O-E)²/E |
| --------- | --- | ----- | ---------- |
| Mon | 45 | 52.86 | 1.168 |
| Tue | 48 | 52.86 | 0.447 |
| Wed | 42 | 52.86 | 2.229 |
| Thu | 50 | 52.86 | 0.155 |
| Fri | 55 | 52.86 | 0.087 |
| Sat | 70 | 52.86 | 5.555 |
| Sun | 60 | 52.86 | 0.964 |
| **Total** | | | **10.605** |
$$\chi^2 = 10.605$$
**Step 3: Critical value**
df = 6, α = 0.05: χ²\* = 12.592
**Step 4: Decision**
χ² = 10.605 < 12.592, **Fail to Reject H₀**
**Step 5: Conclusion**
At α = 0.05, there is insufficient evidence that emergency calls vary by day of the week.
---
## Section C: Kruskal-Wallis Test
### Problem 9: Three Groups Comparison
**Question:** Compare satisfaction scores across three training programs:
| Program A | Program B | Program C |
| --------- | --------- | --------- |
| 82 | 75 | 90 |
| 78 | 70 | 88 |
| 85 | 72 | 92 |
| 80 | 68 | 85 |
Test at α = 0.05 if programs differ.
Click to reveal solution
**Step 1: Rank all data combined**
| Value | Program | Rank |
| ----- | ------- | ---- |
| 68 | B | 1 |
| 70 | B | 2 |
| 72 | B | 3 |
| 75 | B | 4 |
| 78 | A | 5 |
| 80 | A | 6 |
| 82 | A | 7 |
| 85 | A | 8.5 |
| 85 | C | 8.5 |
| 88 | C | 10 |
| 90 | C | 11 |
| 92 | C | 12 |
**Step 2: Calculate rank sums**
- $R_A$ = 5 + 6 + 7 + 8.5 = 26.5
- $R_B$ = 1 + 2 + 3 + 4 = 10
- $R_C$ = 8.5 + 10 + 11 + 12 = 41.5
**Step 3: Calculate H statistic**
$$H = \frac{12}{N(N+1)} \sum \frac{R_i^2}{n_i} - 3(N+1)$$
$$H = \frac{12}{12(13)} \left[\frac{(26.5)^2}{4} + \frac{(10)^2}{4} + \frac{(41.5)^2}{4}\right] - 3(13)$$
$$= \frac{12}{156} \times \frac{702.25 + 100 + 1722.25}{4} - 39$$
$$= \frac{12}{156} \times 631.125 - 39 = 48.55 - 39 = 9.55$$
**Step 4: Critical value**
df = k - 1 = 2
At α = 0.05: χ²\* = 5.991
**Step 5: Decision**
H = 9.55 > 5.991, **Reject H₀**
**Step 6: Conclusion**
At α = 0.05, satisfaction scores differ significantly across the three training programs.
---
### Problem 10: Four Groups Comparison
**Question:** Response times (minutes) across four service centers:
| Center 1 | Center 2 | Center 3 | Center 4 |
| -------- | -------- | -------- | -------- |
| 5 | 8 | 12 | 15 |
| 6 | 10 | 11 | 18 |
| 4 | 9 | 14 | 16 |
Test at α = 0.05 if centers differ.
Click to reveal solution
**Step 1: Rank all 12 values**
| Rank | Value | Center |
| ---- | ----- | ------ |
| 1 | 4 | 1 |
| 2 | 5 | 1 |
| 3 | 6 | 1 |
| 4 | 8 | 2 |
| 5 | 9 | 2 |
| 6 | 10 | 2 |
| 7 | 11 | 3 |
| 8 | 12 | 3 |
| 9 | 14 | 3 |
| 10 | 15 | 4 |
| 11 | 16 | 4 |
| 12 | 18 | 4 |
**Step 2: Rank sums**
- $R_1$ = 1 + 2 + 3 = 6
- $R_2$ = 4 + 5 + 6 = 15
- $R_3$ = 7 + 8 + 9 = 24
- $R_4$ = 10 + 11 + 12 = 33
**Check:** 6 + 15 + 24 + 33 = 78 = 12(13)/2 ✓
**Step 3: Calculate H**
$$H = \frac{12}{12(13)} \left[\frac{36}{3} + \frac{225}{3} + \frac{576}{3} + \frac{1089}{3}\right] - 39$$
$$= \frac{12}{156} \times 642 - 39 = 49.38 - 39 = 10.38$$
**Step 4: Critical value**
df = 3, α = 0.05: χ²\* = 7.815
**Step 5: Decision**
H = 10.38 > 7.815, **Reject H₀**
**Step 6: Conclusion**
Response times differ significantly across the four service centers.
---
### Problem 11: Kruskal-Wallis with Ties
**Question:** Quality ratings (1-10) across three suppliers:
| Supplier A | Supplier B | Supplier C |
| ---------- | ---------- | ---------- |
| 7 | 5 | 8 |
| 6 | 5 | 9 |
| 7 | 6 | 8 |
| 8 | 4 | 9 |
Test at α = 0.05.
Click to reveal solution
**Step 1: Rank all values (handle ties with average ranks)**
| Value | Supplier | Rank |
| ----- | -------- | ---- |
| 4 | B | 1 |
| 5 | B | 2.5 |
| 5 | B | 2.5 |
| 6 | A | 4.5 |
| 6 | B | 4.5 |
| 7 | A | 6.5 |
| 7 | A | 6.5 |
| 8 | A | 9 |
| 8 | C | 9 |
| 8 | C | 9 |
| 9 | C | 11.5 |
| 9 | C | 11.5 |
**Step 2: Rank sums**
- $R_A$ = 4.5 + 6.5 + 6.5 + 9 = 26.5
- $R_B$ = 1 + 2.5 + 2.5 + 4.5 = 10.5
- $R_C$ = 9 + 9 + 11.5 + 11.5 = 41
**Step 3: Calculate H**
$$H = \frac{12}{12(13)} \left[\frac{(26.5)^2}{4} + \frac{(10.5)^2}{4} + \frac{(41)^2}{4}\right] - 39$$
$$= \frac{12}{156} \times \frac{702.25 + 110.25 + 1681}{4} - 39$$
$$= 0.0769 \times 623.375 - 39 = 47.94 - 39 = 8.94$$
**Step 4: Decision**
H = 8.94 > 5.991 (df = 2, α = 0.05), **Reject H₀**
**Step 5: Conclusion**
Quality ratings differ significantly among the three suppliers.
---
## Section D: Comprehensive Problems
### Problem 12: Complete Chi-Square Analysis
**Question:** A company surveyed employee engagement by tenure:
| | Engaged | Neutral | Disengaged | Total |
| --------- | ------- | ------- | ---------- | ----- |
| < 2 years | 35 | 25 | 20 | 80 |
| 2-5 years | 40 | 35 | 25 | 100 |
| > 5 years | 55 | 30 | 35 | 120 |
| Total | 130 | 90 | 80 | 300 |
a) Test independence at α = 0.05
b) Calculate expected frequencies
c) Identify cells contributing most to χ²
Click to reveal solution
**Part (a) & (b): Expected Frequencies**
| Cell | E |
| --------------- | --------------------- |
| <2, Engaged | (80×130)/300 = 34.67 |
| <2, Neutral | (80×90)/300 = 24.00 |
| <2, Disengaged | (80×80)/300 = 21.33 |
| 2-5, Engaged | (100×130)/300 = 43.33 |
| 2-5, Neutral | (100×90)/300 = 30.00 |
| 2-5, Disengaged | (100×80)/300 = 26.67 |
| >5, Engaged | (120×130)/300 = 52.00 |
| >5, Neutral | (120×90)/300 = 36.00 |
| >5, Disengaged | (120×80)/300 = 32.00 |
**χ² Calculation:**
| Cell | O | E | (O-E)²/E |
| --------------- | --- | ----- | --------- |
| <2, Engaged | 35 | 34.67 | 0.003 |
| <2, Neutral | 25 | 24.00 | 0.042 |
| <2, Disengaged | 20 | 21.33 | 0.083 |
| 2-5, Engaged | 40 | 43.33 | 0.256 |
| 2-5, Neutral | 35 | 30.00 | 0.833 |
| 2-5, Disengaged | 25 | 26.67 | 0.105 |
| >5, Engaged | 55 | 52.00 | 0.173 |
| >5, Neutral | 30 | 36.00 | 1.000 |
| >5, Disengaged | 35 | 32.00 | 0.281 |
| **Total** | | | **2.776** |
$$\chi^2 = 2.776$$
**Critical value:** df = (3-1)(3-1) = 4, α = 0.05: χ²\* = 9.488
**Decision:** χ² = 2.776 < 9.488, **Fail to Reject H₀**
**Part (c): Largest Contributions**
1. > 5 years, Neutral: 1.000
2. 2-5 years, Neutral: 0.833
3. > 5 years, Disengaged: 0.281
---
## Practice Problems
1. Test independence:
| | Yes | No | Total |
|--|-----|-----|-------|
| Male | 45 | 30 | 75 |
| Female | 35 | 40 | 75 |
Use α = 0.05.
2. A coin is flipped 100 times: Heads = 58, Tails = 42. Test if coin is fair at α = 0.05.
3. Compare three groups using Kruskal-Wallis:
- Group A: 10, 15, 12, 18
- Group B: 8, 11, 9, 14
- Group C: 20, 22, 19, 25
Test at α = 0.05.
4. Survey results for satisfaction:
| | Satisfied | Neutral | Dissatisfied |
|--|-----------|---------|--------------|
| Urban | 50 | 30 | 20 |
| Rural | 35 | 40 | 25 |
Test at α = 0.01.
5. Test if customer arrivals follow ratio 3:2:1:1:
- Morning: 90
- Noon: 55
- Afternoon: 35
- Evening: 20
Use α = 0.05.
---
## Chi-Square Critical Values Reference Table
| df | α = 0.10 | α = 0.05 | α = 0.01 |
| --- | -------- | -------- | -------- |
| 1 | 2.706 | 3.841 | 6.635 |
| 2 | 4.605 | 5.991 | 9.210 |
| 3 | 6.251 | 7.815 | 11.345 |
| 4 | 7.779 | 9.488 | 13.277 |
| 5 | 9.236 | 11.070 | 15.086 |
| 6 | 10.645 | 12.592 | 16.812 |
| 7 | 12.017 | 14.067 | 18.475 |
| 8 | 13.362 | 15.507 | 20.090 |
| 9 | 14.684 | 16.919 | 21.666 |
| 10 | 15.987 | 18.307 | 23.209 |
---
## Summary of Formulas
| Test | Formula | df |
| -------------------------------- | ------------------------------------------------------------------------- | ---------- |
| **Chi-Square (Independence)** | $\chi^2 = \sum \frac{(O-E)^2}{E}$ | (r-1)(c-1) |
| **Expected Frequency** | $E = \frac{\text{Row Total} \times \text{Col Total}}{\text{Grand Total}}$ | |
| **Chi-Square (Goodness of Fit)** | $\chi^2 = \sum \frac{(O-E)^2}{E}$ | k-1 |
| **Kruskal-Wallis** | $H = \frac{12}{N(N+1)}\sum\frac{R_i^2}{n_i} - 3(N+1)$ | k-1 |
| **2×2 Shortcut** | $\chi^2 = \frac{n(ad-bc)^2}{(a+b)(c+d)(a+c)(b+d)}$ | 1 |
---
## Decision Summary
| Test | Decision Rule |
| -------------- | ------------------------------------------------ |
| Chi-Square | Reject H₀ if χ² > χ²\* |
| Kruskal-Wallis | Reject H₀ if H > χ²\* |
| All tests | Compare calculated statistic with critical value |
**Key Point:** All chi-square and Kruskal-Wallis tests are always **right-tailed** (we reject if the test statistic is too large).