Skip to main content

Chi-Squared Tests

Chi-Squared Tests

The chi-squared test is a non-parametric statistical test used to determine whether observed data deviates significantly from expected values. It has two main applications: testing goodness of fit (to a theoretical distribution) and testing for independence (between two categorical variables).

Board Coverage

BoardPaperNotes
AQAPaper 2Goodness of fit and contingency tables
EdexcelS3Goodness of fit and test for independence
OCR (A)Paper 2Both applications covered
CIE (9231)S2Goodness of fit; independence with 2×22 \times 2 tables
The chi-squared test statistic is always based on observed and expected frequencies, never

on percentages or proportions. Always check the conditions (expected frequency 5\geq 5) before applying the test. The formula booklet provides the chi-squared distribution table. :::


1. The Chi-Squared Distribution

1.1 Definition

Definition. If Z1,Z2,,ZkZ_1, Z_2, \ldots, Z_k are independent standard normal random variables, then

χk2=Z12+Z22++Zk2\chi^2_k = Z_1^2 + Z_2^2 + \cdots + Z_k^2

follows a chi-squared distribution with kk degrees of freedom, written χk2\chi^2_k.

1.2 Properties

  • The distribution is defined only for χ20\chi^2 \geq 0
  • It is positively skewed, becoming more symmetric as kk increases
  • E(χk2)=kE(\chi^2_k) = k
  • Var(χk2)=2k\mathrm{Var}(\chi^2_k) = 2k
  • As kk \to \infty, the distribution approaches a normal distribution N(k,2k)N(k, 2k)
  • The distribution is additive: if Xχa2X \sim \chi^2_a and Yχb2Y \sim \chi^2_b are independent, then X+Yχa+b2X + Y \sim \chi^2_{a+b}

1.3 Critical values

Critical values are found from chi-squared tables. For a test at significance level α\alpha with ν\nu degrees of freedom, the critical value χα,ν2\chi^2_{\alpha,\nu} satisfies:

P(χν2>χα,ν2)=αP(\chi^2_\nu > \chi^2_{\alpha,\nu}) = \alpha


2. Goodness of Fit Test

2.1 Hypotheses

  • H0H_0: The data follows the specified distribution
  • H1H_1: The data does not follow the specified distribution

2.2 Test statistic

χ2=i=1n(OiEi)2Ei\boxed{\chi^2 = \sum_{i=1}^{n}\frac{(O_i - E_i)^2}{E_i}}

where OiO_i is the observed frequency and EiE_i is the expected frequency for category ii.

2.3 Degrees of freedom

ν=n1c\nu = n - 1 - c

where nn is the number of categories and cc is the number of parameters estimated from the data.

  • If no parameters are estimated: ν=n1\nu = n - 1
  • If the mean of a Poisson is estimated from the data: ν=n2\nu = n - 2
  • If both mean and standard deviation of a normal are estimated: ν=n3\nu = n - 3

2.4 Conditions

For the chi-squared approximation to be valid:

  1. Expected frequencies should be 5\geq 5 for each category
  2. If any expected frequency is <5< 5, merge adjacent categories before carrying out the test
  3. The observations must be independent

2.5 Yates' correction (continuity correction)

For a 2×22 \times 2 contingency table with small expected frequencies, Yates' correction adjusts the test statistic:

χYates2=LB(OiEi0.5)2RB◆◆LBEiRB\chi^2_{\mathrm{Yates}} = \sum\frac◆LB◆(|O_i - E_i| - 0.5)^2◆RB◆◆LB◆E_i◆RB◆

This correction makes the test more conservative (less likely to reject H0H_0).

warning categories if expected frequencies are too small. :::

2.6 Worked example: Poisson goodness of fit

Example. Over 100 days, the number of accidents per day at a factory was recorded:

Accidents (rr)012345\geq 5
Days (OrO_r)383218831

Test at the 5% level whether the data follows a Poisson distribution.

Step 1: Estimate λ\lambda from the data:

rˉ=0(38)+1(32)+2(18)+3(8)+4(3)+5(1)100=109100=1.09\bar{r} = \frac{0(38)+1(32)+2(18)+3(8)+4(3)+5(1)}{100} = \frac{109}{100} = 1.09

Step 2: Calculate expected frequencies using Po(1.09)\mathrm{Po}(1.09):

P(X=0)=e1.090.3365    E0=33.65P(X=0) = e^{-1.09} \approx 0.3365 \implies E_0 = 33.65

P(X=1)=1.09e1.090.3668    E1=36.68P(X=1) = 1.09\,e^{-1.09} \approx 0.3668 \implies E_1 = 36.68

P(X=2)=1.0922e1.090.1999    E2=19.99P(X=2) = \dfrac{1.09^2}{2}e^{-1.09} \approx 0.1999 \implies E_2 = 19.99

P(X=3)=1.0936e1.090.0726    E3=7.26P(X=3) = \dfrac{1.09^3}{6}e^{-1.09} \approx 0.0726 \implies E_3 = 7.26

P(X=4)=1.09424e1.090.0198    E4=1.98P(X=4) = \dfrac{1.09^4}{24}e^{-1.09} \approx 0.0198 \implies E_4 = 1.98

P(X5)=10.99560.0044    E5=0.44P(X \geq 5) = 1 - 0.9956 \approx 0.0044 \implies E_5 = 0.44

Step 3: Merge categories so all Ei5E_i \geq 5. Merge r3r \geq 3:

rr0123\geq 3
OrO_r38321812
ErE_r34.9936.7419.298.98

Step 4: Calculate the test statistic:

χ2=(3834.99)234.99+(3236.74)236.74+(1819.29)219.29+(128.98)28.98\chi^2 = \frac{(38-34.99)^2}{34.99} + \frac{(32-36.74)^2}{36.74} + \frac{(18-19.29)^2}{19.29} + \frac{(12-8.98)^2}{8.98}

=9.0634.99+22.4736.74+1.6619.29+9.128.980.259+0.612+0.086+1.016=1.973= \frac{9.06}{34.99} + \frac{22.47}{36.74} + \frac{1.66}{19.29} + \frac{9.12}{8.98} \approx 0.259 + 0.612 + 0.086 + 1.016 = 1.973

Step 5: Degrees of freedom: ν=411=3\nu = 4 - 1 - 1 = 3 (4 categories, 1 parameter estimated).

Step 6: Critical value: χ0.05,32=7.815\chi^2_{0.05,\,3} = 7.815.

Since 1.973<7.8151.973 < 7.815, do not reject H0H_0.

There is insufficient evidence to suggest the data does not follow a Poisson distribution.


3. Test for Independence

3.1 Contingency tables

Definition. A contingency table (or two-way table) displays the frequency distribution of two categorical variables.

3.2 Hypotheses

  • H0H_0: The two variables are independent
  • H1H_1: The two variables are not independent

3.3 Expected frequencies

For a contingency table with entries OijO_{ij} (row ii, column jj), the expected frequency is:

Eij=LB(rowitotal)×(columnjtotal)RB◆◆LBgrandtotalRB\boxed{E_{ij} = \frac◆LB◆(\mathrm{row } i \mathrm{ total}) \times (\mathrm{column } j \mathrm{ total})◆RB◆◆LB◆\mathrm{grand total}◆RB◆}

3.4 Test statistic

χ2=ij(OijEij)2Eij\boxed{\chi^2 = \sum_{i}\sum_{j}\frac{(O_{ij} - E_{ij})^2}{E_{ij}}}

3.5 Degrees of freedom

ν=(r1)(c1)\boxed{\nu = (r-1)(c-1)}

where rr is the number of rows and cc is the number of columns.

3.6 Worked example

Example. A survey of 200 people records their age group and preferred news source:

TVOnlineNewspaperRow total
Under 3020601090
30 to 5030251570
Over 502051540
Col total709040200

Test at the 5% level whether age group and preferred news source are independent.

Expected frequencies:

E11=LB90×70RB◆◆LB200RB=31.5E_{11} = \dfrac◆LB◆90 \times 70◆RB◆◆LB◆200◆RB◆ = 31.5, E12=LB90×90RB◆◆LB200RB=40.5E_{12} = \dfrac◆LB◆90 \times 90◆RB◆◆LB◆200◆RB◆ = 40.5, E13=LB90×40RB◆◆LB200RB=18.0E_{13} = \dfrac◆LB◆90 \times 40◆RB◆◆LB◆200◆RB◆ = 18.0

E21=LB70×70RB◆◆LB200RB=24.5E_{21} = \dfrac◆LB◆70 \times 70◆RB◆◆LB◆200◆RB◆ = 24.5, E22=LB70×90RB◆◆LB200RB=31.5E_{22} = \dfrac◆LB◆70 \times 90◆RB◆◆LB◆200◆RB◆ = 31.5, E23=LB70×40RB◆◆LB200RB=14.0E_{23} = \dfrac◆LB◆70 \times 40◆RB◆◆LB◆200◆RB◆ = 14.0

E31=LB40×70RB◆◆LB200RB=14.0E_{31} = \dfrac◆LB◆40 \times 70◆RB◆◆LB◆200◆RB◆ = 14.0, E32=LB40×90RB◆◆LB200RB=18.0E_{32} = \dfrac◆LB◆40 \times 90◆RB◆◆LB◆200◆RB◆ = 18.0, E33=LB40×40RB◆◆LB200RB=8.0E_{33} = \dfrac◆LB◆40 \times 40◆RB◆◆LB◆200◆RB◆ = 8.0

All expected frequencies 5\geq 5, so the test is valid.

χ2=(2031.5)231.5+(6040.5)240.5+(1018)218+(3024.5)224.5+(2531.5)231.5+(1514)214+(2014)214+(518)218+(158)28\chi^2 = \frac{(20-31.5)^2}{31.5} + \frac{(60-40.5)^2}{40.5} + \frac{(10-18)^2}{18} + \frac{(30-24.5)^2}{24.5} + \frac{(25-31.5)^2}{31.5} + \frac{(15-14)^2}{14} + \frac{(20-14)^2}{14} + \frac{(5-18)^2}{18} + \frac{(15-8)^2}{8}

=132.2531.5+380.2540.5+6418+30.2524.5+42.2531.5+114+3614+16918+498= \frac{132.25}{31.5} + \frac{380.25}{40.5} + \frac{64}{18} + \frac{30.25}{24.5} + \frac{42.25}{31.5} + \frac{1}{14} + \frac{36}{14} + \frac{169}{18} + \frac{49}{8}

4.20+9.39+3.56+1.23+1.34+0.07+2.57+9.39+6.13=37.88\approx 4.20 + 9.39 + 3.56 + 1.23 + 1.34 + 0.07 + 2.57 + 9.39 + 6.13 = 37.88

Degrees of freedom: ν=(31)(31)=4\nu = (3-1)(3-1) = 4.

Critical value: χ0.05,42=9.488\chi^2_{0.05,\,4} = 9.488.

Since 37.88>9.48837.88 > 9.488, reject H0H_0.

There is strong evidence that age group and preferred news source are not independent.


4. Chi-Squared Test Procedure Summary

  1. State H0H_0 and H1H_1
  2. Calculate expected frequencies
  3. Check that all expected frequencies are 5\geq 5 (merge categories if necessary)
  4. Compute the test statistic χ2\chi^2
  5. Determine the degrees of freedom
  6. Compare with the critical value at the given significance level
  7. Conclude in context
Never use percentages or proportions in the chi-squared test — always use raw

frequencies. The test relies on the multinomial distribution, which requires count data. :::


Problems

Details

Problem 1 A die is rolled 60 times. The observed frequencies are: 1: 8, 2: 12, 3: 9, 4: 11, 5: 13, 6: 7. Test at the 5% level whether the die is fair.

Details

Solution 1 H0H_0: The die is fair. H1H_1: The die is not fair.

Expected frequency for each face: Ei=60/6=10E_i = 60/6 = 10.

χ2=(810)210+(1210)210+(910)210+(1110)210+(1310)210+(710)210\chi^2 = \dfrac{(8-10)^2}{10} + \dfrac{(12-10)^2}{10} + \dfrac{(9-10)^2}{10} + \dfrac{(11-10)^2}{10} + \dfrac{(13-10)^2}{10} + \dfrac{(7-10)^2}{10}

=4+4+1+1+9+910=2810=2.8= \dfrac{4+4+1+1+9+9}{10} = \dfrac{28}{10} = 2.8.

ν=61=5\nu = 6 - 1 = 5. Critical value: χ0.05,52=11.07\chi^2_{0.05,\,5} = 11.07.

2.8<11.072.8 < 11.07: do not reject H0H_0. No evidence the die is biased.

If you get this wrong, revise: Goodness of Fit Test — Section 2.

Details

Problem 2 In a 2×22 \times 2 contingency table, the observed frequencies are: Row 1: 30, 20; Row 2: 15, 35. Test at the 5% level whether the two variables are independent.

Details

Solution 2 Row totals: 50, 50. Column totals: 45, 55. Grand total: 100.

Expected: E11=50(45)/100=22.5E_{11} = 50(45)/100 = 22.5, E12=50(55)/100=27.5E_{12} = 50(55)/100 = 27.5, E21=22.5E_{21} = 22.5, E22=27.5E_{22} = 27.5.

χ2=(3022.5)222.5+(2027.5)227.5+(1522.5)222.5+(3527.5)227.5\chi^2 = \dfrac{(30-22.5)^2}{22.5} + \dfrac{(20-27.5)^2}{27.5} + \dfrac{(15-22.5)^2}{22.5} + \dfrac{(35-27.5)^2}{27.5}

=56.2522.5+56.2527.5+56.2522.5+56.2527.5=2.5+2.045+2.5+2.045=9.09= \dfrac{56.25}{22.5} + \dfrac{56.25}{27.5} + \dfrac{56.25}{22.5} + \dfrac{56.25}{27.5} = 2.5 + 2.045 + 2.5 + 2.045 = 9.09.

ν=(21)(21)=1\nu = (2-1)(2-1) = 1. Critical value: χ0.05,12=3.841\chi^2_{0.05,\,1} = 3.841.

9.09>3.8419.09 > 3.841: reject H0H_0. The variables are not independent.

If you get this wrong, revise: Test for Independence — Section 3.

Details

Problem 3 Explain why the chi-squared test statistic uses (OiEi)2/Ei(O_i - E_i)^2 / E_i rather than simply (OiEi)\sum(O_i - E_i).

Details

Solution 3 The sum (OiEi)=0\sum(O_i - E_i) = 0 always, since Oi=Ei=n\sum O_i = \sum E_i = n (both sum to the total number of observations). This provides no information about the discrepancy between observed and expected.

Squaring removes the sign, and dividing by EiE_i standardises the contribution of each category. Categories with larger expected frequencies naturally have larger absolute deviations, so dividing by EiE_i gives each category appropriate weight. This leads to a test statistic whose distribution under H0H_0 is approximately χ2\chi^2.

If you get this wrong, revise: Test statistic — Section 2.2.

Details

Problem 4 The number of emails received per day was recorded over 80 days: 0: 15, 1: 25, 2: 20, 3: 12, 4: 5, 5\geq 5: 3. Test at the 5% level whether the data follows a Poisson distribution.

Details

Solution 4 Estimate λ\lambda: rˉ=0(15)+1(25)+2(20)+3(12)+4(5)+5(3)80=13680=1.7\bar{r} = \dfrac{0(15)+1(25)+2(20)+3(12)+4(5)+5(3)}{80} = \dfrac{136}{80} = 1.7.

Expected (Po(1.7)): P(0)=e1.70.1827E=14.62P(0) = e^{-1.7} \approx 0.1827 \to E = 14.62, P(1)=1.7e1.70.3106E=24.85P(1) = 1.7\,e^{-1.7} \approx 0.3106 \to E = 24.85, P(2)=1.722e1.70.2640E=21.12P(2) = \dfrac{1.7^2}{2}e^{-1.7} \approx 0.2640 \to E = 21.12, P(3)=1.736e1.70.1496E=11.97P(3) = \dfrac{1.7^3}{6}e^{-1.7} \approx 0.1496 \to E = 11.97, P(4)=1.7424e1.70.0636E=5.09P(4) = \dfrac{1.7^4}{24}e^{-1.7} \approx 0.0636 \to E = 5.09, P(5)=10.97050.0295E=2.36P(\geq 5) = 1 - 0.9705 \approx 0.0295 \to E = 2.36.

Merge 4\geq 4: O=8O = 8, E=5.09+2.36=7.45E = 5.09+2.36 = 7.45.

After merging: categories 0, 1, 2, 3, 4\geq 4 with OO: 15, 25, 20, 12, 8 and EE: 14.62, 24.85, 21.12, 11.97, 7.45.

χ2=(1520.49)220.49+(2527.92)227.92+(2019.02)219.02+(2012.58)212.58\chi^2 = \dfrac{(15-20.49)^2}{20.49} + \dfrac{(25-27.92)^2}{27.92} + \dfrac{(20-19.02)^2}{19.02} + \dfrac{(20-12.58)^2}{12.58}

1.471+0.305+0.050+4.378=6.204\approx 1.471 + 0.305 + 0.050 + 4.378 = 6.204.

ν=411=2\nu = 4 - 1 - 1 = 2. Critical value: χ0.05,22=5.991\chi^2_{0.05,\,2} = 5.991.

6.204>5.9916.204 > 5.991: reject H0H_0. Evidence the data does not follow Poisson.

If you get this wrong, revise: Worked example: Poisson goodness of fit — Section 2.6.

Details

Problem 5 A 3×23 \times 2 contingency table has χ2=12.4\chi^2 = 12.4. State the degrees of freedom and determine whether H0H_0 is rejected at the 5% level.

Details

Solution 5 ν=(31)(21)=2\nu = (3-1)(2-1) = 2.

Critical value: χ0.05,22=5.991\chi^2_{0.05,\,2} = 5.991.

Since 12.4>5.99112.4 > 5.991, reject H0H_0.

If you get this wrong, revise: Degrees of freedom — Section 3.5.

Details

Problem 6 A 2×22 \times 2 table has observed frequencies: Row 1: 40, 60; Row 2: 55, 45. Apply Yates' correction and compare with the uncorrected statistic.

Details

Solution 6 Row totals: 100, 100. Column totals: 95, 105. Grand total: 200.

Expected: E11=E12=E21=E22=50E_{11} = E_{12} = E_{21} = E_{22} = 50.

Uncorrected: χ2=(4050)2+(6050)2+(5550)2+(4550)250=100+100+25+2550=5.0\chi^2 = \dfrac{(40-50)^2+(60-50)^2+(55-50)^2+(45-50)^2}{50} = \dfrac{100+100+25+25}{50} = 5.0.

Yates' corrected: χY2=(100.5)2+(100.5)2+(50.5)2+(50.5)250=90.25+90.25+20.25+20.2550=4.42\chi^2_Y = \dfrac{(10-0.5)^2+(10-0.5)^2+(5-0.5)^2+(5-0.5)^2}{50} = \dfrac{90.25+90.25+20.25+20.25}{50} = 4.42.

ν=1\nu = 1. Critical value: 3.8413.841. Both reject H0H_0, but Yates' gives a more conservative result.

If you get this wrong, revise: Yates' correction — Section 2.5.

Details

Problem 7 A uniform distribution is fitted to data with 5 categories and expected frequency 20 per category. The observed frequencies are 15, 22, 18, 25, 20. Test at the 1% level.

Details

Solution 7 H0H_0: Uniform distribution. H1H_1: Not uniform.

All Ei=205E_i = 20 \geq 5.

χ2=25+4+4+25+020=5820=2.9\chi^2 = \dfrac{25+4+4+25+0}{20} = \dfrac{58}{20} = 2.9.

ν=51=4\nu = 5 - 1 = 4. Critical value at 1%: χ0.01,42=13.28\chi^2_{0.01,\,4} = 13.28.

2.9<13.282.9 < 13.28: do not reject H0H_0.

If you get this wrong, revise: Goodness of Fit Test — Section 2.

Details

Problem 8 In a test for independence with a 4×34 \times 3 contingency table, the test statistic is χ2=18.7\chi^2 = 18.7. Test at the 5% level.

Details

Solution 8 ν=(41)(31)=6\nu = (4-1)(3-1) = 6.

Critical value: χ0.05,62=12.59\chi^2_{0.05,\,6} = 12.59.

18.7>12.5918.7 > 12.59: reject H0H_0. There is evidence that the variables are associated.

If you get this wrong, revise: Degrees of freedom — Section 3.5.

Details

Problem 9 State three conditions that must be satisfied before carrying out a chi-squared test, and explain the consequence of violating each.

Solution 9
  1. Expected frequencies 5\geq 5: Violating this makes the χ2\chi^2 approximation to the true distribution inaccurate, increasing the risk of Type I errors. Remedy: merge adjacent categories.

  2. Independence of observations: Violating this means the test assumes a multinomial model that does not apply, invalidating the result. Remedy: ensure the sampling method produces independent observations.

  3. Sufficiently large sample: With very small total samples, even large relative discrepancies can produce non-significant results. The test has low power. Remedy: increase sample size.

If you get this wrong, revise: Conditions — Section 2.4.

Details

Problem 10 Data is fitted to a normal distribution with mean and standard deviation both estimated from the data. There are 8 categories. The test statistic is χ2=7.2\chi^2 = 7.2. Determine the degrees of freedom and test at the 5% level.

Details

Solution 10 Two parameters estimated (mean and standard deviation), so ν=812=5\nu = 8 - 1 - 2 = 5.

Critical value: χ0.05,52=11.07\chi^2_{0.05,\,5} = 11.07.

7.2<11.077.2 < 11.07: do not reject H0H_0. There is insufficient evidence that the data does not follow a normal distribution.

If you get this wrong, revise: Degrees of freedom — Section 2.3.


5. Yates' Correction: When and Why

5.1 The problem with small 2×22 \times 2 tables

The chi-squared distribution is a continuous approximation to the discrete multinomial distribution. For 2×22 \times 2 tables (1 degree of freedom), this approximation is poor when expected frequencies are small. The uncorrected chi-squared test tends to reject H0H_0 too often (it is too liberal).

Yates' correction adjusts each term by subtracting 0.5 from the absolute difference before squaring:

χYates2=i=14LB(OiEi0.5)2RB◆◆LBEiRB\chi^2_{\mathrm{Yates}} = \sum_{i=1}^{4}\frac◆LB◆(|O_i - E_i| - 0.5)^2◆RB◆◆LB◆E_i◆RB◆

This reduces the test statistic, making it harder to reject H0H_0.

5.2 When to apply Yates' correction

  • Apply it to 2×22 \times 2 contingency tables
  • It is most important when the total sample size is small (typically n<40n \lt{} 40) or when any expected frequency is below 10
  • Some exam boards require it for all 2×22 \times 2 tables; check the specific mark scheme
  • Do not apply it to tables larger than 2×22 \times 2

5.3 Limitations

Yates' correction can be overly conservative — it may fail to detect a real association. For very small samples, Fisher's exact test is preferred (but this is beyond the A-Level syllabus).


6. Worked Examples

6.1 Goodness of fit: dice fairness

Example. A die is rolled 120 times with the following results:

Face123456
Obs251820221520

Test at the 5% level whether the die is fair.

H0H_0: The die is fair (uniform distribution). H1H_1: The die is not fair.

Expected: Ei=120/6=20E_i = 120/6 = 20 for all faces.

χ2=(2520)2+(1820)2+(2020)2+(2220)2+(1520)2+(2020)220\chi^2 = \frac{(25-20)^2 + (18-20)^2 + (20-20)^2 + (22-20)^2 + (15-20)^2 + (20-20)^2}{20}

=25+4+0+4+25+020=5820=2.9= \frac{25 + 4 + 0 + 4 + 25 + 0}{20} = \frac{58}{20} = 2.9

ν=61=5\nu = 6 - 1 = 5. Critical value: χ0.05,52=11.07\chi^2_{0.05,\,5} = 11.07.

Since 2.9<11.072.9 \lt{} 11.07, do not reject H0H_0. There is insufficient evidence that the die is biased.

6.2 Goodness of fit: genetic ratios

Example. In a genetics experiment, 200 plants are expected to show a 9:3:3:1 phenotypic ratio. The observed counts are 115, 38, 30, 17. Test at the 5% level.

H0H_0: The 9:3:3:1 ratio holds. H1H_1: The ratio does not hold.

Expected: E1=200(9/16)=112.5E_1 = 200(9/16) = 112.5, E2=200(3/16)=37.5E_2 = 200(3/16) = 37.5, E3=37.5E_3 = 37.5, E4=200(1/16)=12.5E_4 = 200(1/16) = 12.5.

All Ei5E_i \geq 5. \checkmark

χ2=(115112.5)2112.5+(3837.5)237.5+(3037.5)237.5+(1712.5)212.5\chi^2 = \frac{(115-112.5)^2}{112.5} + \frac{(38-37.5)^2}{37.5} + \frac{(30-37.5)^2}{37.5} + \frac{(17-12.5)^2}{12.5}

=6.25112.5+0.2537.5+56.2537.5+20.2512.5= \frac{6.25}{112.5} + \frac{0.25}{37.5} + \frac{56.25}{37.5} + \frac{20.25}{12.5}

0.056+0.007+1.500+1.620=3.183\approx 0.056 + 0.007 + 1.500 + 1.620 = 3.183

ν=41=3\nu = 4 - 1 = 3. Critical value: χ0.05,32=7.815\chi^2_{0.05,\,3} = 7.815.

Since 3.183<7.8153.183 \lt{} 7.815, do not reject H0H_0. The data is consistent with the 9:3:3:1 ratio.

6.3 Test for independence: smoking and disease

Example. A study of 300 adults records smoking status and whether they have a respiratory disease:

DiseaseNo diseaseRow total
Smoker4555100
Non-smoker30170200
Column total75225300

Test at the 1% level whether smoking status and respiratory disease are independent.

H0H_0: Smoking and disease are independent. H1H_1: They are not independent.

Expected frequencies:

E11=100(75)/300=25E_{11} = 100(75)/300 = 25, E12=100(225)/300=75E_{12} = 100(225)/300 = 75.

E21=200(75)/300=50E_{21} = 200(75)/300 = 50, E22=200(225)/300=150E_{22} = 200(225)/300 = 150.

All Ei5E_i \geq 5. \checkmark

χ2=(4525)225+(5575)275+(3050)250+(170150)2150\chi^2 = \frac{(45-25)^2}{25} + \frac{(55-75)^2}{75} + \frac{(30-50)^2}{50} + \frac{(170-150)^2}{150}

=40025+40075+40050+400150= \frac{400}{25} + \frac{400}{75} + \frac{400}{50} + \frac{400}{150}

=16+5.333+8+2.667=32.0= 16 + 5.333 + 8 + 2.667 = 32.0

ν=(21)(21)=1\nu = (2-1)(2-1) = 1. Critical value at 1%: χ0.01,12=6.635\chi^2_{0.01,\,1} = 6.635.

Since 32.0>6.63532.0 > 6.635, reject H0H_0 at the 1% level. There is very strong evidence that smoking status and respiratory disease are associated.

With Yates' correction:

χY2=(200.5)225+(200.5)275+(200.5)250+(200.5)2150\chi^2_Y = \frac{(20-0.5)^2}{25} + \frac{(20-0.5)^2}{75} + \frac{(20-0.5)^2}{50} + \frac{(20-0.5)^2}{150}

=380.2525+380.2575+380.2550+380.25150= \frac{380.25}{25} + \frac{380.25}{75} + \frac{380.25}{50} + \frac{380.25}{150}

=15.21+5.07+7.605+2.535=30.42= 15.21 + 5.07 + 7.605 + 2.535 = 30.42

Still highly significant (30.42>6.63530.42 > 6.635).


7. Degrees of Freedom: Systematic Calculation

7.1 Goodness of fit

ν=(numberofcategoriesaftermerging)1(parametersestimated)\nu = (\mathrm{number of categories after merging}) - 1 - (\mathrm{parameters estimated})

Distribution fittedParameters estimatedν\nu formula
Uniform (known)0n1n - 1
Binomial (known nn, known pp)0n1n - 1
Binomial (known nn, estimate pp)1n2n - 2
Poisson (estimate λ\lambda)1n2n - 2
Normal (estimate μ\mu, σ\sigma)2n3n - 3

7.2 Test for independence

ν=(r1)(c1)\nu = (r - 1)(c - 1)

Table sizeν\nu
2×22 \times 21
2×32 \times 32
3×33 \times 34
3×43 \times 46
4×54 \times 512

7.3 Intuition for degrees of freedom

The degrees of freedom represent the number of independent pieces of information in the data, after accounting for constraints. In a contingency table:

  • Each row total is fixed, so each row has one fewer free value
  • Each column total is fixed, so each column has one fewer free value
  • The grand total is automatically determined

This gives (r1)(c1)(r-1)(c-1) free cells.


8. Interpretation: What "Significant" Means

8.1 In plain language

When we reject H0H_0 at the 5% level, we are saying:

"If the null hypothesis were true, there would be less than a 5% chance of obtaining a test statistic at least as extreme as the one observed."

This is not the same as saying H0H_0 is false with 95% probability. It is a statement about the probability of the data given the hypothesis, not the probability of the hypothesis given the data.

8.2 Common misinterpretations

StatementCorrect?Why
"There is a 5% chance the null hypothesis is true"NoThis confuses P(dataH0)P(\mathrm{data}\mid H_0) with P(H0data)P(H_0\mid\mathrm{data})
"The probability of getting this result by chance is 5%"ApproximatelyMore precisely: the probability of getting a result at least this extreme by chance is 5%
"We have proved the alternative hypothesis"NoWe have only found evidence against H0H_0; the alternative could still be wrong
"A significant result means the effect is practically important"Not necessarilyWith a very large sample, even tiny deviations become significant

8.3 Context matters

A significant chi-squared test tells you the observed data is unlikely under H0H_0, but it does not tell you how the data differs or whether the difference is meaningful. Always inspect the observed vs expected frequencies to understand the nature of any discrepancy.


9. Relationship to the Normal Approximation

9.1 Chi-squared as a sum of squared normals

The fundamental connection: if Xχ12X \sim \chi^2_1 (1 degree of freedom), then X=Z2X = Z^2 where ZN(0,1)Z \sim N(0,1).

This means LBχ12RBZ\sqrt◆LB◆\chi^2_1◆RB◆ \sim |Z|, i.e., the square root of a chi-squared statistic with 1 df follows a half-normal distribution.

9.2 2×22 \times 2 tables and the normal approximation

For a 2×22 \times 2 table, the chi-squared test is equivalent to a two-proportion zz-test. If p1p_1 and p2p_2 are the sample proportions:

χ2=z2wherez=LBp1p2RB◆◆LBLBp^(1p^)(1/n1+1/n2)RB◆◆RB\chi^2 = z^2 \quad \mathrm{where} \quad z = \frac◆LB◆p_1 - p_2◆RB◆◆LB◆\sqrt◆LB◆\hat{p}(1-\hat{p})(1/n_1 + 1/n_2)◆RB◆◆RB◆

and p^\hat{p} is the pooled proportion.

9.3 Large degrees of freedom

As ν\nu increases, χν2\chi^2_\nu approaches N(ν,2ν)N(\nu, 2\nu). This means for large tables:

z=LBχ2νRB◆◆LBLB2νRB◆◆RBN(0,1)approximatelyz = \frac◆LB◆\chi^2 - \nu◆RB◆◆LB◆\sqrt◆LB◆2\nu◆RB◆◆RB◆ \sim N(0,1) \quad \mathrm{approximately}

This approximation is useful when chi-squared tables do not list the required ν\nu value.


10. Common Pitfalls

Using percentages instead of frequencies

The chi-squared test requires raw count data. If you are given percentages, you must convert back to frequencies using the sample size. Using percentages directly produces a test statistic that is off by a factor of n/100n/100 and gives completely wrong pp-values.

Wrong degrees of freedom

For goodness of fit, forgetting to subtract the number of estimated parameters is the most common error. For independence tests, using r×cr \times c instead of (r1)(c1)(r-1)(c-1) will overestimate the degrees of freedom and make the test too liberal.

Small expected values

If any expected frequency is below 5, the chi-squared approximation breaks down. The remedy is to merge adjacent categories before computing the test statistic. Do not simply discard categories — this loses information and biases the result.

Not checking all expected frequencies

After merging categories to fix one small expected value, you must recheck all remaining expected values. The merge may create new expected values below 5.

Merging non-adjacent categories

When merging categories for a goodness of fit, merge categories that are logically adjacent (e.g., "4" and "5\geq 5" in a Poisson fit). Merging non-adjacent categories (e.g., "0" and "5") destroys the structure of the distribution and makes the test invalid.

Confusing one-tailed and two-tailed

The chi-squared test is inherently one-tailed (right-tailed only). Large values of χ2\chi^2 indicate discrepancy from H0H_0. Small values (close to 0) indicate good fit and are not significant. There is no such thing as a "left-tailed" chi-squared test.


11. Problem Set

Q1. A die is rolled 240 times. The observed frequencies are 1: 52, 2: 38, 3: 40, 4: 36, 5: 44, 6: 30. Test at the 5% level whether the die is fair, and identify which face(s) contribute most to the test statistic.

H0H_0: Fair die. H1H_1: Not fair.

Ei=240/6=40E_i = 240/6 = 40 for all faces.

χ2=(5240)2+(3840)2+(4040)2+(3640)2+(4440)2+(3040)240\chi^2 = \dfrac{(52-40)^2 + (38-40)^2 + (40-40)^2 + (36-40)^2 + (44-40)^2 + (30-40)^2}{40}

=144+4+0+16+16+10040=28040=7.0= \dfrac{144 + 4 + 0 + 16 + 16 + 100}{40} = \dfrac{280}{40} = 7.0.

ν=5\nu = 5. Critical value: χ0.05,52=11.07\chi^2_{0.05,\,5} = 11.07.

7.0<11.077.0 \lt{} 11.07: do not reject H0H_0.

Contributions: face 1 contributes 144/40=3.6144/40 = 3.6, face 6 contributes 100/40=2.5100/40 = 2.5. These two faces account for 6.16.1 out of 7.07.0 (87% of the statistic).

Q2. The number of customers arriving at a shop per hour was recorded over 120 hours: 0: 12, 1: 30, 2: 35, 3: 25, 4: 12, 5\geq 5: 6. Test at the 5% level whether the data follows a Poisson distribution.

H0H_0: Poisson. H1H_1: Not Poisson.

xˉ=0(12)+1(30)+2(35)+3(25)+4(12)+5(6)120=2381201.983\bar{x} = \dfrac{0(12) + 1(30) + 2(35) + 3(25) + 4(12) + 5(6)}{120} = \dfrac{238}{120} \approx 1.983.

Expected using Po(1.983): P(0)=e1.9830.1379E=16.55P(0) = e^{-1.983} \approx 0.1379 \to E = 16.55 P(1)=1.983×0.13790.2734E=32.81P(1) = 1.983 \times 0.1379 \approx 0.2734 \to E = 32.81 P(2)=1.9832/2×0.13790.2711E=32.53P(2) = 1.983^2/2 \times 0.1379 \approx 0.2711 \to E = 32.53 P(3)=1.9833/6×0.13790.1792E=21.50P(3) = 1.983^3/6 \times 0.1379 \approx 0.1792 \to E = 21.50 P(4)=1.9834/24×0.13790.0888E=10.66P(4) = 1.983^4/24 \times 0.1379 \approx 0.0888 \to E = 10.66 P(5)=10.94840.0516E=6.19P(\geq 5) = 1 - 0.9484 \approx 0.0516 \to E = 6.19

Merge 4\geq 4: O=18O = 18, E=10.66+6.19=16.85E = 10.66 + 6.19 = 16.85. All E5E \geq 5.

χ2=(1216.55)216.55+(3032.81)232.81+(3532.53)232.53+(2521.50)221.50+(1816.85)216.85\chi^2 = \dfrac{(12-16.55)^2}{16.55} + \dfrac{(30-32.81)^2}{32.81} + \dfrac{(35-32.53)^2}{32.53} + \dfrac{(25-21.50)^2}{21.50} + \dfrac{(18-16.85)^2}{16.85}

1.251+0.241+0.188+0.570+0.079=2.329\approx 1.251 + 0.241 + 0.188 + 0.570 + 0.079 = 2.329

ν=511=3\nu = 5 - 1 - 1 = 3. Critical value: χ0.05,32=7.815\chi^2_{0.05,\,3} = 7.815.

2.329<7.8152.329 \lt{} 7.815: do not reject H0H_0.

Q3. A survey of 400 adults records education level and voting preference. Test at the 5% level whether the two variables are independent.
Party AParty BParty CNon-voterTotal
No qualifications20251045100
A-levels30402530125
Degree45304010125
Postgraduate201515050
Total1151109085400

H0H_0: Independent. H1H_1: Not independent.

Expected: Eij=(rowitotal)(columnjtotal)/400E_{ij} = (\mathrm{row } i \mathrm{ total})(\mathrm{column } j \mathrm{ total})/400.

E44=50×85/400=10.625E_{44} = 50 \times 85/400 = 10.625. All E5E \geq 5. \checkmark

E11=100(115)/400=28.75E_{11} = 100(115)/400 = 28.75, E12=27.5E_{12} = 27.5, E13=22.5E_{13} = 22.5, E14=21.25E_{14} = 21.25. E21=125(115)/400=35.9375E_{21} = 125(115)/400 = 35.9375, E22=34.375E_{22} = 34.375, E23=28.125E_{23} = 28.125, E24=26.5625E_{24} = 26.5625. E31=125(115)/400=35.9375E_{31} = 125(115)/400 = 35.9375, E32=34.375E_{32} = 34.375, E33=28.125E_{33} = 28.125, E34=26.5625E_{34} = 26.5625. E41=50(115)/400=14.375E_{41} = 50(115)/400 = 14.375, E42=13.75E_{42} = 13.75, E43=11.25E_{43} = 11.25, E44=10.625E_{44} = 10.625.

χ2=i=14j=14(OijEij)2Eij\chi^2 = \sum_{i=1}^{4}\sum_{j=1}^{4}\frac{(O_{ij} - E_{ij})^2}{E_{ij}}

Key contributions: (2028.75)228.752.66\dfrac{(20-28.75)^2}{28.75} \approx 2.66, (4521.25)221.2526.53\dfrac{(45-21.25)^2}{21.25} \approx 26.53, (1026.5625)226.562510.33\dfrac{(10-26.5625)^2}{26.5625} \approx 10.33, (4535.9375)235.93752.28\dfrac{(45-35.9375)^2}{35.9375} \approx 2.28, (4028.125)228.1255.01\dfrac{(40-28.125)^2}{28.125} \approx 5.01, (1026.5625)226.562510.33\dfrac{(10-26.5625)^2}{26.5625} \approx 10.33.

χ22.66+0.91+0.69+26.53+1.04+1.03+0.39+0.54+2.28+2.23+0.60+1.26+10.3348.5\chi^2 \approx 2.66 + 0.91 + 0.69 + 26.53 + 1.04 + 1.03 + 0.39 + 0.54 + 2.28 + 2.23 + 0.60 + 1.26 + 10.33 \approx 48.5.

ν=(41)(41)=9\nu = (4-1)(4-1) = 9. Critical value: χ0.05,92=16.92\chi^2_{0.05,\,9} = 16.92.

48.5>16.9248.5 > 16.92: reject H0H_0. Strong evidence that education level and voting preference are associated.

Q4. Explain why merging categories changes the degrees of freedom, and calculate the new df when a 6-category Poisson goodness of fit test (with λ\lambda estimated) has its last 3 categories merged into one.

Originally: ν=611=4\nu = 6 - 1 - 1 = 4 (6 categories, 1 parameter estimated).

After merging the last 3 into 1: we now have 4 categories total (first 3 individual + 1 merged).

New ν=411=2\nu = 4 - 1 - 1 = 2.

Merging reduces the number of categories, which reduces the degrees of freedom. This makes the test slightly more conservative (harder to reject H0H_0) because the critical value is smaller for fewer df, but the merged categories also tend to reduce the test statistic.

Q5. A 2×22 \times 2 table has observed frequencies: Row 1: 10, 40; Row 2: 30, 20. Apply Yates' correction and test at the 5% level. Compare with the uncorrected test.

Row totals: 50, 50. Column totals: 40, 60. Grand total: 100.

Expected: E11=50(40)/100=20E_{11} = 50(40)/100 = 20, E12=30E_{12} = 30, E21=20E_{21} = 20, E22=30E_{22} = 30.

Uncorrected:

χ2=(1020)220+(4030)230+(3020)220+(2030)230=5+3.333+5+3.333=16.67\chi^2 = \dfrac{(10-20)^2}{20} + \dfrac{(40-30)^2}{30} + \dfrac{(30-20)^2}{20} + \dfrac{(20-30)^2}{30} = 5 + 3.333 + 5 + 3.333 = 16.67.

Yates' corrected:

χY2=(100.5)220+(100.5)230+(100.5)220+(100.5)230=90.2520+90.2530+90.2520+90.2530\chi^2_Y = \dfrac{(10-0.5)^2}{20} + \dfrac{(10-0.5)^2}{30} + \dfrac{(10-0.5)^2}{20} + \dfrac{(10-0.5)^2}{30} = \dfrac{90.25}{20} + \dfrac{90.25}{30} + \dfrac{90.25}{20} + \dfrac{90.25}{30}

=4.5125+3.0083+4.5125+3.0083=15.04= 4.5125 + 3.0083 + 4.5125 + 3.0083 = 15.04.

ν=1\nu = 1. Critical value: χ0.05,12=3.841\chi^2_{0.05,\,1} = 3.841.

Both reject H0H_0. The corrected value (15.04) is smaller than the uncorrected (16.67), as expected.

Q6. A normal distribution is fitted to 200 observations with mean and standard deviation estimated from the data. The expected and observed frequencies for 7 categories are calculated, with all Ei5E_i \geq 5. The test statistic is χ2=8.5\chi^2 = 8.5. Test at the 5% level and explain your choice of degrees of freedom.

Two parameters estimated (μ\mu and σ\sigma), so ν=712=4\nu = 7 - 1 - 2 = 4.

H0H_0: Data follows a normal distribution. H1H_1: Data does not follow a normal distribution.

Critical value: χ0.05,42=9.488\chi^2_{0.05,\,4} = 9.488.

8.5<9.4888.5 \lt{} 9.488: do not reject H0H_0. Insufficient evidence to conclude the data is non-normal.

The degrees of freedom calculation accounts for the fact that estimating parameters from the data makes the fit appear better than it truly is. Each estimated parameter reduces the df by 1 because it uses up one piece of information from the data.


8. Advanced Worked Examples

Example 8.1: Goodness-of-fit test with merging of classes

Problem. A die is rolled 120 times. The observed frequencies are:

Face123456
Observed251715231822

Test at the 5% level whether the die is fair.

Solution. H0H_0: die is fair (pi=16p_i = \frac{1}{6}). Expected: Ei=20E_i = 20 for each face.

All Ei=205E_i = 20 \geq 5, so no merging needed.

χ2=(OiEi)2Ei=25+9+25+9+4+420=7620=3.8\chi^2 = \sum \frac{(O_i - E_i)^2}{E_i} = \frac{25 + 9 + 25 + 9 + 4 + 4}{20} = \frac{76}{20} = 3.8

ν=61=5\nu = 6 - 1 = 5. Critical value at 5%: 11.0711.07.

3.8<11.073.8 < 11.07: do not reject H0H_0. There is insufficient evidence to conclude the die is biased.

Example 8.2: Test for independence in a 3×3 contingency table

Problem. 300 people are classified by hair colour and eye colour:

BlueBrownGreen
Blonde402010
Brown306020
Black104070

Test at the 1% level whether hair colour and eye colour are independent.

Solution. H0H_0: hair colour and eye colour are independent.

Row totals: Blonde 70, Brown 110, Black 120. Column totals: Blue 80, Brown 120, Green 100. Grand total: 300.

Expected values: Eij=LBRi×CjRB◆◆LB300RBE_{ij} = \dfrac◆LB◆R_i \times C_j◆RB◆◆LB◆300◆RB◆.

E11=LB70×80RB◆◆LB300RB=18.67E_{11} = \dfrac◆LB◆70 \times 80◆RB◆◆LB◆300◆RB◆ = 18.67, E12=28E_{12} = 28, E13=23.33E_{13} = 23.33, E21=29.33E_{21} = 29.33, E22=44E_{22} = 44, E23=36.67E_{23} = 36.67, E31=32E_{31} = 32, E32=48E_{32} = 48, E33=40E_{33} = 40.

χ2=(4018.67)218.67+(2028)228+(1023.33)223.33+(3029.33)229.33+(6044)244+(2036.67)236.67+(1032)232+(4048)248+(7040)240\chi^2 = \frac{(40-18.67)^2}{18.67} + \frac{(20-28)^2}{28} + \frac{(10-23.33)^2}{23.33} + \frac{(30-29.33)^2}{29.33} + \frac{(60-44)^2}{44} + \frac{(20-36.67)^2}{36.67} + \frac{(10-32)^2}{32} + \frac{(40-48)^2}{48} + \frac{(70-40)^2}{40}

24.35+2.29+7.61+0.02+5.82+7.58+15.13+1.33+22.50=86.63\approx 24.35 + 2.29 + 7.61 + 0.02 + 5.82 + 7.58 + 15.13 + 1.33 + 22.50 = 86.63.

ν=(31)(31)=4\nu = (3-1)(3-1) = 4. Critical value at 1%: 13.2813.28.

86.63>13.2886.63 > 13.28: reject H0H_0. Very strong evidence that hair colour and eye colour are associated.

Example 8.3: Yates' continuity correction for a 2×2 table

Problem. In a drug trial, 40 out of 100 patients on drug A recovered, and 55 out of 100 on drug B recovered. Test at 5% whether the recovery rates differ, using Yates' correction.

Solution. H0H_0: recovery rate is the same for both drugs.

RecoveredNot recovered
Drug A4060
Drug B5545

With Yates' correction:

χ2=LB(OiEi0.5)2RB◆◆LBEiRB\chi^2 = \sum \frac◆LB◆(|O_i - E_i| - 0.5)^2◆RB◆◆LB◆E_i◆RB◆

E11=E12=47.5E_{11} = E_{12} = 47.5, E21=E22=47.5E_{21} = E_{22} = 47.5.

χ2=LB(4047.50.5)2RB◆◆LB47.5RB+LB(6047.50.5)2RB◆◆LB47.5RB+LB(5547.50.5)2RB◆◆LB47.5RB+LB(4547.50.5)2RB◆◆LB47.5RB\chi^2 = \frac◆LB◆(|40-47.5|-0.5)^2◆RB◆◆LB◆47.5◆RB◆ + \frac◆LB◆(|60-47.5|-0.5)^2◆RB◆◆LB◆47.5◆RB◆ + \frac◆LB◆(|55-47.5|-0.5)^2◆RB◆◆LB◆47.5◆RB◆ + \frac◆LB◆(|45-47.5|-0.5)^2◆RB◆◆LB◆47.5◆RB◆

=49+144+49+447.5=24647.55.18= \dfrac{49 + 144 + 49 + 4}{47.5} = \dfrac{246}{47.5} \approx 5.18.

ν=1\nu = 1. Critical value at 5%: 3.843.84.

5.18>3.845.18 > 3.84: reject H0H_0. Significant difference in recovery rates.

Example 8.4: Determining degrees of freedom with estimated parameters

Problem. Data is tested against a normal distribution with mean and variance estimated from the data. The data is grouped into 8 classes. One class has expected frequency 3 and is merged with an adjacent class. State the degrees of freedom.

Solution. Original classes: 8. After merging: 7 classes.

Restrictions: total frequency (1), estimated mean (1), estimated variance (1). Total restrictions: 3.

ν=73=4\nu = 7 - 3 = \boxed{4}

Example 8.5: Chi-squared test for a geometric distribution

Problem. Customers arrive at a till. The number of customers served before the first complaint is recorded over 200 shifts:

Count01234\geq 4
Observed906030128

Test at 5% whether the data follows a geometric distribution.

Solution. H0H_0: data follows Geo(p)\mathrm{Geo}(p).

xˉ=LB0×90+1×60+2×30+3×12+4×8RB◆◆LB200RB=196200=0.98\bar{x} = \dfrac◆LB◆0 \times 90 + 1 \times 60 + 2 \times 30 + 3 \times 12 + 4 \times 8◆RB◆◆LB◆200◆RB◆ = \dfrac{196}{200} = 0.98.

For Geo(p)\mathrm{Geo}(p): E(X)=1pp=0.98    p=11.980.505E(X) = \dfrac{1-p}{p} = 0.98 \implies p = \dfrac{1}{1.98} \approx 0.505.

Expected: P(X=k)=p(1p)k=0.505×0.495kP(X=k) = p(1-p)^k = 0.505 \times 0.495^k.

E0=200×0.505=101E_0 = 200 \times 0.505 = 101, E1=200×0.250=50E_1 = 200 \times 0.250 = 50, E2=200×0.124=24.8E_2 = 200 \times 0.124 = 24.8, E3=200×0.061=12.3E_3 = 200 \times 0.061 = 12.3, E4=2001015024.812.3=11.9E_{\geq 4} = 200 - 101 - 50 - 24.8 - 12.3 = 11.9.

All Ei5E_i \geq 5, so no merging needed.

χ2=(90101)2101+(6050)250+(3024.8)224.8+(1212.3)212.3+(811.9)211.9\chi^2 = \frac{(90-101)^2}{101} + \frac{(60-50)^2}{50} + \frac{(30-24.8)^2}{24.8} + \frac{(12-12.3)^2}{12.3} + \frac{(8-11.9)^2}{11.9}

1.20+2.00+1.09+0.01+1.28=5.58\approx 1.20 + 2.00 + 1.09 + 0.01 + 1.28 = 5.58.

ν=511=3\nu = 5 - 1 - 1 = 3 (5 classes, 1 for total, 1 for estimated pp). Critical value at 5%: 7.827.82.

5.58<7.825.58 < 7.82: do not reject H0H_0.

Example 8.6: Interpreting a very small expected frequency

Problem. A 2×5 contingency table has several expected frequencies below 5. What action should be taken?

Solution. Adjacent rows or columns should be merged to ensure all expected frequencies are at least 5. The degrees of freedom must be recalculated based on the new table dimensions. If merging destroys the structure of the test (e.g., merging distinct categories that have different meanings), Fisher's exact test should be used instead.

Example 8.7: Calculating the chi-squared statistic from raw proportions

Problem. In a survey of 500 people, 60% prefer tea in the North and 45% prefer tea in the South. There are 300 Northerners and 200 Southerners. Test at 5% whether preference differs by region.

Solution. H0H_0: no association between region and preference.

Observed table:

TeaNo Tea
North180120
South90110

Expected: E11=LB270×300RB◆◆LB500RB=162E_{11} = \dfrac◆LB◆270 \times 300◆RB◆◆LB◆500◆RB◆ = 162, E12=138E_{12} = 138, E21=108E_{21} = 108, E22=92E_{22} = 92.

χ2=324162+324138+324108+324922.00+2.35+3.00+3.52=10.87\chi^2 = \frac{324}{162} + \frac{324}{138} + \frac{324}{108} + \frac{324}{92} \approx 2.00 + 2.35 + 3.00 + 3.52 = 10.87

ν=1\nu = 1. Critical value at 5%: 3.843.84.

10.87>3.8410.87 > 3.84: reject H0H_0. Significant association between region and tea preference.


9. Common Pitfalls

PitfallCorrect Approach
Using observed frequencies instead of expected in the denominatorχ2=(OE)2E\chi^2 = \sum \dfrac{(O-E)^2}{E}, not (OE)2O\sum \dfrac{(O-E)^2}{O}
Forgetting to merge classes with E<5E < 5Always check expected frequencies first; merge adjacent classes
Miscounting degrees of freedomν=(r1)(c1)\nu = (r-1)(c-1) for independence; ν=k1m\nu = k - 1 - m for goodness-of-fit with mm estimated parameters
Applying Yates' correction to tables larger than 2×2Yates' correction is only for 2×2 contingency tables

10. Additional Exam-Style Questions

Question 8

A tetrahedral die is rolled 200 times. The observed frequencies for faces 1--4 are 38, 62, 55, 45. Test at the 10% level whether the die is fair.

Solution

H0H_0: fair (p=0.25p = 0.25). Ei=50E_i = 50.

χ2=144+144+25+2550=33850=6.76\chi^2 = \dfrac{144 + 144 + 25 + 25}{50} = \dfrac{338}{50} = 6.76.

ν=3\nu = 3. Critical value at 10%: 6.256.25.

6.76>6.256.76 > 6.25: reject H0H_0. The die appears biased at the 10% level.

Question 9

Explain why the chi-squared test is an approximate test and describe when it may not be appropriate.

Solution

The chi-squared distribution is a continuous approximation to the discrete distribution of the test statistic. The approximation is poor when:

  1. Expected frequencies are small (typically E<5E < 5), as the continuous approximation breaks down.
  2. The total sample size is very small.
  3. The number of classes is very large relative to the sample size.

In these cases, Fisher's exact test or exact multinomial methods should be used instead.

Question 10

In a test of independence on a 4×3 contingency table, the calculated χ2\chi^2 statistic is 18.7. Determine whether to reject H0H_0 at the 5% significance level.

Solution

ν=(41)(31)=6\nu = (4-1)(3-1) = 6. Critical value at 5%: 12.5912.59.

18.7>12.5918.7 > 12.59: reject H0H_0. There is significant evidence of an association between the row and column variables.


11. Connections to Other Topics

11.1 Chi-squared tests and Poisson/geometric distributions

Goodness-of-fit tests are commonly used to test whether data follows a Poisson or geometric distribution. See Poisson and Geometric Distributions.

11.2 Chi-squared and continuous distributions

The chi-squared distribution itself is used in confidence intervals for variance. See Exponential and Continuous Random Variables.

11.3 Chi-squared and probability

Hypothesis testing relies on understanding significance levels, pp-values, and Type I/II errors.


12. Key Results Summary

Test TypeDegrees of FreedomConditions
Goodness-of-fitν=k1m\nu = k - 1 - mAll Ei5E_i \geq 5, mm = estimated parameters
Independenceν=(r1)(c1)\nu = (r-1)(c-1)All Ei5E_i \geq 5
Yates' correctionν=1\nu = 1Only for 2×2 tables
StepAction
1State H0H_0 and H1H_1
2Calculate expected frequencies
3Merge classes if any Ei<5E_i < 5
4Compute χ2=(OE)2E\chi^2 = \sum \dfrac{(O-E)^2}{E}
5Determine degrees of freedom
6Compare with critical value or find pp-value
7Conclude in context

13. Further Exam-Style Questions

Question 11

A teacher believes that grades in a class follow a specific distribution: 10% A, 30% B, 40% C, 20% D. In a sample of 200 students, the observed frequencies are: A: 15, B: 70, C: 80, D: 35. Test at the 5% level.

Solution

H0H_0: grades follow the specified distribution.

Expected: A: 20, B: 60, C: 80, D: 40. All 5\geq 5.

χ2=2520+10060+0+2540=1.25+1.667+0+0.625=3.542\chi^2 = \dfrac{25}{20} + \dfrac{100}{60} + 0 + \dfrac{25}{40} = 1.25 + 1.667 + 0 + 0.625 = 3.542.

ν=41=3\nu = 4 - 1 = 3. Critical value at 5%: 7.827.82.

3.542<7.823.542 < 7.82: do not reject H0H_0. The data is consistent with the teacher's belief.

Question 12

Explain the difference between a Type I error and a Type II error in the context of a chi-squared test.

Solution

Type I error: Rejecting H0H_0 when H0H_0 is true (false positive). The probability is the significance level α\alpha.

Type II error: Failing to reject H0H_0 when H0H_0 is false (false negative). The probability depends on the true distribution and sample size; it is denoted β\beta, and 1β1-\beta is the power of the test.


14. Advanced Topics

14.1 The chi-squared distribution

The chi-squared distribution with ν\nu degrees of freedom is the distribution of i=1νZi2\sum_{i=1}^{\nu} Z_i^2 where ZiN(0,1)Z_i \sim N(0,1) are independent.

Key properties:

  • Mean: ν\nu
  • Variance: 2ν2\nu
  • Additivity: if Xχν12X \sim \chi^2_{\nu_1} and Yχν22Y \sim \chi^2_{\nu_2} are independent, then X+Yχν1+ν22X+Y \sim \chi^2_{\nu_1+\nu_2}

14.2 Chi-squared confidence intervals for variance

For a sample of size nn from N(μ,σ2)N(\mu, \sigma^2), the quantity LB(n1)s2RB◆◆LBσ2RBχn12\dfrac◆LB◆(n-1)s^2◆RB◆◆LB◆\sigma^2◆RB◆ \sim \chi^2_{n-1}.

A 95%95\% confidence interval for σ2\sigma^2 is:

[LB(n1)s2RB◆◆LBχn1,0.0252RB,  LB(n1)s2RB◆◆LBχn1,0.9752RB]\left[\frac◆LB◆(n-1)s^2◆RB◆◆LB◆\chi^2_{n-1,0.025}◆RB◆,\; \frac◆LB◆(n-1)s^2◆RB◆◆LB◆\chi^2_{n-1,0.975}◆RB◆\right]

14.3 Relationship to other tests

The chi-squared test is related to:

  • The GG-test (log-likelihood ratio test), which uses G=2Oiln(Oi/Ei)G = 2\sum O_i \ln(O_i/E_i)
  • Fisher's exact test for small samples
  • The zz-test for proportions (for 2×2 tables, χ2z2\chi^2 \approx z^2)

15. Further Exam-Style Questions

Question 13

A 3×2 contingency table yields χ2=4.5\chi^2 = 4.5. At what significance levels would H0H_0 be rejected?

Solution

ν=(31)(21)=2\nu = (3-1)(2-1) = 2.

Critical values: 1% level: 9.219.21, 5% level: 5.995.99, 10% level: 4.614.61.

4.5<4.614.5 < 4.61: H0H_0 would not be rejected at the 10% level (or any conventional level).

The pp-value is slightly above 10%.

Question 14

Explain why merging classes in a chi-squared test reduces the degrees of freedom and may reduce the power of the test.

Solution

Merging reduces the number of classes kk, which reduces ν=k1m\nu = k - 1 - m. Fewer degrees of freedom means the critical value is lower, making it easier to reject H0H_0, but merging also discards information about the differences between the merged classes. If the true deviation from H0H_0 is in the merged classes, the test loses the ability to detect it, reducing power.

Question 15

A goodness-of-fit test of a normal distribution uses 10 classes with mean and variance estimated from the data. The calculated χ2=15.2\chi^2 = 15.2. Test at the 5% level.

Solution

ν=1012=7\nu = 10 - 1 - 2 = 7 (10 classes, 1 for total, 2 estimated parameters).

Critical value at 5%: 14.0714.07.

15.2>14.0715.2 > 14.07: reject H0H_0. There is sufficient evidence to conclude the data does not follow a normal distribution.


16. Further Advanced Topics

16.1 The chi-squared distribution properties

  • χν2\chi^2_\nu is the distribution of i=1νZi2\sum_{i=1}^\nu Z_i^2 where ZiN(0,1)Z_i \sim N(0,1) i.i.d.
  • Mean =ν= \nu, Variance =2ν= 2\nu
  • For large ν\nu: χν2N(ν,2ν)\chi^2_\nu \approx N(\nu, 2\nu) (by CLT)
  • Additivity: χa2+χb2=χa+b2\chi^2_a + \chi^2_b = \chi^2_{a+b} (independent)

16.2 The GG-test (log-likelihood ratio)

An alternative to the chi-squared test using:

G=2i=1kOiln ⁣(OiEi)G = 2\sum_{i=1}^{k} O_i \ln\!\left(\frac{O_i}{E_i}\right)

For large samples, Gχν2G \approx \chi^2_\nu.

16.3 Fisher's exact test

For 2×2 tables with small expected frequencies:

P=(a+b)!(c+d)!(a+c)!(b+d)!a!b!c!d!n!P = \frac{(a+b)!(c+d)!(a+c)!(b+d)!}{a!\,b!\,c!\,d!\,n!}

This gives the exact pp-value without approximation.

16.4 Post-hoc analysis

After rejecting H0H_0 in a goodness-of-fit test, standardised residuals identify which classes contribute most:

ri=LBOiEiRB◆◆LBEiRBr_i = \frac◆LB◆O_i - E_i◆RB◆◆LB◆\sqrt{E_i}◆RB◆

Values with ri>2|r_i| > 2 indicate significant deviations.


17. Further Exam-Style Questions

Question 16

In a χ2\chi^2 goodness-of-fit test with 8 classes, 1 parameter estimated, and χ2=11.3\chi^2 = 11.3, find the approximate pp-value.

Solution

ν=811=6\nu = 8 - 1 - 1 = 6.

From chi-squared tables: χ6,0.052=12.59\chi^2_{6,0.05} = 12.59, χ6,0.102=10.64\chi^2_{6,0.10} = 10.64.

10.64<11.3<12.5910.64 < 11.3 < 12.59, so 0.05<p<0.100.05 < p < 0.10.

The pp-value is approximately 0.08.

Question 17

Explain when it is appropriate to use Fisher's exact test instead of the chi-squared test.

Solution

Fisher's exact test should be used when:

  1. The sample size is small (typically n<20n < 20 for 2×2 tables)
  2. Expected frequencies are less than 5 and cannot be fixed by merging
  3. An exact pp-value is required rather than an approximation
  4. The table is 2×2 (for larger tables, Fisher's test becomes computationally expensive)

18. Further Exam-Style Questions

Question 19

A researcher tests whether a die is fair. In 120 rolls, the observed frequencies are:

Face123456
Freq152218252020

Carry out a chi-squared goodness-of-fit test at the 5% significance level.

Solution

H0H_0: Die is fair. H1H_1: Die is not fair.

Expected frequency for each face: 120/6=20120/6 = 20.

X2=(OE)2E=25+4+4+25+0+020=5820=2.9X^2 = \displaystyle\sum \frac{(O-E)^2}{E} = \frac{25+4+4+25+0+0}{20} = \frac{58}{20} = 2.9.

ν=61=5\nu = 6 - 1 = 5. Critical value at 5%: χ52(0.95)=11.07\chi^2_{5}(0.95) = 11.07.

2.9<11.072.9 < 11.07, so we do not reject H0H_0. There is insufficient evidence to suggest the die is unfair.