Chi Squared Tests for Contingency Tables (Edexcel A Level Further Maths): Revision Note
Contingency Tables
What is a chi-squared test using contingency tables?
A chi-squared (
) using contingency tables is a hypothesis test used to test whether two variables are independent of each other
For example whether or not favourite music genre is independent of the age group of the listener
This is sometimes called a
two-way test
This is an example of a goodness of fit test
We are testing whether the data fits the modelling assumption that the variables are independent
The chi-squared (
) distribution is used for this test
You will use a contingency table
This is a two-way table that shows the observed frequencies for the different combinations of the two variables
A
contingency table has
rows and
columns
This does not include any rows or columns used to record the row, column and grand totals
Why might I have to combine rows or columns?
The observed values are used to calculate expected values
These are the expected frequencies for each combination assuming that the variables are independent
Your calculator may be able to calculate these for you after you input the observed frequencies
Or else you can calculate them using the formula
None of the expected values used to run the test can be less than 5
If one of the expected values is less than 5 then you will have to combine the corresponding row or column in the table of observed values with the adjacent row or column
The decision between row or column will be based on which seems the most appropriate
For example: if the two variables are age and favourite music genre then it is more appropriate to combine age groups than types of genre
What are the degrees of freedom?
There will be a minimum number of expected values you would need to know in order to be able to calculate all the expected values
This minimum number is called the degrees of freedom and is often denoted by
For a test for independence with an
contingency table
For example: If there are 5 rows and 3 columns then you only need to know 2 of the values in 4 of the rows as the rest can be calculated using the totals
What are the steps for a chi-squared test using contingency tables?
STEP 1: Write the hypotheses
: Variable X and Variable Y are independent
: Variable X and Variable Y are not independent
The hypotheses should always be stated in the context of the question
Make sure you clearly write what the variables are and don’t just call them 'Variable X' and 'Variable Y'
STEP 2: Calculate the expected frequencies
Use the formula
You will need to combine rows or columns if any of the expected frequencies are less than 5
This process is described above
After combining, calculate the new expected frequencies for the modified table
You may also be able to enter the observed frequencies as a matrix in your calculator
Use the option for a 2-way test
Your calculator will calculate the matrix of expected frequencies
STEP 3: Calculate the degrees of freedom for the test
For an
contingency table (after combining)
Degrees of freedom is
STEP 4: Calculate
using the formula
then you will also need to determine the appropriate
critical value
use the 'Percentage Points of the
Distribution' table in the exam formula booklet
If you entered the observed frequencies as a matrix in your calculator
then your calculator's 2-way test option will give you the test statistic
and the associated p-value
STEP 5: Decide whether there is evidence to reject the null hypothesis
Compare the statistic with the critical value you have determined
If
> critical value (or
) then there is sufficient evidence to reject
If
< critical value (or
) then there is insufficient evidence to reject
STEP 6: Write your conclusion
If you reject H0
Variable X and variable Y are not independent
If you do not reject H0
Variable X and variable Y are independent
Be sure to state your conclusion in the context of the question
Worked Example
At a school in Paris, it is believed that favourite film genre is related to favourite subject. 500 students were asked to indicate their favourite film genre and favourite subject from a selection and the results are indicated in the table below.
| Comedy | Action | Romance | Thriller | Total |
Maths | 51 | 52 | 37 | 55 | 195 |
Sports | 59 | 63 | 41 | 33 | 196 |
Geography | 35 | 31 | 28 | 15 | 109 |
Total | 145 | 146 | 106 | 103 | 500 |
Using the statistic and a significance test at the a 1% level, test these results to see if there is an association between favourite film genre and favourite subject. State your conclusions.


You've read 0 of your 5 free revision notes this week
Sign up now. It’s free!
Did this page help you?