Chi Squared Tests for Contingency Tables (Edexcel A Level Further Maths: Further Statistics 1)

Revision Note

Roger

Author

Roger

Last updated

Contingency Tables

What is a chi-squared test using contingency tables?

  • A chi-squared (chi squared) using contingency tables is a hypothesis test used to test whether two variables are independent of each other
    • For example whether or not favourite music genre is independent of the age group of the listener 
    • This is sometimes called a chi squared two-way test
  • This is an example of a goodness of fit test
    • We are testing whether the data fits the modelling assumption that the variables are independent
  • The chi-squared (chi squared) distribution is used for this test
  • You will use a contingency table
    • This is a two-way table that shows the observed frequencies for the different combinations of the two variables
    • bold italic h bold cross times bold italic k contingency table has h rows and k columns
      • This does not include any rows or columns used to record the row, column and grand totals

Why might I have to combine rows or columns?

  • The observed values are used to calculate expected values
    • These are the expected frequencies for each combination assuming that the variables are independent
      • Your calculator may be able to calculate these for you after you input the observed frequencies
      • Or else you can calculate them using the formula

expected space frequency space equals fraction numerator row space total space cross times space column space total over denominator grand space total end fraction

  • None of the expected values used to run the test can be less than 5
    • If one of the expected values is less than 5 then you will have to combine the corresponding row or column in the table of observed values with the adjacent row or column
    • The decision between row or column will be based on which seems the most appropriate
      • For example: if the two variables are age and favourite music genre then it is more appropriate to combine age groups than types of genre

What are the degrees of freedom?

  • There will be a minimum number of expected values you would need to know in order to be able to calculate all the expected values
  • This minimum number is called the degrees of freedom and is often denoted by nu
  • For a test for independence with an h cross times k contingency table
    • nu equals left parenthesis h minus 1 right parenthesis cross times left parenthesis k minus 1 right parenthesis 
    • For example: If there are 5 rows and 3 columns then you only need to know 2 of the values in 4 of the rows as the rest can be calculated using the totals

What are the steps for a chi-squared test using contingency tables?

  • STEP 1: Write the hypotheses
    • straight H subscript 0: Variable X and Variable Y are independent
    • straight H subscript 1: Variable X and Variable Y are not independent
    • The hypotheses should always be stated in the context of the question
    • Make sure you clearly write what the variables are and don’t just call them 'Variable X' and 'Variable Y'
  • STEP 2: Calculate the expected frequencies
    • Use the formula

expected space frequency space equals fraction numerator row space total space cross times space column space total over denominator grand space total end fraction 

    •  You will need to combine rows or columns if any of the expected frequencies are less than 5
      • This process is described above
      • After combining, calculate the new expected frequencies for the modified table
    • You may also be able to enter the observed frequencies as a matrix in your calculator
      • Use the option for a 2-way test
      • Your calculator will calculate the matrix of expected frequencies
  • STEP 3: Calculate the degrees of freedom for the test
    • For an h cross times k contingency table (after combining)
      • Degrees of freedom is nu equals left parenthesis h minus 1 right parenthesis cross times left parenthesis k minus 1 right parenthesis
  • STEP 4: Calculate X squared using the formula

    X squared equals stack sum space with i equals 1 below and n on top open parentheses O subscript i minus E subscript i close parentheses squared over E subscript i equals open parentheses stack sum space with i equals 1 below and n on top O subscript i squared over E subscript i close parentheses minus N 

      • then you will also need to determine the appropriate chi subscript nu superscript 2 open parentheses alpha percent sign close parentheses critical value
      • use the 'Percentage Points of the chi squared Distribution' table in the exam formula booklet
    • If you entered the observed frequencies as a matrix in your calculator
      • then your calculator's 2-way test option will give you the test statistic X squared and the associated p-value
  • STEP 5: Decide whether there is evidence to reject the null hypothesis
    • Compare the statistic with the critical value you have determined
      • If X squared > critical value (or p less than alpha) then there is sufficient evidence to reject bold H subscript bold 0
      • If X squared < critical value (or p greater than alpha) then there is insufficient evidence to reject bold H subscript bold 0
  • STEP 6: Write your conclusion
    • If you reject H0
      • Variable X and variable Y are not independent
    •  If you do not reject H0
      • Variable X and variable Y are independent
    • Be sure to state your conclusion in the context of the question

Worked example

At a school in Paris, it is believed that favourite film genre is related to favourite subject.  500 students were asked to indicate their favourite film genre and favourite subject from a selection and the results are indicated in the table below.

 

Comedy

Action

Romance

Thriller

Total

Maths

51

52

37

55

195

Sports

59

63

41

33

196

Geography

35

31

28

15

109

Total

145 146 106 103 500

Using the chi squared statistic and a significance test at the a 1% level, test these results to see if there is an association between favourite film genre and favourite subject.  State your conclusions.

contingency-tables-we-1contingency-tables-we-2

You've read 0 of your 5 free revision notes this week

Sign up now. It’s free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Roger

Author: Roger

Expertise: Maths

Roger's teaching experience stretches all the way back to 1992, and in that time he has taught students at all levels between Year 7 and university undergraduate. Having conducted and published postgraduate research into the mathematical theory behind quantum computing, he is more than confident in dealing with mathematics at any level the exam boards might throw at you.