Goodness of Fit (Edexcel A Level Further Maths): Revision Note
Goodness of Fit
What is the difference between observed values and expected values?
Goodness of fit is a measure of how well real-life observed data fits a theoretical model
For example, modelling a coin as fair then flipping it 20 times
You may observe 13 heads
You would expect 10 heads
Observed (
) and expected (
) values can be shown in a table
For example, rolling a fair die 60 times (
)
Outcome
1
2
3
4
5
6
12
7
8
10
14
9
10
10
10
10
10
10
Note that
How different do observed and expected values need to be before the model is not a good fit?
You can do a hypothesis test to reach a conclusion
What are the null and alternative hypotheses?
There is no difference between the observed and the expected distribution
The observed distribution cannot be modelled by the expected distribution
Let
be the significance level
How do I calculate the goodness of fit?
First, combine any columns for which expected values are less than 5 until they are greater than 5
For example
Score
1
2
3
4
15
6
4
1
12
8
4
2
The expected value of 2 is less than 5 so combine the last two columns
Score
1
2
3+
15
6
5
12
8
6
Then calculate the goodness of fit,
, from the formula
An alternative version of the formula that can be easier to calculate is
Where
is the sum of all observed values
This is also the same as the sum of all expected values
The larger
is, the more different the observed values are from the expected values
What are degrees of freedom?
The number of degrees of freedom,
, is equal to
The number of columns (after combining to get
) subtract 1
If you also use the observed data to estimate a parameter, then you subtract 2 instead
For example, trying to estimate
when comparing to a
distribution
You are subtracting the number of constraints (or restrictions)
This is the number of times you use the observed data to help form the expected data
This is always 1 from ensuring their totals match,
Then another 1 for each parameter estimated
How do I use the chi-squared distribution?
Once you have calculated the goodness of fit,
Compare it to the critical value
from the chi-squared distribution
is the number of degrees of freedom
Tables of critical values are provided in the exam
You need the significance level,
All chi-squared tests are one-tailed
If
then there is insufficient evidence to reject
This means there is no difference between the observed and expected distributions
In other words, "the expected distribution is a suitable model for the data"
If
then there is sufficient evidence to reject
The expected distribution is not a suitable model for the data
Alternatively, you can use your calculator to find the
p-value
This is the probability of obtaining a chi-squared value of
or more
If
then the result is critical (reject
)
Examiner Tips and Tricks
The alternative formula
is not given in the Formulae Booklet
Worked Example
A game is meant to award points according to the probability distribution below.
Points | 2 | 4 | 8 | 10 |
Probability | 0.6 | 0.2 | 0.15 | 0.05 |
The game is played by 40 people, giving the results below.
Points | 2 | 4 | 8 | 10 |
Frequency | 28 | 5 | 4 | 3 |
Test, at the 5% level of significance, whether or not the game is operating correctly.


You've read 0 of your 5 free revision notes this week
Sign up now. It’s free!
Did this page help you?