Hypothesis Tests for Goodness of Fit (College Board AP® Statistics)
Study Guide
Goodness of fit test
What is a goodness of fit test?
A goodness of fit test is a measure of how well real-life observed data fits a theoretical model
For example, from a coin being thrown a large number of times, a random sample of 20 throws may be inspected
You may observe 13 heads
You would expect 10 heads
Observed and expected values can be shown in a table
For example, a fair dice being rolled a large number of times and a randomly selected sample of 60 rolls () could be displayed as below
Outcome | 1 | 2 | 3 | 4 | 5 | 6 |
---|---|---|---|---|---|---|
Observed | 12 | 7 | 8 | 10 | 14 | 9 |
Expected | 10 | 10 | 10 | 10 | 10 | 10 |
Note that
What are the null and alternative hypotheses for a goodness of fit test?
The null hypothesis, , is the assumption that there is no difference between the expected distribution and the actual distribution
e.g. The dice outcomes have a uniform distribution with the true probability of each outcome being
It is assumed to be correct, unless evidence proves otherwise
The alternative hypothesis, , is the assumption that there is a difference between the expected distribution and the actual distribution
e.g. The dice outcomes do not have a uniform distribution with the true probability of each outcome being
What are the conditions for a goodness of fit test?
When performing a chi-square goodness of fit test:
Observed values must come from a random sample
Observed values must be independent
They are sampled with replacement
or the sample size is less than 10% of the population size
Expected values must meet the large counts condition
Each expected value must be greater than or equal to 5
How do I calculate a chi-square value?
The chi-square value for the goodness of fit test, , can be calculated from the formula given to you in the exam
The larger is, the more different the observed values are from the expected values
What are degrees of freedom?
The number of degrees of freedom, 'dof', is equal to
the number of possible outcomes subtract 1
e.g. dof for the fair die is
How do I use the chi-square distribution table?
You can use the chi-square tables given to you in the exam to find the critical value
This is the threshold value that determines whether you reject the null hypothesis or not
To find critical value from the tables, you need the significance level, and the dof
The critical value is located in the cell where the relevant row and column intersect
How do I conclude a hypothesis test?
Conclusions to a hypothesis test need to show two things:
a decision about the null hypothesis
an interpretation of this decision in the context of the question
To make the decision, compare the calculated goodness of fit value, , to the critical value from the table
If then the null hypothesis should be rejected
The expected distribution is not a suitable model for the data
If then the null hypothesis should not be rejected
This means there is no difference between the observed and expected distributions
The expected distribution is a suitable model for the data
How can I perform a chi-square test on the calculator?
To complete a chi-square test on your calculator:
Add the expected values and the observed values into a table
Perform a chi-square goodness of fit test
Compare your calculated , with the critical value from the chi-square tables
Another calculator method is to:
Add the expected values and the observed values into a table
Perform a chi-square goodness of fit test
Compare the given significance level, , with the calculator's -value
If using the -value, remember
, reject the null hypothesis
, do not reject the null hypothesis
Examiner Tips and Tricks
Even if you perform the chi-square goodness of fit test on your calculator, it is still important to show all of your working to demonstrate full understanding. Therefore you should still calculate the value and the degrees of freedom.
If you compare the -value with , don't forget that the inequalities are the opposite to when you are comparing the value to the critical value when you are determining whether or not to reject the null hypothesis!
Worked Example
A game is meant to award points according to the probability distribution below.
Points | 2 | 4 | 8 | 10 |
---|---|---|---|---|
Probability | 0.6 | 0.2 | 0.15 | 0.05 |
The game is played by 100 people who are randomly selected, giving the results below.
Points | 2 | 4 | 8 | 10 |
---|---|---|---|---|
Frequency | 69 | 13 | 11 | 7 |
Test, at the 5% level of significance, whether or not the game is operating correctly.
Write the null and alternative hypotheses
The actual distribution of points awarded is the same as the expected distribution of points awarded
The actual distribution of points awarded is different to the expected distribution of points awarded
State the type of test being used
The correct inference procedure is a chi-square goodness of fit test at
Calculate the expected values by multiplying each probability by the number of players (100)
Points | 2 | 4 | 8 | 10 |
---|---|---|---|---|
Probability | 0.6 | 0.2 | 0.15 | 0.05 |
Expected | 60 | 20 | 15 | 5 |
Verify the conditions for the test
All conditions for inference have been met:
The sample of players is randomly selected
All expected values are greater than or equal to 5
Calculate the chi-square value, ,
State the number of degrees of freedom
degrees of freedom = 4 - 1 = 3
Find the critical value from the chi-square tables
Find the row corresponding to 3 degrees of freedom and the column corresponding to
Compare the calculated value to the critical value and state the conclusion of the test
is not rejected
Interpret the result in the context of the question
We do not have sufficient evidence to suggest that the observed proportions of the points awarded do not come from the same distribution as is given by the expected proportions in the table above
The game appears to be working correctly
Sign up now. It’s free!
Did this page help you?