Goodness of Fit Test (DP IB Applications & Interpretation (AI)): Revision Note

Did this video help you?

Chi-Squared GOF: Uniform

What is a chi-squared goodness of fit test for a given distribution?

  • A chi-squared (chi squared) goodness of fit test is used to test data from a sample which suggests that the population has a given distribution

  • This could be that: 

    • the proportions of the population for different categories follows a given ratio 

    • the population follows a uniform distribution

      • This means all outcomes are equally likely

What are the steps for a chi-squared goodness of fit test for a given distribution?

  • STEP 1: Write the hypotheses

    • H0 : Variable X can be modelled by the given distribution

    • H1 : Variable X cannot be modelled by the given distribution

      • Make sure you clearly write what the variable is and don’t just call it X

  • STEP 2: Calculate the expected frequencies

    • Split the total frequency using the given ratio

    • For a uniform distribution: divide the total frequency N by the number of possible outcomes k

  • STEP 3: Calculate the degrees of freedom for the test

    • For k possible outcomes

    • Degrees of freedom is nu equals k minus 1

  • STEP 4: Enter the frequencies and the degrees of freedom into your GDC

    • Enter the observed and expected frequencies as two separate lists

    • Your GDC will then give you the χ² statistic and its p-value

    • The χ² statistic is denoted as chi subscript c a l c end subscript superscript 2

  • STEP 5: Decide whether there is evidence to reject the null hypothesis

    • EITHER compare the χ² statistic with the given critical value

      • If χ² statistic > critical value then reject H0

      • If χ² statistic < critical value then accept H0

    • OR compare the p-value with the given significance level

      • If p-value < significance level then reject H0

      • If p-value > significance level then accept H0

  • STEP 6: Write your conclusion

    • If you reject H0

      • There is sufficient evidence to suggest that variable X does not follow the given distribution

      • Therefore this suggests that the data is not distributed as claimed

    •  If you accept H0

      • There is insufficient evidence to suggest that variable X does not follow the given distribution

      • Therefore this suggests that the data is distributed as claimed

Worked Example

A car salesman is interested in how his sales are distributed and records his sales results over a period of six weeks. The data is shown in the table.

Week

1

2

3

4

5

6

Number of sales

15

17

11

21

14

12

chi squared goodness of fit test is to be performed on the data at the 5% significance level to find out whether the data fits a uniform distribution.

a) Find the expected frequency of sales for each week if the data were uniformly distributed.

4-7-3-ib-ai-sl-gof-uniform-a-we-solution

b) Write down the null and alternative hypotheses.

4-7-3-ib-ai-sl-gof-uniform-b-we-solution

c) Write down the number of degrees of freedom for this test.

4-7-3-ib-ai-sl-gof-uniform-c-we-solution

d) Calculate the p-value.

4-7-3-ib-ai-sl-gof-uniform-d-we-solution

e) State the conclusion of the test. Give a reason for your answer.

4-7-3-ib-ai-sl-gof-uniform-e-we-solution

Did this video help you?

Chi-Squared GOF: Binomial

What is a chi-squared goodness of fit test for a binomial distribution?

  • A chi-squared (chi squared) goodness of fit test is used to test data from a sample suggesting that the population has a binomial distribution

    • You will be given the value of p for the binomial distribution

What are the steps for a chi-squared goodness of fit test for a binomial distribution?

  • STEP 1: Write the hypotheses

    • H0 : Variable X can be modelled by the binomial distribution straight B left parenthesis n comma space p right parenthesis

    • H1 : Variable X cannot be modelled by the binomial distribution straight B left parenthesis n comma space p right parenthesis

      • Make sure you clearly write what the variable is and don’t just call it X

      • State the values of n and p clearly

  • STEP 2: Calculate the expected frequencies

    • Find the probability of the outcome using the binomial distribution straight P left parenthesis X equals x right parenthesis

    • Multiply the probability by the total frequency straight P left parenthesis X equals x right parenthesis cross times N

  • STEP 3: Calculate the degrees of freedom for the test

    • For k outcomes

    • Degrees of freedom is nu equals k minus 1

  • STEP 4: Enter the frequencies and the degrees of freedom into your GDC

    • Enter the observed and expected frequencies as two separate lists

    • Your GDC will then give you the χ² statistic and its p-value

    • The χ² statistic is denoted as chi subscript c a l c end subscript superscript 2

  • STEP 5: Decide whether there is evidence to reject the null hypothesis

    • EITHER compare the χ² statistic with the given critical value

      • If χ² statistic > critical value then reject H0

      • If χ² statistic < critical value then accept H0

    • OR compare the p-value with the given significance level

      • If p-value < significance level then reject H0

      • If p-value > significance level then accept H0

  • STEP 6: Write your conclusion

    • If you reject H0

      • There is sufficient evidence to suggest that variable X does not follow the binomial distribution straight B left parenthesis n comma space p right parenthesis

      • Therefore this suggests that the data does not follow straight B left parenthesis n comma space p right parenthesis

    • If you accept H0

      • There is insufficient evidence to suggest that variable X does not follow the binomial distribution straight B left parenthesis n comma space p right parenthesis

      • Therefore this suggests that the data follows straight B left parenthesis n comma space p right parenthesis

Worked Example

A stage in a video game has three boss battles. 1000 people try this stage of the video game and the number of bosses defeated by each player is recorded.

Number of bosses defeated

0

1

2

3

Frequency

490

384

111

15

chi squared goodness of fit test at the 5% significance level is used to decide whether the number of bosses defeated can be modelled by a binomial distribution with a 20% probability of success.

a) State the null and alternative hypotheses.

4-7-3-ib-ai-sl-gof-binomial-a-we-solution

b) Assuming the binomial distribution holds, find the expected number of people that would defeat exactly one boss.

t9ph9q9z_4-7-3-ib-ai-sl-gof-binomial-b-we-solution

c) Calculate the p-value for the test.

3sGACCT3_4-7-3-ib-ai-sl-gof-binomial-c-we-solution

d) State the conclusion of the test. Give a reason for your answer.

opxxE5_K_4-7-3-ib-ai-sl-gof-binomial-d-we-solution

Did this video help you?

Chi-Squared GOF: Normal

What is a chi-squared goodness of fit test for a normal distribution?

  • A chi-squared (chi squared) goodness of fit test is used to test data from a sample suggesting that the population has a normal distribution

    • You will be given the value of μ and σ for the normal distribution

What are the steps for a chi-squared goodness of fit test for a normal distribution?

·     STEP 1: Write the hypotheses

  • H0 : Variable X can be modelled by the normal distribution straight N left parenthesis mu comma space sigma squared right parenthesis

  • H1 : Variable X cannot be modelled by the normal distribution straight N left parenthesis mu comma space sigma squared right parenthesis

    •  Make sure you clearly write what the variable is and don’t just call it X

    • State the values of μ and σ clearly

  • STEP 2: Calculate the expected frequencies

    • Find the probability of the outcome using the normal distribution straight P left parenthesis a less than X less than b right parenthesis

      • Beware of unbounded inequalities straight P left parenthesis X less than b right parenthesis or straight P left parenthesis X greater than a right parenthesis for the class intervals on the 'ends'

    • Multiply the probability by the total frequency straight P left parenthesis a less than X less than b right parenthesis cross times N

  • STEP 3: Calculate the degrees of freedom for the test

    •  For k class intervals

    • Degrees of freedom is nu equals k minus 1

  •  STEP 4: Enter the frequencies and the degrees of freedom into your GDC

    • Enter the observed and expected frequencies as two separate lists

    • Your GDC will then give you the χ² statistic and its p-value

    • The χ² statistic is denoted as chi subscript c a l c end subscript superscript 2

  • STEP 5: Decide whether there is evidence to reject the null hypothesis

    • EITHER compare the χ² statistic with the given critical value

      • If χ² statistic > critical value then reject H0

      • If χ² statistic < critical value then accept H0

    • OR compare the p-value with the given significance level

      • If p-value < significance level then reject H0

      • If p-value > significance level then accept H0

  •  STEP 6: Write your conclusion

    •  If you reject H0

      • There is sufficient evidence to suggest that variable X does not follow the normal distribution straight N left parenthesis mu comma space sigma squared right parenthesis

      • Therefore this suggests that the data does not follow straight N left parenthesis mu comma space sigma squared right parenthesis

    • If you accept H0

      •  There is insufficient evidence to suggest that variable X does not follow the normal distribution straight N left parenthesis mu comma space sigma squared right parenthesis

      •  Therefore this suggests that the data follows straight N left parenthesis mu comma space sigma squared right parenthesis

Worked Example

300 marbled ducks in Quacktown are weighed and the results are shown in the table below.

Mass (g)

Frequency

m less than 470

10

470 less or equal than m less than 520

158

520 less or equal than m less than 570

123

m greater or equal than 570

9

chi squared goodness of fit test at the 10% significance level is used to decide whether the mass of a marbled duck can be modelled by a normal distribution with mean 520 g and standard deviation 30 g.

a) Calculate the expected frequencies, giving your answers correct to 2 decimal places.

4-7-3-ib-ai-sl-gof-normal-a-we-solution

b) Write down the null and alternative hypotheses.

4-7-3-ib-ai-sl-gof-normal-b-we-solution

c) Calculate the chi squared statistic.

4-7-3-ib-ai-sl-gof-normal-c-we-solution

d) Given that the critical value is 6.251, state the conclusion of the test. Give a reason for your answer.

4-7-3-ib-ai-sl-gof-normal-d-we-solution

You've read 0 of your 5 free revision notes this week

Sign up now. It’s free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Dan Finlay

Author: Dan Finlay

Expertise: Maths Lead

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.