Hypothesis Tests for Differences in Population Proportions (College Board AP® Statistics)

Revision Note

Mark Curtis

Expertise

Maths

Two-sample z-test for difference in population proportions

What is a two-sample z-test for a difference in population proportions?

  • A two sample z-test is used to test whether or not the population proportions of two independent populations, p subscript 1 and p subscript 2, are equal

    • One random sample of size n subscript 1 is taken from the first population

    • A different random sample of size n subscript 2 is taken the second population

    • The sample proportions are p with hat on top subscript 1 and p with hat on top subscript 2

      • The difference in sample proportions is p with hat on top subscript 1 minus p with hat on top subscript 2

What are the hypotheses?

  • The null hypothesis, straight H subscript 0, is the assumption that there is no difference between the population proportions

    • e.g. straight H subscript 0 space colon The proportion of left-handed students at both schools is equal, p subscript 1 equals p subscript 2

      • It is assumed to be correct, unless evidence proves otherwise

      • It can also be written as p subscript 1 minus p subscript 2 equals 0

  • The alternative hypothesis, straight H subscript straight a, is how you think the population proportions might be different to each other

    • e.g. straight H subscript straight a colon The proportion of left-handed students in School A is greater than in School B, p subscript 1 greater than p subscript 2

    • Remember that a z-test could be one-tailed or two-tailed (p subscript 1 not equal to p subscript 2)

Exam Tip

When writing out your hypotheses, always fully define the symbol used for the population parameters in context, e.g. '... where p subscript 1 is the proportion of left-handed students in School A and p subscript 2 is the proportion of left-handed students in School B'.

What are the conditions required?

  • When performing a two-sample z-test for a difference in population proportions, you must show that it meets the following conditions:

    • Items in the two samples (or experiment) must satisfy the independence condition

      • by verifying that data is collected by random sampling

      • or random assignment (in an experiment)

      • and, if sampling without replacement, showing that both sample sizes are less than 10% of their population size

    • The sampling distribution of p with hat on top subscript 1 minus p with hat on top subscript 2 must be approximately normal

      • by first calculating the combined proportion, p with hat on top subscript c, given by p with hat on top subscript c equals fraction numerator X subscript 1 plus X subscript 2 over denominator n subscript 1 plus n subscript 2 end fraction (which assumes the null hypothesis, p subscript 1 equals p subscript 2) where X subscript 1 equals n subscript 1 p with hat on top subscript 1 (the number of successes in the first sample) and X subscript 2 equals n subscript 2 p with hat on top subscript 2 (the number of successes in the second sample)

      • then using p with hat on top subscript c to verify that n subscript 1 p with hat on top subscript c greater or equal than 10, n subscript 1 open parentheses 1 minus p with hat on top subscript c close parentheses greater or equal than 10, n subscript 2 p with hat on top subscript c greater or equal than 10 and n subscript 2 open parentheses 1 minus p with hat on top subscript c close parentheses greater or equal than 10

    • The combined proportion, p with hat on top subscript c, is also called the pooled proportion

      • It can only be used when p subscript 1 equals p subscript 2 is assumed (like under the null hypothesis)

Exam Tip

The formula for the combined proportion, p with hat on top subscript c equals fraction numerator X subscript 1 plus X subscript 2 over denominator n subscript 1 plus n subscript 2 end fraction, is given in the exam, but you need to learn that X subscript 1 equals n subscript 1 p with hat on top subscript 1 (the number of successes in the first sample) and X subscript 2 equals n subscript 2 p with hat on top subscript 2 (the number of successes in the second sample).

Exam Tip

Some exam questions may change the four greater or equal than 10 conditions into four greater or equal than 5 conditions (changing the 10 into a 5), though this will be made clear in the question.

How do I calculate the standardized test statistic?

  • The standardized test statistic for a difference in sample proportions is a z-score given by:

    • z equals fraction numerator open parentheses p with hat on top subscript 1 minus p with hat on top subscript 2 close parentheses minus 0 over denominator square root of p with hat on top subscript c open parentheses 1 minus p with hat on top subscript c close parentheses open parentheses 1 over n subscript 1 plus 1 over n subscript 2 close parentheses end root end fraction

      • where p with hat on top subscript 1 and p with hat on top subscript 2 are the sample proportions

      • n subscript 1 and n subscript 2 are the sample sizes

      • p with hat on top subscript c is the combined proportion given by p with hat on top subscript c equals fraction numerator X subscript 1 plus X subscript 2 over denominator n subscript 1 plus n subscript 2 end fraction where X subscript 1 equals n subscript 1 p with hat on top subscript 1 and X subscript 2 equals n subscript 2 p with hat on top subscript 2

      • and the zero, 0, highlights that the difference in population proportions is zero under the null hypothesis, p subscript 1 minus p subscript 2 equals 0

Exam Tip

The formula for the standardized test statistic is given in the exam, fraction numerator statistic minus parameter over denominator standard space error space of space the space statistic end fraction, along with tables of parameters and standard errors.

There are two different standard errors for population proportion given in the exam. For hypothesis testing, you need the second one where it says 'p subscript 1 equals p subscript 2 is assumed'!

How do I calculate the p-value?

  • The p-value is the probability of obtaining a test statistic as extreme, or more extreme, than the one observed in the difference of the two samples, assuming the null hypothesis is true

  • Use the standard normal distribution, Z, to calculate the probability of being in the extreme region (tail) that extends from the z-score given by the formula above

    • You can use either the z-tables or a calculator to find this probability

  • For a two-tail test, remember to work out the total probability across both tails

    • You can double the p-value from a one-tail test

How do I conclude a hypothesis test?

  • Conclusions to a hypothesis test need to show two things:

    • a decision about the null hypothesis

    • an interpretation of this decision in the context of the question

  • To make the decision, compare the p-value to the significance level, alpha

    • If p less than alpha then the null hypothesis should be rejected

    • If p greater than alpha then the null hypothesis should not be rejected

  • In a two-tailed test, double the p-value and compare this to alpha

Exam Tip

Remember that the test should be interpreted within the context of the question.

Use the same language in your conclusion that is used in the problem, e.g. 'The data provides sufficient evidence that the proportion of left-handed students in School A is greater than the proportion of left-handed students in School B'.

What are the steps on a calculator?

  • When using a calculator to conduct a z-test for a difference in population proportions, you must still write down all steps of the hypothesis testing process:

    • State the null and alternative hypotheses and clearly define your parameter

    • Describe the test being used and show that the situation meets the conditions required

    • Calculate the standardized test statistic (z-score)

    • Calculate the p-value using your calculator

    • Compare the p-value to the significance level

    • Write down the conclusion to the test and interpret it in the context of the problem

Exam Tip

Even if you perform a z-test for a difference in population proportions on your calculator, it is still important to show all of your working to demonstrate full understanding, including calculating the z-score.

Worked Example

Nova University and Terra University have over 10,000 students each. A random sample of 200 students at Nova University and a random sample of 150 students from Terra University were asked to complete a survey to measure their level of smartphone addiction. The results showed that 35% of the students sampled from Nova University were addicted to their smartphones, while 28% of the students sampled from Terra University were addicted to their smartphones.

Is there sufficient evidence, at a 0.05 level of significance, to conclude that there is a difference in the proportion of students addicted to smartphones at Nova University and Terra University?

Answer:

State the type of test being used and verify the conditions for the test

The correct inference procedure is a two-sample z-test for the difference in population proportions with alpha equals 0.05

  • The independence condition is satisfied, as

    • both samples were selected randomly

    • the sample size from Nova University, 200, is less than 10% of the total number of students at Nova University (10% of 'over 10,000' is 'over 1000')

    • the sample size from Terra University, 150, is less than 10% of the total number of students at Terra University (10% of 'over 10,000' is 'over 1000')

      • These conditions are required as sampling was conducted without replacement

  • The sample size is large enough for the sampling distribution of the difference in sample proportions to be approximately normally distributed, because

    • the combined proportion is p with hat on top subscript c equals fraction numerator X subscript 1 plus X subscript 2 over denominator n subscript 1 plus n subscript 2 end fraction where X subscript 1 equals n subscript 1 p with hat on top subscript 1 and X subscript 2 equals n subscript 2 p with hat on top subscript 2

      • giving p with hat on top subscript c equals fraction numerator 200 times 0.35 plus 150 times 0.28 over denominator 200 plus 150 end fraction equals 0.32

    • and the following conditions are satisfied

      • n subscript 1 p with hat on top subscript c equals 200 times 0.32 equals 64 greater or equal than 10

      • n subscript 1 open parentheses 1 minus p with hat on top subscript c close parentheses equals 200 times open parentheses 1 minus 0.32 close parentheses equals 136 greater or equal than 10

      • n subscript 2 p with hat on top subscript c equals 150 times 0.32 equals 48 greater or equal than 10

      • n subscript 2 open parentheses 1 minus p with hat on top subscript c close parentheses equals 150 times open parentheses 1 minus 0.32 close parentheses equals 102 greater or equal than 10

Define the population parameters, p subscript 1 and p subscript 2

Let p subscript 1 be the proportion of all students at Nova University who are addicted to their smartphones

Let p subscript 2 be the proportion of all students at Terra University who are addicted to their smartphones

Write the null and alternative hypotheses

This will be a two-tailed test as a difference is assumed, but no direction is specified

straight H subscript 0 space colon space p subscript 1 equals p subscript 2
straight H subscript straight a space colon space p subscript 1 not equal to p subscript 2

Calculate the standardized test statistic

table row z equals cell fraction numerator open parentheses p with hat on top subscript 1 minus p with hat on top subscript 2 close parentheses minus 0 over denominator square root of p with hat on top subscript c open parentheses 1 minus p with hat on top subscript c close parentheses open parentheses 1 over n subscript 1 plus 1 over n subscript 2 close parentheses end root end fraction end cell row blank equals cell fraction numerator open parentheses 0.35 minus 0.28 close parentheses minus 0 over denominator square root of 0.32 open parentheses 1 minus 0.32 close parentheses open parentheses 1 over 200 plus 1 over 150 close parentheses end root end fraction end cell row blank equals cell 1.389297... end cell end table

Find the p-value for one of the tails, P open parentheses Z greater than 1.389297... close parentheses, e.g. from the z-tables

1 minus 0.9177 equals 0.0823

Double this probability to find the p-value for both tails

p equals 0.0823 cross times 2 equals 0.1646

Compare this probability to the significance level and state the conclusion of the test

table row cell 0.1646 end cell greater than cell 0.05 end cell row p greater than alpha end table

straight H subscript 0 is not rejected

Interpret this result in the context of the question

There is not sufficient evidence to conclude that there is a difference in the proportion of students addicted to smartphones at Nova University and Terra University

You've read 0 of your 10 free revision notes

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Mark Curtis

Author: Mark Curtis

Mark graduated twice from the University of Oxford: once in 2009 with a First in Mathematics, then again in 2013 with a PhD (DPhil) in Mathematics. He has had nine successful years as a secondary school teacher, specialising in A-Level Further Maths and running extension classes for Oxbridge Maths applicants. Alongside his teaching, he has written five internal textbooks, introduced new spiralling school curriculums and trained other Maths teachers through outreach programmes.