Confidence Intervals for Differences in Population Proportions (College Board AP® Statistics): Study Guide

Written by: Mark Curtis

Reviewed by: Dan Finlay

Updated on 23 October 2024

Two-sample z-interval for difference in population proportions

What is a confidence interval for the difference between two population proportions?

A confidence interval for the difference between two population proportions is
- a symmetric range of values centered about the difference between two sample proportions
- designed to capture the actual value of the difference between the two population proportions
Different samples generate different confidence intervals
- e.g. a difference of sample proportions of 0.2 may have a confidence interval of (0.15, 0.25)

How do I calculate a confidence interval for the difference between two population proportions?

The confidence interval for the difference between two population proportions is given by
- $difference between sample proportions \pm (critical value) (standard error of sample proportions of two populations)$
Where:
- The difference between the two sample proportions is calculated from the samples or is given to you
- The critical value is the relevant z-value
  - The critical value depends on the confidence level C%
- The standard error is an estimate of how different the difference between two population proportions is likely to be from the difference between the two sample proportions, $\sqrt{\frac{{\hat{p}}_{1} (1 - {\hat{p}}_{1})}{n_{1}} + \frac{{\hat{p}}_{2} (1 - {\hat{p}}_{2})}{n_{2}}}$

Examiner Tips and Tricks

The general formula for confidence intervals (including a table of standard errors) is given in the exam: $statistic \pm (critical value) (standard error of statistic)$ .

You will need to apply it appropriately using the difference between the sample proportions and the standard error of the difference of the sample proportions.

What are the conditions for a confidence interval for a difference in population proportions?

When calculating a two-sample z-interval for a difference in population proportions, you must show that it meets the following conditions:
- Items in the two samples (or experiment) must satisfy the independence condition
  - by verifying that data is collected by random sampling
  - or random assignment (in an experiment)
  - and, if sampling without replacement, showing that both sample sizes are less than 10% of their population size
- The sampling distribution of ${\hat{p}}_{1} - {\hat{p}}_{2}$ must be approximately normal, by verifying that
  - $n_{1} {\hat{p}}_{1} \geq 10$
  - $n_{1} (1 - {\hat{p}}_{1}) \geq 10$
  - $n_{2} {\hat{p}}_{2} \geq 10$
  - $n_{2} (1 - {\hat{p}}_{2}) \geq 10$

Examiner Tips and Tricks

Some exam questions may change the four $\geq 10$ conditions into four $\geq 5$ conditions (changing the 10 into a 5), though this will be made clear in the question.

What is the margin of error?

The margin of error is the half-width of the confidence interval
- $margin of error = (critical value) (standard error of the difference of the sample proportions)$
The confidence interval is
- $difference in sample proportions \pm margin of error$
The total width of a confidence interval is $2 \times margin of error$
You may be given an interval and asked to calculate its margin of error
- or another value, such as $n$
  - This involves forming and solving an equation

Examiner Tips and Tricks

You need to know that the width of a confidence interval increases as the confidence level increases, whereas it decreases as the sample sizes increase!

How do I interpret a confidence interval for a population mean?

You must conclude calculations of a confidence interval by referring to the context
- Start by saying 'we can be C% confident that the interval from [lower limit] to [upper limit]...'
  - using the limits from the confidence interval
- then end with it capturing the difference between the population proportions in context
  - e.g. 'captures the actual difference between the proportion of left-handed students in School A and the proportion of left-handed students in School B'
Confidence intervals for differences may have negative limits
- This means that the difference, $p_{1} - p_{2}$ , is negative
  - so $p_{1} < p_{2}$

How do I use confidence intervals to justify a claim about a population proportions difference?

If the difference in population proportions is claimed to be a specific value
- check if that value lies in your confidence interval
If it does, the sample data provides sufficient evidence that the difference in population proportions is that value
- If it does not, the sample data does not provide sufficient evidence that the difference in population proportions is that value
Look out for confidence intervals for differences that contain zero
- This means $p_{1} - p_{2} = 0$ so there is evidence to suggest $p_{1} = p_{2}$

Worked Example

Nova University and Terra University have over 10,000 students each. A random sample of 200 students at Nova University and a random sample of 150 students from Terra University were asked to complete a survey to measure their level of smartphone addiction. The results showed that 35% of the students sampled from Nova University were addicted to their smartphones, while 28% of the students sampled from Terra University were addicted to their smartphones.

Construct a 95% confidence interval for the difference in the proportion of students addicted to smartphones at Nova University and the proportion of students addicted to smartphones at Terra University.

Answer:

Define the population parameters, $p_{1}$ and $p_{2}$

Let $p_{1}$ be the proportion of all students at Nova University who are addicted to their smartphones

Let $p_{2}$ be the proportion of all students at Terra University who are addicted to their smartphones

State the type of interval being used and verify that the conditions for the interval are met

The correct inference procedure is a two-sample z-interval for the difference in population proportions at a 95% confidence level

The independence condition is satisfied, as
- both samples were selected randomly
- the sample size from Nova University, 200, is less than 10% of the total number of students at Nova University (10% of 'over 10,000' is 'over 1000')
- the sample size from Terra University, 150, is less than 10% of the total number of students at Terra University (10% of 'over 10,000' is 'over 1000')
  - These conditions are required as sampling was conducted without replacement
The sample size is large enough for the sampling distribution of the difference in sample proportions to be approximately normally distributed, because the following conditions are satisfied
- $n_{1} {\hat{p}}_{1} = 200 \cdot 0.35 = 70 \geq 10$
- $n_{1} (1 - {\hat{p}}_{1}) = 200 \cdot (1 - 0.35) = 130 \geq 10$
- $n_{2} {\hat{p}}_{2} = 150 \cdot 0.28 = 42 \geq 10$
- $n_{2} (1 - {\hat{p}}_{2}) = 150 \cdot (1 - 0.28) = 108 \geq 10$

List the sample sizes, $n_{1}$ and $n_{2}$ , the sample proportions, ${\hat{p}}_{1}$ and ${\hat{p}}_{2}$ , and calculate the standard error of the difference in sample proportions, $\sqrt{\frac{{\hat{p}}_{1} (1 - {\hat{p}}_{1})}{n_{1}} + \frac{{\hat{p}}_{2} (1 - {\hat{p}}_{2})}{n_{2}}}$

$\begin{array}{rcl} n_{1} & = & 200 \\ n_{2} & = & 150 \\ {\hat{p}}_{1} & = & 0.35 \\ {\hat{p}}_{2} & = & 0.28 \\ \sqrt{\frac{{\hat{p}}_{1} (1 - {\hat{p}}_{1})}{n_{1}} + \frac{{\hat{p}}_{2} (1 - {\hat{p}}_{2})}{n_{2}}} & = & \sqrt{\frac{0.35 (1 - 0.35)}{200} + \frac{0.28 (1 - 0.28)}{150}} = 0.0498146 . . . \end{array}$

Find the z-score (critical value) for a confidence level of 95%, e.g. from the tables

Remember that a confidence level of 95% is 5% in both tails combined, so use 2.5% for a single tail in the table
(Alternatively, the row for $t_{\infty}$ in the t-tables are z-scores, together with the corresponding 'Confidence level C' shown below)

z-score = 1.960

Calculate the confidence interval using the formula given to you in the exam, $Confidence interval = statistic \pm (critical value) (standard error of statistic)$

$\begin{array}{rcl} CI & = & ({\hat{p}}_{1} - {\hat{p}}_{2}) \pm z \cdot \sqrt{\frac{{\hat{p}}_{1} (1 - {\hat{p}}_{1})}{n_{1}} + \frac{{\hat{p}}_{2} (1 - {\hat{p}}_{2})}{n_{2}}} \\ = & (0.35 - 0.28) \pm 1.960 \cdot 0.0498146 . . . \end{array}$

State the confidence interval

$(- 0.0276, 0.1676)$

Explain the confidence interval in the context of the question

We can be 95% confident that the interval from -0.0276 to 0.1676 captures the actual value of the difference in the proportion of students addicted to smartphones at Nova University and the proportion of students addicted to smartphones at Terra University

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Test yourself

Did this page help you?

Previous:Hypothesis Tests for Differences in Population ProportionsNext:The t-distribution

Confidence Intervals for Differences in Population Proportions (College Board AP® Statistics): Study Guide

Two-sample z-interval for difference in population proportions

What is a confidence interval for the difference between two population proportions?

How do I calculate a confidence interval for the difference between two population proportions?

What are the conditions for a confidence interval for a difference in population proportions?

What is the margin of error?

How do I interpret a confidence interval for a population mean?

How do I use confidence intervals to justify a claim about a population proportions difference?

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

Unit 1: Exploring One-Variable Data

Summary Statistics

Describing Variables

Parameters & Statistics

Measures of Center

Measures of Position

Measures of Variability

Tables & Relative Frequency

Grouped Data

Outliers & Resistant Measures

Five-Number Summary & Boxplots

Skewness of Data

Comparing Data using Summary Statistics

Graphical Representations

Shape of Distributions

Bar Charts & Histograms

Dotplots & Stemplots

Cumulative Graphs

Comparing Univariate Graphs

The Normal Distribution

Properties of Normal Distributions

Standardized z-scores

Comparing Normal Distributions

Finding Proportions from Normal Distributions

Inverse Normal Calculations

Estimating Parameters of Normal Distributions

Unit 2: Exploring Two-Variable Data

Tables & Graphs

Two-Way Tables & Relative Frequencies

Bar Graphs & Mosaic Plots

Scatterplots & Regression

Explanatory & Response Variables

Scatterplots

Association & Correlation Coefficients

Interpolation & Extrapolation using Linear Models

Residuals

The Least-Squares Regression Line

Residual Plots

The Coefficient of Determination

Outliers, High-Leverage & Influential Points

Linearization of Bivariate Data

Unit 3: Collecting Data

Sampling Methods & Bias

Introduction to Sampling

Simple Random Sampling (SRS)

Random Sampling Methods

Types of Bias

Non-random (Biased) Sampling Methods

Experimental Design

Introduction to Experiments

Well-Designed Experiments

Control Groups, Placebos & Blind Experiments

Completely Randomized Design

Randomized Block & Matched Pairs Design

Unit 4: Probability, Random Variables & Probability Distributions

Probability

Estimating Probability using Relative Frequency

Probabilities of Single Events

Introduction to Combined Events

Addition Rule & Mutually Exclusive Events

Conditional Probability

Multiplication Rule & Independent Events

Probabilities of Combined Events using Tree Diagrams

Probabilities of Combined Events using the Rules

Discrete Random Variables

Probability Distributions for Discrete Random Variables

Cumulative Probability Distributions for Discrete Random Variables

Mean & Standard Deviation of a Discrete Random Variable

Linear Transformations of Random Variables

Linear Combinations of Random Variables