Confidence Intervals for Differences in Population Means (College Board AP® Statistics) : Study Guide

Written by: Naomi C

Reviewed by: Dan Finlay

Updated on 23 October 2024

Two-sample t-interval for difference in population means

What is a confidence interval for the difference between two population means?

A confidence interval for the difference between two population means is
- a symmetric range of values centered about the sample means difference
- designed to capture the actual value of the difference between two population means
Different samples generate different confidence intervals
- e.g. a sample means difference of 5 may have a confidence interval of (4.5, 5.5)

How do I calculate a confidence interval for the difference between two population means?

The confidence interval for the difference between two population means is given by
- $difference between sample means \pm (critical value) (standard error of sample means of two populations)$
Where:
- The difference between the two sample means is calculated from the samples or is given to you
- The critical value is the relevant t-value
  - The critical value depends on the confidence level C%
- The standard error is an estimate of how different the difference between two population means is likely to be from the difference between the two sample means, $\begin{array}{rcl} \sqrt{\frac{{s_{A}}^{2}}{n_{A}} + \frac{{s_{B}}^{2}}{n_{B}}} \end{array}$

Examiner Tips and Tricks

The general formula for confidence intervals (including a table of standard errors) is given in the exam: $statistic \pm (critical value) (standard error of statistic)$ .

You will need to apply it appropriately using the difference between the sample means and the standard error of the difference of the sample means.

What are conditions for a confidence interval for a difference in population means?

When calculating a confidence interval, you must show that:
- the two samples are independent of each other
- items in the samples (or experiments) must satisfy the independence condition
  - by verifying that data is collected by random sampling
  - or random assignment (in an experiment)
  - and, if sampling without replacement, showing that the sample size is less than 10% of the population size
- the populations are approximately normally distributed
  - The distribution of both populations needs to be approximately symmetric
  - There should be no outliers

What is the margin of error?

The margin of error is the half-width of the confidence interval
- $margin of error = (critical value) (standard error of the difference of the sample means)$
The confidence interval is
- $difference in sample means \pm margin of error$
The total width of a confidence interval is $2 \times margin of error$
You may be given an interval and asked to calculate its margin of error
- or another value, such as $n$
  - This involves forming and solving an equation

Examiner Tips and Tricks

You need to know that the width of a confidence interval increases as the confidence level increases, whereas it decreases as the sample sizes increase!

How do I interpret a confidence interval for a population mean?

You must conclude calculations of a confidence interval by referring to the context
- Start by saying 'we can be C% confident that the interval from [lower limit] to [upper limit]...'
  - using the limits from the confidence interval
- then end with it capturing the difference between the population means in context
  - e.g. 'captures the actual difference between the population means of the time taken by students from each school to run 100 m'
Confidence intervals for differences may have negative limits
- This means that the difference, $μ_{1} - μ_{2}$ , is negative
  - so $μ_{1} < μ_{2}$

How do I use confidence intervals to justify a claim about a population mean difference?

If a population mean difference is claimed to be a specific value
- check if that value lies in your confidence interval
If it does, the sample data provides sufficient evidence that the population mean difference is that value
- If it does not, the sample data does not provide sufficient evidence that the population mean difference is that value
Look out for confidence intervals for differences that contains zero
- This means $μ_{1} - μ_{2} = 0$ so there is evidence to suggest $μ_{1} = μ_{2}$

Worked Example

Two independent farmers each claim to grow the longest eggplants. A random sample of 18 eggplants from the thousands grown at each farm are measured. The lengths of the eggplants on both farms is normally distributed. Those from farm A have a mean length of 8.1 inches with a standard deviation of 0.36 inches and the eggplants from farm B have a mean length of 7.9 inches with a standard deviation of 0.28 inches. Find the 95% confidence interval for $μ_{A} - μ_{B}$ .

State the type of test being used and verify the conditions for the test

The correct inference procedure is a two-sample t-interval with a 95% confidence level

The independence condition is satisfied, as
- the samples are taken from different farms so they are independent
- the samples of 18 eggplants were selected at random
- the sample size, 18, is less than 10% of the population (thousands)
The distribution of lengths is normal
The sample size is small ( $n = 18$ , which is less than 30) and the population standard deviation is unknown, so the t-distribution can be used

Define the population parameters

Let $μ_{A}$ be the mean length of the eggplants from farm A

Let $μ_{B}$ be the mean length of the eggplants from farm B

List the number of data items in the samples, $n$ , the sample mean, $\bar{x}$ , and the sample standard deviation, $s_{x}$

$\begin{array}{rcl} n_{A} & = & 18 \\ {\bar{x}}_{A} & = & 8.1 \\ s_{A} & = & 0.36 \end{array}$ $\begin{array}{rcl} n_{B} & = & 18 \\ {\bar{x}}_{B} & = & 7.9 \\ s_{B} & = & 0.28 \end{array}$

State the degrees of freedom (dof)

dof = 18 - 1 = 17

Using the t-table, find the t-value (critical value) for the sample mean, using dof = 17 and a confidence level of 95%

Remember that a confidence level of 95% is 5% in both tails combined, so use 2.5% for a single tail in the table (the row 'Confidence level C' at the bottom of the t-tables helps)

t-value = 2.110

Using the formula from the formula sheet, $Confidence interval = statistic \pm (critical value) (standard error of statistic)$ , calculate the confidence interval

$\begin{array}{rcl} CI & = & ({\bar{x}}_{A} - {\bar{x}}_{B}) \pm t * \cdot \sqrt{\frac{{s_{A}}^{2}}{n_{A}} + \frac{{s_{B}}^{2}}{n_{B}}} \\ = & (8.1 - 7.9) \pm 2.110 \cdot \sqrt{\frac{0 . 36^{2}}{18} + \frac{0 . 28^{2}}{18}} \end{array}$

State the confidence interval

$(- 0.027, 0.427)$

Explain the confidence interval in the context of the question

We can be 95% confident that the interval from -0.027 inches to 0.427 inches captures the actual value of the difference between the population means $μ_{A} - μ_{B}$ , of the lengths of eggplants from both farms

You've read 0 of your 5 free study guides this week

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Test yourself

Did this page help you?

Previous:Hypothesis Tests for Differences in Population MeansNext:t-scores versus z-scores

Confidence Intervals for Differences in Population Means (College Board AP® Statistics) : Study Guide

Two-sample t-interval for difference in population means

What is a confidence interval for the difference between two population means?

How do I calculate a confidence interval for the difference between two population means?

What are conditions for a confidence interval for a difference in population means?

What is the margin of error?

How do I interpret a confidence interval for a population mean?

How do I use confidence intervals to justify a claim about a population mean difference?

You've read 0 of your 5 free study guides this week

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

Unit 1: Exploring One-Variable Data

Summary Statistics

Describing Variables

Parameters & Statistics

Measures of Center

Measures of Position

Measures of Variability

Tables & Relative Frequency

Grouped Data

Outliers & Resistant Measures

Five-Number Summary & Boxplots

Skewness of Data

Comparing Data using Summary Statistics

Graphical Representations

Shape of Distributions

Bar Charts & Histograms

Dotplots & Stemplots

Cumulative Graphs

Comparing Univariate Graphs

The Normal Distribution

Properties of Normal Distributions

Standardized z-scores

Comparing Normal Distributions

Finding Proportions from Normal Distributions

Inverse Normal Calculations

Estimating Parameters of Normal Distributions

Unit 2: Exploring Two-Variable Data

Tables & Graphs

Two-Way Tables & Relative Frequencies

Bar Graphs & Mosaic Plots

Scatterplots & Regression

Explanatory & Response Variables

Scatterplots

Association & Correlation Coefficients

Interpolation & Extrapolation using Linear Models

Residuals

The Least-Squares Regression Line

Residual Plots

The Coefficient of Determination

Outliers, High-Leverage & Influential Points

Linearization of Bivariate Data

Unit 3: Collecting Data

Sampling Methods & Bias

Introduction to Sampling

Simple Random Sampling (SRS)

Random Sampling Methods

Types of Bias

Non-random (Biased) Sampling Methods

Experimental Design

Introduction to Experiments

Well-Designed Experiments

Control Groups, Placebos & Blind Experiments

Completely Randomized Design

Randomized Block & Matched Pairs Design

Unit 4: Probability, Random Variables & Probability Distributions

Probability

Estimating Probability using Relative Frequency

Probabilities of Single Events

Introduction to Combined Events

Addition Rule & Mutually Exclusive Events

Conditional Probability

Multiplication Rule & Independent Events

Probabilities of Combined Events using Tree Diagrams

Probabilities of Combined Events using the Rules

Discrete Random Variables

Probability Distributions for Discrete Random Variables

Cumulative Probability Distributions for Discrete Random Variables

Mean & Standard Deviation of a Discrete Random Variable

Linear Transformations of Random Variables