Measures of Variability (College Board AP® Statistics): Study Guide

Written by: Naomi C

Reviewed by: Dan Finlay

Updated on 17 September 2024

Range

What is the range of a data set?

The range of a data set is the difference between the largest and smallest values in the data set
- $range = largest value - smallest value$
- If the data has units (seconds, cm, etc.), then the range has the same units as the values in the data set
The range is a measure of variability (i.e. a measure of spread)
- Recall that an average (measure of center) for a data set tells you what a 'typical' data value is
- The range tells you how spread out the data is around that average
  - A small range means all the data values are close to the average
  - A large range means that some of the data values are far from the average
The range is affected by outliers (extreme values, i.e. extremely large or extremely small)
- Outliers can cause the range of a data set to be large
- The range would then give a misleading idea about how spread out most of the data really is

Worked Example

Saffy counted the number of pairs of shoes that the students in her class owned. The results are listed below:

2, 6, 3, 3, 15, 4, 6, 7, 5, 4, 5,

5, 8, 4, 6, 6, 2, 7, 8, 5, 3

What is the range of the data set?

It can sometimes help to write the data in size order

2, 2, 3, 3, 3, 4, 4, 4, 5, 5,

5, 5, 6, 6, 6, 6, 7, 7, 8, 8, 15

The range is the largest value minus the smallest value

15 - 2 = 13

The range of the number of pairs of shoes that a student in Saffy's class owns is 13

Interquartile range

What is the interquartile range (IQR) of a data set?

The interquartile range (IQR) of a data set is the difference between the third quartile (Q3) and first quartile (Q1) of the data set
- $IQR = Q 3 - Q 1$
- The IQR has the same units as the values in the data set (seconds, cm, etc.)
The interquartile range is also a measure of variability (i.e. of spread)
- Half of the data values in a data set are between Q1 and Q3
  - This 'middle half' of the data set may be thought of as the 'most typical' half of the data
  - The IQR tells you how spread out the values in that middle half are
The largest and smallest values in a data set do not affect the interquartile range
- This makes the IQR a better measure of spread for data sets with outliers (extreme values, i.e. extremely large or extremely small)

Examiner Tips and Tricks

When you enter a set of data into your calculator, the 1-variable statistics function will return the IQR as one of its calculated values.

This is a useful check but you should make sure you show your working clearly.

Worked Example

Roger planted a number of hot pepper seeds and recorded the number of days it took each seed to germinate. The results are listed below:

5 5 6 6 6 7 7 7 7 7 7 7 8

8 8 8 8 8 9 9 9 9 10 10 11 23

Roger calculates that the first quartile of his data set on hot pepper seeds is 7, and the third quartile is 9.

(a) What is the interquartile range of the data set?

Answer:

The interquartile range is the third quartile minus the first quartile

9 - 7 = 2

The interquartile range of the data set is 2 days

(b) Suggest a reason why the interquartile range might be a better measure of spread than the range for this data set.

Answer:

Note that the '23' is an outlier (extreme value)

This would affect the range, but not the IQR

The 23 in the data set is an outlier (extreme value) compared to all the other values

This would make the range very large and give a misleading idea about the spread of the data

The interquartile range is not affected by extreme values, and so will be a better measure of spread for this data set

Standard deviation & variance

What is the standard deviation of a data set?

The standard deviation of a data set is a measure of variability (i.e. a measure of spread)
- It measures how the data is spread out relative to the mean
  - If the standard deviation is small then most data values are close to the mean (there is less variability)
  - If the standard deviation is large then many data values will be further away from the mean (there is greater variability)
- If the data has units (seconds, cm, etc.), then the standard deviation has the same units as the values in the data set
The Greek letter $σ$ (lower case sigma) is used for the population standard deviation
The English letter $s$ is used for the sample standard deviation

How do I calculate the standard deviation for a data set?

The standard deviation of a variable, $x$ , for a population, $σ_{x}$ , can be calculated using the formula:
- $σ_{x} = \sqrt{\frac{1}{n} \sum {(x_{i} - \bar{x})}^{2}} = \sqrt{\frac{\sum {(x_{i} - \bar{x})}^{2}}{n}}$
- This is not given to you in the exam
- In this formula:
  - $n$ is the number of values in the sample
  - $\bar{x}$ is the mean of the sample
  - $x_{i}$ is 'any data value' in the sample
The standard deviation of a variable, $x$ , for a sample, $s_{x}$ , can be calculated using the formula:
- $s_{x} = \sqrt{\frac{1}{n - 1} \sum {(x_{i} - \bar{x})}^{2}} = \sqrt{\frac{\sum {(x_{i} - \bar{x})}^{2}}{n - 1}}$
- This is given to you in the exam
Note that the formula for the population standard deviation is very similar to the formula for the sample standard deviation
- You are just dividing by $n$ rather than $n - 1$

Examiner Tips and Tricks

In practice, you will only be asked to calculate the standard deviation for a sample, but you should be aware that the population standard deviation is a different formula to the sample standard deviation.

If you use your calculator to check or calculate a standard deviation, make sure that you are familiar with you calculator's use of notation so that you are looking at the correct result.

What are the benefits and limitations of the standard deviation as a measure of variability?

The standard deviation does not tell you where the mean is, but it does give you a measure of how far (on average) the data values are from their mean
The standard deviation will not necessarily become larger if more data values are added to the data set
- Adding more terms to the calculation that are a similar distance from the mean will not affect the standard deviation
- However, adding terms that are a greater distance from the mean will increase the standard deviation
Like the mean, the standard deviation is affected by extreme values

What is the variance of a data set?

The variance of a data set is the square of the standard deviation
- $σ^{2}$ is used to denote the population variance
- $s^{2}$ is used to denote the sample variance
The variance is the average of the square of the differences between each data item and the mean
- If the data has units (seconds, cm, etc.), then the variance has the same units squared as the values in the data set (seconds², cm², etc.)
The standard deviation is often the measure used rather than the variance as it has the same units as the data set

Worked Example

A sample of 5 data items have been taken from a population. The values are listed below.

6 9 2 11 5

(a) Calculate the mean of the sample.

Answer:

Add up the values and divide by the number of values, 5

$\frac{6 + 9 + 2 + 11 + 5}{5} = 6.6$

The mean of the sample is 6.6

(b) Calculate the standard deviation of the sample.

Answer:

It is easiest to set up a table to work out the different values

$x$	$x - \bar{x}$	${(x - \bar{x})}^{2}$
$6$	$6 - 6.6 = - 0.6$	${(- 0.6)}^{2} = 0.36$
$9$	$9 - 6.6 = 2.4$	${(2.4)}^{2} = 5.76$
$2$	$2 - 6.6 = - 4.6$	${(- 4.6)}^{2} = 21.16$
$11$	$11 - 6.6 = 4.4$	${(4.4)}^{2} = 19.36$
$5$	$5 - 6.6 = - 1.6$	${(- 1.6)}^{2} = 2.56$
Total		$0.36 + 5.76 + 21.26 + 19.36 + 2.56 = 49.2$

Substitute the values into the formula $\sqrt{\frac{1}{n - 1} \sum_{} {(x_{i} - \bar{x})}^{2}}$

$\sqrt{\frac{1}{5 - 1} \cdot 49.2} = \sqrt{12.3} = 3.50713 . . .$

The standard deviation is 3.51

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Test yourself

Did this page help you?

Previous:Measures of PositionNext:Tables & Relative Frequency

Measures of Variability (College Board AP® Statistics): Study Guide

Range

What is the range of a data set?

Interquartile range

What is the interquartile range (IQR) of a data set?

Standard deviation & variance

What is the standard deviation of a data set?

How do I calculate the standard deviation for a data set?

What are the benefits and limitations of the standard deviation as a measure of variability?

What is the variance of a data set?

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

Unit 1: Exploring One-Variable Data

Summary Statistics

Describing Variables

Parameters & Statistics

Measures of Center

Measures of Position

Measures of Variability

Tables & Relative Frequency

Grouped Data

Outliers & Resistant Measures

Five-Number Summary & Boxplots

Skewness of Data

Comparing Data using Summary Statistics

Graphical Representations

Shape of Distributions

Bar Charts & Histograms

Dotplots & Stemplots

Cumulative Graphs

Comparing Univariate Graphs

The Normal Distribution

Properties of Normal Distributions

Standardized z-scores

Comparing Normal Distributions

Finding Proportions from Normal Distributions

Inverse Normal Calculations

Estimating Parameters of Normal Distributions

Unit 2: Exploring Two-Variable Data

Tables & Graphs

Two-Way Tables & Relative Frequencies

Bar Graphs & Mosaic Plots

Scatterplots & Regression

Explanatory & Response Variables

Scatterplots

Association & Correlation Coefficients

Interpolation & Extrapolation using Linear Models

Residuals

The Least-Squares Regression Line

Residual Plots

The Coefficient of Determination

Outliers, High-Leverage & Influential Points

Linearization of Bivariate Data

Unit 3: Collecting Data

Sampling Methods & Bias

Introduction to Sampling

Simple Random Sampling (SRS)

Random Sampling Methods

Types of Bias

Non-random (Biased) Sampling Methods

Experimental Design

Introduction to Experiments

Well-Designed Experiments

Control Groups, Placebos & Blind Experiments

Completely Randomized Design

Randomized Block & Matched Pairs Design

Unit 4: Probability, Random Variables & Probability Distributions

Probability

Estimating Probability using Relative Frequency

Probabilities of Single Events

Introduction to Combined Events

Addition Rule & Mutually Exclusive Events

Conditional Probability

Multiplication Rule & Independent Events

Probabilities of Combined Events using Tree Diagrams

Probabilities of Combined Events using the Rules

Discrete Random Variables

Probability Distributions for Discrete Random Variables

Cumulative Probability Distributions for Discrete Random Variables

Mean & Standard Deviation of a Discrete Random Variable