Outliers & Resistant Measures (College Board AP® Statistics)
Study Guide
Written by: Naomi C
Reviewed by: Dan Finlay
Outliers
What are outliers?
Outliers are extreme data values that do not fit with the rest of the data
They are either a lot bigger or a lot smaller than the rest of the data
There are two primary methods for defining outliers in this course
Outliers are values that are more than 1.5 times the interquartile range (IQR) from the nearest quartile
is an outlier if or
Outliers are values that lie two or more standard deviations above or below the mean
is an outlier if or
Outliers can have a big effect on some statistical measures
Examiner Tips and Tricks
These two methods may result in slightly different boundaries for determining whether or not a particular value is an outlier. As long as you show full working and reasoning, your answer will gain full marks.
Should I remove outliers?
The decision to remove outliers will depend on the context
Outliers should be removed if they are found to be errors
The data may have been recorded incorrectly
e.g. the age of a teenager, 17, may have been recorded as 71 by mistake
Outliers should not be removed if they are a valid part of the sample
The data may need to be checked to verify that it is not an error
e.g. the annual salaries of employees of a business might appear to have an outlier, but this could be the director’s salary
Worked Example
The ages, in years, of a number of children attending a birthday party are given below.
2, 7, 5, 4, 8, 4, 6, 5, 5, 15, 2, 5, 13
Identify any outliers within the data set.
Answer:
Method 1: IQR
is an outlier if or
Find the first quartile and the third quartile, this can be done by hand or by entering the data into your calculator and looking at the one-variable statistics
Calculate the IQR
Find the boundaries for any possible outliers
Identify any values in the data set outside of these boundaries
The ages of 13 and 15 are outliers as they lie more than 1.5 times the interquartile range above the third quartile
Method 2: Standard deviations
is an outlier if or
Find the mean and the standard deviation by entering the data into your calculator and looking at the one-variable statistics
Remember that this data is the entire data set so you want to use the population standard deviation
Find the boundaries for any possible outliers
Identify any values in the data set outside of these boundaries
The age of 15 lies more than 2 standard deviations above the mean so it is an outlier
Resistant measures
What is a resistant measure?
A resistant measure is a statistical measure that is not greatly affected by an outlier
It is sometimes not affected by an outlier at all
Resistant measures are sometimes known as robust measures
The median and the interquartile range (IQR) are considered to be resistant measures
The mean, standard deviation , and range are considered to be nonresistant measures
The value of any of these measures could be significantly affected by an outlier
Last updated:
You've read 0 of your 5 free study guides this week
Sign up now. It’s free!
Did this page help you?