Did this video help you?
Outliers (Edexcel International AS Maths: Statistics 1)
Revision Note
Outliers
What are outliers?
- Outliers are extreme data values that do not fit with the general pattern of the data
- They can come from one or two extreme events or from mistakes in the data collection
- Outliers will affect some statistics that are calculated from the data
- They can have a big effect on the mean, but not on the median or usually the mode
- The range will be completely changed by a single outlier, but the interquartile range will not be affected
- When calculating the mean or the range it is important to decide whether the outlier(s) should be included in the calculations
- The question will tell you whether to include the outliers or not
- You may have to decide which value is the outlier to be removed
- In general outliers are included if they are a valid piece of data and excluded if it is likely that they are erroneous
How are outliers calculated?
- Most of the time within this syllabus the outliers will be a particular distance either side of the interquartile range
- The most common way to calculate an outlier will be using the formulae:
- A value that is less than (interquartile range)
- A value that is greater than (interquartile range)
- k is a constant that will be given to you in the exam, commonly k=1.5
- The most common way to calculate an outlier will be using the formulae:
- Outliers could also be situated a number of standard deviations away from the mean
- The most common way to calculate an outlier will be using the formulae
- A value that is less than
- A value that is greater than
- k is a constant that will be given to you in the exam, commonly
- The most common way to calculate an outlier will be using the formulae
How are outliers represented on box plots?
- On a box plot an outlier is represented as a cross either side of the maximum or minimum value
- If the maximum or minimum value is discovered to be an outlier, the new maximum or minimum value will need to be found for the box plot
- If the data value just above the minimum or just below the maximum is known, this will become the new value
- If the data value is not known, the new minimum or maximum will become the outlier boundary
Worked example
The ages, in years, of a number of children attending a birthday party are given below:
2, 7, 5, 4, 8, 4, 6, 5, 5, 29, 2, 5, 13,
An outlier is defined as an observation that falls more than the interquartile range above the upper quartile or below the lower quartile
(i)
Identify any outliers within the data set.
(ii)
Decide which values (if any) should be removed, justify your answer.
Examiner Tip
- Read the question carefully to determine which type of outlier you should be finding and to make sure you are using the correct method.
You've read 0 of your 5 free revision notes this week
Sign up now. It’s free!
Did this page help you?