Linear Interpolation (Edexcel GCSE Statistics)
Revision Note
Written by: Roger B
Reviewed by: Dan Finlay
Median from Grouped Data
How do I find the median for grouped data?
Grouped data doesn’t contain the individual data values
So we can’t find the exact median in the usual way
We can estimate the median for grouped data using linear interpolation
This assumes the data is evenly spread across the class containing the median
STEP 1
Identify the class interval containing the medianThis is the class containing the ‘th’ data value
Divide the total number of values, n, by 2
e.g. if there are 80 data values,
Consider cumulative frequencies until you find the class interval containing the 40th value
STEP 2
Find ‘how far into’ that class interval the th data value ise.g. if the interval with the median contains the 36th through 43rd data values
then the 40th value is ‘5 values in’ (36, 37, 38, 39, 40)
and there are 8 values in the interval
so the 40th value isof the way into the interval
STEP 3
Multiply the class width of the class interval containing the median by the fraction found in Step 2e.g. if the interval with the median is
the class width is
STEP 4
Add the result from Step 3 to the lower boundary of the class interval containing the medianThe result is the estimated median
e.g. lower bound of is 50
The estimated median is
The estimated median can also be found using the following formula:
L is the lower boundary of the class interval containing the median
n is the total number of data values
C is the cumulative frequency of all the class intervals before the one containing the median
f is the frequency of (i.e. the number of values in) the class interval containing the median
w is the width of the class interval containing the median (upper boundary minus lower boundary)
This formula combines all four steps of the process
but it is not on the exam formula sheet
So if you want to use it you’ll need to remember it
Examiner Tips and Tricks
The formula can be tricky to remember correctly
It’s better to understand how the method works
Then you don’t need to remember the formula
Remember that the median found this way is an estimate
You can’t find the exact median without knowing all the data values
Worked Example
A student collected data about the length of time (x hours) students in his school spent listening to music in a given week. He collected data from 50 students in total. The following table summarises the data:
Time spent, x (hours) | Number of students |
0 ≤ x ≤ 10 | 3 |
10 < x ≤ 20 | 19 |
20 < x ≤ 30 | 12 |
30 < x ≤ 40 | 10 |
40 < x ≤ 50 | 5 |
50 < x ≤ 60 | 1 |
Work out an estimate for the median amount of time spent listening to music by the students.
STEP 1: Identify the class interval containing the median
Divide the total number of values, n, by 2
Here n = 50
So we’re looking for the interval with the 25th value
Note that this is different from finding the median from a set of data values
In that case we would be looking for the value halfway between the 25th and 26th values
With linear interpolation we don’t have to worry about that!
Add a cumulative frequency column to the table and work out the cumulative frequencies
Time spent, x (hours) | Number of students | Cumulative Frequency |
0 ≤ x ≤ 10 | 3 | 3 |
10 < x ≤ 20 | 19 | 22 |
20 < x ≤ 30 | 12 | 34 |
30 < x ≤ 40 | 10 | 44 |
40 < x ≤ 50 | 5 | 49 |
50 < x ≤ 60 | 1 | 50 |
The second class interval goes up to the 22nd data value and the third class interval goes up to the 34th data value
So the median is in the third class interval
The median is in the class interval
STEP 2: Find how far into the class interval the th data value is
The interval with the median contains the 23rd through 34th data values
The 25th data value is ‘3 values in’ to the interval (23, 24, 25)
And there are 12 data values in the interval
So the median is 1/4 of the way into the interval
STEP 3: Multiply the class width by the fraction found in Step 2
Subtract the lower boundary from the upper boundary to find the class width of the interval
30-20=10
Multiply by the fraction in Step 2
STEP 4: Add the result from Step 3 to the lower boundary of the class interval
20 + 2.5 = 22.5
Don’t forget the units in your answer!
Estimated median = 22.5 hours
How do I find the median for data on a histogram?
A histogram is a way of representing grouped data as a diagram
See the 'Histograms & Frequency Polygons' revision note for full details
The connection between the frequency density shown on the histogram and the frequency that would be shown in a grouped data table is given by the formula
If you are asked to estimate a median for data in a histogram there are two options:
You can recreate the grouped data table using the frequency density formula, and then follow the method given above
Or you can work out the estimated median directly from the histogram
See the following Worked Example for how to do this
Worked Example
The histogram shows the weight, in kg, of 60 newborn bottlenose dolphins.
Find an estimate for the median weight of the dolphins in the sample, giving your answer correct to two decimal places.
, so to estimate the median we need to find the weight of the 30th dolphin
To find the frequencies represented by the different bars, rearrange as
The first two classes have a cumulative frequency of
So the median is going to be '10 dolphins into' the 10-12 kg class
The height (frequency density) of the 10-12 kg bar is 9.5
We need to find what width would give a frequency of 10
Use and solve for width
That means that the median lies 1.05 kg into the 10-12 kg class interval
Estimated median = 11.05 kg (2 d.p.)
Last updated:
You've read 0 of your 5 free revision notes this week
Sign up now. It’s free!
Did this page help you?