Linear Interpolation (Edexcel GCSE Statistics)

Revision Note

Median from Grouped Data

How do I find the median for grouped data?

  • Grouped data doesn’t contain the individual data values

    • So we can’t find the exact median in the usual way

  • We can estimate the median for grouped data using linear interpolation

    • This assumes the data is evenly spread across the class containing the median

  • STEP 1
    Identify the class interval containing the median

    • This is the class containing the ‘n over 2th’ data value

      • Divide the total number of values, n, by 2

    • e.g. if there are 80 data values, n over 2 equals 80 over 2 equals 40

      • Consider cumulative frequencies until you find the class interval containing the 40th value

  • STEP 2
    Find ‘how far into’ that class interval the n over 2th data value is

    • e.g. if the interval with the median contains the 36th through 43rd data values

      • then the 40th value is ‘5 values in’ (36, 37, 38, 39, 40)

      • and there are 8 values in the interval

      • so the 40th value is5 over 8of the way into the interval

  • STEP 3
    Multiply the class width of the class interval containing the median by the fraction found in Step 2

    • e.g. if the interval with the median is 50 less or equal than x less or equal than 70

      • the class width is 70 minus 50 equals 20

      • 20 cross times 5 over 8 equals 12.5

  • STEP 4
    Add the result from Step 3 to the lower boundary of the class interval containing the median

    • The result is the estimated median

    • e.g. lower bound of 50 less or equal than x less or equal than 70 is 50

      • The estimated median is 50 plus 12.5 equals 62.5

  • The estimated median can also be found using the following formula:

    • estimated space median equals L plus fraction numerator n over 2 minus C over denominator f end fraction cross times w

      • L is the lower boundary of the class interval containing the median

      • n is the total number of data values

      • C is the cumulative frequency of all the class intervals before the one containing the median

      • f is the frequency of (i.e. the number of values in) the class interval containing the median

      • w is the width of the class interval containing the median (upper boundary minus lower boundary)

    • This formula combines all four steps of the process

      • but it is not given to you on the exam

      • So if you want to use it you’ll need to remember it

Examiner Tips and Tricks

  • The formula can be tricky to remember correctly

    • It’s better to understand how the method works

    • Then you don’t need to remember the formula

  • Remember that the median found this way is an estimate

    • You can’t find the exact median without knowing all the data values

Worked Example

A student collected data about the length of time (x hours) students in his school spent listening to music in a given week. He collected data from 50 students in total.  The following table summarises the data:

Time spent, x (hours)

Number of students

0 ≤ x ≤ 10

3

10 < x ≤ 20

19

20 < x ≤ 30

12

30 < x ≤ 40

10

40 < x ≤ 50

5

50 < x ≤ 60

1

Work out an estimate for the median amount of time spent listening to music by the students.

STEP 1: Identify the class interval containing the median

Divide the total number of values, n, by 2

Here n = 50

50 over 2 equals 25

So we’re looking for the interval with the 25th value

Note that this is different from finding the median from a set of data values

In that case we would be looking for the value halfway between the 25th and 26th values

With linear interpolation we don’t have to worry about that!

Add a cumulative frequency column to the table and work out the cumulative frequencies

Time spent, x (hours)

Number of students

Cumulative Frequency

0 ≤ x ≤ 10

3

3

10 < x ≤ 20

19

22

20 < x ≤ 30

12

34

30 < x ≤ 40

10

44

40 < x ≤ 50

5

49

50 < x ≤ 60

1

50

The second class interval goes up to the 22nd data value and the third class interval goes up to the 34th data value

So the median is in the third class interval

The median is in the 20 less than x less or equal than 30 class interval

STEP 2: Find how far into the class interval the n over 2th data value is

The interval with the median contains the 23rd through 34th data values

The 25th data value is ‘3 values in’ to the interval (23, 24, 25)

And there are 12 data values in the interval

3 over 12 equals 1 fourth

So the median is 1/4 of the way into the interval

STEP 3: Multiply the class width by the fraction found in Step 2

Subtract the lower boundary from the upper boundary to find the class width of the 20 less or equal than x less or equal than 30 interval

30-20=10

Multiply by the fraction in Step 2

10 cross times 1 fourth equals 2.5

STEP 4: Add the result from Step 3 to the lower boundary of the class interval

20 + 2.5 = 22.5

Don’t forget the units in your answer!

Estimated median = 22.5 hours

Last updated:

You've read 0 of your 5 free revision notes this week

Sign up now. It’s free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Roger B

Author: Roger B

Expertise: Maths

Roger's teaching experience stretches all the way back to 1992, and in that time he has taught students at all levels between Year 7 and university undergraduate. Having conducted and published postgraduate research into the mathematical theory behind quantum computing, he is more than confident in dealing with mathematics at any level the exam boards might throw at you.

Dan Finlay

Author: Dan Finlay

Expertise: Maths Lead

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.