Cumulative Frequency Charts (Edexcel GCSE Statistics)
Revision Note
Written by: Roger B
Reviewed by: Dan Finlay
Cumulative Frequency
What is cumulative frequency?
Cumulative refers to a “running total" or "adding up as you go along”
So in a table of grouped data
cumulative frequency means all of the frequencies for the different groups totalled up to the end of the group in a given row
When working out cumulative frequencies you may see tables presented in two ways
A regular grouped data table with an extra column for cumulative frequencies
e.g. rows labelled 0 ≤ x < 20, 20 ≤ x < 40, 40 ≤ x < 60, etc
frequency
cumulative frequency
0 ≤ x < 20
14
14
20 ≤ x < 40
25
39
(because 14+25=39)
40 ≤ x < 60
29
68
(because 14+25+29=68)
60 ≤ x < 80
12
80
(because 14+25+29+12=80)
or a separate table where every group is relabelled as starting at the beginning (often zero)
e.g. rows labelled 0 ≤ x < 20, 0 ≤ x < 40, 0 ≤ x < 60, etc.
or x < 20, x < 40, x < 60, etc.
cumulative frequency
0 ≤ x < 20
(or x < 20)
14
0 ≤ x < 40
(or x < 40)
39
0 ≤ x < 60
(or x < 60)
68
0 ≤ x < 80
(or x < 80)
80
In the second type of table, you can subtract to find the individual frequencies
e.g. the frequency of the 20 ≤ x < 40 class interval is 39-14=25
the frequency of the 40 ≤ x < 60 class interval is 68-39=29
etc.
Cumulative Frequency Step Polygons
What is a cumulative frequency step polygon?
A cumulative frequency step polygon is a way of representing discrete data
For grouped continuous data a cumulative frequency diagram would be used instead
How do I draw a cumulative frequency step polygon?
This is best explained with an example
The numbers of eggs found in each nest in a survey of 50 American alligator nests are shown in the table below:
Number of eggs
Frequency
29
5
30
6
31
11
32
15
33
7
34
6
Total
50
Then the cumulative frequency is the running total of the frequencies
Number of eggs
Frequency
Cumulative Frequency
29
5
5
30
6
5 + 6 = 11
31
11
11 + 11 = 22
32
15
22 + 15 = 37
33
7
37 + 7 = 44
34
6
44 + 6 = 50
Total
50
We can now draw the cumulative frequency step polygon
The cumulative frequency will always go on the vertical axis
The values in the data set will appear along the horizontal axis
The cumulative frequency is zero until we get to 29 eggs
So start at the point (29, 0)
Then the cumulative frequency jumps up to 5
Draw a vertical line from (29, 0) to (29, 5) to show this jump
Nothing changes until we get to 30 eggs
Draw a horizontal line from (29, 5) to (30, 5) to show this
At 30 eggs the cumulative frequency jumps up to 11
Draw a vertical line from (30, 5) to (30, 11) to show this jump
Continue the same way through the rest of the values in the table
The graph will end at the point (34, 50)
Here is the final cumulative frequency diagram for the numbers of eggs
Cumulative Frequency Diagrams
What is a cumulative frequency diagram?
A cumulative frequency diagram is a way of representing grouped continuous data
For discrete data a cumulative frequency step polygon would be used instead
A cumulative frequency diagram can be used to estimate other statistical values
For example the median, quartiles or percentiles
How do I draw a cumulative frequency diagram?
This is best explained with an example
The times taken to complete a short general knowledge quiz taken by 50 students are shown in the table below:
Time taken ( seconds)
Frequency
3
8
17
12
7
3
Total
50
Then the cumulative frequency is the running total of the frequencies
Time taken ( seconds)
Frequency
Cumulative Frequency
3
3
8
3 + 8 = 11
17
11 + 17 = 28
12
28 + 12 = 40
7
40 + 7 = 47
3
47 + 3 = 50
Total
50
We can now draw the cumulative frequency diagram
The most important part is that cumulative frequency is plotted against the end (upper bound) of the class interval
The end of the class interval is the x-coordinate
The cumulative frequency is the y-coordinate
For the above example the first two points to plot would be (30, 3) and (35, 11)
To explain this, consider the second row ()
the 8 students in this group could have taken any time between 30 and 35 seconds
they cannot all be guaranteed to have been accounted for until we reach 35 seconds
Once all points from the table are plotted, a point for the start needs to be added
this will be at the lowest time from the table
i.e. at 25 seconds with a cumulative frequency of 0
so plot the point (25, 0)
Join points up with a smooth curve (this takes some practice), or by drawing straight lines from each point to the next one (use a ruler)
If you draw a curve, make sure it goes through all of the marked points
It is usually easier to draw straight lines
You will get full marks for either version
In general a cumulative frequency diagram has a stretched-S-shape appearance
a cumulative frequency diagram will never come back towards the x-axis
Here is the final cumulative frequency diagram for the quiz times
Interpreting Cumulative Frequency Diagrams
How do I use and interpret a cumulative frequency diagram?
A cumulative frequency diagram provides a way to estimate key facts about the data
median
lower and upper quartiles (and interquartile range)
percentiles
These values will be estimates as the original raw data is unknown
Cumulative frequency diagrams are used with grouped data
Points are joined by a smooth curve or by straight lines
This means the data is assumed to be smoothly spread out over each interval
The median and quartiles are also key features of a box plot
It is possible to draw a box plot from a cumulative frequency diagram
This can make it easier to compare two data sets
How do I find the median, lower quartile and upper quartile from a cumulative frequency diagram?
This is all about understanding how many data values are represented by the cumulative frequency diagram
This may be stated in words within the question
If not, it will be the highest value on the frequency (y-) axis that the curve on the diagram reaches
This should be "top right" of the curve on a cumulative frequency diagram
Median
STEP 1
Find the position of the medianFor data values, this will be
This is different from finding the median from a set of data values
e.g. for a list of 60 data values the median would be halfway between the 30th and 31st values
But for a cumulative frequency diagram it would just be
STEP 2
Draw a horizontal line from on the cumulative frequency (y-) axis until it hits the curveSTEP 3
Draw a vertical line from that point on the curve down to the horizontal (x-) axisThe value where that line hits the horizontal axis will be the median
Lower quartile
STEP 1
Find the position of the lower quartileFor data values this will be
STEP 2
Draw a horizontal line from on the cumulative frequency axis until it hits the curveSTEP 3
Draw a vertical line from that point on the curve down to the horizontal (x-) axisThe value where that line hits the horizontal axis will be the lower quartile
Upper quartile
STEP 1
Find the position of the upper quartileFor data values this will be
STEP 2
Draw a horizontal line from on the cumulative frequency axis until it hits the curveSTEP 3
Draw a vertical line from that point on the curve to the horizontal (x-) axisThe value where that line hits the horizontal axis will be the upper quartile
How do I find a percentile from a cumulative frequency diagram?
Percentiles split the data into 100 parts
So the 50th percentile is another way of describing the median
The 25th and 75th percentiles are the same as the lower and upper quartiles (respectively)
To find the pth percentile
STEP 1
Find the position of the pth percentileFor data values, this will be
So for the 10th percentile () with 60 data values ()
the position is
STEP 2
Draw a horizontal line from on the cumulative frequency axis until it hits the curveSTEP 3
Draw a vertical line from that point on the curve down to the horizontal (x-) axisThe value where that line hits the horizontal axis will be the pth percentile
Worked Example
A company is investigating the length of telephone calls customers make to its help centre.
The company randomly selects 100 phone calls from a particular day.
The results are displayed in the cumulative frequency diagram below.
(a) Estimate the median, the lower quartile and the upper quartile.
There are 100 pieces of data, so .
So the median is the 50th value, the lower quartile is the 25th value and the upper quartile is the 75th value
Draw horizontal lines from these on the cumulative frequency axis until they hit the curve
Then draw vertical lines down to the time of calls axis and take readings
Median = 6.2 minutes (6 m 12 s)
Lower quartile = 4.2 minutes (4 m 12 s)
Upper quartile = 8.2 minutes (8 m 12 s)
There is no need to convert to minutes and seconds unless the question asks you to
However, writing 6 m 2 s or 6 m 20 s would be incorrect
(b) The company is thinking of putting an upper limit of 12 minutes on calls to its help centre.
Estimate the number of these 100 calls that would have been beyond this limit.
Draw a vertical line up from 12 minutes on the time of calls axis until it hits the curve
Then draw a horizontal line across to the cumulative frequency axis and take a reading (in this case, 90)
This tells us that up to 12 minutes, 90 of the calls had been accounted for
The question wants the number of calls that were greater than 12 minutes so subtract this from the total of 100
100 - 90 = 10
Approximately 10 (out of 100) calls were beyond the 12 minute limit
Last updated:
You've read 0 of your 5 free revision notes this week
Sign up now. It’s free!
Did this page help you?