Representing Data Numerically (AQA Level 3 Mathematical Studies (Core Maths))

Revision Note

Naomi C

Author

Naomi C

Expertise

Maths

Mean, Median & Mode

What is the mode?

  • The mode is the value that appears the most often

    • The mode of 1, 2, 2, 5, 6 is 2

  • There can be more than one mode

    • The modes of 1, 2, 2, 5, 5, 6 are 2 and 5

  • The mode can also be called the modal value

  • In some situations there may be no mode

What is the median?

  • The median is the middle value when you put values in size order

    • The median of 4, 2, 3 can be found by

      • ordering the numbers: 2, 3, 4

      • and choosing the middle value, 3

  • If you have an even number of values, find the midpoint of the middle two values 

    • The midpoint is the sum of the two middle values divided by 2

    • The median of 1, 2, 3, 4 is 2.5

      • 2.5 is the midpoint of 2 and 3

What is the mean?

  • The mean is the sum of the values divided by the number of values

    • The notation, x with bar on top, is used to represent the mean

    • The mean of 1, 2, 6 is (1 + 2 + 6) ÷ 3 = 3

  • The mean can be fraction or a decimal

    • It may need rounding

    • You do not need to force it to be a whole number

      • You can have a mean of 7.5 people, for example!

  • The mean is often selected because it uses all of the data

How do I calculate the mean from a frequency table?

  • To find the mean from a frequency table of ungrouped data, use the formula

    x with bar on top equals fraction numerator sum for blank of f x over denominator sum for blank of f end fraction

    • where sum for blank of f x is the sum of each data item, x, multiplied by its corresponding frequency, f

    • and sum for blank of f is the sum of all of the frequencies

  • To find the mean from a frequency table of grouped data

    • Use the same formula as for a frequency table

    • Use the mid-interval value (midpoint) of each group as the value for x

How do I know which average to use?

  • The mode, median and mean are different ways to measure an average

    • Units for the mean, median and mode are the same as for the data set

  • In certain situations it is better to use one average over another

  • For example:

    • If the data has extreme values (outliers)

      • Don't use the mean (it's badly affected by extreme values)

    • If the data has more than one mode 

      • Don't use the mode as it is not clear

    • If the data is non-numerical, like dog, cat, cat, fish

      • You can only use the mode

Worked Example

15 students were timed to see how long it took them to solve a mathematical problem. Their times, in seconds, are given below.

12

10

15

14

17

11

12

13

9

21

14

20

19

16

23

(a) Find the mean time, giving your answer to 3 significant figures.

Add up all the numbers (you can add the rows if it helps) 

12 plus 10 plus 15 plus 14 plus 17 equals 68
11 plus 12 plus 13 plus 9 plus 21 equals 66
14 plus 20 plus 19 plus 16 plus 23 equals 92

Total space equals space 68 plus 66 plus 92 plus 92 equals 226

 Divide the total by the number of values (there are 15 values)

table row cell 226 over 15 end cell equals cell 15.066 space 666 space... end cell end table

Write the mean to 3 significant figures
Remember to include the units

The mean time is 15.1 seconds (to 3 s.f.) 

(b) Find the median time.

Write the times in order and find the middle value

up diagonal strike 9 space space space space up diagonal strike 10 space space space space up diagonal strike 11 space space space space up diagonal strike 12 space space space space up diagonal strike 12 space space space space up diagonal strike 13 space space space space up diagonal strike 14 space space space space circle enclose 14 space space space space up diagonal strike 15 space space space space up diagonal strike 16 space space space space up diagonal strike 17 space space space space up diagonal strike 19 space space space space up diagonal strike 20 space space space space up diagonal strike 21 space space space space up diagonal strike 23

The median time is 14 seconds

(c) Explain why the median is a better measure of average time than the mode.

Try to find the mode (the number that occurs the most)

There are two modes: 12 and 14

Explain why the median is better

There is no clear mode (there are two modes, 12 and 14),
so the median is better

(d) If a 16th student has a time of 95 seconds, explain why the median of all 16 students would be a better measure of average time than the mean.

The16th value of 95 is extreme (very high) compared to the other values
Means are affected by extreme values

The mean will be affected by the extreme value of 95
whereas the median will not

Worked Example

The frequency table below shows the number of pets owned by 30 students in a class.

Number of pets, x

0

1

2

3

4

Frequency, f

5

13

7

4

1

Work out the mean number of pets owned.

Multiply each data item by its corresponding frequency and add together to find sum for blank of f x

table row cell sum for blank of f x end cell equals cell 0 cross times 5 plus 1 cross times 13 plus 2 cross times 7 plus 3 cross times 4 plus 4 cross times 1 end cell row blank equals 43 end table

The sum of the frequencies,sum for blank of f, is the total number of students in the class

sum for blank of f equals 30

Use the formula, x with bar on top equals fraction numerator sum for blank of f x over denominator sum for blank of f end fraction, to calculate the mean, x with bar on top

x with bar on top equals 43 over 30 equals 1.43333...

Round to 3 an appropriate degree of accuracy

The mean number of pets owned by students in the class is 1.43 (3 s.f.)

Range, Quartiles & Outliers

What are quartiles?

  • Quartiles are measures of location

  • Quartiles divide a population or data set into four equal sections

    • The lower quartile, Q subscript 1, splits the lowest 25% from the highest 75%

    • The median, Q subscript 2, splits the lowest 50% from the highest 50%

    • The upper quartile, Q subscript 3, splits the lowest 75% from the highest 25%

  • There are different methods for finding quartiles, depending on the number of items in the data set, n

    • First, list the items in size order

    • When finding the median and quartiles from raw data:

      • The median will be at position fraction numerator n plus 1 over denominator 2 end fraction

      • The lower quartile will be at position fraction numerator n plus 1 over denominator 4 end fraction

      • The upper quartile will be at position fraction numerator 3 open parentheses n plus 1 close parentheses over denominator 4 end fraction

    • For larger data sets:

      • The median will be at position n over 2

      • The lower quartile will be at position n over 4

      • The upper quartile will be at position fraction numerator 3 n over denominator 4 end fraction

      • The use of n plus 1 rather than n will still be accepted however

What are the range and interquartile range?

  • The range and interquartile range are both measures of dispersion

    • They describe how spread out the data is

  • The range is the largest value of the data minus the smallest value of the data

  • The interquartile range is the range of the central 50% of data

    • It is the upper quartile minus the lower quartile

begin mathsize 22px style IQR equals Q subscript 3 minus Q subscript 1 end style

  • The units for the range and interquartile range are the same as the units for the data

  • The range can be affected by outliers (extreme values)

    • Outliers will not affect the interquartile range

Exam Tip

If asked to find the range, or the interquartile range, in an exam, make sure you show your subtraction clearly (don't just write down the answer)

What are outliers?

  • Outliers are extreme data values that do not fit with the rest of the data

    • They are either a lot bigger or a lot smaller than the rest of the data

  • Outliers are defined as values that are more than 1.5 cross times IQR from the nearest quartile

    • x is an outlier if x less than Q subscript 1 minus 1.5 cross times IQR or x greater than Q subscript 3 plus 1.5 cross times IQR

  • Outliers can have a big effect on some statistical measures

Should I remove outliers?

  • The decision to remove outliers will depend on the context

  • Outliers should be removed if they are found to be errors

    • The data may have been recorded incorrectly

    • For example, the number 17 may have been recorded as 71 by mistake

  • Outliers should not be removed if they are a valid part of the sample

    • The data may need to be checked to verify that it is not an error

    • For example, the annual salaries of employees of a business might appear to have an outlier but this could be the director’s salary

Worked Example

Find the range and interquartile range for the data set given below.

 43                        29                        70                        51                        64                       43                       44

Find the range by subtracting the minimum value from the maximum value

70 minus 29

Range = 41

Arrange the values from smallest to largest

table row 29 blank blank end table table row 43 blank blank end table table row 43 blank blank end table table row 44 blank blank end table table row 51 blank blank end table table row 64 blank blank end table 70

The lower quartile will be at position fraction numerator n plus 1 over denominator 4 end fraction

fraction numerator 7 plus 1 over denominator 4 end fraction equals 2 to the power of nd space value

Q subscript 1 equals 43

The upper quartile will be at position fraction numerator 3 open parentheses n plus 1 close parentheses over denominator 4 end fraction

fraction numerator 3 open parentheses 7 plus 1 close parentheses over denominator 4 end fraction equals 6 to the power of th space value

Q subscript 3 equals 64

Subtract the lower quartile from the upper quartile to find the interquartile range

IQR equals 64 minus 43

IQR = 21

Standard Deviation

What is standard deviation?

  • The standard deviation, sigma, is a measure of dispersion

    • It describes how spread out the data is in relation to the mean

    • If greater the value of the standard deviation, the more spread out the data is

  • The units for the standard deviation are the same as the units for the data

How is standard deviation calculated?

  • The standard deviation is the square root of the mean of the squares of the differences between the values and the mean

  • The formula used in this course for standard deviation is the standard deviation for a sample

sigma subscript n minus 1 end subscript equals square root of fraction numerator sum open parentheses x minus x with bar on top close parentheses squared over denominator n minus 1 end fraction end root

  • You can calculate the standard deviation of a small data set by hand

  • You can also enter the data to your calculator and use the stats calculation options to calculate the standard deviation of a data set

Exam Tip

If you use your calculator to find the standard deviation, make sure that you find the standard deviation for a sample, sigma subscript n minus 1 end subscript, and not the standard deviation for a population, sigma subscript n

Worked Example

Find standard deviation for the data set given below.

 43                        29                        70                        51                        64                       43

Method 1: Calculator

You can calculate the standard deviation using your calculator

Input all of the values into a spreadsheet

Select the statistics calculation option and find the value for the standard deviation

sigma subscript n minus 1 end subscript equals 15.07315...

Round appropriately

15.1 (to 1 d.p.)

Method 2: By hand

To calculate the standard deviation by hand, use the formula, sigma subscript n minus 1 end subscript equals square root of fraction numerator sum for blank of open parentheses x with bar on top minus x close parentheses squared over denominator n minus 1 end fraction end root
Start by finding the mean, x with bar on top

fraction numerator 43 plus 29 plus 70 plus 51 plus 64 plus 43 over denominator 6 end fraction equals 50

Find the difference between each data item and the mean, square it and add the results together, sum for blank of open parentheses x with bar on top minus x close parentheses squared

table row cell sum for blank of open parentheses x with bar on top minus x close parentheses squared end cell equals cell open parentheses 43 minus 50 close parentheses squared plus open parentheses 29 minus 50 close parentheses squared plus open parentheses 70 minus 50 close parentheses squared plus open parentheses 51 minus 50 close parentheses squared plus open parentheses 64 minus 50 close parentheses squared plus open parentheses 43 minus 50 close parentheses squared end cell row blank equals cell open parentheses negative 7 close parentheses squared plus open parentheses negative 21 close parentheses squared plus 20 squared plus 1 squared plus 14 squared plus open parentheses negative 7 close parentheses squared end cell row blank equals cell 49 plus 441 plus 400 plus 1 plus 196 plus 49 end cell row blank equals 1136 end table

Divide the result by 1 less than the number of data items, n minus 1

11136 over 5 equals 227.2

Finally take the square root of the result

square root of 227.2 end root equals 15.07315...

Round appropriately

15.1 (to 1 d.p.)

You've read 0 of your 10 free revision notes

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Naomi C

Author: Naomi C

Naomi graduated from Durham University in 2007 with a Masters degree in Civil Engineering. She has taught Mathematics in the UK, Malaysia and Switzerland covering GCSE, IGCSE, A-Level and IB. She particularly enjoys applying Mathematics to real life and endeavours to bring creativity to the content she creates.