Lines of Best Fit (Edexcel GCSE Statistics)

Revision Note

Lines of Best Fit

What is a line of best fit?

  • If a scatter graph suggests that there is a positive or negative correlation

    • a line of best fit can be drawn on the scatter graph

      • This can then be used to make predictions

How do I draw a line of best fit?

  • line of best fit can often be drawn by eye

    • It is a straight line (use a ruler!)

    • It must extend across the full data set

    • There should be roughly as many points on either side of the line (along its whole length)

    • The spaces between the points and the line should roughly be the same on either side

  • If there is one extreme value (outlier) that does not fit the general pattern

    • then ignore this point when drawing a line of best fit

What is the double mean point?

  • A question may talk about the double mean point

    • This is the point open parentheses x with bar on top comma space y with bar on top close parentheses

      • x with bar on top is the mean of the data values that are plotted along the x-axis

      • y with bar on top is the mean of the data values that are plotted along the y-axis

    • The question may give you the values of x with bar on top and y with bar on top

      • Or you may need to calculate the means from the data

  • If a question mentions the double mean point, then the line of best fit must go through the double mean point

    • It should still follow all the other rules for drawing a line of best fit (roughly same number of points on each side, etc.)

  • If a question doesn't mention the double mean point

    • then you don't need to calculate it or worry about drawing the line through it

How do I use a line of best fit?

  • The line of best fit can be used to predict the value of one variable from the other variable

    • See the Worked Example

  • Predictions should only be made for values that are within the range of the given data

    • Making a prediction within the range of the given data is called interpolation

      • This will normally give a reliable result

    • Making a prediction outside of the range of the given data is called extrapolation 

      • This is much less reliable

What about the gradient and y-intercept of a line of best fit?

  • You need to be able to interpret the meaning of the gradient and y-intercept of a line of best fit

  • The gradient is the slope or 'steepness' of the line

    • A question may tell you the gradient of the line of best fit

    • If you need to find it you can calculate it using 'rise over run'

      • Pick two points on the line with coordinates open parentheses x subscript 1 comma space y subscript 1 close parentheses and open parentheses x subscript 2 comma space y subscript 2 close parentheses

      • gradient equals fraction numerator y subscript 2 minus y subscript 1 over denominator x subscript 2 minus x subscript 1 end fraction

      • Be careful – the plotted data points will usually not be points on the line!

  • The gradient of the line of best fit tells you the rate of change of the y-axis variable with respect to the x-axis variable

    • This needs to be interpreted in context

      • For example if the x-axis variable is distance travelled in a taxi (in miles) and the y-axis variable is the cost of the taxi ride (in pounds £)

      • then the gradient of the line of best fit (£ per mile) is the cost in pounds for increasing the distance travelled by 1 mile

  • The y-intercept is the value of the y-coordinate at the point where the line crosses the y-axis

    • This can be read off the graph

  • The y-intercept of the line of best fit tells you the value of the y-axis variable when the x-axis variable is equal to zero

    • This needs to be interpreted in context

      • For example if the x-axis variable is distance travelled in a taxi (in miles) and the y-axis variable is the cost of the taxi ride (in pounds £)

      • then the y-intercept of the line of best fit tells you the 'flat fee' that is added onto every taxi ride

Exam Tip

  • Sliding a ruler around a scatter graph can help to find the right position for the line of best fit!

  • Remember to draw the line through the double mean point if the question mentions it

Worked Example

Sophie wants to know if the price of a computer is related to the speed of the computer.

She tests 8 computers by running the same program on each, measuring how many seconds it takes to finish.

Sophie's results are shown in the table below.

Price (£)

320

300

400

650

220

380

900

700

Time (secs)

3.2

5.3

4.1

2.9

5.1

4.3

2.6

3.8

(a) Draw a scatter diagram showing these results.

Plot each point carefully using crosses 

A scatter diagram drawn from the data in the question

(b) Write down the type of correlation shown and interpret this in the context of the question. 

The shape formed by the points goes from top left to bottom right (negative gradient), so there is negative correlation
As one quantity increases (price), the other decreases (time)
Note that time decreasing means that the computer is running faster

The graph shows a negative correlation
This means that the more a computer costs, the quicker it is at running the program

(c) Use a line of best fit to estimate the price of a computer that completes the task in 3.4 seconds.

First draw a line of best fit, by eye
Then draw a horizontal line from 3.4 seconds to the line of best fit
Draw a vertical line down to read off the price

A line of best fit drawn on a scatter diagram

A computer that takes 3.4 seconds to run the program should cost around £620

A range of different answers would be accepted, depending on the line of best fit

(d) Explain why this should not be used to estimate the time taken to complete the task by a computer costing £1500.

£1500 is outside the range of the data, so estimating that from the scatter diagram would be extrapolation

Using the diagram for a computer costing £1500 would be extrapolation, and results from extrapolation are usually unreliable

You've read 0 of your 10 free revision notes

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Roger B

Author: Roger B

Roger's teaching experience stretches all the way back to 1992, and in that time he has taught students at all levels between Year 7 and university undergraduate. Having conducted and published postgraduate research into the mathematical theory behind quantum computing, he is more than confident in dealing with mathematics at any level the exam boards might throw at you.