Correlation & Regression (Edexcel International AS Maths: Statistics 1)

Revision Note

Test yourself
Dan

Author

Dan

Last updated

Did this video help you?

PMCC

What is the product moment correlation coefficient?

  • The product moment correlation coefficient (PMCC) is a way of giving a numerical value to linear correlation of bivariate data
  • The PMCC of a sample is denoted by the letter r
    • r can take any value such that negative 1 less or equal than r less or equal than 1
      • Can be written as vertical line r vertical line less or equal than 1
    • A positive value of r describes positive correlation
    • A negative value of describes negative correlation
    • If r = 0 there is no correlation
    • r = 1 means perfect positive correlation and r = -1 means perfect negative correlation
    • The closer to 1 or -1, the stronger the correlation
  • The gradient of the regression line does not change the value of r

2-5-1-pmcc-diagram-1

How is the product moment correlation coefficient (PMCC) calculated?

  • For n pairs of bivariate data (x, y) we define the following statistics
    • begin mathsize 16px style S subscript x x end subscript equals straight capital sigma x squared minus open parentheses straight capital sigma x close parentheses squared over n end style
    • begin mathsize 16px style S subscript y y end subscript equals Σy squared minus open parentheses Σy close parentheses squared over n end style
    • begin mathsize 16px style S subscript x y end subscript equals straight capital sigma x y minus fraction numerator open parentheses straight capital sigma x close parentheses open parentheses straight capital sigma y close parentheses over denominator n end fraction end style 
    • These are given in the formula booklet
  • These are related to variance and can be written in several different ways:
    •  begin mathsize 16px style S subscript x x end subscript end style
      • begin mathsize 16px style straight capital sigma open parentheses x minus x with bar on top close parentheses squared end style
      • begin mathsize 16px style n sigma subscript x squared end style
    • size 16px S subscript size 16px x size 16px y end subscript
      • begin mathsize 16px style straight capital sigma x y minus n x with bar on top space y with bar on top end style
  • The product moment correlation coefficient (PMCC) is then calculated using the formula
    • begin mathsize 16px style r equals fraction numerator S subscript x y end subscript over denominator square root of S subscript x x end subscript S subscript y y end subscript end root end fraction end style
    • This is given in the formula booklet

Did this video help you?

Calculating Regression Line

If the PMCC is close to 1 or -1 then this suggests the data follows a linear model. In this case a regression line of the form y = a + bx is appropriate.

How do I calculate the equation of the regression line of y on x?

  • The gradient b of the regression line is calculated using the formula
    • b equals S subscript x y end subscript over S subscript x x end subscript 
    • This is given in the formulae booklet
  • The y-intercept a of the regression line is calculated using the formula
    • a equals y with bar on top minus b x with bar on top 
    • This is given in the formulae booklet
    • This is found using the fact that the point open parentheses x with bar on top comma space y with bar on top close parentheses lies on the regression line
  • If you are asked to find the equation of the regression line of x on y
    • x = c + dy 
    • d equals S subscript x y end subscript over S subscript y y end subscript
    • c equals x with bar on top minus d y with bar on top
    • These are not given in the formulae booklet

Worked example

Ashika is a football coach to 20 children. She records how long it takes each of them to run a lap of the football pitch, p seconds, and the distance that they can kick the football, d metres. 

Ashika calculates the following summary statistics:

 S subscript p p end subscript equals 687.2        p with bar on top space equals 62.8       straight capital sigma d equals 1566       straight capital sigma d squared equals 124240       straight capital sigma p d equals 99127.   

(a)
Calculate S subscript p d end subscript.

 

(b)
Calculate the product moment correlation coefficient between p and d.

 

(c)
Calculate the equation of the regression line of d on p giving your answer in the form d equals a plus b p                
(a)
Calculate S subscript p d end subscript.

1-3-2-correlation-regression-we-solution-part-1

(b)
Calculate the product moment correlation coefficient between p and d.
1-3-2-correlation-regression-we-solution-part-2
(c)
Calculate the equation of the regression line of d on p giving your answer in the form d equals a plus b p  
1-3-2-correlation-regression-we-solution-part-3

Examiner Tip

  • Questions typically use different variables instead of x and y. It might help to label the independent variable as x and the dependent variable as y, this will help you when calculating the equation of the regression line.

You've read 0 of your 5 free revision notes this week

Sign up now. It’s free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Dan

Author: Dan

Expertise: Maths

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.