PMCC & Non-linear Regression (Edexcel A Level Maths: Statistics)

Revision Note

Test yourself
Amber

Author

Amber

Last updated

Did this video help you?

Product Moment Correlation Coefficient (PMCC)

What is the product moment correlation coefficient?

  • The product moment correlation coefficient (PMCC) is a way of giving a numerical value to linear correlation of bivariate data
  • The PMCC of a sample is denoted by the letter r
    • begin mathsize 16px style r end style can take any value such that negative 1 less or equal than r less or equal than 1
    • A positive value of r describes positive correlation
    • A negative value of r describes negative correlation
    • If size 16px r size 16px equals size 16px 0 there is no correlation
    • size 16px r size 16px equals size 16px 1 means perfect positive correlation and size 16px r size 16px equals size 16px minus size 16px 1 means perfect negative correlation
    • The closer to 1 or -1, the stronger the correlation
  • The gradient does not change the value of r

2-5-1-pmcc-diagram-1

How is the product moment correlation coefficient calculated?

  • You must learn how to use your calculator to calculate value of the PMCC, begin mathsize 16px style r end style for the relationship between two variables
  • All calculators are different and you should make sure you can calculate the PMCC on your personal calculator
    • Make sure you know how put your calculator into the statistics mode
      • You will be given the option to turn the frequency on or off, choose off for most calculations of the PMCC
    • With the statistics mode switched on on your calculator, there will be a ‘statistics’ option, followed by a regression option in the form A + BX
      • Your calculator will give you two columns into which you can input the begin mathsize 16px style x end style and begin mathsize 16px style y end style data values
    • Once the data has been entered into your calculator, choose the ‘r’ value from the ‘STAT’ options

Worked example

2-5-1-pmcc-we-diagram-1

Three scatter diagrams, showing observations from different bivariate data sets, are shown above.

(i)
Match each of the three scatter diagrams show above to one of the values of r given below.  You should use each given value of r no more than once.
r equals negative 0.7134
r equals 0.1652
r equals 0.8134
r equals negative 0.9993
(ii)
Sketch a scatter diagram for the remaining value of r listed above. 

2-5-1-pmcc-we-solution

Did this video help you?

Non-linear Regression

At AS Level you learned how to use linear regression models to describe a relationship between two variables. However, it is possible for two variables to have a relationship that does not fit a linear model, but still shows a pattern based on exponential growth or decay. A linear regression model is only appropriate if the PMCC is close to 1 or -1.

What forms can non – linear regression models take?

  • If a bivariate data set appears to have a non – linear relationship it could fit an exponential model
    • A non – linear regression model could take the form y equals a x to the power of n or y equals k b to the power of x where a, n, k and b are constants
  • It is possible to use logarithms to rearrange the non – linear form of the model to obtain a linear regression model which can then be used to examine trends in the data
    • If the regression model takes the form size 16px y size 16px equals size 16px a size 16px x to the power of size 16px n the data should be coded from begin mathsize 16px style x end style- values to y- values using X equals log space x and  Y equals log space y
      • If y equals a x to the power of n  for constants a and n, then log space y equals log space a plus n log space x space or space Y equals n X plus log space a
      • Plotting begin mathsize 16px style log space x end style against log space y will give a linear graph
      • The y – intercept would be log space a and the gradient of the line would be n
      • This can be shown by taking logarithms of both sides
    • If the regression model takes the form begin mathsize 16px style y equals k b to the power of x end style the data should be coded from begin mathsize 16px style x end style values to y values using  X equals x and  Y equals log space y
      • If size 16px y size 16px equals size 16px k size 16px b to the power of size 16px x for constants k and b , then log space y equals log space k plus x log b or Y equals left parenthesis log space b right parenthesis X plus log space y
      • Plotting begin mathsize 16px style x end style against log space y will give a linear graph
      • The y – intercept would be log space k and the gradient of the line would be log space b
      • This can be shown in the same way by taking logarithms of both sides
      • For example:

size 16px y size 16px equals size 16px k size 16px b to the power of size 16px x

Take logarithms of both sides

size 16px log space size 16px y size 16px equals size 16px log begin mathsize 16px style stretchy left parenthesis k b to the power of x stretchy right parenthesis end style

Use the addition law for logarithms

log space y equals log space k plus log space b to the power of x

Use the power law for logarithms

log space y equals log space k plus x space log space b

  • Using logarithms to code the data in this way is called changing the variables

How can non – linear regression models be used?

  • Non – linear regression models can be used in much the same way as linear regression models
  • By coding the original data using logarithms (changing the variables) a regression line of Y on X can be found
    • This can be used to make predictions for data values that are within the range of the given data (interpolation)
    • Making a prediction outside of the range of the given data is called extrapolation and should not be done
  • The non – linear regression model can then be found by substituting begin mathsize 16px style log space x end style and log space y back into the X and Y values in the regression line and rearranging

Worked example

The graph below shows the distribution of the height, h m, of a group of children and the amount of time, t hours, they spend napping in the day.  It is believed the data can be modelled using the form t space equals space k space h to the power of n .

2-5-1-non-linear-regression-we-diagram

The data are coded using the changes of variables X space equals log space h and Y space equals space log space t. The regression line of Y on X is found to be Y space equals space minus 3.5 X .

 

(i)
Find the values of X and Y for a child that is 75 cm tall and naps for 4 hours per day, giving your answers to four decimal places.

 

(ii)
Using the regression line, show that a child of height 0.9 metres would be expected to nap for approximately 1.45 hours per day.

 

(iii)
State an assumption that was made in order to justify the use of the regression line in part (ii).

 

(iv)
By first substituting log space h for X and log space t for Y in the equation of the regression line given, show that the relationship between the height of a child and the time they spend sleeping can be modelled by  t space equals space h to the power of negative 3.5 end exponent .

2-5-1-non-linear-regression-we-solution-2-part-1

Uo6sLrf6_2-5-1-non-linear-regression-we-solution-2-part-21

Examiner Tip

  • Be careful when using original and coded data interchangeably, it is easy to forget which one you are working with. Remember that if your regression line was calculated using coded data then you will need to reverse this if finding predictions. Make sure that you are familiar with using logarithms, indices and their laws. Be careful to check which base logarithms were used for coding the data, if begin mathsize 16px style log space x end style was used then it is reversed using 10 to the power of size 16px log size 16px space size 16px x end exponent, but is ln space x was used then it should be reversed using e to the power of ln space x end exponent.

You've read 0 of your 5 free revision notes this week

Sign up now. It’s free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Amber

Author: Amber

Expertise: Maths

Amber gained a first class degree in Mathematics & Meteorology from the University of Reading before training to become a teacher. She is passionate about teaching, having spent 8 years teaching GCSE and A Level Mathematics both in the UK and internationally. Amber loves creating bright and informative resources to help students reach their potential.