Linearization of Bivariate Data (College Board AP® Statistics)
Study Guide
Written by: Mark Curtis
Reviewed by: Dan Finlay
Linearization of bivariate data
What does transforming a variable mean?
Transforming a variable means performing a mathematical operation on either the -coordinates of the data points, or the -coordinates
e.g. take the -coordinates and square them
becomes
A common transformation is taking the natural logarithm of the -coordinates
becomes
What is linearization of bivariate data?
If a scatterplot shows that data points do not follow a linear relationship
then it is sometimes possible to transform one of the variables to make the data points follow a more linear relationship
This process is called linearization of bivariate data
Examiner Tips and Tricks
When transforming variables, the type of transformation will be given to you in the exam.
How do I know if the transformed data is more linear than the untransformed data?
There are two different methods to check if the transformed data is more linear than the untransformed data:
Method 1: Create residual plots before and after the transformation
If, after the transformation, the plots are more random (no longer following curves or patterns)
then this is evidence that the transformed data is more linear than the untransformed data
Method 2: Calculate the coefficient of determination, , before and after the transformation
If, after the transformation, is closer to 1,
then this is evidence that the least-squares regression line is a better model for the transformed data than the regression line for the untransformed data
How do I use the regression equation for the transformed data?
Find the least-squares regression line for the transformed data
This will either have the form
or the form
Then use this equation to predict values, given -values
You may need to rearrange the equation to make the subject
or you may need to transform the -value before substituting it in
Worked Example
The scatterplot below shows the population of mosquitoes, , in different parts of an island against the percentage cover of vegetation, %. The least-squares regression line and its residual plot are also shown.
A biologist claims that the natural logarithm of the population of mosquitoes will have a linear relationship with the percentage cover of vegetation. The scatterplot, least-squares regression line and residual plot for the transformed data are shown below.
(a) State, with justification, whether or not the new plots support the biologist's claim.
It is not enough to say the scatterplot looks more linear
Instead, you need to compare the residual plots
They are more random after the transformation, suggesting that the transformed data is more linear than the untransformed data
Remember to give all comments in context (copy phrases from the question to help)
Answer:
The residual plot from the scatterplot showing the population of mosquitoes, , in different parts of an island against the percentage cover of vegetation, %, shows that the residuals follow a U-shaped pattern (they are not random)
The residual plot from the scatterplot showing the natural logarithm of the population of mosquitoes, , in different parts of an island against the percentage cover of vegetation, %, shows that these residuals are randomly spread (not following a pattern)
This means there is evidence to say that the natural logarithm of the population of mosquitoes, , in different parts of an island against the percentage cover of vegetation, %, has a more linear relationship than the population of mosquitoes, , in different parts of an island against the percentage cover of vegetation, %
This supports the claim by the biologist
(b) Given that the second regression line has a slope of 0.102 and a -axis intercept of 4.29, estimate, to the nearest thousand, the population of mosquitoes in an area on the island with a vegetation cover of 65%.
Answer:
Write out the equation of the least-squares regression line using instead of (the is unchanged)
Substitute in and simplify
Rearrange the equation to make the subject (find to the power of the right-hand side)
Round this answer to the nearest 1000 and give the answer in context
The population of mosquitoes is approximately 55000 in an area on the island with a vegetation cover of 65%
Last updated:
You've read 0 of your 5 free study guides this week
Sign up now. It’s free!
Did this page help you?