0Still learning
Know0
What is discrete data?
Enjoying Flashcards?
Tell us what you think
What is discrete data?
Discrete data refers to data that can only take certain numerical values. It is often (but not always) data that can be counted.
Examples of discrete data include:
Number of pets
Shoe size
Number of petals on a flower
What is continuous data?
Continuous data refers to data that can take any numerical value within a range. It is usually data that needs to be measured.
Examples of continuous data include:
Height
Weight
Time taken to complete a jigsaw
How do you find the mean of a set of numbers?
To find the mean of a set of numbers:
Add the values together.
Divide by the total number of values.
How do you find the median of a set of numbers?
To find the median of a set of numbers:
Put the numbers in order.
Find the middle number.
If there are two middle numbers then the median is the midpoint of those numbers.
How do you find the mode of a set of numbers?
The mode of a set of numbers is the value that appears the most.
True or False?
There can be more than one median value in a data set.
False.
There can not be more than one median value in a data set. The median is the middle value.
The only average that can have more than one value is the mode.
True or false?
The mean is affected by extreme values.
True.
The mean is affected by extreme values.
True or false?
The median is affected by extreme values.
False.
The median is not affected by extreme values.
How can you find the total of all data values if you know the mean and the number of values?
E.g. if the mean of a set of 25 values is 3.7, what is the total of all values in the data set?
If you know the mean and number of values, you can calculate the total of values by rearranging the mean formula:
E.g. if the mean of a set of 25 values is 3.7, then the total of all values in the data set is .
How can you find the number of data values if you know the mean and the total of the values?
E.g. if the total of a set of data values is 66 and the mean is 8.25, how many data values are there?
If you know the mean and total of the data values, you can calculate the number of values by rearranging the mean formula:
E.g. the total number of data values in a set of data with a total of 66 and mean of 8.25 is .
How can you find the new mean of a data set if a new data item is added to the set?
E.g. a data set of 12 items has a mean of 5.7.
A new data item of 9.6 is added to the set.
What is the new mean?
To find the new mean of a data set if a new data item is added to the set:
Find the total of the current data set:
Add the new data item
Divide by the new total number of data items
E.g. for a data set of 12 items and a mean of 5.7, the total of the values is .
Therefore when a new data item of 9.6 is added to the set, the new mean is .
How do you find the mean from a frequency table?
To find the mean from a frequency table:
Include a column for (value frequency).
Add the values in this column.
Divide the sum by the total frequency.
How do you find the median from a frequency table?
To find the median from a frequency table:
Make sure the values in the table are in order.
Find the th value, where is the total frequency.
How do you find the mode from a frequency table?
To find the mode from a frequency table, look for the value with the highest frequency.
How do you estimate the mean from grouped data?
To estimate the mean from grouped data:
Find the midpoint of each class.
Calculate (midpoint × frequency) for each class.
Sum the (midpoint × frequency) values.
Divide the sum by the total frequency.
What is a class interval?
A class interval is a range of values within which data points are grouped together.
True or false?
The actual mean can be calculated from grouped data.
False.
Only an estimate of the mean can be calculated from grouped data.
How do you find the class interval containing the median from grouped data?
To find the class interval containing the median from grouped data:
Find the position of the median using , where is the total frequency.
Use the table to determine the class interval containing this position.
How do you find the modal class interval from grouped data?
To find the modal class interval from grouped data, look for the class interval with the highest frequency.
What does the phrase "estimate the mean" usually indicate on an exam question?
The phrase "estimate the mean" usually indicates that the data is grouped, and that you should use the midpoint method to estimate the mean.
Define the range of a data set.
The range is the difference between the highest and lowest values in a data set (i.e., highest value minus lowest value).
What is a quartile in a set of data?
A quartile is one of the values (lower quartile, median and upper quartile) that divide an ordered data set into four equal parts.
Define the interquartile range (IQR) of a data set.
The interquartile range (IQR) of a data set is the difference between the upper and lower quartiles (i.e., upper quartile minus lower quartile).
What is the lower quartile (LQ) in a set of data?
The lower quartile (LQ) is the value below which 25% of the data lies (and above which 75% of the data lies).
What is the upper quartile (UQ) of a set of data?
The upper quartile (UQ) is the value above which 25% of the data lies (and below which 75% of the data lies).
True or false?
The range is not affected by outliers.
False.
The range is affected by outliers.
True or false?
The IQR is not affected by outliers.
True.
The IQR is not affected by outliers.
State the formula for finding the lower quartile in terms of .
Lower quartile th value
where is the total frequency
State the formula for finding the upper quartile in terms of .
Upper quartile th value
where is the total frequency
State the equation for finding the IQR in terms of the UQ and LQ.
The equation for the interquartile range (IQR) is
Where:
is the upper quartile
is the lower quartile
Which measures should you look at when comparing distributions?
When comparing distributions, you should compare two things:
An average for the distributions, e.g. mean, median or mode.
A measure of spread for the distributions, e.g. range or IQR.
True or false?
The mode should always be used to compare the averages of distributions.
False.
The mean or median are usually used to compare the averages of distributions.
The mode can be used for non-numerical data.
What measures of spread can be used when comparing distributions?
The range or interquartile range (IQR) can be used to compare the spread of distributions.
The IQR focuses on the middle 50% of the data.
How should you compare the averages of distributions?
To compare the averages:
Give the numerical values of the averages,
e.g. data set A has a mean of 43.5 and data set B has a mean of 53.7.
Explicitly compare them,
e.g. the mean of set B is greater than the mean of set A.
Explain what the comparison means in the context of the question,
e.g. this means that, on average, the people in set B take longer on the task than the people in set A.
How should you compare the spreads of distributions?
To compare the spreads:
Give the numerical values of the range or interquartile range,
e.g. set A has a range of 7, set B has a range of 10
Explicitly compare them,
e.g. the range of set B is greater than the range of set A.
Explain what the comparison means in the context of the question,
e.g. this means that the times taken by individuals in set B is more spread out than the times taken by individuals in set A.
True or False?
Extreme values should be considered when comparing raw data sets.
True.
When comparing raw data sets, you should check for extreme values in either distribution and mention how they may affect the reliability of the results and comparisons.
True or false?
You should make at least two pairs of comments when comparing distributions.
True.
You should make at least two pairs of comments when comparing distributions, one pair comparing averages and one pair comparing spread.
True or False.
You must consider any assumptions or potential issues with the data when comparing data sets.
True.
When comparing distributions, you should also consider any assumptions or potential issues with the data as these could affect the validity of the comparisons.
E.g. when assessing the water quality of a river, the samples may have all been taken from one place so may not be representative of the whole river.
True or false?
The context of the question doesn't matter when comparing averages and spread.
False.
When comparing averages and spread, you must discuss them in the context of the question, not just compare the numbers.