Statistics Toolkit (Edexcel IGCSE Maths A (Modular))

Flashcards

1/62

0Still learning

Know0

  • What is discrete data?

Enjoying Flashcards?
Tell us what you think

Cards in this collection (62)

  • What is discrete data?

    Discrete data refers to data that can only take certain numerical values. It is often (but not always) data that can be counted.

    Examples of discrete data include:

    • Number of pets

    • Shoe size

    • Number of petals on a flower

  • What is continuous data?

    Continuous data refers to data that can take any numerical value within a range. It is usually data that needs to be measured.

    Examples of continuous data include:

    • Height

    • Weight

    • Time taken to complete a jigsaw

  • How do you find the mean of a set of numbers?

    To find the mean of a set of numbers:

    1. Add the values together.

    2. Divide by the total number of values.

  • How do you find the median of a set of numbers?

    To find the median of a set of numbers:

    1. Put the numbers in order.

    2. Find the middle number.

    If there are two middle numbers then the median is the midpoint of those numbers.

  • How do you find the mode of a set of numbers?

    The mode of a set of numbers is the value that appears the most.

  • True or False?

    There can be more than one median value in a data set.

    False.

    There can not be more than one median value in a data set. The median is the middle value.

    The only average that can have more than one value is the mode.

  • True or false?

    The mean is affected by extreme values.

    True.

    The mean is affected by extreme values.

  • True or false?

    The median is affected by extreme values.

    False.

    The median is not affected by extreme values.

  • How can you find the total of all data values if you know the mean and the number of values?

    E.g. if the mean of a set of 25 values is 3.7, what is the total of all values in the data set?

    If you know the mean and number of values, you can calculate the total of values by rearranging the mean formula:

    • total space of space values equals mean space cross times space number space of space values

    E.g. if the mean of a set of 25 values is 3.7, then the total of all values in the data set is 3.7 cross times 25 equals 92.5.

  • How can you find the number of data values if you know the mean and the total of the values?

    E.g. if the total of a set of data values is 66 and the mean is 8.25, how many data values are there?

    If you know the mean and total of the data values, you can calculate the number of values by rearranging the mean formula:

    • number space of space values equals fraction numerator total space of space values over denominator mean end fraction

    E.g. the total number of data values in a set of data with a total of 66 and mean of 8.25 is fraction numerator 66 over denominator 8.25 end fraction equals 8.

  • How can you find the new mean of a data set if a new data item is added to the set?

    E.g. a data set of 12 items has a mean of 5.7.
    A new data item of 9.6 is added to the set.
    What is the new mean?

    To find the new mean of a data set if a new data item is added to the set:

    1. Find the total of the current data set: total space of space values equals mean space cross times space number space of space values

    2. Add the new data item

    3. Divide by the new total number of data items

    E.g. for a data set of 12 items and a mean of 5.7, the total of the values is 12 cross times 5.7 equals 68.4.
    Therefore when a new data item of 9.6 is added to the set, the new mean is fraction numerator 68.4 plus 9.6 over denominator 13 end fraction equals 6.

  • How do you find the mean from a frequency table?

    To find the mean from a frequency table:

    1. Include a column for (value cross times frequency).

    2. Add the values in this column.

    3. Divide the sum by the total frequency.

  • How do you find the median from a frequency table?

    To find the median from a frequency table:

    1. Make sure the values in the table are in order.

    2. Find the fraction numerator n plus 1 over denominator 2 end fractionth value, where n is the total frequency.

  • How do you find the mode from a frequency table?

    To find the mode from a frequency table, look for the value with the highest frequency.

  • How do you estimate the mean from grouped data?

    To estimate the mean from grouped data:

    1. Find the midpoint of each class.

    2. Calculate (midpoint × frequency) for each class.

    3. Sum the (midpoint × frequency) values.

    4. Divide the sum by the total frequency.

  • What is a class interval?

    A class interval is a range of values within which data points are grouped together.

  • True or false?

    The actual mean can be calculated from grouped data.

    False.

    Only an estimate of the mean can be calculated from grouped data.

  • How do you find the class interval containing the median from grouped data?

    To find the class interval containing the median from grouped data:

    1. Find the position of the median using n over 2, where n is the total frequency.

    2. Use the table to determine the class interval containing this position.

  • How do you find the modal class interval from grouped data?

    To find the modal class interval from grouped data, look for the class interval with the highest frequency.

  • What does the phrase "estimate the mean" usually indicate on an exam question?

    The phrase "estimate the mean" usually indicates that the data is grouped, and that you should use the midpoint method to estimate the mean.

  • Define the range of a data set.

    The range is the difference between the highest and lowest values in a data set (i.e., highest value minus lowest value).

  • What is a quartile in a set of data?

    A quartile is one of the values (lower quartile, median and upper quartile) that divide an ordered data set into four equal parts.

  • Define the interquartile range (IQR) of a data set.

    The interquartile range (IQR) of a data set is the difference between the upper and lower quartiles (i.e., upper quartile minus lower quartile).

  • What is the lower quartile (LQ) in a set of data?

    The lower quartile (LQ) is the value below which 25% of the data lies (and above which 75% of the data lies).

  • What is the upper quartile (UQ) of a set of data?

    The upper quartile (UQ) is the value above which 25% of the data lies (and below which 75% of the data lies).

  • True or false?

    The range is not affected by outliers.

    False.

    The range is affected by outliers.

  • True or false?

    The IQR is not affected by outliers.

    True.

    The IQR is not affected by outliers.

  • State the formula for finding the lower quartile in terms of n.

    Lower quartile equals fraction numerator n plus 1 over denominator 4 end fractionth value

    • where n is the total frequency

  • State the formula for finding the upper quartile in terms of n.

    Upper quartile equals fraction numerator 3 open parentheses n plus 1 close parentheses over denominator 4 end fractionth value

    • where n is the total frequency

  • State the equation for finding the IQR in terms of the UQ and LQ.

    The equation for the interquartile range (IQR) is  IQR equals UQ minus LQ

    Where:

    • UQ is the upper quartile

    • LQ is the lower quartile

  • Which measures should you look at when comparing distributions?

    When comparing distributions, you should compare two things:

    • An average for the distributions, e.g. mean, median or mode.

    • A measure of spread for the distributions, e.g. range or IQR.

  • True or false?

    The mode should always be used to compare the averages of distributions.

    False.

    The mean or median are usually used to compare the averages of distributions.

    The mode can be used for non-numerical data.

  • What measures of spread can be used when comparing distributions?

    The range or interquartile range (IQR) can be used to compare the spread of distributions.

    The IQR focuses on the middle 50% of the data.

  • How should you compare the averages of distributions?

    To compare the averages:

    1. Give the numerical values of the averages,
      e.g. data set A has a mean of 43.5 and data set B has a mean of 53.7.

    2. Explicitly compare them,
      e.g. the mean of set B is greater than the mean of set A.

    3. Explain what the comparison means in the context of the question,
      e.g. this means that, on average, the people in set B take longer on the task than the people in set A.

  • How should you compare the spreads of distributions?

    To compare the spreads:

    1. Give the numerical values of the range or interquartile range,
      e.g. set A has a range of 7, set B has a range of 10

    2. Explicitly compare them,
      e.g. the range of set B is greater than the range of set A.

    3. Explain what the comparison means in the context of the question,
      e.g. this means that the times taken by individuals in set B is more spread out than the times taken by individuals in set A.

  • True or False?

    Extreme values should be considered when comparing raw data sets.

    True.

    When comparing raw data sets, you should check for extreme values in either distribution and mention how they may affect the reliability of the results and comparisons.

  • True or false?

    You should make at least two pairs of comments when comparing distributions.

    True.

    You should make at least two pairs of comments when comparing distributions, one pair comparing averages and one pair comparing spread.

  • True or False.

    You must consider any assumptions or potential issues with the data when comparing data sets.

    True.

    When comparing distributions, you should also consider any assumptions or potential issues with the data as these could affect the validity of the comparisons.

    E.g. when assessing the water quality of a river, the samples may have all been taken from one place so may not be representative of the whole river.

  • True or false?

    The context of the question doesn't matter when comparing averages and spread.

    False.

    When comparing averages and spread, you must discuss them in the context of the question, not just compare the numbers.

  • The image below shows what type of statistical diagram?

    A statistical diagram with x-axis labelled 6 to 12. Each boot represents 2 students.

    The diagram shown is a pictogram.

    A pictogram is a visual representation of discrete data using repeated symbols or icons.

  • True or false?

    Bar charts are used for continuous data.

    False.

    Bar charts are not used for continuous data.

    Bar charts are used for discrete data.

  • How do you identify the mode from a bar chart?

    To identify the mode from a bar chart, find the bar with the highest height or frequency.

  • What is a comparative bar chart?

    A comparative bar chart is a bar chart that displays two or more data sets side by side for easy comparison.

    An example of a comparative bar chart showing monthly sales of hot food and ice cream in February, March, and April.
  • True or false?

    A key is optional when creating a pictogram.

    False.

    A key is not optional when creating a pictogram.

    A pictogram requires a key that specifies the frequency represented by each symbol or icon.

  • True or false?

    Bar charts should have gaps between the bars.

    True.

    Bar charts should have gaps between the bars.

    An example bar chart showing  shoe sizes in Class 11A. There is a gap between each bar.
  • True or false?

    Pictogram symbols can be of different sizes.

    False.

    Pictogram symbols should be of the same size for easy comparison.

    (Though a pictogram may use part of a symbol, to represent a frequency that is less than the value of the complete symbol.)

  • True or False?

    A two-way table is used to compare two types of characteristics.

    True.

    A two-way table is used to compare two types of characteristics.

    E.g. school year group and favourite genre of movie.

  • How do you construct a two-way table from information given in words?

    1. Identify the two characteristics, e.g. favourite colours, gender

    2. Use rows for one characteristic and columns for the other

    3. Add an extra row and column for marginal totals

    Red

    Blue

    Yellow

    Total

    Male

    Female

    Total

  • True or false?

    The numbers needed to complete a two-way table will always be given explicitly in a question.

    False.

    When completing a two-way table, some values can be filled in directly from the question information, but some values will need to be worked out.

    E.g. you may need to subtract other values in a row from the row total to find a missing value.

  • How can you double-check your answers when completing a two-way table?

    You can double-check your answers when completing a two-way table by making sure that all row and column totals add up correctly, and that they match the grand total.

  • How can the probability of an event occurring be worked out from a two-way table?

    E.g. what is the probability that a randomly selected student's favourite subject is Physics?

    Biology

    Physics

    Chemistry

    Total

    Year 7

    12

    8

    10

    30

    Year 8

    8

    13

    6

    27

    Total

    20

    21

    16

    57

    The probability of a particular event occurring can be worked out by finding the number of successes by the total number.

    E.g. the probability that a student's favourite subject is Physics is 21 over 57.

    Biology

    Physics

    Chemistry

    Total

    Year 7

    12

    8

    10

    30

    Year 8

    8

    13

    6

    27

    Total

    20

    21

    16

    57

  • A pie chart is drawn for a set of data where the total frequency is 180.

    What do you do to the frequency of each item to find its angle for the pie chart?

    If a pie chart is drawn for a set of data where the total frequency is 180, you multiply the frequency of an item by 2 (i.e. 360 ÷ 180) to find the size of its angle on the pie chart.

  • In a pie chart, if you know that the angle 30° represents a frequency of 10, how would you find the total frequency?

    In a pie chart, if an angle of 30° represents a frequency of 10, then you can find the total frequency by:

    • dividing 10 by 30 to find how much 1° represents,

    • then multiplying this by 360.

    Alternatively, you can see how many times 30° goes into 360° and then multiply this by 10.

  • How do you calculate the angles needed for a pie chart?

    To calculate the angles needed for a pie chart:

    • Divide each frequency by the total frequency.

    • Multiply each result by 360°.

    Alternatively:

    • Divide 360° by the total frequency.

    • Multiply each frequency by this number.

  • If you are given the angles in a pie chart and the total frequency, how do you calculate the individual frequencies?

    If you are given the angles in a pie chart and the total frequency, you can calculate the individual frequencies by doing the following:

    • Divide each angle by 360°.

    • Multiply by the total frequency.

    Alternatively:

    • Divide the total frequency by 360.

    • Multiply each angle by this number.

  • What sorts of things should you look for when reading and interpreting statistical diagrams?

    When reading and interpreting statistical diagrams, you should look for:

    • Keys

    • Shading

    • Axis labels

    • The word "frequency"

    • Any unusual or unexpected information mentioned

  • Define anomaly in the context of statistical diagrams.

    An anomaly, otherwise known as an extreme value or outlier, is a data point that is significantly different from the rest of the data.

  • True or false?

    You may be asked to comment on aspects of a statistical diagram that could be misleading or incorrect.

    True.

    You may be asked to comment on aspects of a statistical diagram that could be misleading or incorrect, such as uneven gaps in axis values or a missing key.

  • Define key in the context of statistical diagrams.

    In the context of statistical diagrams, a key is a legend that explains the meaning of symbols, colours, or shading used in the diagram.

  • True or False?

    The purpose of comparing statistical diagrams is to identify and comment on differences or similarities in averages, spread, and unusual data values for the data sets represented by the diagrams.

    True.

    The purpose of comparing statistical diagrams is to identify and comment on differences or similarities in averages, spread, and unusual data values for the data sets represented by the diagrams.

  • What should you consider when deciding which measures to compare in statistical diagrams?

    When deciding which measures to compare in statistical diagrams, you should consider:

    • Whether the mean, median or mode is the appropriate average to use.

    • Whether the range or interquartile range is the appropriate measure of spread to use.

    • Whether any assumptions or potential issues with the data could affect the reliability of the results and comparisons.

  • True or false?

    You should aim to make at least one pair of comments when comparing statistical diagrams.

    False.

    You should aim to make at least two pairs of comments when comparing statistical diagrams:

    • One pair should compare averages and comment on what this means in the context of the question.

    • The other pair should compare spread and comment on what this means in the context of the question.