Question 1

What is qualitative data?

Accepted Answer

Qualitative data is given in words, not numbers. Examples include: hair colour, favourite animal, street name, etc.

Question 2

What is quantitative data?

Accepted Answer

Quantitative data is given using numbers . Examples include: number of siblings, height, time taken to run 100 metres, etc.

Question 3

What is continuous data?

Accepted Answer

Continuous data is quantitative data that can take any value within an interval. Continuous data needs to be measured . Examples include: height, mass of apples, length of leaves, etc.

Question 4

What is discrete data?

Accepted Answer

Discrete data is quantitative data that can only take specific values from a set. Discrete data is normally counted.

Examples include: number of pets, number of times a coin is flipped until a 'tails' is obtained, etc.

(Not all discrete data is counted however, e.g. shoe sizes.)

Question 5

What is a population ?

Accepted Answer

A population is the whole set of things you are interested in observing. For example, if you are investigating the heights of giraffes in Africa then the population is the set of all giraffes in Africa.

Question 6

What is a sample ?

Accepted Answer

A sample is a subset of a population used to collect data.

Question 7

What is a sampling frame ?

Accepted Answer

A sampling frame is a list of all members of the population.

Question 8

What is a random sample?

Accepted Answer

A random sample is where every member of the population has an equal chance of being included in the sample.

Question 9

Name the five sampling techniques .

Accepted Answer

The five sampling techniques are: simple random sampling systematic sampling stratified sampling quota sampling convenience sampling

Question 10

Describe how you would use simple random sampling to take a sample of 30 people from a population of 100 people.

Accepted Answer

To use simple random sampling to take a sample of 30 people from a population of 100 people: give each member of the population a unique number from 1 to 100, use a random number generator to select 30 randomly different numbers between 1 and 100, the 30 people with those numbers form the sample.

Question 11

True or False? Forming a sample by including every 10th person from a list is an example of stratified sampling.

Accepted Answer

False. Forming a sample by including every 10th person from a list is not an example of stratified sampling. It is an example of systematic sampling.

Question 12

What is the main difference between stratified sampling and quota sampling?

Accepted Answer

The main difference between stratified sampling and quota sampling is how members for each group are selected . For stratified sampling, simple random sampling is used on each group of the population. Whereas for quota sampling, the members do not have to be selected randomly, e.g. convenience sampling could be used.

Question 13

True or False? A survey is always carried out in person .

Accepted Answer

False. A survey is not always carried out in person . A survey can be done remotely such as postal surveys, phone surveys and internet surveys.

Question 14

Why is the following question not suitable for a questionnaire? "How much do you appreciate your hardworking, selfless teacher?"

Accepted Answer

"How much do you appreciate your hardworking, selfless teacher?" This question is not suitable for a questionnaire because it is a leading question.

Question 15

What is meant by reliability in terms of data collection?

Accepted Answer

Reliability measures how consistent a process is at measuring a variable. A process is reliable if you get the same results by repeating the process with the same sample using the same conditions.

Question 16

What is meant by validity in terms of data collection?

Accepted Answer

Validity measures how accurate a process is at measuring a variable. A process is valid if it is accurately measuring the variable you want it to measure .

Question 17

What are the two methods to test for reliability ?

Accepted Answer

The two reliability tests are: Test-retest Parallel forms

Question 18

Describe the process for the test-retest method.

Accepted Answer

The test-retest method is where you use a data collection process with a sample and then repeat the same process with the same sample at a later time .

Question 19

Describe the process for the parallel forms method.

Accepted Answer

The parallel forms method is where you give the same sample a second set of questions (or second set of experiments), which are similar to the first set .

Question 20

What are the two methods to test for validity ?

Accepted Answer

The two validity tests are: Content-related check Criterion-related check

Question 21

Describe the content-related validity check.

Accepted Answer

The content validity method is where you check how well the process measures all aspects of the variable . The process is valid if it covers all aspects of the variable.

Question 22

Describe the criterion-related validity check.

Accepted Answer

The criterion-related validity method is where you check how well one variable predicts the outcome for another variable (called the criterion variable). If the process is valid then the variable should be a good predictor .

Question 23

The test-retest method is used to check the reliability of a process. What type of correlation should there be between the two sets of results if the process is reliable ?

Accepted Answer

The test-retest method is used to check the reliability of a process. There should be a positive correlation between the two sets of results if the process is reliable .

Question 24

What is the mode of a data set?

Accepted Answer

The mode of a data set is the item(s) that occurs the most often .

Question 25

True or False? Any data set always has exactly one mode .

Accepted Answer

False. Not all data sets have exactly one mode . A data set may have: no mode more than one mode.

Question 26

How do you find the median of ungrouped data without a GDC?

Accepted Answer

To find the median of ungrouped data: put the data in order from smallest to largest, find the middle value(s). If there are two middle values then find the midpoint of them. For example, the median of 1, 2, 3, 4 is 2.5 (i.e. the midpoint of 2 and 3).

Question 27

True or False? The lower quartile of a set of data splits the lowest 25% from the highest 75%.

Accepted Answer

True. The lower quartile of a set of data splits the lowest 25% from the highest 75%. For example, the lower quartile of 1, 2, 3, 4 is 1.5.

Question 28

True or False? The range of 2, 3, 1, 5, 8 is 8 - 2 = 6.

Accepted Answer

False. The range is the difference between the lowest value and the highest value . The range of 2, 3, 1, 5, 8 is 8 - 1 = 7.

Question 29

True or False? The standard deviation is a measure of central tendency.

Accepted Answer

False. The standard deviation is not a measure of central tendency. The standard deviation is a measure of dispersion , it measures how spread out the data is about the mean.

Question 30

What is the mathematical relationship between the standard deviation and the variance ?

Accepted Answer

The standard deviation is the positive square-root of the variance . (Equivalently, the variance is the standard deviation squared .)

Question 31

True or False? Score Frequency 5 10 10 8 The modal score is 10.

Accepted Answer

False. Score Frequency 5 10 10 8 The mode is the value with the highest frequency. Therefore the modal score is 5.

Question 32

True or False? It is possible to find the exact median from a frequency table of ungrouped data.

Accepted Answer

True. It is possible to find the exact median from a frequency table of ungrouped data. You can find the median by finding the middle value. You can use cumulative frequency to help to find which value is the middle value. You can also use your GDC to find the median.

Question 33

Why is it not possible to calculate the exact mean of grouped data?

Accepted Answer

It is not possible to calculate the exact mean of grouped data because the exact individual values are unknown .

Question 34

True or False? If you add 5 to each value in a data set, then the mean also increases by 5 .

Accepted Answer

True. If you add 5 to each value in a data set, then the mean also increases by 5 .

Question 35

What happens to the mean of a data set if each value in the data set is doubled ?

Accepted Answer

If each value in a data set is doubled , then the mean is also doubled .

Question 36

True or False? If you add 5 to each value in a data set, then the standard deviation also increases by 5 .

Accepted Answer

False. If you add 5 to each value in a data set, then the standard deviation does not change . Adding or subtracting a constant to the values in a data set does not affect the standard deviation.

Question 37

True or False? If you multiply each value in a data set by -2, then the standard deviation is also multiplied by -2.

Accepted Answer

False. If you multiply each value in a data set by -2, then the standard deviation is multiplied by 2. The standard deviation is always positive .

Question 38

What is an outlier ?

Accepted Answer

Outliers are extreme data values that do not fit with the rest of the data.

Question 39

What is the formula used to calculate the boundaries for outliers in a data set?

Accepted Answer

The boundaries for outliers in a data set are found by: multiplying the interquartile range by 1.5, subtracting this from the lower quartile and adding it to the upper quartile. It can be written as a formula. x is an outlier if x < Q 1 - 1.5 × IQR or x > Q 3 + 1.5 × IQR.

Question 40

True or False? All outliers are errors .

Accepted Answer

False. Not all outliers are errors .

Question 41

Should outliers be removed from a data set?

Accepted Answer

Outliers should be removed from a data set if they are clearly errors . However, if they are possibly valid values then they should be included.

Question 42

What are the five values needed to draw a box and whisker diagram ?

Accepted Answer

The five values needed to draw a box and whisker diagram are: lowest data value, lower quartile, median, upper quartile, and highest data value.

Question 43

What proportion of the data set does the box in a box and whisker diagram represent?

Accepted Answer

The box in a box and whisker diagram represents the central 50% of the data set.

Question 44

What do the " whiskers " in a box and whisker diagram represent?

Accepted Answer

The whiskers represent the lowest 25% and the highest 25% of the data.

Question 45

True or False? To draw a cumulative frequency diagram , you plot the midpoint of a group against its frequency .

Accepted Answer

False. To draw a cumulative frequency diagram , you do not plot the midpoint of a group against its frequency . This is a mistake that students often make on the exam. You plot the endpoint of a group against its cumulative frequency .

Question 46

How can you estimate the lower quartile using a cumulative frequency diagram ?

Accepted Answer

To estimate the lower quartile using a cumulative frequency diagram : Divide the total frequency by 4. Draw a horizontal line from that result on the vertical axis to the curve. Draw a vertical line down from this point to the horizontal axis. The number on the horizontal axis is an estimate for the lower quartile .

Question 47

What is a (frequency) histogram ?

Accepted Answer

A frequency histogram clearly shows the frequency of class intervals for grouped data with equal class intervals.

Question 48

True or False? You need to leave a gap between the bars when drawing a histogram .

Accepted Answer

False. You do not leave a gap between the bars when drawing a histogram .

Question 49

Which measure of central tendency does a box and whisker diagram show?

Accepted Answer

A box and whisker diagram shows the median .

Question 50

When working with grouped data, which data representation clearly shows the modal group: a cumulative frequency graph or a histogram ?

Accepted Answer

When working with grouped data, a histogram clearly shows the modal group.

Question 51

Which measure of central tendency (mode, median, mean) is most affected by outliers ?

Accepted Answer

The mean is most affected by outliers .

Question 52

Which is the only measure of central tendency (mode, median, mean) that can be applied to qualitative data?

Accepted Answer

The mode is the only measure of central tendency (mode, median, mean) that can be applied to qualitative data.

Question 53

True or False? When comparing two data sets, the one with the higher mean is always better .

Accepted Answer

False. When comparing two data sets, the one with the higher mean is not always better . It depends on the context. If comparing times to complete a race, the smaller mean is better. If comparing scores, the bigger mean is better.

Question 54

True or False? The interquartile range is not affected by outliers .

Accepted Answer

True. The interquartile range is not affected by outliers . The interquartile range only uses at the central 50% of the data.

Question 55

When comparing two data sets, which one is more spread out about the median : the one with the bigger interquartile range or the one with the smaller interquartile range ?

Accepted Answer

When comparing two data sets, the one with the bigger interquartile range is more spread out about the median .

Question 56

If you are comparing data sets that contain outliers , which measure of dispersion should be used: the standard deviation or the interquartile range ?

Accepted Answer

If you are comparing data sets which contain outliers , you should use the interquartile range as this is not affected by outliers.

Statistics Toolkit (DP IB Applications & Interpretation (AI))

Flashcards

Cards in this collection (69)

1. Number & Algebra

Number Toolkit

Exponentials & Logs

Sequences & Series

Financial Applications

Complex Numbers

Further Complex Numbers

Matrices

Eigenvalues & Eigenvectors

2. Functions

Linear Functions & Graphs

Further Functions & Graphs

Modelling with Functions

Functions Toolkit

Transformations of Graphs

Modelling with Logarithmic, Logistic & Piecewise Functions

3. Geometry & Trigonometry

Geometry Toolkit

Geometry of 3D Shapes

Trigonometry

Trigonometric Identities & Equations

Voronoi Diagrams

Matrix Transformations

Vector Properties

Vector Equations of Lines

Modelling with Vectors

Graph Theory

4. Statistics & Probability

Statistics Toolkit

Correlation & Regression

Non-linear Regression

Probability

Probability Distributions

Random Variables

Binomial Distribution

Normal Distribution

Combinations of Normal Distributions & Sample Mean Distributions

Poisson Distribution

Hypothesis Testing using the Chi-squared Distribution

Hypothesis Testing for Population Parameters

Transition Matrices & Markov Chains

5. Calculus

Differentiation

Further Differentiation

Integration

Further Integration

Kinematics

Differential Equations

Coupled & Second Order Differential Equations