What is a hypothesis test ?

A hypothesis test uses a sample of data in an experiment to test a statement made about the population . The statement is either about a population parameter or the distribution of the population .

Which hypothesis do you assume to be true when performing a hypothesis test: the null hypothesis or the alternative hypothesis?

When performing a hypothesis test, you assume the null hypothesis to be true .

What is meant by the significance leve l of a hypothesis test?

The significance level is the probability that a hypothesis test rejects the null hypothesis when it is true . The significance level sets the smallest probability that an event could have occurred if the null hypothesis were true. Any probability smaller than the significance level would suggest that the event is unlikely to have happened by chance.

An observation of the test statistic is taken for a hypothesis test. What is meant by the p -value of this observation?

An observation of the test statistic is taken for a hypothesis test. The p -value is the probability of obtaining a value at least as extreme as the observation if the null hypothesis were true .

True or False? If the p -value is less than the significance level then there is evidence to reject the null hypothesis.

True. If the p -value is less than the significance level then there is evidence to reject the null hypothesis.

What is meant by the critical region for a test statistic?

The critical region is the set of values for a test statistic that would lead to the rejection of the null hypothesis. These are the values that are unlikely to be obtained if the null hypothesis were true.

What is meant by the critical value(s) for a test statistic?

The critical value(s) is the boundary of the critical region. It is the least extreme value that would lead to the rejection of the null hypothesis.

True or False? If a contingency table has m rows and n columns then the number of degrees of freedom is equal to m × n .

False. If a contingency table has m rows and n columns then the number of degrees of freedom is equal to ( m -1)×( n -1) .

When using a chi-squared test , what number do the expected values need to be bigger than ?

When using a chi-squared test , the expected values need to be bigger than 5 .

What do you need to do if an expected value is less than 5 in a contingency table?

If an expected value is less than 5 in a contingency table, then you need to combine that row or column with the next row or column. You also need to combine the corresponding rows or columns in the contingency table for the observed values.

Revision Notes Exam Questions Flashcards Past Papers Mock Exams

IBMathsDPApplications & Interpretation (AI)HLFlashcards4. Statistics & ProbabilityHypothesis Testing using the Chi-squared Distribution

Hypothesis Testing using the Chi-squared Distribution (DP IB Applications & Interpretation (AI))

Flashcards

1/34

0Still learning

Know0

FrontHypothesis Testing
What is a hypothesis test?

Did this page help you?

Cards in this collection (34)

What is a hypothesis test?
A hypothesis test uses a sample of data in an experiment to test a statement made about the population.
The statement is either about a population parameter or the distribution of the population.
Which hypothesis do you assume to be true when performing a hypothesis test: the null hypothesis or the alternative hypothesis?
When performing a hypothesis test, you assume the null hypothesis to be true.
What is denoted by the notation $H_{0}$ ?
The null hypothesis is denoted by $H_{0}$ .
What is the notation for the alternative hypothesis?
The notation for the alternative hypothesis is $H_{1}$ .
What is meant by the significance level of a hypothesis test?
The significance level is the probability that a hypothesis test rejects the null hypothesis when it is true.
The significance level sets the smallest probability that an event could have occurred if the null hypothesis were true. Any probability smaller than the significance level would suggest that the event is unlikely to have happened by chance.
An observation of the test statistic is taken for a hypothesis test.
What is meant by the p-value of this observation?
An observation of the test statistic is taken for a hypothesis test.
The p-value is the probability of obtaining a value at least as extreme as the observation if the null hypothesis were true.
True or False?
If the p-value is less than the significance level then there is evidence to reject the null hypothesis.
True.
If the p-value is less than the significance level then there is evidence to reject the null hypothesis.
What is meant by the critical region for a test statistic?
The critical region is the set of values for a test statistic that would lead to the rejection of the null hypothesis.
These are the values that are unlikely to be obtained if the null hypothesis were true.
What is meant by the critical value(s) for a test statistic?
The critical value(s) is the boundary of the critical region.
It is the least extreme value that would lead to the rejection of the null hypothesis.
What is a $χ^{2}$ test for independence used for?
A $χ^{2}$ test for independence is used to test whether two variables are statistically independent of each other.
What are contingency tables in a $χ^{2}$ test for independence?
A contingency table is a two-way table that shows the observed frequencies for each combination of the two variables.
For example:
Eye colour
Blue
Brown
Green
Hair colour
Black
17
12
29
Blonde
31
25
21
True or False?
If a contingency table has m rows and n columns then the number of degrees of freedom is equal to m×n.
False.
If a contingency table has m rows and n columns then the number of degrees of freedom is equal to (m-1)×(n-1).
True or False?
For a $χ^{2}$ test for independence, you reject the null hypothesis if the test statistic is greater than the critical value.
True.
For a $χ^{2}$ test for independence, you reject the null hypothesis if the test statistic is greater than the critical value.
What is meant by the expected frequencies for a $χ^{2}$ test for independence?
The expected frequencies for a $χ^{2}$ test for independence are the frequencies for each possible combination of outcomes of the two variables if they were independent.
How should you write the null hypothesis of a $χ^{2}$ test for independence?
For example, suppose you are testing whether hair colour and eye colour are independent.
The null hypothesis of a $χ^{2}$ test for independence should be of the form:
$H_{0}$ : variable X is independent of variable Y.
For example, $H_{0}$ : hair colour is independent of eye colour.
In an exam, how do you find the $χ^{2}$ statistic for a $χ^{2}$ test for independence?
In an exam, to find the $χ^{2}$ statistic for a $χ^{2}$ test for independence you:
- use the two-way test option on your GDC,
- input the observed frequencies as a matrix,
- run the test.
Your GDC will give you the value of the $χ^{2}$ statistic as well as the p-value and the expected frequencies.
If the null hypothesis is rejected for a $χ^{2}$ test for independence, then what does this suggest about the two variables?
If the null hypothesis is rejected for a $χ^{2}$ test for independence, then this suggests that the two variables are not independent.
However, this conclusion is not definitive as there will still be a small chance that they are independent.
If the null hypothesis is not rejected for a $χ^{2}$ test for independence, then what does this suggest about the two variables?
If the null hypothesis is not rejected for a $χ^{2}$ test for independence, then there is insufficient evidence to suggest that the variables are not independent.
Therefore, this suggests that the two variables could be independent.
However, this conclusion is not definitive.
When using a chi-squared test, what number do the expected values need to be bigger than?
When using a chi-squared test, the expected values need to be bigger than 5.
What do you need to do if an expected value is less than 5 in a contingency table?
If an expected value is less than 5 in a contingency table, then you need to combine that row or column with the next row or column.
You also need to combine the corresponding rows or columns in the contingency table for the observed values.
What is a $χ^{2}$ goodness of fit test used for?
A $χ^{2}$ goodness of fit test is used to test whether data can be modelled by a specified distribution.
True or False?
For a $χ^{2}$ goodness of fit test, you reject the null hypothesis if the test statistic is less than the critical value.
False.
For a $χ^{2}$ goodness of fit test, you reject the null hypothesis if the test statistic is greater than the critical value.
What is meant by the expected frequencies for a $χ^{2}$ goodness of fit test?
The expected frequencies for a $χ^{2}$ goodness of fit test are the frequencies for each outcome if the data follows the specified distribution.
How do you find the expected frequencies for a $χ^{2}$ goodness of fit test?
To find the expected frequencies for a $χ^{2}$ goodness of fit test, you:
- find the probability of each outcome assuming the data follows the specified distribution,
- multiply the probabilities by the total frequency.
In an exam, how do you find the $χ^{2}$ statistic for a $χ^{2}$ goodness of fit test?
In an exam, to find the $χ^{2}$ statistic for a $χ^{2}$ goodness of fit test you:
- use the goodness of fit option on your GDC,
- input the observed frequencies as a list,
- input the expected frequencies as a separate list,
- enter the number of degrees of freedom,
- run the test.
Your GDC will give you the value of the $χ^{2}$ statistic as well as the p-value.
How should you write the null hypothesis of a $χ^{2}$ goodness of fit test?
For example, suppose you are testing whether the number of eggs in a nest can be modelled by B(3, 0.1).
The null hypothesis of a $χ^{2}$ goodness of fit test should be of the form:
$H_{0}$ : variable X follows the distribution...(state the distribution)
For example, $H_{0}$ : the number of eggs in a nest follows the binomial distribution B(3, 0.1).
Suppose you are performing a $χ^{2}$ goodness of fit test to test whether the following data can be modelled by $X ~ N (160, 20^{2})$ .
What three probabilities would you need to calculate?
Height
Frequency
$120 \leq h < 150$
35
$150 \leq h < 180$
45
$180 \leq h < 200$
20
Suppose you are performing a $χ^{2}$ goodness of fit test to test whether the following data can be modelled by $X ~ N (160, 20^{2})$ .
You would need to calculate the following three probabilities.
Height
Probability
$120 \leq h < 150$
$P (X < 150)$
$150 \leq h < 180$
$P (150 < X < 180)$
$180 \leq h < 200$
$P (X > 180)$
What is the conclusion be if the null hypothesis is rejected for a $χ^{2}$ goodness of fit test?
If the null hypothesis is rejected for a $χ^{2}$ goodness of fit test then there is sufficient evidence to suggest that the data does not follow the specified distribution.
What is the conclusion if the null hypothesis is not rejected for a $χ^{2}$ goodness of fit test?
If the null hypothesis is not rejected for a $χ^{2}$ goodness of fit test then there is insufficient evidence to suggest that the data does not follow the specified distribution.
Therefore, this suggests that the data does follow the specified distribution.
Given observed data for a goodness of fit test, how do you estimate the value of $p$ for a binomial distribution?
Given observed data for a goodness of fit test, you can estimate the value of $p$ for a binomial distribution by finding the mean of the observed data and dividing by the number of outcomes for the binomial distribution.
$p = \frac{\bar{x}}{n} = \frac{1}{n} \times \frac{\sum f x}{\sum f}$ .
This formula is not given in your exam formula booklet.
Given observed data for a goodness of fit test, how do you estimate the value of $m$ for a Poisson distribution?
Given observed data for a goodness of fit test, you can estimate the value of $m$ for a Poisson distribution by finding the mean of the observed data.
$m = \bar{x} = \frac{\sum f x}{\sum f}$ .
This formula is not given in your exam formula booklet.
True or False?
For a goodness of fit test, the number of degrees of freedom is always $k - 1$ .
False.
For a goodness of fit test, the number of degrees of freedom is not always $k - 1$ .
You also need to subtract an additional 1 for every parameter that had to be estimated.
$H_{0} : X$ can be modelled by a normal distribution.
How would you find the number of degrees of freedom for the goodness of fit test described by the null hypothesis above?
$H_{0} : X$ can be modelled by a normal distribution.
The number of degrees of freedom for the goodness of fit test is $k - 3$ where $k$ is the number of classes after combining if necessary.
The mean and variance will need to be estimated.
$H_{0} : X$ can be modelled by $B (3, 0.4)$ .
The expected values for the hypothesis test are shown in the table.
$x$
0
1
2
3
Frequency
8
21
18
3
What do you have to do with the expected values?
You have to combine the final two rows as the last expected value is less than 5.
$x$
0
1
2 or more
Frequency
8
21
21

Previous:Poisson DistributionNext:Hypothesis Testing for Population Parameters

		Eye colour
		Blue	Brown	Green
Hair colour	Black	17	12	29
Hair colour	Blonde	31	25	21

Height	Frequency
$120 \leq h < 150$	35
$150 \leq h < 180$	45
$180 \leq h < 200$	20

Height	Probability
$120 \leq h < 150$	$P (X < 150)$
$150 \leq h < 180$	$P (150 < X < 180)$
$180 \leq h < 200$	$P (X > 180)$

$x$	0	1	2	3
Frequency	8	21	18	3

$x$	0	1	2 or more
Frequency	8	21	21

Hypothesis Testing using the Chi-squared Distribution (DP IB Applications & Interpretation (AI))

Flashcards

Cards in this collection (34)

1. Number & Algebra

Number Toolkit

Exponentials & Logs

Sequences & Series

Financial Applications

Complex Numbers

Further Complex Numbers

Matrices

Eigenvalues & Eigenvectors

2. Functions

Linear Functions & Graphs

Further Functions & Graphs

Modelling with Functions

Functions Toolkit

Transformations of Graphs

Modelling with Logarithmic, Logistic & Piecewise Functions

3. Geometry & Trigonometry

Geometry Toolkit

Geometry of 3D Shapes

Trigonometry

Trigonometric Identities & Equations

Voronoi Diagrams

Matrix Transformations

Vector Properties

Vector Equations of Lines

Modelling with Vectors

Graph Theory

4. Statistics & Probability

Statistics Toolkit

Correlation & Regression

Non-linear Regression

Probability

Probability Distributions

Random Variables

Binomial Distribution

Normal Distribution

Combinations of Normal Distributions & Sample Mean Distributions

Poisson Distribution

Hypothesis Testing using the Chi-squared Distribution

Hypothesis Testing for Population Parameters

Transition Matrices & Markov Chains

5. Calculus

Differentiation

Further Differentiation

Integration

Further Integration

Kinematics

Differential Equations

Coupled & Second Order Differential Equations