Sampling & Data Collection (CIE A Level Maths: Probability & Statistics 2)

Exam Questions

2 hours21 questions
1a
Sme Calculator
2 marks

A census is when data is collected from every member of the population.

Give one advantage and one disadvantage of using a census rather than a sample.

1b
Sme Calculator
1 mark

A company produces batteries and part of their quality control process involves testing the batteries to see how long they last. State a reason why it is not practical to test every battery (ie. use a census for the quality control process) the company produces.

1c
Sme Calculator
1 mark

In a population of 1000 members, 50 are randomly selected for a sample.
Write down the sampling fraction.

Did this page help you?

2a
Sme Calculator
1 mark

Computer components produced by an electronics company are each given a unique serial number. A34X processing chips are produced in batches of 2500. For quality control, the company tests a random sample of 20 A34X chips from each batch.

Suggest a suitable sampling frame from which to obtain this sample.

2b
Sme Calculator
3 marks

The company will use four-digit random numbers in order to select the 20 A34X chips from each batch.

(i)
Briefly explain why the serial numbers cannot be four-digit random numbers.
(ii)
Explain how the company can use four-digit random numbers in order to sample the A34X chips..

Did this page help you?

3
Sme Calculator
3 marks

For each scenario below, state, giving a reason, as to whether they could lead to bias in the data collected.

(i)
In a survey of adults’ health, a sample of people aged 70 or over is used.
(ii)
A teacher asking students about the amount of work outside of lessons they attempt per week.
(iii)
Randomly sampling 20 people from a population of 100 by assigning each person in the population a unique two-digit number (00-99) and then selecting 20 using a two-digit random number generator.

Did this page help you?

4a
Sme Calculator
3 marks

A fast-food chain, introducing a new vegan menu, employ a researcher to investigate people’s opinions before they launch the products. The researcher decides to conduct a survey on half the people eating in one branch of the fast-food chain at 2pm on a Wednesday afternoon.

Explain what is meant by the words ‘population’ and ‘sampling frame’.

You may use the context of the above scenario as an example.

4b
Sme Calculator
2 marks

The researcher decides to select people in the fast-food branch using a random sample whereby they assign each customer a unique number and then select those to be surveyed using a random number generator, ignoring any numbers generated that do not correspond to a customer.

(i)
Explain why this is a valid way to conduct a random sample.
(ii)
Give a reason why the sampling process described may still lead to biased data.

Did this page help you?

5a
Sme Calculator
2 marks

A flatpack furniture company, AEKI, has a testing facility where its products are put through a series of safety and quality checks. 

Suggest two reasons why it would not be sensible for AEKI to test every product they produce.

5b
Sme Calculator
2 marks

Explain how AEKI could use a systematic sample to test their products.

Did this page help you?

6a
Sme Calculator
3 marks

A high school holds an annual summer festival to raise money for events and trips throughout the year. Before this year’s festival the headteacher decided to survey the opinion of the 40 staff and 490 students using a random sample.

(i)
Suggest a suitable sampling frame for both students and staff.
(ii)
The headteacher will use a sampling fraction of one-tenth for both staff and students. Find the number of staff and students that will take part in the survey.
6b
Sme Calculator
1 mark

Suggest a possible problem that might arise with the sampling frame when selecting the staff and students.

6c
Sme Calculator
1 mark

Briefly explain why the data collected may be biased if the headteacher were to ask the staff and students for their opinions personally.

Did this page help you?

7a
Sme Calculator
2 marks

N’Oréal, a hair and beauty company, release an advertising campaign for its new product called ‘Face Amazifier’. The small print at the bottom of the advert says the following

‘55% of 25 existing customers agree that the product makes your face feel more amazing.’

The Advertising Standards Agency get complaints that the advert is not representative.

Explain the meaning of ‘bias’ in relation to the choice of sample N’Oréal has used.

7b
Sme Calculator
2 marks

Suggest two ways N’Oréal could improve their sample to make it more representative.

7c
Sme Calculator
3 marks
(i)
Show that 55% of 25 customers is not a valid statistic for N’Oréal to use.

(ii)
State the two closest possible alternative figures that would be valid.

Did this page help you?

1a
Sme Calculator
2 marks

A census is when data is collected from every member of the population. In the UK, the Office for National Statistics runs a compulsory census every ten years, in years ending in 1 (2001, 2011, etc), to gather information about all individuals and households in England and Wales. This information helps organisations make decisions on planning and funding for public services in each area, including transport, education and healthcare.

Suggest one advantage and one disadvantage of the census only being carried out every ten years.

1b
Sme Calculator
1 mark

A local council wants to gather opinions from residents about opening a new care home.  They decide to conduct their own survey as opposed to using information from the 2011 census.

Give a reason why the council may be wise to conduct their own survey rather than using data from the 2011 census.

1c
Sme Calculator
2 marks

Briefly describe how the council could use a random number generator to select residents for their survey.

Did this page help you?

2a
Sme Calculator
2 marks

Clyde works at an orangutan sanctuary and part of his job involves measuring the weight of each of its orangutans once a week.

The weights, to the nearest kg, of ALL their 20 adult males are listed below:

52, 57, 63, 80, 56, 66, 101, 68, 55, 96, 70, 62, 66, 64, 99, 91, 55, 92, 73, 61

State a reason why, in this case, it is important that Clyde weighs every orangutan rather than taking a sample of them.

2b
Sme Calculator
3 marks

The sanctuary decides to trial a new medication by testing it on 5 of their 20 adult male orangutans. Clyde assigns each orangutan one unique 2-digit number (from 00 to 99), then uses a 2-digit random number generator to select which orangutans will be given the new medication.  Clyde ignores any numbers generated that are not assigned to an orangutan.

(i)
Explain how Clyde’s system of selecting orangutans will produce a random sample of the population.
(ii)
Explain how Clyde could make more efficient use of the 2-digit random number generator.

Did this page help you?

3a
Sme Calculator
1 mark

A supermarket wants to gather data from its shoppers on how far they have travelled to shop there.

Give a reason why sampling shoppers from the supermarket would be appropriate in this case.

3b
Sme Calculator
3 marks

One lunchtime an employee is stationed at the door of the shop and instructed to ask the first 10 customers how far they have travelled. Give two criticisms of using this method to sample shoppers.

3c
Sme Calculator
1 mark

Briefly explain why it is impossible to have a sampling frame for the population in this case.

Did this page help you?

4
Sme Calculator
2 marks

To check the quality of produce used in a restaurant kitchen, the head chef likes to taste one item from each box of produce as soon as it arrives at the restaurant. The head chef opens each box in turn and takes the top item of produce to taste test.

(i)
Suggest a reason why the chef does not taste test every item.
(ii)
Suggest a reason why the chef does not necessarily test every type of produce delivered.
(iii)
Give one reason the chef’s method of sampling produce is not random.

Did this page help you?

5a
Sme Calculator
3 marks

A company wants to survey 15% of its staff to find out whether employees would like to continue working from home after the Covid-19 pandemic. The company has 580 members of staff.

Describe how the company can randomly sample its staff, stating clearly the sampling frame that could be used.

5b
Sme Calculator
2 marks

Work out the sampling fraction that the company will use.

Did this page help you?

6
Sme Calculator
4 marks

The wingspans of a sample of flamingos in a reserve were measured to the nearest centimetre. The results are shown in the table below:

Wingspan (cm)

Number of flamingos, f

140 - 144

2

145 - 149

5

150 - 154

8

155 - 159

3

160 - 164

6

165 - 169

1

(i)
State one condition that must have been met in order for the flamingos whose wingspans were measured to have been selected by a random sample.
(ii)
The wingspan data is collected on a weekday morning. State, with a reason, whether this could create bias in the data.
(iii)
It is estimated there are between 150 and 200 flamingos in the reserve. Assuming this estimate is accurate, show that the sampling fraction is greater than 10% regardless of the exact number of flamingos at the reserve.

Did this page help you?

7
Sme Calculator
3 marks

For each scenario below, state, giving a reason, whether the scenario could lead to bias in the data collected.

(i)
A questionnaire containing the question “Do you agree that the UK voting age limit should be reduced to 16?”.
(ii)
A policeman asking 20 randomly chosen people at a music festival if they are in possession of any illegal substances.
(iii)
A sample of 5 dogs from a population of 10 at a rescue home being randomly produced by assigning each dog a single digit number (from 0 to 9) and using a random number generator to select which dogs are selected for the sample.

Did this page help you?

1a
Sme Calculator
3 marks

An online magazine which offers both free and paid for content has a large number of readers. Readers can view additional content by paying a monthly subscription fee. Based on reviews on the magazine’s website, the editor of the magazine believes that an additional type of content could be introduced. Before making any changes, the editor decides to carry out a sample survey to obtain the opinions of the readers.

(i)
Define the parent population that would be associated with the magazine.
(ii)
Briefly explain why, x ̄, the mean age of a sample of the magazine’s readers, is only an estimate of mu, the mean age of the parent population.
(iii)
Give one advantage the editor will have in carrying out a sample survey, compared with carrying out a survey of all readers.
1b
Sme Calculator
3 marks

The editor decides to gather opinions from only the 3750 readers who subscribe to the additional content. A random sample of 25 subscribers is selected for the sample survey.

(i)
Suggest a suitable sampling frame for the survey.
(ii)
Find the sampling fraction the editor has used.
(iii)
Give one possible disadvantage of carrying out the sample survey in this manner.

Did this page help you?

2a
Sme Calculator
3 marks

A charity running four sloth sanctuaries across a wild area of South America records data on all the sloths in its care. The numbers of sloths at the four sanctuaries are 24, 40, 16 and 56. To compare care across the 4 sanctuaries the charity decides to take a sample of sloths from each sanctuary and run some health tests on the sloths selected.

For the sanctuary with 40 sloths in its care, identify the sampling frame and briefly describe how the charity could efficiently use a three-digit random number generator to select the sloths from this sanctuary.

2b
Sme Calculator
2 marks

The charity decides to use a sampling fraction of one-eighth from each sanctuary. Determine the total number of sloths that will be selected for the health tests.

2c
Sme Calculator
1 mark

The results from the health tests showed that the sloths at the sanctuary with the smallest population weighed, on average, less than those from the other three sanctuaries, whereas the average weights of the sloths in those other three sanctuaries were similar.

Suggest a reason why this does not necessarily mean that the sloths at the sanctuary with the smallest population are in poorer health than those at the other three sanctuaries.

2d
Sme Calculator
2 marks

The charity wishes to determine which sloths are suitable for release into the wild. A sloth can be released once it has reached the age of 3 and is of a healthy weight. Explain why the information from the sample conducted by the charity is of little use to them in determining which sloths are suitable for release into the wild.

Did this page help you?

3a
Sme Calculator
2 marks

Stephan is researching the effects a new energy drink has on the glucose levels of students aged 13 to 18. He decides to randomly sample 50 male and 50 female students and measure their blood glucose levels.

(i)
State one advantage of the sampling method Stephan is using.
(ii)
State one disadvantage of the sampling method Stephan is using.
3b
Sme Calculator
3 marks

Stephan has a sampling frame which is an alphabetical list of 100 of the male students in his year group from the Sixth Form college he attends. All 100 have agreed to have a blood glucose test should they be selected in the sample.

(i)
Explain how Stephen could use a two-digit random number generator to take a random sample of size 50 from the male students.
(ii)
Explain how the data could be biased.

Did this page help you?

4a
Sme Calculator
2 marks

Freda wants to conduct a survey to investigate the type of ice cream people prefer.

She decides to stand in a busy high street on a Sunday afternoon and attempt to get shoppers to answer her questions.

(i)
Define the word ‘population’ in the context of Freda’s survey.
(ii)
State one advantage and one disadvantage of Freda’s sampling method.
4b
Sme Calculator
2 marks

Explain why Freda’s sampling technique is not a random sample.

4c
Sme Calculator
3 marks

Having been unsuccessful in obtaining enough data from her previous attempt, Freda decides to look at the electoral register for her town and randomly select a new sample of people to contact.

(i)
Explain why Freda’s new method may still not produce a random sample.
(ii)
Suggest a reason why Freda may again be unsuccessful in getting enough data using this sampling technique.

Did this page help you?

5a
Sme Calculator
4 marks

The CEO of Save My Exams, Jamie, wants to find out what users would like to see on the revision website in future. He notices that around 15% of those who access the site have signed up to the mailing list to get content updates.

An employee suggests that they send out an email to all those who have signed up to the mailing list with a questionnaire for them to complete and return.

(i)
Give two reasons why the users who return the questionnaire would not form a random sample of users of the website.
(ii)
Given the site has over users, state two problems with sending out the questionnaire in this way.
5b
Sme Calculator
2 marks

Jamie decides to separate users by exam board to gather more detailed opinions. A member of the Maths Content Team suggests using a table of random numbers to select a random sample of 100 users from the 4581 CIE mailing list subscribers. The first five random numbers from the table are as follows.

02743               45290               19024               24337               90044

Explain how Jamie could use these random numbers to select the first few members in the sample.

Did this page help you?

6a
Sme Calculator
4 marks

A researcher wishes to measure the heights of a random sample of giraffes from a large nature reserve where each of the 2400 giraffes is able to be tracked and identified.

When the current tracking system was first introduced, each giraffe was assigned a unique four‑digit ID number between 0001 and 2400.  The researcher wants to randomly select 100 of the giraffes to test a new, updated tracking system and decides to select the giraffes with the ID numbers 0001-0100.

(i)
Explain why this method of selecting the giraffes is not necessarily random.
(ii)
Identify the sampling frame in this context.
(iii)
Describe how the researcher can assure a random selection of 100 giraffes.
6b
Sme Calculator
3 marks

Whilst attaching the new tracking devices to the 100 giraffes selected for the sample the researcher also measured the heights of the giraffes and worked out their mean height, ̅ h space m.

(i)
Explain why ̅ h space m produces only an estimate of the population mean height, mu space m.
(ii)
Explain how the researcher could use the results of the Central Limit Theorem to get a better estimate of the population mean.

Did this page help you?

7a
Sme Calculator
3 marks

A factory produces paper fruit baskets used for fruit pickers at ‘pick your own’ farms. The breaking load of a paper fruit basket is the maximum load that it can carry before the basket handles break. One ‘pick your own’ farm purchased 15 000 paper fruit baskets but wishes to test a sample of these to establish the breaking load of the baskets.

(i)
Identify the population in this context.
(ii)
In a census, every member of the population is sampled.
Give two reasons why a census would be unsuitable for the purpose of testing the paper fruit baskets.
7b
Sme Calculator
2 marks

The farm tests a random sample of six paper fruit baskets. The loads required for the handles to break are shown below:

2.035 kg       1.975 kg       2.128 kg       1.898 kg            2.112 kg       1.999 kg

(i)
Without performing any calculations on the sampled data, comment on the factory’s claim that the paper fruit baskets have a breaking load of 2 kg.
(ii)
Give one criticism of the sample the farm has used.
7c
Sme Calculator
3 marks

The farm decides to perform another 4 random sample tests, taking another sample of six paper baskets each time.  They calculate the mean breaking load for each sample.

(i)
Give one potential problem from performing multiple sample tests.
(ii)
Briefly describe how the farm can use the mean breaking loads from their samples to obtain a more accurate estimate of the mean breaking load of the entire population of paper fruit baskets.

Did this page help you?