Population & Sampling (Edexcel GCSE Maths)

Revision Note

Paul

Author

Paul

Last updated

Population & Sampling

What are the different types of data?

  • Primary data is data that has been collected by the person carrying out the research
    • This could be through questionnaires, surveys, experiments etc 
  • Secondary data is data that has been collected previously
    • This could be found on the internet or through other research sources
  • Qualitative data is data that is usually given in words not numbers to describe something
    • For example: the colour of a teacher's car
  • Quantitative data is data that is given using numbers which counts or measures something
    • For example: the number of pets that a student has
  • Discrete data is quantitative data that needs to be counted
    • Discrete data can only take specific values from a set of (usually finite) values
    • For example: the number of times a coin is flipped until a ‘tails’ is obtained
  • Continuous data is quantitative data that needs to be measured
    • Continuous data can take any value within a range of infinite values
    • For example: the height of a student
  • Age can be discrete or continuous depending on the context or how it is defined
    • If you mean how many years old a person is then this is discrete
    • If you mean how long a person has been alive then this is continuous

What is a population?

  • population refers to the whole set of things which you are interested in
    • e.g.  if a teacher wanted to know how long pupils in year 11 at their school spent revising each week then the population would be all the year 11 pupils at the school
  • Population does not necessarily refer to a number of people or animals
    • e.g.  if an IT expert wanted to investigate the speed of mobile phones then the population would be all the different makes and models of mobile phones in the world

What is a sample?

  • A sample refers to a selected part (called a subset) of the population which is used to collect data from
    • e.g.  for the teacher investigating year 11 revision times a sample would be a certain number of pupils from year 11
  • random sample is where every item in the population has an equal chance of being selected
    • e.g.  every pupil in year 11 would have the same chance of being selected for the teacher's sample
  • A biased sample is where the sample is not random
    • e.g.  the teacher asks pupils from just one class

What are the advantages and disadvantages of using a population?

  • You may see or hear the word census - this is when data is collected from every member of the whole population
  • The advantages of using a population
    • Accurate results - as every member/item of the population is used
      • In reality it would be close to every member for practical reasons
    • All options/opinions/responses will be included in the results
  • The disadvantages of using a population
    • Time consuming to collect the data
    • Expensive due to the large numbers involved
    • Large amounts of data to organise and analyse

What are the advantages and disadvantages of using a sample? 

  • The advantages of using a sample
    • Quicker to collect the data
    • Cheaper as not so much work involved
    • Less data to organise and analyse
  • The disadvantages of using a sample
    • A small sample size can lead to unreliable results
      • Sampling methods can usually be improved by taking a larger sample size
    • A sample can introduce bias
      • particularly if the sample is not random
    • A sample might not be representative of the population
      • Only a selection of options/opinions/responses might be accounted for 
      • The members/items used in the sample may all have similar responses
        e.g.  even with a random sample it may be possible the teacher happens to select pupils for his sample who all happen to do very little revision
  • It is important to recognise that different samples (from the same population) may produce different results

Worked example

Mike is a biologist studying mice and has access to 600 mice that live in an enclosure.
Mike wants to sample some of the mice for a study into their response to a new drug.
He decides to sample 10 mice, selecting those nearest to the enclosure's entrance.

a)

State the population in this situation.

The population is the 600 mice living in the enclosure

b)

State two possible issues with the sample method Mike intends using.

The sample size is very small - just 10 mice
The mice are not being selected at random - those nearest the entrance have a greater chance of being selected

c)

Suggest one way in which Mike could improve the reliability of the results from his sample.

Mike should increase the sample size to increase the reliability of the results

Capture-Recapture

What is the capture-recapture method?

  • The capture-recapture method is a way to estimate the size of a population
    • It is used when it is impossible, time-consuming or impractical to count the whole population
    • Common examples include
      • the population of fish in a river/lake/sea
      • the population of wild animals, in their natural habitat
  • The capture-recapture method is based on proportion
    • A first sample of the population is captured and each member is given an identifiable marker/tag
    • All members of the sample are then replaced (released) into the population
    • At a later time, a second sample of the population is taken
    • The proportion of the second sample that is tagged is assumed to be the same as the proportion of the population that is tagged

How do I use the capture-recapture method?

  • Let N be the size of the population
  • Take the first sample of size M, say
    • i.e.  M members of the population have been captured
    • Mark/tag every member in the sample (M members of the population now have tags)
    • Release all of the sample back into the population
    • Wait some time for those captured to mix with the rest of the population
      • the amount of time required will depend on the type of population
        e.g. fish may only need a few hours to mix but wild animals in a large habitat area may need days
  • Take the second sample of size n, say
    • Let be the number in the second sample that have marks/tags
    • i.e.  m previously tagged members of the population have been recaptured
  • Form an equation by using the assumption of proportion
    • i.e.  the proportion of the second sample tagged is equal to the proportion of the population tagged
    • m over n equals M over N
    • In words this is  fraction numerator Number space of space apostrophe recaptured apostrophe over denominator Size space of space second space sample end fraction equals fraction numerator Number space of space apostrophe captured apostrophe over denominator Size space of space population end fraction
  • Rearrange this equation to find an estimate for the population size N
    • N equals n over m cross times M

What assumptions are made in the capture-recapture method?

  • Each member (element) of the population has an equal chance of being selected
    • this applies to both the first and second sample
  • Between the first and second samples
    • the tagged members have had sufficient time and opportunity to mix with the rest of the population
    • the population remains the same size (broadly speaking)
      • no (significant number of) births/deaths
      • no (significant number of) members leave the population - e.g. migration
    • No marks/tags have been removed or destroyed
      • e.g.  taken off, worn off

Worked example

Roger captures 50 rabbits from Lopital Woods and marks them by putting a safe tag on their ears.
Roger then releases the rabbits back into the woods.
Sometime later, Roger captures 100 rabbits and finds that 8 of the rabbits have the tag in their ears.

a)

Use the capture-recapture method to estimate the size of the population of rabbits in Lopital Woods.


Start by defining an unknown to be the size of the population.

Let be the size of the population

Write down the fraction of the population that have the tag (from the first sample).

50 over N

Write down the fraction of the second sample that have the tag.

8 over 100

Form an equation using the assumption these two fractions (proportions) are equal.

8 over 100 equals 50 over N

Rearrange the equation (one way to do this would be to 'cross-multiply').

8 N equals 5000

Solve to find N (divide both sides of the equation by 8).

 N equals 5000 over 8 equals 625

Answer the question in context.

There are approximately 625 rabbits in Lopital Woods

b)

State one assumption that would have to be made for the estimate to be valid.

Roger allowed enough time for the tagged rabbits to mix with the rest of the population between samples

You've read 0 of your 10 free revision notes

Unlock more, it's free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Paul

Author: Paul

Expertise: Maths

Paul has taught mathematics for 20 years and has been an examiner for Edexcel for over a decade. GCSE, A level, pure, mechanics, statistics, discrete – if it’s in a Maths exam, Paul will know about it. Paul is a passionate fan of clear and colourful notes with fascinating diagrams – one of the many reasons he is excited to be a member of the SME team.