Estimation from Statistical Data (Edexcel GCSE Statistics)

Revision Note

Estimating Population Characteristics

How can I use samples to estimate population characteristics?

  • Summary statistics calculated from a sample can be used to estimate the same statistics for the population as a whole

    • e.g. the mean, median, range, quartiles and interquartile range

  • Remember that a sample is a selection of members drawn from the population, whereas the population is all the members

    • Statistics calculated from a sample will usually not be exactly the same as the statistics for the whole population (different mean, etc.)

    • Statistics calculated from two different samples will also not usually be the same

  • As long as the sample is representative of the population

    • then you can assume that the statistics for the population are approximately the same as those for the sample

      • e.g. assume the population mean is about equal to the sample mean

  • This can be used to make predictions about the population

    • About half (50%) of the population will be above the sample median

      • and about half will be below

    • About a quarter (25%) of the population will be below the sample lower quartile

      • and about a quarter will be above the sample upper quartile

    • About half (50%) of the population will be between the sample upper and lower quartiles

  • Sample size has an impact on the reliability of estimates made about the population

    • In general a larger sample size will lead to more reliable conclusions

Examiner Tips and Tricks

  • If a question asks you about improving the reliability of estimates made from samples

    • the answer will almost always have to do with increasing the sample size

Worked Example

Paul has been studying a population of rabbits in Lopital Woods. He captured a sample of 50 rabbits and weighed each of the rabbits before releasing them again. He records the following data for his sample:

total weight of rabbits: 82.5 kg
lower quartile: 1.2 kg
median: 1.6 kg
upper quartile: 2.1 kg


(a) Calculate an estimate for the mean weight of the population of rabbits in Lopital Woods.

Assume that the population mean is the same as the sample mean
Calculate the sample mean by dividing the total weight by the number in the sample (50)

fraction numerator 82.5 over denominator 50 end fraction equals 1.65

1.65 kg

It is assumed that there are a total of 600 rabbits living in Lopital Woods.

(b) Use Paul's data to estimate how many rabbits in Lopital Woods weigh between 1.2 kg and 2.1 kg.

Those values are the lower and upper quartiles of the data set
Half of the data values fall between the lower and upper quartiles
Assume the same is true for the population as a whole

1 half cross times 600 equals 300

Approximately 300 rabbits


(c) Suggest a way that the reliability of Paul's results could be improved.

Use a larger sample of rabbits to calculate the statistics from

Petersen Capture Recapture Formula

What is the capture recapture method?

  • The capture recapture method is a way to estimate the size of a population

    • It is used when it is either impossible or impractical to count the whole population

      • e.g. too expensive or too time-consuming

    • Common examples include

      • the population of fish in a river/lake/sea

      • the population of wild animals in a natural habitat

  • The capture recapture method is based on proportion

    • A first sample of the population is captured

      • and each member is given an identifiable marker/tag

    • All members of the sample are then replaced 

      • i.e. released back into the population

    • At a later time, a second sample of the population is taken

    • The proportion of the second sample that is tagged is assumed to be the same as the proportion of the population that was tagged in the first sample

How do I use the Petersen capture recapture formula?

  • The Petersen capture recapture formula is used to find an estimate for the population size using a capture recapture experiment

    • Number space in space population equals fraction numerator sample space size space 1 space cross times space sample space size space 2 over denominator number space marked space in space sample space 2 end fraction

      • This formula is not on the exam formula sheet, so you need to remember it

    • A shorter version is N equals fraction numerator M n over denominator m end fraction

      • N is the size of the population

      • M is the size of the first sample

      • n is the size of the second sample

      • m is the number in the second sample that have markers/tags

  • It can be easier to remember the formula if you understand where it comes from

    • The proportion of the total population captured (and marked/tagged) in the first sample is M over N

    • The proportion of the second sample that is marked/tagged is m over n

    • We assume that those two proportions are equal

      • m over n equals M over N

    • Rearrange that equation to get the formula

      • fraction numerator m N over denominator n end fraction equals M

      • m N equals M n

      • N equals fraction numerator M n over denominator m end fraction

What assumptions need to be made for the capture recapture method to be valid?

  • Each member (element) of the population must have an equal chance of being selected

    • This applies to both the first and second samples

    • This means that random sampling is used

  • Between the first and second samples

    • The tagged members must have had sufficient time and opportunity to mix with the rest of the population

    • The population remains the same size (broadly speaking)

      • No (significant number of) births/deaths

      • No (significant number of) members leave the population

        • e.g. migration

    • No marks/tags have been removed or destroyed

      • e.g.  taken off, worn off

  • The process of tagging the members of the first sample must not affect the likelihood of them being recaptured

When is the capture recapture method reliable?

  • The sample sizes used must be large enough to be representative of the population for the method to be reliable

Worked Example

Roger captures 50 rabbits from Lopital Woods and marks them by putting a small safe tag on their ears. Roger then releases the rabbits back into the woods.

Some time later, Roger captures 100 rabbits and finds that 8 of the rabbits have the tag on their ears.

(a) Use this information to estimate the size of the population of rabbits in Lopital Woods.

If you remember the formula N equals fraction numerator M n over denominator m end fraction you can use it directly here
M=50, n=100, m=8

N equals fraction numerator 50 cross times 100 over denominator 8 end fraction equals 5000 over 8 equals 625


If you don't remember the formula, use the 'equal proportions' idea

The proportion of the population number captured (and tagged) in the first sample is 50 over N
The proportion of the second sample with tags is 8 over 100
Set those two proportions equal to each other

8 over 100 equals 50 over N

Solve that equation for N

table row cell fraction numerator 8 N over denominator 100 end fraction end cell equals 50 row cell 8 N end cell equals cell 50 cross times 100 end cell row cell 8 N end cell equals 5000 row N equals cell 5000 over 8 end cell row N equals 625 end table

Answer the question in context

There are approximately 625 rabbits in Lopital Woods

(b) State one assumption that would have to be made for the estimate to be valid.

Roger allowed enough time for the tagged rabbits to mix with the rest of the population between samples

Last updated:

You've read 0 of your 5 free revision notes this week

Sign up now. It’s free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Roger B

Author: Roger B

Expertise: Maths

Roger's teaching experience stretches all the way back to 1992, and in that time he has taught students at all levels between Year 7 and university undergraduate. Having conducted and published postgraduate research into the mathematical theory behind quantum computing, he is more than confident in dealing with mathematics at any level the exam boards might throw at you.

Dan Finlay

Author: Dan Finlay

Expertise: Maths Lead

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.