Run Length Encoding & Dictionary Coding (OCR A Level Computer Science)

Revision Note

Callum Davies

Written by: Callum Davies

Reviewed by: James Woodhouse

Updated on

Run Length Encoding & Dictionary Coding

What is Run-Length Encoding?

  • Run-Length Encoding (RLE) is a form of data compression that condenses identical elements into a single value with a count.

  • For a text file, "AAAABBBCCDAA" is compressed to "4A3B2C1D2A"

  • The string has four 'A's, followed by three 'B's, two 'C's, one 'D', and two 'A's. RLE shows this as "4A3B2C1D2A"

  • It is used in bitmap images to compress sequences of the same colour

  • For example, a line in an image with 5 red pixels followed by 3 blue pixels could be represented as "5R3B"

What is Dictionary Coding?

  • Dictionary coding replaces recurring sequences with shorter, unique codes

  • A 'dictionary' is compiled to map original sequences to special codes

  • This method is effective for both text and binary data

  • The phrase "for example" could be coded as 'FE' if 'FE' doesn't appear in the original text

  • A sequence of binary numbers '1010' could be replaced by a shorter unique code

Example of Dictionary Coding

  • Consider this sentence where some algorithm names are repeatedly mentioned:

    • QuickSort is faster than BubbleSort but MergeSort is more stable than QuickSort and BubbleSort.

  • Here, the names QuickSort, BubbleSort, and MergeSort are repeated.

  • Create a dictionary:

    • Start with an empty dictionary.

    • Scan the sentence for recurring sequences.

    • The dictionary might look like this:

{
   QuickSort: Q
   BubbleSort: B
   MergeSort: M
}

  • Replace the sequences

    • The repetitive words in the sentence can be replaced with the dictionary values:

      • Original: QuickSort is faster than BubbleSort but MergeSort is more stable than QuickSort and BubbleSort.

      • Compressed: Q is faster than B but M is more stable than Q and B.

  • The original string was 95 characters long, and the dictionary-coded example is 53 characters long

    • The shorter string will require less space in memory or storage

Examiner Tips and Tricks

  • RLE and Dictionary Coding serve different needs and have their advantages and disadvantages

  • RLE: More effective when data has lots of repetition

  • Dictionary Coding: More versatile but may require more computational resources

Worked Example

A survey focuses on the kinds of vehicles travelling on a motorway. For each vehicle that passes, a letter is noted:

  • For a car, 'C' is entered.

  • For a motorbike, 'M' is entered.

  • For a lorry, 'L' is entered.

  • For any other vehicle, 'O' is entered.

It's decided to compress the data generated.

Run Length Encoding has compressed the following sequence:

3C3M4C

Show the result of decompressing the sequence.

[2]

How to answer this question:
Run Length Encoding (RLE) shows the number of consecutive occurrences of a character in a sequence. To decompress, repeat each character the number of times indicated by the preceding number.

Answer:

Example answer that gets full marks:
CCCMMMCCCC

Worked Example

The following sequence represents the raw data collected during the survey:
CCCCOLLLCCCCCMOCCCCC

Show the result of compressing the sequence.

[2]

How to answer this question:
To compress using RLE, count consecutive occurrences of each character and append the count before the character itself.

Answer:

Example answer that gets full marks:
4C1O3L5C1M1O5C

Worked Example

Write pseudocode for the function "longest", which takes in a string of characters as an argument and returns an integer representing the longest continuous sequence of 'C's.

[6]


How to answer this question
Write pseudocode for a function that iterates through the string, counting continuous sequences of 'C's and keeping track of the longest such sequence.

For example, in ABCCCBAACC the longest consecutive string of 'C's is 3.

Answer:

Example answer that gets full marks

How this answer might look in pseudocode:

Function longest(s: String) -> Integer:
   max_count = 0
   current_count = 0
    
    For each character in s:
        If char equals 'C':
            Increment current_count by 1
            If current_count > max_count:
                Set max_count to current_count
        Else:
            Set current_count to 0
    
    Return max_count

How this answer might look in Python:

def longest(s): 
      max_count = 0
      current_count = 0
      
      for char in s:
            if char == 'C':
                  current_count += 1
            else:
                  current_count = 0
                  if current_count > max_count:
                        max_count = current_count
      
      return max_count

How this answer might look in Java:

public class Main {
    public static void main(String[] args) {
        System.out.println(longest("ABCACCCCCABAC"));  // Output should be 5
    }

    public static int longest(String s) {
        int maxCount = 0;
        int currentCount = 0;

        for (int i = 0; i < s.length(); i++) {
            char currentChar = s.charAt(i);

            if (currentChar == 'C') {
                currentCount++;
                if (currentCount > maxCount) {
                    maxCount = currentCount;
                }
            } else {
                currentCount = 0;
            }
        }

        return maxCount;
    }
}

Acceptable answers you could have given instead:
You could use different variable names or loop structures, but the logic should be the same to get full marks.

You've read 0 of your 5 free revision notes this week

Sign up now. It’s free!

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Callum Davies

Author: Callum Davies

Expertise: Computer Science

Callum is an experienced teacher of GCSE and A-Level Computer Science. He has 4 years of teaching experience and has detailed knowledge of how to achieve exam success, having marked for OCR A-Level. Callum is now a software engineer and regularly mentors new engineers.

James Woodhouse

Author: James Woodhouse

Expertise: Computer Science

James graduated from the University of Sunderland with a degree in ICT and Computing education. He has over 14 years of experience both teaching and leading in Computer Science, specialising in teaching GCSE and A-level. James has held various leadership roles, including Head of Computer Science and coordinator positions for Key Stage 3 and Key Stage 4. James has a keen interest in networking security and technologies aimed at preventing security breaches.