WhitePages picoCTF 2019 Solution

Published: April 2, 2026

Description

I stopped using color in my terminal. Decode the binary in the whitepages file.

Download the file.

bash
wget <url>/whitepages.txt
  1. Step 1Examine the file with xxd
    The file appears to contain only whitespace. Run xxd to see the actual hex values of each byte. You will find two different whitespace characters being used - for example, regular space (0x20) and a special Unicode whitespace character.
    bash
    xxd whitepages.txt | head -20
    Learn more

    Unicode contains many whitespace characters beyond the regular space (U+0020). Common ones used in steganography challenges include: em space (U+2003, UTF-8: E2 80 83), en space (U+2002), thin space (U+2009), and others. All look identical in most text editors.

  2. Step 2Map whitespace characters to binary bits
    Identify the two different whitespace byte sequences. Assign one to bit 0 and the other to bit 1. Group every 8 bits into a byte and convert to ASCII.
    python
    python3 << 'EOF'
    with open('whitepages.txt', 'rb') as f:
        data = f.read()
    
    # Identify the two whitespace types from xxd output
    # e.g., 0x20 = space = 0, 0xe2 0x80 0x83 = em-space = 1
    bits = ''
    i = 0
    while i < len(data):
        if data[i:i+3] == b'\xe2\x80\x83':
            bits += '1'
            i += 3
        elif data[i] == 0x20:
            bits += '0'
            i += 1
        else:
            i += 1
    
    # Convert bits to ASCII
    result = ''
    for j in range(0, len(bits) - 7, 8):
        byte = int(bits[j:j+8], 2)
        result += chr(byte)
    print(result)
    EOF
    Learn more

    This technique is called whitespace steganography. The SNOW tool and the Whitespace programming language both use similar concepts of encoding information in invisible characters. It is effective against casual inspection but immediately visible under hex analysis.

    The key insight is that while the text looks blank, it encodes a binary string where each character of the message is represented by 8 bits of whitespace characters.

Flag

picoCTF{...}

Map the two types of whitespace characters to binary bits 0 and 1, then decode 8-bit groups as ASCII characters.

Want more picoCTF 2019 writeups?

Useful tools for Forensics

Related reading

What to try next