Description
I stopped using color in my terminal. Decode the binary in the whitepages file.
Setup
Download the file.
wget <url>/whitepages.txtSolution
Want to try it yourself first?
The guided walkthrough reveals hints one step at a time.
Step 1
Examine the file with xxdObservationI noticed the challenge description said the file was named 'whitepages' and contained only whitespace, which suggested the data might be hidden in the byte values of different-looking-but-distinct whitespace characters rather than any visible text.The file appears to contain only whitespace. Run xxd to see the actual hex values of each byte. You will find two different whitespace characters being used - for example, regular space (0x20) and a special Unicode whitespace character.bashxxd whitepages.txt | head -20Expected output
00000000: 2020 20e2 8083 20e2 8083 2020 e280 8320 ..... .... 00000010: e280 8320 20e2 8083 2020 e280 8320 e280 ... ... .. .. 00000020: 8320 20e2 8083 20e2 8083 e280 8320 20e2 . ... ... . . ...
What didn't work first
Tried: Open whitepages.txt in a text editor or cat it to the terminal to look for hidden content.
Both approaches display whitespace characters as blank space, so the file appears completely empty. Terminals and editors collapse or ignore the distinction between U+0020 and U+2003. Only a hex dump tool like xxd reveals the actual byte values, making the two-character binary alphabet visible.
Tried: Run strings whitepages.txt hoping to extract embedded printable text directly.
The strings tool filters for sequences of at least 4 consecutive printable ASCII bytes. Although regular space (0x20) is printable ASCII, each space byte is isolated by surrounding 3-byte em-space sequences (E2 80 83), which are not printable ASCII. No run of 4 or more consecutive printable ASCII bytes exists, so strings produces no output. The data is encoded as a binary sequence in the whitespace itself, not stored as raw ASCII.
Learn more
Unicode contains many whitespace characters beyond the regular space (U+0020). Common ones used in steganography challenges include: em space (U+2003, UTF-8: E2 80 83), en space (U+2002), thin space (U+2009), and others. All look identical in most text editors.
Step 2
Map whitespace characters to binary bitsObservationI noticed the xxd output showed exactly two distinct byte sequences (0x20 and the three-byte UTF-8 sequence E2 80 83 for em-space), which suggested treating them as a two-symbol binary alphabet and grouping every 8 symbols into a byte to recover ASCII text.Identify the two different whitespace byte sequences. Assign one to bit 0 and the other to bit 1. Group every 8 bits into a byte and convert to ASCII.pythonpython3 << 'EOF' with open('whitepages.txt', 'rb') as f: data = f.read() # Identify the two whitespace types from xxd output # e.g., 0x20 = space = 1, 0xe2 0x80 0x83 = em-space = 0 bits = '' i = 0 while i < len(data): if data[i:i+3] == b'\xe2\x80\x83': bits += '0' i += 3 elif data[i] == 0x20: bits += '1' i += 1 else: i += 1 # Convert bits to ASCII result = '' for j in range(0, len(bits) - 7, 8): byte = int(bits[j:j+8], 2) result += chr(byte) print(result) EOFWhat didn't work first
Tried: Swap the bit assignments so that em-space maps to 1 and regular space maps to 0, then decode the result.
Reversing the bit mapping produces a garbled sequence of non-printable bytes rather than readable ASCII. The correct assignment must be determined by inspecting which byte sequence appears more frequently (the dominant character is typically 0) or by trying both assignments and checking which produces a valid ASCII string. Without confirming the mapping against actual output, the decoded bytes will be wrong.
Tried: Use the SNOW steganography tool (stegsnow) to decode the file, since SNOW is a known whitespace steganography tool.
SNOW encodes data using only trailing tabs and spaces at the end of lines, and it requires a specific file format with newlines. This challenge encodes data using Unicode em-space versus regular space throughout the body of the file, which is a different scheme entirely. Running stegsnow on this file produces no output or an error because the encoding format does not match what SNOW expects.
Learn more
This technique is called whitespace steganography. The SNOW tool and the Whitespace programming language both use similar concepts of encoding information in invisible characters. It is effective against casual inspection but immediately visible under hex analysis.
The key insight is that while the text looks blank, it encodes a binary string where each character of the message is represented by 8 bits of whitespace characters.
Interactive tools
- StegallDrop any file and Stegall runs every applicable steg technique in parallel: LSB sweeps, bit planes, spectrograms, polyglot carving, metadata, whitespace decode, and a 6-layer base/ROT/XOR/zlib cascade. Recursively unpacks results and surfaces flag matches.
- Hex ViewerView text or raw hex bytes as a xxd-style hex dump with byte offset, hex columns, and ASCII sidebar. Highlights printable characters and null bytes.
- Strings ExtractorPull printable text from any binary, library, or image. ASCII and UTF-16 detection, configurable minimum length, flag-like highlight, no command line needed.
Flag
Reveal flag
picoCTF{not_all_spaces_are_created_equal_...}
Per-instance flag. Multiple hash suffixes confirmed across instances (c167040c..., f71be4d2..., etc.). Prefix picoCTF{not_all_spaces_are_created_equal_} is consistent.