Transformation picoCTF 2021 Solution

Published: April 2, 2026

Description

I wonder what this really is... The file enc displays as seemingly random Unicode characters.

Download the enc file.

bash
wget <url>/enc

Solution

Want to try it yourself first?

The guided walkthrough reveals hints one step at a time.

Walk me through it
  1. Step 1
    Identify the encoding scheme
    Observation
    I noticed the enc file displayed as Unicode characters with code points well above 127 rather than readable ASCII text, which suggested the original bytes had been packed together into wider integer values rather than stored directly.
    Open enc in a text editor or print it - it contains Unicode characters with code points well above 127. This is a clue that multiple ASCII bytes were packed together. The encoding is: char = (ord(a) << 8) + ord(b), combining two consecutive ASCII bytes into one Unicode code point.
    Learn more

    Standard ASCII characters have code points from 0 to 127, fitting in 7 bits. A Unicode code point, however, can represent values up to 1,114,111. The encoding used here treats two ASCII bytes as the high byte and low byte of a 16-bit integer, producing a single Unicode character - effectively compressing two characters into one.

    This is not a standard encoding like UTF-16; it is a custom scheme. Looking at the code points of the characters in enc and noticing they are all in the range 0x2000 to 0x7e7e (printable ASCII pairs) is the main clue.

  2. Step 2
    Reverse the encoding
    Observation
    I noticed that once the scheme was identified as pairing two ASCII bytes into a single Unicode code point via bit-shifting, applying the inverse operations (>> 8 and & 0xff) to each character would reconstruct the original plaintext and reveal the flag.
    For each Unicode character c in enc, extract the high byte with (ord(c) >> 8) and the low byte with (ord(c) & 0xff). Convert each byte back to a character and concatenate - this reconstructs the original ASCII string containing the flag.
    python
    python3 -c "
    enc = open('enc').read().strip()
    print(''.join([chr(ord(c) >> 8) + chr(ord(c) & 0xff) for c in enc]))
    "

    Expected output

    picoCTF{16_bits_inst34d_of_8_...}
    What didn't work first

    Tried: Opening enc in a hex editor and running strings or xxd to extract ASCII text directly

    The file is valid UTF-8, so strings and xxd show the raw encoded Unicode bytes rather than the packed ASCII pairs. No recognizable flag characters appear because each original character pair has been merged into a single high-code-point glyph - you need the Python bit-extraction step to reconstruct the original ASCII.

    Tried: Decoding the file as base64 or with Python's codecs.decode(data, 'utf-8') expecting a standard encoding

    UTF-8 decoding succeeds but gives back the same Unicode string you already have - it does not unpack the byte pairs. This is not base64 or any standard codec; it is a custom scheme where each glyph encodes two ASCII bytes via bit-packing. Applying the wrong decoding layer produces meaningless Unicode text rather than the flag.

    Learn more

    The shift-right operator >> 8 drops the lower 8 bits, leaving the original high byte. The bitwise AND & 0xff masks to only the lower 8 bits, recovering the low byte. These are standard bit manipulation operations for unpacking multi-byte values - used extensively in binary format parsing and network protocol decoding.

    The challenge title "Transformation" refers to the encoding transformation applied to the plaintext. Reversing a transformation requires understanding the original operation - here a straightforward bit-packing scheme with no randomness or key.

Interactive tools
  • Base64 & Base32 DecoderDecode Base64 and Base32 strings with auto-detection. Multi-layer mode unwraps nested encodings automatically.
  • Recipe ChainStack decoders into a pipeline: Base64, hex, ROT, XOR, Morse, URL, Atbash, Vigenère, and more. Magic mode auto-discovers the chain. Bookmark the URL to save it.
  • Number Base ConverterConvert numbers between binary, octal, decimal, and hexadecimal instantly. Enter any value and see all four bases update in real time.

Flag

Reveal flag

picoCTF{16_bits_inst34d_of_8_...}

The encoding pairs consecutive ASCII bytes into a single Unicode code point - reverse by extracting the high byte (>>8) and low byte (&0xff) separately.

Key takeaway

Custom bit-packing schemes disguise data by repacking bytes into wider integer types, but the transformation is always reversible given knowledge of the scheme. Recognizing that output code points cluster within a predictable numeric range (here 0x2000 to 0x7e7e) is the key analytical step; it reveals the underlying structure without needing to reverse-engineer source code. The same analytical approach applies to proprietary binary protocols, obfuscated shellcode, and any encoding where the character-set distribution is anomalous compared to natural text.

Related reading

Want more picoCTF 2021 writeups?

Useful tools for Reverse Engineering

What to try next