Description
I wonder what this really is... The file enc displays as seemingly random Unicode characters.
Setup
Download the enc file.
wget <url>/encSolution
Want to try it yourself first?
The guided walkthrough reveals hints one step at a time.
Step 1
Identify the encoding schemeObservationI noticed the enc file displayed as Unicode characters with code points well above 127 rather than readable ASCII text, which suggested the original bytes had been packed together into wider integer values rather than stored directly.Open enc in a text editor or print it - it contains Unicode characters with code points well above 127. This is a clue that multiple ASCII bytes were packed together. The encoding is: char = (ord(a) << 8) + ord(b), combining two consecutive ASCII bytes into one Unicode code point.Learn more
Standard ASCII characters have code points from 0 to 127, fitting in 7 bits. A Unicode code point, however, can represent values up to 1,114,111. The encoding used here treats two ASCII bytes as the high byte and low byte of a 16-bit integer, producing a single Unicode character - effectively compressing two characters into one.
This is not a standard encoding like UTF-16; it is a custom scheme. Looking at the code points of the characters in enc and noticing they are all in the range 0x2000 to 0x7e7e (printable ASCII pairs) is the main clue.
Step 2
Reverse the encodingObservationI noticed that once the scheme was identified as pairing two ASCII bytes into a single Unicode code point via bit-shifting, applying the inverse operations (>> 8 and & 0xff) to each character would reconstruct the original plaintext and reveal the flag.For each Unicode character c in enc, extract the high byte with (ord(c) >> 8) and the low byte with (ord(c) & 0xff). Convert each byte back to a character and concatenate - this reconstructs the original ASCII string containing the flag.pythonpython3 -c " enc = open('enc').read().strip() print(''.join([chr(ord(c) >> 8) + chr(ord(c) & 0xff) for c in enc])) "Expected output
picoCTF{16_bits_inst34d_of_8_...}What didn't work first
Tried: Opening enc in a hex editor and running strings or xxd to extract ASCII text directly
The file is valid UTF-8, so strings and xxd show the raw encoded Unicode bytes rather than the packed ASCII pairs. No recognizable flag characters appear because each original character pair has been merged into a single high-code-point glyph - you need the Python bit-extraction step to reconstruct the original ASCII.
Tried: Decoding the file as base64 or with Python's codecs.decode(data, 'utf-8') expecting a standard encoding
UTF-8 decoding succeeds but gives back the same Unicode string you already have - it does not unpack the byte pairs. This is not base64 or any standard codec; it is a custom scheme where each glyph encodes two ASCII bytes via bit-packing. Applying the wrong decoding layer produces meaningless Unicode text rather than the flag.
Learn more
The shift-right operator
>> 8drops the lower 8 bits, leaving the original high byte. The bitwise AND& 0xffmasks to only the lower 8 bits, recovering the low byte. These are standard bit manipulation operations for unpacking multi-byte values - used extensively in binary format parsing and network protocol decoding.The challenge title "Transformation" refers to the encoding transformation applied to the plaintext. Reversing a transformation requires understanding the original operation - here a straightforward bit-packing scheme with no randomness or key.
Interactive tools
- Base64 & Base32 DecoderDecode Base64 and Base32 strings with auto-detection. Multi-layer mode unwraps nested encodings automatically.
- Recipe ChainStack decoders into a pipeline: Base64, hex, ROT, XOR, Morse, URL, Atbash, Vigenère, and more. Magic mode auto-discovers the chain. Bookmark the URL to save it.
- Number Base ConverterConvert numbers between binary, octal, decimal, and hexadecimal instantly. Enter any value and see all four bases update in real time.
Flag
Reveal flag
picoCTF{16_bits_inst34d_of_8_...}
The encoding pairs consecutive ASCII bytes into a single Unicode code point - reverse by extracting the high byte (>>8) and low byte (&0xff) separately.