Description
The provided PDF is actually a shell archive containing multiple nested formats (ar, cpio, bzip2, gzip, lzip, etc.). Extract them sequentially until you reach ASCII text, then hex-decode the contents.
Setup
Run the file as a shell archive (`sh Flag.pdf`) to extract `flag`.
Inspect each resulting file with `file` and use the appropriate extractor (ar, cpio, bzip2, gzip, lzip, lz4, lzma, lzop, xz, etc.).
Once the final ASCII file appears, hex-decode it with `xxd -r -p`.
sh Flag.pdfar x flagcpio --file flag.cpio -ibzip2 flag -dgunzip flag.gzlzip flag -dunlz4 flag.lz4lzma flag.lzma -dlzop flag.lzop -dunxz flag.xzxxd -r -p flagSolution
- Step 1Peel each layerAfter each extraction, run `file flag` to identify the next compression/container type and use its counterpart to extract again.
Learn more
The
filecommand reads a file's magic bytes - the first few bytes of the file - to determine its actual type, regardless of the filename extension. Every file format has a signature: PNG files start with\x89PNG, ZIP files withPK, gzip with\x1f\x8b, and so on. This makesfilereliable even when extensions have been changed or stripped, which is exactly what this challenge does.Shell archives (shar) are self-extracting scripts that encode file contents as shell commands. Running them with
shexecutes the script and recreates the original files. This format predates modern archive tools like tar; it was common in the early Unix era for distributing source code via email or newsgroups. The "PDF" extension here is a red herring - the magic bytes of the file identify it as a shell script.The succession of formats in this challenge (ar, cpio, bzip2, gzip, lzip, lz4, lzma, lzop, xz) is a tour through Linux compression and archive history. Each was invented to improve on prior tools in speed, compression ratio, or patent-freedom.
xzandlzmaare the most modern and achieve the best compression ratios;gzipremains the most widely deployed due to its long history. - Step 2Decode the hexThe final file is ASCII hex; `xxd -r -p` converts it back into the readable picoCTF flag.
Learn more
Hexadecimal encoding represents each byte as two hex digits (0-9, a-f), so a 10-byte file becomes a 20-character hex string. It's commonly used to display binary data in a human-readable, copy-pasteable form. The
xxdtool both creates hex dumps (xxd file) and reverses them (xxd -r -preads plain hex and outputs raw bytes).The
-pflag tellsxxdto use "plain" hex format - just the hex digits with no address offsets or ASCII sidebar. This is the format produced by tools like Python'sbytes.hex()and is the cleanest form to pipe between tools.Nested compression challenges like this one teach you to systematically identify and peel container formats rather than panicking when a file isn't what its extension claims. In real forensics work, files are frequently renamed or given wrong extensions to obscure their nature - the
file/magic bytesapproach is always the authoritative check.
Flag
picoCTF{f1len@m3_m@n1pul@t10n_f0r_0b2cur17y_3c7...}
Automating the extraction loop with `while file flag | grep ...` can save time on nested compression challenges.