Description
The provided PDF is actually a shell archive containing multiple nested formats (ar, cpio, bzip2, gzip, lzip, etc.). Extract them sequentially until you reach ASCII text, then hex-decode the contents.
Setup
Run the file as a shell archive (sh Flag.pdf) to extract flag.
Inspect each resulting file with file and use the appropriate extractor (ar, cpio, bzip2, gzip, lzip, lz4, lzma, lzop, xz, etc.).
Once the final ASCII file appears, hex-decode it with xxd -r -p.
sh Flag.pdfar x flagcpio --file flag.cpio -ibzip2 flag -dgunzip flag.gzlzip flag -dunlz4 flag.lz4lzma flag.lzma -dlzop flag.lzop -dunxz flag.xzxxd -r -p flagSolution
Walk me through it- Step 1Peel each layerAfter each extraction, run
file flagto identify the next compression/container type and use the matching extractor.bash# Peel one layer at a time. After EACH extraction, re-run 'file flag' # to see the next layer, then use the matching extractor below. # The trick: rename 'flag' to the extension the tool expects; the # decompressor strips it and writes 'flag' back, ready for the next pass. file flag # identify the current layer, then run ONE of: ar x flag && rm -f flag # "ar archive" -> extracts the inner member cpio -idu < flag # "cpio archive" -> extracts the inner file mv flag flag.bz2 && bunzip2 flag.bz2 # "bzip2 compressed" mv flag flag.gz && gunzip flag.gz # "gzip compressed" mv flag flag.xz && unxz flag.xz # "XZ compressed" mv flag flag.lzma && unlzma flag.lzma # "LZMA compressed" mv flag flag.lz && lzip -d flag.lz # "lzip compressed" mv flag flag.lz4 && unlz4 -f flag.lz4 flag # "LZ4 compressed" mv flag flag.lzo && lzop -d flag.lzo # "lzop compressed" # Repeat until 'file flag' reports ASCII text, then hex-decode (next step).Learn more
The
filecommand reads magic bytes - the first few bytes of the file - to identify the format regardless of extension.fileworks because every container has a unique signature in its header. Quick reference:Magic bytes (first 2-6 bytes): gzip 1f 8b 08 bzip2 42 5a 68 ("BZh") xz fd 37 7a 58 5a 00 ("\xfdxz" + "\0") lzma 5d 00 00 (no good universal magic) lz4 04 22 4d 18 cpio 30 37 30 37 30 (ASCII "07070") ar 21 3c 61 72 63 68 ("!<arch>") zip / jar 50 4b 03 04 ("PK\x03\x04") PNG 89 50 4e 47 ("\x89PNG") ELF 7f 45 4c 46 ("\x7fELF")Why ship the chain inside a shell archive (shar) wearing a PDF extension? Layered misdirection. PDF makes you expect a binary you'd open in a viewer, not a script you'd execute. shar predates tar - it's a self-extracting shell script that recreates files via inline commands. The format is harmless in itself; the trick is that
file Flag.pdfidentifies it as a shell script regardless of extension.Entropy heuristic. Compressed/encrypted data has high entropy (~7.99/8.0 bits per byte); plaintext is around 4-5.
ent flagorpython3 -c "import collections, math; b=open('flag','rb').read(); print(-sum((c/len(b))*math.log2(c/len(b)) for c in collections.Counter(b).values()))"gives you a quick check: if entropy stays high, you're still wrapped; if it drops, you've hit text or hex.More CLI recipes for this kind of file archaeology in Linux CLI for CTF.
- Step 2Decode the hexThe final file is ASCII hex;
xxd -r -pconverts it back to readable bytes. Verify the output looks like a flag before submitting.bashhead -c 64 flagbashxxd -r -p flagbashxxd -r -p flag | head -c 200Learn more
Hex round-trip example. If
cat flagshows706963 6f4354 467b66 316c65..., thenxxd -r -p flagemits the bytesp i c o C T F { f 1 l e ....-rreverses (read hex, write bytes),-pselects "plain" format (just the hex digits, no offsets, no ASCII sidebar). That's the same format Python'sbytes.hex()produces, so it round-trips cleanly.Sanity-check the result: it should start with
picoCTF{. If it's still binary garbage, you missed an extraction layer. Runfileon the "hex" input first; iffilesays it's still a compressed format, you stopped peeling too early.The general lesson: in forensics, extension is a hint and magic bytes are the truth.
fileand the magic-byte table above let you correctly identify any container regardless of how an attacker labelled it. Hex-dump fundamentals in Hex Dumps for CTF.
Flag
picoCTF{f1len@m3_m@n1pul@t10n_f0r_0b2cur17y_3c7...}
Automating the extraction loop with `while file flag | grep ...` can save time on nested compression challenges.