Description
Matryoshka dolls are nested - can you extract the flag from this nested image? Download dolls.jpg.
Setup
Download dolls.jpg.
wget <url>/dolls.jpgSolution
Walk me through itxxd.- Step 1Scan for embedded files with binwalkRun binwalk on dolls.jpg. It will report additional file signatures embedded inside the JPEG - specifically nested PNG files and ZIP archives. Extract everything with the --dd flag.bash
binwalk dolls.jpgbashbinwalk --dd='.*' dolls.jpgLearn more
binwalk scans a binary file for known file format magic bytes (signatures). A JPEG file starts with bytes
FF D8 FF; a ZIP file starts with50 4B 03 04; a PNG with89 50 4E 47. binwalk identifies these signatures at any offset, revealing files hidden after or within the primary file.The
--dd='.*'flag tells binwalk to extract all recognized signatures to a directory named_dolls.jpg.extracted/. Without this flag, binwalk only reports offsets but does not extract.How file format magic bytes work: Almost every binary file format begins with a distinctive byte sequence called a magic number. JPEG:
FF D8 FF E0. PNG:89 50 4E 47 0D 0A 1A 0A. ZIP/Office:50 4B 03 04. PDF:25 50 44 46(%PDF). binwalk maintains a database of hundreds of these signatures and scans the entire file at every byte offset for any match. This is why it finds embedded files even when they appear at arbitrary positions inside another file.Why appended data does not break a JPEG: JPEG parsers read the image from the SOI (Start of Image) marker
FF D8to the EOI (End of Image) markerFF D9. Any bytes after the EOI marker are silently ignored by most image viewers. This means a ZIP archive appended after the JPEG EOI creates a valid JPEG that also contains a valid ZIP - a "polyglot" file. This technique is distinct from LSB steganography: the embedded data is not hidden within the image pixels, it is simply appended as a second file.Alternative extraction with
foremost:foremostis another file carving tool that extracts embedded files by magic number. It may handle some edge cases differently from binwalk. If binwalk's extraction produces a corrupt inner file, tryforemost -i dolls.jpg -o output/as an alternative. For ZIP archives specifically,unzip -l dolls.jpgworks directly because the ZIP central directory at the end of the file is found regardless of what precedes it. - Step 2Repeat extraction on each nested imageNavigate into the extracted directory and find the nested image file. Run binwalk --dd='.*' on it. Repeat this process four times total - each image contains another image inside it. After the fourth extraction, you will find flag.txt.bash
cd _dolls.jpg.extracted/bashbinwalk --dd='.*' base_images/2_c.jpgbashbinwalk --dd='.*' base_images/3_c.jpgbashbinwalk --dd='.*' base_images/4_c.jpgbashcat flag.txtbash# Or, if each layer is a password-protected zip whose password sits in the previous layer:bashwhile ls *.zip 1>/dev/null 2>&1; do unzip -P "$(cat hint.txt 2>/dev/null)" *.zip && rm *.zip; doneLearn more
This is a steganography-by-polyglot technique: a file that is simultaneously a valid JPEG (the outer image) and a ZIP archive (containing the inner image). Most image viewers display only the JPEG portion and ignore the trailing data. The challenge nests four layers deep, mimicking a Matryoshka (Russian nesting doll).
The
base_images/directory convention comes from binwalk's extraction naming - it creates subdirectories based on the offset where the nested file was found. The finalflag.txtappears as a plain text file inside the innermost archive.Automating repeated extraction: With four nesting layers, running binwalk manually four times is manageable - but for deeper nesting you could script it. A simple Bash loop:
while binwalk --dd='.*' *.jpg 2>/dev/null; do cd _*.extracted/; donedescends through the layers automatically. Real forensic tools like Autopsy handle recursive extraction internally, tracking provenance of each extracted file back to its source offset.
Flag
picoCTF{...}
binwalk detects file format magic bytes inside any binary - files can be nested arbitrarily and binwalk will find them all.