Description
A PDF file hides the flag in its metadata. Find it.
Setup
Download the PDF file from the challenge page.
Install exiftool: sudo apt install libimage-exiftool-perl
sudo apt install libimage-exiftool-perlSolution
Want to try it yourself first?
The guided walkthrough reveals hints one step at a time.
Step 1
Read all PDF metadata with exiftoolObservationI noticed the challenge description stated the flag was hidden in the PDF's metadata rather than its visible content, which suggested using a metadata extraction tool like exiftool to surface fields such as Author, Title, and Keywords that a standard PDF viewer never renders.Run exiftool on the PDF to display every metadata field. Look at the Author, Title, Subject, Keywords, and Creator fields for anything unusual.bashexiftool riddle.pdfWhat didn't work first
Tried: Open the PDF in a viewer and search the visible text for the flag.
The flag is stored in metadata fields that are never rendered on any page. A PDF reader displays only the page content stream, not the Document Information Dictionary or XMP stream. exiftool is required to surface those hidden fields.
Tried: Run strings on the PDF instead of exiftool to look for the flag.
strings prints printable sequences from the raw binary and can show XMP metadata if it happens to be in plain text, but it dumps thousands of unformatted lines and may truncate or misalign multi-line fields. exiftool parses the PDF structure properly and labels every field, making it far easier to spot the anomalous value without grepping through noise.
Learn more
PDF metadata is stored in two locations within a PDF file: the older Document Information Dictionary (fields like Author, Title, Subject, Keywords, Creator, Producer, CreationDate) and the newer XMP (Extensible Metadata Platform) stream, which stores the same and additional fields as embedded XML. exiftool reads both and displays them together.
These fields are set by the application that created the PDF (Word, Adobe Acrobat, LibreOffice, LaTeX, etc.) and can be edited freely by the document owner. They are not visible in the rendered document - a reader would never see them while reading the PDF's pages. This makes them an effective hiding spot for data that shouldn't appear in the visible content.
In real-world digital forensics, PDF metadata is valuable evidence: author names, organization names, creation timestamps, and editing software versions can identify who created a document and when. Leaked documents have been traced back to their source by author metadata. Tools like
pdfinfo(from poppler-utils) andpdf-parser.pyfrom Didier Stevens provide additional views into PDF structure.Step 2
Decode the base64 valueObservationI noticed one of the metadata fields exiftool returned contained an alphanumeric blob that did not resemble a normal author name or title, which suggested it was base64-encoded data that needed to be decoded with the base64 -d command to reveal the flag.Find a metadata field containing a base64-encoded string. Decode it with the base64 command to reveal the flag.bashecho '<base64_value_from_metadata>' | base64 -dWhat didn't work first
Tried: Pipe the entire exiftool output directly into base64 -d without extracting the specific field value first.
base64 -d will fail or produce garbage because the exiftool output includes field labels, colons, whitespace, and newlines that are not part of the encoded payload. Only the alphanumeric blob from the specific metadata field is valid base64 - it must be copied or extracted cleanly with something like exiftool -FieldName -b riddle.pdf before decoding.
Tried: Try base64 --decode with the -i flag to ignore non-base64 characters, expecting it to silently strip the surrounding exiftool formatting.
The -i flag causes base64 to skip invalid characters, which produces decoded output without an error - but the result is meaningless binary because the field labels and spacing corrupt the data stream in unpredictable positions. The output will not contain the flag. The correct approach is to isolate just the encoded value before decoding.
Learn more
Metadata fields in PDFs (and other file formats) can hold arbitrary text strings - their content is never validated by the format specification. A base64-encoded value stored in a metadata field provides a layer of obfuscation: the raw field value looks like random alphanumeric characters to a casual viewer, but decodes to meaningful data instantly with standard tools.
echo '...' | base64 -dpipes the string tobase64's decode mode. The output goes to stdout, which can be further piped or redirected. If the decoded output is binary (an image, a compressed file), redirect it to a file:echo '...' | base64 -d > output.bin, then runfile output.binto determine what it is.This two-layer approach (metadata hiding + base64 encoding) combines two independent techniques: hiding data where most tools won't look, and obfuscating it so it doesn't look suspicious at first glance. In CTF forensics, always run exiftool first (metadata), then
strings(printable sequences), thenbinwalk(embedded files) on any given file before attempting more complex analysis.
Flag
Reveal flag
picoCTF{puzzl3d_m3tadata_f0und!_...}
Flag prefix verified: base64 in PDF Author field decodes to picoCTF{puzzl3d_m3tadata_f0und!_...}. Hash suffix varies per instance (observed: c2073669, 0e2de5a1). Confirmed by manual base64 decode.