Description
A PDF file hides the flag in its metadata. Find it.
Setup
Download the PDF file from the challenge page.
Install exiftool: sudo apt install libimage-exiftool-perl
sudo apt install libimage-exiftool-perlSolution
- Step 1Read all PDF metadata with exiftoolRun exiftool on the PDF to display every metadata field. Look at the Author, Title, Subject, Keywords, and Creator fields for anything unusual.
exiftool riddle.pdfLearn more
PDF metadata is stored in two locations within a PDF file: the older Document Information Dictionary (fields like Author, Title, Subject, Keywords, Creator, Producer, CreationDate) and the newer XMP (Extensible Metadata Platform) stream, which stores the same and additional fields as embedded XML. exiftool reads both and displays them together.
These fields are set by the application that created the PDF (Word, Adobe Acrobat, LibreOffice, LaTeX, etc.) and can be edited freely by the document owner. They are not visible in the rendered document - a reader would never see them while reading the PDF's pages. This makes them an effective hiding spot for data that shouldn't appear in the visible content.
In real-world digital forensics, PDF metadata is valuable evidence: author names, organization names, creation timestamps, and editing software versions can identify who created a document and when. Leaked documents have been traced back to their source by author metadata. Tools like
pdfinfo(from poppler-utils) andpdf-parser.pyfrom Didier Stevens provide additional views into PDF structure. - Step 2Decode the base64 valueFind a metadata field containing a base64-encoded string. Decode it with the base64 command to reveal the flag.
echo '<base64_value_from_metadata>' | base64 -dLearn more
Metadata fields in PDFs (and other file formats) can hold arbitrary text strings - their content is never validated by the format specification. A base64-encoded value stored in a metadata field provides a layer of obfuscation: the raw field value looks like random alphanumeric characters to a casual viewer, but decodes to meaningful data instantly with standard tools.
echo '...' | base64 -dpipes the string tobase64's decode mode. The output goes to stdout, which can be further piped or redirected. If the decoded output is binary (an image, a compressed file), redirect it to a file:echo '...' | base64 -d > output.bin, then runfile output.binto determine what it is.This two-layer approach (metadata hiding + base64 encoding) combines two independent techniques: hiding data where most tools won't look, and obfuscating it so it doesn't look suspicious at first glance. In CTF forensics, always run exiftool first (metadata), then
strings(printable sequences), thenbinwalk(embedded files) on any given file before attempting more complex analysis.
Flag
picoCTF{...}
PDF metadata fields (Author, Title, Subject, Keywords) can hold arbitrary data - exiftool reads them all and is the standard first step for document forensics challenges.