Description
Sensitive text in a PDF was only visually redacted. Convert the PDF to text (or copy/paste) to reveal the hidden flag.
Setup
Install `pdftotext` (from poppler-utils or xpdf).
Run `pdftotext Financial_Report_for_ABC_Labs.pdf` to create a .txt version.
Search the text output for `picoCTF` (or simply copy the blacked-out text directly inside a PDF viewer).
pdftotext Financial_Report_for_ABC_Labs.pdfgrep -oE "picoCTF\{.*\}" Financial_Report_for_ABC_Labs.txtSolution
- Step 1Convert the PDFVisual redactions don't remove the underlying text. `pdftotext` extracts everything, including the supposedly hidden sections.
Learn more
PDF (Portable Document Format)stores content as a layered document structure. A black rectangle drawn on top of text is a separate visual element - the original text data remains fully intact in the file's content stream. This is fundamentally different from actually deleting or overwriting the text.
The
pdftotexttool (part of the poppler-utils package) strips all visual formatting and extracts the raw text content, bypassing any overlaid shapes. Even simpler: many PDF viewers let you select and copy text that appears visually redacted - the text layer is still there and selectable.This is not a theoretical vulnerability - it has caused real-world data breaches. High-profile examples include leaked NSA documents and court filings where sensitive names were "blacked out" using this flawed method. The correct approach is to use purpose-built redaction tools that remove the text from the document, not merely cover it.
- Step 2Search for the flagGrep the generated text file for picoCTF to immediately locate the flag string.
Learn more
grep -oE "picoCTF\{.*\}"uses an extended regular expression to match the flag pattern. The-oflag prints only the matching portion (not the whole line), and-Eenables extended regex syntax like.*for "any characters."In forensics and incident response, pattern-matching against extracted text is a core workflow. Tools like bulk_extractor automate this at scale, scanning disk images or raw files for email addresses, URLs, credit card numbers, and other structured data patterns - even across file boundaries in unallocated space.
Proper document redaction for sensitive material requires tools certified for the purpose, such as Adobe Acrobat's built-in redaction feature (which actually removes content), or dedicated solutions used in legal and government contexts that produce a new, sanitized document with the underlying data permanently removed.
Flag
picoCTF{C4n_Y0u_S33_m3_f...}
Real-world lesson: always remove sensitive text entirely before distributing redacted documents.