It is my Birthday picoCTF 2021 Solution

Published: April 2, 2026

Description

I sent out 2 invitations to my birthday party, but some friends said the links were identical! Upload two different PDF files that share the same MD5 hash to get the flag.

Remote

Navigate to the challenge URL.

bash
# Open the challenge URL in your browser

Solution

Want to try it yourself first?

The guided walkthrough reveals hints one step at a time.

Walk me through it
Background: Hash Cracking for CTFs separates collision attacks (this challenge) from preimage and password recovery.
  1. Step 1
    Obtain precomputed MD5 collision PDFs
    Observation
    I noticed the challenge explicitly asks for two different PDF files that share the same MD5 hash, which is the textbook definition of an MD5 collision; since MD5 has been broken since 2004, precomputed collision pairs in PDF format are publicly available and are the fastest path to a valid submission.
    MD5 collision pairs for PDF files have been publicly available since 2004. Download a pair from the corkami/collisions repository on GitHub or from sites hosting the original Vlastimil Klima and Marc Stevens collision examples.
    bash
    wget https://github.com/corkami/collisions/raw/master/examples/collision1.pdf
    bash
    wget https://github.com/corkami/collisions/raw/master/examples/collision2.pdf

    Expected output

    bcc4be725bbc03c6c3dfc42f59b3df96  collision1.pdf
    bcc4be725bbc03c6c3dfc42f59b3df96  collision2.pdf
    5c694a1291411928d7b3dd679c0754e98ecd12f5feed77b8ca2ddc51f56cb0c3  collision1.pdf
    9e7dc6c1a1fe594937e4e8e77a70dfc1c88a4e02b3e0b26d5e82023111f4096b  collision2.pdf
    Files collision1.pdf and collision2.pdf differ
    What didn't work first

    Tried: Try to generate a fresh MD5 collision from scratch using hashclash or fastcoll.

    Generating a new MD5 collision with hashclash takes hours of CPU time and requires compiling specialized tools. The challenge only needs two files with matching MD5 hashes - precomputed public collision pairs from corkami/collisions are identical in that regard and take seconds to download.

    Tried: Download any two files that both happen to be PDFs and assume they might share an MD5 hash by coincidence.

    MD5 collisions are not accidental - two randomly chosen files would never share a hash. A collision requires deliberately crafted byte sequences in a 128-byte block where the internal state of the hash function cancels out. Only purpose-built pairs like the corkami examples satisfy this.

    Learn more

    MD5 (Message Digest 5) was designed in 1991 as a cryptographic hash function. By 2004, Xiaoyun Wang and colleagues demonstrated practical collision attacks - finding two different inputs that produce the same 128-bit hash output. By 2008, researchers had forged an MD5-signed SSL certificate. MD5 is now considered completely broken for any security-critical application.

    The corkami/collisions GitHub repository by Ange Albertini is an excellent reference on file format collisions - it demonstrates MD5 collisions for PDF, JPEG, ZIP, and many other formats. The collision PDFs look different but hash to the same MD5 value.

  2. Step 2
    Verify the collision and upload both files
    Observation
    I noticed the challenge requires two files that are genuinely different in content yet share the same MD5, so I needed to confirm with md5sum that the hashes match and with sha256sum and diff that the bytes actually differ before submitting.
    Confirm same MD5, different bytes, then upload both. The server treats them as two distinct valid documents because its only equality check is MD5.
    bash
    md5sum collision1.pdf collision2.pdf
    bash
    sha256sum collision1.pdf collision2.pdf
    bash
    diff -q collision1.pdf collision2.pdf
    What didn't work first

    Tried: Upload the same PDF file twice under two different filenames and expect the server to accept the submission.

    The challenge explicitly requires two DIFFERENT files that share the same MD5. Uploading the same bytes twice would make both the MD5 and the sha256sum identical, and the server checks that the files differ in content. The diff command confirms the corkami pair has genuinely distinct bytes even though md5sum returns the same hash for both.

    Tried: Try SHA-256 collision pairs instead of MD5 collision pairs, reasoning that any hash collision should work.

    No practical SHA-256 collision pairs exist - none have been published because SHA-256 has not been broken. The server is specifically comparing MD5 hashes, so only MD5 collision pairs satisfy the challenge constraint. Using SHA-256 collisions (which do not exist) or SHA-1 collisions (which exist but not in PDF format in the same public repositories) would not help.

    Learn more

    md5sum should show identical hashes for both files while sha256sum and diff confirm they are genuinely different. The collision is achieved by carefully choosing content in a section of the PDF that does not affect rendering - the visible pages look different, but the underlying byte sequences hash identically under MD5.

    Why PDFs are easy targets. The PDF format is built around indirect objects (1 0 obj ... endobj) and binary streams (stream ... endstream) that the renderer follows like pointers. A PDF can contain two streams whose first 64 bytes form an MD5 collision pair (generated with hashclash or the older unicoll from Marc Stevens), then a conditional like /Catalog /Pages X selects which stream to display based on a single byte that differs. Same MD5, two different rendered documents.

    Modern cryptographic standards use SHA-256 or SHA-3 instead of MD5. No practical collision attacks are known against SHA-256. Code signing, SSL certificates, and integrity verification should never rely on MD5 or SHA-1 (also broken since 2017).

Interactive tools
  • Hash IdentifierIdentify unknown hash types by length and prefix. Covers MD5, SHA-1, SHA-256, SHA-512, bcrypt, NTLM, and more.
  • Checksum CalculatorCompute CRC32, SHA-1, SHA-256, SHA-384, and SHA-512 hashes for text or uploaded files. Verify against known hashes.

Flag

Reveal flag

picoCTF{c0ngr4ts_u_r_1nv1t3d_...}

MD5 collision attacks have been broken since 2004 - precomputed collision pairs for common file formats like PDF are publicly available.

Key takeaway

MD5's collision resistance was broken in 2004, and precomputed collision pairs for PDF, JPEG, ZIP, and other formats have been freely available ever since. A server that accepts 'same MD5 hash implies same content' can be tricked into treating two visually distinct documents as identical, enabling document forgery, certificate spoofing, and integrity bypass. The 2008 rogue CA attack and the 2012 Flame malware both exploited MD5 collisions in production. SHA-256 or SHA-3 must replace MD5 in any system where integrity actually matters.

Related reading

Want more picoCTF 2021 writeups?

Useful tools for Web Exploitation

What to try next