Description
I sent out 2 invitations to my birthday party, but some friends said the links were identical! Upload two different PDF files that share the same MD5 hash to get the flag.
Setup
Navigate to the challenge URL.
# Open the challenge URL in your browserSolution
Want to try it yourself first?
The guided walkthrough reveals hints one step at a time.
Step 1
Obtain precomputed MD5 collision PDFsObservationI noticed the challenge explicitly asks for two different PDF files that share the same MD5 hash, which is the textbook definition of an MD5 collision; since MD5 has been broken since 2004, precomputed collision pairs in PDF format are publicly available and are the fastest path to a valid submission.MD5 collision pairs for PDF files have been publicly available since 2004. Download a pair from the corkami/collisions repository on GitHub or from sites hosting the original Vlastimil Klima and Marc Stevens collision examples.bashwget https://github.com/corkami/collisions/raw/master/examples/collision1.pdfbashwget https://github.com/corkami/collisions/raw/master/examples/collision2.pdfExpected output
bcc4be725bbc03c6c3dfc42f59b3df96 collision1.pdf bcc4be725bbc03c6c3dfc42f59b3df96 collision2.pdf 5c694a1291411928d7b3dd679c0754e98ecd12f5feed77b8ca2ddc51f56cb0c3 collision1.pdf 9e7dc6c1a1fe594937e4e8e77a70dfc1c88a4e02b3e0b26d5e82023111f4096b collision2.pdf Files collision1.pdf and collision2.pdf differ
What didn't work first
Tried: Try to generate a fresh MD5 collision from scratch using hashclash or fastcoll.
Generating a new MD5 collision with hashclash takes hours of CPU time and requires compiling specialized tools. The challenge only needs two files with matching MD5 hashes - precomputed public collision pairs from corkami/collisions are identical in that regard and take seconds to download.
Tried: Download any two files that both happen to be PDFs and assume they might share an MD5 hash by coincidence.
MD5 collisions are not accidental - two randomly chosen files would never share a hash. A collision requires deliberately crafted byte sequences in a 128-byte block where the internal state of the hash function cancels out. Only purpose-built pairs like the corkami examples satisfy this.
Learn more
MD5 (Message Digest 5) was designed in 1991 as a cryptographic hash function. By 2004, Xiaoyun Wang and colleagues demonstrated practical collision attacks - finding two different inputs that produce the same 128-bit hash output. By 2008, researchers had forged an MD5-signed SSL certificate. MD5 is now considered completely broken for any security-critical application.
The corkami/collisions GitHub repository by Ange Albertini is an excellent reference on file format collisions - it demonstrates MD5 collisions for PDF, JPEG, ZIP, and many other formats. The collision PDFs look different but hash to the same MD5 value.
Step 2
Verify the collision and upload both filesObservationI noticed the challenge requires two files that are genuinely different in content yet share the same MD5, so I needed to confirm with md5sum that the hashes match and with sha256sum and diff that the bytes actually differ before submitting.Confirm same MD5, different bytes, then upload both. The server treats them as two distinct valid documents because its only equality check is MD5.bashmd5sum collision1.pdf collision2.pdfbashsha256sum collision1.pdf collision2.pdfbashdiff -q collision1.pdf collision2.pdfWhat didn't work first
Tried: Upload the same PDF file twice under two different filenames and expect the server to accept the submission.
The challenge explicitly requires two DIFFERENT files that share the same MD5. Uploading the same bytes twice would make both the MD5 and the sha256sum identical, and the server checks that the files differ in content. The diff command confirms the corkami pair has genuinely distinct bytes even though md5sum returns the same hash for both.
Tried: Try SHA-256 collision pairs instead of MD5 collision pairs, reasoning that any hash collision should work.
No practical SHA-256 collision pairs exist - none have been published because SHA-256 has not been broken. The server is specifically comparing MD5 hashes, so only MD5 collision pairs satisfy the challenge constraint. Using SHA-256 collisions (which do not exist) or SHA-1 collisions (which exist but not in PDF format in the same public repositories) would not help.
Learn more
md5sumshould show identical hashes for both files whilesha256sumanddiffconfirm they are genuinely different. The collision is achieved by carefully choosing content in a section of the PDF that does not affect rendering - the visible pages look different, but the underlying byte sequences hash identically under MD5.Why PDFs are easy targets. The PDF format is built around indirect objects (
1 0 obj ... endobj) and binary streams (stream ... endstream) that the renderer follows like pointers. A PDF can contain two streams whose first 64 bytes form an MD5 collision pair (generated with hashclash or the olderunicollfrom Marc Stevens), then a conditional like/Catalog /Pages Xselects which stream to display based on a single byte that differs. Same MD5, two different rendered documents.Modern cryptographic standards use SHA-256 or SHA-3 instead of MD5. No practical collision attacks are known against SHA-256. Code signing, SSL certificates, and integrity verification should never rely on MD5 or SHA-1 (also broken since 2017).
Interactive tools
- Hash IdentifierIdentify unknown hash types by length and prefix. Covers MD5, SHA-1, SHA-256, SHA-512, bcrypt, NTLM, and more.
- Checksum CalculatorCompute CRC32, SHA-1, SHA-256, SHA-384, and SHA-512 hashes for text or uploaded files. Verify against known hashes.
Flag
Reveal flag
picoCTF{c0ngr4ts_u_r_1nv1t3d_...}
MD5 collision attacks have been broken since 2004 - precomputed collision pairs for common file formats like PDF are publicly available.