June 9, 2026

z3 for CTF: Constraint Solving from Keychecks to Crypto

A z3 solver tutorial for CTF: the four-move script skeleton, the BitVec-vs-Int type trap that silently breaks solutions, and when z3 beats angr or brute force.

The first time z3 felt like cheating (and why it isn't)

Hello! Here is the thing nobody told me when I started losing reverse-engineering challenges to a wall of arithmetic: z3 does not solve the challenge. It solves the equations you hand it. z3 is a constraint solver (a tool from Microsoft Research that takes a pile of equations like x > 5 and x * 7 == 91 and either hands you a value of x that makes all of them true, or proves none exists). It is very good at that one job. It is not a magic "press button, receive flag" machine, and the people who treat it like one are the people it lets down.

I remember the first time a z3 script printed a password I would have spent an hour recovering by hand. It felt like cheating. Then the next challenge, the exact same approach printed garbage, confidently, and I had no idea why. That gap, between "it worked and I don't know why" and "it failed and I don't know why," is the whole reason this post exists. The barrier with z3 is almost never the solver. It is two skills the tutorials skip: recognizing when a check is a system of equations you can read off the decompiler, and typing those equations so the solver's answer matches what the machine actually does.

z3 doesn't solve the challenge. It solves the equations you hand it. The skill is knowing which checks are equations, and writing them in the right type.

This is for picoCTF players who have solved a few reversing or crypto challenges and keep bouncing off z3 because every example uses Int and their binary uses 32-bit integers that wrap around. By the end you will have a four-move skeleton, the one type rule that silently breaks more solutions than anything else, a worked keycheck, and a clear sense of when z3 beats angr or plain brute force, and when it has no chance at all. The real picoCTF challenges I anchor to are secure-email-service (picoCTF 2025), where z3 cracks something brute force structurally cannot, and rolling-my-own (picoCTF 2021), where it looks like a z3 problem and absolutely is not.

Tip: z3 is a Python library. pip install z3-solver gives you both the engine and the z3 Python module. Everything in this post is from z3 import * and then a dozen lines.

Every z3 script is the same four moves

Open any z3 writeup and the script will look busy. Strip the noise and what is left is four moves, in the same order, every single time. Declare your unknowns. Add the constraints they have to satisfy. Ask z3 if that is even possible. Read the answer (z3 calls it the model, the concrete values that satisfy everything) out of the solver.

from z3 import *
# 1. Declare the unknowns (the bytes you're solving for).
x = BitVec('x', 32) # a 32-bit machine integer (BitVec explained next)
y = BitVec('y', 32)
# 2. Add the constraints the binary imposes.
s = Solver()
s.add(x + y == 0x1337)
s.add(x ^ y == 0x4242)
# 3. Ask: is there an assignment that satisfies all of them?
print(s.check()) # sat / unsat / unknown
# 4. Read the answer out of the model.
if s.check() == sat:
m = s.model()
print(m[x], m[y])

That is the entire framework. BitVec makes a variable. Solver() holds the equations. s.add(...) adds one. s.check() returns one of three words, and you have to respect all three:

  • sat (satisfiable): there is an assignment, and s.model() hands it to you. This is the happy path.
  • unsat (unsatisfiable): no assignment exists. If you expected a flag and got unsat, your constraints are wrong, not the challenge.
  • unknown: z3 gave up, usually because it hit a timeout or the math is outside what it can decide. Crucially, unknown is not unsat. You cannot conclude "no solution" from it.
Key insight: Four moves: declare, constrain, check, read. Everything you will ever add (printable-byte bounds, a loop to enumerate all solutions, a timeout, byte-slicing with Extract) is in service of one of those four. When a script misbehaves, the bug is almost always in move 1 (wrong type) or move 2 (wrong constraint).

The official z3py tutorial and the newer online z3 guide both frame it the same way: declare constants, assert constraints, call check, inspect the model. There is also a one-shot helper, solve(x + y == 0x1337), which prints a model in a single line. It is fine for a quick experiment. For anything you want to extend, use a real Solver() so you can add constraints in a loop and ask check more than once.

TakeawayIf you can write the check as a set of s.add(...) lines, z3 will finish the algebra. The entire job is getting move 1 and move 2 right.

The trap nobody warns you about: Int vs BitVec

Here is the single most expensive mistake, and it is invisible. Most z3 tutorials open with Int('x'). An Int in z3 is a real mathematical integer: unbounded, never overflows, behaves like math class. Almost every binary you will ever reverse uses fixed-width machine integers that wrap around at 2^32 or 2^64. The mismatch does not throw an error. It quietly gives you a model that is correct for the wrong arithmetic.

Watch it happen. Suppose the binary does this with a 32-bit unsigned integer (a hashing step you would read straight out of Ghidra):

// C, with uint32_t arithmetic (wraps mod 2**32)
uint32_t h = input;
h = h * 0x9E3779B1;
if (h == 0x1234567) win();

Model it with an Int and z3 reasons over the unbounded integers:

from z3 import *
x = Int('x') # WRONG: unbounded math integer
s = Solver()
s.add(x * 0x9E3779B1 == 0x1234567)
print(s.check()) # unsat

z3 says unsat. Over the unbounded integers there is no whole number you can multiply by 0x9E3779B1 to land exactly on 0x1234567, so z3 is telling the truth about the wrong question. A beginner reads unsat, concludes "impossible, I must have misread the binary," and goes in circles. Now model it as the 32-bit machine integer it actually is:

from z3 import *
x = BitVec('x', 32) # RIGHT: 32-bit, wraps mod 2**32
s = Solver()
s.add(x * 0x9E3779B1 == 0x1234567)
print(s.check()) # sat
print(hex(s.model()[x].as_long())) # the one input that works

Now it is sat. Multiplication on a BitVec wraps modulo 2^32 exactly like C, and because 0x9E3779B1 is odd it has a modular inverse, so there is exactly one 32-bit value that works and z3 finds it instantly. Same solver, same constraint, opposite answer. The only thing that changed was the type.

The second half of the trap is signed versus unsigned. The z3 Python operators <, <=, >, >=, /, %, and >> all emit the signed versions of those operations. The z3 guide says it plainly: signed comparison "takes the sign bit of bitvectors into account," while unsigned comparison "treats the bitvector as a natural number" (z3 docs). If the C code compared two unsigned ints and you wrote >, z3 can hand you a value with its top bit set that satisfies your signed check but would fail the real unsigned one. There is a whole separate set of unsigned operators for this, and matching them to the C code is the job.

You write (Python)z3 emitsUse this for unsigned C
a < bSigned less-thanULT(a, b)
a <= bSigned less-or-equalULE(a, b)
a > bSigned greater-thanUGT(a, b)
a >= bSigned greater-or-equalUGE(a, b)
a / bSigned divisionUDiv(a, b)
a % bSigned remainderURem(a, b)
a >> bArithmetic (sign-filling) shiftLShR(a, b)

The good news: +, -, *, and the bitwise ^ & | ~ wrap identically whether you think of the bits as signed or unsigned, so those need no special operator. It is comparisons, division, remainder, and right-shift where the signed default bites. And it does bite in the wild: even z3's own author, Nikolaj Bjorner, has fielded bug reports about mixing integers and bit-vectors producing surprising results. If the maintainer has to think carefully here, so do you.

Heads up: Default to BitVec for anything that came out of a binary, and pick the width the C type uses (8 for char, 32 for int, 64 for long on a 64-bit target). Reach for Int only when the challenge is genuinely about unbounded math (a system of linear equations over the integers, say), not machine code.
TakeawayWrong type, wrong answer, no error message. BitVec with the right width and the unsigned operators is how you make z3's model match the machine. This one rule fixes more broken z3 scripts than everything else combined.

Reading a keycheck straight off the decompiler

The bread-and-butter z3 challenge is the keycheck: the binary reads your input, runs it through a chain of arithmetic and bitwise operations, and compares the result to a constant. In Ghidra you will be staring at decompiled output that looks like a list of conditions on an input array. Each condition becomes one s.add(...). That is the whole translation.

Here is the shape for transcribing a decompiled if-chain. Say Ghidra shows you something like this (a is your input, one byte per index):

// decompiled check (each line is one constraint)
if (a[21] - a[3] != -20) return WRONG;
if (4 * a[11] != 380) return WRONG;
if ((a[20] ^ 0x2B) != a[7]) return WRONG;
if ((a[6] & 0x45) != 68) return WRONG;
if ((a[5] << (a[9] % 8)) != 392) return WRONG;
// ... and so on, for every input byte
return CORRECT;

You do not need to understand why those constants are what they are. You just copy each condition into z3, flipping "return WRONG unless" into "this must be true," and let the solver untangle the whole system at once:

from z3 import *
N = 32 # input length, read from the binary
a = [BitVec(f'a{i}', 8) for i in range(N)] # one 8-bit var per byte
s = Solver()
# Keep every byte printable so we can type the answer back in.
for b in a:
s.add(b >= 0x20, ULE(b, 0x7e)) # ULE: unsigned upper bound
# Transcribe the decompiled checks, one per line.
s.add(a[21] - a[3] == -20)
s.add(4 * a[11] == 380)
s.add((a[20] ^ 0x2B) == a[7])
s.add((a[6] & 0x45) == 68)
s.add(a[5] << (a[9] % 8) == 392)
# ... one s.add per condition
assert s.check() == sat
m = s.model()
flag = bytes(m[a[i]].as_long() for i in range(N))
print(flag)

That is the move. Forty conditions that would take you an hour to solve by hand as a system of simultaneous equations become forty s.add lines, and z3 returns the one input that satisfies all of them in well under a second. The printable-byte loop is doing real work: without it, z3 is free to pick any byte values that satisfy the math, including unprintable ones, and you will get a "correct" answer you cannot type into the program.

When the decompiled logic has branches (an if inside the check that picks one transform or another), you cannot use a Python if, because at script time a[i] is a symbol, not a number. You use z3's If:If(cond, then_expr, else_expr). This pattern also surfaces the masking lesson: you have to mask each intermediate with & 0xff because "type information was lost," which is the same Int-vs-BitVec lesson from the last section wearing a different hat. Model your bytes as 8-bit BitVecs from the start and the masking takes care of itself.

Tip: You do not have to retype Ghidra's output by hand for huge checks. People script the transcription, and tools like WithSecure's z3-and-angr workshop walk through building these models step by step. But for a normal picoCTF keycheck, typing the conditions is faster than automating it.
TakeawayA keycheck is a system of equations wearing an if-chain costume. One s.add per condition, printable-byte bounds, 8-bit BitVecs, and z3 does the simultaneous algebra you would otherwise grind by hand.

z3 is the solver inside angr (so the real question is who writes the constraints)

People frame it as "z3 versus angr," and the framing hides the punchline: angr uses z3. Under the hood, angr does its constraint-solving through a library called claripy, and claripy's default backend is literally a class named BackendZ3 that import z3 and calls z3.Solver(). The angr docs describe claripy as "a simple frontend to z3." So the two tools are not rivals running different engines. They are the same engine with two different ways of feeding it.

angr is not an alternative to z3. angr is symbolic execution that hands its constraints to z3 for you. The choice is who writes the equations: you, or an emulator.

That reframe is the useful part. With z3, you read the binary, understand the check, and write the constraints by hand. With angr, you point a symbolic CPU emulator at the binary, it explores execution paths, and it accumulates the constraints automatically along the way, then asks z3 the same kind of question you would have asked. angr saves you from reading the logic. It costs you path explosion: a branch inside a loop multiplies the number of states, and the emulator can eat all your RAM before z3 ever gets a turn. (The angr post on this site has the full diagnostic for that.)

So the decision is really about the shape of the problem, not brand loyalty:

When the challenge looks like thisReach forBecause
The check is a readable system of arithmetic and bitwise conditions on the input.z3You can transcribe it in minutes, the model is exact and debuggable, and you control every constraint.
You would rather not read the logic, and there are only a few input-dependent branches (no loop multiplying the paths).angrIt extracts the constraints for you and discharges them with z3. Watch for path explosion.
The keyspace is tiny: a few thousand candidates, or each position is independent.brute forceWriting constraints costs more than the loop saves. One writeup brute-forced a stuck check in a few thousand tries.
The check runs a real hash or cipher (SHA, MD5, AES) on your input.none of themz3 and angr both choke on cryptographic diffusion. Read the code or find the crypto weakness instead.

That last row is not hypothetical, and the third row is a real war story. In one hands-on account, the author reached for angr first, watched memory consumption kill the process once there were a handful of avoid conditions, and ended up brute-forcing the tiny remaining keyspace because it was only a couple thousand trials. The lesson is not "angr is bad." It is that the shape of the problem, not the tool you like, picks the tool.

Takeawayz3 and angr share a solver. Choose z3 when you can read the check and want exact control, angr when you want the constraints extracted for you, brute force when the space is small, and none of them when the binary runs real crypto.

Where z3 glows: cracking a PRNG you cannot brute force

The picoCTF 2025 challenge secure-email-service is the case where z3 does something no brute force can. The app generates MIME boundaries with Python's random.randrange(sys.maxsize), which on a 64-bit system calls getrandbits(63) under the hood. If you can predict the next boundary, you can forge a message. Predicting it means recovering the internal state of Python's random number generator, the Mersenne Twister (MT19937).

Normally you would grab a library like randcrack, feed it 624 consecutive 32-bit outputs (624 because that is the number of 32-bit words in the Mersenne Twister's internal state), and it reconstructs the state by inverting the generator's output step. Here that fails, for a beautifully specific reason. CPython's getrandbits(63) generates two 32-bit Mersenne-Twister words and then right-shifts to drop down to 63 bits, which throws away the lowest bit of one word. You can verify this in the CPython source: the relevant path does r >>= (32 - k) on the final word (_randommodule.c). So every observation you collect is one bit short of a clean 32-bit word. randcrack needs exact 32-bit values and has no way to represent "this bit is unknown," so it cannot proceed.

z3 does not care that a bit is missing. You model each of the 624 state words as a symbolic 32-bit BitVec, add a constraint for every bit you do know, and leave the unknown bits as free variables for the solver to pin down from the rest of the system. That is exactly what the symbolic_mersenne_cracker library (its class is called Untwister) does. Its own description: it models the Mersenne Twister symbolically so you can recover the state "even if the outputs are truncated." Each observation is a binary string where ? marks an unknown bit:

from symbolic_mersenne_cracker import Untwister
ut = Untwister()
for boundary in observed_boundaries: # ~800 of them, from the emails
# split the 63-bit value into its two MT word contributions,
# marking the one missing bit as '?'
ut.submit(known_bits_with_question_marks(boundary))
rng = ut.get_random() # z3 has solved for the full MT state
next_boundary = rng.getrandbits(63) # now you can predict forward

Collect around 800 boundaries, mark the missing bits, hand the over-determined system to z3, and it recovers the exact internal state. From there you predict the next boundary and forge the message. This is the thing brute force structurally cannot do: you are not guessing a value, you are solving a 624-word system with holes in it, and "solve a system with holes" is precisely what an SMT solver is for.

Heads up: The same trick generalizes. Linear-feedback shift registers (LFSRs), small custom XOR ciphers, and other generators built from shifts and XORs are all linear systems z3 eats happily. If you can read off "output bit 5 is the XOR of state bits 2 and 9," you can model it. See the stream ciphers post for the keystream-reuse cousins of this idea.
Takeawayz3 wins big when the secret is the solution to a linear system you can only partially observe. Missing bits, truncated outputs, "I have 800 noisy samples": this is its home turf, and no amount of brute force reaches it.

Where z3 dies: real crypto, and why structure beats size

Now the honest part. z3 has a hard ceiling, and the ceiling is not about size. It is about structure. A bit-vector problem with billions of states can solve in a blink if the math is linear and well-behaved. A much smaller problem can run forever if the math was designed to destroy structure. Real cryptography is designed, on purpose, to destroy structure. That is the entire point of a hash function.

Look at rolling-my-own (picoCTF 2021). The check involves MD5: a chunk of your password is hashed, and specific bytes of the digest have to come out a certain way. It looks like a constraint problem ("find the input whose MD5 has these bytes"), and a beginner's instinct is to hand the whole MD5 computation to z3 and ask for a preimage. Do not. z3 will sit there. The right move, and the one the writeup uses, is brute force over the tiny 4-character search space per chunk, which finishes in well under a second. The constraints were never the bottleneck; inverting a hash was, and you sidestep it instead of asking z3 to do the impossible.

Why can z3 read off a keycheck but not invert MD5? Two reasons, both real.

First, the mechanism. z3 solves bit-vector problems by bit-blasting: it expands the word-level math into a giant boolean circuit and hands that to its SAT engine. For linear and bitwise operations the circuit stays small. For a hash function, which is dozens of rounds of mixing built specifically so every output bit depends chaotically on every input bit, the blasted circuit explodes into something enormous and structureless, and the SAT engine has nothing to exploit. Z3's own maintainer co-authored a 2024 paper noting that even plain multiplication makes "the bit-blasting strategy perform poorly" (PolySAT, FMCAD 2024), which is why they keep inventing new techniques. A full cipher or hash is multiplication's problem many times over.

Second, the theory. Deciding bit-vector formulas is, in the worst case, brutally expensive: the satisfiability problem underneath is NP-complete, and with the binary-encoded bit-widths real solvers actually use, the bit-vector logic is formally NExpTime-complete. That is a statement about the worst case, not your average keycheck, but a hash preimage is the worst case, dressed up as a CTF challenge. Researchers who attack reduced-round SHA-256 with SAT solvers are clear that the full function stays out of reach; one such project says breaking full SHA-256 is "probably still infeasible." If a dedicated research tool cannot, your CTF script will not either.

There is also a quieter failure that catches people. z3 is, in the words of one experienced user on Hacker News, "unpredictable and highly dependent on the form in which you pose the instance, often in very unintuitive ways." A formula that is technically satisfiable can still come back unknown because the solver hit your timeout before deciding it. So set a timeout on purpose (s.set('timeout', 10000), in milliseconds) and treat unknown as "I need to model this differently," not "no solution."

Heads up: The tell that you are in z3's dead zone: the check calls a named, standard primitive (MD5, SHA-256, AES, RSA) on your input. The moment you see one of those, stop modeling and switch tools, to hash cracking, AES weaknesses, or RSA attacks. Custom toy ciphers made of XOR and shifts are fair game; real primitives are not.
Takeawayz3 wins where the math is structured and loses where a cryptographer worked hard to remove structure. "Is this linear-ish, or is it a real hash?" is the question that decides whether z3 is your tool or your time sink.

Quick reference

The skeleton, the type rules, and the decision, in one place.

from z3 import *
# 1. Declare unknowns as BitVecs with the binary's real width.
a = [BitVec(f'a{i}', 8) for i in range(N)]
# 2. Constrain: printable bounds + one s.add per decompiled condition.
s = Solver()
for b in a:
s.add(b >= 0x20, ULE(b, 0x7e))
s.add( ... ) # transcribe each check; use If(...) for branches
# 3. Check (respect all three answers).
s.set('timeout', 10000) # ms; unknown != unsat
# 4. Read the model.
if s.check() == sat:
m = s.model()
print(bytes(m[a[i]].as_long() for i in range(N)))
  • Use BitVec, not Int, for anything from a binary. Match the C width (8 / 32 / 64).
  • Unsigned C comparisons need ULT/ULE/UGT/UGE; unsigned division and shift need UDiv/URem/LShR.
  • Branches in the check become If(cond, then, else), never a Python if.
  • Pull bytes out with m[a[i]].as_long(), or slice a wider word with Extract(high, low, bv) (inclusive, bit 0 is the LSB).
  • All solutions: loop while s.check() == sat and s.add a constraint blocking the model you just found.
  • unsat means your constraints are wrong; unknown means timeout or undecidable, not "no answer."
  • Named crypto primitive in the check? Stop. z3 cannot invert real hashes or ciphers.

Related guides on this site:

External primary sources: