ARMssembly 2 picoCTF 2021 Solution

Published: April 2, 2026

Description

What integer does this ARM program print with argument 2403814618? Flag format: picoCTF{XXXXXXXX} - 8 lowercase hex characters.

Download chall_2.S.

bash
wget <url>/chall_2.S

Solution

Want to try it yourself first?

The guided walkthrough reveals hints one step at a time.

Walk me through it
  1. Step 1
    Trace the loop logic
    Observation
    I noticed chall_2.S is an ARM assembly source with a backward branch and a counter register, which suggested the presence of a counted loop where recognizing the iteration pattern would let me collapse the entire computation to a simple arithmetic expression.
    Open chall_2.S. The loop runs N times (where N is the input) and adds 3 each iteration, starting from 0. That's just input * 3. Assumes -O0 (no optimizer); an optimized build would have folded this to a single mul or a constant.
    bash
    less chall_2.S
    What didn't work first

    Tried: Run objdump or strings on chall_2.S instead of reading it directly.

    chall_2.S is a plain text assembly source file, not a compiled binary, so objdump reports an unrecognized format error and strings just echoes the mnemonics verbatim. The correct approach is to open it as text with less or cat and trace the logic manually.

    Tried: Assume the loop adds the input value each iteration rather than the constant 3.

    Skimming the add instruction without checking which register holds the literal vs. the argument leads to computing input * input (squaring) instead of input * 3. Reading the immediate operand in the add line and cross-referencing with the initialized accumulator register makes clear that 3 is the stride, not the input.

    Learn more

    The loop pattern in ARM assembly looks like: initialize a counter and accumulator, compare counter to zero, add the constant, decrement the counter, and branch back if not zero. Recognizing this as repeated addition lets you replace the loop with a single multiplication.

    Working registers on AArch64 are 64 bits wide, but when the result is stored in a w register (32-bit), it is automatically truncated to the lower 32 bits, equivalent to % (2^32). This matters: the mathematical result of 2403814618 * 3 exceeds 2^32 and must be masked.

    Identifying loops in assembly: three characteristic features: an initialization before the loop body, a comparison instruction (cmp or cbz/cbnz), and a backward branch (a branch to a label earlier in the code). When you spot a backward branch, find the loop variable and the accumulator, then count how many times the body runs and what it adds each iteration.

    Why -O0 matters: at -O2 or -O3 the compiler folds this loop into a single multiplication and the verbatim-loop strategy fails. CTF challenges typically ship -O0 binaries to keep the assembly readable, but always sanity-check by skimming for a long loop body before assuming it's a constant-stride accumulator.

  2. Step 2
    Compute the result with 32-bit truncation
    Observation
    I noticed the loop result is stored in a 32-bit ARM w register and that the input 2403814618 multiplied by 3 yields a 33-bit value, which suggested I needed to mask with 0xffffffff to simulate the hardware truncation before converting to the hex flag.
    Multiply the input by 3 and take the lower 32 bits. The full product 7211443854 truncates to 0xadd5e68e.
    python
    python3 -c "print(hex((2403814618 * 3) & 0xffffffff))"

    Expected output

    0xadd5e68e
    What didn't work first

    Tried: Omit the & 0xffffffff mask and submit the full decimal or hex of 2403814618 * 3 directly.

    Python gives 7211443854 (0x1ADD5E68E), which is a 33-bit value. Submitting 0x1add5e68e as the flag fails because the ARM w register silently drops bit 32, yielding 0xadd5e68e. The mask is not optional - it is the hardware behavior of a 32-bit write.

    Tried: Use % 2**32 instead of & 0xffffffff and expect a different result.

    For positive integers both expressions are equivalent and both return 2916476558 (0xadd5e68e), so this is not the source of error. The more common mistake is forgetting any masking entirely, not choosing the wrong masking form.

    Learn more

    & 0xffffffff masks to the lower 32 bits, simulating 32-bit integer overflow. In Python, integers are arbitrarily large by default; you must apply the mask explicitly to match the behavior of C uint32_t or ARM w register arithmetic.

    Watching the truncation in binary. 7211443854 is a 33-bit value:

    7211443854 (hex)  = 0x1 ADD5E68E
                        = 1 1010 1101 1101 0101 1110 0110 1000 1110 (binary)
                          ^
                          bit 32 (lost when stored in a 32-bit w register)
    
    after & 0xffffffff:
                           1010 1101 1101 0101 1110 0110 1000 1110
                        = 0xADD5E68E (lowercase: add5e68e)

    The leading 1 is bit 32 and gets discarded by the w-register write. To automate runs across many ARM challenges in one go, see the pwntools guide for connecting QEMU stdio to a Python harness.

Flag

Reveal flag

picoCTF{add5e68e}

The loop accumulates additions - multiply input by 3 then mask to 32 bits for the wrapped result. For arg=2403814618: (2403814618*3) & 0xFFFFFFFF = 0xadd5e68e.

Key takeaway

Integer overflow is a property of fixed-width registers, not a bug in the assembly itself. When a computation whose mathematical result exceeds 2^32 is stored in a 32-bit register, the processor silently discards the high bits, producing a wrapped result. This behavior is predictable and must be accounted for when emulating assembly in Python or any language with arbitrary-precision integers. The same truncation arithmetic appears in vulnerability research: integer overflow bugs in C arise from the same property, where an attacker supplies a value that causes a size calculation to wrap around to a small number, leading to an undersized allocation.

Related reading

Want more picoCTF 2021 writeups?

Useful tools for Reverse Engineering

What to try next