Description
What integer does this ARM program print with argument 2403814618? Flag format: picoCTF{XXXXXXXX} - 8 lowercase hex characters.
Setup
Download chall_2.S.
wget <url>/chall_2.SSolution
Want to try it yourself first?
The guided walkthrough reveals hints one step at a time.
Step 1
Trace the loop logicObservationI noticed chall_2.S is an ARM assembly source with a backward branch and a counter register, which suggested the presence of a counted loop where recognizing the iteration pattern would let me collapse the entire computation to a simple arithmetic expression.Open chall_2.S. The loop runs N times (where N is the input) and adds 3 each iteration, starting from 0. That's just input * 3. Assumes -O0 (no optimizer); an optimized build would have folded this to a single mul or a constant.bashless chall_2.SWhat didn't work first
Tried: Run objdump or strings on chall_2.S instead of reading it directly.
chall_2.S is a plain text assembly source file, not a compiled binary, so objdump reports an unrecognized format error and strings just echoes the mnemonics verbatim. The correct approach is to open it as text with less or cat and trace the logic manually.
Tried: Assume the loop adds the input value each iteration rather than the constant 3.
Skimming the add instruction without checking which register holds the literal vs. the argument leads to computing input * input (squaring) instead of input * 3. Reading the immediate operand in the add line and cross-referencing with the initialized accumulator register makes clear that 3 is the stride, not the input.
Learn more
The loop pattern in ARM assembly looks like: initialize a counter and accumulator, compare counter to zero, add the constant, decrement the counter, and branch back if not zero. Recognizing this as repeated addition lets you replace the loop with a single multiplication.
Working registers on AArch64 are 64 bits wide, but when the result is stored in a
wregister (32-bit), it is automatically truncated to the lower 32 bits, equivalent to% (2^32). This matters: the mathematical result of 2403814618 * 3 exceeds 2^32 and must be masked.Identifying loops in assembly: three characteristic features: an initialization before the loop body, a comparison instruction (
cmporcbz/cbnz), and a backward branch (a branch to a label earlier in the code). When you spot a backward branch, find the loop variable and the accumulator, then count how many times the body runs and what it adds each iteration.Why -O0 matters: at
-O2or-O3the compiler folds this loop into a single multiplication and the verbatim-loop strategy fails. CTF challenges typically ship-O0binaries to keep the assembly readable, but always sanity-check by skimming for a long loop body before assuming it's a constant-stride accumulator.Step 2
Compute the result with 32-bit truncationObservationI noticed the loop result is stored in a 32-bit ARM w register and that the input 2403814618 multiplied by 3 yields a 33-bit value, which suggested I needed to mask with 0xffffffff to simulate the hardware truncation before converting to the hex flag.Multiply the input by 3 and take the lower 32 bits. The full product 7211443854 truncates to 0xadd5e68e.pythonpython3 -c "print(hex((2403814618 * 3) & 0xffffffff))"Expected output
0xadd5e68e
What didn't work first
Tried: Omit the & 0xffffffff mask and submit the full decimal or hex of 2403814618 * 3 directly.
Python gives 7211443854 (0x1ADD5E68E), which is a 33-bit value. Submitting 0x1add5e68e as the flag fails because the ARM w register silently drops bit 32, yielding 0xadd5e68e. The mask is not optional - it is the hardware behavior of a 32-bit write.
Tried: Use % 2**32 instead of & 0xffffffff and expect a different result.
For positive integers both expressions are equivalent and both return 2916476558 (0xadd5e68e), so this is not the source of error. The more common mistake is forgetting any masking entirely, not choosing the wrong masking form.
Learn more
& 0xffffffffmasks to the lower 32 bits, simulating 32-bit integer overflow. In Python, integers are arbitrarily large by default; you must apply the mask explicitly to match the behavior of Cuint32_tor ARMwregister arithmetic.Watching the truncation in binary. 7211443854 is a 33-bit value:
7211443854 (hex) = 0x1 ADD5E68E = 1 1010 1101 1101 0101 1110 0110 1000 1110 (binary) ^ bit 32 (lost when stored in a 32-bit w register) after & 0xffffffff: 1010 1101 1101 0101 1110 0110 1000 1110 = 0xADD5E68E (lowercase: add5e68e)The leading
1is bit 32 and gets discarded by thew-register write. To automate runs across many ARM challenges in one go, see the pwntools guide for connecting QEMU stdio to a Python harness.
Flag
Reveal flag
picoCTF{add5e68e}
The loop accumulates additions - multiply input by 3 then mask to 32 bits for the wrapped result. For arg=2403814618: (2403814618*3) & 0xFFFFFFFF = 0xadd5e68e.