Secure Dot Product picoCTF 2026 Solution

Published: March 20, 2026

Description

This dot-product service mixes secret AES key material into a queryable protocol and assumes the design is safe. Download remote.py, study the oracle, and recover the key material it leaks.

Download remote.py and read the server source to understand the protocol.

Launch the challenge instance to get the server's host and port.

Install hlextend.

bash
cat remote.py
bash
pip install hlextend

Solution

Want to try it yourself first?

The guided walkthrough reveals hints one step at a time.

Walk me through it
  1. Step 1
    Understand the SHA-512 authenticated oracle
    Observation
    I noticed that the server computes SHA-512 as H(secret || message) rather than HMAC, which is the classic pattern for a length extension vulnerability and suggested that the authentication layer could be bypassed entirely before attacking the dot product oracle.
    The server computes the dot product of your input vector with the secret 32-byte AES key, then returns a SHA-512 MAC of the result. The twist: the server only accepts queries where your vector has a valid SHA-512 hash - this is vulnerable to a SHA-512 length extension attack.
    Learn more

    A cryptographic oracle is any system that performs a secret cryptographic operation and returns a result you can observe. In this challenge, the oracle computes dot(query_vector, secret_key) - a linear function of the secret. Even though the result is wrapped in a SHA-512 MAC, the underlying dot product leaks information about the key through repeated queries with carefully chosen vectors.

    The authentication bypass using SHA-512 length extension is the critical vulnerability that allows you to submit arbitrary query vectors. SHA-512 (like SHA-256 and SHA-1) uses the Merkle-Damgård construction, where the hash output IS the internal compression state. Given a hash H(secret || message), an attacker who does not know secret can compute H(secret || message || padding || extension) for any extension. This is because the hash output provides the internal state needed to continue the computation.

    In practice, HMAC (Hash-based Message Authentication Code) was specifically designed to be immune to length extension attacks. HMAC computes H(key XOR opad || H(key XOR ipad || message)), which requires knowledge of the key to extend. Any protocol using H(key || data) directly as a MAC (without HMAC) is vulnerable to length extension. This challenge is a direct demonstration of why that construction is insecure.

  2. Step 2
    Forge valid vectors with SHA-512 length extension
    Observation
    I noticed the server requires a valid SHA-512 MAC on every submitted query vector, and since the MAC uses the H(secret || message) construction, the hlextend library can produce accepted forgeries for arbitrary vectors without knowing the secret prefix.
    Use hlextend (pip install hlextend). It implements length extension correctly so you don't have to maintain your own SHA-512 from-state code. You need: the known hash, the data you want to append, and the original message length in bytes (secret prefix length plus any visible query bytes).
    python
    python3 << 'EOF'
    import hlextend
    
    # Inputs you observe / pick:
    orig_hash   = "..."     # hex SHA-512 returned by the server
    orig_length = 32 + 64   # secret_prefix_len + len(visible_query_bytes)
                            # If secret length is unknown, brute over a small range
                            # (typically 16..64 bytes) until the server accepts.
    
    # What you want to append (your forged query vector, serialized):
    new_data = b"\x01" + b"\x00" * 31   # e.g. e_1 = (1,0,...,0)
    
    sha = hlextend.new('sha512')
    forged_message = sha.extend(new_data, b"A" * orig_length)  # 'A' is a placeholder
    # extend() returns the bytes you must send; sha.hexdigest() is the new hash.
    extended_hash = sha.hexdigest()
    
    # Strip the placeholder prefix; everything after orig_length bytes is what
    # the server hashes against state-from-orig_hash:
    payload = forged_message[orig_length:]
    print("Send this:", payload, extended_hash)
    EOF
    What didn't work first

    Tried: Pass the secret prefix length as just the known query length (e.g. 64 for a 64-byte visible vector), ignoring the secret bytes.

    hlextend uses original_length to reconstruct the exact internal SHA-512 state at the moment the server finalized its hash. If you omit the secret prefix length, the padding block you insert is wrong and the server recomputes a different hash than the one you forge. The forgery is rejected. You must add the secret prefix length (usually 32 bytes) to the visible query length to get the right value.

    Tried: Use hashpump (the C binary) instead of hlextend, but forget to decode its hex output before sending it to the server.

    hashpump prints the forged hash as a hex string and the forged message as raw bytes to stdout. If you pass the hex string directly as the message payload, the server receives ASCII hex characters instead of the binary blob, and the vector it parses is garbage. You must send the raw binary from hashpump's second output line (or use hlextend in Python to keep everything in bytes throughout).

    Learn more

    The original_length parameter is the part most people get wrong. It must be the total number of bytes the server hashed before producing orig_hash: the length of the secret prefix plus the length of any visible message bytes. If the server uses SHA-512(secret || your_query), then original_length = len(secret) + len(your_query_bytes). The secret length is often not exposed, so brute-force a small range (16..64 bytes) until the server accepts a forged tag.

    SHA-512 padding follows the Merkle-Damgård strengthening rule: append 0x80, then zeros, then a 128-bit big-endian length field, padded out to the next 128-byte boundary. Because the hash output IS the internal state, knowing the hash plus the original length lets you start a fresh SHA-512 computation from that state and append anything you like. hlextend handles the padding bytes you must include in the forged message.

    hlextend (Python) and hashpump (C) implement this automatically. Real systems that were vulnerable include early AWS Signature v2 and Flickr's API; both moved to HMAC. See hash cracking and length extension for CTF for the broader treatment.

  3. Step 3
    Collect 32 equations and solve over Z/256
    Observation
    I noticed the oracle returns dot(query_vector, secret_key) over bytes, which means sending the 32 standard basis vectors e_1 through e_32 via forged MACs yields each key byte directly, making SageMath over Zmod(256) the correct solver to avoid float rounding errors from numpy.
    Use the identity-matrix shortcut: send e_1 = (1,0,...,0), e_2 = (0,1,0,...,0), and so on. Each response is one byte of the key directly - no linear algebra required. If you must use non-trivial query vectors, solve over Z/256 in SageMath rather than np.linalg.solve (floats accumulate error).
    python
    # Easy path: identity-matrix queries, each response = one key byte
    python3 << 'EOF'
    key = bytearray(32)
    for i in range(32):
        v = bytearray(32)
        v[i] = 1
        # send forged (v, length-extended-hash) to server
        response = query_server(v)   # server returns dot(v, key) = key[i]
        key[i] = response % 256
    print("Recovered key:", bytes(key).hex())
    
    from Crypto.Cipher import AES
    cipher = AES.new(bytes(key), AES.MODE_CBC, iv=YOUR_IV)
    print(cipher.decrypt(YOUR_CT))
    EOF
    python
    # Non-trivial vectors? Solve over Z/256 in SageMath
    sage << 'EOF'
    M = Matrix(Zmod(256), [...])    # 32x32 of your query vectors mod 256
    Y = vector(Zmod(256), [...])    # 32 server responses mod 256
    key = M.solve_right(Y)
    print(bytes([int(k) for k in key]).hex())
    EOF
    What didn't work first

    Tried: Solve the 32x32 linear system with numpy.linalg.solve instead of SageMath over Zmod(256).

    numpy.linalg.solve operates in floating-point, so intermediate products accumulate rounding error. Key bytes are integers mod 256, not real numbers, and numpy's solver does not apply the modular reduction. The resulting floats round to wrong integers and the recovered key decrypts to garbage. SageMath's Matrix(Zmod(256), ...).solve_right() does exact arithmetic mod 256 and gives the correct answer.

    Tried: Send random non-standard basis vectors hoping the server returns enough equations, but pick vectors that are not linearly independent mod 256.

    If two query vectors are linearly dependent over Z/256, the second equation provides no new information about the key bytes and the system is under-determined. For example, v2 = 2*v1 mod 256 is dependent on v1. The matrix M is then non-invertible over Zmod(256) and SageMath's solve_right raises an error. Using the 32 standard basis vectors guarantees M is the identity matrix, which is always invertible, and reduces each query directly to one key byte.

    Learn more

    The core attack is linear algebra over the integers. Each query vector v_i produces a response r_i = v_i · k = sum(v_i[j] * k[j]) where k is the 32-byte secret key. By sending 32 linearly independent query vectors (e.g., standard basis vectors e_1, e_2, ..., e_32), you obtain 32 equations in 32 unknowns. The system M · k = r can then be solved for k using SageMath over Zmod(256) -- numpy.linalg.solve() operates on floats and accumulates rounding error for byte-range modular arithmetic.

    The simplest choice of query vectors is the identity matrix: vector e_i has a 1 in position i and 0 everywhere else. The dot product of e_i with the key is simply k[i], so each query directly reveals one byte of the key. This reduces the attack to 32 oracle queries and trivial extraction, without needing to solve a linear system at all. More complex query designs can be used to reduce the number of queries needed by querying multiple key bytes simultaneously.

    This type of attack - recovering a secret through repeated linear queries to an oracle - is related to lattice attacks on cryptographic systems and appears in attacks on certain symmetric ciphers, secret sharing schemes, and machine learning models (where gradients leak training data). The dot product oracle is a simplified version of the vulnerabilities exploited in model inversion attacks against neural networks.

    Linear-algebra recovery (toy with a 4-byte key):
      Secret key  k = (37, 91, 200, 5)
    
    Query 1: v1 = (1, 0, 0, 0)  -> r1 = 1*37 + 0 + 0 + 0   = 37
    Query 2: v2 = (0, 1, 0, 0)  -> r2 = 0 + 1*91 + 0 + 0   = 91
    Query 3: v3 = (0, 0, 1, 0)  -> r3 = 0 + 0 + 1*200 + 0  = 200
    Query 4: v4 = (0, 0, 0, 1)  -> r4 = 0 + 0 + 0 + 1*5    = 5
    
    With identity-basis queries, each response is one byte of the key.
    M = I (4x4 identity matrix), Y = (37, 91, 200, 5)
    solve M*k = Y trivially: k = Y.
    
    For non-trivial query vectors, e.g.:
      v1 = (1, 1, 0, 0) -> r1 = 37 + 91 = 128
      v2 = (1, 0, 1, 0) -> r2 = 37 + 200 = 237
      v3 = (0, 1, 0, 1) -> r3 = 91 + 5 = 96
      v4 = (1, 0, 0, 1) -> r4 = 37 + 5 = 42
    
      M = [[1,1,0,0],[1,0,1,0],[0,1,0,1],[1,0,0,1]]
      det(M) != 0, so M is invertible.
      k = M^(-1) * Y
        = (37, 91, 200, 5)  (verifiable via SageMath: Matrix(Zmod(256), M).solve_right(vector(Zmod(256), Y))).
    
    For the real challenge with a 32-byte key, you need 32 linearly
    independent queries. The identity basis is the simplest valid choice,
    turning the linear algebra into trivial coordinate readout. The
    forged length-extension hashes let you submit any vectors at all
    to the otherwise gated oracle.
Interactive tools
  • AES DecryptorDecrypt AES-CBC, AES-GCM, AES-CTR, and AES-ECB ciphertexts with a known key and IV. Hex / base64 / UTF-8 inputs, AES-128/192/256, PKCS#7 padding.

Flag

Reveal flag

picoCTF{d0t_pr0duct_l34k_...}

SHA-512 length extension forges valid query vectors accepted by the server. Collect 32 dot product equations (M·K = Y) and solve with SageMath over Zmod(256) (or read each byte directly via identity-matrix queries) to recover the 32-byte AES key.

Key takeaway

Any MAC constructed as H(secret || message) using a Merkle-Damgard hash (SHA-1, SHA-256, SHA-512) is forgeable without knowing the secret, because the hash output IS the internal compression state. HMAC was designed specifically to close this gap by double-wrapping the key. Beyond authentication, this challenge illustrates how a linear oracle (one that returns a linear function of a secret) leaks the secret through repeated structured queries, a principle that also underlies lattice attacks and model inversion attacks on machine learning systems.

Related reading

Want more picoCTF 2026 writeups?

Useful tools for Cryptography

What to try next