Description
This dot-product service mixes secret AES key material into a queryable protocol and assumes the design is safe. Download remote.py, study the oracle, and recover the key material it leaks.
Setup
Download remote.py and read the server source to understand the protocol.
Launch the challenge instance to get the server's host and port.
Install hlextend.
cat remote.pypip install hlextendSolution
Want to try it yourself first?
The guided walkthrough reveals hints one step at a time.
Step 1
Understand the SHA-512 authenticated oracleObservationI noticed that the server computes SHA-512 as H(secret || message) rather than HMAC, which is the classic pattern for a length extension vulnerability and suggested that the authentication layer could be bypassed entirely before attacking the dot product oracle.The server computes the dot product of your input vector with the secret 32-byte AES key, then returns a SHA-512 MAC of the result. The twist: the server only accepts queries where your vector has a valid SHA-512 hash - this is vulnerable to a SHA-512 length extension attack.Learn more
A cryptographic oracle is any system that performs a secret cryptographic operation and returns a result you can observe. In this challenge, the oracle computes
dot(query_vector, secret_key)- a linear function of the secret. Even though the result is wrapped in a SHA-512 MAC, the underlying dot product leaks information about the key through repeated queries with carefully chosen vectors.The authentication bypass using SHA-512 length extension is the critical vulnerability that allows you to submit arbitrary query vectors. SHA-512 (like SHA-256 and SHA-1) uses the Merkle-Damgård construction, where the hash output IS the internal compression state. Given a hash
H(secret || message), an attacker who does not knowsecretcan computeH(secret || message || padding || extension)for anyextension. This is because the hash output provides the internal state needed to continue the computation.In practice, HMAC (Hash-based Message Authentication Code) was specifically designed to be immune to length extension attacks. HMAC computes
H(key XOR opad || H(key XOR ipad || message)), which requires knowledge of the key to extend. Any protocol usingH(key || data)directly as a MAC (without HMAC) is vulnerable to length extension. This challenge is a direct demonstration of why that construction is insecure.Step 2
Forge valid vectors with SHA-512 length extensionObservationI noticed the server requires a valid SHA-512 MAC on every submitted query vector, and since the MAC uses the H(secret || message) construction, the hlextend library can produce accepted forgeries for arbitrary vectors without knowing the secret prefix.Use hlextend (pip install hlextend). It implements length extension correctly so you don't have to maintain your own SHA-512 from-state code. You need: the known hash, the data you want to append, and the original message length in bytes (secret prefix length plus any visible query bytes).pythonpython3 << 'EOF' import hlextend # Inputs you observe / pick: orig_hash = "..." # hex SHA-512 returned by the server orig_length = 32 + 64 # secret_prefix_len + len(visible_query_bytes) # If secret length is unknown, brute over a small range # (typically 16..64 bytes) until the server accepts. # What you want to append (your forged query vector, serialized): new_data = b"\x01" + b"\x00" * 31 # e.g. e_1 = (1,0,...,0) sha = hlextend.new('sha512') forged_message = sha.extend(new_data, b"A" * orig_length) # 'A' is a placeholder # extend() returns the bytes you must send; sha.hexdigest() is the new hash. extended_hash = sha.hexdigest() # Strip the placeholder prefix; everything after orig_length bytes is what # the server hashes against state-from-orig_hash: payload = forged_message[orig_length:] print("Send this:", payload, extended_hash) EOFWhat didn't work first
Tried: Pass the secret prefix length as just the known query length (e.g. 64 for a 64-byte visible vector), ignoring the secret bytes.
hlextend uses original_length to reconstruct the exact internal SHA-512 state at the moment the server finalized its hash. If you omit the secret prefix length, the padding block you insert is wrong and the server recomputes a different hash than the one you forge. The forgery is rejected. You must add the secret prefix length (usually 32 bytes) to the visible query length to get the right value.
Tried: Use hashpump (the C binary) instead of hlextend, but forget to decode its hex output before sending it to the server.
hashpump prints the forged hash as a hex string and the forged message as raw bytes to stdout. If you pass the hex string directly as the message payload, the server receives ASCII hex characters instead of the binary blob, and the vector it parses is garbage. You must send the raw binary from hashpump's second output line (or use hlextend in Python to keep everything in bytes throughout).
Learn more
The original_length parameter is the part most people get wrong. It must be the total number of bytes the server hashed before producing
orig_hash: the length of the secret prefix plus the length of any visible message bytes. If the server usesSHA-512(secret || your_query), thenoriginal_length = len(secret) + len(your_query_bytes). The secret length is often not exposed, so brute-force a small range (16..64 bytes) until the server accepts a forged tag.SHA-512 padding follows the Merkle-Damgård strengthening rule: append
0x80, then zeros, then a 128-bit big-endian length field, padded out to the next 128-byte boundary. Because the hash output IS the internal state, knowing the hash plus the original length lets you start a fresh SHA-512 computation from that state and append anything you like.hlextendhandles the padding bytes you must include in the forged message.hlextend (Python) and hashpump (C) implement this automatically. Real systems that were vulnerable include early AWS Signature v2 and Flickr's API; both moved to HMAC. See hash cracking and length extension for CTF for the broader treatment.
Step 3
Collect 32 equations and solve over Z/256ObservationI noticed the oracle returns dot(query_vector, secret_key) over bytes, which means sending the 32 standard basis vectors e_1 through e_32 via forged MACs yields each key byte directly, making SageMath over Zmod(256) the correct solver to avoid float rounding errors from numpy.Use the identity-matrix shortcut: send e_1 = (1,0,...,0), e_2 = (0,1,0,...,0), and so on. Each response is one byte of the key directly - no linear algebra required. If you must use non-trivial query vectors, solve over Z/256 in SageMath rather than np.linalg.solve (floats accumulate error).python# Easy path: identity-matrix queries, each response = one key byte python3 << 'EOF' key = bytearray(32) for i in range(32): v = bytearray(32) v[i] = 1 # send forged (v, length-extended-hash) to server response = query_server(v) # server returns dot(v, key) = key[i] key[i] = response % 256 print("Recovered key:", bytes(key).hex()) from Crypto.Cipher import AES cipher = AES.new(bytes(key), AES.MODE_CBC, iv=YOUR_IV) print(cipher.decrypt(YOUR_CT)) EOFpython# Non-trivial vectors? Solve over Z/256 in SageMath sage << 'EOF' M = Matrix(Zmod(256), [...]) # 32x32 of your query vectors mod 256 Y = vector(Zmod(256), [...]) # 32 server responses mod 256 key = M.solve_right(Y) print(bytes([int(k) for k in key]).hex()) EOFWhat didn't work first
Tried: Solve the 32x32 linear system with numpy.linalg.solve instead of SageMath over Zmod(256).
numpy.linalg.solve operates in floating-point, so intermediate products accumulate rounding error. Key bytes are integers mod 256, not real numbers, and numpy's solver does not apply the modular reduction. The resulting floats round to wrong integers and the recovered key decrypts to garbage. SageMath's Matrix(Zmod(256), ...).solve_right() does exact arithmetic mod 256 and gives the correct answer.
Tried: Send random non-standard basis vectors hoping the server returns enough equations, but pick vectors that are not linearly independent mod 256.
If two query vectors are linearly dependent over Z/256, the second equation provides no new information about the key bytes and the system is under-determined. For example, v2 = 2*v1 mod 256 is dependent on v1. The matrix M is then non-invertible over Zmod(256) and SageMath's solve_right raises an error. Using the 32 standard basis vectors guarantees M is the identity matrix, which is always invertible, and reduces each query directly to one key byte.
Learn more
The core attack is linear algebra over the integers. Each query vector
v_iproduces a responser_i = v_i · k = sum(v_i[j] * k[j])wherekis the 32-byte secret key. By sending 32 linearly independent query vectors (e.g., standard basis vectorse_1, e_2, ..., e_32), you obtain 32 equations in 32 unknowns. The systemM · k = rcan then be solved forkusing SageMath overZmod(256)--numpy.linalg.solve()operates on floats and accumulates rounding error for byte-range modular arithmetic.The simplest choice of query vectors is the identity matrix: vector
e_ihas a 1 in positioniand 0 everywhere else. The dot product ofe_iwith the key is simplyk[i], so each query directly reveals one byte of the key. This reduces the attack to 32 oracle queries and trivial extraction, without needing to solve a linear system at all. More complex query designs can be used to reduce the number of queries needed by querying multiple key bytes simultaneously.This type of attack - recovering a secret through repeated linear queries to an oracle - is related to lattice attacks on cryptographic systems and appears in attacks on certain symmetric ciphers, secret sharing schemes, and machine learning models (where gradients leak training data). The dot product oracle is a simplified version of the vulnerabilities exploited in model inversion attacks against neural networks.
Linear-algebra recovery (toy with a 4-byte key): Secret key k = (37, 91, 200, 5) Query 1: v1 = (1, 0, 0, 0) -> r1 = 1*37 + 0 + 0 + 0 = 37 Query 2: v2 = (0, 1, 0, 0) -> r2 = 0 + 1*91 + 0 + 0 = 91 Query 3: v3 = (0, 0, 1, 0) -> r3 = 0 + 0 + 1*200 + 0 = 200 Query 4: v4 = (0, 0, 0, 1) -> r4 = 0 + 0 + 0 + 1*5 = 5 With identity-basis queries, each response is one byte of the key. M = I (4x4 identity matrix), Y = (37, 91, 200, 5) solve M*k = Y trivially: k = Y. For non-trivial query vectors, e.g.: v1 = (1, 1, 0, 0) -> r1 = 37 + 91 = 128 v2 = (1, 0, 1, 0) -> r2 = 37 + 200 = 237 v3 = (0, 1, 0, 1) -> r3 = 91 + 5 = 96 v4 = (1, 0, 0, 1) -> r4 = 37 + 5 = 42 M = [[1,1,0,0],[1,0,1,0],[0,1,0,1],[1,0,0,1]] det(M) != 0, so M is invertible. k = M^(-1) * Y = (37, 91, 200, 5) (verifiable via SageMath: Matrix(Zmod(256), M).solve_right(vector(Zmod(256), Y))). For the real challenge with a 32-byte key, you need 32 linearly independent queries. The identity basis is the simplest valid choice, turning the linear algebra into trivial coordinate readout. The forged length-extension hashes let you submit any vectors at all to the otherwise gated oracle.
Interactive tools
- AES DecryptorDecrypt AES-CBC, AES-GCM, AES-CTR, and AES-ECB ciphertexts with a known key and IV. Hex / base64 / UTF-8 inputs, AES-128/192/256, PKCS#7 padding.
Flag
Reveal flag
picoCTF{d0t_pr0duct_l34k_...}
SHA-512 length extension forges valid query vectors accepted by the server. Collect 32 dot product equations (M·K = Y) and solve with SageMath over Zmod(256) (or read each byte directly via identity-matrix queries) to recover the 32-byte AES key.