Diffie-Hellman and Discrete Log for CTF: Breaking Weak Key Exchanges

You found a Diffie-Hellman exchange. Now what?

A challenge hands you a prime p, a generator g, and two public values A and B. Somewhere there is a ciphertext encrypted with a key derived from the shared secret. Unlike RSA, there is no public modulus to factor and no exponent to invert. The entire attack reduces to one question: can you solve the discrete logarithm, that is, recover the private exponent a from A = g^a mod p?

For a well-chosen 2048-bit safe prime, you cannot, and you should stop looking for a math break and start looking for an implementation bug. But CTF authors hand you weak parameters constantly. Here is the triage order, fastest move first:

Is p small? If it fits in 40 to 60 bits, baby-step giant-step (BSGS) solves the log in seconds. Under ~30 bits, plain brute force does.
Is p - 1 smooth? Factor it. If it splits into small prime powers, Pohlig-Hellman breaks even a large p quickly.
Just ask the tool. sympy.discrete_log(p, A, g) or Sage's discrete_log internally pick BSGS, Pollard rho, or Pohlig-Hellman for you. Try this before writing anything by hand.
Bad subgroup? If g has small order, or the public value sits in a tiny subgroup, the secret has very few possible values regardless of how big p is.

Recover a, compute s = B^a mod p, derive the symmetric key the way the challenge does (almost always a hash of s), and decrypt. The rest of this post is each of those steps with working Python.

picoCTF Diffie-Hellman challenges

Shared Secrets Custom encryption

Both hand you a DH exchange where a private exponent is exposed instead of hidden, so you skip the discrete log entirely and recompute the shared secret directly. They are the gentlest way to see the transcript-to-key pipeline before you meet a challenge that makes you actually solve the log.

Note: This is a sibling to the RSA Attacks guide, the Elliptic Curves guide, and the Stream Ciphers guide. Discrete log is the same hard problem that secures elliptic-curve crypto, just over a different group, so the attack ideas here transfer almost directly to ECC.

What does Diffie-Hellman actually compute?

Diffie and Hellman published the key exchange in 1976 as the first practical way for two parties to agree on a shared secret over a public channel (New Directions in Cryptography, 1976). The mechanics are short:

Public, agreed in the open:
  p   a large prime modulus
  g   a generator (base) modulo p
Alice picks secret a, sends  A = g^a mod p
Bob   picks secret b, sends  B = g^b mod p
Alice computes  s = B^a mod p = g^(b*a) mod p
Bob   computes  s = A^b mod p = g^(a*b) mod p
Both now share  s = g^(a*b) mod p

The shared secret s = g^(a*b) mod p falls out because exponentiation commutes: raising g^b to the a is the same as raising g^a to the b. Nobody ever transmits a, b, or s. That shared s is then almost always hashed down into an AES or ChaCha key, because s itself is a large structured integer, not uniform key material.

DH security rests on a single asymmetry: computing g^a mod p is cheap, but recovering a from the result is the discrete logarithm problem.

What does the attacker on the wire actually have?

You are the eavesdropper. The public channel leaks everything except the private exponents. Concretely, a CTF DH dump gives you:

p = 0x...        # the prime modulus      (PUBLIC)
g = 2            # the generator          (PUBLIC)
A = g^a mod p    # Alice's public value   (PUBLIC, but a is secret)
B = g^b mod p    # Bob's public value     (PUBLIC, but b is secret)
ct = AES_encrypt(flag, key=H(s))   # the prize
You DO NOT have: a, b, or s = g^(a*b) mod p

To get s you need exactly one of the private exponents. Recover a and you compute s = B^a mod p; recover b and you compute s = A^b mod p. Either works, so attack whichever public value sits in the weaker structure. You never need both.

Tip: DH with no authentication is also vulnerable to a man-in-the-middle, where the attacker runs two separate exchanges and relays. That is an interactive-protocol attack, not a math attack, and it only shows up in CTFs that let you talk to a live service. The rest of this post is the offline math break: you have a transcript and you must solve the log.

The discrete logarithm problem, and when it is easy

Given g, p, and A = g^a mod p, the discrete logarithm problem (DLP) asks you to find a. Over the integers, taking a log is easy. Modulo a prime the values wrap around unpredictably, so there is no notion of "bigger means larger exponent" to binary-search on. For a strong prime the best known general algorithm is the number field sieve, which is infeasible at 2048 bits.

But the difficulty is not about the size of p alone. It is about the size of the largest prime factor of the group order, which is p - 1 for the full multiplicative group. The cost of every practical DLP algorithm is tied to that largest factor. This single fact drives the whole attack surface:

When DLP is EASY (CTF territory):
  - p is small (fits in 40 to 60 bits)        -> BSGS
  - p - 1 has only small prime factors        -> Pohlig-Hellman
  - g generates a small subgroup              -> brute the subgroup
  - a itself is small (poor key generation)   -> brute / Pollard kangaroo
When DLP is HARD (look for a bug instead):
  - p is a 2048-bit safe prime, p = 2q + 1, q prime
  - g generates the full prime-order subgroup

Key insight: A "safe prime" is one where p = 2q + 1 with q also prime. Then p - 1 = 2q has a huge prime factor q, which is exactly what makes Pohlig-Hellman useless. If a challenge uses a safe prime and a proper generator, the intended break is somewhere else: reused nonces, a small private exponent, a side channel, or a parameter you have not checked yet.

So the first thing you do with any DH challenge is factor p - 1 and look at its largest prime factor. That number, not p, tells you whether you are about to win in seconds or need a different idea entirely.

Small or weak prime: brute force and baby-step giant-step

When p is small, the simplest attack is to walk the powers of g until you hit A. That is brute force, and it costs up to p multiplications. It is fine up to roughly 2^30:

def brute_dlog(g, A, p):
    cur = 1                 # g^0
    for a in range(p):
        if cur == A:
            return a
        cur = (cur * g) % p
    raise ValueError('no solution in range')

Baby-step giant-step is the standard upgrade. It trades memory for time, solving the log in about 2 * sqrt(n) operations where n is the order of g. A 50-bit modulus drops from 2^50 steps to roughly 2^25, which is a few seconds in Python. The idea: write the unknown exponent as a = i*m + j with m = ceil(sqrt(n)). Precompute every g^j (the baby steps) into a table, then take giant strides of size m until one lands in the table.

from math import isqrt
def bsgs(g, A, p, n=None):
    # solve g^a = A (mod p); n is the order of g, defaults to p-1
    if n is None:
        n = p - 1
    m = isqrt(n) + 1
    # baby steps: table of g^j for j in [0, m)
    table = {}
    cur = 1
    for j in range(m):
        table[cur] = j
        cur = (cur * g) % p
    # giant stride factor = g^(-m) mod p
    g_inv_m = pow(g, (p - 1 - m) % (p - 1), p)   # g^(-m) when ord = p-1
    # giant steps: A * (g^-m)^i, look for a baby step match
    gamma = A
    for i in range(m):
        if gamma in table:
            return i * m + table[gamma]
        gamma = (gamma * g_inv_m) % p
    raise ValueError('no solution found')
# usage
a = bsgs(g, A, p)
assert pow(g, a, p) == A
s = pow(B, a, p)        # the shared secret

Warning: The inverse step g^(-m) above uses ord(g) = p - 1. If g generates a smaller subgroup of order n, replace the exponent with (n - m) % n and compute the modular inverse against that order instead. Getting the order wrong is the most common reason a hand-rolled BSGS silently returns garbage.

BSGS needs O(sqrt(n)) memory for the table, so it caps out around 60-bit orders before the dictionary eats your RAM. Past that, Pollard rho for logarithms uses the same square-root time with constant memory, and every serious library implements it. You will rarely write rho by hand in a CTF; you will call the library in the next section.

Smooth p-1: the Pohlig-Hellman attack

This is the attack that breaks a large prime, and it is the one CTF authors set up most often by mistake. Pohlig and Hellman observed that if the group order factors into small primes, you can solve the discrete log in each small prime-power subgroup separately and glue the answers together with the Chinese Remainder Theorem (Pohlig and Hellman, 1978).

Suppose the order of g is n = p - 1 = q1^e1 * q2^e2 * ... * qk^ek. For each prime power qi^ei you project the problem into the subgroup of that size, where the log only has qi^ei possible values, and solve it with BSGS. The total cost is dominated by the largest prime factor, not by p:

Cost of Pohlig-Hellman ~ sum over factors of  ei * sqrt(qi)
Example: p-1 factors as  2^3 * 3 * 5 * 7 * 11 * ... (all factors < 2^20)
  -> each subgroup log is trivial, full DLP solved in well under a second,
     even when p itself is 512 or 1024 bits.
Example: p = 2q + 1 with q a 1023-bit prime
  -> largest factor is q, so cost ~ sqrt(q) ~ 2^511. Infeasible. Stop.

You almost never implement Pohlig-Hellman by hand, because sympy and Sage apply it automatically once they factor the order. The workflow is: factor p - 1 yourself first to decide whether the attack is viable, then let the library do the projection and CRT:

from sympy import factorint
factors = factorint(p - 1)
print(factors)             # {2: 1, 3: 2, 1009: 1, 7919: 1, ...}
print('largest factor:', max(factors))
# decision rule: if max(factors) is below ~2^50, Pohlig-Hellman wins.
# Then just call the high-level solver, which uses it internally:
from sympy.ntheory.residue_ntheory import discrete_log
a = discrete_log(p, A, g)
assert pow(g, a, p) == A

Key insight: Pohlig-Hellman is why "the prime is 2048 bits" tells you nothing on its own. A 2048-bit prime whose p - 1 is the product of dozens of 20-bit factors is as weak as a toy. Always factor p - 1 before you conclude a DH instance is secure. If factorint hangs, the prime is probably safe and the break is elsewhere.

Let the tools solve it: sympy.discrete_log and Sage

Before writing any custom solver, try the batteries-included call. SymPy's discrete_log is a dispatcher: it reduces the problem over a composite-order group to prime-order subproblems and picks trial multiplication, BSGS, Pollard rho, or Pohlig-Hellman depending on the structure (SymPy ntheory documentation). One line covers most CTF DH instances:

from sympy.ntheory.residue_ntheory import discrete_log
# discrete_log(modulus, value, base) solves  base^x = value (mod modulus)
a = discrete_log(p, A, g)
print('recovered a =', a)
assert pow(g, a, p) == A
s = pow(B, a, p)        # shared secret g^(a*b) mod p
print('shared secret =', s)

Mind the argument order. discrete_log(n, a, b) solves b^x = a (mod n), so the modulus comes first, the known public value second, and the base last. Swapping the last two is the most common mistake and produces a wrong or never-returning call.

When SymPy is too slow (it is pure Python), reach for Sage, whose discrete_log is backed by PARI and handles much larger smooth orders. Sage also lets you build the exact group and pass a known order, which avoids re-factoring:

# Sage
p = ...; g = ...; A = ...
F = GF(p)                      # the finite field Z/pZ
a = discrete_log(F(A), F(g))   # uses Pohlig-Hellman + BSGS/rho automatically
print(a)
# if you already know the order of g, pass it to skip factoring:
a = discrete_log(F(A), F(g), ord=order_of_g)

Tip: If the library spins forever, it is telling you the largest prime factor of the order is big. Interrupt it, run factorint(p - 1) (or Sage's factor(p - 1)) with a timeout, and read the largest factor. That number is your go/no-go signal. No factor under ~2^50 as the largest means the math break is closed and you should pivot to the encryption layer or key derivation for a bug.

From shared secret to AES key: the pattern that finishes the challenge

Solving the log is the hard part, but it is not the flag. The shared secret s is a big integer, and the challenge encrypts the flag with a symmetric key derived from s. In nearly every CTF that derivation is a plain hash, so once you have s you reproduce the server's key exactly. The canonical pattern:

from hashlib import sha256
from Crypto.Cipher import AES
from Crypto.Util.Padding import unpad
# 1) recover a from the discrete log (any method above)
a = discrete_log(p, A, g)
# 2) compute the shared secret exactly as the parties did
s = pow(B, a, p)
# 3) derive the key the same way the challenge does.
#    Read the challenge source for the EXACT recipe. Common ones:
key = sha256(str(s).encode()).digest()          # hash of the decimal string
# key = sha256(long_to_bytes(s)).digest()       # hash of the raw bytes
# key = long_to_bytes(s)[:16]                    # raw truncation (lazy authors)
# 4) decrypt. ECB and CBC are both common; match the mode and IV.
ct  = bytes.fromhex('....')
iv  = bytes.fromhex('....')      # present only for CBC/CTR
pt  = unpad(AES.new(key, AES.MODE_CBC, iv).decrypt(ct), 16)
print(pt)

The two things that trip people here are both about matching the server byte for byte. First, the key derivation: sha256(str(s)) and sha256(long_to_bytes(s)) produce different keys, so read the challenge script and copy its exact expression. Second, the cipher mode: ECB needs no IV, CBC and CTR do, and using the wrong one yields garbage even with the right key.

Note: If the challenge derives the key from s but you only recovered a value in asubgroup (see the next section), your s may be correct modulo a small factor but not equal to the true secret. When decryption fails despite a verified log, suspect a small-subgroup situation and recover the full exponent before deriving the key.

Parameter mistakes: small subgroups and a generator of small order

Even with a large, properly safe prime, the exchange breaks if the generator is chosen badly. The security of DH depends on g generating a large prime-order subgroup. When it does not, the secret lives in a space far smaller than p suggests.

Generator of small order. If g has order n much smaller than p - 1, then A = g^a only depends on a mod n. The effective key space is n, not p. Check the order before anything else:

from sympy import factorint, n_order
ord_g = n_order(g, p)            # multiplicative order of g mod p
print('order of g:', ord_g)
print('p - 1     :', p - 1)
# if ord_g is small, brute force a mod ord_g directly:
if ord_g < 2**40:
    a = bsgs(g, A, p, n=ord_g)   # pass the real order, not p-1
    s = pow(B, a, p)

Small-subgroup confinement. A subtler variant: p - 1 has a small factor q, and an attacker (or a careless author) sends a public value of order q instead of full order. The other party's response is then confined to a subgroup of size q, leaking b mod q. Stack several small factors and you reconstruct b with CRT. This is the active version of Pohlig-Hellman and it shows up in interactive DH challenges where you control one side of the exchange.

A 2048-bit prime is not a security claim. The real question is the order of the generator and the largest prime factor of that order.

Small private exponent. Some challenges generate a from a tiny range to keep the math "fast." If a < 2^40, brute force or BSGS over the exponent range recovers it no matter how strong p is, because you are searching the exponent, not the modulus. Always ask whether the secret itself might be small before assuming you need a full discrete-log solve.

Quick reference

Triage order for any DH or discrete-log challenge

Read the source. Note p, g, A, B, the key derivation, and the cipher mode.
factorint(p - 1). Read the largest prime factor. That is your difficulty.
n_order(g, p). If the generator's order is small, the key space is tiny.
Largest factor below ~2^50 or order small? Call discrete_log(p, A, g) (SymPy) or Sage. It applies BSGS and Pohlig-Hellman for you.
Need it by hand or faster? BSGS for orders up to ~2^60; Pollard rho or Sage past that.
Got a? Compute s = pow(B, a, p), derive the key exactly as the challenge does, decrypt.
Nothing works and p is a safe prime? The math is closed. Look at nonce reuse, a small private exponent, or the encryption layer.

One-liners worth memorizing

# solve base^x = value (mod modulus)
from sympy.ntheory.residue_ntheory import discrete_log
a = discrete_log(p, A, g)           # NOTE: modulus, value, base
# is this even attackable? read the largest factor
from sympy import factorint, n_order
max(factorint(p - 1))               # largest prime factor of the order
n_order(g, p)                        # actual order of the generator
# finish it
s = pow(B, a, p)                     # shared secret
key = sha256(str(s).encode()).digest()   # match the challenge recipe

Diffie-Hellman has no modulus to factor and no exponent to invert, so a DH challenge is never about DH. It is always a discrete-log problem in disguise, and your only job is to decide which weakness, small prime, smooth order, or bad generator, made that log easy.