waves over lambda picoCTF 2019 Solution

Published: April 2, 2026

Description

Decode this substitution cipher. Connect to the server to get the ciphertext.

Connect to the challenge server.

bash
nc <HOST> <PORT_FROM_INSTANCE>
  1. Step 1Get the ciphertext
    Connect with nc. The server prints a paragraph of ciphertext encrypted with a simple substitution cipher (each letter is replaced by a different letter consistently throughout the text). Copy the entire ciphertext.
    Learn more

    Why the keyspace size is misleading. A monoalphabetic substitution cipher maps each plaintext letter to a unique cipher letter. The total keyspace is 26! ≈ 4 * 10^26 - far too large to brute force. But the cipher leaks two structural properties: (1) letter equality is preserved - identical plaintext letters always map to the same ciphertext letter, and (2) letter frequency is preserved- if 'e' is 12.7% of plaintext, then whatever letter 'e' maps to is 12.7% of ciphertext.

    English letter frequencies (long text).

    E  12.7%   T  9.1%   A  8.2%   O  7.5%   I  7.0%
    N  6.7%    S  6.3%   H  6.1%   R  6.0%   D  4.3%
    L  4.0%    C  2.8%   U  2.8%   M  2.4%   W  2.4%
    F  2.2%    G  2.0%   Y  2.0%   P  1.9%   B  1.5%
    V  1.0%    K  0.8%   J  0.15%  X  0.15%  Q  0.10%   Z  0.07%

    Index of Coincidence (IoC). A statistical measure: probability that two randomly chosen letters from the text are equal. For English, IoC ~= 0.0667; for uniform random text (26 equally likely letters), IoC = 1/26 ~= 0.0385. A monoalphabetic substitution preserves IoC at the English value (since frequencies are merely relabeled). A polyalphabetic cipher (Vigenere) drops IoC toward 0.0385 because it flattens the distribution. Computing IoC of the ciphertext is the first sanity check that this is a simple substitution.

    Bigram and trigram analysis sharpen the attack. The most common English bigrams: TH HE IN ER AN RE. Trigrams: THE AND ING HER. The single most common word: THE (~3% of all words). Once you guess e -> X and t -> Y from frequency, look at three-letter words starting with Y and ending with X: those are likely THE, fixing the third letter as h -> ? in one shot.

  2. Step 2Perform frequency analysis
    Count letter frequencies in the ciphertext. The most frequent cipher letter is probably E, the second most frequent is probably T, and so on. Build a substitution key based on these correspondences.
    python
    python3 << 'EOF'
    from collections import Counter
    
    ciphertext = """PASTE CIPHERTEXT HERE"""
    ct_letters = [c.lower() for c in ciphertext if c.isalpha()]
    freq = Counter(ct_letters).most_common()
    print("Cipher letter frequencies:")
    for letter, count in freq:
        print(f"  {letter}: {count}")
    print("\nExpected English order: e t a o i n s h r l d c u m f p g w y b v k x j q z")
    EOF
    Learn more

    Crib bootstrapping. A crib is a piece of plaintext you know lives in the ciphertext. The flag format picoCTF{...} guarantees the seven letters p, i, c, o, C, T, F appear inside the braces. Find any {...} pattern in the ciphertext - the seven letters inside are exactly that mapping (case is sometimes preserved, sometimes flattened). Lock those substitutions, propagate them throughout the text, and the partial decryption guides the rest.

    Hill-climbing solver (quipqiup style). Start with a random key. Compute a fitness score: log-likelihood of the partial decryption under English n-gram probabilities (typically quadgram statistics from a corpus). Repeatedly swap two letters in the key, accept the swap if fitness improves, occasionally accept a worse swap (simulated annealing) to escape local optima. After ~5000-10000 iterations the key converges to the real one. quipqiup.com runs this in your browser; expect a clean answer in under 10 seconds for a paragraph of text.

    fitness(text) = sum over each 4-letter window:
                      log P(quadgram | English)
    
    P(THER | English) ~ 10^-2
    P(QXJZ | English) ~ 10^-9   <- garbage
  3. Step 3Decode the flag
    Apply your discovered substitution key to the entire ciphertext. The flag text will appear. Look for the picoCTF{...} pattern.
    Learn more

    Frequency analysis was first described by Arab mathematician Al-Kindi in the 9th century. It remained the primary cryptanalytic technique for nearly 1000 years, until the invention of polyalphabetic ciphers (Vigenere) made it less effective.

Alternate Solution

Paste the ciphertext into the Frequency Analysis tool on this site to instantly see the letter frequency distribution and start mapping cipher letters to English. For a substitution cipher with word spacing preserved, quipqiup.com can also auto-solve it using a dictionary-based hill-climbing algorithm.

Flag

picoCTF{...}

Frequency analysis on the ciphertext maps cipher letters to English letters - the flag appears in the decrypted text.

Want more picoCTF 2019 writeups?

Useful tools for Cryptography

Related reading

What to try next