spelling-quiz

Published: April 2, 2026

Description

A spelling quiz study guide and flag file were encrypted with the same substitution cipher. Recover the key.

Download the encrypted study guide and flag.txt from the challenge page.

Solution

  1. Step 1Analyze the ciphertext
    The study guide is a long English text encrypted with a monoalphabetic substitution cipher -- each letter is consistently replaced by exactly one other letter. The large amount of English text makes frequency analysis effective.
    Learn more

    A monoalphabetic substitution cipher replaces each letter of the plaintext with a fixed corresponding letter from a scrambled alphabet. There are 26! (about 4 × 10^26) possible substitution alphabets, making exhaustive search completely infeasible. However, the cipher preserves the statistical properties of the underlying language -- letter frequencies, bigram frequencies, and common word patterns all survive the substitution unchanged.

    In English, the most frequent letters by occurrence are approximately: e, t, a, o, i, n, s, h, r, d, l, u. The most common bigrams are th, he, in, er, an. The most common trigrams are the, and, ing, her, hat. A long ciphertext preserves these ratios, making it possible to map cipher letters to plaintext letters by matching frequency distributions.

    The Caesar cipher is the simplest substitution cipher (shift each letter by a fixed amount) and is broken by just trying all 25 shifts. A general monoalphabetic cipher requires frequency analysis or automated solving. Historical monoalphabetic ciphers -- including the one famously used in Edgar Allan Poe's "The Gold-Bug" -- were broken exactly this way long before computers existed.

  2. Step 2Break the substitution cipher
    Paste the encrypted study guide into quipqiup.com or use a tool like SubstitutionBreaker. The solver uses letter frequency statistics and common bigrams/trigrams to recover the alphabet mapping automatically.
    # Online: https://quipqiup.com
    Learn more

    quipqiup is an automated cryptogram solver that uses a combination of frequency analysis and dictionary-guided hill climbing. It starts with a frequency-based initial guess at the substitution key and then iteratively swaps letter mappings, keeping changes that increase how many common English words appear in the decryption. Given a sufficiently long ciphertext, it converges on the correct key within seconds.

    The automated approach works because the study guide provides far more ciphertext than is needed for reliable frequency analysis. Frequency analysis typically requires at least a few hundred characters to be reliable; a full study guide provides thousands. More text means the observed letter frequencies in the ciphertext converge tightly to the true English frequencies, making the mapping unambiguous.

    Before automated tools existed, cryptanalysts broke substitution ciphers manually using frequency tables, looking for common short words (likely candidates for the, a, an, in), and word patterns. A word like XYYXZ in the ciphertext is almost certainly level or another word with the pattern ABBA+C. This pattern-matching intuition is what automated solvers encode in their scoring functions.

  3. Step 3Apply the recovered key to flag.txt
    Once you have the substitution key (e.g. pcubfwhvjknairmetszdxygolq mapping to abcdefghijklmnopqrstuvwxyz), apply it to decrypt flag.txt using Python's str.translate().
    python3 -c " key = 'pcubfwhvjknairmetszdxygolq' alpha = 'abcdefghijklmnopqrstuvwxyz' table = str.maketrans(key, alpha) print(open('flag.txt').read().translate(table)) "
    Learn more

    str.maketrans(from, to) builds a translation table: a dictionary mapping each character in from to the corresponding character in to. str.translate(table) applies that mapping to every character in the string in a single pass. This is the idiomatic Python way to implement any character-level substitution without a loop.

    The same key that decrypts the study guide decrypts the flag because both were encrypted with the same cipher. This is a common CTF pattern: give you a large, known-plaintext-adjacent file (the study guide is English text you can guess at) so you can recover the key, then ask you to apply that key to the short target file (flag.txt) that alone would not have enough ciphertext for frequency analysis.

    The broader lesson is about key reuse: using the same key for multiple messages always creates exploitable relationships between those messages. In symmetric cryptography, this principle extends to nonce reuse in stream ciphers and IV reuse in block cipher modes -- the consequences range from trivial decryption (as here) to full key recovery depending on the cipher.

Flag

picoCTF{...}

A monoalphabetic substitution cipher with enough English ciphertext is breakable via frequency analysis -- the distribution of letters (e, t, a, o, i, n...) directly reveals the key.

More Cryptography