substitution2

Published: July 20, 2023

Description

A substitution cipher with no punctuation must be solved to recover a block of prose ending in the flag. Online solvers can handle the bulk of the decoding, but you’ll need to map the remaining characters manually.

Load the ciphertext into a substitution solver such as quipqiup.com.

The solver recovers nearly the entire plaintext, but the final line (the flag) includes underscores and digits that may be mangled.

Manually align the decoded letters with the ciphertext to reconstruct the flag.

Solution

  1. Step 1Leverage an automatic solver
    Tools like quipqiup produce a readable paragraph about offensive competitions. Copy the output and focus on the last sentence (the one mentioning the flag).
    Learn more

    Without word boundaries (spaces) or punctuation, substitution solving becomes harder because solvers can't use word-shape patterns. They fall back to pure n-gram statistics- scoring candidate plaintexts based on how common each sequence of 2, 3, or 4 consecutive letters is in English. "th", "he", "in" are very common bigrams; "xz" and "qk" are extremely rare.

    Despite the added difficulty, modern solvers still handle this well for sufficiently long texts. The paragraph about offensive competitions provides enough English text for statistical patterns to emerge, allowing quipqiup to recover the correct (or near-correct) substitution alphabet. The longer the ciphertext, the more reliable the statistical attack.

    A key forensic skill demonstrated here is cross-referencing: using the partially-solved plaintext to bootstrap the full solution. Our Frequency Analysis tool lets you paste the ciphertext and interactively correct the auto-generated mapping. Once most letters are known from the prose section, the remaining mappings for digits and underscores in the flag can be inferred from the established key.

  2. Step 2Fix the flag characters
    Because underscores/numbers confused the solver, cross-reference the original ciphertext's ending (`qcuhUIE{...}`) with the decoded letters to get picoCTF{...}.
    Learn more

    The flag wrapper picoCTF{...} is a known-plaintext element - you know the plaintext and can read the corresponding ciphertext. This is called a known-plaintext attack (KPA): given pairs of plaintext and ciphertext, recover the key. In this case, the prefix picoCTF maps directly to the cipher characters before the opening brace, revealing 7 additional letter substitutions immediately.

    Known-plaintext attacks are powerful in classical cryptography. For monoalphabetic ciphers, knowing even a few plaintext-ciphertext pairs directly reveals portions of the key. The Enigma machine was partly broken this way - German operators began messages with predictable weather reports and protocol headers that gave Allied cryptanalysts at Bletchley Park known plaintext to work with.

    Digits in the flag (N6R4M_4N41Y515) pass through unchanged from the ciphertext since they're not alphabetic characters being substituted. Underscores similarly pass through. This means only the letter portions require decoding, and the established key from the prose section handles those directly.

Alternate Solution

Use the Frequency Analysis tool on this site to count letter occurrences and guide your substitution mapping - the most frequent ciphertext letter almost always corresponds to E in the plaintext. For a hands-off approach, quipqiup.com auto-solves monoalphabetic ciphers with a dictionary search.

Flag

picoCTF{...}

Classic substitution can be solved in seconds with automated tools, but always verify the final characters.

Want more picoCTF 2022 writeups?

Tools used in this challenge

Related reading

What to try next