New Vignere picoCTF 2021 Solution

Published: April 2, 2026

Description

Another Vigenere cipher? This version uses a modified encoding. Decrypt the ciphertext to find the flag.

The ciphertext is provided with the challenge - no file to download.

  1. Step 1Understand the New Vigenere encoding
    The 'New' Vigenere differs from classical Vigenere by first encoding the plaintext into a restricted alphabet using a custom base16-like scheme before applying the Vigenere cipher. Understand both layers before attempting decryption.
    Learn more

    Classical Vigenere refresher. Plaintext letter at position i is shifted by key letter at position i mod L, all modulo 26. Encryption: c_i = (p_i + k_(i mod L)) mod 26. Decryption: p_i = (c_i - k_(i mod L)) mod 26. Strength comes from the polyalphabetic shift; weakness comes from the repeating key.

    The "new" twist. Plaintext is first nibble-encoded into the 16-letter alphabet abcdefghijklmnop (a=0, b=1, ..., p=15), then Vigenere-shifted modulo 16:

    Encrypt:
      for byte b in plaintext:
          hi, lo = b >> 4, b & 0xF
          out += alpha[(hi + key[i % L]) % 16]
          out += alpha[(lo + key[(i+1) % L]) % 16]
    
    Decrypt:
      reverse: subtract key, repack nibbles into bytes

    Worked tiny example. Plaintext 'A' = 0x41; key k = "ba" = [1, 0]:

    hi = 4, lo = 1
    ct[0] = alpha[(4 + 1) % 16] = alpha[5] = 'f'
    ct[1] = alpha[(1 + 0) % 16] = alpha[1] = 'b'
    ciphertext = "fb"
    
    Decrypt "fb" with key [1, 0]:
    hi = (5 - 1) mod 16 = 4
    lo = (1 - 0) mod 16 = 1
    byte = (4 << 4) | 1 = 0x41 = 'A'   ✓

    Why the alphabet shrinks the keyspace. Modulo 16 instead of modulo 26, with key length L, total keyspace is 16^L. L = 6 gives 2^24 ≈ 16M keys - brute-forceable in seconds. The nibble-doubling also means every key length must be even when the encoded string is split into "hi" and "lo" columns - useful when interpreting Kasiski distances.

  2. Step 2Perform the Kasiski examination or frequency analysis
    Find the key length using the Kasiski test (look for repeated ciphertext substrings; their spacings are multiples of the key length). Then use index of coincidence or frequency analysis to recover the key. Split the ciphertext into L columns with cols = [ct[i::L] for i in range(L)] so each column is a single-shift Caesar.
    python
    python3 - <<'EOF'
    ciphertext = "PASTE_CIPHERTEXT_HERE"
    alpha = "abcdefghijklmnop"  # the b16 alphabet
    
    # Kasiski: find repeated trigrams and their spacings
    from collections import Counter, defaultdict
    import math
    
    def gcd_list(lst):
        result = lst[0]
        for v in lst[1:]:
            result = math.gcd(result, v)
        return result
    
    positions = defaultdict(list)
    for i in range(len(ciphertext) - 2):
        tri = ciphertext[i:i+3]
        positions[tri].append(i)
    
    spacings = []
    for tri, pos in positions.items():
        if len(pos) > 1:
            spacings.extend([pos[j+1]-pos[j] for j in range(len(pos)-1)])
    
    if spacings:
        key_len = gcd_list(spacings)
        print(f"Likely key length: {key_len}")
    
    # Index of coincidence: discriminator for the right key length
    def ic(s):
        n = len(s)
        counts = Counter(s)
        return sum(c*(c-1) for c in counts.values()) / (n*(n-1))
    
    # Average IoC across columns; structured plaintext sits well above 1/16 = 0.0625
    for L in range(2, 12):
        cols = [ciphertext[i::L] for i in range(L)]
        avg = sum(ic(c) for c in cols) / L
        print(f"L={L}: avg IoC = {avg:.4f}")
    EOF
    Learn more

    Kasiski intuition. If two identical plaintext substrings happen to align with the same key offset, they produce identical ciphertext substrings. The distance between those occurrences must therefore be a multiple of the key length L. Find every repeated trigram in the ciphertext, list the gaps between occurrences, and take the GCD of those gaps. The result is almost always L (occasionally a small multiple of L).

    Worked toy example. Ciphertext ...HJBCFHJBCFHJBC... has trigram HJB at positions 0, 5, 10. Gaps: 5, 5. GCD = 5, so the key length is 5 (or a divisor of 5, i.e., 1 - which would be a Caesar cipher and is ruled out by the IoC test).

    Index of Coincidence as a sanity check. For a 16-letter alphabet, English (or any structured plaintext) has IoC well above the uniform value 1/16 = 0.0625. Compute IoC of each candidate column slice; when key length is correct, every slice IoC should be ~0.07-0.10 (since each slice is now a simple shift cipher and preserves frequency). Wrong key lengths give IoC around 0.0625 (uniform).

    IoC(column) = sum_c [n_c * (n_c - 1)] / [N * (N - 1)]
    where n_c is the count of letter c in the column, N is column length

    Recovering each key byte. Once L is locked, split into L columns. Each column is a Caesar cipher mod 16. The most common letter in flag-format plaintext is almost always 'p' = 15 (high nibble of 'p', 'i', 'c', 'o' is 6, and the high nibble of '_' is 5; low nibbles are spread). For each column, find the most frequent ciphertext letter x, set k_i = (x - expected) mod 16. Verify the candidate key by decrypting and checking that the result starts with picoCTF.

  3. Step 3Brute-force the key or use CyberChef
    If the key is short, brute-force all possible keys. Use Python to apply the inverse b16 decode and Vigenere decryption, checking if the result contains 'picoCTF'.
    python
    python3 - <<'EOF'
    ciphertext = "PASTE_CIPHERTEXT_HERE"
    alpha = "abcdefghijklmnop"
    
    def b16_decode(s):
        result = []
        for i in range(0, len(s), 2):
            hi = alpha.index(s[i])
            lo = alpha.index(s[i+1])
            result.append(hi << 4 | lo)
        return bytes(result)
    
    def vigenere_decrypt(ct, key):
        N = len(alpha)
        result = []
        for i, c in enumerate(ct):
            k = alpha.index(key[i % len(key)])
            result.append(alpha[(alpha.index(c) - k) % N])
        return ''.join(result)
    
    # Brute force short keys
    key_len = 6  # from Kasiski
    assert len(alpha) == 16, "alphabet must be 16 chars for the base-16 conversion"
    # Sanity: 16^6 ~= 16.7M attempts, runs in a few seconds in pure Python
    for key_int in range(len(alpha) ** key_len):
        key = ''
        n = key_int
        for _ in range(key_len):
            key = alpha[n % len(alpha)] + key
            n //= len(alpha)
        assert len(key) == key_len  # avoid off-by-one base-16 conversion bugs
        decrypted_b16 = vigenere_decrypt(ciphertext, key)
        try:
            decoded = b16_decode(decrypted_b16)
            if b'picoCTF' in decoded:
                print(f"Key: {key}")
                print(f"Flag: {decoded.decode()}")
                break
        except Exception:
            pass
    EOF
    Learn more

    The b16 encoding into the abcdefghijklmnop alphabet means the Vigenere cipher operates over a 16-character alphabet (not 26). The total keyspace for a 6-character key is only 16^6 = 16,777,216 - around 17M trial decryptions, finishing in a handful of seconds in pure Python. For a larger key, use frequency analysis per stream as described in the previous step.

    Defensive assertion. The base conversion in the loop above can silently truncate when n hits zero before the inner loop completes (leading-letter a = 0 would make key shorter than expected). The assert len(key) == key_len catches that off-by-one before it reaches the cipher.

Flag

picoCTF{...}

The 'New' Vigenere adds a base-16 encoding layer over a 16-character alphabet before applying Vigenere - brute-forcing over the small (16^N) keyspace or using Kasiski/frequency analysis recovers the key.

Want more picoCTF 2021 writeups?

Useful tools for Cryptography

Related reading

What to try next