Description
Another Vigenere cipher? This version uses a modified encoding. Decrypt the ciphertext to find the flag.
Setup
The ciphertext is provided with the challenge - no file to download.
Solution
Walk me through it- Step 1Understand the New Vigenere encodingThe 'New' Vigenere differs from classical Vigenere by first encoding the plaintext into a restricted alphabet using a custom base16-like scheme before applying the Vigenere cipher. Understand both layers before attempting decryption.
Learn more
Classical Vigenere refresher. Plaintext letter at position
iis shifted by key letter at positioni mod L, all modulo 26. Encryption:c_i = (p_i + k_(i mod L)) mod 26. Decryption:p_i = (c_i - k_(i mod L)) mod 26. Strength comes from the polyalphabetic shift; weakness comes from the repeating key.The "new" twist. Plaintext is first nibble-encoded into the 16-letter alphabet
abcdefghijklmnop(a=0, b=1, ..., p=15), then Vigenere-shifted modulo 16:Encrypt: for byte b in plaintext: hi, lo = b >> 4, b & 0xF out += alpha[(hi + key[i % L]) % 16] out += alpha[(lo + key[(i+1) % L]) % 16] Decrypt: reverse: subtract key, repack nibbles into bytesWorked tiny example. Plaintext
'A'=0x41; keyk = "ba"=[1, 0]:hi = 4, lo = 1 ct[0] = alpha[(4 + 1) % 16] = alpha[5] = 'f' ct[1] = alpha[(1 + 0) % 16] = alpha[1] = 'b' ciphertext = "fb" Decrypt "fb" with key [1, 0]: hi = (5 - 1) mod 16 = 4 lo = (1 - 0) mod 16 = 1 byte = (4 << 4) | 1 = 0x41 = 'A' ✓Why the alphabet shrinks the keyspace. Modulo 16 instead of modulo 26, with key length
L, total keyspace is16^L.L = 6gives2^24 ≈ 16Mkeys - brute-forceable in seconds. The nibble-doubling also means every key length must be even when the encoded string is split into "hi" and "lo" columns - useful when interpreting Kasiski distances. - Step 2Perform the Kasiski examination or frequency analysisFind the key length using the Kasiski test (look for repeated ciphertext substrings; their spacings are multiples of the key length). Then use index of coincidence or frequency analysis to recover the key. Split the ciphertext into L columns with
cols = [ct[i::L] for i in range(L)]so each column is a single-shift Caesar.pythonpython3 - <<'EOF' ciphertext = "PASTE_CIPHERTEXT_HERE" alpha = "abcdefghijklmnop" # the b16 alphabet # Kasiski: find repeated trigrams and their spacings from collections import Counter, defaultdict import math def gcd_list(lst): result = lst[0] for v in lst[1:]: result = math.gcd(result, v) return result positions = defaultdict(list) for i in range(len(ciphertext) - 2): tri = ciphertext[i:i+3] positions[tri].append(i) spacings = [] for tri, pos in positions.items(): if len(pos) > 1: spacings.extend([pos[j+1]-pos[j] for j in range(len(pos)-1)]) if spacings: key_len = gcd_list(spacings) print(f"Likely key length: {key_len}") # Index of coincidence: discriminator for the right key length def ic(s): n = len(s) counts = Counter(s) return sum(c*(c-1) for c in counts.values()) / (n*(n-1)) # Average IoC across columns; structured plaintext sits well above 1/16 = 0.0625 for L in range(2, 12): cols = [ciphertext[i::L] for i in range(L)] avg = sum(ic(c) for c in cols) / L print(f"L={L}: avg IoC = {avg:.4f}") EOFLearn more
Kasiski intuition. If two identical plaintext substrings happen to align with the same key offset, they produce identical ciphertext substrings. The distance between those occurrences must therefore be a multiple of the key length
L. Find every repeated trigram in the ciphertext, list the gaps between occurrences, and take the GCD of those gaps. The result is almost alwaysL(occasionally a small multiple ofL).Worked toy example. Ciphertext
...HJBCFHJBCFHJBC...has trigramHJBat positions 0, 5, 10. Gaps: 5, 5. GCD = 5, so the key length is 5 (or a divisor of 5, i.e., 1 - which would be a Caesar cipher and is ruled out by the IoC test).Index of Coincidence as a sanity check. For a 16-letter alphabet, English (or any structured plaintext) has IoC well above the uniform value
1/16 = 0.0625. Compute IoC of each candidate column slice; when key length is correct, every slice IoC should be ~0.07-0.10 (since each slice is now a simple shift cipher and preserves frequency). Wrong key lengths give IoC around 0.0625 (uniform).IoC(column) = sum_c [n_c * (n_c - 1)] / [N * (N - 1)] where n_c is the count of letter c in the column, N is column lengthRecovering each key byte. Once
Lis locked, split intoLcolumns. Each column is a Caesar cipher mod 16. The most common letter in flag-format plaintext is almost always'p'= 15 (high nibble of'p', 'i', 'c', 'o'is 6, and the high nibble of'_'is 5; low nibbles are spread). For each column, find the most frequent ciphertext letterx, setk_i = (x - expected) mod 16. Verify the candidate key by decrypting and checking that the result starts withpicoCTF. - Step 3Brute-force the key or use CyberChefIf the key is short, brute-force all possible keys. Use Python to apply the inverse b16 decode and Vigenere decryption, checking if the result contains 'picoCTF'.python
python3 - <<'EOF' ciphertext = "PASTE_CIPHERTEXT_HERE" alpha = "abcdefghijklmnop" def b16_decode(s): result = [] for i in range(0, len(s), 2): hi = alpha.index(s[i]) lo = alpha.index(s[i+1]) result.append(hi << 4 | lo) return bytes(result) def vigenere_decrypt(ct, key): N = len(alpha) result = [] for i, c in enumerate(ct): k = alpha.index(key[i % len(key)]) result.append(alpha[(alpha.index(c) - k) % N]) return ''.join(result) # Brute force short keys key_len = 6 # from Kasiski assert len(alpha) == 16, "alphabet must be 16 chars for the base-16 conversion" # Sanity: 16^6 ~= 16.7M attempts, runs in a few seconds in pure Python for key_int in range(len(alpha) ** key_len): key = '' n = key_int for _ in range(key_len): key = alpha[n % len(alpha)] + key n //= len(alpha) assert len(key) == key_len # avoid off-by-one base-16 conversion bugs decrypted_b16 = vigenere_decrypt(ciphertext, key) try: decoded = b16_decode(decrypted_b16) if b'picoCTF' in decoded: print(f"Key: {key}") print(f"Flag: {decoded.decode()}") break except Exception: pass EOFLearn more
The b16 encoding into the
abcdefghijklmnopalphabet means the Vigenere cipher operates over a 16-character alphabet (not 26). The total keyspace for a 6-character key is only16^6 = 16,777,216- around 17M trial decryptions, finishing in a handful of seconds in pure Python. For a larger key, use frequency analysis per stream as described in the previous step.Defensive assertion. The base conversion in the loop above can silently truncate when
nhits zero before the inner loop completes (leading-lettera= 0 would makekeyshorter than expected). Theassert len(key) == key_lencatches that off-by-one before it reaches the cipher.
Flag
picoCTF{...}
The 'New' Vigenere adds a base-16 encoding layer over a 16-character alphabet before applying Vigenere - brute-forcing over the small (16^N) keyspace or using Kasiski/frequency analysis recovers the key.