Description

Another Vigenere cipher? This version uses a modified encoding. Decrypt the ciphertext to find the flag.

Setup

The ciphertext is provided with the challenge - no file to download.

Solution

Want to try it yourself first?

The guided walkthrough reveals hints one step at a time.

Walk me through it

Step 1
Understand the New Vigenere encoding
Observation
I noticed the challenge name included 'New' and the ciphertext alphabet was restricted to only 16 letters (abcdefghijklmnop), which suggested this was not a standard Vigenere cipher and required understanding the extra base-16 nibble-encoding layer before any cryptanalysis could begin.
The 'New' Vigenere differs from classical Vigenere by first encoding the plaintext into a restricted alphabet using a custom base16-like scheme before applying the Vigenere cipher. Understand both layers before attempting decryption.
Learn more
Classical Vigenere refresher. Plaintext letter at position i is shifted by key letter at position i mod L, all modulo 26. Encryption: c_i = (p_i + k_(i mod L)) mod 26. Decryption: p_i = (c_i - k_(i mod L)) mod 26. Strength comes from the polyalphabetic shift; weakness comes from the repeating key.
The "new" twist. Plaintext is first nibble-encoded into the 16-letter alphabet abcdefghijklmnop (a=0, b=1, ..., p=15), then Vigenere-shifted modulo 16:
Encrypt: for byte b in plaintext: hi, lo = b >> 4, b & 0xF out += alpha[(hi + key[i % L]) % 16] out += alpha[(lo + key[(i+1) % L]) % 16] Decrypt: reverse: subtract key, repack nibbles into bytes
Worked tiny example. Plaintext 'A' = 0x41; key k = "ba" = [1, 0]:
hi = 4, lo = 1 ct[0] = alpha[(4 + 1) % 16] = alpha[5] = 'f' ct[1] = alpha[(1 + 0) % 16] = alpha[1] = 'b' ciphertext = "fb" Decrypt "fb" with key [1, 0]: hi = (5 - 1) mod 16 = 4 lo = (1 - 0) mod 16 = 1 byte = (4 << 4) | 1 = 0x41 = 'A' ✓
Why the alphabet shrinks the keyspace. Modulo 16 instead of modulo 26, with key length L, total keyspace is 16^L. L = 6 gives 2^24 ≈ 16M keys - brute-forceable in seconds. The nibble-doubling also means every key length must be even when the encoded string is split into "hi" and "lo" columns - useful when interpreting Kasiski distances.
Step 2
Perform the Kasiski examination or frequency analysis
Observation
I noticed the ciphertext was long enough to contain repeated substrings and that the polyalphabetic key repeats, which suggested the Kasiski examination could extract the key length and then per-column index-of-coincidence analysis could isolate each Caesar shift independently.
Find the key length using the Kasiski test (look for repeated ciphertext substrings; their spacings are multiples of the key length). Then use index of coincidence or frequency analysis to recover the key. Split the ciphertext into L columns with cols = [ct[i::L] for i in range(L)] so each column is a single-shift Caesar.
python
python3 - <<'EOF' ciphertext = "PASTE_CIPHERTEXT_HERE" alpha = "abcdefghijklmnop" # the b16 alphabet # Kasiski: find repeated trigrams and their spacings from collections import Counter, defaultdict import math def gcd_list(lst): result = lst[0] for v in lst[1:]: result = math.gcd(result, v) return result positions = defaultdict(list) for i in range(len(ciphertext) - 2): tri = ciphertext[i:i+3] positions[tri].append(i) spacings = [] for tri, pos in positions.items(): if len(pos) > 1: spacings.extend([pos[j+1]-pos[j] for j in range(len(pos)-1)]) if spacings: key_len = gcd_list(spacings) print(f"Likely key length: {key_len}") # Index of coincidence: discriminator for the right key length def ic(s): n = len(s) counts = Counter(s) return sum(c*(c-1) for c in counts.values()) / (n*(n-1)) # Average IoC across columns; structured plaintext sits well above 1/16 = 0.0625 for L in range(2, 12): cols = [ciphertext[i::L] for i in range(L)] avg = sum(ic(c) for c in cols) / L print(f"L={L}: avg IoC = {avg:.4f}") EOF
What didn't work first
Tried: Treat the ciphertext as a standard Vigenere over the 26-letter alphabet and run a classical Kasiski tool against it.
The ciphertext only uses the 16-letter alphabet abcdefghijklmnop, so any tool that expects 26 letters will miscount letter frequencies and report a nonsensical IoC (close to 1/26 rather than 1/16). The IoC threshold for 'structured plaintext' is 1/16 = 0.0625, not 1/26 = 0.0385. The script in this step sets the alphabet and modulus to 16 explicitly to get the right baseline.
Tried: Take the GCD of ALL repeated trigram gaps and accept the result immediately as the key length without verifying with IoC.
Coincidental repeated trigrams unrelated to the key create spurious gaps that drive the GCD down to 1 or 2, hiding the real key length. The IoC check on each candidate column slice is necessary to confirm: correct L produces high per-column IoC (~0.07-0.10), while wrong L leaves IoC near the uniform floor of 0.0625.
Learn more
Kasiski intuition. If two identical plaintext substrings happen to align with the same key offset, they produce identical ciphertext substrings. The distance between those occurrences must therefore be a multiple of the key length L. Find every repeated trigram in the ciphertext, list the gaps between occurrences, and take the GCD of those gaps. The result is almost always L (occasionally a small multiple of L).
Worked toy example. Ciphertext ...HJBCFHJBCFHJBC... has trigram HJB at positions 0, 5, 10. Gaps: 5, 5. GCD = 5, so the key length is 5 (or a divisor of 5, i.e., 1 - which would be a Caesar cipher and is ruled out by the IoC test).
Index of Coincidence as a sanity check. For a 16-letter alphabet, English (or any structured plaintext) has IoC well above the uniform value 1/16 = 0.0625. Compute IoC of each candidate column slice; when key length is correct, every slice IoC should be ~0.07-0.10 (since each slice is now a simple shift cipher and preserves frequency). Wrong key lengths give IoC around 0.0625 (uniform).
IoC(column) = sum_c [n_c * (n_c - 1)] / [N * (N - 1)] where n_c is the count of letter c in the column, N is column length
Recovering each key byte. Once L is locked, split into L columns. Each column is a Caesar cipher mod 16. The most common letter in flag-format plaintext is almost always 'p' = 15 (high nibble of 'p', 'i', 'c', 'o' is 6, and the high nibble of '_' is 5; low nibbles are spread). For each column, find the most frequent ciphertext letter x, set k_i = (x - expected) mod 16. Verify the candidate key by decrypting and checking that the result starts with picoCTF.
Step 3
Brute-force the key or use CyberChef
Observation
I noticed the 16-letter alphabet reduces the keyspace to 16^L candidates, and once the Kasiski step gave me the key length, testing all 16 Caesar shifts column by column was feasible in seconds, making a targeted brute-force with a 'picoCTF' prefix check the fastest path to the flag.
If the key is short, brute-force all possible keys. Use Python to apply the inverse b16 decode and Vigenere decryption, checking if the result contains 'picoCTF'.
python
python3 - <<'EOF' ciphertext = "PASTE_CIPHERTEXT_HERE" alpha = "abcdefghijklmnop" def b16_decode(s): result = [] for i in range(0, len(s), 2): hi = alpha.index(s[i]) lo = alpha.index(s[i+1]) result.append(hi << 4 | lo) return bytes(result) def vigenere_decrypt(ct, key): N = len(alpha) result = [] for i, c in enumerate(ct): k = alpha.index(key[i % len(key)]) result.append(alpha[(alpha.index(c) - k) % N]) return ''.join(result) # Brute force short keys key_len = 9 # from Kasiski assert len(alpha) == 16, "alphabet must be 16 chars for the base-16 conversion" # Note: 16^9 is too large for exhaustive brute force; use column-by-column analysis instead (see context) for key_int in range(len(alpha) ** key_len): key = '' n = key_int for _ in range(key_len): key = alpha[n % len(alpha)] + key n //= len(alpha) assert len(key) == key_len # avoid off-by-one base-16 conversion bugs decrypted_b16 = vigenere_decrypt(ciphertext, key) try: decoded = b16_decode(decrypted_b16) if b'picoCTF' in decoded: print(f"Key: {key}") print(f"Flag: {decoded.decode()}") break except Exception: pass EOF
What didn't work first
Tried: Use CyberChef's built-in Vigenere Decode operation directly on the ciphertext without first reversing the b16 nibble-encoding layer.
CyberChef's Vigenere Decode operates on the 26-letter English alphabet. The ciphertext here uses a 16-letter alphabet with mod-16 arithmetic, so CyberChef produces garbage output even with the correct key. The b16 decode step (repacking nibbles into bytes) must be applied after the Vigenere reversal, not before or instead of it.
Tried: Run exhaustive brute force over all 16^9 keys in a Python loop, iterating key_int from 0 to 16^9.
16^9 is about 68 billion candidates. A pure Python loop processes roughly 1-5 million keys per second, meaning full exhaustion would take days. The correct approach is column-by-column analysis: test all 16 Caesar shifts independently on each of the 9 columns and discard shifts that produce non-ASCII output, reducing the combined search space from billions to at most a few thousand survivors.
Learn more
The b16 encoding into the abcdefghijklmnop alphabet means the Vigenere cipher operates over a 16-character alphabet (not 26). With the correct key length of 9, the full keyspace 16^9 ≈ 68B is too large for pure Python brute force. In practice, cryptanalysts use column-by-column analysis: for each of the 9 columns, test all 16 Caesar shifts independently and keep only the shifts where b16_decode produces valid ASCII output. This reduces the search to at most 16 x 9 = 144 individual tests, then itertools.permutations on the surviving candidates to find the key that fully validates. Total attempts are in the thousands, not billions.
Defensive assertion. The base conversion in the loop above can silently truncate when n hits zero before the inner loop completes (leading-letter a = 0 would make key shorter than expected). The assert len(key) == key_len catches that off-by-one before it reaches the cipher.

Interactive tools

Cipher Identifier & Auto-DecoderPaste any ciphertext and the tool auto-runs every common decoder (base64, hex, Morse, ROT, Atbash, Bacon, binary, decimal, URL) and ranks the results by English-likeness.
Frequency AnalysisAnalyze letter frequencies in a substitution cipher and interactively build the decryption mapping with auto-filled guesses.

Flag

Reveal flag

picoCTF{94bf01ad4b8a63425c32c02ba4c9632f}

Static flag. The same ciphertext (bkglibgkhghkijphhhejggikgjkbhefgpienefjdioghhchffhmmhhbjgclpjfkp) and flag appear consistently across instances.

Key takeaway

The Vigenere cipher is broken by the Kasiski examination and index-of-coincidence analysis because a repeating key creates statistical patterns in the ciphertext that betray both the key length and, column by column, each individual shift. Restricting the alphabet from 26 to 16 characters makes the keyspace smaller, not larger, so the 'new' encoding actually weakens rather than strengthens the cipher. The general lesson is that polyalphabetic substitution ciphers fail against any adversary who can collect enough ciphertext, which is why all modern encryption uses key material at least as long as the plaintext (one-time pad) or relies on computational hardness rather than key secrecy.

New Vignere picoCTF 2021 Solution

Description

Solution

Flag

Key takeaway

Related reading

Useful tools for Cryptography

What to try next

Vigenere

New Caesar

Hidden Cipher 1

Hidden Cipher 2

rotation

reverse_cipher