Introduction
Every CTF beginner hits the same wall: you find a string that looks like random garbage, and you have no idea what it is or how to decode it. Is it Base64? Hex? Some cipher? This guide is the reference you wish you had on day one.
The most important concept to understand first is the difference between encoding and encryption. Encoding is a reversible transformation that requires no key: anyone who knows the scheme can decode it. Base64, hex, and URL encoding are all encodings. Encryption, by contrast, requires a secret key to reverse: AES, RSA, and XOR with a hidden key are all encryption. CTF challenge descriptions that use the word "encoding" almost always mean something you can reverse without any key material.
CTF authors love to layer encodings. A flag might be hex-encoded, and the hex string might itself be Base64-encoded, and that Base64 might be URL-encoded in a query parameter. The strategy is always the same: decode one layer at a time, check whether the result looks printable and recognizable, and repeat until you see picoCTF{.
To identify an unknown encoding, look at the character set and the length. Each encoding has a distinctive fingerprint: only certain characters appear, the length often follows a pattern (always even, always a multiple of 4, always a multiple of 8), and some encodings have unmistakable visual signatures like trailing = padding or % percent signs. This guide covers every common encoding with its fingerprint and decode one-liner.
Related guides
Base64
Base64 is the single most common encoding in CTF competitions. It represents binary data using 64 printable ASCII characters: the uppercase letters A-Z, the lowercase letters a-z, the digits 0-9, and the two symbols + and /. An = or == suffix is used as padding to make the total length a multiple of 4.
How to spot Base64
- Character set:
A-Z a-z 0-9 + /with optional trailing=or== - Length is always a multiple of 4 (padding makes it so)
- The string
cGljb0NURg==decodes topicoCTF - Ratio of about 4 output characters per 3 bytes of input, so longer than the original
Decoding commands
# Decode a Base64 string on the command lineecho 'cGljb0NURntmbGFnfQ==' | base64 -d# Decode a Base64-encoded filebase64 -d file.txt# Decode in Pythonpython3 -c "import base64; print(base64.b64decode('cGljb0NURntmbGFnfQ==').decode())"
Multi-layer Base64
Some challenges encode the flag in Base64 multiple times. The repetitions challenge is a classic example: the flag is Base64-encoded six times. When the output of one decode is still a Base64-looking string, keep decoding. A quick loop handles this automatically:
python3 -c "import base64data = open('encoded.txt').read().strip()for _ in range(10):try:data = base64.b64decode(data).decode()except Exception:breakprint(data)"
Base64 URL variant
The URL-safe variant of Base64 replaces + with - and / with _, so the output is safe to embed in a URL without percent-encoding. You will see this in JWT tokens and some web challenges. Use base64.urlsafe_b64decode() in Python, or manually swap the characters before using the standard decoder.
python3 -c "import base64; print(base64.urlsafe_b64decode('cGljb0NURntmbGFnfQ==').decode())"
Challenge using Base64
Hex encoding
Hex encoding (hexadecimal, base 16) represents each byte as exactly two characters from the set 0-9 a-f (sometimes uppercase A-F). Because every byte becomes two characters, a hex-encoded string is always an even number of characters long. Longer strings are often prefixed with 0x to signal they are hexadecimal.
How to spot hex
- Only the characters
0-9anda-f(orA-F) - Always an even number of characters
- Sometimes prefixed with
0x - The string
7069636f435446decodes topicoCTF
Decoding commands
# Decode hex to text using xxdecho '7069636f435446' | xxd -r -p# Decode hex in Pythonpython3 -c "print(bytes.fromhex('7069636f435446').decode())"# Read hex from stdin in Python (useful for piping)echo '7069636f435446' | python3 -c "import sys; print(bytes.fromhex(sys.stdin.read().strip()).decode())"
The xxd -r -p flag combination is worth memorising: -r means reverse (hex to binary) and -p means plain hex input with no line numbers or offsets. Together they convert a bare hex string to raw bytes, which you can then pipe into other tools.
Challenge using hex
Binary encoding
Binary encoding represents each character as its 8-bit ASCII value written in base 2. Each byte becomes exactly 8 digits, all either 0 or 1. Groups are usually separated by spaces, so the string looks like a sequence of 8-digit patterns.
How to spot binary
- Only the characters
0and1 - Groups of exactly 8 digits, often space-separated
- Total digit count (ignoring spaces) is a multiple of 8
01110000 01101001 01100011decodes topic
Decoding commands
python3 -c "bits = '01110000 01101001 01100011 01101111 01000011 01010100 01000110'print(''.join(chr(int(b, 2)) for b in bits.split()))"
Octal encoding
Octal (base 8) uses only the digits 0-7 and is far less common than binary or hex in CTFs, but it does appear. Each character is typically represented as a 3-digit octal number. The decode is the same idea: convert each group from its base to an integer, then to a character.
# Decode octal-encoded text in Pythonpython3 -c "octs = '160 151 143 157 103 124 106'print(''.join(chr(int(o, 8)) for o in octs.split()))"
0s and 1s, it is binary (base 2). Only 0-7? It might be octal (base 8). Only 0-9 a-f? Hex (base 16). The base tells you which digits are possible.ROT13, Caesar ciphers, and substitution
ROT13 is a simple letter-substitution cipher that shifts every letter in the alphabet forward by 13 positions. Because the alphabet has 26 letters, applying ROT13 twice returns the original text, making it self-inverse. It is used to obscure text without any key, and it appears constantly in CTF general-skills challenges.
The more general form is a Caesar cipher, which shifts by any number N from 1 to 25. To crack an unknown Caesar cipher, you can try all 26 shifts and look for the one that produces readable English.
How to spot ROT13 / Caesar
- Contains only letters (and possibly punctuation or spaces)
- The text looks like English but with shifted letters
cvpbPGSispicoCTFunder ROT13- Non-letter characters (digits, braces, spaces) are usually left unchanged
Decoding commands
# ROT13 using tr (fast, works in bash)echo 'cvpbPGS{synth}' | tr 'A-Za-z' 'N-ZA-Mn-za-m'# ROT13 using Python codecspython3 -c "import codecsprint(codecs.decode('cvpbPGS{synth}', 'rot13'))"
Brute-force all 26 Caesar shifts
python3 -c "text = 'cvpbPGS{synth}'for shift in range(26):result = ''for ch in text:if ch.isalpha():base = ord('A') if ch.isupper() else ord('a')result += chr((ord(ch) - base + shift) % 26 + base)else:result += chprint(f'Shift {shift:2d}: {result}')"
Atbash cipher
Atbash reverses the alphabet: A becomes Z, B becomes Y, and so on. It is its own inverse, just like ROT13. To decode, map each letter to its mirror position: chr(ord('Z') - (ord(ch) - ord('A'))) for uppercase. CyberChef has a dedicated Atbash operation.
python3 -c "text = 'krlbXGU'result = ''for ch in text:if ch.isupper():result += chr(ord('Z') - (ord(ch) - ord('A')))elif ch.islower():result += chr(ord('z') - (ord(ch) - ord('a')))else:result += chprint(result)"
Challenge using rotation
URL encoding
URL encoding (also called percent-encoding) replaces characters that are not safe in a URL with a %sign followed by two hex digits representing the character's ASCII code. A space becomes %20, an opening brace becomes %7B, and so on. You will see this encoding constantly in web challenges where a flag is embedded in a URL or in a form parameter.
How to spot URL encoding
- Contains
%signs followed by exactly two hex digits - Often mixed with regular printable ASCII characters
%70%69%63%6f%43%54%46%7b%66%6c%61%67%7ddecodes topicoCTF{flag}
Decoding commands
# Decode URL-encoded string in Pythonpython3 -c "from urllib.parse import unquote; print(unquote('%70%69%63%6f%43%54%46%7b%66%6c%61%67%7d'))"# Or using the command line with curlpython3 -c "import urllib.parse; print(urllib.parse.unquote_plus('hello+world+%7Bflag%7D'))"
HTML entity encoding
HTML uses its own encoding scheme for special characters. A less-than sign becomes <, a greater-than sign becomes >, and an ampersand itself becomes &. You encounter this in web challenges where the server reflects your input back in HTML but has encoded the angle brackets to prevent XSS. Python's html.unescape() handles this:
python3 -c "import html; print(html.unescape('<script>alert(&quot;flag&quot;)</script>'))"
Double encoding
Web challenges sometimes double-encode payloads: a % character is itself URL-encoded as %25, so %2570 decodes first to %70and then to p. If you decode once and get more percent signs, decode again. This trick is used in WAF-bypass challenges where filters check the encoded form but the server decodes twice.
# Decode twice for double-encoded stringspython3 -c "from urllib.parse import unquoteencoded = '%2570%2569%2563%256f'print(unquote(unquote(encoded)))"
Morse code
Morse code encodes letters and digits as sequences of short signals (dots, .) and long signals (dashes, -). Each character is separated by a space, and each word is separated by a forward slash / or a longer gap. In CTFs, Morse code shows up as a string of dots, dashes, and separators, and sometimes as an audio file you must transcribe.
How to spot Morse code
- Only the characters
.and-(and spaces or/as separators) - Very short strings with a regular rhythmic structure
.-. . .- -..decodes toREAD
Decoding with Python
python3 -c "MORSE = {'.-': 'A', '-...': 'B', '-.-.': 'C', '-..': 'D', '.': 'E','..-.': 'F', '--.': 'G', '....': 'H', '..': 'I', '.---': 'J','-.-': 'K', '.-..': 'L', '--': 'M', '-.': 'N', '---': 'O','.--.': 'P', '--.-': 'Q', '.-.': 'R', '...': 'S', '-': 'T','..-': 'U', '...-': 'V', '.--': 'W', '-..-': 'X', '-.--': 'Y','--..': 'Z', '-----': '0', '.----': '1', '..---': '2','...--': '3', '....-': '4', '.....': '5', '-....': '6','--...': '7', '---..': '8', '----.': '9',}msg = '-- --- .-. ... . / -.-. --- -.. .'words = msg.split(' / ')print(' '.join(''.join(MORSE.get(c, '?') for c in w.split()) for w in words))"
Challenge using Morse code
Other encodings to know
Beyond the core five, a handful of less common encodings appear often enough that you should be able to recognise them on sight.
Base32
Base32 uses the uppercase letters A-Z and the digits 2-7 (32 characters total). It avoids ambiguous digits like 0 and 1. The output is padded with = signs to make the length a multiple of 8. It looks like all-caps Base64 with no lowercase letters.
python3 -c "import base64; print(base64.b32decode('OBWGKYLTEBWWCYLSMFZCA===').decode())"
Base58
Base58 is used in Bitcoin addresses and some password hashes. It uses the standard alphanumeric character set but removes the four characters most likely to cause confusion:0 (zero), O (uppercase o), I (uppercase i), and l (lowercase L). If you see a Base64-like string that just happens to be missing those four characters, it is likely Base58.
pip3 install base58 # one-time installpython3 -c "import base58; print(base58.b58decode('StV1DL6CwTryKyV').decode())"
Decimal / ASCII
Sometimes a flag is encoded as a space-separated list of decimal numbers, each representing one ASCII character code. Printable ASCII falls in the range 32 to 127, so if you see a list of numbers in that range, try converting each to a character.
python3 -c "print(''.join(chr(int(n)) for n in '112 105 99 111 67 84 70'.split()))"
Braille
Unicode Braille characters (U+2800 to U+28FF) occasionally appear in CTF stego or encoding challenges. Each Unicode Braille cell represents a letter or symbol. CyberChef has a Braille decode operation. You can also identify it visually because the characters look like raised-dot patterns.
Bacon cipher
The Bacon cipher (invented by Francis Bacon in 1605) encodes each letter as a 5-character sequence of two symbols, traditionally A and B. For example, AAAAA is A, AAAAB is B, and so on. In CTFs it may use any two alternating symbols (uppercase vs lowercase, bold vs normal text, or two different characters). CyberChef's Bacon Cipher Decode operation handles all common variants.
Identifying an unknown encoding
When you find an unidentified string in a CTF challenge, work through this decision tree from top to bottom. Stop at the first rule that matches.
- Only
0s and1s, length a multiple of 8: Binary encoding. Decode withchr(int(b, 2))per 8-bit group. - Only
0-7, groups of 3: Octal encoding. Decode withchr(int(o, 8))per group. - Only
0-9 a-f(orA-F), even length: Hex encoding. Decode withbytes.fromhex()orxxd -r -p. - Only
A-Z 2-7with=padding, length a multiple of 8: Base32. Decode withbase64.b32decode(). A-Za-z 0-9 + /with=or==padding, length a multiple of 4: Base64. Decode withbase64 -dorbase64.b64decode().A-Za-z 0-9 - _with=padding: Base64 URL variant. Decode withbase64.urlsafe_b64decode().- All alphanumeric but missing
0 O I l: Base58. Decode with thebase58Python library. - Only letters, looks like shifted English: Caesar / ROT. Try ROT13 first, then brute-force all 26 shifts.
- Only letters, each maps to its mirror position (A=Z, B=Y): Atbash. Reverse the alphabet for each character.
- Only
.,-,/, and spaces: Morse code. Decode with a Morse lookup table or CyberChef. - Contains
%signs followed by two hex digits: URL encoding. Decode withurllib.parse.unquote(). - Contains
&followed by a word and;: HTML entity encoding. Decode withhtml.unescape(). - Space-separated numbers in the range 32-127: ASCII decimal. Decode with
chr(int(n))per number. - Two alternating symbols in groups of 5: Bacon cipher. Use CyberChef Bacon Cipher Decode.
- None of the above match: Paste the string into CyberChef and run the Magic operation. It will suggest the most likely encoding. If it still cannot identify it, the data might be encrypted (not just encoded) and you will need a key.
Quick reference
Use this table as a one-stop lookup when you recognize an encoding and just need the decode command.
| Encoding | Character set | Padding | Decode command |
|---|---|---|---|
| Base64 | A-Za-z0-9+/ | = or == | echo '...' | base64 -d |
| Base64 URL | A-Za-z0-9-_ | = | base64.urlsafe_b64decode() |
| Base32 | A-Z2-7 | = (length % 8 == 0) | base64.b32decode() |
| Base58 | A-Za-z0-9 (no 0OIl) | None | base58.b58decode() |
| Hex | 0-9a-f | None (even length) | echo '...' | xxd -r -p |
| Binary | 01 | None (length % 8 == 0) | chr(int(b, 2)) per group |
| Octal | 0-7 | None (groups of 3) | chr(int(o, 8)) per group |
| ROT13 | A-Za-z (shifted) | None | tr A-Za-z N-ZA-Mn-za-m |
| Caesar (N) | A-Za-z (shifted) | None | brute-force 26 shifts |
| Atbash | A-Za-z (reversed) | None | chr(ord('Z') - (c - ord('A'))) |
| URL encoding | %XX sequences | None | urllib.parse.unquote() |
| HTML entities | &name; or &#NNN; | None | html.unescape() |
| ASCII decimal | 0-9 (32-127) | None (space-separated) | chr(int(n)) per number |
| Morse code | . - / | None | dict lookup or CyberChef |
Recommended first-pass workflow for an encoded string
- Look at the character set and length. Use the identification flowchart in the previous section to narrow down the encoding type.
- Try the most likely encoding first using the one-liners in this table. Check whether the output looks printable and contains
picoCTF. - If the output is printable but still looks encoded, run the flowchart again on the decoded output. Repeat until you see the flag or clearly non-text binary data.
- If you are stuck, paste the string into CyberChef and run the Magic operation. It will rank likely decodings automatically.
- If Magic does not help and the string has no recognisable structure, the data is likely encrypted rather than encoded. Look for a key in the challenge files, network traffic, or source code.