Base64, Hex, and Common CTF Encodings Explained

Introduction

Every CTF beginner hits the same wall: you find a string that looks like random garbage, and you have no idea what it is or how to decode it. Is it Base64? Hex? Some cipher? This guide is the reference you wish you had on day one.

The most important concept to understand first is the difference between encoding and encryption. Encoding is a reversible transformation that requires no key: anyone who knows the scheme can decode it. Base64, hex, and URL encoding are all encodings. Encryption, by contrast, requires a secret key to reverse: AES, RSA, and XOR with a hidden key are all encryption. CTF challenge descriptions that use the word "encoding" almost always mean something you can reverse without any key material.

CTF authors love to layer encodings. A flag might be hex-encoded, and the hex string might itself be Base64-encoded, and that Base64 might be URL-encoded in a query parameter. The strategy is always the same: decode one layer at a time, check whether the result looks printable and recognizable, and repeat until you see picoCTF{.

To identify an unknown encoding, look at the character set and the length. Each encoding has a distinctive fingerprint: only certain characters appear, the length often follows a pattern (always even, always a multiple of 4, always a multiple of 8), and some encodings have unmistakable visual signatures like trailing = padding or % percent signs. This guide covers every common encoding with its fingerprint and decode one-liner.

Related guides

Hex Dumps and xxd Linux CLI for CTF

Skip the layer-by-layer hand-decode: Paste any suspicious string into Recipe Chain and click Magic. It auto-discovers the decode pipeline (base64 → hex → ROT → XOR → Morse → URL → Atbash → Vigenère, in any order) and stops when it hits a printable, flag-shaped result. The URL captures your full recipe so you can bookmark it. Use this guide when you want to understand the fingerprints below the surface or when Magic mode hits a dead end and you need to drive the pipeline by hand.

Base64

Base64 is the single most common encoding in CTF competitions. It represents binary data using 64 printable ASCII characters: the uppercase letters A-Z, the lowercase letters a-z, the digits 0-9, and the two symbols + and /. An = or == suffix is used as padding to make the total length a multiple of 4.

How to spot Base64

Character set: A-Z a-z 0-9 + / with optional trailing = or ==
Length is always a multiple of 4 (padding makes it so)
The string cGljb0NURg== decodes to picoCTF
Ratio of about 4 output characters per 3 bytes of input, so longer than the original

Decoding commands

# Decode a Base64 string on the command line
echo 'cGljb0NURntmbGFnfQ==' | base64 -d
# Decode a Base64-encoded file
base64 -d file.txt
# Decode in Python
python3 -c "import base64; print(base64.b64decode('cGljb0NURntmbGFnfQ==').decode())"

Multi-layer Base64

Some challenges encode the flag in Base64 multiple times. The repetitions challenge is a classic example: the flag is Base64-encoded six times. When the output of one decode is still a Base64-looking string, keep decoding. A quick loop handles this automatically:

python3 -c "
import base64
data = open('encoded.txt').read().strip()
for _ in range(10):
    try:
        data = base64.b64decode(data).decode()
    except Exception:
        break
print(data)
"

Base64 URL variant

The URL-safe variant of Base64 replaces + with - and / with _, so the output is safe to embed in a URL without percent-encoding. You will see this in JWT tokens and some web challenges. Use base64.urlsafe_b64decode() in Python, or manually swap the characters before using the standard decoder.

# Note the '-' in the input: standard base64 would use '+' there instead.
python3 -c "import base64; print(base64.urlsafe_b64decode('cGljb0NURnt-YjY0X3VybH59').decode())"  # picoCTF{~b64_url~}

Challenge using Base64

picoCTF 2023 / repetitions

Hex encoding

Hex encoding (hexadecimal, base 16) represents each byte as exactly two characters from the set 0-9 a-f (sometimes uppercase A-F). Because every byte becomes two characters, a hex-encoded string is always an even number of characters long. Longer strings are often prefixed with 0x to signal they are hexadecimal.

How to spot hex

Only the characters 0-9 and a-f (or A-F)
Always an even number of characters
Sometimes prefixed with 0x
The string 7069636f435446 decodes to picoCTF

Decoding commands

# Decode hex to text using xxd
echo '7069636f435446' | xxd -r -p
# Decode hex in Python
python3 -c "print(bytes.fromhex('7069636f435446').decode())"
# Read hex from stdin in Python (useful for piping)
echo '7069636f435446' | python3 -c "import sys; print(bytes.fromhex(sys.stdin.read().strip()).decode())"

The xxd -r -p flag combination is worth memorising: -r means reverse (hex to binary) and -p means plain hex input with no line numbers or offsets. Together they convert a bare hex string to raw bytes, which you can then pipe into other tools.

Challenge using hex

picoCTF 2024 / binhexa

Binary encoding

Binary encoding represents each character as its 8-bit ASCII value written in base 2. Each byte becomes exactly 8 digits, all either 0 or 1. Groups are usually separated by spaces, so the string looks like a sequence of 8-digit patterns.

How to spot binary

Only the characters 0 and 1
Groups of exactly 8 digits, often space-separated
Total digit count (ignoring spaces) is a multiple of 8
01110000 01101001 01100011 decodes to pic

Decoding commands

python3 -c "
bits = '01110000 01101001 01100011 01101111 01000011 01010100 01000110'
print(''.join(chr(int(b, 2)) for b in bits.split()))
"

Octal encoding

Octal (base 8) uses only the digits 0-7 and is far less common than binary or hex in CTFs, but it does appear. Each character is typically represented as a 3-digit octal number. The decode is the same idea: convert each group from its base to an integer, then to a character.

# Decode octal-encoded text in Python
python3 -c "
octs = '160 151 143 157 103 124 106'
print(''.join(chr(int(o, 8)) for o in octs.split()))
"

Tip: When you see a string made of only 0s and 1s, it is binary (base 2). Only 0-7? It might be octal (base 8). Only 0-9 a-f? Hex (base 16). The base tells you which digits are possible.

ROT13, Caesar ciphers, and substitution

ROT13 is a simple letter-substitution cipher that shifts every letter in the alphabet forward by 13 positions. Because the alphabet has 26 letters, applying ROT13 twice returns the original text, making it self-inverse. It is used to obscure text without any key, and it appears constantly in CTF general-skills challenges.

The more general form is a Caesar cipher, which shifts by any number N from 1 to 25. To crack an unknown Caesar cipher, you can try all 26 shifts and look for the one that produces readable English.

How to spot ROT13 / Caesar

Contains only letters (and possibly punctuation or spaces)
The text looks like English but with shifted letters
cvpbPGS is picoCTF under ROT13
Non-letter characters (digits, braces, spaces) are usually left unchanged

Decoding commands

# ROT13 using tr (fast, works in bash)
echo 'cvpbPGS{synth}' | tr 'A-Za-z' 'N-ZA-Mn-za-m'
# ROT13 using Python codecs
python3 -c "
import codecs
print(codecs.decode('cvpbPGS{synth}', 'rot13'))
"

Brute-force all 26 Caesar shifts

python3 -c "
text = 'cvpbPGS{synth}'
for shift in range(26):
    result = ''
    for ch in text:
        if ch.isalpha():
            base = ord('A') if ch.isupper() else ord('a')
            result += chr((ord(ch) - base + shift) % 26 + base)
        else:
            result += ch
    print(f'Shift {shift:2d}: {result}')
"

Atbash cipher

Atbash reverses the alphabet: A becomes Z, B becomes Y, and so on. It is its own inverse, just like ROT13. To decode, map each letter to its mirror position: chr(ord('Z') - (ord(ch) - ord('A'))) for uppercase. CyberChef has a dedicated Atbash operation.

python3 -c "
text = 'krxlXGU'
result = ''
for ch in text:
    if ch.isupper():
        result += chr(ord('Z') - (ord(ch) - ord('A')))
    elif ch.islower():
        result += chr(ord('z') - (ord(ch) - ord('a')))
    else:
        result += ch
print(result)
"

CyberChef tip: The ROT13 Brute Force operation in CyberChef tries all 26 shifts at once and shows all results, making it easy to spot the readable one visually.

Challenge using rotation

picoCTF 2023 / rotation

URL encoding

URL encoding (also called percent-encoding) replaces characters that are not safe in a URL with a % sign followed by two hex digits representing the character's ASCII code. A space becomes %20, an opening brace becomes %7B, and so on. You will see this encoding constantly in web challenges where a flag is embedded in a URL or in a form parameter.

How to spot URL encoding

Contains % signs followed by exactly two hex digits
Often mixed with regular printable ASCII characters
%70%69%63%6f%43%54%46%7b%66%6c%61%67%7d decodes to picoCTF{flag}

Decoding commands

# Decode URL-encoded string in Python
python3 -c "from urllib.parse import unquote; print(unquote('%70%69%63%6f%43%54%46%7b%66%6c%61%67%7d'))"
# Or using the command line with curl
python3 -c "import urllib.parse; print(urllib.parse.unquote_plus('hello+world+%7Bflag%7D'))"

HTML entity encoding

HTML uses its own encoding scheme for special characters. A less-than sign becomes <, a greater-than sign becomes >, and an ampersand itself becomes &. You encounter this in web challenges where the server reflects your input back in HTML but has encoded the angle brackets to prevent XSS. Python's html.unescape() handles this:

python3 -c "import html; print(html.unescape('&lt;script&gt;alert(&amp;quot;flag&amp;quot;)&lt;/script&gt;'))"

Double encoding

Web challenges sometimes double-encode payloads: a % character is itself URL-encoded as %25, so %2570 decodes first to %70 and then to p. If you decode once and get more percent signs, decode again. This trick is used in WAF-bypass challenges where filters check the encoded form but the server decodes twice.

# Decode twice for double-encoded strings
python3 -c "
from urllib.parse import unquote
encoded = '%2570%2569%2563%256f'
print(unquote(unquote(encoded)))
"

Morse code

Morse code encodes letters and digits as sequences of short signals (dots, .) and long signals (dashes, -). Each character is separated by a space, and each word is separated by a forward slash / or a longer gap. In CTFs, Morse code shows up as a string of dots, dashes, and separators, and sometimes as an audio file you must transcribe.

How to spot Morse code

Only the characters . and - (and spaces or / as separators)
Very short strings with a regular rhythmic structure
.-. . .- -.. decodes to READ

Decoding with Python

python3 -c "
MORSE = {
    '.-': 'A', '-...': 'B', '-.-.': 'C', '-..': 'D', '.': 'E',
    '..-.': 'F', '--.': 'G', '....': 'H', '..': 'I', '.---': 'J',
    '-.-': 'K', '.-..': 'L', '--': 'M', '-.': 'N', '---': 'O',
    '.--.': 'P', '--.-': 'Q', '.-.': 'R', '...': 'S', '-': 'T',
    '..-': 'U', '...-': 'V', '.--': 'W', '-..-': 'X', '-.--': 'Y',
    '--..': 'Z', '-----': '0', '.----': '1', '..---': '2',
    '...--': '3', '....-': '4', '.....': '5', '-....': '6',
    '--...': '7', '---..': '8', '----.': '9',
}
msg = '-- --- .-. ... . / -.-. --- -.. .'
words = msg.split(' / ')
print(' '.join(''.join(MORSE.get(c, '?') for c in w.split()) for w in words))
"

Online tools: For audio-based Morse challenges, use morsecode.world or morsify.net to paste in a text Morse string and get the decoded output instantly. CyberChef also has a Morse Code decode operation.

Challenge using Morse code

picoCTF 2022 / morse-code

Other encodings to know

Beyond the core five, a handful of less common encodings appear often enough that you should be able to recognise them on sight.

Base32

Base32 uses the uppercase letters A-Z and the digits 2-7 (32 characters total). It avoids ambiguous digits like 0 and 1. The output is padded with = signs to make the length a multiple of 8. It looks like all-caps Base64 with no lowercase letters.

python3 -c "import base64; print(base64.b32decode('OBUWG32DKRDA====').decode())"  # picoCTF

Base58

Base58 is used in Bitcoin addresses and some password hashes. It uses the standard alphanumeric character set but removes the four characters most likely to cause confusion:0 (zero), O (uppercase o), I (uppercase i), and l (lowercase L). If you see a Base64-like string that just happens to be missing those four characters, it is likely Base58.

pip3 install base58   # one-time install
python3 -c "import base58; print(base58.b58decode('StV1DL6CwTryKyV').decode())"

Decimal / ASCII

Sometimes a flag is encoded as a space-separated list of decimal numbers, each representing one ASCII character code. Printable ASCII falls in the range 32 to 127, so if you see a list of numbers in that range, try converting each to a character.

python3 -c "print(''.join(chr(int(n)) for n in '112 105 99 111 67 84 70'.split()))"

Braille

Unicode Braille characters (U+2800 to U+28FF) occasionally appear in CTF stego or encoding challenges. Each Unicode Braille cell represents a letter or symbol. CyberChef has a Braille decode operation. You can also identify it visually because the characters look like raised-dot patterns.

Bacon cipher

The Bacon cipher (invented by Francis Bacon in 1605) encodes each letter as a 5-character sequence of two symbols, traditionally A and B. For example, AAAAA is A, AAAAB is B, and so on. In CTFs it may use any two alternating symbols (uppercase vs lowercase, bold vs normal text, or two different characters). CyberChef's Bacon Cipher Decode operation handles all common variants.

CyberChef Magic operation: When none of the above matches, paste your string into CyberChef and use the Magic operation. It tries dozens of encodings automatically, scores the results by how much they look like valid text, and shows you the top candidates. It is the best fallback tool for unknown encodings.

Identifying an unknown encoding

When you find an unidentified string in a CTF challenge, work through this decision tree from top to bottom. Stop at the first rule that matches.

Only 0s and 1s, length a multiple of 8: Binary encoding. Decode with chr(int(b, 2)) per 8-bit group.
Only 0-7, groups of 3: Octal encoding. Decode with chr(int(o, 8)) per group.
Only 0-9 a-f (or A-F), even length: Hex encoding. Decode with bytes.fromhex() or xxd -r -p.
Only A-Z 2-7 with = padding, length a multiple of 8: Base32. Decode with base64.b32decode().
A-Za-z 0-9 + / with = or == padding, length a multiple of 4: Base64. Decode with base64 -d or base64.b64decode().
A-Za-z 0-9 - _ with = padding: Base64 URL variant. Decode with base64.urlsafe_b64decode().
All alphanumeric but missing 0 O I l: Base58. Decode with the base58 Python library.
Only letters, looks like shifted English: Caesar / ROT. Try ROT13 first, then brute-force all 26 shifts.
Only letters, each maps to its mirror position (A=Z, B=Y): Atbash. Reverse the alphabet for each character.
Only ., -, /, and spaces: Morse code. Decode with a Morse lookup table or CyberChef.
Contains % signs followed by two hex digits: URL encoding. Decode with urllib.parse.unquote().
Contains & followed by a word and ;: HTML entity encoding. Decode with html.unescape().
Space-separated numbers in the range 32-127: ASCII decimal. Decode with chr(int(n)) per number.
Two alternating symbols in groups of 5: Bacon cipher. Use CyberChef Bacon Cipher Decode.
None of the above match: Paste the string into Recipe Chain (or CyberChef) and run the Magic operation. It will suggest the most likely encoding. If it still cannot identify it, the data might be encrypted (not just encoded) and you will need a key.

Remember: Encodings can be layered. If one decode produces a string that still looks encoded, run through this checklist again on the decoded output. Multi-layer encodings of 3-6 levels deep are a common CTF pattern, which is exactly what Recipe Chain is built to handle: stack as many decode steps as you need and watch the output evolve live at each stage.

Quick reference

Use this table as a one-stop lookup when you recognize an encoding and just need the decode command.

Encoding	Character set	Padding	Decode command
Base64	A-Za-z0-9+/	`=` or `==`	echo '...' \| base64 -d
Base64 URL	A-Za-z0-9-_	`=`	base64.urlsafe_b64decode()
Base32	A-Z2-7	`=` (length % 8 == 0)	base64.b32decode()
Base58	A-Za-z0-9 (no 0OIl)	None	base58.b58decode()
Hex	0-9a-f	None (even length)	echo '...' \| xxd -r -p
Binary	01	None (length % 8 == 0)	chr(int(b, 2)) per group
Octal	0-7	None (groups of 3)	chr(int(o, 8)) per group
ROT13	A-Za-z (shifted)	None	tr A-Za-z N-ZA-Mn-za-m
Caesar (N)	A-Za-z (shifted)	None	brute-force 26 shifts
Atbash	A-Za-z (reversed)	None	chr(ord('Z') - (c - ord('A')))
URL encoding	%XX sequences	None	urllib.parse.unquote()
HTML entities	&name; or &#NNN;	None	html.unescape()
ASCII decimal	0-9 (32-127)	None (space-separated)	chr(int(n)) per number
Morse code	. - /	None	dict lookup or CyberChef

Recommended first-pass workflow for an encoded string

Look at the character set and length. Use the identification flowchart in the previous section to narrow down the encoding type.
Try the most likely encoding first using the one-liners in this table. Check whether the output looks printable and contains picoCTF.
If the output is printable but still looks encoded, run the flowchart again on the decoded output. Repeat until you see the flag or clearly non-text binary data.
If you are stuck, paste the string into CyberChef and run the Magic operation. It will rank likely decodings automatically.
If Magic does not help and the string has no recognisable structure, the data is likely encrypted rather than encoded. Look for a key in the challenge files, network traffic, or source code.

Base64, Hex, and Common CTF Encodings Explained

Introduction

Base64

How to spot Base64

Decoding commands

Multi-layer Base64

Base64 URL variant

Hex encoding

How to spot hex

Decoding commands

Binary encoding

How to spot binary

Decoding commands

Octal encoding

ROT13, Caesar ciphers, and substitution

How to spot ROT13 / Caesar

Decoding commands

Brute-force all 26 Caesar shifts

Atbash cipher

URL encoding

How to spot URL encoding

Decoding commands

HTML entity encoding

Double encoding

Morse code

How to spot Morse code

Decoding with Python

Other encodings to know

Base32

Base58

Decimal / ASCII

Braille

Bacon cipher

Identifying an unknown encoding

Quick reference

Try it on these picoCTF challenges

Keep reading