April 11, 2026

Base64, Hex, and Common CTF Encodings Explained

Identify and decode every encoding you will encounter in CTF competitions: Base64, hex, binary, octal, ROT13, URL encoding, Morse code, and more -- with one-liners for each.

Introduction

Every CTF beginner hits the same wall: you find a string that looks like random garbage, and you have no idea what it is or how to decode it. Is it Base64? Hex? Some cipher? This guide is the reference you wish you had on day one.

The most important concept to understand first is the difference between encoding and encryption. Encoding is a reversible transformation that requires no key: anyone who knows the scheme can decode it. Base64, hex, and URL encoding are all encodings. Encryption, by contrast, requires a secret key to reverse: AES, RSA, and XOR with a hidden key are all encryption. CTF challenge descriptions that use the word "encoding" almost always mean something you can reverse without any key material.

CTF authors love to layer encodings. A flag might be hex-encoded, and the hex string might itself be Base64-encoded, and that Base64 might be URL-encoded in a query parameter. The strategy is always the same: decode one layer at a time, check whether the result looks printable and recognizable, and repeat until you see picoCTF{.

To identify an unknown encoding, look at the character set and the length. Each encoding has a distinctive fingerprint: only certain characters appear, the length often follows a pattern (always even, always a multiple of 4, always a multiple of 8), and some encodings have unmistakable visual signatures like trailing = padding or % percent signs. This guide covers every common encoding with its fingerprint and decode one-liner.

Base64

Base64 is the single most common encoding in CTF competitions. It represents binary data using 64 printable ASCII characters: the uppercase letters A-Z, the lowercase letters a-z, the digits 0-9, and the two symbols + and /. An = or == suffix is used as padding to make the total length a multiple of 4.

How to spot Base64

  • Character set: A-Z a-z 0-9 + / with optional trailing = or ==
  • Length is always a multiple of 4 (padding makes it so)
  • The string cGljb0NURg== decodes to picoCTF
  • Ratio of about 4 output characters per 3 bytes of input, so longer than the original

Decoding commands

# Decode a Base64 string on the command line
echo 'cGljb0NURntmbGFnfQ==' | base64 -d
# Decode a Base64-encoded file
base64 -d file.txt
# Decode in Python
python3 -c "import base64; print(base64.b64decode('cGljb0NURntmbGFnfQ==').decode())"

Multi-layer Base64

Some challenges encode the flag in Base64 multiple times. The repetitions challenge is a classic example: the flag is Base64-encoded six times. When the output of one decode is still a Base64-looking string, keep decoding. A quick loop handles this automatically:

python3 -c "
import base64
data = open('encoded.txt').read().strip()
for _ in range(10):
try:
data = base64.b64decode(data).decode()
except Exception:
break
print(data)
"

Base64 URL variant

The URL-safe variant of Base64 replaces + with - and / with _, so the output is safe to embed in a URL without percent-encoding. You will see this in JWT tokens and some web challenges. Use base64.urlsafe_b64decode() in Python, or manually swap the characters before using the standard decoder.

python3 -c "import base64; print(base64.urlsafe_b64decode('cGljb0NURntmbGFnfQ==').decode())"

Challenge using Base64

Hex encoding

Hex encoding (hexadecimal, base 16) represents each byte as exactly two characters from the set 0-9 a-f (sometimes uppercase A-F). Because every byte becomes two characters, a hex-encoded string is always an even number of characters long. Longer strings are often prefixed with 0x to signal they are hexadecimal.

How to spot hex

  • Only the characters 0-9 and a-f (or A-F)
  • Always an even number of characters
  • Sometimes prefixed with 0x
  • The string 7069636f435446 decodes to picoCTF

Decoding commands

# Decode hex to text using xxd
echo '7069636f435446' | xxd -r -p
# Decode hex in Python
python3 -c "print(bytes.fromhex('7069636f435446').decode())"
# Read hex from stdin in Python (useful for piping)
echo '7069636f435446' | python3 -c "import sys; print(bytes.fromhex(sys.stdin.read().strip()).decode())"

The xxd -r -p flag combination is worth memorising: -r means reverse (hex to binary) and -p means plain hex input with no line numbers or offsets. Together they convert a bare hex string to raw bytes, which you can then pipe into other tools.

Challenge using hex

Binary encoding

Binary encoding represents each character as its 8-bit ASCII value written in base 2. Each byte becomes exactly 8 digits, all either 0 or 1. Groups are usually separated by spaces, so the string looks like a sequence of 8-digit patterns.

How to spot binary

  • Only the characters 0 and 1
  • Groups of exactly 8 digits, often space-separated
  • Total digit count (ignoring spaces) is a multiple of 8
  • 01110000 01101001 01100011 decodes to pic

Decoding commands

python3 -c "
bits = '01110000 01101001 01100011 01101111 01000011 01010100 01000110'
print(''.join(chr(int(b, 2)) for b in bits.split()))
"

Octal encoding

Octal (base 8) uses only the digits 0-7 and is far less common than binary or hex in CTFs, but it does appear. Each character is typically represented as a 3-digit octal number. The decode is the same idea: convert each group from its base to an integer, then to a character.

# Decode octal-encoded text in Python
python3 -c "
octs = '160 151 143 157 103 124 106'
print(''.join(chr(int(o, 8)) for o in octs.split()))
"
Tip: When you see a string made of only 0s and 1s, it is binary (base 2). Only 0-7? It might be octal (base 8). Only 0-9 a-f? Hex (base 16). The base tells you which digits are possible.

ROT13, Caesar ciphers, and substitution

ROT13 is a simple letter-substitution cipher that shifts every letter in the alphabet forward by 13 positions. Because the alphabet has 26 letters, applying ROT13 twice returns the original text, making it self-inverse. It is used to obscure text without any key, and it appears constantly in CTF general-skills challenges.

The more general form is a Caesar cipher, which shifts by any number N from 1 to 25. To crack an unknown Caesar cipher, you can try all 26 shifts and look for the one that produces readable English.

How to spot ROT13 / Caesar

  • Contains only letters (and possibly punctuation or spaces)
  • The text looks like English but with shifted letters
  • cvpbPGS is picoCTF under ROT13
  • Non-letter characters (digits, braces, spaces) are usually left unchanged

Decoding commands

# ROT13 using tr (fast, works in bash)
echo 'cvpbPGS{synth}' | tr 'A-Za-z' 'N-ZA-Mn-za-m'
# ROT13 using Python codecs
python3 -c "
import codecs
print(codecs.decode('cvpbPGS{synth}', 'rot13'))
"

Brute-force all 26 Caesar shifts

python3 -c "
text = 'cvpbPGS{synth}'
for shift in range(26):
result = ''
for ch in text:
if ch.isalpha():
base = ord('A') if ch.isupper() else ord('a')
result += chr((ord(ch) - base + shift) % 26 + base)
else:
result += ch
print(f'Shift {shift:2d}: {result}')
"

Atbash cipher

Atbash reverses the alphabet: A becomes Z, B becomes Y, and so on. It is its own inverse, just like ROT13. To decode, map each letter to its mirror position: chr(ord('Z') - (ord(ch) - ord('A'))) for uppercase. CyberChef has a dedicated Atbash operation.

python3 -c "
text = 'krlbXGU'
result = ''
for ch in text:
if ch.isupper():
result += chr(ord('Z') - (ord(ch) - ord('A')))
elif ch.islower():
result += chr(ord('z') - (ord(ch) - ord('a')))
else:
result += ch
print(result)
"
CyberChef tip: The ROT13 Brute Force operation in CyberChef tries all 26 shifts at once and shows all results, making it easy to spot the readable one visually.

Challenge using rotation

URL encoding

URL encoding (also called percent-encoding) replaces characters that are not safe in a URL with a %sign followed by two hex digits representing the character's ASCII code. A space becomes %20, an opening brace becomes %7B, and so on. You will see this encoding constantly in web challenges where a flag is embedded in a URL or in a form parameter.

How to spot URL encoding

  • Contains % signs followed by exactly two hex digits
  • Often mixed with regular printable ASCII characters
  • %70%69%63%6f%43%54%46%7b%66%6c%61%67%7d decodes to picoCTF{flag}

Decoding commands

# Decode URL-encoded string in Python
python3 -c "from urllib.parse import unquote; print(unquote('%70%69%63%6f%43%54%46%7b%66%6c%61%67%7d'))"
# Or using the command line with curl
python3 -c "import urllib.parse; print(urllib.parse.unquote_plus('hello+world+%7Bflag%7D'))"

HTML entity encoding

HTML uses its own encoding scheme for special characters. A less-than sign becomes <, a greater-than sign becomes >, and an ampersand itself becomes &. You encounter this in web challenges where the server reflects your input back in HTML but has encoded the angle brackets to prevent XSS. Python's html.unescape() handles this:

python3 -c "import html; print(html.unescape('<script>alert("flag")</script>'))"

Double encoding

Web challenges sometimes double-encode payloads: a % character is itself URL-encoded as %25, so %2570 decodes first to %70and then to p. If you decode once and get more percent signs, decode again. This trick is used in WAF-bypass challenges where filters check the encoded form but the server decodes twice.

# Decode twice for double-encoded strings
python3 -c "
from urllib.parse import unquote
encoded = '%2570%2569%2563%256f'
print(unquote(unquote(encoded)))
"

Morse code

Morse code encodes letters and digits as sequences of short signals (dots, .) and long signals (dashes, -). Each character is separated by a space, and each word is separated by a forward slash / or a longer gap. In CTFs, Morse code shows up as a string of dots, dashes, and separators, and sometimes as an audio file you must transcribe.

How to spot Morse code

  • Only the characters . and - (and spaces or / as separators)
  • Very short strings with a regular rhythmic structure
  • .-. . .- -.. decodes to READ

Decoding with Python

python3 -c "
MORSE = {
'.-': 'A', '-...': 'B', '-.-.': 'C', '-..': 'D', '.': 'E',
'..-.': 'F', '--.': 'G', '....': 'H', '..': 'I', '.---': 'J',
'-.-': 'K', '.-..': 'L', '--': 'M', '-.': 'N', '---': 'O',
'.--.': 'P', '--.-': 'Q', '.-.': 'R', '...': 'S', '-': 'T',
'..-': 'U', '...-': 'V', '.--': 'W', '-..-': 'X', '-.--': 'Y',
'--..': 'Z', '-----': '0', '.----': '1', '..---': '2',
'...--': '3', '....-': '4', '.....': '5', '-....': '6',
'--...': '7', '---..': '8', '----.': '9',
}
msg = '-- --- .-. ... . / -.-. --- -.. .'
words = msg.split(' / ')
print(' '.join(''.join(MORSE.get(c, '?') for c in w.split()) for w in words))
"
Online tools: For audio-based Morse challenges, use morsecode.world or morsify.net to paste in a text Morse string and get the decoded output instantly. CyberChef also has a Morse Code decode operation.

Challenge using Morse code

Other encodings to know

Beyond the core five, a handful of less common encodings appear often enough that you should be able to recognise them on sight.

Base32

Base32 uses the uppercase letters A-Z and the digits 2-7 (32 characters total). It avoids ambiguous digits like 0 and 1. The output is padded with = signs to make the length a multiple of 8. It looks like all-caps Base64 with no lowercase letters.

python3 -c "import base64; print(base64.b32decode('OBWGKYLTEBWWCYLSMFZCA===').decode())"

Base58

Base58 is used in Bitcoin addresses and some password hashes. It uses the standard alphanumeric character set but removes the four characters most likely to cause confusion:0 (zero), O (uppercase o), I (uppercase i), and l (lowercase L). If you see a Base64-like string that just happens to be missing those four characters, it is likely Base58.

pip3 install base58 # one-time install
python3 -c "import base58; print(base58.b58decode('StV1DL6CwTryKyV').decode())"

Decimal / ASCII

Sometimes a flag is encoded as a space-separated list of decimal numbers, each representing one ASCII character code. Printable ASCII falls in the range 32 to 127, so if you see a list of numbers in that range, try converting each to a character.

python3 -c "print(''.join(chr(int(n)) for n in '112 105 99 111 67 84 70'.split()))"

Braille

Unicode Braille characters (U+2800 to U+28FF) occasionally appear in CTF stego or encoding challenges. Each Unicode Braille cell represents a letter or symbol. CyberChef has a Braille decode operation. You can also identify it visually because the characters look like raised-dot patterns.

Bacon cipher

The Bacon cipher (invented by Francis Bacon in 1605) encodes each letter as a 5-character sequence of two symbols, traditionally A and B. For example, AAAAA is A, AAAAB is B, and so on. In CTFs it may use any two alternating symbols (uppercase vs lowercase, bold vs normal text, or two different characters). CyberChef's Bacon Cipher Decode operation handles all common variants.

CyberChef Magic operation: When none of the above matches, paste your string into CyberChef and use the Magic operation. It tries dozens of encodings automatically, scores the results by how much they look like valid text, and shows you the top candidates. It is the best fallback tool for unknown encodings.

Identifying an unknown encoding

When you find an unidentified string in a CTF challenge, work through this decision tree from top to bottom. Stop at the first rule that matches.

  1. Only 0s and 1s, length a multiple of 8: Binary encoding. Decode with chr(int(b, 2)) per 8-bit group.
  2. Only 0-7, groups of 3: Octal encoding. Decode with chr(int(o, 8)) per group.
  3. Only 0-9 a-f (or A-F), even length: Hex encoding. Decode with bytes.fromhex() or xxd -r -p.
  4. Only A-Z 2-7 with = padding, length a multiple of 8: Base32. Decode with base64.b32decode().
  5. A-Za-z 0-9 + / with = or == padding, length a multiple of 4: Base64. Decode with base64 -d or base64.b64decode().
  6. A-Za-z 0-9 - _ with = padding: Base64 URL variant. Decode with base64.urlsafe_b64decode().
  7. All alphanumeric but missing 0 O I l: Base58. Decode with the base58 Python library.
  8. Only letters, looks like shifted English: Caesar / ROT. Try ROT13 first, then brute-force all 26 shifts.
  9. Only letters, each maps to its mirror position (A=Z, B=Y): Atbash. Reverse the alphabet for each character.
  10. Only ., -, /, and spaces: Morse code. Decode with a Morse lookup table or CyberChef.
  11. Contains % signs followed by two hex digits: URL encoding. Decode with urllib.parse.unquote().
  12. Contains & followed by a word and ;: HTML entity encoding. Decode with html.unescape().
  13. Space-separated numbers in the range 32-127: ASCII decimal. Decode with chr(int(n)) per number.
  14. Two alternating symbols in groups of 5: Bacon cipher. Use CyberChef Bacon Cipher Decode.
  15. None of the above match: Paste the string into CyberChef and run the Magic operation. It will suggest the most likely encoding. If it still cannot identify it, the data might be encrypted (not just encoded) and you will need a key.
Remember: Encodings can be layered. If one decode produces a string that still looks encoded, run through this checklist again on the decoded output. Multi-layer encodings of 3-6 levels deep are a common CTF pattern.

Quick reference

Use this table as a one-stop lookup when you recognize an encoding and just need the decode command.

EncodingCharacter setPaddingDecode command
Base64A-Za-z0-9+/= or ==echo '...' | base64 -d
Base64 URLA-Za-z0-9-_=base64.urlsafe_b64decode()
Base32A-Z2-7= (length % 8 == 0)base64.b32decode()
Base58A-Za-z0-9 (no 0OIl)Nonebase58.b58decode()
Hex0-9a-fNone (even length)echo '...' | xxd -r -p
Binary01None (length % 8 == 0)chr(int(b, 2)) per group
Octal0-7None (groups of 3)chr(int(o, 8)) per group
ROT13A-Za-z (shifted)Nonetr A-Za-z N-ZA-Mn-za-m
Caesar (N)A-Za-z (shifted)Nonebrute-force 26 shifts
AtbashA-Za-z (reversed)Nonechr(ord('Z') - (c - ord('A')))
URL encoding%XX sequencesNoneurllib.parse.unquote()
HTML entities&name; or &#NNN;Nonehtml.unescape()
ASCII decimal0-9 (32-127)None (space-separated)chr(int(n)) per number
Morse code. - /Nonedict lookup or CyberChef

Recommended first-pass workflow for an encoded string

  1. Look at the character set and length. Use the identification flowchart in the previous section to narrow down the encoding type.
  2. Try the most likely encoding first using the one-liners in this table. Check whether the output looks printable and contains picoCTF.
  3. If the output is printable but still looks encoded, run the flowchart again on the decoded output. Repeat until you see the flag or clearly non-text binary data.
  4. If you are stuck, paste the string into CyberChef and run the Magic operation. It will rank likely decodings automatically.
  5. If Magic does not help and the string has no recognisable structure, the data is likely encrypted rather than encoded. Look for a key in the challenge files, network traffic, or source code.