Introduction
A hex dump is a textual representation of binary data where every byte is shown as its two-digit hexadecimal value. When you open a PNG, an ELF binary, or any raw file in a hex editor, this is what you see. CTF challenges put hex dumps in front of you constantly: forensics categories hand you mystery files with unknown contents, reverse engineering challenges require you to understand compiled binaries at the byte level, and general skills challenges sometimes include corrupted or deliberately mislabeled files that you must identify and repair.
Being comfortable reading hex output separates players who can inspect a file in thirty seconds from those who spend an hour guessing. The core skill is pattern recognition: you learn to spot magic bytes at the start of a file, ASCII strings embedded in binary data, and structural anomalies that hint at steganography or tampering.
This guide covers the full hex analysis toolkit: the xxd command-line dumper, common file signatures, string extraction, patching bytes, graphical hex editors, endianness, and CyberChef for quick transformations.
Related guides
Reading xxd output
xxd is the standard tool for generating hex dumps on Linux. Every line it prints has three columns separated by whitespace:
- Byte offset (left): the position of the first byte on this line, expressed in hexadecimal. The very first line starts at
00000000. - Hex bytes (middle): sixteen bytes shown as eight pairs of two hex digits, split into two groups of eight by a central space. Each pair is one byte.
- ASCII representation (right): the same sixteen bytes decoded as ASCII characters. Any byte that is not a printable ASCII character (values below 0x20 or above 0x7E) is shown as a dot (
.).
Annotated example
$ xxd example.png | head -400000000: 8950 4e47 0d0a 1a0a 0000 000d 4948 4452 .PNG........IHDR00000010: 0000 0080 0000 0080 0806 0000 00c3 3e61 ..............>a00000020: 0000 0004 6741 4d41 0000 b18f 0bfc 6105 ....gAMA......a.00000030: 0000 0009 7048 5973 0000 0ec4 0000 0ec4 ....pHYs........# Column 1 (offset): 00000000, 00000010, 00000020, 00000030# Each line advances by 0x10 = 16 bytes# Column 2 (hex): first four bytes are 89 50 4E 47# That is the PNG magic number (see the Magic bytes section)# Column 3 (ASCII): 89 is non-printable -> shown as '.'# 50 4E 47 = 'P', 'N', 'G' -> shown as 'PNG'
Useful xxd flags
# Basic dump of the whole filexxd file# Dump only the first 64 bytes (great for checking magic bytes)xxd -l 64 file# Skip to byte offset 100 and dump from therexxd -s 100 file# Skip to offset 100 and show only 32 bytesxxd -s 100 -l 32 file# Show one byte per line (useful for scripting)xxd -c 1 file# Plain hex dump with no ASCII column (useful for piping)xxd -p file
less -S to scroll a large dump without wrapping: xxd largefile | less -S. Press q to exit.Magic bytes and file signatures
The first few bytes of most binary formats are a fixed signature called magic bytes. They exist so the operating system and tools like file can identify a file regardless of its extension. In CTFs, challenges routinely rename files with the wrong extension (a ZIP named flag.png, an ELF named mystery.txt) to slow you down. Checking the magic bytes immediately tells you the true format.
Common file signatures
| Format | Magic bytes (hex) | ASCII hint |
|---|---|---|
| PNG | 89 50 4E 47 0D 0A 1A 0A | .PNG.... |
| JPEG | FF D8 FF | ... |
| ZIP / JAR / DOCX | 50 4B 03 04 | PK.. |
| 25 50 44 46 | ||
| ELF (Linux binary) | 7F 45 4C 46 | .ELF |
| GIF | 47 49 46 38 | GIF8 |
| gzip / .gz | 1F 8B | .. |
| BMP | 42 4D | BM |
| MP3 | 49 44 33 | ID3 |
| 7-Zip | 37 7A BC AF 27 1C | 7z....' |
Checking magic bytes quickly
# Show just the first 8 bytesxxd -l 8 mystery_file# The fast way: let the 'file' command read magic bytes for youfile mystery_file# Example output:mystery_file: PNG image data, 128 x 128, 8-bit/color RGBA, non-interlaced# If the extension is wrong, 'file' still reports the real typefile flag.pngflag.png: Zip archive data, at least v2.0 to extract
file reports, always rename it and treat it as the real format. A ZIP disguised as a PNG often contains the flag file inside the archive.Finding strings and flags in hex dumps
Flags in CTF challenges are almost always printable ASCII strings (e.g. picoCTF{...}). Even inside compiled binaries, compressed archives, or raw disk images, the flag text often sits in plaintext and can be extracted without fully understanding the file format.
Using strings
The strings command scans a binary and prints every sequence of four or more consecutive printable ASCII characters. It is the fastest way to look for a flag.
# Default: sequences of 4+ printable ASCII charsstrings file# Require at least 6 characters (reduces noise)strings -n 6 file# Search the output for the flag prefixstrings file | grep -i pico# Search case-insensitively for any common flag formatstrings file | grep -iE 'pico|flag|ctf'# Show the byte offset of each string (-t x = hex offset)strings -t x file | grep pico
Grepping through a hex dump
Sometimes you want to search for a string while also seeing the surrounding hex context. Pipe xxd output through grep:
# Search hex dump output for the flag prefixxxd file | grep 'pico'# Case-insensitive searchxxd file | grep -i 'pico'# Search for a hex pattern directly (e.g. the PNG magic bytes)xxd file | grep '8950 4e47'
Flags split across line boundaries
Because xxd splits output into 16-byte lines, a flag string that starts near the end of one line will be split across two lines. The grep approach above will miss it in that case. Use strings or Python instead, since both operate on the raw byte stream without artificial line breaks:
# Python: read raw bytes and search for the flagpython3 -c "data = open('file', 'rb').read()idx = data.find(b'picoCTF{')if idx >= 0:end = data.index(b'}', idx)print(data[idx:end+1].decode())"
Related challenge
Editing bytes with xxd
xxd is not just a reader. The -r (reverse) flag converts hex dump text back into binary. This lets you patch a binary by editing a plain text file, which is much easier than using a hex editor when you know exactly which bytes to change.
The xxd -r patch workflow
# Step 1: dump the binary to a text filexxd binary > dump.txt# Step 2: open dump.txt in any text editor and change the bytes# Example: change the first byte from 00 to 89 (fix a PNG header)# Before: 00000000: 0050 4e47 0d0a 1a0a ... .PNG....# After: 00000000: 8950 4e47 0d0a 1a0a ... .PNG....# Step 3: convert the edited text back to binaryxxd -r dump.txt > patched_binary# Step 4: verify the resultfile patched_binaryxxd -l 8 patched_binary
When is this useful in a CTF?
- Corrupted file headers: a challenge provides a file that tools cannot open because the first few bytes were deliberately scrambled. Fix the magic bytes and the file becomes readable.
- Magic byte check bypass: a binary checks that its input starts with specific bytes before processing it. Patch your input file to satisfy the check.
- Flipping a flag byte: a program reads a single byte at a known offset and branches on it. Change that byte to take the other branch.
Surgical patches with printf and Python
For single-byte changes, xxd round-tripping is overkill. Use printf or Python to write exactly the bytes you need:
# Overwrite bytes at a specific offset using Pythonpython3 -c "data = bytearray(open('binary', 'rb').read())data[0] = 0x89 # fix first bytedata[1] = 0x50 # fix second byteopen('patched', 'wb').write(data)"# One-liner to write a single byte at offset 0 with ddprintf '\x89' | dd of=binary bs=1 seek=0 count=1 conv=notrunc
Hex editors
When you need to visually explore a binary file, scroll through it, and make interactive edits, a dedicated hex editor is faster than the command line. These are the tools most CTFers reach for:
Install: sudo apt install hexedit
Lightweight ncurses editor. Opens directly in the terminal. Use arrow keys to navigate, type hex digits to overwrite. Good for quick edits when you don't have a GUI.
Install: sudo apt install bless
A clean GTK hex editor. Shows hex and ASCII side by side with search and replace. Good default choice for Linux desktop environments.
Install: Download from mh-nexus.de
Free, fast Windows hex editor with a familiar tabbed interface. Handles very large files efficiently. The standard recommendation for Windows users.
Install: Download from sweetscape.com (paid, trial available)
The most powerful option. Its killer feature is Binary Templates: structured parsers for PNG, ELF, ZIP, PE, and many more formats that overlay field names on the hex view. Invaluable for reverse engineering known formats.
Install hexedit and Bless
# Install both on Debian / Ubuntu / Kalisudo apt install hexedit bless# Open a file in hexedithexedit file.bin# hexedit key bindings# Arrow keys navigate# Tab toggle between hex and ASCII pane# Ctrl+X save and exit# Ctrl+C exit without saving# Ctrl+W search (enter hex or ASCII string)
Endianness in hex dumps
Endianness describes the byte order used to store multi-byte integers in memory. It is one of the most common sources of confusion when reading hex dumps, especially in reverse engineering challenges.
Little-endian vs big-endian
Consider the 32-bit integer 0x12345678. It occupies four consecutive bytes in memory. How those bytes are ordered depends on the architecture:
# Big-endian (network byte order, most significant byte first)Address: 00 01 02 03Bytes: 12 34 56 78# Little-endian (x86 / x86-64, least significant byte first)Address: 00 01 02 03Bytes: 78 56 34 12# In a hex dump of a little-endian binary you would see:00000000: 7856 3412 ....# and need to mentally reverse the byte groups to recover 0x12345678
x86 and x86-64 (the architectures most CTF binaries run on) are little-endian. ARM is little-endian by default. Network protocols and file formats like PNG and ELF use big-endian (network byte order) for multi-byte fields.
Python struct for multi-byte values
When you need to read a 2-, 4-, or 8-byte integer from a binary file, Python's struct module handles the byte-order conversion for you:
import structdata = open('file.bin', 'rb').read()# Read a 4-byte little-endian unsigned int at offset 0value = struct.unpack_from('<I', data, offset=0)[0]print(hex(value))# Read a 4-byte big-endian unsigned intvalue = struct.unpack_from('>I', data, offset=0)[0]print(hex(value))# Format string quick reference:# '<' = little-endian '>' = big-endian# 'B' = uint8 'H' = uint16 'I' = uint32 'Q' = uint64# 'b' = int8 'h' = int16 'i' = int32 'q' = int64
Challenges where endianness matters
CyberChef for hex manipulation
CyberChef (gchq.github.io/CyberChef) is a browser-based data transformation tool. It has over 300 operations and lets you chain them visually with no coding. For hex manipulation it is often faster than writing a Python script.
Key hex operations
- From Hex: paste a hex string like
70 69 63 6fand decode it to bytes or ASCII. Works with or without spaces and with0xprefixes. - To Hex: convert raw text or bytes to their hex representation, with configurable delimiters.
- Magic: paste unknown data and let CyberChef auto-detect the encoding or format. It tries dozens of transformations and scores how printable the result is. Very useful when you have no idea what encoding you are dealing with.
- XOR: apply a hex key XOR across the input. A classic CTF encoding.
- Entropy: measure the Shannon entropy of the data. High entropy near 8.0 suggests encryption or compression. Low entropy suggests plain text or structured data.
Chaining operations
CyberChef shines when you need to apply multiple transformations in sequence. A typical CTF chain:
# CyberChef recipe (add these operations in order in the UI)1. From Hex <- decode the hex dump to raw bytes2. Reverse <- reverse the byte order3. To Base64 <- encode the result as base64# Another common chain for an encoded flag:1. From Base64 <- decode outer base64 layer2. From Hex <- decode inner hex encoding3. ROT13 <- final ROT13 shift
CyberChef vs the command line
Use CyberChef when you are exploring and not sure what encoding you are dealing with, when you want to visually chain operations without writing code, or when you need a quick one-off transformation. Use the command line when you are automating, working with large files (CyberChef struggles above a few MB in the browser), or need to integrate the result into a script.
Related guide
Hex analysis workflow for CTF
When a forensics or binary challenge hands you a file you have never seen before, work through this checklist in order. Each step takes only seconds and rules out large classes of possibilities before you spend time on manual inspection.
Decision checklist
- Run
fileto identify the type. Do not trust the extension.file mysteryreads the magic bytes and reports the real format. If it says "data" or "ASCII text," the file may be encoded or intentionally obfuscated. - Run
xxd -l 32 fileto verify magic bytes manually. Cross-reference the first four bytes against the magic bytes table above. This confirms whatfilereported and shows you whether the header is partially corrupted. - Run
strings file | grep -i picoto find obvious flags. Many challenges embed the flag in plaintext. This takes one second and sometimes solves the challenge outright. - Check file size: is it larger than expected? A 200-byte PNG that is actually 2 MB may have data appended after the
IENDchunk. Runbinwalk fileto scan for embedded file signatures.binwalk -e fileextracts them automatically. - Open in a hex editor for manual inspection. Scroll through the file looking for ASCII regions (visible text in the right column), repeating byte patterns, or abrupt transitions from structured data to apparent random bytes (which may indicate encryption).
- If it is an image, check for LSB steganography. Tools like
zsteg(PNG/BMP) andsteghidelook for data hidden in the least significant bits of pixel values. Runzsteg -a file.pngto try all common channel combinations.
# The six-step checklist as a command sequencefile mystery_filexxd -l 32 mystery_filestrings mystery_file | grep -i picols -lh mystery_file && binwalk mystery_filehexedit mystery_filezsteg -a mystery_file # if it turned out to be a PNG or BMP
Quick reference
Copy-paste cheat sheet for the most common hex analysis commands in a CTF context.
xxd flags
| Flag | Effect |
|---|---|
| -l N | Dump only the first N bytes |
| -s N | Skip N bytes before dumping |
| -c N | N bytes per line (default 16) |
| -p | Plain hex output, no offsets or ASCII column |
| -r | Reverse: convert hex dump back to binary |
| -r -p | Reverse plain hex (no offset column needed) |
| -e | Little-endian byte grouping |
| -b | Binary (bit) dump instead of hex |
Essential hex analysis commands
file target # identify file type from magic bytesxxd -l 16 target # peek at the first 16 bytesxxd target | grep -i 'pico' # search hex dump for flag prefixstrings -n 6 target | grep -i pico # extract long strings, filter for flagbinwalk target # detect embedded filesbinwalk -e target # extract embedded fileszsteg -a target.png # LSB steganography scan (PNG/BMP)steghide extract -sf target.jpg # extract steghide payload (JPEG)xxd target > dump.txt # dump to text for editingxxd -r dump.txt > patched # reverse: text back to binary
Python struct format characters
# Prefix: '<' = little-endian, '>' = big-endian, '=' = nativestruct.unpack_from('<B', data, off) # uint8 (1 byte)struct.unpack_from('<H', data, off) # uint16 (2 bytes)struct.unpack_from('<I', data, off) # uint32 (4 bytes)struct.unpack_from('<Q', data, off) # uint64 (8 bytes)
Challenges to practice on