April 11, 2026

How to Read and Analyze Hex Dumps

Learn to read and analyze hex dumps for CTF challenges -- understanding the xxd format, spotting magic bytes, finding hidden strings, and using hex editors to inspect binary files.

Introduction

A hex dump is a textual representation of binary data where every byte is shown as its two-digit hexadecimal value. When you open a PNG, an ELF binary, or any raw file in a hex editor, this is what you see. CTF challenges put hex dumps in front of you constantly: forensics categories hand you mystery files with unknown contents, reverse engineering challenges require you to understand compiled binaries at the byte level, and general skills challenges sometimes include corrupted or deliberately mislabeled files that you must identify and repair.

Being comfortable reading hex output separates players who can inspect a file in thirty seconds from those who spend an hour guessing. The core skill is pattern recognition: you learn to spot magic bytes at the start of a file, ASCII strings embedded in binary data, and structural anomalies that hint at steganography or tampering.

This guide covers the full hex analysis toolkit: the xxd command-line dumper, common file signatures, string extraction, patching bytes, graphical hex editors, endianness, and CyberChef for quick transformations.

Reading xxd output

xxd is the standard tool for generating hex dumps on Linux. Every line it prints has three columns separated by whitespace:

  1. Byte offset (left): the position of the first byte on this line, expressed in hexadecimal. The very first line starts at 00000000.
  2. Hex bytes (middle): sixteen bytes shown as eight pairs of two hex digits, split into two groups of eight by a central space. Each pair is one byte.
  3. ASCII representation (right): the same sixteen bytes decoded as ASCII characters. Any byte that is not a printable ASCII character (values below 0x20 or above 0x7E) is shown as a dot (.).

Annotated example

$ xxd example.png | head -4
00000000: 8950 4e47 0d0a 1a0a 0000 000d 4948 4452 .PNG........IHDR
00000010: 0000 0080 0000 0080 0806 0000 00c3 3e61 ..............>a
00000020: 0000 0004 6741 4d41 0000 b18f 0bfc 6105 ....gAMA......a.
00000030: 0000 0009 7048 5973 0000 0ec4 0000 0ec4 ....pHYs........
# Column 1 (offset): 00000000, 00000010, 00000020, 00000030
# Each line advances by 0x10 = 16 bytes
# Column 2 (hex): first four bytes are 89 50 4E 47
# That is the PNG magic number (see the Magic bytes section)
# Column 3 (ASCII): 89 is non-printable -> shown as '.'
# 50 4E 47 = 'P', 'N', 'G' -> shown as 'PNG'

Useful xxd flags

# Basic dump of the whole file
xxd file
# Dump only the first 64 bytes (great for checking magic bytes)
xxd -l 64 file
# Skip to byte offset 100 and dump from there
xxd -s 100 file
# Skip to offset 100 and show only 32 bytes
xxd -s 100 -l 32 file
# Show one byte per line (useful for scripting)
xxd -c 1 file
# Plain hex dump with no ASCII column (useful for piping)
xxd -p file
Tip: Pipe through less -S to scroll a large dump without wrapping: xxd largefile | less -S. Press q to exit.

Magic bytes and file signatures

The first few bytes of most binary formats are a fixed signature called magic bytes. They exist so the operating system and tools like file can identify a file regardless of its extension. In CTFs, challenges routinely rename files with the wrong extension (a ZIP named flag.png, an ELF named mystery.txt) to slow you down. Checking the magic bytes immediately tells you the true format.

Common file signatures

FormatMagic bytes (hex)ASCII hint
PNG89 50 4E 47 0D 0A 1A 0A.PNG....
JPEGFF D8 FF...
ZIP / JAR / DOCX50 4B 03 04PK..
PDF25 50 44 46%PDF
ELF (Linux binary)7F 45 4C 46.ELF
GIF47 49 46 38GIF8
gzip / .gz1F 8B..
BMP42 4DBM
MP349 44 33ID3
7-Zip37 7A BC AF 27 1C7z....'

Checking magic bytes quickly

# Show just the first 8 bytes
xxd -l 8 mystery_file
# The fast way: let the 'file' command read magic bytes for you
file mystery_file
# Example output:
mystery_file: PNG image data, 128 x 128, 8-bit/color RGBA, non-interlaced
# If the extension is wrong, 'file' still reports the real type
file flag.png
flag.png: Zip archive data, at least v2.0 to extract
CTF pattern: When a challenge gives you a file with an extension that does not match what file reports, always rename it and treat it as the real format. A ZIP disguised as a PNG often contains the flag file inside the archive.

Finding strings and flags in hex dumps

Flags in CTF challenges are almost always printable ASCII strings (e.g. picoCTF{...}). Even inside compiled binaries, compressed archives, or raw disk images, the flag text often sits in plaintext and can be extracted without fully understanding the file format.

Using strings

The strings command scans a binary and prints every sequence of four or more consecutive printable ASCII characters. It is the fastest way to look for a flag.

# Default: sequences of 4+ printable ASCII chars
strings file
# Require at least 6 characters (reduces noise)
strings -n 6 file
# Search the output for the flag prefix
strings file | grep -i pico
# Search case-insensitively for any common flag format
strings file | grep -iE 'pico|flag|ctf'
# Show the byte offset of each string (-t x = hex offset)
strings -t x file | grep pico

Grepping through a hex dump

Sometimes you want to search for a string while also seeing the surrounding hex context. Pipe xxd output through grep:

# Search hex dump output for the flag prefix
xxd file | grep 'pico'
# Case-insensitive search
xxd file | grep -i 'pico'
# Search for a hex pattern directly (e.g. the PNG magic bytes)
xxd file | grep '8950 4e47'

Flags split across line boundaries

Because xxd splits output into 16-byte lines, a flag string that starts near the end of one line will be split across two lines. The grep approach above will miss it in that case. Use strings or Python instead, since both operate on the raw byte stream without artificial line breaks:

# Python: read raw bytes and search for the flag
python3 -c "
data = open('file', 'rb').read()
idx = data.find(b'picoCTF{')
if idx >= 0:
end = data.index(b'}', idx)
print(data[idx:end+1].decode())
"

Related challenge

Editing bytes with xxd

xxd is not just a reader. The -r (reverse) flag converts hex dump text back into binary. This lets you patch a binary by editing a plain text file, which is much easier than using a hex editor when you know exactly which bytes to change.

The xxd -r patch workflow

# Step 1: dump the binary to a text file
xxd binary > dump.txt
# Step 2: open dump.txt in any text editor and change the bytes
# Example: change the first byte from 00 to 89 (fix a PNG header)
# Before: 00000000: 0050 4e47 0d0a 1a0a ... .PNG....
# After: 00000000: 8950 4e47 0d0a 1a0a ... .PNG....
# Step 3: convert the edited text back to binary
xxd -r dump.txt > patched_binary
# Step 4: verify the result
file patched_binary
xxd -l 8 patched_binary

When is this useful in a CTF?

  • Corrupted file headers: a challenge provides a file that tools cannot open because the first few bytes were deliberately scrambled. Fix the magic bytes and the file becomes readable.
  • Magic byte check bypass: a binary checks that its input starts with specific bytes before processing it. Patch your input file to satisfy the check.
  • Flipping a flag byte: a program reads a single byte at a known offset and branches on it. Change that byte to take the other branch.

Surgical patches with printf and Python

For single-byte changes, xxd round-tripping is overkill. Use printf or Python to write exactly the bytes you need:

# Overwrite bytes at a specific offset using Python
python3 -c "
data = bytearray(open('binary', 'rb').read())
data[0] = 0x89 # fix first byte
data[1] = 0x50 # fix second byte
open('patched', 'wb').write(data)
"
# One-liner to write a single byte at offset 0 with dd
printf '\x89' | dd of=binary bs=1 seek=0 count=1 conv=notrunc

Hex editors

When you need to visually explore a binary file, scroll through it, and make interactive edits, a dedicated hex editor is faster than the command line. These are the tools most CTFers reach for:

hexeditTerminal (Linux / macOS)

Install: sudo apt install hexedit

Lightweight ncurses editor. Opens directly in the terminal. Use arrow keys to navigate, type hex digits to overwrite. Good for quick edits when you don't have a GUI.

BlessLinux GUI

Install: sudo apt install bless

A clean GTK hex editor. Shows hex and ASCII side by side with search and replace. Good default choice for Linux desktop environments.

HxDWindows

Install: Download from mh-nexus.de

Free, fast Windows hex editor with a familiar tabbed interface. Handles very large files efficiently. The standard recommendation for Windows users.

010 EditorCross-platform (Windows / macOS / Linux)

Install: Download from sweetscape.com (paid, trial available)

The most powerful option. Its killer feature is Binary Templates: structured parsers for PNG, ELF, ZIP, PE, and many more formats that overlay field names on the hex view. Invaluable for reverse engineering known formats.

Install hexedit and Bless

# Install both on Debian / Ubuntu / Kali
sudo apt install hexedit bless
# Open a file in hexedit
hexedit file.bin
# hexedit key bindings
# Arrow keys navigate
# Tab toggle between hex and ASCII pane
# Ctrl+X save and exit
# Ctrl+C exit without saving
# Ctrl+W search (enter hex or ASCII string)

Endianness in hex dumps

Endianness describes the byte order used to store multi-byte integers in memory. It is one of the most common sources of confusion when reading hex dumps, especially in reverse engineering challenges.

Little-endian vs big-endian

Consider the 32-bit integer 0x12345678. It occupies four consecutive bytes in memory. How those bytes are ordered depends on the architecture:

# Big-endian (network byte order, most significant byte first)
Address: 00 01 02 03
Bytes: 12 34 56 78
# Little-endian (x86 / x86-64, least significant byte first)
Address: 00 01 02 03
Bytes: 78 56 34 12
# In a hex dump of a little-endian binary you would see:
00000000: 7856 3412 ....
# and need to mentally reverse the byte groups to recover 0x12345678

x86 and x86-64 (the architectures most CTF binaries run on) are little-endian. ARM is little-endian by default. Network protocols and file formats like PNG and ELF use big-endian (network byte order) for multi-byte fields.

Python struct for multi-byte values

When you need to read a 2-, 4-, or 8-byte integer from a binary file, Python's struct module handles the byte-order conversion for you:

import struct
data = open('file.bin', 'rb').read()
# Read a 4-byte little-endian unsigned int at offset 0
value = struct.unpack_from('<I', data, offset=0)[0]
print(hex(value))
# Read a 4-byte big-endian unsigned int
value = struct.unpack_from('>I', data, offset=0)[0]
print(hex(value))
# Format string quick reference:
# '<' = little-endian '>' = big-endian
# 'B' = uint8 'H' = uint16 'I' = uint32 'Q' = uint64
# 'b' = int8 'h' = int16 'i' = int32 'q' = int64

Challenges where endianness matters

CyberChef for hex manipulation

CyberChef (gchq.github.io/CyberChef) is a browser-based data transformation tool. It has over 300 operations and lets you chain them visually with no coding. For hex manipulation it is often faster than writing a Python script.

Key hex operations

  • From Hex: paste a hex string like 70 69 63 6f and decode it to bytes or ASCII. Works with or without spaces and with 0x prefixes.
  • To Hex: convert raw text or bytes to their hex representation, with configurable delimiters.
  • Magic: paste unknown data and let CyberChef auto-detect the encoding or format. It tries dozens of transformations and scores how printable the result is. Very useful when you have no idea what encoding you are dealing with.
  • XOR: apply a hex key XOR across the input. A classic CTF encoding.
  • Entropy: measure the Shannon entropy of the data. High entropy near 8.0 suggests encryption or compression. Low entropy suggests plain text or structured data.

Chaining operations

CyberChef shines when you need to apply multiple transformations in sequence. A typical CTF chain:

# CyberChef recipe (add these operations in order in the UI)
1. From Hex <- decode the hex dump to raw bytes
2. Reverse <- reverse the byte order
3. To Base64 <- encode the result as base64
# Another common chain for an encoded flag:
1. From Base64 <- decode outer base64 layer
2. From Hex <- decode inner hex encoding
3. ROT13 <- final ROT13 shift

CyberChef vs the command line

Use CyberChef when you are exploring and not sure what encoding you are dealing with, when you want to visually chain operations without writing code, or when you need a quick one-off transformation. Use the command line when you are automating, working with large files (CyberChef struggles above a few MB in the browser), or need to integrate the result into a script.

Hex analysis workflow for CTF

When a forensics or binary challenge hands you a file you have never seen before, work through this checklist in order. Each step takes only seconds and rules out large classes of possibilities before you spend time on manual inspection.

Decision checklist

  1. Run file to identify the type. Do not trust the extension. file mysteryreads the magic bytes and reports the real format. If it says "data" or "ASCII text," the file may be encoded or intentionally obfuscated.
  2. Run xxd -l 32 file to verify magic bytes manually. Cross-reference the first four bytes against the magic bytes table above. This confirms what file reported and shows you whether the header is partially corrupted.
  3. Run strings file | grep -i pico to find obvious flags. Many challenges embed the flag in plaintext. This takes one second and sometimes solves the challenge outright.
  4. Check file size: is it larger than expected? A 200-byte PNG that is actually 2 MB may have data appended after the IEND chunk. Run binwalk file to scan for embedded file signatures. binwalk -e file extracts them automatically.
  5. Open in a hex editor for manual inspection. Scroll through the file looking for ASCII regions (visible text in the right column), repeating byte patterns, or abrupt transitions from structured data to apparent random bytes (which may indicate encryption).
  6. If it is an image, check for LSB steganography. Tools like zsteg (PNG/BMP) and steghide look for data hidden in the least significant bits of pixel values. Run zsteg -a file.png to try all common channel combinations.
# The six-step checklist as a command sequence
file mystery_file
xxd -l 32 mystery_file
strings mystery_file | grep -i pico
ls -lh mystery_file && binwalk mystery_file
hexedit mystery_file
zsteg -a mystery_file # if it turned out to be a PNG or BMP

Quick reference

Copy-paste cheat sheet for the most common hex analysis commands in a CTF context.

xxd flags

FlagEffect
-l NDump only the first N bytes
-s NSkip N bytes before dumping
-c NN bytes per line (default 16)
-pPlain hex output, no offsets or ASCII column
-rReverse: convert hex dump back to binary
-r -pReverse plain hex (no offset column needed)
-eLittle-endian byte grouping
-bBinary (bit) dump instead of hex

Essential hex analysis commands

file target # identify file type from magic bytes
xxd -l 16 target # peek at the first 16 bytes
xxd target | grep -i 'pico' # search hex dump for flag prefix
strings -n 6 target | grep -i pico # extract long strings, filter for flag
binwalk target # detect embedded files
binwalk -e target # extract embedded files
zsteg -a target.png # LSB steganography scan (PNG/BMP)
steghide extract -sf target.jpg # extract steghide payload (JPEG)
xxd target > dump.txt # dump to text for editing
xxd -r dump.txt > patched # reverse: text back to binary

Python struct format characters

# Prefix: '<' = little-endian, '>' = big-endian, '=' = native
struct.unpack_from('<B', data, off) # uint8 (1 byte)
struct.unpack_from('<H', data, off) # uint16 (2 bytes)
struct.unpack_from('<I', data, off) # uint32 (4 bytes)
struct.unpack_from('<Q', data, off) # uint64 (8 bytes)