How to Read and Analyze Hex Dumps

Introduction

A hex dump is a textual representation of binary data where every byte is shown as its two-digit hexadecimal value. When you open a PNG, an ELF binary, or any raw file in a hex editor, this is what you see. CTF challenges put hex dumps in front of you constantly: forensics categories hand you mystery files with unknown contents, reverse engineering challenges require you to understand compiled binaries at the byte level, and general skills challenges sometimes include corrupted or deliberately mislabeled files that you must identify and repair.

Being comfortable reading hex output separates players who can inspect a file in thirty seconds from those who spend an hour guessing. The core skill is pattern recognition: you learn to spot magic bytes at the start of a file, ASCII strings embedded in binary data, and structural anomalies that hint at steganography or tampering.

This guide covers the full hex analysis toolkit: the xxd command-line dumper, common file signatures, string extraction, patching bytes, graphical hex editors, endianness, and CyberChef for quick transformations.

Reading xxd output

xxd is the standard tool for generating hex dumps on Linux. Every line it prints has three columns separated by whitespace:

Byte offset (left): the position of the first byte on this line, expressed in hexadecimal. The very first line starts at 00000000.
Hex bytes (middle): sixteen bytes shown as eight pairs of two hex digits, split into two groups of eight by a central space. Each pair is one byte.
ASCII representation (right): the same sixteen bytes decoded as ASCII characters. Any byte that is not a printable ASCII character (values below 0x20 or above 0x7E) is shown as a dot (.).

Annotated example

$ xxd example.png | head -4
00000000: 8950 4e47 0d0a 1a0a 0000 000d 4948 4452  .PNG........IHDR
00000010: 0000 0080 0000 0080 0806 0000 00c3 3e61  ..............>a
00000020: 0000 0004 6741 4d41 0000 b18f 0bfc 6105  ....gAMA......a.
00000030: 0000 0009 7048 5973 0000 0ec4 0000 0ec4  ....pHYs........
# Column 1 (offset): 00000000, 00000010, 00000020, 00000030
# Each line advances by 0x10 = 16 bytes
# Column 2 (hex): first four bytes are 89 50 4E 47
# That is the PNG magic number (see the Magic bytes section)
# Column 3 (ASCII): 89 is non-printable -> shown as '.'
# 50 4E 47 = 'P', 'N', 'G' -> shown as 'PNG'

Useful xxd flags

# Basic dump of the whole file
xxd file
# Dump only the first 64 bytes (great for checking magic bytes)
xxd -l 64 file
# Skip to byte offset 100 and dump from there
xxd -s 100 file
# Skip to offset 100 and show only 32 bytes
xxd -s 100 -l 32 file
# Show one byte per line (useful for scripting)
xxd -c 1 file
# Plain hex dump with no ASCII column (useful for piping)
xxd -p file

Tip: Pipe through less -S to scroll a large dump without wrapping: xxd largefile | less -S. Press q to exit.

Magic bytes and file signatures

The first few bytes of most binary formats are a fixed signature called magic bytes. They exist so the operating system and tools like file can identify a file regardless of its extension. In CTFs, challenges routinely rename files with the wrong extension (a ZIP named flag.png, an ELF named mystery.txt) to slow you down. Checking the magic bytes immediately tells you the true format.

Common file signatures

Format	Magic bytes (hex)	ASCII hint
PNG	89 50 4E 47 0D 0A 1A 0A	.PNG....
JPEG	FF D8 FF	...
ZIP / JAR / DOCX	50 4B 03 04	PK..
PDF	25 50 44 46	%PDF
ELF (Linux binary)	7F 45 4C 46	.ELF
GIF	47 49 46 38	GIF8
gzip / .gz	1F 8B	..
BMP	42 4D	BM
MP3	49 44 33	ID3
7-Zip	37 7A BC AF 27 1C	7z....'

Checking magic bytes quickly

# Show just the first 8 bytes
xxd -l 8 mystery_file
# The fast way: let the 'file' command read magic bytes for you
file mystery_file
# Example output:
mystery_file: PNG image data, 128 x 128, 8-bit/color RGBA, non-interlaced
# If the extension is wrong, 'file' still reports the real type
file flag.png
flag.png: Zip archive data, at least v2.0 to extract

CTF pattern: When a challenge gives you a file with an extension that does not match what file reports, always rename it and treat it as the real format. A ZIP disguised as a PNG often contains the flag file inside the archive.

Polyglots and second magic numbers: When a single file embeds two formats (e.g. a valid PNG with a ZIP archive appended after IEND), file only reports the first one. Stegall scans the entire file for every known magic signature in parallel, carves out anything embedded, and recurses into the carved blob. Drop the file in once and you get every polyglot layer at the bottom of the report.

Finding strings and flags in hex dumps

Flags in CTF challenges are almost always printable ASCII strings (e.g. picoCTF{...}). Even inside compiled binaries, compressed archives, or raw disk images, the flag text often sits in plaintext and can be extracted without fully understanding the file format.

Using `strings`

The strings command scans a binary and prints every sequence of four or more consecutive printable ASCII characters. It is the fastest way to look for a flag.

# Default: sequences of 4+ printable ASCII chars
strings file
# Require at least 6 characters (reduces noise)
strings -n 6 file
# Search the output for the flag prefix
strings file | grep -i pico
# Search case-insensitively for any common flag format
strings file | grep -iE 'pico|flag|ctf'
# Show the byte offset of each string (-t x = hex offset)
strings -t x file | grep pico

Grepping through a hex dump

Sometimes you want to search for a string while also seeing the surrounding hex context. Pipe xxd output through grep:

# Search hex dump output for the flag prefix
xxd file | grep 'pico'
# Case-insensitive search
xxd file | grep -i 'pico'
# Search for a hex pattern directly (e.g. the PNG magic bytes)
xxd file | grep '8950 4e47'

Flags split across line boundaries

Because xxd splits output into 16-byte lines, a flag string that starts near the end of one line will be split across two lines. The grep approach above will miss it in that case. Use strings or Python instead, since both operate on the raw byte stream without artificial line breaks:

# Python: read raw bytes and search for the flag
python3 -c "
data = open('file', 'rb').read()
idx = data.find(b'picoCTF{')
if idx >= 0:
    end = data.index(b'}', idx)
    print(data[idx:end+1].decode())
"

Related challenge

picoCTF 2024 / endianness

Editing bytes with xxd

xxd is not just a reader. The -r (reverse) flag converts hex dump text back into binary. This lets you patch a binary by editing a plain text file, which is much easier than using a hex editor when you know exactly which bytes to change.

The `xxd -r` patch workflow

# Step 1: dump the binary to a text file
xxd binary > dump.txt
# Step 2: open dump.txt in any text editor and change the bytes
# Example: change the first byte from 00 to 89 (fix a PNG header)
# Before: 00000000: 0050 4e47 0d0a 1a0a ...  .PNG....
# After:  00000000: 8950 4e47 0d0a 1a0a ...  .PNG....
# Step 3: convert the edited text back to binary
xxd -r dump.txt > patched_binary
# Step 4: verify the result
file patched_binary
xxd -l 8 patched_binary

When is this useful in a CTF?

Corrupted file headers: a challenge provides a file that tools cannot open because the first few bytes were deliberately scrambled. Fix the magic bytes and the file becomes readable.
Magic byte check bypass: a binary checks that its input starts with specific bytes before processing it. Patch your input file to satisfy the check.
Flipping a flag byte: a program reads a single byte at a known offset and branches on it. Change that byte to take the other branch.

Surgical patches with printf and Python

For single-byte changes, xxd round-tripping is overkill. Use printf or Python to write exactly the bytes you need:

# Overwrite bytes at a specific offset using Python
python3 -c "
data = bytearray(open('binary', 'rb').read())
data[0] = 0x89   # fix first byte
data[1] = 0x50   # fix second byte
open('patched', 'wb').write(data)
"
# One-liner to write a single byte at offset 0 with dd
printf '\x89' | dd of=binary bs=1 seek=0 count=1 conv=notrunc

Hex editors

When you need to visually explore a binary file, scroll through it, and make interactive edits, a dedicated hex editor is faster than the command line. These are the tools most CTFers reach for:

hexeditTerminal (Linux / macOS)

Install: sudo apt install hexedit

Lightweight ncurses editor. Opens directly in the terminal. Use arrow keys to navigate, type hex digits to overwrite. Good for quick edits when you don't have a GUI.

BlessLinux GUI

Install: sudo apt install bless

A clean GTK hex editor. Shows hex and ASCII side by side with search and replace. Good default choice for Linux desktop environments.

HxDWindows

Install: Download from mh-nexus.de

Free, fast Windows hex editor with a familiar tabbed interface. Handles very large files efficiently. The standard recommendation for Windows users.

010 EditorCross-platform (Windows / macOS / Linux)

Install: Download from sweetscape.com (paid, trial available)

The most powerful option. Its killer feature is Binary Templates: structured parsers for PNG, ELF, ZIP, PE, and many more formats that overlay field names on the hex view. Invaluable for reverse engineering known formats.

Install hexedit and Bless

# Install both on Debian / Ubuntu / Kali
sudo apt install hexedit bless
# Open a file in hexedit
hexedit file.bin
# hexedit key bindings
# Arrow keys   navigate
# Tab          toggle between hex and ASCII pane
# Ctrl+X       save and exit
# Ctrl+C       exit without saving
# Ctrl+W       search (enter hex or ASCII string)

Endianness in hex dumps

Endianness describes the byte order used to store multi-byte integers in memory. It is one of the most common sources of confusion when reading hex dumps, especially in reverse engineering challenges.

Little-endian vs big-endian

Consider the 32-bit integer 0x12345678. It occupies four consecutive bytes in memory. How those bytes are ordered depends on the architecture:

# Big-endian (network byte order, most significant byte first)
Address: 00  01  02  03
Bytes:   12  34  56  78
# Little-endian (x86 / x86-64, least significant byte first)
Address: 00  01  02  03
Bytes:   78  56  34  12
# In a hex dump of a little-endian binary you would see:
00000000: 7856 3412 ....
# and need to mentally reverse the byte groups to recover 0x12345678

x86 and x86-64 (the architectures most CTF binaries run on) are little-endian. ARM is little-endian by default. Network protocols and some file formats like PNG use big-endian (network byte order) for multi-byte fields. ELF, by contrast, stores multi-byte fields in the target architecture's endianness (little-endian on x86/x86-64, flagged by EI_DATA in the header).

Python `struct` for multi-byte values

When you need to read a 2-, 4-, or 8-byte integer from a binary file, Python's struct module handles the byte-order conversion for you:

import struct
data = open('file.bin', 'rb').read()
# Read a 4-byte little-endian unsigned int at offset 0
value = struct.unpack_from('<I', data, offset=0)[0]
print(hex(value))
# Read a 4-byte big-endian unsigned int
value = struct.unpack_from('>I', data, offset=0)[0]
print(hex(value))
# Format string quick reference:
# '<' = little-endian   '>' = big-endian
# 'B' = uint8   'H' = uint16   'I' = uint32   'Q' = uint64
# 'b' = int8    'h' = int16    'i' = int32    'q' = int64

Challenges where endianness matters

picoCTF 2024 / endianness picoCTF 2024 / binhexa

CyberChef for hex manipulation

CyberChef (gchq.github.io/CyberChef) is a browser-based data transformation tool. It has over 300 operations and lets you chain them visually with no coding. For hex manipulation it is often faster than writing a Python script.

Key hex operations

From Hex: paste a hex string like 70 69 63 6f and decode it to bytes or ASCII. Works with or without spaces and with 0x prefixes.
To Hex: convert raw text or bytes to their hex representation, with configurable delimiters.
Magic: paste unknown data and let CyberChef auto-detect the encoding or format. It tries dozens of transformations and scores how printable the result is. Very useful when you have no idea what encoding you are dealing with.
XOR: apply a hex key XOR across the input. A classic CTF encoding.
Entropy: measure the Shannon entropy of the data. High entropy near 8.0 suggests encryption or compression. Low entropy suggests plain text or structured data.

Chaining operations

CyberChef shines when you need to apply multiple transformations in sequence. A typical CTF chain:

# CyberChef recipe (add these operations in order in the UI)
1. From Hex          <- decode the hex dump to raw bytes
2. Reverse           <- reverse the byte order
3. To Base64         <- encode the result as base64
# Another common chain for an encoded flag:
1. From Base64       <- decode outer base64 layer
2. From Hex          <- decode inner hex encoding
3. ROT13             <- final ROT13 shift

CyberChef vs the command line

Use CyberChef when you are exploring and not sure what encoding you are dealing with, when you want to visually chain operations without writing code, or when you need a quick one-off transformation. Use the command line when you are automating, working with large files (CyberChef struggles above a few MB in the browser), or need to integrate the result into a script.

Local alternative: This site has Recipe Chain which covers the same use case for hex-adjacent CTF work: From Hex, To Hex, XOR, ROT, Magic auto-decode, and base/encoding cascades stacked into a pipeline. The full recipe is captured in the URL so you can bookmark a working solve.

Related guide

Base64 and CTF encodings

Hex analysis workflow for CTF

When a forensics or binary challenge hands you a file you have never seen before, work through this checklist in order. Each step takes only seconds and rules out large classes of possibilities before you spend time on manual inspection.

Decision checklist

Run file to identify the type. Do not trust the extension. file mystery reads the magic bytes and reports the real format. If it says "data" or "ASCII text," the file may be encoded or intentionally obfuscated.
Run xxd -l 32 file to verify magic bytes manually. Cross-reference the first four bytes against the magic bytes table above. This confirms what file reported and shows you whether the header is partially corrupted.
Run strings file | grep -i pico to find obvious flags. Many challenges embed the flag in plaintext. This takes one second and sometimes solves the challenge outright.
Check file size: is it larger than expected? A 200-byte PNG that is actually 2 MB may have data appended after the IEND chunk. Run binwalk file to scan for embedded file signatures. binwalk -e file extracts them automatically.
Open in a hex editor for manual inspection. Scroll through the file looking for ASCII regions (visible text in the right column), repeating byte patterns, or abrupt transitions from structured data to apparent random bytes (which may indicate encryption).
If it is an image, check for LSB steganography. Tools like zsteg (PNG/BMP) and steghide look for data hidden in the least significant bits of pixel values. Run zsteg -a file.png to try all common channel combinations.

# The six-step checklist as a command sequence
file mystery_file
xxd -l 32 mystery_file
strings mystery_file | grep -i pico
ls -lh mystery_file && binwalk mystery_file
hexedit mystery_file
zsteg -a mystery_file   # if it turned out to be a PNG or BMP

Quick reference

Copy-paste cheat sheet for the most common hex analysis commands in a CTF context.

xxd flags

Flag	Effect
-l N	Dump only the first N bytes
-s N	Skip N bytes before dumping
-c N	N bytes per line (default 16)
-p	Plain hex output, no offsets or ASCII column
-r	Reverse: convert hex dump back to binary
-r -p	Reverse plain hex (no offset column needed)
-e	Little-endian byte grouping
-b	Binary (bit) dump instead of hex

Essential hex analysis commands

file target                         # identify file type from magic bytes
xxd -l 16 target                    # peek at the first 16 bytes
xxd target | grep -i 'pico'         # search hex dump for flag prefix
strings -n 6 target | grep -i pico  # extract long strings, filter for flag
binwalk target                       # detect embedded files
binwalk -e target                    # extract embedded files
zsteg -a target.png                  # LSB steganography scan (PNG/BMP)
steghide extract -sf target.jpg      # extract steghide payload (JPEG)
xxd target > dump.txt                # dump to text for editing
xxd -r dump.txt > patched            # reverse: text back to binary

Python struct format characters

# Prefix: '<' = little-endian, '>' = big-endian, '=' = native
struct.unpack_from('<B', data, off)  # uint8  (1 byte)
struct.unpack_from('<H', data, off)  # uint16 (2 bytes)
struct.unpack_from('<I', data, off)  # uint32 (4 bytes)
struct.unpack_from('<Q', data, off)  # uint64 (8 bytes)

Challenges to practice on

picoCTF 2024 / endianness picoCTF 2024 / binhexa

How to Read and Analyze Hex Dumps

Introduction

Reading xxd output

Annotated example

Useful xxd flags

Magic bytes and file signatures

Common file signatures

Checking magic bytes quickly

Finding strings and flags in hex dumps

Using `strings`

Grepping through a hex dump

Flags split across line boundaries

Editing bytes with xxd

The `xxd -r` patch workflow

When is this useful in a CTF?

Surgical patches with printf and Python

Hex editors

Install hexedit and Bless

Endianness in hex dumps

Little-endian vs big-endian

Python `struct` for multi-byte values

CyberChef for hex manipulation

Key hex operations

Chaining operations

CyberChef vs the command line

Hex analysis workflow for CTF

Quick reference

Try it on these picoCTF challenges

Keep reading

Introduction

Reading xxd output

Annotated example

Useful xxd flags

Magic bytes and file signatures

Common file signatures

Checking magic bytes quickly

Finding strings and flags in hex dumps

Using strings

Grepping through a hex dump

Flags split across line boundaries

Editing bytes with xxd

The xxd -r patch workflow

When is this useful in a CTF?

Surgical patches with printf and Python

Hex editors

Install hexedit and Bless

Endianness in hex dumps

Little-endian vs big-endian

Python struct for multi-byte values

CyberChef for hex manipulation

Key hex operations

Chaining operations

CyberChef vs the command line

Hex analysis workflow for CTF

Quick reference

Try it on these picoCTF challenges

Keep reading

Using `strings`

The `xxd -r` patch workflow

Python `struct` for multi-byte values