Steganography Techniques for CTF Competitions

Introduction

Steganography is the practice of hiding data inside an ordinary-looking carrier file, such that the very existence of the hidden message is concealed. This is different from encryption, which scrambles data so it cannot be read without a key but makes no attempt to hide that something is being protected. A steganographic image looks like a perfectly normal photograph; only someone who knows where to look will find the flag.

In CTF competitions, steganography challenges appear almost exclusively in the forensics category. The carrier can be any file type: PNG, JPEG, BMP, WAV, MP3, a PDF, or even a plain text file. The five main technique families you will encounter are:

Image LSB: data encoded in the least significant bits of pixel color channels, invisible to the naked eye.
File-within-file: a second file (ZIP, ELF, PNG) appended to or embedded inside the primary file.
Audio spectrograms: images or text drawn in the frequency domain of an audio recording.
Metadata: flags or clues stored in EXIF comment fields, GPS coordinates, or thumbnail images.
Text and whitespace: data encoded in trailing spaces, Unicode zero-width characters, or homoglyph substitutions inside text files.

This guide focuses on techniques and workflow: what to try, in what order, and why. For detailed installation instructions for each individual tool, see the companion posts linked below.

Auto-solver shortcut: Most of the techniques below (LSB sweeps across every channel and bit plane, audio spectrograms, polyglot carving, EXIF, whitespace decode, plus a 6-layer base/ROT/XOR/zlib cascade on anything extracted) run in parallel inside Stegall. Drop a file in to get an instant verdict, then come back to this guide when you need to understand why a technique worked or hand-tune a specific channel.

First steps on any stego challenge

Before reaching for any specialist steganography tool, run through this checklist. It takes under a minute and occasionally reveals the flag outright, without needing anything beyond core Unix utilities.

1. Confirm the actual file type

The file extension can lie. file reads the magic bytes at the start of the file and tells you what it actually is. A PNG that is secretly a ZIP, or a JPEG that is really an ELF binary, will be obvious immediately.

file challenge.png
# Example output:
# challenge.png: PNG image data, 800 x 600, 8-bit/color RGB, non-interlaced
# If it says something unexpected:
# challenge.png: Zip archive data, at least v2.0 to extract

2. Inspect magic bytes directly

xxd prints a hex dump of the file. The first few bytes are the magic bytes that identify the format. PNG files start with 89 50 4E 47 (that is.PNG). If the hex dump shows something else at offset 0 or reveals a second magic sequence further in, you have found something interesting.

xxd challenge.png | head -20
# Common magic bytes to recognise:
# 89 50 4E 47  ->  PNG
# FF D8 FF     ->  JPEG
# 50 4B 03 04  ->  ZIP (PK..)
# 7F 45 4C 46  ->  ELF binary
# 25 50 44 46  ->  PDF
# 52 49 46 46  ->  RIFF (WAV audio)

3. Search for obvious embedded text

strings challenge.png | grep -i pico
# Broader search when you don't know the flag prefix:
strings challenge.png | grep -E '[A-Za-z0-9+/]{20,}='   # base64 blobs
strings challenge.png | grep -i flag
strings challenge.png | grep -i ctf

4. Read the metadata with exiftool

EXIF metadata is the first place challenge authors hide flags because it requires no pixel manipulation. One command dumps everything.

# Install:
sudo apt install libimage-exiftool-perl
# Dump all metadata:
exiftool challenge.jpg
# Focus on comment-type fields that often hide flags:
exiftool -Comment -Artist -UserComment -Description challenge.jpg

5. Check file size vs expected size

An 800x600 RGB PNG with no hidden data typically compresses to roughly 300-800 KB depending on the image content. If the file is 3 MB, something extra is in there. Usels -lh challenge.png and compare against a reference or against what the image dimensions would suggest.

Tip: These five checks together take about 60 seconds. Do all of them before launching any specialist tool. Many CTF flags live in the EXIF Comment or are a base64 blob instrings output.

LSB (Least Significant Bit) steganography

LSB steganography hides data by replacing the lowest-order bit of each color channel value in every pixel. A pixel whose red channel value is 11001010 in binary has its LSB changed from 0 to 1 to store a hidden bit, making the channel value 11001011. The visible color difference is just one step on a 256-step scale, completely undetectable to the human eye. Across an entire image, you can store roughly one bit per color channel per pixel: a 1000x1000 RGB PNG can carry around 375 KB of hidden data.

LSB encoding is the most common steganography technique in CTF challenges because it is easy to implement and the results look completely normal.

zsteg: automated LSB scanning for PNG and BMP

zsteg automatically tests dozens of combinations of channels (R, G, B, alpha), bit positions (bit 0 through bit 7), and byte order, then reports anything that decodes as readable text. One command is usually enough.

zsteg challenge.png          # scan common channel/bit combinations
zsteg -a challenge.png       # exhaustive mode (slower, fewer missed cases)
# Example output to look for:
# b1,rgb,lsb,xy         .. text: "picoCTF{hidden_in_lsb_42}"
# b1,rgba,lsb,xy        .. file: Zip archive data
# Extract a specific channel to a file:
zsteg -e 'b1,rgb,lsb,xy' challenge.png > extracted.bin
file extracted.bin

Pay attention to entries that report file: rather than just text. If zsteg says it found a Zip archive or PNG inside, extract that channel and open the result with the appropriate tool.

Stegsolve: visual LSB inspection

When zsteg finds nothing, Stegsolve lets you visually inspect individual bit planes. Some challenges encode data in a non-standard order (column-major instead of row-major, or MSB instead of LSB) that zsteg misses. In Stegsolve, use the left and right arrow keys to flip through R0, R1... G0, G1... B0, B1... planes. If a plane shows a hidden image or text, you have found the encoding scheme.

# Download and launch:
wget http://www.caesum.com/handbook/Stegsolve.jar -O stegsolve.jar
java -jar stegsolve.jar
# Then: File > Open, use arrow keys to cycle planes,
# or Analyse > Data Extract to combine specific bit planes.

Challenges using LSB techniques

picoCTF 2023 / HideToSee picoCTF 2023 / hideme

File-within-file and appended data

Many file formats tolerate extra data appended after their official end marker: a PNG ends at its IEND chunk, but a ZIP (or any other file) stitched on after it stays invisible to image viewers while unzip still extracts it. Suspect this when the file is far larger than its dimensions warrant, or when strings shows aPK (ZIP) or ELF signature buried mid-file. The workhorses are binwalk challenge.png to locate signatures, binwalk -e (or -Me for recursion) to extract, and foremost or a manual dd carve when automatic extraction misfires.

For the full carving workflow, magic-byte and trailer tables, polyglot construction, and challenge receipts (UnforgottenBits, hideme), see File Carving and Magic Bytes for CTF.

Metadata and EXIF data

EXIF metadata sits outside the pixel array, so a flag stashed in a comment field, in GPS coordinates, or in the embedded thumbnail JPEG leaves the image looking identical. exiftool -all challenge.jpg dumps everything; exiftool -Comment -Artist -UserComment -Description challenge.jpg targets the fields authors reach for most, and exiftool -b -ThumbnailImage challenge.jpg > thumb.jpg pulls out a thumbnail that sometimes holds the flag on its own.

For the field-by-field breakdown, the GPS-coordinate tricks, PNG text chunks versus JPEG EXIF, and worked challenge examples, see EXIF and Metadata Forensics for CTF.

Audio steganography

Audio carriers hide data three main ways: images or text painted into a spectrogram (generate one with sox audio.wav -n spectrogram -o spec.png and always try this first), LSB encoding in uncompressed WAV samples, and Morse or binary tones visible as short and long pulses in the waveform. If the audio sounds like static, the spectrogram usually reveals the answer immediately.

For the full toolchain (Audacity and Sonic Visualiser settings, WAV LSB extraction with stego-lsb, DTMF and Morse decoding with multimon-ng, and challenge receipts), see Audio Steganography for CTF.

Text and whitespace steganography

Text-based steganography does not involve images or audio at all. The challenge provides a text file, source code file, or HTML page that looks normal but has data hidden in the whitespace or in invisible Unicode characters. These challenges require different tools from the image-focused ones.

Trailing whitespace encoding (SNOW)

The SNOW tool encodes messages by appending space and tab characters to the ends of lines. The trailing whitespace is invisible in most text editors and terminals but carries binary data. SNOW is specifically designed for this scheme. On Debian, Ubuntu, and Kali the package and command are both named stegsnow.

sudo apt install stegsnow
# Decode hidden message from a text file:
stegsnow -C message.txt
# With a passphrase (some challenges use one):
stegsnow -C -p 'password' message.txt
# Verify trailing whitespace is present:
cat -A message.txt   # -A shows $ at line end and ^I for tabs
# Lines with trailing spaces will show: 'some text   $'

Zero-width and invisible Unicode characters

Some tools encode binary data using zero-width Unicode characters that are completely invisible in normal text rendering. Common culprits include:

U+200B (Zero Width Space), encoded as E2 80 8B in UTF-8
U+200C (Zero Width Non-Joiner), encoded as E2 80 8C
U+200D (Zero Width Joiner), encoded as E2 80 8D
U+FEFF (Zero Width No-Break Space / BOM), encoded as EF BB BF

# Detect zero-width characters with xxd and grep:
xxd message.txt | grep 'e2 80 8b'   # zero-width space
xxd message.txt | grep 'e2 80 8c'   # zero-width non-joiner
xxd message.txt | grep 'e2 80 8d'   # zero-width joiner
# Use Python to find and decode a zero-width binary encoding:
python3 -c "
text = open('message.txt').read()
zwsp = '\u200b'  # zero-width space = '1'
zwnj = '\u200c'  # zero-width non-joiner = '0'
bits = ''.join('1' if c == zwsp else '0' if c == zwnj else '' for c in text)
print(bits)
print(bytes(int(bits[i:i+8], 2) for i in range(0, len(bits), 8)))
"

Homoglyph substitution

Homoglyphs are characters that look identical (or nearly identical) to common Latin letters but are different Unicode code points. For example, the Cyrillic letterа (U+0430) looks exactly like the Latin a (U+0061). A challenge might replace select letters in a paragraph with their homoglyph equivalents to encode a binary message.

# Check for non-ASCII characters in a text file:
python3 -c "
text = open('message.txt').read()
for i, c in enumerate(text):
    if ord(c) > 127:
        print(f'Position {i}: U+{ord(c):04X} ({c!r})')
"

Note: Text steganography challenges often come with a hint that the file "looks normal" or asks you to look more carefully. If you open a text file and it seems unremarkable, paste its contents into a hex editor before giving up. Zero-width characters and unusual Unicode code points will be immediately visible.

Decoding the extracted bytes: The bit string you recover from whitespace or homoglyphs is rarely the flag itself. Authors typically wrap it in base64, hex, ROT, or an XOR layer (sometimes several at once). Paste the decoded bytes into Recipe Chain and run Magic mode to auto-discover the decode pipeline, or stack the operations manually if you already know the layering.

Color channel and bit-plane analysis

Sometimes a flag is encoded in a single color channel and invisible in the composite image. The red channel might carry the hidden data while the green and blue channels contain only noise, making the flag undetectable in a normal RGB view. Stegsolve is the primary tool for this kind of visual analysis.

Cycling through bit planes in Stegsolve

Stegsolve displays one bit plane at a time. Press the left and right arrow keys to cycle through all 32 planes (8 bits for each of R, G, B, and alpha). If a plane shows coherent content (readable text, a QR code, another image) rather than random noise, the data is encoded there.

Open the image: File > Open.
Use the left/right arrow keys in the main window to step through planes. The plane name appears at the bottom (e.g. Red plane 0).
Plane 0 for each channel is the LSB. Plane 7 is the MSB. Challenges using LSB encoding will show visible structure in plane 0; MSB encoding shows in plane 7.
If you find a plane with readable content, note which channel and bit number.

Data Extract: combining bit planes

The Analyse > Data Extract dialog in Stegsolve lets you select which bit planes to combine and extract as a binary stream. This is useful when the flag is spread across multiple planes (e.g. bits 0-2 of the red channel, read in column-major order).

Go to Analyse > Data Extract.
Tick the checkboxes for the channels and bit positions you want to include. For standard LSB encoding, tick Red 0, Green 0, Blue 0.
Choose Row or Column order to match how the data was written.
Click Preview. Look for readable text at the top of the output. Use Save Text or Save Bin to export it.
Run strings saved.bin | grep pico or file saved.bin to identify what was extracted.

Using Python to isolate a single channel

If you know which channel contains the hidden data, you can extract it without Stegsolve using the Python Pillow library. This is useful when you want to automate the extraction or process the channel data further.

python3 -c "
from PIL import Image
img = Image.open('challenge.png').convert('RGB')
r, g, b = img.split()
# Save just the red channel as a grayscale image:
r.save('red_channel.png')
# Check if LSBs of red channel form a readable byte stream:
pixels = list(img.getdata())
bits = ''.join(str(p[0] & 1) for p in pixels)  # red channel LSBs
message = bytes(int(bits[i:i+8], 2) for i in range(0, len(bits)-7, 8))
print(message[:200])
"

Challenges using bit-plane analysis

picoCTF 2023 / MSB picoCTF 2025 / RED

Recommended stego triage workflow

When you receive a challenge file, follow this numbered decision tree. It moves from the fastest, most automated checks to progressively more manual investigation. Stop as soon as you find the flag.

Identify the actual file type. Run file challenge.* and xxd challenge.* | head. If the format does not match the extension, rename the file and proceed accordingly.
Check metadata first. Run exiftool challenge.* and strings challenge.* | grep -i pico. These two commands together find a significant fraction of CTF stego flags because challenge authors frequently use EXIF comments for beginner-level challenges.
Scan for embedded files. Run binwalk challenge.*. If it reports a second file signature, run binwalk -e challenge.* and examine the extracted directory. This covers the entire file-within-file category automatically.
For PNG and BMP images: run zsteg. zsteg challenge.png covers LSB encoding in all common channel configurations. If the quick scan finds nothing, run zsteg -a challenge.png for exhaustive mode.
For JPEG images: try steghide. steghide extract -sf challenge.jpg -p '' attempts extraction with a blank passphrase. If it reports data is present but requests a password, check other challenge materials for the passphrase. If none is obvious, try stegcracker challenge.jpg /usr/share/wordlists/rockyou.txt.
For audio files: check the spectrogram. Run sox audio.wav -n spectrogram -o spec.png and open the result. If you see text or an image, you are done. If not, open the file in Audacity and look at the waveform for Morse patterns.
Open in Stegsolve and cycle planes manually. This catches non-standard LSB orderings, MSB encoding, single-channel encoding, and anything that automated tools miss. Use the Data Extract dialog to try different channel and bit combinations systematically.
For text files: look for whitespace and Unicode. Run cat -A file.txt to see trailing whitespace, then stegsnow -C file.txt. Run xxd file.txt | grep 'e2 80' to check for zero-width Unicode characters.

Strategy note: Steps 1 through 3 are format-agnostic and take under two minutes total. Always do them before branching into format-specific tools. Step 4 and 5 are mutually exclusive based on file type. Step 7 (Stegsolve) is the manual fallback when everything automated has failed.

Quick reference

Tool matrix covering the main steganography techniques encountered in CTF competitions. For installation details, see the Steganography Tools guide.

ToolFindsFile typesOne-liner

exiftoolEXIF comments, GPS, thumbnailsJPEG, PNG, TIFF, PDFexiftool -all file.jpg

strings + grepPlaintext flags, base64 blobsAnystrings file.png | grep -i pico

binwalkAppended/embedded filesAny binarybinwalk -e file.png

file + xxdTrue file format, magic bytesAnyxxd file.png | head

zstegLSB / MSB pixel encodingPNG, BMPzsteg file.png

steghidePassphrase-protected payloadJPEG, BMPsteghide extract -sf file.jpg -p ''

stegcrackersteghide passphrase via wordlistJPEG, BMPstegcracker file.jpg rockyou.txt

StegsolveAny bit-plane pattern, MSB encodingPNG, BMP, JPEGjava -jar stegsolve.jar

sox spectrogramImages hidden in audio frequencyWAV, AIFFsox audio.wav -n spectrogram -o s.png

stegsnowWhitespace-encoded messagesTXTstegsnow -C file.txt

Format-to-tool decision guide

Any file, first 60 seconds: file, exiftool, strings | grep pico, binwalk
PNG image: zsteg first, then Stegsolve
JPEG image: steghide with blank password, then stegcracker if locked
BMP image: zsteg and steghide both apply
WAV / audio: sox spectrogram, then Audacity waveform view for Morse
Text / TXT file: cat -A for whitespace, stegsnow, then Python Unicode check