June 14, 2026

Python Reversing for CTF: Bytecode, Frozen Binaries, and Obfuscated Scripts

Decompile Python bytecode, unpack PyInstaller executables, and peel exec-obfuscation layers. Everything you need to reverse engineer Python CTF challenges.

Python is the most reversible compiled language in CTF

Here is the honest take: Python reversing is easier than reversing C, not harder. The bytecode is documented. The standard library ships a disassembler. Even frozen executables are just zipped .pyc files with a stub in front. Every secret the program uses at runtime is sitting somewhere in that artifact, because Python has to be able to load it.

The confusion comes from not knowing which tool to reach for. Someone who has never seen a .pyc file googles for ten minutes, gets three conflicting answers about uncompyle6 and pycdc, and gives up. Someone who has done it once opens the file, checks the magic bytes, picks the right tool, and has source in 30 seconds. This guide is the 30-second version.

Every secret a Python program uses at runtime has to be present in the artifact. The question is just where.

Python challenges in picoCTF fall into three categories: source code with obfuscation layered on top, compiled bytecode (.pyc files), and frozen executables that bundle the entire interpreter (PyInstaller, py2exe, cx_Freeze). Each requires a different opening move, but the logic after that is nearly identical. Identify the type, extract the real logic, read the flag comparison.

Note: This post covers reversing Python programs. Writing Python scripts to solve other challenge types is a different skill, covered in the Python for CTF guide.

Three-second triage: what are you actually looking at

Run file ./chall and strings ./chall | head -20 before doing anything else. Those two commands tell you which category you are in.

What you seeTypeFirst move
file: ASCII text, .py extensionPython sourceRead it; check for exec( or eval( calls
magic bytes: xx 0d 0d 0aCompiled bytecode (.pyc)python3 -m dis chall.pyc or run a decompiler
strings: "PyInstaller" or "_MEIPASS"Frozen executablepython pyinstxtractor.py ./chall
file: Zip archive, .pyz extensionPython zip archiveunzip chall.pyz -d out/ then decompile the .pyc inside
strings: "UPX!" alongside "PyInstaller"UPX-packed frozen binaryupx -d ./chall first, then pyinstxtractor

Spotting a .pyc file without an extension requires checking the first four bytes. Every .pyc starts with a two-byte magic number followed by 0d 0a. Run xxd chall | head -1 and compare the first two bytes against the table below to identify the Python version before you pick a decompiler.

# Quick check: is this a .pyc?
xxd chall | head -1
# Output: 00000000: 550d 0d0a ... <-- Python 3.8 bytecode
# Python version from first two bytes (little-endian magic)
# 3.8 -> 55 0d 3.9 -> 61 0d
# 3.10 -> 6f 0d 3.11 -> a7 0d
# 3.12 -> cb 0d 3.13 -> d3 0d
# Or let Python tell you directly:
python3 -c "import importlib._bootstrap_external as b; print(b.MAGIC_NUMBER.hex())"
Tip: The challenge file often lacks a .pyc extension to obscure what it is. Always run file and xxd before assuming anything from the name. weirdSnake in picoCTF 2024 is a classic example: the file is named snake with no extension, but the magic bytes immediately identify it as Python bytecode.

Reading bytecode with dis: the universal fallback

The Python standard library ships a disassembler. No installs, no version compatibility headaches. If a decompiler fails, python3 -m dis chall.pyc always works. It is also the right tool when you want to see the exact instructions rather than an approximate source reconstruction.

python3 -m dis chall.pyc | less
# Or from inside a script:
import dis
import marshal
with open('chall.pyc', 'rb') as f:
f.read(16) # skip 16-byte header (Python 3.8+; 8 bytes for 3.7 and older)
code = marshal.loads(f.read())
dis.dis(code)

Most CTF flag checkers boil down to four opcodes. Learn these and you can read a flag comparison without reconstructing full source:

OpcodeMeaningWhy CTF cares
LOAD_CONSTPush a literal value onto the stackKeys, ciphertext arrays, and expected values live here
BUILD_LIST / BUILD_STRINGAssemble N items from the stack into a list or stringA long sequence of LOAD_CONST followed by BUILD_LIST is the ciphertext array
COMPARE_OPCompare top two stack values (==, !=, etc.)The flag comparison. The expected value sits on the stack just before this
BINARY_XOR / BINARY_OPApply a binary operator to the top two stack valuesTells you exactly how the cipher works without reading any source

The picoCTF 2024 challenge weirdSnake is the canonical bytecode XOR example. The file has no extension, but xxd snake | head -1 reveals a .pyc magic number immediately. Running python3 -m dis snake shows two key sequences:

# Key string built character by character:
2 0 LOAD_CONST 0 ('t')
2 LOAD_CONST 1 ('_')
4 LOAD_CONST 2 ('J')
6 LOAD_CONST 3 ('o')
8 LOAD_CONST 4 ('3')
10 BUILD_STRING 5
12 STORE_NAME 0 (key_str)
# Ciphertext integers in a list:
4 0 LOAD_CONST 0 (99)
2 LOAD_CONST 1 (116)
4 LOAD_CONST 2 (38)
...
XX BUILD_LIST N
XX STORE_NAME 1 (input_list)

The LOAD_CONST sequence before BUILD_STRING reconstructs key_str = 't_Jo3'. The LOAD_CONST integers before BUILD_LIST are the ciphertext. A BINARY_XOR opcode later in the disassembly confirms the cipher. Once you have both, decryption is three lines of Python:

from itertools import cycle
key_list = [ord(c) for c in 't_Jo3']
input_list = [99, 116, 38, ...] # all values from LOAD_CONST sequence
flag = ''.join(chr(a ^ b) for a, b in zip(input_list, cycle(key_list)))
print(flag) # picoCTF{N0t_sO_coNfus1ng_sn@ke_...}
Warning: Use itertools.cycle when the key is shorter than the ciphertext. Plain zip(input_list, key_list) stops at the shorter sequence and silently truncates the flag. This is the most common decryption bug in Python CTF scripts.

Decompiling .pyc files: which tool wins for which version

Decompilers reconstruct Python source from bytecode. The quality varies wildly by Python version. The short rule: use pycdc for Python 3.9 and later; use uncompyle6 for Python 2.x through 3.8. When both fail, fall back to dis and read the opcodes directly.

ToolPython version supportInstall
pycdc3.0 through 3.13 (best for 3.9+)Build from source (see below)
uncompyle62.x through 3.8pip install uncompyle6
decompile33.7 and 3.8 (fork of uncompyle6)pip install decompile3
pycdasSame as pycdc (ships alongside it)Built with pycdc; raw disassembly output
# uncompyle6 (Python 2.x to 3.8)
pip install uncompyle6
uncompyle6 chall.pyc
# decompile3 (Python 3.7-3.8 alternative)
pip install decompile3
decompile3 chall.pyc
# pycdc (Python 3.9+ and anything uncompyle6 fails on)
git clone https://github.com/zrax/pycdc && cd pycdc
cmake . && make
./pycdc chall.pyc
# pycdas: raw disassembly (like dis but more annotated)
./pycdas chall.pyc

When a decompiler prints Internal error or produces garbled output, check the Python version first. The magic bytes in the first four bytes of the .pyc file tell you exactly which version compiled it. Feeding a Python 3.11 .pyc into uncompyle6 will always fail; switch to pycdc.

Note: The pycdas tool (shipped alongside pycdc) produces a richer disassembly than Python's built-in dis module, including symbolic names for jump targets and cleaner constant formatting. When decompilation fails and raw dis output is hard to read, pycdas is a good middle ground.
When the decompiler fails, read the opcodes. The disassembler never fails.

Unpacking PyInstaller frozen executables

PyInstaller bundles a Python interpreter, all imported modules, and your script into a single ELF or PE file. At runtime, it extracts a temporary directory (the _MEIPASS folder) and runs the bundled .pyc. From a reversing perspective, all the Python source is still there, just compressed inside the binary.

The fingerprint is reliable: strings ./chall | grep -i 'PyInstaller\|_MEIPASS\|pyi-' returns hits on any PyInstaller binary. Once confirmed, run pyinstxtractor to extract the contents:

# Detect
strings ./chall | grep -i 'PyInstaller\|_MEIPASS'
# Get pyinstxtractor (single-file script, no install needed)
wget https://raw.githubusercontent.com/extremecoders-re/pyinstxtractor/master/pyinstxtractor.py
# Extract
python3 pyinstxtractor.py ./chall
# Creates: chall_extracted/
# Find the main script (same name as the binary, no .pyc extension usually)
ls chall_extracted/
file chall_extracted/chall
# Decompile
cp chall_extracted/chall chall_main.pyc
pycdc chall_main.pyc

Two gotchas that catch people the first time:

  • Missing magic bytes. pyinstxtractor usually prepends the correct magic bytes automatically, but if decompilation fails with "bad magic number," manually prepend them. The correct magic for the Python version pyinstxtractor reports is atchall_extracted/struct.pyc (first four bytes). Copy those four bytes, then prepend twelve zero bytes to match the full 16-byte header, and prepend the whole 16 bytes to your main .pyc.
  • Wrong entry point. The binary name in the extracted folder is the entry point, but large PyInstaller bundles also contain many library .pyc files. Look for the one whose name matches the binary or run grep -rl 'picoCTF\|flag\|password' chall_extracted/ to find which .pyc contains the relevant logic.
# If pycdc reports 'bad magic number', fix the header manually:
MAGIC=$(xxd -l 4 chall_extracted/struct.pyc | awk '{print $2$3}' | sed 's/../\\x&/g')
printf "${MAGIC}" > header.bin
# Append 12 zero bytes (padding for Python 3.8+ header format)
python3 -c "open('header.bin','ab').write(b'\x00'*12)"
cat header.bin chall_extracted/chall > chall_main.pyc
pycdc chall_main.pyc
Tip: If the extracted directory contains a PYZ-00.pyz_extracted folder, all the bundled modules are there as individual .pyc files. The main application logic is almost always in the top-level file matching the binary name, not inside this folder.

Peeling exec() obfuscation layers

The most common obfuscation pattern in Python CTF challenges is a self-decoding script: an outer script that decodes an inner payload and runs it via exec(). The outer script might base64-decode, XOR-decrypt, Fernet-decrypt, or zlib-decompress the payload before executing it. Multiple layers stack.

The fundamental insight never changes. At some point, every layer has the decoded payload in a variable, and it calls exec(payload) or exec(payload.decode()). Replacing that one call with print(payload.decode()) exposes the inner script without executing it.

# Pattern 1: simple base64 exec
# Original: exec(base64.b64decode(b'aW1wb3J0...'))
# Fix: print(base64.b64decode(b'aW1wb3J0...').decode())
# Pattern 2: zlib + base64 chain
# Original: exec(zlib.decompress(base64.b64decode(b'eJy...')).decode())
# Fix: print(zlib.decompress(base64.b64decode(b'eJy...')).decode())
# Pattern 3: marshal code object (cannot print as string, use dis instead)
import marshal, base64, dis
code = marshal.loads(base64.b64decode(b'YwAAAAA...'))
dis.dis(code) # disassemble the hidden code object
# Pattern 4: multiple nested layers
# Run the outer print to get layer 2, then repeat

The picoCTF 2022 challenge unpackme.flag.py uses Fernet encryption (AES-128-CBC with HMAC-SHA256) with a hardcoded key. The outer script imports the key as a string literal, constructs a Fernet object, decrypts the embedded ciphertext, and calls exec(plain.decode()). Replacing the exec with a print reveals the inner Python script, which itself prints the flag when run.

# Inspect any self-decoding script without running the payload
# Edit unpackme.flag.py: change exec(plain.decode()) to print(plain.decode())
sed -i 's/exec(plain.decode())/print(plain.decode())/' unpackme.flag.py
python3 unpackme.flag.py
# Prints the inner source code; the flag is inside
Warning: Never use eval() to parse unknown output from a CTF server or challenge binary. Use ast.literal_eval() instead. It parses only Python literals (strings, numbers, lists, dicts, tuples) and refuses to execute arbitrary code. The picoCTF 2025 challenge quantum-scrambler shows exactly why: the server returns a nested list literal that looks safe to eval but should never be trusted unconditionally.

The marshal pattern is the trickiest variant. When the payload is a compiled code object rather than a string, you cannot print it as text. The code object is the same format Python uses internally for function bodies. Disassembling it with dis.dis(code) gives you the opcodes directly, which is everything you need to trace the flag logic.

The five CTF patterns: what flag checkers actually do

After working through dozens of Python reversing challenges, the same five structures keep appearing. Recognizing the pattern takes you from "how do I even start" to "oh I know exactly what to do" in seconds.

1. Hardcoded comparison loop

The simplest pattern. The script checks user input character by character against a hardcoded expected value. In source, it is a for loop with if char != expected[i]. In bytecode, it is COMPARE_OP inside a loop. The expected value is visible as a string literal or a list of integer ASCII codes in the LOAD_CONST sequence.

crackme-py (picoCTF 2021) is the cleanest example. The script encodes the flag with a simple Caesar-style rotation and compares the result to a hardcoded string. Read the rotation function, invert it, apply it to the expected string.

2. XOR with key from constants

The ciphertext is stored as a list of integers. The key is a string built from individual LOAD_CONST characters. The decryption is a repeating-key XOR loop. Already covered in the dis section above using weirdSnake as the example.

3. Custom cipher in obfuscated source

Some challenges ship a .py file where the logic is deliberately obfuscated: variable names replaced with single letters, whitespace stripped, string operations inlined. The cipher itself is simple (Caesar, Vigenere, XOR, character remapping), but reading the source takes effort. bloat.py (picoCTF 2022) is a classic. The script uses cryptic variable names but the actual logic is a character lookup in a static alphabet. Read the lookup table and invert it.

For this pattern, patching is often faster than reversing. Find the comparison and replace it with a print. The patchme.py (picoCTF 2022) challenge takes this literally: the password is in the source, and grepping for the assignment finds it immediately.

4. Exec layer chain

Covered in detail in the obfuscation section above. The fingerprint is anexec( call anywhere in the file. Peel each layer by replacing the exec with a print and running the script again. Two or three layers is the typical maximum in CTF challenges.

5. Bytecode virtual machine

The most advanced pattern. The challenge ships a custom bytecode format and a Python interpreter for it. Your task is to understand the bytecode ISA (instruction set architecture) and trace execution to find the flag. The Virtual Machine 0 and Virtual Machine 1 challenges from picoCTF 2023 both use this pattern: a Python script implements a tiny VM, and the bytecode program for that VM encodes the flag check. Read the VM loop, trace the bytecode, extract the expected value.

Key insight: For bytecode VM challenges, print the dispatch table from the VM loop (the dictionary or match statement mapping opcodes to handlers) before tracing execution. This gives you the ISA in one shot and saves the time of reverse engineering each handler individually.

picoCTF challenge map

Python reversing challenges span nearly every picoCTF event. The difficulty ladder is gradual enough that beginners can start at the source-reading end and work up to bytecode VM challenges without gaps.

ChallengePatternKey skill
patchme.py (2022)Hardcoded comparisonGrep source for password assignment
bloat.py (2022)Obfuscated source cipherRead alphabet lookup, replace exec with print
unpackme.flag.py (2022)Exec layer (Fernet)Swap exec for print to expose inner script
crackme-py (2021)Rotation cipher in sourceRead decode function, apply inverse
weirdSnake (2024)Bytecode XORdis, extract LOAD_CONST sequence, XOR decrypt
quantum-scrambler (2025)Permutation cipher in sourceast.literal_eval, reverse the shuffle
Virtual Machine 0 (2023)Bytecode VMRead VM loop, trace bytecode ISA
Virtual Machine 1 (2023)Bytecode VM (harder)Build a disassembler for the custom ISA

The picoGym Exclusive Picker I, Picker II, and Picker III challenges are a good warm-up series for Python source reading, each adding one more layer of indirection before you can call the flag-printing function.

Tool quick reference

Decision tree

  1. Run file ./chall and strings ./chall | head -30.
  2. Plain .py source with exec( inside? Swap exec for print, run, repeat.
  3. Magic bytes end in 0d 0d 0a? It is a .pyc. Run python3 -m dis or pick a decompiler from the version table.
  4. strings shows "PyInstaller" or "_MEIPASS"? Run pyinstxtractor, find the main .pyc, decompile.
  5. Decompiler fails? Read the opcodes with dis or pycdas. Every constant the program compares against is visible in LOAD_CONST.

Install cheat sheet

# uncompyle6 (Python 2.x - 3.8)
pip install uncompyle6
uncompyle6 chall.pyc
# decompile3 (Python 3.7-3.8 alternative)
pip install decompile3
decompile3 chall.pyc
# pycdc / pycdas (Python 3.9+ and anything uncompyle6 fails on)
git clone https://github.com/zrax/pycdc && cd pycdc
cmake . && make
./pycdc chall.pyc # decompile to source
./pycdas chall.pyc # raw disassembly
# pyinstxtractor (frozen executables)
wget https://raw.githubusercontent.com/extremecoders-re/pyinstxtractor/master/pyinstxtractor.py
python3 pyinstxtractor.py ./chall
# dis (built-in, always works)
python3 -m dis chall.pyc | less

One last thing. The reason Python reversing feels hard to beginners is not the bytecode. It is the decision paralysis at the start: which tool, which version, which file. Once you have a triage habit, the actual reversing is just reading Python with slightly worse formatting. You already know the language. The disassembler output looks alien the first time, but LOAD_CONST and COMPARE_OP are just function calls and if statements in disguise. Read them that way.

For the binary side of reversing (C and C++ executables), the Ghidra Reverse Engineering guide covers the equivalent workflow. For symbolic execution that can solve Python flag checkers automatically when the logic is complex, see the angr CTF Tutorial.