What picoCTF Web Challenges Teach You About Real Bugs in Production

The transfer problem

Pull up the writeup for n0s4n1ty-1. Profile page takes an image upload, never checks the extension, serves the file back from a directory that executes PHP. Now pull up almost any WordPress plugin advisory from the last five years with the phrase “arbitrary file upload leading to remote code execution.” The payload is identical. The shell is identical. The branding is the only thing that changes.

That mapping is not a coincidence, and it is not limited to file uploads. After working the picoCTF web catalog end-to-end, the bug classes collapse into a small number of patterns, and those patterns are the ones that still ship in production code. The interesting question is not whether CTF transfers. It is what transfers, and which reflex each challenge is actually drilling.

Every picoCTF web challenge trains one of five reflexes. The flags are bait. The reflex is the payload.

The five surfaces

The five surfaces below cover the picoCTF web catalog from 2019 through 2025. Every challenge lives on at least one, and most real-world web CVEs you will read about this year do too.

Surface	The mistake	picoCTF example
Trust the client	Security state stored where the user can edit it	Cookie Monster
Strings become code	Input concatenated into a query, template, or expression	SQLiLite, SSTI1
Operator confusion	A parser accepts a richer type than the code assumed	No Sql Injection
Files become code	User-supplied bytes end up in a directory the server executes	n0s4n1ty-1
Leaked internals	Debug or diagnostic surfaces reachable from the public internet	head-dump, apriti-sesamo

The rest of this post walks through each one. For every surface I will point to the picoCTF writeup in this repo, name the reflex the challenge trains, and sketch the production incident pattern it rhymes with.

Surface 1: trust the client

The first challenge that usually surprises a working developer is Cookie Monster. You log in, look at your cookies, and find a field named something like admin=false. You flip it to true, refresh, and the flag appears. That is the entire challenge.

The reflex this trains is the most valuable one in the whole catalog: every byte the client controls is attacker input, including the ones your framework gave it. Cookies, local storage, hidden form fields, JSON Web Token (JWT) payloads with alg: none, client-side feature flags. The moment trust crosses the network boundary and lands in a place the user can edit, it is not trust anymore.

In production this surface shows up as insecure direct object references, JWTs signed with the empty algorithm, role claims read from cookies instead of server-side session stores, and the nastiest version of all, a serialized object in a cookie that the server blindly rebuilds (insecure deserialization, which jumps straight to remote code execution). You do not need a zero-day to find one. You need the habit of opening DevTools on every authenticated page and reading what the server sent you.

Pair Cookie Monster with 3v-l and any of the JWT challenges in the Cookies and JWT post. Three challenges, three hours, and you will never again ship a role check that reads from a cookie.

Surface 2: strings become code

The second surface is the one the industry has been fighting longest and still loses to: user input concatenated into something that later gets parsed as code. Structured Query Language (SQL), HTML, shell commands, template strings, object-graph navigation expressions. The parser does not care where the bytes came from. If the syntax is valid, the syntax runs.

picoCTF teaches this surface in a staircase. Start with SQLiLite, a login form that concatenates your username into a WHERE clause. Move to More SQLi for UNION-based extraction. Then jump to SSTI1, where the same reflex applies to a Jinja template instead of a database, and a payload of {{7*7}} comes back as 49 in the rendered page.

# SQLi: input becomes part of the query
username = "' OR 1=1-- -"
query = f"SELECT * FROM users WHERE username='{username}'"
# SSTI: input becomes part of the template
name = "{{ config.__class__.__init__.__globals__['os'].popen('id').read() }}"
template = f"Hello {name}"

This is exactly the class of bug behind the 2017 Equifax breach, where an unpatched Apache Struts server (CVE-2017-5638) evaluated user-controlled Object-Graph Navigation Language (OGNL) inside a header and executed arbitrary commands on the server. It is also the class of bug behind the 2023 Confluence template injection flaw (CVE-2023-22527), which sat in production for years before anyone caught it. The payloads look different. The parser trust model is the same one SSTI1 trains you to distrust.

Each half of this surface has a dedicated walkthrough: the SQL injection guide for the database side, and the server-side template injection guide for the template side.

Surface 3: operator confusion

Surface 3 is the quiet one. It does not look like injection, because there are no quotes, no semicolons, no template braces. The attacker supplies a value that is syntactically legal but semantically richer than the code assumed, and the parser happily accepts it.

No Sql Injection is the canonical training exercise. The backend is Node.js with MongoDB, the login route reads JavaScript Object Notation (JSON) from the request body, and the developer wrote roughly this:

db.users.findOne({ username: req.body.username, password: req.body.password })

That looks fine until you send {"username": {"$ne": null}, "password": {"$ne": null}}. The developer thought req.body.username was a string. The MongoDB driver is happy to take an object, interprets $ne as the “not equal” operator, and the query now matches any document where both fields are not null. First user in the collection is admin. Done.

The production version of this bug shows up every time a framework auto-parses a body format richer than the developer expected. It is why mass-assignment ships in Rails apps, why PHP's strcmp with an array used to return NULL (which loosely compares equal to 0), and why a decade of Node.js tutorials quietly taught people to write exploitable login code. The reflex No Sql Injection trains is: treat every parsed input as a value of the widest type your parser accepts, not the narrowest type your code expects. The NoSQL injection guide collects the operator payloads worth trying first.

Surface 4: files become code

n0s4n1ty-1 is the cleanest example of a pattern that keeps killing small web apps: user-supplied bytes end up in a directory the server will execute. Upload a file named shell.php containing a one-line PHP webshell, browse to /uploads/shell.php?cmd=id, and you have code execution.

# shell.php: the entire payload
<?php system($_GET['cmd']); ?>
# then run commands through the query string
curl 'http://target/uploads/shell.php?cmd=id'
curl 'http://target/uploads/shell.php?cmd=cat+/flag.txt'

The full extension-bypass and web-shell repertoire lives in the file upload exploitation guide.

Two separate controls have to fail for this to work, and in real breaches they usually both do. The first is input validation: the server should reject anything that is not an image, by content inspection rather than extension. The second is containment: the upload directory should not be inside a path the web server interprets. One failure is embarrassing. Two is a CVE.

The production version is a staple of WordPress plugin advisories and small business content management systems. A plugin accepts an avatar upload, stores it at /wp-content/uploads/avatars/ with the original filename, and PHP handlers are enabled in that directory. The writeups for those CVEs read like a word-for-word reprint of the n0s4n1ty-1 solution, and they do not get written by nation-state actors. They get written by anyone who has done this challenge once.

Surface 5: leaked internals

The last surface is also the most embarrassing for the defender, because there is no input validation flaw at all. The server simply exposes a diagnostic surface that was supposed to stay internal.

head-dump hands you a heap dump endpoint on a Spring Boot application. You download the dump, grep it for strings that look like secrets, and the flag falls out. No exploitation required. apriti-sesamo is the same idea with a different door: a debug route that was never taken down before deployment.

The production version of this bug is Spring Boot Actuator endpoints reachable on the public internet. The /actuator/heapdump path returns a binary dump of the running JVM, which can be loaded into a standard memory analyzer and grepped for database passwords, API keys, and session cookies. Shodan searches for exposed actuator endpoints return a steady stream of results every year, and the HackerOne bounty histories for large companies are full of reports with a six-word summary: “/actuator/heapdump exposed, contained database credentials.”

The reflex head-dump trains is not a scanning tool or a payload. It is the habit of probing /debug, /actuator, /api/docs, /_profiler, /metrics, /.env, and /.git/config on every target you touch. That habit is worth more than any scanner.

# probe the classic Spring Boot leak, then mine it for secrets
curl -s http://target/actuator/heapdump -o heap.hprof
strings heap.hprof | grep -iE 'password|secret|token|picoCTF\{'
# and the other doors worth a one-line check
curl -s http://target/.env
curl -s http://target/.git/config

The systematic version of this sweep (wordlists, endpoint discovery, and what each hit means) is the web reconnaissance guide.

And when one of those internal surfaces is bound to localhost instead of the public interface, the way you reach it from outside is Server-Side Request Forgery: make the server fetch its own internal endpoints for you. Surface 5 and SSRF are the same blind spot seen from two sides.

Why the transfer works

The five surfaces are not a taxonomy of every web bug. They are the five picoCTF drills hardest, and they account for a conspicuously large share of the CVEs that actually get exploited in the wild. The reason CTF transfers is not that the payloads match. It is that pattern recognition under time pressure is not something you read into. It is something you rehearse, and the web track is 50+ reps of the same five reflexes in slightly different costumes.

Spend a weekend running the track end-to-end and you will notice the side effect. The next time you open a pull request that concatenates a string into a query, or mounts an uploads directory inside the web root, or reads a role claim from a cookie, the same signal fires that fired the first time you typed ' OR 1=1-- - into a login form. That is the entire point. Every challenge you solve calibrates the signal a little sharper.