Description
Can you find the robots? A website URL is provided - figure out where they are hiding.
Setup
Open the provided challenge website URL in your browser.
Solution
Want to try it yourself first?
The guided walkthrough reveals hints one step at a time.
Step 1
Check robots.txtObservationI noticed the challenge is named 'Where Are the Robots,' which is a direct reference to the robots.txt file that every web server exposes publicly at its root, suggesting that reading this file would reveal the hidden path.robots.txt is a public file at the root of every website that instructs search engine crawlers which paths not to index. It ironically reveals 'secret' paths to any human visitor. Navigate to /robots.txt and read the Disallow entries.bashcurl http://<challenge-server>/robots.txtExpected output
User-agent: * Disallow: /477ce.html
What didn't work first
Tried: Navigate to /sitemap.xml instead of /robots.txt to find hidden paths
sitemap.xml lists pages the site owner wants crawlers to find, not pages they want hidden. It returns either an XML list of public URLs or a 404. The disallowed path only appears in robots.txt under a Disallow directive - that is the file this challenge is specifically named after.
Tried: Run gobuster or dirbuster against the server to brute-force directory names
Directory brute-forcing sends thousands of requests and may take minutes without finding the exact randomized path (like /477ce.html). The path is not a common word in any wordlist. robots.txt hands you the exact disallowed path for free in a single request, which is the intended recon step here.
Learn more
robots.txt is a plain text file placed at the root of a web server (
/robots.txt) that implements the Robots Exclusion Standard. It tells well-behaved web crawlers (like Googlebot, Bingbot) which URLs they should not visit or index. The format specifiesUser-agent(which crawler the rule applies to) andDisallow(paths that crawler should skip).The critical security flaw: robots.txt is completely public. Any human - or malicious bot - can navigate directly to
/robots.txtand read every disallowed path. This means that listing a path asDisallowactively advertises its existence to anyone curious enough to look. Common mistakes developers make include listing paths like:/admin- admin panels/backup- backup files/api/v1/internal- internal API endpoints/staging- staging environment/.git- version control directories
robots.txt is always one of the first places checked during a web application penetration test or bug bounty reconnaissance phase. Security tools like
gobuster,dirbuster, andferoxbusterautomatically fetch robots.txt as part of their directory enumeration process. The proper way to protect sensitive paths is through authentication and authorization - not by relying on crawlers to ignore them.Step 2
Visit the disallowed pathObservationI noticed robots.txt contained a Disallow entry for /477ce.html, which means that exact path exists on the server and is accessible to anyone who navigates to it directly.robots.txt will list a disallowed path like /477ce.html. Navigate directly to that URL in your browser - the flag is displayed on that page.Learn more
Once a disallowed path is found in robots.txt, navigating to it is trivial - just append the path to the base URL. This step highlights the core lesson: obscurity is not security. The path is "hidden" only from search engine indexes, not from direct access. Anyone who knows the URL can visit it freely.
This is sometimes called security through obscurity- the misguided belief that keeping implementation details secret provides security. Bruce Schneier and other security experts have long argued that security systems must be secure even if everything about the system except the key is public knowledge (Kerckhoffs's principle). Applying this to web security: every URL on your server should be assumed publicly known, and authorization must be enforced server-side for every request.
In bug bounty programs, robots.txt enumeration is a standard first step. Hunters have found admin panels, debug endpoints, internal APIs, and sensitive files this way on major websites. Google itself publishes its full
robots.txtdisallow list forgoogle.com, which reveals interesting internal path structures even though all those paths require authentication to access.
Interactive tools
- Strings ExtractorPull printable text from any binary, library, or image. ASCII and UTF-16 detection, configurable minimum length, flag-like highlight, no command line needed.
Flag
Reveal flag
picoCTF{ca1cu1at1ng_Mach1n3s_...}
Per-instance flag. Multiple hash suffixes confirmed (8e32f, a44f7). Prefix picoCTF{ca1cu1at1ng_Mach1n3s_} is consistent.