Roboto Sans

Published: July 20, 2023

Description

Inspect the site’s `/robots.txt`. The developer left Base64-encoded hints pointing to the real flag file.

Navigate to `/robots.txt`; it lists disallowed paths plus several Base64 strings.

Decode the strings (e.g., with CyberChef) to learn filenames like `flag1.txt` and `js/myfile.txt`.

Visit the decoded path `js/myfile.txt` to view the flag.

curl -s http://saturn.picoctf.net:55771/robots.txt
echo 'anMvbXlmaWxlLnR4dA==' | base64 -d # yields js/myfile.txt
curl -s http://saturn.picoctf.net:55771/js/myfile.txt

Solution

  1. Step 1Read robots.txt
    The file disallows `/cgi-bin/` and `/wp-admin/`, but more importantly includes encoded clues. Decoding them reveals exact file paths.
    Learn more

    robots.txt is a plain-text file at the root of a website that instructs web crawlers (like Googlebot) which paths to avoid indexing. It follows the Robots Exclusion Protocol - an informal standard, not a security mechanism. Any browser or script can freely access Disallow-listed paths; the file is merely a polite request to well-behaved bots.

    In penetration testing and CTF recon, robots.txtis one of the first files to check. Developers often list sensitive paths (admin panels, backup files, API endpoints) to prevent them from appearing in search results - inadvertently advertising exactly what they're trying to hide. It's essentially a roadmap of interesting endpoints.

    Common paths worth checking: /admin, /backup, /.git, /config, /api, and anything a developer would want to hide from search engines. Automated tools like dirbuster, gobuster, and ffuf combine robots.txt analysis with dictionary-based path brute-forcing.

  2. Step 2Grab the flag
    Request `http://saturn.picoctf.net:55771/js/myfile.txt`. The response contains the picoCTF flag in plain text.
    Learn more

    Base64is an encoding scheme that converts binary or arbitrary data into a set of 64 printable ASCII characters. It's not encryption - it has no key and is trivially reversible. Its purpose is data transport (e.g., email attachments, embedding binary in JSON), not confidentiality.

    Spotting Base64 in the wild: strings are typically a multiple of 4 characters long, use only A-Z, a-z, 0-9, +, and /, and often end with one or two = padding characters. The command echo 'string' | base64 -d decodes on Linux; CyberChef provides the same with a visual interface.

    In this challenge, Base64 was used to slightly obscure file paths - but obscurity is not security. Any attacker who reads robots.txt will immediately recognize and decode the strings. Real security requires proper access controls (authentication, authorization), not encoding tricks.

Flag

picoCTF{Who_D03sN7_L1k5_90B0T5_22ce...}

Robots.txt is frequently a gold mine for hidden endpoints during recon.

Want more picoCTF 2022 writeups?

Useful tools for Web Exploitation

Related reading

Do these first

What to try next