Roboto Sans picoCTF 2022 Solution

Published: July 20, 2023

Description

Inspect the site’s /robots.txt. The developer left Base64-encoded hints pointing to the real flag file.

Open the site in your browser and have curl/CyberChef ready for inspecting and decoding the response.

bash
curl -s http://saturn.picoctf.net:55771/robots.txt
  1. Step 1Read robots.txt
    robots.txt is a plain-text directive file with three main fields: User-agent (which crawlers a rule applies to), Allow, and Disallow (paths to skip). This site lists encoded strings alongside the disallow rules. Decoding them reveals exact file paths.

    The Base64 string anMvbXlmaWxlLnR4dA== shows up among the disallows. Decode it inline:

    $ echo 'anMvbXlmaWxlLnR4dA==' | base64 -d
    js/myfile.txt

    That path is the next stop.

    Learn more

    robots.txt is a plain-text file at the root of a website that instructs web crawlers (like Googlebot) which paths to avoid indexing. It follows the Robots Exclusion Protocol - an informal standard, not a security mechanism. Any browser or script can freely access Disallow-listed paths; the file is merely a polite request to well-behaved bots.

    In penetration testing and CTF recon, robots.txt is one of the first files to check. Developers often list sensitive paths (admin panels, backup files, API endpoints) to prevent them from appearing in search results - inadvertently advertising exactly what they're trying to hide. It's essentially a roadmap of interesting endpoints.

    Common paths worth checking: /admin, /backup, /.git, /config, /api, and anything a developer would want to hide from search engines. Automated tools like dirbuster, gobuster, and ffuf combine robots.txt analysis with dictionary-based path brute-forcing.

  2. Step 2Grab the flag
    Request http://saturn.picoctf.net:55771/js/myfile.txt. The response contains the picoCTF flag in plain text.
    Learn more

    Base64 is an encoding scheme that converts binary or arbitrary data into a set of 64 printable ASCII characters. It's not encryption - it has no key and is trivially reversible. Its purpose is data transport (e.g., email attachments, embedding binary in JSON), not confidentiality.

    Spotting Base64 in the wild: strings are typically a multiple of 4 characters long, use only A-Z, a-z, 0-9, +, and /, and often end with one or two = padding characters. The command echo 'string' | base64 -d decodes on Linux; CyberChef provides the same with a visual interface.

    In this challenge, Base64 was used to slightly obscure file paths - but obscurity is not security. Any attacker who reads robots.txt will immediately recognize and decode the strings. Real security requires proper access controls (authentication, authorization), not encoding tricks.

Flag

picoCTF{Who_D03sN7_L1k5_90B0T5_22ce...}

Robots.txt is frequently a gold mine for hidden endpoints during recon.

Want more picoCTF 2022 writeups?

Useful tools for Web Exploitation

Related reading

Do these first

What to try next