Corrupted file picoMini by CMU-Africa Solution

Published: April 2, 2026

Description

A JPEG file has corrupted magic bytes. Fix the header to view the flag image.

Download the corrupted file from the challenge page.

bash
file corrupted

Solution

Want to try it yourself first?

The guided walkthrough reveals hints one step at a time.

Walk me through it
  1. Step 1
    Diagnose the corruption
    Observation
    I noticed the challenge described the file as having corrupted magic bytes, which suggested running the file command first to confirm the format was unrecognized, then using xxd to inspect the raw leading bytes and compare them against the expected JPEG signature.
    Run the file command - it reports 'data' instead of JPEG because the magic bytes are wrong. Inspect the raw bytes with xxd to see what the first three bytes currently are.
    bash
    file corrupted
    bash
    xxd corrupted | head

    Expected output

    corrupted: data
    
    00000000: ab4c 4aff d800 0000 0000 0000 0000 0000  .LJ.............
    00000010: 0000 0000 0000 0000 0000 0000 0000 0000  ................
    What didn't work first

    Tried: Rename the file to corrupted.jpg and open it in an image viewer hoping it displays

    Image decoders check magic bytes internally, not the file extension. The viewer still refuses to render it and throws a 'not a valid JPEG' error. You need to actually overwrite the bad bytes in the file content, not just change the filename.

    Tried: Run strings corrupted to look for flag text without fixing the file

    strings only extracts printable ASCII sequences and will not decode the flag because it is embedded in image pixel data as a rendered graphic, not as a plain text string in the file body. The flag only becomes readable after the image is rendered by an actual JPEG decoder.

    Learn more

    Magic bytes (also called file signatures) are specific byte sequences at the start of a file that identify its format. The file command reads these bytes and matches them against a database of known signatures rather than trusting the file extension. This is why renaming a PNG to .jpg still produces "PNG image data" in the output - the content determines the type, not the name.

    xxd produces a hex dump of a file - two hex characters per byte on the left, with an ASCII representation on the right. The first few bytes of the dump reveal exactly what is currently stored at the file's start. Comparing those bytes against a table of known signatures (e.g., JPEG = FF D8 FF, PNG = 89 50 4E 47, PDF = 25 50 44 46) immediately tells you what the file should be and what needs fixing.

    In forensics challenges, file corruption is a very common technique: magic bytes are intentionally altered so the file appears unreadable at first glance. The solution almost always involves identifying the correct signature for the detected file type and restoring those bytes using a hex editor or a command-line tool.

  2. Step 2
    Restore the JPEG magic bytes
    Observation
    I noticed the xxd dump showed bytes AB 4C 4A at the file start instead of the required FF D8 FF JPEG signature, which meant I needed a binary-safe tool like dd with conv=notrunc to overwrite only those three bytes without destroying the rest of the image data.
    A valid JPEG must start with FF D8 FF. Use printf and dd to overwrite only the first three bytes without touching any image data. The conv=notrunc flag prevents truncating the rest of the file.
    bash
    printf '\xff\xd8\xff' | dd of=corrupted bs=1 count=3 conv=notrunc
    bash
    file corrupted

    Expected output

    3+0 records in
    3+0 records out
    3 bytes copied, 0.000123 s, 24.4 kB/s
    corrupted: JPEG image data, JFIF standard 1.01
    What didn't work first

    Tried: Run dd without conv=notrunc: printf '\xff\xd8\xff' | dd of=corrupted bs=1 count=3

    Without conv=notrunc, dd truncates the output file to exactly the number of bytes written - in this case 3 bytes. The resulting file contains only the magic bytes and all the image data is destroyed, leaving nothing for a viewer to render.

    Tried: Use a text editor (nano, vim) to manually type the bytes FF D8 FF at the start of the file

    Text editors interpret byte values above 0x7F as multi-byte UTF-8 sequences and will either refuse to save or corrupt the non-ASCII bytes further. You must use a binary-aware tool like dd, Python, or a dedicated hex editor (hexedit, ghex) that writes raw byte values without any character encoding translation.

    Learn more

    The JPEG file format specifies that every valid file must begin with the byte sequence FF D8 FF. This is the SOI (Start of Image) marker followed by the first segment marker. Image decoders check for this signature before attempting to parse the rest of the file - without it, they refuse to render the image.

    dd is a low-level copy utility that operates on raw bytes. The flags used here are: bs=1 (block size of 1 byte), count=3 (write exactly 3 bytes), and conv=notrunc (do not truncate the output file - without this flag, dd would overwrite the file entirely with just those 3 bytes, destroying all the image data). printf with \xff-style escape sequences outputs the exact raw bytes needed.

    Alternatively, a Python one-liner or a hex editor like hexedit or 010 Editor can patch specific byte offsets interactively. In real forensics investigations, restoring a corrupted file header is a standard recovery technique used to repair deliberately or accidentally damaged files.

  3. Step 3
    Open the repaired image
    Observation
    I noticed the file command now reported a valid JPEG after the byte patch, which confirmed the header was restored and the image could be rendered by any standard viewer to reveal the flag.
    Open the repaired JPEG in any image viewer. The flag is visible inside the image.
    bash
    eog corrupted
    Learn more

    Once the magic bytes are restored, the file is a fully valid JPEG that any compliant decoder can render. eog (Eye of GNOME) is the default image viewer on many Linux desktops. Alternatives include display (ImageMagick), feh, or simply opening the file in a browser.

    This challenge demonstrates that file formats are defined by their internal structure, not their extension or filename. Understanding byte-level file structures is a foundational skill in digital forensics - it applies to recovering accidentally overwritten headers, analyzing malware that disguises its type, and extracting data from partially damaged storage media.

Interactive tools
  • File Magic IdentifierIdentify file types from magic numbers. Paste hex bytes or drop a file to detect PNG, JPEG, ZIP, PDF, ELF, PCAP, SQLite, and dozens of other formats.

Flag

Reveal flag

picoCTF{r3st0r1ng_th3_by73s_...}

Per-instance flag confirmed from multiple independent sources. Different users received different hex suffixes: 1512b52a, b67c1558, 2326ca93. Format prefix is consistent across all instances.

Key takeaway

Every common file format begins with a fixed magic byte sequence that parsers check before processing any content. Corrupting or altering those leading bytes renders the file unrecognized by standard tools, even though all the underlying data remains intact. Restoring the correct signature with a hex editor or dd is a routine step in digital forensics and malware analysis, where attackers frequently change file headers to disguise executable payloads or hide data from automated scanners.

Related reading

Want more picoMini by CMU-Africa writeups?

Tools used in this challenge

What to try next