ISO-9660 Compromise, Part 2: Finding Root

A long time ago, I dashed off a quick blog post with a curious finding after studying the ISO-9660 spec: The format stores multi-byte numbers in a format I termed “omni-endian”– the committee developing the format apparently couldn’t come to an agreement on this basic point regarding big- vs. little-endian encoding (I’m envisioning something along the lines of “tastes great! … less filling!” in the committee meetings).

I recently discovered another bit of compromise in the ISO-9660 spec: It seems that there are 2 different methods for processing the directory structure. That means it’s incumbent upon ISO-9660 creation software to fill in the data structures to support both methods, because about some ISO-reading programs out there rely on one set of data structures while the rest prefer to read the other set.

Background

As a refresher, the “ISO” extension of an ISO file refers to the ISO-9660 specification. This is a type of read-only filesystem (i.e, the filesystem is created once and never updated after initial creation) for the purpose of storing on a read-only medium, often an optical disc (CD-ROM, DVD-ROM). The level of nostalgic interest I display for the ISO-9660 filesystem reminds me of my computer science curriculum professors from the mid-90s reminiscing about ye olden days of punchcard programming, but such is my lot. I’m probably also alone in my frustration of seeing rips of, e.g., GameCube or Xbox or 3DO games being tagged with the extension .ISO since those systems use different read-only filesystems.

I recently fell in with an odd bunch called the eXoDOS project and was trying to help fill in a few gaps. One request was a 1994 game called Power Drive for DOS.


Power Drive CD-ROM

My usual CD-ROM ripping method (for the data track) is a simple ‘dd’ command from a Linux command line to copy the string of raw sectors. However, it turned out to be unusually difficult to open the resulting ISO. A few of the the options I know of worked but most didn’t. What’s the difference?

Methods that work:

  • Mounting the file with the Linux iso9660 kernel module, i.e.,
    mount -t iso9660 /dev/optical-drive /mnt

    or

    mount -t iso9660 -o loop /path/to/Power-Drive.iso /mnt
  • Directory Opus
  • Windows 10 can read the filesystem when reading the physical disc
  • Windows 10 can burn the ISO image to a new CD (“right click” -> “Burn disc image”); this method does not modify any of the existing sectors but did append 149 additional empty sectors

Methods that don’t work:

Understanding The Difference

I think I might have a handle on why some tools are able to process this disc while most can’t. There appears to be 2 sets of data structures to describe the base of the filesystem: A root directory, and a path table. These both occur in the first substantive sector of the ISO-9660 filesystem, usually sector 16.

A compact disc can be abstractly visualized as a long string of sectors, each one 2,352 bytes long. (See my Grand Unified Theory of Compact Disc post for deeper discussion.) A CD-ROM data track will contain 2048 bytes of data. Thus, sector 16 appears at 0x8000 of an ISO filesystem. I like the clarity of this description of the ISO-9660 spec. It shows that the path table is defined at byte 140 (little-endian; big comes later) and location of the root directory is at byte 158. Thus, these locations generally occur at 0x808c and 0x809e.


Primary Volume Descriptor
Primary Volume Descriptor

The path table is highlighted in green and the root directory record is highlighted in red. These absolute locations are specified in sectors. So the path table is located at sector 0x12 = offset 0x9000 in the image, while the root directory record is supposed to be at sector 0x62 = 0x31000. Checking into those sectors, it turns out that the path table is valid while the root directory record is invalid. Thus, any tool that relies on the path table will be successful in interpreting the disc, while tools that attempt to recursively traverse starting from root directory record are gonna have a bad time.

Since I was able to view the filesystem with a few different tools, I know what the root directory contains. Searching for those filenames reveals that the root directory was supposed to point to the next sector, number 0x63. So this was a bizarre off-by-1 error on the part of the ISO creation tool. Maybe. I manually corrected 0x62 -> 0x63 and that fixed the interaction with fuseiso, but not with other tools. So there may have been some other errors. Note that a quick spot-check of another, functional ISO revealed that this root directory sector is supposed to be exact, not 1-indexed.

Upon further inspection, I noticed that, while fuseiso appeared to work with that one patch, none of the files returned correct data, and none of the directories contained anything. That’s when I noticed that ALL of the sector locations described in the various directory and file records are off by 1!

Further Investigation

I have occasionally run across ISO images on the Internet Archive that return the error about not being able to read the contents when trying to “View contents” (error text: “failed to obtain file list from xyz.iso”, as seen with this ISO). Too bad I didn’t make a record of them because I would be interested to see if they have the same corruption.

Eventually, I’ll probably be able to compile an archive of deviant ISO-9660 images. A few months ago, I was processing a large collection from IA and found a corrupted ISO which had a cycle, i.e., the subdirectory pointed to a parent directory, which caused various ISO tools to loop forever. Just one of those things that is “never supposed to happen”, so why write code to deal with it gracefully?

See Also

3 thoughts on “ISO-9660 Compromise, Part 2: Finding Root

  1. b0b

    ISO9660 isn’t always read-only… on SunOS 4 at least, you can have an ISO9660 filesystem on a hard drive and mount it read-write.

  2. Multimedia Mike Post author

    @ ‘.’: Thanks for the intel regarding the ‘&debug=1’ flag. That will be useful in finding more disc images that exhibit this same weirdness.

    @b0b: That’s interesting to learn, and a bit crazy. It just doesn’t seem like the FS is optimized for random writes, especially since all files have to be contiguous. But I’m sure they knew what they were doing.

Comments are closed.