Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

ISO Compromise

November 20th, 2007 by Multimedia Mike

Engineering is about trade-offs and compromises. One of the most fundamental trade-offs to be made when designing a storage format is whether multi-byte numbers will be encoded as little or big endian numbers. But have you ever studied the data structures involved in ISO-9660, the standard filesystem format for optical discs? It seems that the committee tasked with developing this standard were unwilling to make this one tough decision and specified all multi-byte numbers as omni-endian. I just made that term up. Maybe it could be called bi-endian or multi-endian. The raw detail is that multi-byte numbers are stored in little endian format and then in big endian. For example, 0x11223344 is stored using 8 bytes: 0x44 0x33 0x22 0x11 0x11 0x22 0x33 0x44.


CD-ROM

Do any other filesystems take this compromise? I am not that versed. I have studied the odd game-related optical filesystem; I had to write a manual ext2 searching tool for a sysadmin class; I also had to try to recover a virus-corrupted FAT16 filesystem (to no avail; that virus cleanly chewed up some of the most important sectors).

Anyway, if I were to go ahead and try for a new FUSE driver for ISO-9660 (or modify an existing driver), I would want to go after the main format. Plus, I would want to natively interpret that CISO format on the fly. Further, I would use this as a platform to understand what is so special about the apparent ISO-9660 data ripped from a Sega Dreamcast GD-ROM.

Are there any other ISO bastardizations to target for such a tool?

Posted in Programming | 4 Comments »

4 Responses

  1. mark cox Says:

    Perhaps, it is a simple form of redundancy, with the benefit that, if you don’t care about error detection, you can just skip every 1st or second byte (depending on your architecture).

    Is this practice only in the meta-data? or repeated in data too?

  2. Multimedia Mike Says:

    I haven’t digested the whole spec, but I think it is just in the metadata, the core data structures.

    I forgot to stipulate that at least they weren’t concerned about space. Given that trade-off, perhaps it’s no big mystery that the CISO guys opted to compress as many sectors as possible.

  3. Ian Farquhar Says:

    Not that I know of myself.

    The closest thing I can think of is the X11 protocol, which negotiates endianness during the initial session setup. But that’s a network protocol, not a filesystem.

    But I’ve not seen anything else which stores/sends both formats. I’m guessing this was a political compromise.

  4. sean barrett Says:

    It would be even more awesome if it was stored interleaved, i.e. as ’44 11 33 22 22 33 11 44′.

    I guess this approach lets you write readers with something like

    struct
    {
    int field_bigendian;
    int field_lilendian;

    };
    #if ENDIAN==BIG
    #define field field_bigendian
    #else
    #define field field_lilendian
    #endif

    which lets you do things like memory map read-only files and work on the data directly (since normally you’d either swizzle on load, or have to swizzle on every access). It doesn’t seem like it could be that performance critical, having to swizzle on every access, but maybe it bloats code to much?

    I mean, ok, maybe it was just political, but that seems crazy.