Pursuant to my recent post about finally understanding how Sega Dreamcast GD-ROM rips are structured, I was able to prepare the contents of various demo discs in a manner that makes exploration easy via the Internet Archive. This is due to the way that IA makes it easy to browse archives such as ZIP or ISO files (anything that 7zip knows how to unpack), and also presents the audio tracks for native playback directly through the web browser.
These are some of the interesting things I have found while perusing the various Dreamcast sampler discs.
First and foremost: Multimedia-wise, SFD and ADX files abound on all the discs. SFD files are Sofdec, a middleware format used for a lot of FMV on Dreamcast games. These were little more than MPEG video files with a non-MPEG (ADPCM instead) audio codec. VLC will usually play the video portions of these files but has trouble detecting the audio. It’s not for lack of audio codec support because it can play the ADX files just fine.
I’m finally completing something I set out to comprehend over a decade ago. I wanted to understand how data is actually laid out on a Sega Dreamcast GD-ROM drive. I’m trying to remember why I even still care. There was something about how I wanted to make sure the contents of a set of Dreamcast demo discs was archived for study.
I eventually figured it out. Read on, if you are interested in the technical details. Or, if you would like to examine the fruits of this effort, check out the Dreamcast demo discs that I took apart and uploaded to the Internet Archive.
If you care to read some geeky technical details of some of the artifacts on these sampler discs, check out this followup post on Dreamcast Finds.
Why do I still care about this? Well, see the original charter of this blog above. It’s mostly about studying multimedia formats, as well as the general operation of games and their non-multimedia data formats. It’s also something that has nagged at me ever since I extracted a bunch of Dreamcast discs years ago and tried to understand why the tracks were arranged the way they were, and how I could systematically split the files out of the filesystem. This turns out not to be as easy as it might sound, even if you can get past the obstacle of getting at the raw data.
A long time ago, I dashed off a quick blog post with a curious finding after studying the ISO-9660 spec: The format stores multi-byte numbers in a format I termed “omni-endian”– the committee developing the format apparently couldn’t come to an agreement on this basic point regarding big- vs. little-endian encoding (I’m envisioning something along the lines of “tastes great! … less filling!” in the committee meetings).
I recently discovered another bit of compromise in the ISO-9660 spec: It seems that there are 2 different methods for processing the directory structure. That means it’s incumbent upon ISO-9660 creation software to fill in the data structures to support both methods, because about some ISO-reading programs out there rely on one set of data structures while the rest prefer to read the other set.
As a refresher, the “ISO” extension of an ISO file refers to the ISO-9660 specification. This is a type of read-only filesystem (i.e, the filesystem is created once and never updated after initial creation) for the purpose of storing on a read-only medium, often an optical disc (CD-ROM, DVD-ROM). The level of nostalgic interest I display for the ISO-9660 filesystem reminds me of my computer science curriculum professors from the mid-90s reminiscing about ye olden days of punchcard programming, but such is my lot. I’m probably also alone in my frustration of seeing rips of, e.g., GameCube or Xbox or 3DO games being tagged with the extension .ISO since those systems use different read-only filesystems.
I recently fell in with an odd bunch called the eXoDOS project and was trying to help fill in a few gaps. One request was a 1994 game called Power Drive for DOS.
I recently published a tool called MobyCAIRO. The ‘CAIRO’ part stands for Computer-Assisted Image ROtation, while the ‘Moby’ prefix refers to its role in helping process artifact image scans to submit to the MobyGames database. The tool is meant to provide an accelerated workflow for rotating and cropping image scans. It works on both Windows and Linux. Hopefully, it can solve similar workflow problems for other people.
As of this writing, MobyCAIRO has not been tested on Mac OS X yet– I expect some issues there that should be easily solvable if someone cares to test it.
The rest of this post describes my motivations and how I arrived at the solution.
I have scanned well in excess of 2100 images for MobyGames and other purposes in the past 16 years or so. The workflow looks like this:
It should be noted that my original workflow featured me manually rotating the artifact on the scanner bed in order to ensure straightness, because I guess I thought that rotate functions in image editing programs constituted dark, unholy magic or something. So my workflow used to be even more arduous:
I can’t believe I had the patience to do this for hundreds of scans
Sometime last year, I was sitting down to perform some more scanning and found myself dreading the oncoming tedium of straightening and cropping the images. This prompted a pivotal question:
Why can’t a computer do this for me?
After all, I have always been a huge proponent of making computers handle the most tedious, repetitive, mind-numbing, and error-prone tasks. So I did some web searching to find if there were any solutions that dealt with this. I also consulted with some like-minded folks who have to cope with the same tedious workflow.
I came up empty-handed. So I endeavored to develop my own solution.
Problem Statement and Prior Work