Author Archives: Multimedia Mike

Parsing The Clue Chronicles

A long time ago, I procured a 1999 game called Clue Chronicles: Fatal Illusion, based on the classic board game Clue, a.k.a. Cluedo. At the time, I was big into collecting old, unloved PC games so that I could research obscure multimedia formats.



Surveying the 3 CD-ROMs contained in the box packaging revealed only Smacker (SMK) videos for full motion video which was nothing new to me or the multimedia hacking community at the time. Studying the mix of data formats present on the discs, I found a selection of straightforward formats such as WAV for audio and BMP for still images. I generally find myself more fascinated by how computer games are constructed rather than by playing them, and this mix of files has always triggered a strong “I could implement a new engine for this!” feeling in me, perhaps as part of the ScummVM project which already provides the core infrastructure for reimplementing engines for 2D adventure games.

Tying all of the assets together is a custom high-level programming language. I have touched on this before in a blog post over a decade ago. The scripts are in a series of files bearing the extension .ini (usually reserved for configuration scripts, but we’ll let that slide). A representative sample of such a script can be found here:

clue-chronicles-scarlet-1.txt

What Is This Language?
At the time I first analyzed this language, I was still primarily a C/C++-minded programmer, with a decent amount of Perl experience as a high level language, and had just started to explore Python. I assessed this language to be “mildly object oriented with C++-type comments (‘//’) and reliant upon a number of implicit library functions”. Other people saw other properties. When I look at it nowadays, it reminds me a bit more of JavaScript than C++. I think it’s sort of a Rorschach test for programming languages.

Strangely, I sort of had this fear that I would put a lot of effort into figuring out how to parse out the language only for someone to come along and point out that it’s a well-known yet academic language that already has a great deal of supporting code and libraries available as open source. Google for “spanish dolphins far side comic” for an illustration of the feeling this would leave me with.

It doesn’t matter in the end. Even if such libraries exist, how easy would they be to integrate into something like ScummVM? Time to focus on a workable approach to understanding and processing the format.

Problem Scope
So I set about to see if I can write a program to parse the language seen in these INI files. Some questions:

  1. How large is the corpus of data that I need to be sure to support?
  2. What parsing approach should I take?
  3. What is the exact language format?
  4. Other hidden challenges?

To figure out how large the data corpus is, I counted all of the INI files on all of the discs. There are 138 unique INI files between the 3 discs. However, there are 146 unique INI files after installation. This leads to a hidden challenge described a bit later.

What parsing approach should I take? I worried a bit too much that I might not be doing this the “right” way. I’m trying to ignore doubts like this, like how “SQL Shame” blocked me on a task for a little while a few years ago as I concerned myself that I might not be using the purest, most elegant approach to the problem. I know I covered language parsing a lot time ago in university computer science education and there is a lot of academic literature to the matter. But sometimes, you just have to charge in and experiment and prototype and see what falls out. In doing so, I expect to have a better understanding of the problems that need to solved and the right questions to ask, not unlike that time that I wrote a continuous integration system from scratch because I didn’t actually know that “continuous integration” was the keyword I needed.

Next, what is the exact language format? I realized that parsing the language isn’t the first and foremost problem here– I need to know exactly what the language is. I need to know what the grammar are keywords are. In essence, I need to reverse engineer the language before I write a proper parser for it. I guess that fits in nicely with the historical aim of this blog (reverse engineering).

Now, about the hidden challenges– I mentioned that there are 8 more INI files after the game installs itself. Okay, so what’s the big deal? For some reason, all of the INI files are in plaintext on the CD-ROM but get compressed (apparently, according to file size ratios) when installed to the hard drive. This includes those 8 extra INI files. I thought to look inside the CAB installation archive file on the CD-ROM and the files were there… but all in compressed form. I suspect that one of the files forms the “root” of the program and is the launching point for the game.

Parsing Approach
I took a stab at parsing an INI file. My approach was to first perform lexical analysis on the file and create a list of 4 types: symbols, numbers, strings, and language elements ([]{}()=.,:). Apparently, this is the kind of thing that Lex/Flex are good at. This prototyping tool is written in Python, but when I port this to ScummVM, it might be useful to call upon the services of Lex/Flex, or another lexical analyzer, for there are many. I have a feeling it will be easier to use better tools when I understand the full structure of the language based on the data available.
Continue reading

Dreamcast Serial Extractor

It has not been a very productive year for blogging. But I started the year by describing an unfinished project that I developed for the Sega Dreamcast, so I may as well end the year the same way. The previous project was a media player. That initiative actually met with some amount of success and could have developed into something interesting if I had kept at it.

By contrast, this post describes an effort that was ultimately a fool’s errand that I spent way too much time trying to make work.

Problem Statement
In my neverending quest to analyze the structure of video games while also hoarding a massive collection of them (though I’m proud to report that I did play at least a few of them this past year), I wanted to be able to extract the data from my many Dreamcast titles, both games and demo discs. I had a tool called the DC Coder’s Cable, a serial cable that enables communication between a Dreamcast and a PC. With the right software, you could dump an entire Dreamcast GD-ROM, which contained a gigabyte worth of sectors.

Problem: The dumping software (named ‘dreamrip’ and written by noted game hacker BERO) operated in a very basic mode, methodically dumping sector after sector and sending it down the serial cable. This meant that it took about 28 hours to extract all the data on a single disc by running at the maximum speed of 115,200 bits/second, or about 11 kilobytes/second. I wanted to create a faster method.

The Pitch
I formed a mental model of dreamrip’s operation that looked like this:



As an improvement, I envisioned this beautiful architecture:
Continue reading

Writing A Dreamcast Media Player

I know I’m not the only person to have the idea to port a media player to the Sega Dreamcast video game console. But I did make significant progress on an implementation. I’m a little surprised to realize that I haven’t written anything about it on this blog yet, given my propensity for publishing my programming misadventures.


3 Dreamcast consoles in a row

This old effort had been on my mind lately due to its architectural similarities to something else I was recently brainstorming.

Early Days
Porting a multimedia player was one of the earliest endeavors that I embarked upon in the multimedia domain. It’s a bit fuzzy for me now, but I’m pretty sure that my first exposure to the MPlayer project in 2001 arose from looking for a multimedia player to port. I fed it through the Dreamcast development toolchain but encountered roadblocks pretty quickly. However, this got me looking at the MPlayer source code and made me wonder how I could contribute, which is how I finally broke into practical open source multimedia hacking after studying the concepts and technology for more than a year at that point.

Eventually, I jumped over to the xine project. After hacking on that for awhile, I remembered my DC media player efforts and endeavored to compile xine to the console. The first attempt was to simply compile the codebase using the Dreamcast hobbyist community’s toolchain. This is when I came to fear the multithreaded snake pit in xine’s core. Again, my memories are hazy on the specifics, but I remember the engine having a bunch of threading hacks with comments along the lines of “this code deadlocks sometimes, so on shutdown, monitor this lock and deliberately break it if it has been more than 3 seconds”.

Something Workable
Eventually, I settled on a combination of FFmpeg’s libavcodec library for audio and video decoders, xine’s demuxer library, and xine’s input API, combined with my own engine code to tie it all together along with video and output drivers provided by the KallistiOS hobbyist OS for Dreamcast. Here is a simple diagram of the data movement through this player:


Architecture diagram for a Sega Dreamcast media player

Details and Challenges
This is a rare occasion when I actually got to write the core of a media player engine. I made some mistakes.

xine’s internal clock ran at 90000 Hz. At least, its internal timestamps were all in reference to a 90 kHz clock. I got this brilliant idea to trigger timer interrupts at 6000 Hz to drive the engine. Whatever the timer facilities on the Dreamcast, I found that 6 kHz was the greatest common divisor with 90 kHz. This means that if I could have found an even higher GCD frequency, I would have used that instead.

So the idea was that, for a 30 fps video, the engine would know to render a frame on every 200th timer interrupt. I eventually realized that servicing 6000 timer interrupts every second would incur a ridiculous amount of overhead. After that, my engine’s philosophy was to set a timer to fire for the next frame while beginning to process the current frame. I.e., when rendering a frame, set a timer to call back in 1/30th of a second. That worked a lot better.

As I was still keen on 8-bit paletted image codecs at the time (especially since they were simple and small for bootstrapping this project), I got to use output palette images directly thanks to the Dreamcast’s paletted textures. So that was exciting. The engine didn’t need to convert the paletted images to a different colorspace before rendering. However, I seem to recall that the Dreamcast’s PowerVR graphics hardware required that 8-bit textures be twiddled/swizzled. Thus, it was still required to manipulate the 8-bit image before rendering.

I made good progress on this player concept. However, a huge blocker for me was that I didn’t know how to make a proper user interface for the media player. Obviously, programming the Dreamcast occurred at a very low level (at least with the approach I was using), so there were no UI widgets easily available.

This was circa 2003. I assumed there must have been some embedded UI widget libraries with amenable open source licenses that I could leverage. I remember searching and checking out a library named libSTK. I think STK stood for “set-top toolkit” and was positioned specifically for doing things like media player UIs on low-spec embedded computing devices. The domain hosting the project is no longer useful but this appears to be a backup of the core code.

It sounded promising, but the libSTK developers had a different definition of “low-spec embedded” device than I did. I seem to recall that they were targeting something along with likes of a Pentium III clocked at 800 MHz with 128 MB RAM. The Dreamcast, by contrast, has a 200 MHz SH-4 CPU and 16 MB RAM. LibSTK was also authored in C++ and leveraged the Boost library (my first exposure to that code), and this all had the effect of making binaries quite large while I was trying to keep the player in lean C.

Regrettably, I never made any serious progress on a proper user interface. I think that’s when the player effort ran out of steam.

The Code
So, that’s another project that I never got around to finishing or publishing. I was able to find the source code so I decided to toss it up on github, along with 2 old architecture outlines that I was able to dig up. It looks like I was starting small, just porting over a few of the demuxers and decoders that I knew well.

I’m wondering if it would still be as straightforward to separate out such components now, more than 13 years later?

Translating Return To Ringworld

As indicated in my previous post, the Translator has expressed interest in applying his hobby towards another DOS adventure game from the mid 1990s: Return to Ringworld (henceforth R2RW) by Tsunami Media. This represents significantly more work than the previous outing, Phantasmagoria.


Return to Ringworld Title Screen
Return to Ringworld Title Screen

I have been largely successful thus far in crafting translation tools. I have pushed the fruits of these labors to a Github repository named improved-spoon (named using Github’s random name generator because I wanted something more interesting than ‘game-hacking-tools’).

Further, I have recorded everything I have learned about the game’s resource format (named RLB) at the XentaxWiki.

New Challenges
The previous project mostly involved scribbling subtitle text on an endless series of video files by leveraging a separate software library which took care of rendering fonts. In contrast, R2RW has at least 30k words of English text contained in various blocks which require translation. Further, the game encodes its own fonts (9 of them) which stubbornly refuse to be useful for rendering text in nearly any other language.

Thus, the immediate 2 challenges are:
Continue reading