Category Archives: Reverse Engineering

Brainstorming and case studies relating to craft of software reverse engineering.

LucasArts Task Force

Those wacky programmers over at LucasArts have traditionally employed more acronyms than the most bloated of bureaucracies. They started out with SCUMM (Script Creation Utility For Maniac Mansion). Then there was SMUSH and INSANE, 2 other game engines. I think GRIME was another engine, developed for Grim Fandango. I probably have some of those details at least a little incorrect but I’m too lazy to do the research right now, and I know the ScummVM hackers will perform any corrections for me. (Update: Sure enough– check the comments for clarification.)

Why do I care about any of this? A number of games based on the SMUSH engine contain FMV files with the extension .san. This apparently stands for Smush ANimation. Many years ago, shortly after I gained interest in studying game-related multimedia formats, I received a request to study the SAN format. Some reverse engineering work had already been done and put into ScummVM but it was not quite complete and was also quite rough around the edges– the kind of RE work where ASM is more or less converted directly into C. I never could quite make heads or tails of it and sometimes thought it might be easier to just start over with the raw binary.

I think it’s finally time to get serious and and put the SAN format to rest! Who’s with me? Good, now let’s get down to business. First things first, samples! I have organized SAN samples from 11 different titles (although Outlaws and The Dig are represented twice in the count because of their demos): http://www.mplayerhq.hu/MPlayer/samples/game-formats/la-san/. That directory also contains the file TGSMUSH.DLL which is allegedly capably of playing every known variation of SMUSH files.

The SAN files have a pretty straightforward FourCC-chunk format that we multimedia hackers have all seen in dozens of formats before. What coding methods does the format use for audio and video compression? That’s the big question. Not unlike other game-specific FMV formats that evolved over time as their creators refined the concepts (I’m thinking of Westwood VQA and Interplay MVE), SAN went through a few iterations. It apparently employs a range of coding techniques. One audio codec is called VIMA — most likely an abbreviation for variable IMA ADPCM — that Cyril has done an amazing job of documenting. As for video codecs there are a number of codecs and subcodecs at work and the details are still not entirely clear. There is some RLE and there appears to be some vector quantization. Further, based on the RE work already in ScummVM it appears that SAN might use hierarchical VQ, also seen in Sorenson Video 1, as well as gradient blocks, also seen in Gremlin Digital Video (though still not documented in the Wiki).

For your inspection you can find the source for the ScummVM SAN video decoder here: http://omega.xtr.net.pl/misc/smush/. The binary DLL is also available as mentioned previously. Cyril has started a new page on Smush. I think we have all the resources we need to attack this. Let’s see what we can come up with.

Xan Binary Decoding

In the time-honored tradition of avoiding real work, I made an extension to xine that will decode Origin’s Xan codec (FourCC: Xxan). This is despite my long-standing policy that I will not invest any effort into making open source programs leverage closed, binary code in order to decode data.


Wing Commander IV Title

I have no intention of committing it to the xine codebase because, really, who cares? But let’s see MPlayer decode Xan data! Ha! Oh, why do I provoke them so? They’re going to have a workable decoder 20 minutes after I post this.

If you care, I posted the xine plugin code here: http://multimedia.cx/xan/. This is based on Mário Brito’s extensive Wing Commander research. It needs a very large table for decoding (128KB of data expressed in ASCII text) and that’s contained in xandata.h[.bz2]. Ideally, I think that table is supposed to be generated by some DLL function. Xan samples and the xanlib.dll are located at the MPlayer samples repository.

This particular plugin is based on one of my old reverse engineering experiments. The reason I took on this task is because loading xanlib.dll and calling into it isn’t especially difficult. At least, none of the relevant functions are dependent upon any external functions. Thus, I just used a few strategic mmap() functions and loaded the binary code directly into specific memory regions. Oh, the code only works on x86 architectures, of course.

SDL Corruption

Pursuant to Alex’s challenge to write a Unix player for Trixter’s 8088 Corruption data file, combined with an interest in re-learning the Simple DirectMedia Layer (SDL) API, I wrote a basic program that takes said data file, a font file, and the hardwired colors in the CGA card and renders the video using SDL. I don’t think the font vectors I scavenged are 100% the same as the ones in Trixter’s IBM model 5150 PC:


Eiffel Tower Breakdancer

In particular, I’m not sure about all of those box characters. I think the box is supposed to be one flat color. Anyway, here is another shot, only from the “Tron light cycles” section of video:


Tron Light Cycles

Followed up in SDL Corruption Corrected.