Category Archives: Reverse Engineering

Brainstorming and case studies relating to craft of software reverse engineering.

Something For The LucasArts Fanboys

I have a contact working diligently on the LucasArts SANM format, most famously used in Grim Fandango. He sent me a MPEG-4 movie transcoded from an SANM file via an experimental FFmpeg module. Screenshot:

Feeble Files DXA And Wiki Upgrade

Kirben from ScummVM tipped me off on the DXA format. Apparently, it was only ever used in one game called Feeble Files. He reports that the Amiga and Macintosh ports used the DXA format. I was unaware that there were any commercial games for the Amiga past about 1995. Here is the requisite Wiki page on the format. It’s one of the simplest formats yet. It’s unusual in that it stores all of the audio data in a single WAV file chunk near the start. The known video coding format simply uses zlib’s deflate() function to compress a raw frame or the result of a XOR operation between the current and previous frames.

Speaking of the Wiki, I have upgraded the MultimediaWiki to the latest, and therefore greatest, incarnation of MediaWiki — 1.6.3. I don’t see too many major differences so far. There are supposed to be some useful counter-spam features which seems to be increasingly important. I still can’t generate math expressions in Wiki. I’ve traced this to the absence of LaTeX processing utilities on the host machine. Why does LaTeX always have to cause such trouble? We’re stuck with plaintext math expressions until I can get around this problem somehow.

Doxygen As RE Tool

It’s important not to overlook some obvious tools even though they may not be chartered for reverse engineering purposes. Think of Doxygen. This is that tool that can look at your project full of source code and extract all of the comments and data structures and functions and automatically generate a cleanly interlinked set of HTML files to browse.

It helps if you design a project from the ground up to take advantage of Doxygen’s features. But strictly speaking, Doxygen doesn’t really care how poorly a project is commented. It will still plow through the source code, analyzing it as best it can.

I am still working diligently on Understanding VC-1. Now, the SMPTE reference decoder is quite neatly documented. However, in trying to understand functions there is the inevitable jumping back and forth between source files trying to figure out what this or that data structure contains. In this, Doxygen relieves some of the RE pain. As a small example, here is what the Doxygen-generated HTML for the VLC data structure looks like (it’s a picture; don’t bother to click the links):

This example turned out all right. Sometimes, however, the comments don’t match up with the fields since they weren’t formatted with Doxygen in mind. On average, it’s still an improvement than the standard file-hopping.

Mimic Doc

Browsing through the MultimediaWiki I found a codec called Mimic that I previously had not paid too much attention to. I remember someone on the ffmpeg-devel list inquiring about some reverse engineering advice regarding the codec with the FourCC ML20, which is a.k.a. Mimic. The FourCC was apparently a bit frustrating to work with because it only occurs on the wire– it is only used for teleconferencing app and there are no apps to easily save the encoded data into a handy container format.

Anyway, someone persisted (one Ole André Vadla Ravnås according to the source headers) and eventually reverse engineered the complete codec and created an open source library called libmimic, according to the Wiki page. The link on the Wiki page for libmimic does not seem to work. However, another open source teleconferencing app called Farsight incorporates the full library, which even knows how to encode video.

It’s a very simple codec for those of us — ahem — skilled in the art. I’ve done my best to document the details based on the RE’d source code. Stop me if you’ve heard this one before: Transform 8×8 blocks of data with discrete cosine transform -> quantize DCT coefficients -> zigzag coefficients -> Huffman code the non-zero coefficients and zero-runs between the non-zero coefficients. The only major intraframe coding concepts the codec is missing are macroblocks and delta coding. Interframes feature frame differencing– if a block is unchanged from the block at the same position in the previous frame, leave it alone. Good design decision for a teleconferencing video codec.

Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering

Category Archives: Reverse Engineering

Something For The LucasArts Fanboys

Feeble Files DXA And Wiki Upgrade

Doxygen As RE Tool

Mimic Doc