Tag Archives: Reverse Engineering

The 11th Hour RoQ Variation

I have been looking at the RoQ file format almost as long as I have been doing practical multimedia hacking. However, I have never figured out how the RoQ format works on The 11th Hour, which was the game for which the RoQ format was initially developed. When I procured the game years ago, I remember finding what appeared to be RoQ files and shoving them through the open source decoders but not getting the right images out.

I decided to dust off that old copy of The 11th Hour and have another go at it.



Baseline
The game consists of 4 CD-ROMs. Each disc has a media/ directory that has a series of files bearing the extension .gjd, likely the initials of one Graeme J. Devine. These are resource files which are merely headerless concatenations of other files. Thus, at first glance, one file might appear to be a single RoQ file. So that’s the source of some of the difficulty: Sending an apparent RoQ .gjd file through a RoQ player will often cause the program to complain when it encounters the header of another RoQ file.

I have uploaded some samples to the usual place.

However, even the frames that a player can decode (before encountering a file boundary within the resource file) look wrong.

Investigating Codebooks Using dreamroq
I wrote dreamroq last year– an independent RoQ playback library targeted towards embedded systems. I aimed it at a gjd file and quickly hit a codebook error.

RoQ is a vector quantizer video codec that maintains a codebook of 256 2×2 pixel vectors. In the Quake III and later RoQ files, these are transported using a YUV 4:2:0 colorspace– 4 Y samples, a U sample, and a V sample to represent 4 pixels. This totals 6 bytes per vector. A RoQ codebook chunk contains a field that indicates the number of 2×2 vectors as well as the number of 4×4 vectors. The latter vectors are each comprised of 4 2×2 vectors.

Thus, the total size of a codebook chunk ought to be (# of 2×2 vectors) * 6 + (# of 4×4 vectors) * 4.

However, this is not the case with The 11th Hour RoQ files.

Longer Codebooks And Mystery Colorspace
Juggling the numbers for a few of the codebook chunks, I empirically determined that the 2×2 vectors are represented by 10 bytes instead of 6. Now I need to determine what exactly these 10 bytes represent.

I should note that I suspect that everything else about these files lines up with successive generations of the format. For example if a file has 640×320 resolution, that amounts to 40×20 macroblocks. dreamroq iterates through 40×20 8×8 blocks and precisely exhausts the VQ bitstream. So that all looks valid. I’m just puzzled on the codebook format.

Here is an example codebook dump:
Continue reading

Deobfuscation Redux: JavaScript

Google recently released version 12 of their Chrome browser. This version adds a new feature that automatically allows deobfuscating obfuscated JavaScript source code.

Before:



After:



As a reverse engineering purist, I was a bit annoyed. Not at the feature, just the naming. This is clearly code beautification but not necessarily deobfuscation. The real obfuscation comes not from removing whitespace but from renaming variable and function names to terse 1- and 2-letter identifiers. True automated deobfuscation — which entails recovering the original variable and function identifiers as well as source code comments — is basically impossible.

Still, it makes me wonder if there is any interest in a JavaScript deobfuscator that operates similar to my Java deobfuscator which was one of the first things I published on this blog. The general idea is automatically replace function names with random English verbs (since functions correspond to actions) and variable names with random animal names (I decided “English nouns” encompassed too broad a category of words). I suspect the day that someone releases a proprietary multimedia codec in a pure (though obfuscated) JavaScript format is that day that I will try to accomplish this, if it hasn’t been done already.

See also:

Camp Luna

I remember when the Mono people first announced the Moonlight project for Linux that would interoperate with Microsoft’s Silverlight. They claimed that Microsoft would release a special binary codec pack that would allow Linux users to play back their proprietary media codecs. However, this codec pack would not be allowed for use in any other application, like FFmpeg or GStreamer. How are they going to enforce that? Or so I wondered. Tonight I learned how.

I started investigating the API of the binary codec pack blobs a few weeks ago. I got as far as figuring out how Moonlight registers the codecs. Then I lost motivation, in no small part because there isn’t that much in the blob that I would deem interesting (perhaps one method for keeping people from sorting out the API). In the comments of the last post on the matter, people wondered if the codec pack included support for WMA Voice, which is still unknown. I can’t find any ‘voice’ strings in the blob. However, I do find references to lossless coding. This might pertain to Windows Lossless Audio, or it could just be a special coding mode for WMA3 Pro. Either way, I’m suddenly interested.

So I looked for interface points in the Moonlight source. Moonlight simply loads and invokes registration functions for WMA, WMV, and MP3. The registration functions don’t return any data that Moonlight stores. Moonlight doesn’t appear to load (via dlsym()) or invoke any other codec pack functions directly. So how can it possibly be interfacing? The only other way the interaction could flow is if the codec pack shared library was invoking functions in Moonlight…

Oh, no… they wouldn’t do that, would they?

Continue reading

IDA Pro Freeware Update

Thanks to igorsk for informing me that DataRescue has made an updated version of IDA Pro available as freeware. No longer must we suffer the quirks of the old freeware version 4.3– we get to learn a whole new set of idiosyncrasies with 4.9.


DataRescue IDA Pro -- Improved version available

The sales folks at DataRescue told me that this freeware release was in the works– to pacify me when they refused to sell me a license for the full version of IDA Pro. Interesting business model.