I have been looking at the RoQ file format almost as long as I have been doing practical multimedia hacking. However, I have never figured out how the RoQ format works on The 11th Hour, which was the game for which the RoQ format was initially developed. When I procured the game years ago, I remember finding what appeared to be RoQ files and shoving them through the open source decoders but not getting the right images out.
I decided to dust off that old copy of The 11th Hour and have another go at it.
The game consists of 4 CD-ROMs. Each disc has a media/ directory that has a series of files bearing the extension .gjd, likely the initials of one Graeme J. Devine. These are resource files which are merely headerless concatenations of other files. Thus, at first glance, one file might appear to be a single RoQ file. So that’s the source of some of the difficulty: Sending an apparent RoQ .gjd file through a RoQ player will often cause the program to complain when it encounters the header of another RoQ file.
However, even the frames that a player can decode (before encountering a file boundary within the resource file) look wrong.
Investigating Codebooks Using dreamroq
I wrote dreamroq last year– an independent RoQ playback library targeted towards embedded systems. I aimed it at a gjd file and quickly hit a codebook error.
RoQ is a vector quantizer video codec that maintains a codebook of 256 2×2 pixel vectors. In the Quake III and later RoQ files, these are transported using a YUV 4:2:0 colorspace– 4 Y samples, a U sample, and a V sample to represent 4 pixels. This totals 6 bytes per vector. A RoQ codebook chunk contains a field that indicates the number of 2×2 vectors as well as the number of 4×4 vectors. The latter vectors are each comprised of 4 2×2 vectors.
Thus, the total size of a codebook chunk ought to be (# of 2×2 vectors) * 6 + (# of 4×4 vectors) * 4.
However, this is not the case with The 11th Hour RoQ files.
Longer Codebooks And Mystery Colorspace
Juggling the numbers for a few of the codebook chunks, I empirically determined that the 2×2 vectors are represented by 10 bytes instead of 6. Now I need to determine what exactly these 10 bytes represent.
I should note that I suspect that everything else about these files lines up with successive generations of the format. For example if a file has 640×320 resolution, that amounts to 40×20 macroblocks. dreamroq iterates through 40×20 8×8 blocks and precisely exhausts the VQ bitstream. So that all looks valid. I’m just puzzled on the codebook format.
Here is an example codebook dump: