Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Duck Hieroglyphics

August 26th, 2007 by Multimedia Mike

It’s like digital archaeology– understanding the ancient (by internet time) workings of less advanced engineering efforts. Pop culture depictions of archaeology would have us believe that those who toil in the field are searching for that one artifact that would neatly solve the entire puzzle at hand. Historically, perhaps the artifact that came closest to fitting this archetype was the Rosetta Stone, and even that was incomplete (if I’m observing the pictures correctly).

So it is with my current investigation. Much like Sean Connery playing Indiana Jones’ dad searching for the Holy Grail in The Last Crusade, I run the risk of making the Duck TrueMotion 1 algorithm my lifelong obsession. I got that Duck TM1 source code into a state where I could compile it against a standalone program. It only required 38 C source files from the original vpvision source tree and a mess of attendant header files to boot.

Eventually, I noticed a .txt file somewhere in the original source tree. I have no idea what ‘hfb’ means in the tree so I’ll have a look in hfb/corelibs/hfb/generic/readme.txt:


Not so useful. Let’s try a file called iva/iva.txt:

 no documentation yet

Why do I bother? Why do I devote time to a hobby of interpreting the scrawlings that cavemen engineers splattered on their cube walls?

There are 2 copies of a file called ccreadme.txt that have statistics on the performance of various optimized colorspace converters. Someone was obviously proud of that work and saw fit to preserve the numbers for future generations.

Perhaps the only semi-useful document file I found is b_readme.txt, in the main tm1.0 source directory. See, there are 25 of these files in the tm1.0 source directory that have the format bXYmn.c ‘b’ is ‘b’. As for the remainder of the characters, well, here’s where we get into the hieroglyphics– the letters ‘X’ and ‘Y’ represent a character, either s, f, t, or e. What do they mean? 16, 24, 32, or 8 bits, respectively. Huh? Your guess on the mapping logic is as valid as mine. ‘m’ is a number 0-3. This has to do with stretching mode, apparently, but what the modes exactly represent is not clearly explained (“Next character is a mode (0-3) such as same, stretch, and bright (implemented with interpolation or otherwise).”). ‘n’ is a number (0-2) indicating, I suppose, a stretching algorithm (“dumb, smart, or fat-freddy”). I googled for a fat freddy algorithm but only found reference to an aberrant cartoon character. Apparently, even Duck/On2’s algorithms are inside jokes.

The file singles out a particular engineer, one Scott LaVarnway, for his exemplary work in the blitter department, and for one blitter in particular which is extra special and deviates from the naming convention somewhat.

I don’t know how, or if, I missed this file when I studied this codec the first time around. I do remember that it was painful to figure out which of the b????.c files I was supposed to RE. I could tell that different ones were used for 16- or 24-bit source data and implemented different interpolation modes. Now I recognize that, for understanding 24-bit decoding, bft00.c is perhaps my best bet (24-bit source data, 32-bit output target, no stretching, no interpolation algorithm).

So hopefully, that will help in figuring out the 24-bit mode. There is still the matter of many corrupted DUCK TM1 files that don’t work with the libavcodec TM1 decoder. It will be useful to feed them into this decoder somehow and see if it can handle them, and then figure out what the official decoder is doing that the lavc decoder needs to do. I thought I had a good idea of what the relevant functions were until I noticed that these particular functions were static within the file.

Now I think I have the real API figured out, at least partially– I believe I know how the data goes in, though I’m not clear on where the data comes out just yet. The initialization also doesn’t get very far before erroring out. Now I’m just debugging the thing using gdb to figure out why it’s not responding as expected.

Posted in On2/Duck, Reverse Engineering | 6 Comments »

6 Responses

  1. Kostya Says:

    From fellow hacker who understood truemotion 2 ;)

    You need only dxl{16,24}c.c and some tables in order to understand its work – blitters work on decoded picture, gamma correction is also unneeded the only useful function from tm1.c is PopulateImage().

    As I wrote once – the art of RE’ing usually starts with determining what pieces will be needed to analyze. And TM1/TM2 contains a lot of lint.

  2. Multimedia Mike Says:

    Thanks for the tip. While perusing the blitter files, I was left wondering why none of them looked all that familiar. Now I realize that I never had to study them carefully the first time around. However, if I am trying to get this thing compiled so that it decodes a full image (i.e., not just understand the general decoding flow), won’t I need some of the blitters?

    Funny you should mention PopulateImage() since that’s the function I’m stuck on right now. I think there may be problems running the code on PowerPC even though it appears to be endian-safe. I may try it on x86.

  3. Kostya Says:

    While working on TM2 I found it very convenient to decode frames with my own program written in Perl to PNM format (it’s easiest to write).

    Also you should still look at the mentioned files, maybe it should be easier to write wrapper around them as they decode frames. I personally wince every time I look at that decoder library code.

  4. Multimedia Mike Says:

    Yeah, you and me both. Thankfully, I sincerely believe that the full Duck TM1 algorithm is the last useful bit of information to be gleaned from this code.

  5. Reimar Says:

    > the full Duck TM1 algorithm is the last useful bit of information to be gleaned from this code

    Some people might say even that is only because you have such a weird idea of “useful” ;-)
    And I am not so sure about the blitters, I think that one of the ffmpeg TM1 problems (right side of image is only gray) is that the resolution given in the file is after after blitting, which includes upscaling in X direction. But I found nothing in the files that would indicate which blitter to use…

  6. Kostya Says:

    Look at tm1.0/src/dxl24c.c, everything is there.
    It depends on parameters passed to decoding function and depending on it that decoding function calls functions from dxl16.c which decodes one delta into one or two pixels.