Category Archives: Reverse Engineering

Brainstorming and case studies relating to craft of software reverse engineering.

Quacking Like Duck

I don’t learn very quickly. A good illustration of this personal shortcoming is my ongoing battle with the Duck TrueMotion 1 video codec, a conflict that is well into its 8th year at this point. This was one of my first exposures to the field of multimedia hacking as I discovered some Duck TM1-encoded AVI files on select Sega Saturn console games all the way back in 1998. I found a few Japanese Windows programs that could play the files. Early in my multimedia hacking career, I worked hard to reverse engineer the TM1 algorithm.

In 2002, On2 (formerly Duck) open sourced their VP3 codec (which would later become Theora). I kept stubbornly trying to RE TM1 from the binary when Alex notified me that On2’s open source VpVision package contains the decoders for TrueMotion 1 and 2. It also included the decoders for Duck DK3 and DK4 ADPCM, both of which I successfully had RE’d from binary code, and at around the time that VpVision had been released as open source.

The VpVision source has been quite valuable in RE’ing both the TM1 and TM2 algorithms. It’s extremely difficult to understand, however, as is often the case with code produced on a deadline. Some TM1 files still fail to work correctly with the native FFmpeg decoder. I always thought it would be useful if I could compile the code and run it in a step debugger, or perhaps profile it with other tools, in order to sort out the correct operation when decoding certain files. But the code seemed like such a mess that I didn’t think it would compile on Linux. It looked like it would only compile on Windows or maybe Mac, and only with some effort.

Continue reading →

Tonight’s ASM Strangeness

Check out this sequence:

copy 32-bit register A to register B
shift B ~~left~~ right by 0x1F (thanks to Reimar for spotting the mistake)
subtract 1 from B
logically AND B against A

Eventually, it dawned on me that the sequence saturates an integer to a minimum value of 0, only without using any of the more traditional branching logic for such an operation.

Speed demons, these programmers. They used some neat tricks. It looks like gcc didn’t, though. Either that or I don’t understand the deeper meaning of the instruction “lea esi, [esi+0]”. It strikes me as a NOP. But that’s not quite as bad as some code observed in another gcc-compiled module recently that saw fit to execute “mov eax, eax” after a function call before moving eax (the function’s return value) to its final destination.

I’m not complaining since, when the time comes (hopefully soon) to reverse engineer that module, the naive compilation will make the task more straightforward.

RE Math Puzzle

It’s late and it has been a long day, but I’m still trying to reverse engineer a particular math function. RE is such a wonderful endeavor for those who are keen on bizarre little number puzzles. Check this out and see if you can understand the overall effect:

pick some number, we’ll call it A
shift A left by 29, effectively multiplying A by 2²⁹
subtract A from the result, assign to B, so now B = A * 2²⁹ – A
multiply B by 8 and add A, assign to C: C = B * 8 + A

What is the net result of the bit shifts and subtractions? I have a feeling that there must be multiple operations combined into one, sort of like when a DCT-based video codec dequantizes and reorders coefficients at the same time. Time for some more math:

  C = B * 8 + A
  C = 8 * (A * 2²⁹ - A) + A
  C = 8 * A * 2²⁹ - 8 * A + A
  C = 2³ * 2²⁹ * A - 7 * A
  C = 2³² * A - 7 * A

In order to negate an N-bit integer in 2s complement arithmetic, you can subtract that integer from 2^N. For example, to negate 1 as a 32-bit integer: (2³² = 0x100000000) – 1 = (0xFFFFFFFF = -1). So the algorithm above appears to negate a number while multiplying it by 7 at the same time. These people were resolute to never ever use an actual ‘imul’ instruction unless absolutely necessary.

Well, I’m glad we had that little talk. I can sleep soundly tonight. I just hope this still makes sense to me in the morning.

Sega CD FMV VQ Analysis

I have amassed quite a collection of Sega CD titles over my years of multimedia hacking. Since it was an early CD-based console, I reasoned that at least some of the games would contain full motion video (FMV). In fact, essentially all Sega CD games fall into one of the following 2 categories:

Standard 16-bit Sega Genesis-type games that were enhanced by a Red Book CD audio soundtrack
Games that were driven entirely by very low-quality FMV

Mad Dog McCree (Sega CD version) -- Mayor's daughter

Screenshot of Mad Dog McCree for the Sega CD, an FMV-driven FPS

Many Sega CD games, particularly those published by Sega itself, contain many large files with the extension .sga. I have never made much headway on understanding any of these files, save for the fact that many of them use sign/magnitude 8-bit PCM audio. As for the video codec, “Cinepak” or “Cinepak for Sega” is often thrown around. I can certify that it is not the stock Cinepak data commonly seen in the early FMV era. Though perhaps the Sega CD console was the proving ground for later Cinepak technologies. A lot of Sega CD FAQs around the internet were apparently plagiarized from each other, which must have originally been plagiarized from Sega marketing material because they all shallowly list one of the system’s graphical capabilities as “Advanced compression scheme.”