Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Reverse Engineering Radius VideoVision

April 2nd, 2011 by Multimedia Mike

I was called upon to help reverse engineer an old video codec called VideoVision (FourCC: PGVV), ostensibly from a company named Radius. I’m not sure of the details exactly but I think a game developer has a bunch of original FMV data from an old game locked up in this format. The name of the codec sounded familiar. Indeed, we have had a sample in the repository since 2002. Alex B. did some wiki work on the codec some years ago. The wiki mentions that there existed a tool to transcode PGVV data into MJPEG-B data, which is already known and supported by FFmpeg.

The Software
My contacts were able to point me to some software, now safely archived in the PGVV samples directory. There is StudioPlayer2.6.2.sit.hqx which is supposed to be a QuickTime component for working with PGVV data. I can’t even remember how to deal with .sit or .hqx data. Then there is which is the tool that transcodes to MJPEG-B.

Disassembling for Reverse Engineering
Since I could actually unpack the transcoder, I set my sights on that. Unpacking the archive sets up a directory structure for a component. There is a binary called RadiusVVTranscoder under RadiusVVTranscoder.component/Contents/MacOS/. Basic deadlisting disassembly is performed via ‘otool’ as shown:

  otool -tV RadiusVVTranscoder | c++filt

This results in a deadlisting of both PowerPC and 32-bit x86 code, as the binary is a “fat” Mac OS X binary designed to run on both architectures. The command line also demangles C++ function signatures which gives useful insight into the parameters passed to a function.

Pretty Pictures
The binary had a lot of descriptive symbols. As a basis for reverse engineering, I constructed call graphs using these symbols. Here are the 2 most relevant portions (click for larger images).

The codec initialization generates Huffman tables relevant to the codec:

The main decode function calls AddMJPGFrame which apparently does the heavy lifting for the transcode process:

Based on this tree, I’m guessing that luma blocks can be losslessly transcoded (perhaps with different Huffman tables) which chroma blocks may rely on a different quantization method.

Assembly Constructs
I started looking at the instructions (the x86 ones, of course). The binary uses a calling convention I haven’t seen before, at least not for the x86: Rather than pushing function arguments onto the stack, the code manually subtracts, e.g., 12 from the ESP register, loads 3 32-bit arguments into memory relative to ESP, and then proceeds with the function call.

I’m also a little unclear on constructs such as “call ___i686.get_pc_thunk.bx” seen throughout relevant functions such as MakeRadiusQuantizationTables().

I’m just presenting what I have so far in case anyone else wants to try their hand.

Posted in Reverse Engineering | 10 Comments »

10 Responses

  1. Owen S Says:

    Doing a SUB for all the stack space that will be needed at the start of the caller function, MOVing arguments into place, then calling, then doing this again, and so on, is a pretty common in optimized code. Having multiple PUSHes in a row generates false dependencies on rSP (And can also add extra instructions and register pressure, as the arguments can’t go straight down onto the stack)

    It, of course, doesn’t really make much difference.

  2. astrange Says:

    gcc doesn’t generate push at -O.

    “call ___i686.get_pc_thunk.bx” is the PIC implementation for i386. I think it can be ignored.

  3. Pengvado Says:

    > The binary uses a calling convention I haven’t seen before, at least not for the x86

    All versions of GCC I tested do that at all optimization levels other than -Os. It isn’t even a unique calling convention per se, it’s just one possible sequence of instructions that implements cdecl.

    > call ___i686.get_pc_thunk.bx

    That’s for position-independent-code. The only way on x86_32 to get the instruction pointer into a general purpose register, is to “call” something and then pop the return address.

  4. Reimar Says:

    Hm, what makes you think that chroma uses different quantisation?
    For the AC part is still just copies the coefficients.
    I guess it’s possible that the DC part of chroma uses different quantisation, but maybe the format is just even simpler than JPEG and e.g. doesn’t use DC prediction for chroma or so?

  5. Multimedia Mike Says:

    Thanks, everyone, for the remedial reverse engineering tips. I’m clearly out of practice.

    @Reimar: I’m just wondering where the quantization comes into play. The CopyLumaBlock() function name implies a pretty simple process which TranscodeChromaBlock() implies something more in-depth. Then again, the only place where multiplication instructions show up is in MakeRadiusQuantizationTables().

  6. compn Says:

    you need stuffit expander to extract sit / hqx.

    good luck with the reverse engineering :)

  7. igorsk Says:

    [plug]IDA Pro handles OS X PIC code nicely:
    Hex-Rays decompiler also does pretty nice job on it:
    (note that the magic division by 50 was recognized).

  8. Multimedia Mike Says:

    @igorsk: Looks good, thanks!

  9. Louise Says:

    Did you ever get it to work? I have some old Radius files that I want to convert, too.

  10. Multimedia Mike Says:

    @Louise: We only have 1 sample file. If you can provide more, that would help us. I’ll be in touch.