Author Archives: Multimedia Mike

Electronic Lossless Viking Arts

I’m technically off the ffmpeg-devel list right now. A few days ago, my ISP started having trouble delivering email between my mail server and the list server. I’m still trying to resolve the problem. Wouldn’t you know, some interesting stuff has been going down:

Theora Superblock Traversal

Hands down, one of the hardest parts about the VP3/Theora coding method is the bizarre fragment traversal pattern. Most everything else that VP3 does is seen in other codecs. But the Hilbert pattern used for encoding and decoding coefficients creates a monster. Vector drawing program to the rescue!

Imagine a video frame with a resolution of 96×48, highly contrived for this example. The Y plane would consist of 12×6 fragments (8×8 block of samples). Meanwhile, the U and V planes would each consist of 6×3 fragments. Recall that VP3 employs a notion of a superblock, which encompasses a group of 4×4 fragments. If the fragment resolution of a plane is not divisible by 4, the plane must be conceptually padded out to the nearest superblock boundary for the traversal pattern.


click on the image for the full sized image
Theora superblock decoding pattern

The above drawing illustrates the order that the fragments are traversed in our hypothetical 96×48 frame when decoding coefficients. Note, in particular, the strange order in which fragments 50..53 are traversed.

I hope that’s clear enough. The challenge at hand is to establish data structures at the start of the encode or decode process. Generally, you will allocate an array of fragment data structures: Fragment indices 0, 1, 2, 3, etc. will proceed left -> right, then top to bottom. The second row of the Y plane begins at index 12, third row at index 24, and so on. But when it comes to decoding the DCT coefficients, it is necessary to traverse through the fragment data structure array at indices 0, 1, 13, 12, 24, 36, 37, 25, and so on. Thus, it is customary to build an array that maps one set of indices to the other. I’m having flashbacks just thinking about it. I remember developing a completely different algorithm than the official VP3 decoder when I reimplemented the decoder for FFmpeg.

Oh yeah, and remember that all VP3/Theora decoding is performed upside-down. I choose not to illustrate that fact in these drawings since this stuff is complicated enough.

Linus Is Still The Man

Linus Torvalds– a legendary figure who sat down one day and wrote an operating system. To many ordinary programmers like myself, he is a distant figurehead, difficult to comprehend. Every now and then, however, we catch a glimpse that helps us to humanize the mighty coder. And I don’t know about you, but I love a good knockdown, drag-em-out C vs. C++/Java/OOP flame war and this thread does not disappoint: Linus tells it like it is on the topic of C++.

Perhaps I’m too harsh on C++. In fact, there is one instance where I really appreciate the use of good, solid C++ coding– when a binary target that I wish to reverse engineer was originally authored in C++, compiled, and still has the mangled C++ symbols. gcc’s binutils do a fabulous job of recovering the original class and method names, as well as argument lists.

Sometimes I think I should get off my high horse with regards to C. After all, this article from May listed C programming as one of the top 10 dead or dying computer skills, right up there with Cobol and OS/2. This is not the first time that I have encountered such sentiment, that C is going the way of raw assembler. I think it’s all a conspiracy perpetrated by the computer book publishing industry. The C language simply does not move anywhere near as many books as the latest flavor of the month fad language.

What RE Looks Like

People ask me what binary reverse engineering is like. It’s tedious, that’s what it is. When it gets right down to it, you just have to concentrate on little (in the sense that they do not do much individually) instructions and carefully trace the bigger picture from huge blocks of these baby steps.

If you want to see what RE really looks like, come inside my notebook. Here are 3 choice pages, illustrating the process I used to figure out one of the inverse transforms that RealVideo 4 (and 3, for that matter; both H.264 prototypes) employ:


RV40 transform RE notes
page 1, page 2, page 3

In this example, I used a lot of back substitution in order to figure out a series of math formulas. This case was greatly simplified by the fact that there were not very many mystery parameters to deal with. However, the property that complicated matters here is that there were few if any straightforward imul (integer multiplication) instructions. Even though multiplication figures heavily into the transform, most multiplications were performed by sequences of additions, bit shifts, and compound add/shift instructions (lea).