Monthly Archives: April 2007

First Love: Vector Quantization

Someone was asking me about vector quantizer codecs recently. Sure, Wikipedia has the obligatory article. To its credit, the article is actually halfway useful these days (I seem to recall that it used to be a lot more impenetrable). It doesn’t help that the concept is identified by 2 terms that, by themselves, sound somewhat intimidating: ‘vector’ and ‘quantization’.

Anyway, he asked the right person about VQ codecs because I happen to love VQ codecs and can go on for days about them. In fact, I might do just that. I’ll start with a post about the theory and then describe specific examples in separate posts.

Continue reading

MPEG-4 Jump Start

Save for the yeoman’s work that our little community does on the MultimediaWiki, it’s generally quite difficult to come by solid technical data on specific multimedia codecs. That even holds true for the “open” MPEG codecs which are wrapped up in NDAs and licensing fees. So I was stunned to find a thick, colorful, well-illustrated book called “MPEG-4 Jump Start” at the local public library.

MPEG-4 Jump Start cover

I thought this looked highly promising because, while I know a lot of the general concepts surrounding image compression, I have never gotten too deep into MPEG-4 video compression for the simple reason that everyone else works on it. Thus, I don’t feel a need to.

Unfortunately, this book is not quite what I expected. I once asked the guru in passing whether or not FFmpeg supported the entire MPEG-4 spec. His terse response: “Very funny.” It turns out that MPEG-4 encompasses a huge number of features relating to sprite movement and 3D stuff that no one ever uses in practice these days. And that, my friends, is what this book was largely focused on. It may help to explain why Amazon presently lists used copies of this giant tome starting at $2.81.

There is, however, a followup volume entitled “More MPEG-4 Jump Start” (why do I get the feeling that MPEG probably has a separate committee dedicated to developing the names of these books?) that claims it will divulge more information about audio and video coding in the MPEG-4 scheme.

You can read an amusing passage about the unused body of MPEG-4 features under the “Enter the MPEG-4 behemoth” section at Deconstructing H.264/AVC.

Superblock Corner Cases

Look who has been playing around some more with the vector drawing program. Here’s an illustration of somewhat limited utility but that still demonstrates an important point of the VP3/Theora coding scheme:

VP3/Theora superblock traversal corner cases

The image above depicts a hypothetical frame in the VP3/Theora coding scheme that has sample dimensions of 88×48. The valid 8×8 fragments are depicted in green. Since these do not line up nicely on 32-sample superblock boundaries, round up to the nearest superblock in either dimension. The green fragments inside the turquoise zone are the visible fragments. The grey fragments are phantoms that still must be accounted for in the overall superblock traversal pattern when coding/decoding the transform coefficients.

There is also the matter of what happens when the width and height of the frame do not line up on fragment boundaries (i.e., are not divisible by 8). The image is rounded up to the nearest fragment size for the purpose of transform coding.